Friday 7 September 2007

Mail Status

  • All sites are up apart from out www.tomeraider.com.
  • We are currently working on mail backups.
  • We have mail backups for EVERY domain - this is very good.
  • Some mail backups have been messed up a bit by people using remedial options in the downtime like the webmail and the redirects, but it shouldn't be too bad...
  • We haven't restored mail like this before so we are going to try with a test account first.

Thursday 6 September 2007

All Clients Sites are Done!

At last!

Something like 9 years after the server went down (it feels like that) all the client sites are up and being QA'd . There is lots more to do but we are in the home straight.

We are overwhelmed by the clients understanding

Really, I know a lot of clients have taken a big hit this week, we all have. But I just cant thank enough how understanding and sympathetic every single one of them has been - Bar one who hasn't been reading this blog (using this blog for updates has been a godsend too I reckon, for all involved).


Thank you. Thank you. Thank you.

Its been pretty hellish but would have been zillions of times worse without this understanding:)



Its not all bad...

I just got a call from a guy named Paul...

He said "Andrew... recommended you for a new website."

I said, "He did? his site's been down nearly a week!"

"I know" he said with a telephone smile.

"But I still want and Andrew still recommends...."

:)

Recovering "Old Mails"

Once the client sites are set up and the email working the next big issue is the "missing mails" that were sent during the various stages of downtime.

In theory we should be able to get all of these, but I wont sugar the pill, the myraid mail systems that have been set up as temp measures and the general server problems means that some mails might have gone missing. We don't expect many, we hope none, but we just cant tell yet.

Some more sites are up....

http://www.tm-clothing.com http://cornwallsustainabilityawards.org/

Morning update 9.12

There have been some small issues since the last couple of sites, we have resolved those and are continuing updating the sites.

Its going well again.

Its a beautiful day out side in Truro and in Columbo its "sunny" (Its either "sunny" or "torrential ")
www.burrowsestateagents.co.uk
www.duckworthpotter.co.uk

The Shipping Forcast

Its 4.30 am and we are uploading and its going well. We haven't actually encountered ANY problems at all in the last couple of hours; so thats quite remarkable for the week.

I'm now going to try to get a few hours sleep. If your a client please try not to ring me too early but do email myprojects{at}yadabyte.com if your site isn't live by the time your up.

Launch Priorities

The remainder of the sites are being rolled out on a pretty ad hoc basis, if when you get up your site isn't up please email the myprojects{at}yadabyte.com address and just ask for it to be expedited.

The last sites we are going to do will be for our software business -tomeraider.com etc, which like out client sites have pretty much had their balls ripped out this week:(


Epic thanks to Sumith and Waruna who are on the last 2 hours of a 27 hour non-stop workthrough.

News Sites Up... even more....

http://leapmedia.co.uk/
http://nfasoccerskills.com
http://www.ipsf.org/

News Sites Up...

http://www.envisionsw.org.uk/
http://auntieannes.co.uk/
http://www.smorgers.com/

More sites up.....

http://www.fingerprint-jewellery.co.uk/
girlstravelclub.co.uk

First site is up!

We have the first site up and all seems well with it. www.vivastay.com

There are two types of sites we have, once with whats called Plesk backups (This is a server system) and Folder level backups. We are going to be doing the Plesk backups first, these are generally the most complex sites.



In terms of the order we roll them out, its nearly nearly 2am Uk and 6. 30 Sri Lankan we have a bit of a "production line" going in SL.....

I see a light!

87.106.9.47

Put that in your browser!! There is the seed of a website there!!

on our server!!!

Wednesday 5 September 2007

Update 22.23

Its still ongoing.. its taking ages... its going to be an all nighter in Truro and Columbo and I know who has the best coffee...

Update 17.43

There has been a delay of an hour with the new server company.




Update 15.00

We are finally moving files over to the new server house. the server isn't quite ready (Should be by 4) but this new company has bent over backwards to help.


Its not over, but whatever happens, without the 3 day and 3 night dedication of the Yadabyte Sri Lanka team I would still be on the phone to the Philippines. Sumith, Waruna, Amila, Vaj, Pras, you guys especially, I cant thank you enough.

UPDATE 13.40

We have the sites ready to start being uploaded.

This is a break in the clouds.

I'm preying to the binary gods its not just eye of the storm.

UPDATE 13.30

The new company we have to provide the server is finalizing it. Thats good news.

Server Update

The backup should be complete in 2 hours

Server Update 08 AM

We are taking a backup of the server now, so we have access and nothing should have been lost (In between the 1and1 backups and our backups).

We are also preparing the sites for setting up on the new server as follows:

  1. We will get a "Down for unscheduled maintainance" holding page up on all sites.
  2. we will reset up emails on the sites.
  3. We will upload the site to the new server.
Note that these can be done fairly simultaneously.

Tuesday 4 September 2007

Its not directly relevant but...

Stress

1and 1 problems

One of our clients just sent this.... it seems we are not alone:


Thank you for your updates, it's not going to help in any way but I had a quick look at 1&1's impressive track record..
the first link talks about problems terminating contracts..
Kindest regards,

Some progress...

We have made some progress with the site. Our main aim is to get a complete backup of the site before we reimage it - but we are having problems mounting the raid (hard drive) - and getting no help from 1and1:(

The guys in SL are still working on it as well - its past midnight in Colombo.

Update and Some Closure

As of 14.45 this is how it stands:

Email

We have emails working for all sites. If you think you are missing mails please contact me asap so we can investigate.


  • We are currently taking the sites off 1and1.co.uk server and Tomorrow, when we have them all, we will start putting them on a new server.
  • We will then start pointing the name servers, which can take up to 72 hours but usually is close to instant.
  • So tomorrow all sites (Apart from our own tomeraider.com which will be a monster move) should be up and running.

Conclusion

I hope that any clients readings realise that what has happened to your site, and the other sites, was very very far out of all of our control. I guess the only mistake we made is that 2 years ago we went with 1and1.co.uk

Nonetheless, if anyone would like to stop hosting with Yadabyte we will give you all the help you need to move your site to another server, and continue to support it as if it was on our server, free of charge.

Thank you all of our clients for being so very understanding. Its been a horrible and helpless 2 days but it would have been much much worse if you hadn't been:)

Mat


Some good news

We still don't have the server but we can now telnet into it and retrieve the databases for the sites, which we are doing.

This means that when we get the new server fully installed we will be able to pick up pretty much where we left off... at least thats the idea...

This is officially the worst 36 hours of my working life.


It dint work

same message "Bringing up interface eth0:
Determining IP information for eth0... failed; no link present. Check cable?
[FAILED]
"

There is hope....

Dear Matthew Ripley, (Customer ID: 10236598)

Thank you for contacting us.

This is in regards to the phone conversation I had with you earlier.

With regards to your concern, we will advice you to please reboot your
server to rescue mode and then choose options.
Linux Rescue System (debian/woody - 2.4.x)
Linux Rescue System (debian/woody - 2.6.x)

.....


Watch this space..

We have been contacted by 1and1.co.uk !

And lo! From the darkness!

1and1.co.uk have managed, after about 26 hours, to contact us with the following:



Dear Matthew Ripley, (Customer ID: 10236598)

Thank you for contacting us.

We really understand your frustrations right now, unfortunately, we
really cannot give any estimated timeframe for this. What we can only
suggest here is to wait for further resplies from our highest level of
support. Rest assured that your issue has already and still currently
being worked on.

If you have any further questions please do not hesitate to contact us.

--
Sincerely,
Ell Sworth Guy Caballero
Technical Support
1&1 Internet


Its not actually any news because this came from: support@1and1.co.uk

The mysterious "Server Department", who are the only ones who can deal with this, use this email: server247@1and1.com

And don't reply, but if you are a client feel free to email them about your particular domain (That might help?)

Some Good news - emails

We think that we can create new domains for email and thus the sites can start receiving emails again. We will have to do this one by one but we are about to start.

Update to the down date

The server has now been down just over 24 hours.

The Good News:


The Bad News:

The server is down.

By 10 we are going to find a new company to host our server and will start moving the site, Cleint sites first.

Hopefully the server today will be up and we can do a more leisurely move rather than an evacuation.


Again, to our clients, I'm so very sorry. And thank you for being so very understanding

Monday 3 September 2007

Goodnight Cruel Internet

Its twenty past midnight.. after my last call... this time to the Philippines via the 1and1 head office I get the same old bull.

Meh...

1and1 server down problem update 19.10

Still no joy, I cannot believe. This is the worst customer service of any company I have ever dealth with, ever. including Orange, Sky and even Compuserve.

1and1 server down problem update 17.20

I have spoken to the head office in Germany, of United Internet.

They were no help. It seems that nobody in America, Germany or the Philippines can communicate directly with the UK server cluster.

I just got the phone again to the Philipines. Everything is saying its being worked on, but the evidence doesn't suggest that.

The feeling of helplessness is maddening!

Yadabyte Server Issues

Its been a hell of a day. Our server went down this morning and hasn't been up since.

These things happen - although this happened to us in many years of serving; when it did go bad it went it in spades.

But whats been really terrible is the appalling support that www.1and1.co.uk internet have been providing us. We have been lied to, passed all around the globe between the Philippines and basically treated as worthless - which considering the large amount of money we pay them is even more frustrating.

1and1 are supposed to be the biggest ISP in Europe, and they are pretty big in the states too. They guarantee great uptime and speed but I guess when it comes to it, as a company you need to get judged by how you deal with the bad times too.

Clients have been understandably calling us all day to get the low down and I have been unable to give them any real information, its been stressful to say the least.

One of the real issues I have with them today is the fact that their entire support infrastructure has no connection with the hardware, as in the servers. So the guys and gals in the Philipines cant do anything about the hardware issues, like we are having.

  • At 8 am we were told to call back in 30 mins.
  • We call back they have no idea about it, but then escalate it.
  • We call back again, another escalation ; by 10am uk were were supposedly at the highest level.
  • Then we get told its a problem with UDP flooding (we didn't think that made sense)
  • Then we get told its a BILLING ISSUE - I cant believe this so chase it up with their accounts department. No billing problem - what ind of internet system allows such misinformation?
  • We can see now that they are doing something on the server. not sure what but now, over 12 hours after the initial report maybe we will have our sites, and most importantly our client sites, back up.
Stay tunes... and if this has affected you... sorry

Mat