Now that the dust has settled, we know what happened. Our tech didn't 
completely disconnect the SAN connections (he unplugged them, but not far 
enough) when installing ESX v3.5 on a new physical host and it formatted a SAN 
drive instead of the local drive. If we had known this before powering off the 
VM's we could have VMotioned them to the other SAN, but at the time we didn't 
know this.

I still shouldn't have had all my eggs on one SAN (and now don't), and version 
4 of ESX doesn't allow this without having to click on some very prominent "are 
you sure!?!?!" boxes, whereas apparently v3.5 just throws it wherever and 
apparently making it easy to shoot yourself in the foot.

Dave

From: Brian Desmond [mailto:br...@briandesmond.com]
Sent: Friday, October 08, 2010 2:13 PM
To: NT System Admin Issues
Subject: RE: How'd this for a bad day? AKA bad me

Sounds like you should home the redundant sets of VMs on different SAN 
volumes/whatever?

Thanks,
Brian Desmond
br...@briandesmond.com

c - 312.731.3132


From: David Lum [mailto:david....@nwea.org]
Sent: Friday, October 08, 2010 11:51 AM
To: NT System Admin Issues
Subject: How'd this for a bad day? AKA bad me

I have 7 production systems running on 3 different ESX boxes in an ESX cluster, 
and 2 different logical SAN volumes (sorry am not SAN savvy, I just know I have 
two different SAN volumes to choose from when making a VM).

Today, a SAN blows up and takes out half - our SharePoint server (heavily 
used), a Terminal Server , and an internal occasionally-used web server 
(Namescape rDirectory). Then somehow, when I was told to power down the other 4 
VM's so our VMWare guy could reboot a vCenter server, 3 of the 4 remaining VM's 
decided to go AWOL (a combination of "missing" and "disconnected"). That took 
out my other two Terminal Servers and another lightly used internal web server.

Did I mention I don't have the normal backups for these things because 
...well...I'm an idiot and didn't confirm our backup guy installed backup 
software on these servers as I stood them up (process error on my part since I 
should confirm it's on there). None of these store data - they all talk to a 
backend SQL and the Terminal Servers are used to run apps that are slow if they 
run the same apps over VPN. SharePoint we got back quick because we do have a 
staging equivalent of it, so it was repoint to a config and content DB, DNS 
change, and done.

I do have copious notes on how I built the others and can rebuild from scratch 
easily enough (I just finished the three TS boxes), but dude...six servers at 
once?

The most frustrating part was discovering that the 4 systems that had been 
powered off could have been "migrated" before power off and there would have 
been no issue with them - the power down nuked 'em.

Oh, and the lone surviving server - the PGP Universal Server that manages the 
encrypted machines. (Yes, the PGP machines will still boot w/out the server up, 
but still, I've been on this server 50% of my time over the last two weeks!).

Dave

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to 
listmana...@lyris.sunbeltsoftware.com<mailto:listmana...@lyris.sunbeltsoftware.com>
with the body: unsubscribe ntsysadmin

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to 
listmana...@lyris.sunbeltsoftware.com<mailto:listmana...@lyris.sunbeltsoftware.com>
with the body: unsubscribe ntsysadmin

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to listmana...@lyris.sunbeltsoftware.com
with the body: unsubscribe ntsysadmin

Reply via email to