Richard Maloley II <rich...@rrcomputerconsulting.com> writes:

> Earlier this week I went to a client’s business to finish some work on their
> server.  Something went horribly wrong and I could use some help figuring
> out what happened.
>
> Some background information first… This is a small business with about 15
> workstations and a single Server 2003 SBS installation. Their C: drive is a
> mirror of two 80GB drives using a Silicon Graphics SATA RAID card (add on
> card, not an original part of the server).

I assume you mean "Silicon Image", who make SATA fakeRAID cards, not Silicon
Graphics, here.  Given later comments about low-cost "RAID", though, I think
that is a pretty safe assumption.

[...]

> I unplugged all the SATA cables and put it all back together. I am fairly
> confident that the two drives for the OS were reconnected to the proper
> ports on the RAID card, otherwise I feel the RAID BIOS would have given me
> an error message.

You have a lot more confidence in the BIOS than my experience supports.

[...]

> I checked the event logs – same thing! Log files are all blank from
> 2/11/2010 until 4/14 /2010.

[...]

> It appears as though Windows lost two months of data. I’m at a loss as to
> how this could have happened.

Well, my guess is that you hit the same problem that I had a handful of
clients hit in the past:

The Windows vendor-supplied software RAID drivers for things like the Silicon
Image "RAID" controllers are terrible, and are prone to things like...

> The only initial thought is that I swapped the SATA cables on the OS mirror
> set… but it’s a mirror, so it should be in sync 100%, not 2 months behind!

...dropping a drive out of the array, so that you discover well after the fact
that one disk was not up to date.  Usually because the system reboots and
decides to use the out-of-date disk for some reason — usually a failure of the
second disk.

Worse, our experience is that the RAID BIOS or software RAID driver usually
ends up writing over the "out of date" second disk (with the newer data) if it
is at all accessible, to "fix" the broken array after a reboot.

> Since this is a consumer level card it has no usable tools or log files that
> I can see.  Swapping the cables again sounds dangerous to me – I don’t
> believe that it would be a safe thing to do.
>
> Has anyone heard or experienced anything like this?

Sure.  Several times, on which basis I pretty much started telling clients
that they better maintain excellent, and well-tested, backups if they intend
to run critical systems on any of the low-end "RAID" hardware.

Intel were the least-worst vendor for software RAID solutions, but I have very
little faith in their tools.  At least they put a little more effort into
keeping them up to date than many of the other vendors.

> Does the community have any other options that I might be able to try?

You could *try* pulling the second disk and reading it in another machine, to
see if it contains the extra data, then hand-integrating that back into the
primary device.

I wouldn't hold my breath, however: the odds are good that the controller
overwrote the second (good) disk with the old data, in my experience.

        Daniel

-- 
✣ Daniel Pittman            ✉ dan...@rimspace.net            ☎ +61 401 155 707
               ♽ made with 100 percent post-consumer electrons

_______________________________________________
Tech mailing list
Tech@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to