I've been experimenting with my 4 disk Raid5 setup for a few weeks now and been impressed - so far.
 
Today, I sucumbed to the temptation of of simulating a disk failure (or more accurately a power supply failure to a disk), by 'hot' unplugging its power lead. Nothing appeared to happen at first - mdstat reported that all disks were working. Then the system stopped responding to console commands - not even to a shutdown.'Never mind' I thought, I'll power cycle and when it fires up again the array will get reconstructed and all will be fine. However, on restarting, md reported that _2_ disks were non-fresh and were kicked from the array. This left only 2 disks for reconstruction - and the system gave up with a kernel panic. I'm now left with a system that I can't do anything with except a reinstallation from scratch.
 
Where did I go wrong - what strategy should be adopted in such a situation? It seems to me that cutting the power to a disk is a reasonable test - simulating a faulty connection. Should I have waited longer before shutting the power off to the system - and was this the reason that the 2nd disk went down? If I'd had a spare disk in the array, would it have reconstructed O.K.?
 
Now - this is the interesting one - if the system had _not_ been Raid5 I could probably have done the same thing and still ended up with a useable system. In likelyhood, all that would have happened is that the fs would have been marked as dirty and a fschk carried out at next reboot. This suggests that Raid5 _can_ be more fragile than a non Raid setup.
 
Regards: Jim Ford

Reply via email to