On Thu, May 21, 2009 at 04:05:18PM +0000, Uwe Dippel wrote:
> Marco Peereboom <slash <at> peereboom.us> writes:
> 
> 
> > The plugging in of the disk is a non-event.  The disk is dead to the
> > OS and by extension to softraid.
> 
> Let me follow up on this topic, please, and report some more experiments and
> results and thoughts.
> 
> I recreated the mirror from scratch, and put /tmp, /var, /usr, /home and 
> /backup directories on it. (No need to point out this is kind of stupid.)
> Running for 2 days. 
> Hot unplugged drive A. Then 'echo Nonsense > /backup/testo'
> Good outcome, though not tested intensely yet: the system keeps running on 
> B as if nothing had happened. 
> Shutdown and plugged A back, restart.

Upon reboot the mirror should be brought up with only the surviving
member.  If this isn't the case please show me a trace so that I can go
fix that bug.

> Fails at file check, with 'help!' and dropping to a shell at /var. 
> Problem is, that the .pid had been properly removed on B, but not on A; 
> and I needed to delete those one by one at fsck. I also fsck-ed all other
> partitions, and as to be expected, the 'testo' was on B, not on A, 

Drive A is DEAD.  Do not EVER use it again.

> and therefore it needed to be deleted.
> Reboot, alas, ending in a hangman. Reboot.
> Another time /var drops to a shell, it has some trouble with 'lost+found',
> another manual fsck is needed, reboot.
> Finally, the mirror comes up properly. 
> 
> Next, I'd like to do a real test on a production machine. What scares me, 
> is the lack of physical access, so the hangman and the drop to shell for 
> fsck are not good. And, on a production box here, there might be thousands 
> of files accumulating on the plugged drive that won't be available on the 
> unplugged one,

YOU CAN NEVER USE THE UNPLUGGED DRIVE EVER EVER EVER EVER EVER
AGAIN!!!!!  IT IS DEAD AND IS CORRUPT AND PUPPIES DIE WHEN YOU USE IT!!

Hope this sinks in.

> and I will be asked to delete those. Also, this is not good. 
> My question/suggestion: I for one would be happy if the state after reboot 
> would by default be identical to the (degraded) state before the reboot: 
> Because then I would hope to get the system started without the earlier 
> defunct drive; that means, hopefully starting okay, and more relevant, 
> not require me to do anything, not to delete any files. Simply start with 
> the sane drive of the broken mirror as it was shut down. Then I could 
> dump and restore the data to a freshly created RAID, without any further ado.
> Then, at least, a broken drive, a flimsy controller would not interfere 
> into the proper running and restarting of the box; and giving me the 
> chance to retrieve all, including the most recent, data.

If the dead drive becomes a participating member of a raid set
something is broken; very badly broken.  Show me a trace, including
bioctl output,  if this is the case.

> 
> Does this make sense?

Not sure.  I have a hard time following what you are doing vs. what your
expectations are.

Reply via email to