That is essentially the same problem I have with raid-1. If I unplug/power
off a drive, the machine never recovers, even though it looks like the raid
kernel code detects that the raidset has gone degraded.

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       EMail:  [EMAIL PROTECTED]
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216


> -----Original Message-----
> From: Gordon Henderson [mailto:[EMAIL PROTECTED]]
> Sent: Monday, June 14, 1999 6:08 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Dump a raid5?
> 
> 
> On Sun, 13 Jun 1999, Carlos Carvalho wrote:
> 
> > Gordon Henderson ([EMAIL PROTECTED]) wrote on 12 
> June 1999 17:27:
> >  >2 problems though - I powered down a drive to see what 
> happened and I got
> >  >a spew of messages to say it was running in degraded 
> mode, but it didn't
> >  >actually carry on. It seemed that all accesses to the 
> array were blocked.
> >  >I couldn't even halt the machine cleanly - had to hit the 
> reset button (it
> >  >sat there trying to unmount the disks for a long time) 
> Rebooted, it
> >  >wouldn't automatically rebuild the array
> > 
> > This is strange... for me it rebuilds fine if the raid 
> wasn't cleanly
> > stopped. About powering down, you must have hardware that supports
> > this. If there's a problem on the SCSI bus, the machine 
> will hang. On
> > the same issue, if on reboot if finds that both disks on the
> > controller where you stopped the disk are out of sync it won't
> > rebuild, because there'll be two drives out, and raid5 gives
> > redundancy for just one failure. The boot msgs. show this.
> 
> I can indivudually power down each drive. They are in 
> removable caddys.
> Maybe having the drive powered down but still connected to 
> the bus was the
> problem and I should have just unplugged it totally. Once I'd 
> rebooted it
> and fsck'd it (3 times before it passed a clean fsck - some very odd
> messages, but it did keep all the files!) then it ran fine in degraded
> mode and I hade to use the --really-force option to mkraid 
> --force-resync
> to get it to rebuild the array which it did do 
> satisfactorily. I'm going
> to do some more tests with it today though.
> 
> As for the boot messages... They are way too verbose! the dmesg buffer
> isn't large enough to hold them all )-:
> 
> 
> For dump: I did this:
> 
> Script started on Mon Jun 14 11:33:58 1999
> 
> voxel:/# uname -a
> Linux voxel 2.2.6 #1 SMP Fri Jun 11 21:03:24 BST 1999 i686 unknown
> 
> voxel:/# cat /proc/mdstat 
> Personalities : [linear] [raid0] [raid1] [raid5] [translucent] 
> read_ahead 1024 sectors
> md0 : active raid5 sdd1[3] sdc1[2] sdb1[1] sda1[0]
>   53761152 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
> unused devices: <none>
> 
> voxel:/# ls -l /mnt
> total 0
> -rw-r--r--   1 root     root            0 Jun 14 11:30 testFile
> 
> voxel:/# mount /dev/md0 /mnt
> 
> voxel:/# df -m /mnt
> Filesystem         MB-blocks    Used Available Capacity Mounted on
> /dev/md0               50849    1025    47199      2%   /mnt
> 
> voxel:/# ls -l /mnt
> total 1049604
> -rw-r--r--   1 root     root     1073741824 Jun 12 15:13 f1
> 
> voxel:/# dump 0f /var/tmp/dum1 /dev/md0
>   DUMP: Date of this level 0 dump: Mon Jun 14 11:35:19 1999
>   DUMP: Date of last level 0 dump: the epoch
>   DUMP: Dumping /dev/hda1 (/) to /var/tmp/dum2
>   DUMP: mapping (Pass I) [regular files]
> /dev/hda1: Ext2 inode is not a directory while mapping files 
> in dev/md0
> 
> voxel:/# dump 0f /var/tmp/dum1 /mnt
>   DUMP: Date of this level 0 dump: Mon Jun 14 11:35:10 1999
>   DUMP: Date of last level 0 dump: the epoch
>   DUMP: Dumping /dev/hda1 (/) to /var/tmp/dum1
>   DUMP: mapping (Pass I) [regular files]
>   DUMP: mapping (Pass II) [directories]
>   DUMP: estimated 25 tape blocks on 0.00 tape(s).
>   DUMP: dumping (Pass III) [directories]
>   DUMP: dumping (Pass IV) [regular files]
>   DUMP: DUMP: 22 tape blocks on 1 volumes(s)
>   DUMP: finished in less than a second
>   DUMP: Closing /var/tmp/dum1
>   DUMP: DUMP IS DONE
> 
> voxel:/# restore -i -f /var/tmp/dum1
> restore > ls
> .:
> mnt/
> 
> restore > cd mnt
> restore > ls
> ./mnt:
> testFile
> 
> restore > quit
> voxel:/# exit
> exit
> 
> Script done on Mon Jun 14 11:35:58 1999
> 
> 
> If someone can explain what I'm doing wrong... I want to run 
> Amanda with
> dump on this machine... (it's also got a 35/70GB DLT drive 
> connected and I
> currently run amanda + dump on another 2 Linux servers without any
> problems ...)
> 
> Gordon
> -- 
> Gordon Henderson,             \  Pixelfusion Ltd.
> Senior Systems Administrator   \  2430 The Quadrant,
>  +44 0 1454 878 740             \  Aztec West,
>  +44 0 1454 878 644 (fax)        \  Almondsbury, Bristol. BS32 4AQ
> 

Reply via email to