That is essentially the same problem I have with raid-1. If I unplug/power
off a drive, the machine never recovers, even though it looks like the raid
kernel code detects that the raidset has gone degraded.
-- Nathan
------------------------------------------------------------
Nathan Neulinger EMail: [EMAIL PROTECTED]
University of Missouri - Rolla Phone: (573) 341-4841
Computing Services Fax: (573) 341-4216
> -----Original Message-----
> From: Gordon Henderson [mailto:[EMAIL PROTECTED]]
> Sent: Monday, June 14, 1999 6:08 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Dump a raid5?
>
>
> On Sun, 13 Jun 1999, Carlos Carvalho wrote:
>
> > Gordon Henderson ([EMAIL PROTECTED]) wrote on 12
> June 1999 17:27:
> > >2 problems though - I powered down a drive to see what
> happened and I got
> > >a spew of messages to say it was running in degraded
> mode, but it didn't
> > >actually carry on. It seemed that all accesses to the
> array were blocked.
> > >I couldn't even halt the machine cleanly - had to hit the
> reset button (it
> > >sat there trying to unmount the disks for a long time)
> Rebooted, it
> > >wouldn't automatically rebuild the array
> >
> > This is strange... for me it rebuilds fine if the raid
> wasn't cleanly
> > stopped. About powering down, you must have hardware that supports
> > this. If there's a problem on the SCSI bus, the machine
> will hang. On
> > the same issue, if on reboot if finds that both disks on the
> > controller where you stopped the disk are out of sync it won't
> > rebuild, because there'll be two drives out, and raid5 gives
> > redundancy for just one failure. The boot msgs. show this.
>
> I can indivudually power down each drive. They are in
> removable caddys.
> Maybe having the drive powered down but still connected to
> the bus was the
> problem and I should have just unplugged it totally. Once I'd
> rebooted it
> and fsck'd it (3 times before it passed a clean fsck - some very odd
> messages, but it did keep all the files!) then it ran fine in degraded
> mode and I hade to use the --really-force option to mkraid
> --force-resync
> to get it to rebuild the array which it did do
> satisfactorily. I'm going
> to do some more tests with it today though.
>
> As for the boot messages... They are way too verbose! the dmesg buffer
> isn't large enough to hold them all )-:
>
>
> For dump: I did this:
>
> Script started on Mon Jun 14 11:33:58 1999
>
> voxel:/# uname -a
> Linux voxel 2.2.6 #1 SMP Fri Jun 11 21:03:24 BST 1999 i686 unknown
>
> voxel:/# cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid5] [translucent]
> read_ahead 1024 sectors
> md0 : active raid5 sdd1[3] sdc1[2] sdb1[1] sda1[0]
> 53761152 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
> unused devices: <none>
>
> voxel:/# ls -l /mnt
> total 0
> -rw-r--r-- 1 root root 0 Jun 14 11:30 testFile
>
> voxel:/# mount /dev/md0 /mnt
>
> voxel:/# df -m /mnt
> Filesystem MB-blocks Used Available Capacity Mounted on
> /dev/md0 50849 1025 47199 2% /mnt
>
> voxel:/# ls -l /mnt
> total 1049604
> -rw-r--r-- 1 root root 1073741824 Jun 12 15:13 f1
>
> voxel:/# dump 0f /var/tmp/dum1 /dev/md0
> DUMP: Date of this level 0 dump: Mon Jun 14 11:35:19 1999
> DUMP: Date of last level 0 dump: the epoch
> DUMP: Dumping /dev/hda1 (/) to /var/tmp/dum2
> DUMP: mapping (Pass I) [regular files]
> /dev/hda1: Ext2 inode is not a directory while mapping files
> in dev/md0
>
> voxel:/# dump 0f /var/tmp/dum1 /mnt
> DUMP: Date of this level 0 dump: Mon Jun 14 11:35:10 1999
> DUMP: Date of last level 0 dump: the epoch
> DUMP: Dumping /dev/hda1 (/) to /var/tmp/dum1
> DUMP: mapping (Pass I) [regular files]
> DUMP: mapping (Pass II) [directories]
> DUMP: estimated 25 tape blocks on 0.00 tape(s).
> DUMP: dumping (Pass III) [directories]
> DUMP: dumping (Pass IV) [regular files]
> DUMP: DUMP: 22 tape blocks on 1 volumes(s)
> DUMP: finished in less than a second
> DUMP: Closing /var/tmp/dum1
> DUMP: DUMP IS DONE
>
> voxel:/# restore -i -f /var/tmp/dum1
> restore > ls
> .:
> mnt/
>
> restore > cd mnt
> restore > ls
> ./mnt:
> testFile
>
> restore > quit
> voxel:/# exit
> exit
>
> Script done on Mon Jun 14 11:35:58 1999
>
>
> If someone can explain what I'm doing wrong... I want to run
> Amanda with
> dump on this machine... (it's also got a 35/70GB DLT drive
> connected and I
> currently run amanda + dump on another 2 Linux servers without any
> problems ...)
>
> Gordon
> --
> Gordon Henderson, \ Pixelfusion Ltd.
> Senior Systems Administrator \ 2430 The Quadrant,
> +44 0 1454 878 740 \ Aztec West,
> +44 0 1454 878 644 (fax) \ Almondsbury, Bristol. BS32 4AQ
>