> While doing the mkraid --force i accidentially mounted the /dev/md without
> ext2fsing it. This probably damaged the persistent superblock and mkraid
> stopped. Redoing the mkraid was not possible!

How did you manage to mount a device which did not have a filesystem on it? Did
you apply the RAID over an existing filesystem? Whatever, the problem of not
being able to redo the mkraid is probably due to the fact that you needed to
unmount and raidstop the array and then use the --really-force option to ignore
the existing RAID superblock.

> Using badblocks -w on hdd1 fixed the problem. I think it simply overwrote
> the corrupt superblock.

Overwriting the corrupt superblock sounds like a _very_ bad idea. I would guess
that anything other than the RAID code writing to the superblock would give a
problem. I would go through the procedure below of setting the partition as
faulty, raidhotremoving it and raidhotadding it to allow the RAID code to resync
the mirrors. Better do a backup first though, just in case....

> While this happened i noticed i have no idea what to do in case a disk
> failed. Any hints?

If you think there is a problem with a disk, but the RAID code hasn't noticed
it, tell the RAID code there is a problem with:

raidsetfaulty /dev/mdX /dev/sdYn

for each array (X) containing a partition (n) on the faulty disk (Y).

If the partitions are marked as faulty in /proc/mdstat (their entry is followed
by (F)), either because you have set them as faulty, or because the RAID code
noticed there was a problem and marked them as faulty itself, you can remove
them from the array with:

raidhotremove /dev/mdX /dev/sdYn

for each array.... (as above). This step is unnecessary if the RAID code felt
that the error was serious enough to kick the partition out of the array.

The arrays are now operating in degraded mode and ready for you to remove the
disk. You can make sure by looking at the last part of each array's entry in
/proc/mdstat. This is a list of Us or _s surrounded by square brackets. Each U
indicates an active partition, and each _ indicates a failed partition. The list
starts with the first disk in the array and works upwards. So [U_] means that
the first disk in the array is active and the second disk is failed. The disks
are ordered as in your /etc/raidtab, and the order is confirmed by bracketed
numbers (starting from 0) after each partition's entry in /proc/mdstat (a failed
disk may not be listed at all).

Remember to backup and unmount any non-RAID partitions that are on that disk.
Then remove it.

When you have replaced the disk, partition the replacement with an identical
partition scheme to the removed disk. Use fdisk to mark any RAID partitions as
type fd. Then do:

raidhotadd /dev/mdX /dev/sdYn

for each array.... Resyncing will start. You can follow the resyncing progress
in /proc/mdstat, which tells what percentage has been completed and an estimated
time to completion. You can use the arrays while the system is resyncing. The
arrays will return to normal operation once resyncing has finished.

In practice, most of the above will be unnecessary. Generally, the RAID code
will notice a failure and automatically fail and remove the appropriate
partitions. In which case, you only need to follow the steps from backing up the
un-RAIDed partitions above.

Cheers,


Bruno Prior         [EMAIL PROTECTED]

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Joachim Zobel
> Sent: 18 November 1999 20:19
> To: [EMAIL PROTECTED]
> Subject: Little quirk while setting up raid1
>
>
>
> Hi,
>
> i've got a raid1 with 0.90 up and running. Cool. Autorebuild was definitely
> a feature i wanted to have.
>
> While doing the mkraid --force i accidentially mounted the /dev/md without
> ext2fsing it. This probably damaged the persistent superblock and mkraid
> stopped. Redoing the mkraid was not possible!
>
> kernel: hdd: read_intr: status=0x59 { DriveReady SeekComplete DataRequest
> Error }
> kernel: hdd: read_intr: error=0x40 { UncorrectableError }, LBAsect=415009,
> sector=414944
> kernel: end_request: I/O error, dev 16:41 (hdd), sector 414944
> kernel: interrupting MD-thread pid 583
> kernel: raid1: mirror resync was not fully finished, restarting next time.
> kernel: raid1: Disk failure on hdd1, disabling device.
> kernel:        Operation continuing on 1 devices
>
> Using badblocks -w on hdd1 fixed the problem. I think it simply overwrote
> the corrupt superblock.
>
> While this happened i noticed i have no idea what to do in case a disk
> failed. Any hints?
>
> Thanx,
> Joachim
>
> --
> "... ein Geschlecht erfinderischer Zwerge, die fuer alles gemietet werden
> koennen."                            - Bertolt Brecht - Leben des Galilei
>
>
>

Reply via email to