In any case, it would be prudent to observe what the event counter is on the
"good" drives and be sure that it is higher than that of the "bad" drive
hdh2 before readding it (hdh2) back on to the array. That might require
more than one reboot. It would be especially BAD if it just happened to be
an exact match after a reboot and the RAID logic decided that everything was
in sync without any further effort!
Thanks, Rich B
----- Original Message -----
From: "Corin Hartland-Swann" <[EMAIL PROTECTED]>
To: "Richard Bollinger" <[EMAIL PROTECTED]>
Cc: "Pavel Kucera" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Thursday, May 18, 2000 4:49 PM
Subject: Re: Help with RAID5 damage please
>
> Hi there,
>
> On Thu, 18 May 2000, Richard Bollinger wrote:
> > > May 18 16:38:27 backup kernel: hdh2's event counter: 0000000a
> > > May 18 16:38:27 backup kernel: hdg2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hdf2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hde2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hdd2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hdc2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hdb2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: hda2's event counter: 00000008
> > > May 18 16:38:27 backup kernel: md: superblock update time
inconsistency
> > > May 18 16:38:27 backup kernel: unbind<hdb2,2>
> >
> > Your logs indicate that the Raid code decided to look at hdh2 as gospel
and
> > dismiss all of the rest. The easiest solution is to temporarily
disconnect
> > or disable hdh2, then restart the system. It will accept the data on
all of
> > the other drives as OK now and start up the array in "degraded" mode due
to
> > the missing hdh2 drive. Shut the system down once more, reattach hdh2
and
> > start it up one more time. This time, all of the drives should be
there,
> > but with hdh2 listed as out of step. Now you should be able to do a
> > "raidhotadd /dev/md0 /dev/hdh2" to start reconstruction with hdh2
included.
>
> I've thought of a problem with this. IIRC, the event counter is
> incremented once for each successful mount. If you follow this procedure,
> then the raid driver will increment hd[a-g]'s event counters to 9, and
> when you boot back up again you'll be in the same situation.
>
> The best suggestion I can give is to reboot three times so that the event
> counters cycle through '9', 'a' and then 'b'. When you reattach hdh2, and
> reboot the event counters for hd[a-g] will be greater than hdh's, and you
> can do the raidhotadd then!
>
> Hope this helps!
>
> Corin