Walter Haidinger writes:
> First of all: Thanks for replying to an issue with a
> non-generic kernel! I really appreciate that!

That it was a non-generic kernel didn't even cross my mind... it was 
an issue w/ RAIDframe, and that's why I responded...

> On Thu, 29 Jun 2006, Greg Oster wrote:
> 
> > > Adding a spare did work:
> > > 
> > > # raidctl -a /dev/wd1g raid1
> > 
> > Isn't that the spare you used for raid2 ?
> 
> Sorry, cut&paste error, should have been wd1f.
>  
> > Hmm.. where is the lines saying reconstruction is "n% complete"? 
> > (they arn't pretty, but in this case they'd be useful)
> 
> I'm sorry, I did not record those. Reconstructing did take some time,
> though, I recall checking the progress, nothing suspicous there,

So did the reconstruction actually complete? 

> > > raidctl -F subsequently fails with "error rewriting parity".
> > 
> > That error will only come from attempting to check parity on a RAID 
> > set with a failed component.  It has nothing to do with "raidctl -F".
> 
> Oh yes, of course! Should have mentioned that I've tried raidctl -P 
> after raidctl -F ...

Ok... so the big question is still: how far along was the 
reconstruction?  "raidctl -P" would fail even if the reconstruct was 
still in progress.
 
> > How long did you wait for the reconstruction to finish?  
> > For the above output, note that it still says "reconstructing" 
> > for component1...  When that finishes, it will say "spared".  
> 
> And what about the spare? Shouldn't it replace component1?

It won't replace it in the output of 'raidctl -s', but it will 
replace component1 for all accesses and what-not.. (and will take its 
proper place (with autoconfig turned on) after a reboot (well... 
sans a known bug in rf_reconstruct.c where this line:

  c_label.partitionSize = raidPtr->Disks[srow][scol].partitionSize;

should be added to where it says:

  /* XXXX MORE NEEDED HERE. */
)

> That never happend. Instead, component1 sequence was:
> failed -> reconstructing -> failed. 

Hmm... I think you should see failed->reconstructing->spared
(that's what you'd see if 'component1' was a "normal" disk...)

You might want to check /var/log/messages* for some indication as to 
why the reconstruction failed... (as well, there should be something 
in there indicating the reconstruction completed, if it did...)

Later...

Greg Oster

Reply via email to