Ingo,

I can fairly regularly generate corruption (data or ext2 filesystem) on a busy
RAID-5 by adding a spare drive to a degraded array and letting it build the
parity. Could the problem be from the bad (illegal) buffer interactions you
mentioned, or are there other areas that need fixing as well? I have been
looking into this issue for a long time with no resolve. Since you may be aware
of possible problem areas: any ideas, code or encouragement is greatly welcome.

<>< Lance.


Ingo Molnar wrote:

> On Wed, 12 Jan 2000, Gadi Oxman wrote:
>
> > As far as I know, we took care not to poke into the buffer cache to
> > find clean buffers -- in raid5.c, the only code which does a find_buffer()
> > is:
>
> yep, this is still the case. (Sorry Stephen, my bad.) We will have these
> problems once we try to eliminate the current copying overhead.
> Nevertheless there are bad (illegal) interactions between the RAID code
> and the buffer cache, i'm cleaning up this for 2.3 right now. Especially
> the reconstruction code is a rathole. Unfortunately blocking
> reconstruction if b_count == 0 is not acceptable because several
> filesystems (such as ext2fs) keep metadata caches around (eg. the block
> group descriptors in the ext2fs case) which have b_count == 1 for a longer
> time.

Reply via email to