On Fri, Oct 18, 2002 at 11:35:54AM -0700, Matthew Dillon wrote: > > :> : > :> :There is a very easy way to trigger the problem: insert blank floppy > :> :... > :> > :> Your patch looks slightly incomplete to me, but the concept is reasonable. > :> The BIO_NORETRY test that sets B_INVAL should probably be done in > :> brelse(), not in bufwait(). It is the code in brelse() that actually > :> does the re-dirtying of the buffer in case of a write-error. > : > :Ah, actually I've initially put it into brelse() but then reconsidered > :a decision and moved it down into bufwait(). I'll move it back. ;) > > Heh heh. Well, it seems to me that since it is the BUF abstraction > that has the error check / redirtying / retry code, then the BUF > abstraction should probably be responsible for the no-retry case as > well. The BIO abstraction is really designed to hold an I/O operation, > not really to hold meta operations. You could still specify a BIO > flag for it since it's a media hack of sorts, but the BUF code should > be responsible for processing it.
OK, thank you for deteiled explanation. > I dunno about a formal abstraction. We need to differentiate between > media which can and cannot remap blocks. A 'perfect' solution > would be far more complex. File data blocks would have to be > remapped at the filesystem level and meta-data would have to be > invalidated in-core (bitmap, inode blocks with write errors), and > the filesystem would have to be marked dirty on unmount. Then unmount > could safely destroy the buffers representing the write-error'd meta > data. > > The VFS layer would definitely need to be involved. We have the > advantage in that the buffer cache is already logically mapped, but > it would still be a fairly sophisticated piece of work. > > :> This re-dirtying is necessary in most cases to prevent filesystem > :> corruption. Otherwise the buffer may be thrown away and a re-read > :> may return the original pre-modified data, causing massive filesystem > :> corruption elsewhere (consider what that would mean for a bitmap block). > :> > :> I think it's perfectly reasonable to do away with the buffer in the > :> case of a floppy error, though. > > Just a bit of history. Originally the buffer cache did not retry error'd > out writes. I changed it several years ago because the mechanism > was producing massive filesystem corruption in the face of disk write > errors. The floppy issue was a known issue at the time and I am quite > happy that someone is tackling the problem now! Hmm, the current approach doesn't look all that "right" to me, because we are retrying operation even though the upper-layer code that initiated it was already notified about the failure (e.g. received EIO), so that it should not assume that the data was actually written successfully. Or I am missing something? -Maxim To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message