On Tue, Aug 07, 2007 at 06:09:03PM -0700, Andrew Morton wrote:
> On Tue, 7 Aug 2007 07:51:29 +0200
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> 
> > nobh mode error handling is not just pretty slack, it's wrong.
> > 
> > One cannot zero out the whole page to ensure new blocks are zeroed,
> > because it just brings the whole page "uptodate" with zeroes even if
> > that may not be the correct uptodate data. Also, other parts of the
> > page may already contain dirty data which would get lost by zeroing
> > it out. Thirdly, the writeback of zeroes to the new blocks will also
> > erase existing blocks. All these conditions are pagecache and/or
> > filesystem corruption.
> > 
> > The problem comes about because we didn't keep track of which buffers
> > actually are new or old. However it is not enough just to keep only
> > this state, because at the point we start dirtying parts of the page
> > (new blocks, with zeroes), the handling of IO errors becomes impossible
> > without buffers because the page may only be partially uptodate, in
> > which case the page flags allone cannot capture the state of the parts
> > of the page.
> > 
> > So allocate all buffers for the page upfront, but leave them unattached
> > so that they don't pick up any other references and can be freed when
> > we're done. If the error path is hit, then zero the new buffers as the
> > regular buffer path does, then attach the buffers to the page so that
> > it can actually be written out correctly and be subject to the normal
> > IO error handling paths.
> > 
> > As an upshot, we save 1K of kernel stack on ia64 or powerpc 64K page
> > systems.
> > 
> > Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
> > 
> 
> With this change, nobh_prepare_write() can magically attach buffers to the
> page.  But a filesystem which is running in nobh mode wouldn't expect that,
> and could quite legitimately go BUG, or leak data or, more seriously and
> much less fixably, just go and overwrite page->private, because it "knows"
> that nobody else is using ->private.

I was fairly sure that a filesystem can not assume buffers won't be
attached, because there are various error case paths thta do exactly
the same thing (eg. nobh_writepage can call __block_write_full_page
which will attach buffers). 

Does any filesystem assume this? Is it not already broken?


> I'd have thought that it would be better to not attach the buffers and to
> go ahead and do whatever synchronous IO is needed in the error recovery
> code, then free those buffers again.

It is hard because if the synchronous IO fails, then what do you do?
You could try making it up as you go along, but of course if we _can_
attach the buffers here then it would be preferable to do that. IMO.
 

> Also, you have a couple of (cheerily uncommented) PagePrivate() tests in
> there which should be page_has_buffers().

Yeah, I guess the whole thing needs more commenting :P
page_has_buffers... right, I'll change that.


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to