Erez Zadok <[EMAIL PROTECTED]> writes:
> In message <[EMAIL PROTECTED]>, Ion Badulescu writes:
>
> [...]
> > The current implementation will also populate the page cache with pages
> > that are not Uptodate, but are not Locked either, which is clearly a bug.
> > It will always happen if there is a partial write to a page, e.g. if a
> > program creates a file and then writes 1.5k worth of data, on a 1k-block
> > filesystem.
> >
> > It should be fixed either by getting all the buffers within the page
> > Uptodate, or by throwing away the page at the end of the write operation.
> >
> >
> > Ion
>
> Right. This messed up our stacking code a bit. The VFS essentially does
> this in generic_file_read:
>
> read_cache_page()
> wait_on_page()
> if (!Page_uptodate()) {
> report an error
> }
Nope. generic_file_read does essentially:
read_cache_page();
wait_on_page();
if (!Page_Uptodate()) {
mapping->a_ops->readpage()
wait_on_page();
if (!Page_Uptodate()) {
report an error;
}
}
This is important as it greatly speeds of the case of writing
partial pages. It eliminates the read/modify write case, if possible.
For block based filesystems if we aren't writing on block boundaries
we probably shouldn't mess with it and do the read/modify/write, if
we aren't writing on a block boundary.
For network filesystems with no block alignment constraints, and much
greater latency the win is real.
>
> Since our stacking behaves like a VFS, we have to reproduce the above code
> in our readpage(). For some file systems, such as cryptfs, there are two
> pages in memory for each normal page: one ciphertext and one cleartext. But
> now we have a problem in the following scenario:
>
> (1) you copy a file through the lower level file system (ext2) which has a
> 1k block size for a 4k page size (intel)
>
> (2) the file you copy isn't an exact multiple of PAGE_CACHE_SIZE
>
> (3) the caching at the ext2 level will put the last page in the cache, with
> only some of the buffers being BH_Uptodate. This is
> fs/buffer.c:__block_commit_write() which ext2 uses. The code in that
> function will not set the page uptodate flag on partial pages that only
> have a few buffers uptodate. So now we have a page in the cache that is
> not up-to-date. Now see what happens when our stacking layer executes
> the code similar to generic_file_read() in our own readpage():
O.k. This case looks like a performance bug.
Worth fixing but it should not prevent correct operations.
> Summary: a page should not be in the cache and not be up-to-date. If it is
> not up-to-date, then it should also probably be locked, but only b/c it is
> probably in transit from the disk to the cache.
Nope.
> We have tested the patch included here, which tries to ensure that no pages
> are in the cache and not up-to-date; it fixes __block_prepare_write(). Can
> someone who knows the buffer.c code well comment on this? I can't see a way
> out of this situation without one of the following:
>
> - partial pages are put in the cache if they are the last page of a file,
> but they are then marked up-to-date, and the rest of the code changed to
> handle this special situation
This should be done, but just because it is a performance optimization.
This case has been done for years, and works fine.
Eric