On 7/30/00, 7:14:16 AM, Daniel Phillips
<[EMAIL PROTECTED]> wrote regarding Re: Questions about
the buffer+page cache in 2.4.0:
> Daniel Phillips wrote:
> > There are two obvious ways to do filesystem-specific special handling of
the
> > tail block: (1) in the 'read actor' that does the actual copy-to-user
(see? yet
> > another problem solved by an extra level of indirection!) or (2) in the
> > inode->i_mapping->a_ops->readpage function that retrieves the page:
essentially
> > we would do a copy-on-write to produce a new page that has the tail
fragment in
> > the correct place.
> >
> > I have a very distinct preference for (1), and I'll proceed on the
assumption
> > that that's what I'll be doing unless some shows me why it's bad.
> After digging a little deeper I can see that using the read actor won't
work
> because the read actor doesn't take the inode, or anything that can be
> dereferenced to find the inode, as a parameter. So it's not possible to
do the
> tail offset check and adjustment there.
> That's ok - it's the wrong place to do it anyway because the check then
has to
> be performed each time around the loop. A much better way is to replace
> generic_file_read in the Ext2 file_operations struct by a new
ext2_file_read:
> proposed_ext2_file_read:
> - generic_file_read stopping before any tail with nonzero offset
> - If necessary, generic_file_read of the tail with source offset
For reading the tail, take a look at how these functions interact:
get_block
generic_file_read
block_read_full_page (ext2's readpage func)
Putting the tail knowledge into ext2_file_read won't be enough, it won't
cover mmaps. You have to make sure your readpage/writepage functions
keep the page and buffer caches in sync. Reiserfs does most of this from
get_block...
> This imposes just one extra check when the tail isn't merged or happens
to be at
> the beginning of the tail block, so read overhead for tailmerging is
negligible
> when the feature isn't used.
> Now I have to address the question of how tail blocks can be shared
between
> files. This does not seem to me to be an easy question at all. I'll
summarize
> the previous discussion here...
You have two real choices. Unpack the tail before any writes to it, then
repack later (file close, whatever). This allows you to use all the
generic functions for writing data, and keeps the synchronization down
(only write to shared data on pack/unpack).
Or, change your prepare, commit, writepage, and get_block routines to
write directly to the shared block. This is somewhat more difficult, and
I suspect slower since you'll have to touch all the inodes in the ring as
you shift data around for each write.
-chris