Re: readpage/writepage completion inode ops

Ion Badulescu Thu, 14 Sep 2000 12:16:23 -0700
In article <[EMAIL PROTECTED]> you wrote:

> Thanks, Ion.
> That sounds close to what I had in mind.
> 
> But one problem that I could see in terms of using this address space op
> in a stackable fileystem is on the reverse stacking path needed during i/o
> completion.
> i.e. We want things to happen in the following sequence:
>      upper layer readpage
>      lower layer readpage
>      lower layer readpagecomplete
>      upper layer readpagecomplete
> 
> As you have rightly observed in your earlier mail, the lower layer must
> not know anything about the layer above it, and we do need to think about
> how to use this address space op to achieve stacking without violating this
> requirement.

This is fairly easy, at least the way we implement stacking right now. Since
each filesystem layer uses the page cache, each one has a page allocated 
allocated for a piece of data. Each of those pages would have a pointer
to a method belonging to the layer above (who allocated the page in the 
first place and then passed it down), so the notification can propagate
up without any problems.

I know this multiple-caching is suboptimal, but I see no clean way around it.
A good optimization would be to mark all the lower layer pages as "discard",
but I don't think the page cache currently supports such a feature. Please
correct me if I'm wrong -- I'd love to be proven wrong here. :-)

BTW, I now realize that I was in fact right in my first reply. :-) You can't
use an address_space op for the callback, because it must callback into
the upper layer, whereas the address_space ops reachable from struct page 
belongs to the current layer. So an addition to struct page is inevitable.

> (BTW, regarding the converse, what I really meant to say but perhaps didn't
> manage to express properly is that the layer on top should not need to know
> anything about the internals of how the lower layer implements its
> functionality - since we would like to be able to layer over any file
> system
> type, without needing to know which type it is.

Yes, I think we are in violent agreement here. :-) See also my reply to Ben.

> One option that I first thought of was that the upper layer could change
> the
> lower layer's completion op to point to the upper layer's readpagecomplete.
> The upper layer readpagecomplete would invoke the lower layer's completion
> op, and then perform its own completion processing. The only complication
> is that
> the upper layer's readpagecomplete would still be passed in the lower
> layer's
> inode/page and would hence need to devise some means of locating the
> corresponding upper layer inode/page (and from that the original lower
> layer's
> completion op too).

Ah, but that is too ugly to contemplate. No, modifying the inode's op table
is completely out of the question, especially since it is (implementation-wise)
shared across all the filesystem inodes. No way. A separate callback mechanism
added to struct page is a lot cleaner.


Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
            than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
Re: readpage/writepage completion inode ops

Reply via email to