Stephen, I've looked through the current stuff with truncate()
(BTW, minixfs is broken too - rmdir() hangs solid) and I think that I have
a more-or-less tolerable solution. You definitely know the VFS/VM
interaction better (I'm mostly dealing with namespace side of the things),
so I'ld really like to hear your comments on the stuff below.
        a) On the normal filesystems we have 4 kinds of blocks - data,
per-inode metadata, per-fs metadata and free ones.
        b) We can't allow several instances of the buffer_head for the
same block, at least not the stale dirty ones.
        c) Currently we keep the stuff for the first class around the page
cache and the rest in buffer cache. Large part of our problems comes
from the fact that we need to detect migration of block from one class to
another and scanning the whole buffer cache is way too slow.
        d) Moving the second class into the page cache will cause problems
with bigmem stuff. Besides, I have the reasons of my own for keeping those
beasts separate - softupdates code will get simpler that way. I suspect
that you have similar reasons wrt journalling. The bottom line: we don't
want it there.
        e) we might get out with just a dirty blocks lists, but I think
that we can do better than that: keep per-inode cache for metadata. It
is going to be separate from the data pagecache. It is never exported to
user space and lives completely in kernel context. Ability to search
there doesn't mean much for normal filesystems, but weirdies like AFFS
will _really_ benefit from it - I've just realized that I was crufting up
an equivalent of such cache there anyway.
        f) Essentially we have several address spaces _and_ mappings
between them - some of them are done via the hardware MMU (process ones),
some are done by hands (per-inode pagecache). I think that we might take
it further and add per-inode metadata address spaces. Moreover, the large
part of code will be shared anyway, so we might want an ability to make
address spaces fullfledged objects - e.g. Steve Dodd^W^WNTFS driver may
want to keep a separate caches for forks, etc.[1]

        Sorry for less than coherent text - I'm wading through the current
code trying to figure out what it is supposed to do and I'd feel much
better if I had better understanding of ideology behind the thing. Could
you comment on/ACK/NAK the stuff above? I'm Cc'ing it to fsdevel - this
stuff affects all filesystems and IMO everybody will benefit from the
clearly formulated rules regarding the work with buffer cache.
                                                Cheers,
                                                        Al

[1] MULTICS segments with human face, anyone? ;-)

-- 
Two of my imaginary friends reproduced once ... with negative results.
                                Ben <[EMAIL PROTECTED]> in ASR

Reply via email to