Re: coda lock-up in 2.3.29

Alexander Viro Sun, 28 Nov 1999 07:19:40 -0800


On Sun, 28 Nov 1999, Manfred Spraul wrote:

> Alexander Viro wrote:
> > >
> > > I think it's a Bad Thing (tm) that file write operations are single
> > > threaded [generic_file_write() calls down(&i_sem)], and I'd like to
> > > change that.
> > 
> > Danger: clashing patches ahead. 
> 
> I assumed that. If you want help, just send me your patches: I could 
> review them, or I could implement missing parts.
> 
> > In particular, any stuff related to
> > areas, NFS atomicity, etc. would better live on address_space level.
> 
> Are you sure? I thought that the address_space was added to clean up
> the page cache [to get rid of the dummy inode for the swapper].

Erm... That too ;-) Actually the thing is _more_ that page cache cleanup.
At least I intended it to be more than that and I think that it will be.
swapper_inode breakage was the last straw.
Consider MMU that caches by virtual address. Unlike the cache-by-PA
designs it has cache _before_ TLB and address translation. That's what
pagecache is. struct address_space is the front part of MMU context. Since
we have cache ahead of MMU proper we can use the same caching mechanics
for different MMUs. We have at least 4 of them - normal buffer cache,
buffer cache for filesystems with continuous files, NFS and swap.
Moreover, loopback, compression, etc. stuff fits there too. Probably we'll
need to move readpage(), flushpage() and friends into address_space - we
still have tons of ad-hackery and special-casing here, but it can be
cleaned up. Moreover, some of inode methods are not VFS-to-filesystem
interface - they are MMU-to-filesystem (and their _set_ depends on MMU -
e.g. ->get_block() has different semantics for normal and continuous-files 
filesystems and makes no sense for NFS et.al.)
        IMO the right thing would be to have something like a vma tree for
address space. It's _definitely_ better than what we have for locks right
now - current implementation sucks several MLl. NFS atomicity _is_ locking
of a range in address space, another question being whether we can combine
it nicely with locks. I hope that we can - moreover, that we can do it
better than current fs/lockd does.
        Now, the metadata is a totally different can of worms. Here the
rwlock (truncate and O_APPEND writes are writers, normal writes are
readers) is exactly what we need. Doing truncate() in clear way will take
some patches, but with rwlocks and most nasty part of old truncate patch 
already in the tree they will be easy. BTW, symlinks are going into
pagecache too - VM-related patches in -pre3 came from that (read: they
were needed in new symlink patch and happened to be the least
controversary part ;-)

> NFS atomicity has nothing to do with the page cache, I'd implement
> it either within "struct inode" or with a third structure.
> 
> [ie struct inode, contains struct atomic_access,
> contains struct address_space]
> 
> Btw, do you know if the file pointer must be atomically updated?
> Eg you have a file with fixed size, unordered records. You could open
> the file, start 10 worker threads, and all of them just executed:
> 
> for(;;){
>       read(fd,&data,sizeof(data));
>       if(EOF)
>               break;
>       process_data(&data);
> }     
> 
> Do you know if we have to ensure that no record is returned twice, no
> record is missed? I just checked the POSIX standard, but I found no clear
> answer. Neither 2.2.13 nor 2.3.29 enforce that.

Hell knows. Check the other Unices - POSIX is a politics-ridden piece of
work. Usually if it says nothing it means that some BMissed'em'VFH contained
some mind-boggling lossage and vendor could not be bothered to fix it. I
would not rely on such construction in portable program. So I'ld say that
it depends on how hard it will be to implement ;-)
Re: coda lock-up in 2.3.29

Reply via email to