Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-18 Thread Trond Myklebust
> " " == Michael Eisler <[EMAIL PROTECTED]> writes: > What if someone has written to multiple, non-contiguous regions > of a page? This has been foreseen and hence we only allow 1 contiguous region per page. If somebody tries to schedule a write to a second region that is not conti

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-18 Thread Michael Eisler
> > " " == Michael Eisler <[EMAIL PROTECTED]> writes: > > >> I'm not clear on why you want to enforce page alignedness > >> though? As long as writes respect the lock boundaries (and not > >> page boundaries) why would use of a page cache change matters? > > > For the reason

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-18 Thread Trond Myklebust
> " " == Michael Eisler <[EMAIL PROTECTED]> writes: >> I'm not clear on why you want to enforce page alignedness >> though? As long as writes respect the lock boundaries (and not >> page boundaries) why would use of a page cache change matters? > For the reason that was poin

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-18 Thread Michael Eisler
> Yes. fs/read_write calls the NFS subsystem. The problem then is that > NFS uses the generic_file_{read,write,mmap}() interfaces. These are > what enforce use of the page cache. So, don't use generic*() when locking is active. It's what most other UNIX-based NFS clients do. Even if it is "stupi

Re: NFS locking bug -- limited mtime resolution means nfs_lock()does not provide coherency guarantee

2000-09-16 Thread Linus Torvalds
On 16 Sep 2000, Trond Myklebust wrote: > > Yes. fs/read_write calls the NFS subsystem. The problem then is that > NFS uses the generic_file_{read,write,mmap}() interfaces. These are > what enforce use of the page cache. > > You could drop these functions, but that would mean designing an > ent

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-16 Thread Alan Cox
> I'm not a Linux kernel literate. However, I found your > assertion surprising. Does procfs do page i/o as well? No. An fs isnt required to use the page caches at all. It makes life a lot saner to do so - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a m

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-16 Thread Trond Myklebust
> " " == Michael Eisler <[EMAIL PROTECTED]> writes: > I'm not a Linux kernel literate. However, I found your > assertion surprising. Does procfs do page i/o as well? No. It has its own setup. > file.c in fs/nfs suggests that the Linux VFS has non-page > interfaces in add

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-16 Thread Michael Eisler
> > " " == Michael Eisler <[EMAIL PROTECTED]> writes: > > > Focus on correctness and do the expedient thing first, which > > is: > > - The first time a file is locked, flush dirty pages > >to the server, and then invalidate the page cache > > This would be

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-16 Thread Trond Myklebust
> " " == Michael Eisler <[EMAIL PROTECTED]> writes: > Focus on correctness and do the expedient thing first, which > is: > - The first time a file is locked, flush dirty pages > to the server, and then invalidate the page cache This would be implemented with the

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-15 Thread Michael Eisler
> > " " == James Yarbrough <[EMAIL PROTECTED]> writes: > > > What is done for bypassing the cache when the size of a file > > lock held by the reading/writing process is not a multiple of > > the caching granularity? Consider two different clients with > > processes shari

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-15 Thread Trond Myklebust
> " " == James Yarbrough <[EMAIL PROTECTED]> writes: > What is done for bypassing the cache when the size of a file > lock held by the reading/writing process is not a multiple of > the caching granularity? Consider two different clients with > processes sharing a file an

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-15 Thread James Yarbrough
> This will always invalidate the page cache whenever we try to obtain > the lock, hence you are guaranteed that the cache will be reread after > the lock was grabbed. > After unlocking however one needs no guarantees other than ensuring > that any modifications were committed while we held the l

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-15 Thread Trond Myklebust
> " " == Michael Eisler <[EMAIL PROTECTED]> writes: > The fix still does not provide coherency guarantees in all > situations, and at minimum, there ought to be a way to force > the client provide a coherency guarantee. Yes. I came to the same conclusion after having sent it o

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Jeff Epler
Jeff Epler <[EMAIL PROTECTED]> writes: > > > Is there a solution that would allow the kind of guarantee our > > > software wants with non-linux nfsds without the cache-blowing > > > that the change I'm suggesting causes? Trond: > > As you can see, the idea is to look at whether or

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Michael Eisler
> > " " == Jeff Epler <[EMAIL PROTECTED]> writes: > > > Is there a solution that would allow the kind of guarantee our > > software wants with non-linux nfsds without the cache-blowing > > that the change I'm suggesting causes? > > How about something like the following compro

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Albert D. Cahalan
Theodore Y. Ts'o writes: > From: "Albert D. Cahalan" <[EMAIL PROTECTED]> >> The ext2 inode has 6 obviously free bytes, 6 that are only used >> on filesystems marked as Hurd-type, and 8 that seem to be claimed >> by competing security and EA projects. So, being wasteful, it would >> be possible to

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Theodore Y. Ts'o
Date: Thu, 14 Sep 2000 17:03:11 +0200 (CEST) From: Trond Myklebust <[EMAIL PROTECTED]> For the timestamps, yes, but inode caching will take most of that hit. After all, the only time stat() reads from disk is when the inode has completely fallen out of the cache. For commonly used

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Trond Myklebust
> " " == Theodore Y Ts'o <[EMAIL PROTECTED]> writes: >Would it perhaps make sense to use one of these last 'free' >fields as a pointer to an 'inode entension'? If you still >want ext2fs to be able to accommodate new projects and >ideas, then it seems that

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Theodore Y. Ts'o
Date: Thu, 14 Sep 2000 15:09:35 +0200 (CEST) From: Trond Myklebust <[EMAIL PROTECTED]> Would it perhaps make sense to use one of these last 'free' fields as a pointer to an 'inode entension'? If you still want ext2fs to be able to accommodate new projects and ideas, then it seem

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Trond Myklebust
> " " == Theodore Y Ts'o <[EMAIL PROTECTED]> writes: > There has been some talk of doubling the size of the ext2 > inode, which will of course cause some backwards compatibility > problems and would mean that you would only be able to use > certain advanced features on new

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Theodore Y. Ts'o
From: "Albert D. Cahalan" <[EMAIL PROTECTED]> Date:Wed, 13 Sep 2000 19:20:42 -0400 (EDT) The ext2 inode has 6 obviously free bytes, 6 that are only used on filesystems marked as Hurd-type, and 8 that seem to be claimed by competing security and EA projects. So, being wastef

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Andi Kleen
On Thu, Sep 14, 2000 at 11:51:20AM +0200, Trond Myklebust wrote: > The reason I'm sceptical to this and other in-core type solutions is > that knfsd doesn't keep files open for long: basically it opens a file > and close it upon processing of each and every request from a > client. You can therefo

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Trond Myklebust
> " " == Andi Kleen <[EMAIL PROTECTED]> writes: > On Thu, Sep 14, 2000 at 10:49:59AM +0200, Trond Myklebust > wrote: >> > " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: >> >> > The ext2 inode has 6 obviously free bytes, 6 that are only >> > used on filesyste

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Andi Kleen
On Thu, Sep 14, 2000 at 10:49:59AM +0200, Trond Myklebust wrote: > > " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: > > > The ext2 inode has 6 obviously free bytes, 6 that are only used > > on filesystems marked as Hurd-type, and 8 that seem to be > > claimed by competing

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Trond Myklebust
> " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: > The ext2 inode has 6 obviously free bytes, 6 that are only used > on filesystems marked as Hurd-type, and 8 that seem to be > claimed by competing security and EA projects. So, being > wasteful, it would be possible t

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-14 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > Is there a solution that would allow the kind of guarantee our > software wants with non-linux nfsds without the cache-blowing > that the change I'm suggesting causes? How about something like the following compromise? I haven

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Jeff Epler
On Wed, Sep 13, 2000 at 11:56:56PM +0200, Trond Myklebust wrote: > After consultation with the appropriate authorities, it turns out that > the Sun-code based clients do in fact turn off data caching entirely > when using NLM file locking. Entirely? That's interesting, because for our consistenc

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Jeff Epler
I think that both the NFS server changes that Trond is suggesting and the NFS client changes that I am suggesting have their place. The fact that the tuple (mtime, size) is used to test to test for unchangedness of a file indicates that the people who designed NFS understood that just using mtime

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Albert D. Cahalan
Trond Myklebust writes: > > " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: >> 20 bits * 3 timestamps == 60 bits 60 bits <= 8 bytes >> >> So you do need 8 bytes. > > Yes. Assuming that you want to implement the microseconds field on > all 3 timestamps. For NFS I would only need that preci

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Andi Kleen <[EMAIL PROTECTED]> writes: > Steal a couple of bytes from what ? >From the 32-bit storage area currently allocated to i_generation on the on-disk ext2fs inode (as per Ted's suggestion). With current disk/computing speeds, you probably don't need the full 32 bits to

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > But "ctime and file size are the same" does not prove that the > file is unchanged. That's the root of this problem, and why > NFS_CACHEINV(inode) is not enough to ensure coherency. > Furthermore, according to NFSv4, what

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Theodore Y Ts'o <[EMAIL PROTECTED]> writes: > Why? i_generation is a field which is only used by the NFS > server. As far as ext2 is concerned, it's a black box value. > Currently, as I understand things (and you're much more the > export on the NFS code than I

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Andi Kleen
On Wed, Sep 13, 2000 at 10:35:10PM +0200, Trond Myklebust wrote: > > No, it does not help. i_generation is only in the NFS file > > handle, but not in the fattr/file id this is used for cache > > checks. The NFS file handle has to stay identical anyways, as > > long as the inod

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Frank van Maarseveen
On Wed, Sep 13, 2000 at 12:42:15PM +0200, Trond Myklebust wrote: > > Is there any 'standard' function for reading the microseconds fields > in userland? I couldn't find anything with a brief search in the > X/Open docs. > Both Digital OSF/1 and Solaris use 3 undocumented spare fields in the stru

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Theodore Y. Ts'o
Date: Wed, 13 Sep 2000 22:35:10 +0200 (CEST) From: Trond Myklebust <[EMAIL PROTECTED]> You might be able to steal a couple of bytes and then rewrite ext2fs to mask those out from the 'i_generation' field, but it would mean that you could no longer boot your old 2.2.16 kernel withou

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Andi Kleen <[EMAIL PROTECTED]> writes: > On Wed, Sep 13, 2000 at 11:51:45AM -0400, Theodore Y. Ts'o > wrote: >> >> So this is a really stupid question, but I'll ask it anyway. >> If you just need a cookie, is there any way that you might be >> able to steal

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: > 20 bits * 3 timestamps == 60 bits 60 bits <= 8 bytes > So you do need 8 bytes. Yes. Assuming that you want to implement the microseconds field on all 3 timestamps. For NFS I would only need that precision on mtime. Does anyb

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Andi Kleen
On Wed, Sep 13, 2000 at 11:51:45AM -0400, Theodore Y. Ts'o wrote: > > So this is a really stupid question, but I'll ask it anyway. If you > just need a cookie, is there any way that you might be able to steal a > few bits from i_generation field for that purpose? (This assumes that > we only wo

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Theodore Y. Ts'o
Date:Wed, 13 Sep 2000 12:54:49 +0200 (CEST) From: Trond Myklebust <[EMAIL PROTECTED]> Don't forget that 2^20 > 10^6, hence if you really want units of microseconds, you actually only need to save 3 bytes worth of data per timestamp. For the purposes of NFS, however the

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Albert D. Cahalan
Trond Myklebust writes: > > " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: >> It would not be reasonable to try extending ext2 to 64-bit >> times, but milliseconds might be doable. You'd need 4 bytes to >> support the 3 standard time stamps. >> >> Going to microseconds would require 8 fr

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > On Wed, Sep 13, 2000 at 02:53:02PM +0200, Trond Myklebust > wrote: >> No. Things fall in and out of the inode cache all the >> time. That's a vicious circle that's going to lead to a lot of >> unnecessary traffic. >

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Jeff Epler
On Wed, Sep 13, 2000 at 02:53:02PM +0200, Trond Myklebust wrote: > No. Things fall in and out of the inode cache all the time. That's a > vicious circle that's going to lead to a lot of unnecessary traffic. The traffic is not "unnecessary" if it's needed to work with today's NFS servers and get c

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Jamie Lokier <[EMAIL PROTECTED]> writes: > If we're getting serious about adding finer-grained mtimes to > ext2, could we please consider using those bits as a way to > know, for sure, whether a file has changed? Agreed. > Btw, the whole cache coherency thing wo

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Jamie Lokier
Trond Myklebust wrote: > > It would not be reasonable to try extending ext2 to 64-bit > > times, but milliseconds might be doable. You'd need 4 bytes to > > support the 3 standard time stamps. > > > Going to microseconds would require 8 free bytes, which I don't > > think

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Andi Kleen
On Wed, Sep 13, 2000 at 12:42:15PM +0200, Trond Myklebust wrote: > > " " == Andi Kleen <[EMAIL PROTECTED]> writes: > > > On Tue, Sep 12, 2000 at 09:10:56PM +0200, Trond Myklebust > > wrote: > >> Making mtime a true 64-bit cookie on Linux servers would be a > >> solution that

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Albert D Cahalan <[EMAIL PROTECTED]> writes: > It would not be reasonable to try extending ext2 to 64-bit > times, but milliseconds might be doable. You'd need 4 bytes to > support the 3 standard time stamps. > Going to microseconds would require 8 free bytes, wh

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-13 Thread Trond Myklebust
> " " == Andi Kleen <[EMAIL PROTECTED]> writes: > On Tue, Sep 12, 2000 at 09:10:56PM +0200, Trond Myklebust > wrote: >> Making mtime a true 64-bit cookie on Linux servers would be a >> solution that works on all clients. > Making mtime 64bit would also be useful for lo

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Albert D. Cahalan
Trond Myklebust writes: > Relying on the sub-second timing fields is a much more > implementation-specific. It depends on the capabilities of both the > filesystem and the server OS. > Linux and the knfsd server code could easily be modified to provide > such a service, but the problem (as I've s

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Andi Kleen
On Tue, Sep 12, 2000 at 09:10:56PM +0200, Trond Myklebust wrote: > Making mtime a true 64-bit cookie on Linux servers would be a solution > that works on all clients. Making mtime 64bit would also be useful for local parallel make runs, the current second resolution leads to race conditions on b

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Jeff Epler
On Tue, Sep 12, 2000 at 09:10:56PM +0200, Trond Myklebust wrote: > > " " == Alan Cox <[EMAIL PROTECTED]> writes: > > > Providing everyone is careful to hold a lock I think it is > > > lockf() is a read barrier providing the local cache is flushed, > > the unlock is a write bar

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Jeff Epler
On Tue, Sep 12, 2000 at 07:08:09PM +0200, Trond Myklebust wrote: > This is a known issue, and is not easy to fix. Neither of the > solutions you propose are correct since they will both cause a cache > invalidation. This is not the same as cache coherency checking. > > > The correct solution in

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Alan Cox
> > The fix we have found is to *either* have NFS_CACHEINV(inode) > > change inode-> i_mtime to an artificial value (0) or to call > > nfs_zap_caches instead. Since I am not sure which fix is > > appropriate, I'm not enclosing an actual patch. > > This is a known issue, and i

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Trond Myklebust
> " " == Alan Cox <[EMAIL PROTECTED]> writes: > Providing everyone is careful to hold a lock I think it is > lockf() is a read barrier providing the local cache is flushed, > the unlock is a write barrier providing the local cache is > flushed first. Providing all users a

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > Is there any solution that's available today, and doesn't > depend on using Linux in the server? I suspect that we will > have to distribute a modified nfs client with our app, and > we're prepared to accept the cache inva

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > But "ctime and file size are the same" does not prove that the ^ mtime > file is unchanged. That's the root of this problem, and why > NFS_CACHEINV(inode) is not enough to ensure coherency. Yes, but I just demo

Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Alan Cox
> If we have a purely Linux-specific hack to ensure cache coherency, > that will still corrupt the cache on those *NIX clients that use > ordinary cache coherency checking (i.e. checking mtime + file size) > rather than cache invalidation. Its what Solaris implements and what SunOS back down to 3

NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee

2000-09-12 Thread Trond Myklebust
> " " == Jeff Epler <[EMAIL PROTECTED]> writes: > I believe I have discovered a problem in the Linux 2.2 nfs > client. The short story is that NFS_CACHEINV(inode) does not > "act as a cache coherency point" (as nfs/file.c:nfs_lock()'s > comment implies) when multiple modi