On Thu, 2007-05-24 at 22:35 +0100, David Howells wrote: > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > Could you please flesh this out a bit, from a higher level? > > See the description for patch 3/4. > > > As in: what does it mean for a server to "reject" a write? What's actually > > going on here? > > Simply, it means that the server refused to perform the write. The example > I'm > thinking of is that it refused to do so because the ACL governing that file > changed between the permission() check and the attempt to write the data back. > > All it takes is a write-back cache and a tiny little delay between the copy to > page cache and the write back to the server and there's a window in which > another client can leap in and make it impossible for the client to write the > data. > > There are other possibilities, for instance: > > (1) The key governing the writeback expires. > > (2) The user's account is deleted or disabled. > > The file being deleted is handled elsewhere. In such a case, everyone's > writes are just discarded. > > > Why does the server do this? > > Because it's told to by some other client. Think chmod, or in AFS's case: > > fs setacl -dir . -acl system:administrators rlidka > > This removes write access by not including a 'w' amongst 'rlidka'. > > (Note in AFS, the ACLs attached to a directory control the files in that > directory). > > > I assume that the application will see a short write or an EIO or something? > > EACCES or similar most likely. In AFS's case you should get an abort with > some > appropriate error code. This is then mapped to EACCES, EKEYREVOKED, > EKEYEXPIRED, etc. > > > How does this interact with MAP_SHARED? > > MAP_SHARED/PROT_WRITE is just another way of doing writes to the pagecache. > It > isn't really special. The way I've done it is to use page_mkwrite() to track > the key with which a write through a mapping is written back. > > > Do we expect that any other networked filesystem would want to do this? > > (and if not, why not?) Or do they already attempt to do it in some other > > fashion? > > _Any_ network filesystem that (a) has access controls, and (b) does writeback > caching (be the interval ever so brief) is liable to this issue. It could > possible for a netfs to avoid with this by getting a guarantee that the write > will be accepted upfront (CIFS might do this), or by blocking attempts to > change the access controls through some sort of locking (again CIFS might do > this). > > However, looking at NFS, it appears that NFS may be liable to this problem. > It > does appear to use writeback caching, though it flushes its writes back fairly > promptly making it quite difficult to test. I talked to Steve Dickson about > this, and I get the impression that NFS will just sit their and keep retrying > the writeback.
No. If the write fails, then NFS will mark the mapping as invalid and attempt to call invalidate_inode_pages2() at the earliest possible moment. I'm adding in a patch to defer marking the page as uptodate until the write is successful in cases where NFS is writing a pristine page. As for pages that are already marked as uptodate, well you already have a race: you have deferred the page write, and so other processes may already have read the rejected data before you tried to write it out. Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/