Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins <[EMAIL PROTECTED]> wrote: > > VM in a way that people are not unhappy with :) > > I'm hoping you intended one less negative ;) Or did you mean one fewer negative? :-) David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: > On Thu, 17 May 2007, Nick Piggin wrote: >> ... and I don't want to change the >> VM in a way that people are not unhappy with :) > > I'm hoping you intended one less negative ;) Or one more... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: On Thu, 17 May 2007, Nick Piggin wrote: ... and I don't want to change the VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) Derrr... I'm an idiot! -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Thu, 17 May 2007, Nick Piggin wrote: > > ... and I don't want to change the > VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: On Wed, 16 May 2007, Nick Piggin wrote: Andrew Morton wrote: I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? That's right, the overhead of the lock_page()/unlock_page() in the common path of faulting, and of the extra call to unmap_mapping_range() when truncating (because page lock doesn't satisfactorily replace the old sequence count when COWing). Yes he did. It seems to only be noticable in microbenchmarks. So far, yes. I expect it'll surface in some reallife workload sometime, but let's not get too depressed about that. I guess your blithe "Scalability is not an issue" comment still irks me. I say I believe scalability will not be a huge issue, because for concurrent faulters on the same page, they still have cacheline contention beginning before we lock the page (tree_lock), and ending after we unlock it (page->_count), and a few others in the middle for good mesure. I sure don't think it is going to help, but I don't think it would be a great impact on an alrady sucky workload. We would have contention against other sources of lock_page, but OTOH, we want to fix up the clear_page_dirty_for_io race window by doing a wait_for_page_locked here anyway. We could introduce some contention for some other lockers... but again I think they are often situations where the page fields are going to be modified anyway, or where the fault code would be contending on something anyway (eg. vs !uptodate read, truncate). The lock hold time in the fault should be smaller than many. I'm not saying it would never be contended, but it is such a granular thing and it is so often surrounded by file level locking. Still, I have some lock_page speedup work that eliminates that regression anyway. Again, rather too blithely said. You have a deep well of ingenuity, but I doubt it can actually wash away all of the small overhead added. OK, I haven't proven to eliminate the overhead completely, but we can see that options exist... And we should be able to at least help the more heavily impacted powerpc architecture and bring the hit it to a more reasonable level there. However, Hugh hasn't exactly said yes or no yet... Getting a "yes" or "no" out of me is very hard work indeed. But I didn't realize that was gating this work: if the world had to wait on me, we'd be in trouble. I think there are quite a few people eager to see the subsequent ->fault changes go in. And I think I'd just like to minimize the difference between -mm and mainline, so maximize comprehensibilty, by getting this all in. I've not heard of any correctness issues with it, and we all agree that the page lock there is attractively simple. Well I value your opinion of course, and I don't want to change the VM in a way that people are not unhappy with :) If I ever find a better way of handling it, I'm free to send patches in future, after all. That's definitely what I had in mind -- the page lock is a simple starting point (that is hopefully correct) that we would always be able to build on with something more fancy in future. Did that sound something like a "yes"? A little bit :) -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Wed, 16 May 2007, Nick Piggin wrote: > Andrew Morton wrote: > > I need to work out what to do with > > > > mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch > > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch > > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch > > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch > > mm-merge-nopfn-into-fault.patch > > convert-hugetlbfs-to-use-vm_ops-fault.patch > > mm-remove-legacy-cruft.patch > > mm-debug-check-for-the-fault-vs-invalidate-race.patch > > mm-fix-clear_page_dirty_for_io-vs-fault-race.patch > > > > Probably merge them, I guess. Hugh had concerns, I think over small > > additional overhead from the lock_page()? That's right, the overhead of the lock_page()/unlock_page() in the common path of faulting, and of the extra call to unmap_mapping_range() when truncating (because page lock doesn't satisfactorily replace the old sequence count when COWing). > Yes he did. It seems to only be noticable in microbenchmarks. So far, yes. I expect it'll surface in some reallife workload sometime, but let's not get too depressed about that. I guess your blithe "Scalability is not an issue" comment still irks me. > In my opinion not enough to withhold pagecache corruption bug fixes. It is a pity to be adding overhead to a common path in order to fix such very rare cases, but yes, we probably have to live with that. > Still, I have some lock_page speedup work that eliminates that regression > anyway. Again, rather too blithely said. You have a deep well of ingenuity, but I doubt it can actually wash away all of the small overhead added. > However, Hugh hasn't exactly said yes or no yet... Getting a "yes" or "no" out of me is very hard work indeed. But I didn't realize that was gating this work: if the world had to wait on me, we'd be in trouble. I think there are quite a few people eager to see the subsequent ->fault changes go in. And I think I'd just like to minimize the difference between -mm and mainline, so maximize comprehensibilty, by getting this all in. I've not heard of any correctness issues with it, and we all agree that the page lock there is attractively simple. If I ever find a better way of handling it, I'm free to send patches in future, after all. Did that sound something like a "yes"? Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Christoph Hellwig wrote: On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote: +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host); + struct key *key = vma->vm_file->private_data; + int ret; + + _enter("{{%x:%u},%x},{%lx}", + vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index); + + lock_page(page); + ret = afs_prepare_write(vma->vm_file, page, 0, 0); + unlock_page(page); + + _leave(" = %d", ret); + return ret; +} It looks like you really want Dave's generic page_mkwrite. Is that appropriate for a non buffer based filesystem? And we should really get that one merged for 2.6.22. Dave asked me about that the other day (in relation to the ->fault ops)... I have no problem merging it for 2.6.22 and rebasing my patches on top. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote: > +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) > +{ > + struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host); > + struct key *key = vma->vm_file->private_data; > + int ret; > + > + _enter("{{%x:%u},%x},{%lx}", > +vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index); > + > + lock_page(page); > + ret = afs_prepare_write(vma->vm_file, page, 0, 0); > + unlock_page(page); > + > + _leave(" = %d", ret); > + return ret; > +} It looks like you really want Dave's generic page_mkwrite. And we should really get that one merged for 2.6.22. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote: +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host); + struct key *key = vma-vm_file-private_data; + int ret; + + _enter({{%x:%u},%x},{%lx}, +vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index); + + lock_page(page); + ret = afs_prepare_write(vma-vm_file, page, 0, 0); + unlock_page(page); + + _leave( = %d, ret); + return ret; +} It looks like you really want Dave's generic page_mkwrite. And we should really get that one merged for 2.6.22. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Christoph Hellwig wrote: On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote: +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host); + struct key *key = vma-vm_file-private_data; + int ret; + + _enter({{%x:%u},%x},{%lx}, + vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index); + + lock_page(page); + ret = afs_prepare_write(vma-vm_file, page, 0, 0); + unlock_page(page); + + _leave( = %d, ret); + return ret; +} It looks like you really want Dave's generic page_mkwrite. Is that appropriate for a non buffer based filesystem? And we should really get that one merged for 2.6.22. Dave asked me about that the other day (in relation to the -fault ops)... I have no problem merging it for 2.6.22 and rebasing my patches on top. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Wed, 16 May 2007, Nick Piggin wrote: Andrew Morton wrote: I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? That's right, the overhead of the lock_page()/unlock_page() in the common path of faulting, and of the extra call to unmap_mapping_range() when truncating (because page lock doesn't satisfactorily replace the old sequence count when COWing). Yes he did. It seems to only be noticable in microbenchmarks. So far, yes. I expect it'll surface in some reallife workload sometime, but let's not get too depressed about that. I guess your blithe Scalability is not an issue comment still irks me. In my opinion not enough to withhold pagecache corruption bug fixes. It is a pity to be adding overhead to a common path in order to fix such very rare cases, but yes, we probably have to live with that. Still, I have some lock_page speedup work that eliminates that regression anyway. Again, rather too blithely said. You have a deep well of ingenuity, but I doubt it can actually wash away all of the small overhead added. However, Hugh hasn't exactly said yes or no yet... Getting a yes or no out of me is very hard work indeed. But I didn't realize that was gating this work: if the world had to wait on me, we'd be in trouble. I think there are quite a few people eager to see the subsequent -fault changes go in. And I think I'd just like to minimize the difference between -mm and mainline, so maximize comprehensibilty, by getting this all in. I've not heard of any correctness issues with it, and we all agree that the page lock there is attractively simple. If I ever find a better way of handling it, I'm free to send patches in future, after all. Did that sound something like a yes? Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: On Wed, 16 May 2007, Nick Piggin wrote: Andrew Morton wrote: I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? That's right, the overhead of the lock_page()/unlock_page() in the common path of faulting, and of the extra call to unmap_mapping_range() when truncating (because page lock doesn't satisfactorily replace the old sequence count when COWing). Yes he did. It seems to only be noticable in microbenchmarks. So far, yes. I expect it'll surface in some reallife workload sometime, but let's not get too depressed about that. I guess your blithe Scalability is not an issue comment still irks me. I say I believe scalability will not be a huge issue, because for concurrent faulters on the same page, they still have cacheline contention beginning before we lock the page (tree_lock), and ending after we unlock it (page-_count), and a few others in the middle for good mesure. I sure don't think it is going to help, but I don't think it would be a great impact on an alrady sucky workload. We would have contention against other sources of lock_page, but OTOH, we want to fix up the clear_page_dirty_for_io race window by doing a wait_for_page_locked here anyway. We could introduce some contention for some other lockers... but again I think they are often situations where the page fields are going to be modified anyway, or where the fault code would be contending on something anyway (eg. vs !uptodate read, truncate). The lock hold time in the fault should be smaller than many. I'm not saying it would never be contended, but it is such a granular thing and it is so often surrounded by file level locking. Still, I have some lock_page speedup work that eliminates that regression anyway. Again, rather too blithely said. You have a deep well of ingenuity, but I doubt it can actually wash away all of the small overhead added. OK, I haven't proven to eliminate the overhead completely, but we can see that options exist... And we should be able to at least help the more heavily impacted powerpc architecture and bring the hit it to a more reasonable level there. However, Hugh hasn't exactly said yes or no yet... Getting a yes or no out of me is very hard work indeed. But I didn't realize that was gating this work: if the world had to wait on me, we'd be in trouble. I think there are quite a few people eager to see the subsequent -fault changes go in. And I think I'd just like to minimize the difference between -mm and mainline, so maximize comprehensibilty, by getting this all in. I've not heard of any correctness issues with it, and we all agree that the page lock there is attractively simple. Well I value your opinion of course, and I don't want to change the VM in a way that people are not unhappy with :) If I ever find a better way of handling it, I'm free to send patches in future, after all. That's definitely what I had in mind -- the page lock is a simple starting point (that is hopefully correct) that we would always be able to build on with something more fancy in future. Did that sound something like a yes? A little bit :) -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Thu, 17 May 2007, Nick Piggin wrote: ... and I don't want to change the VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: On Thu, 17 May 2007, Nick Piggin wrote: ... and I don't want to change the VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) Derrr... I'm an idiot! -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins wrote: On Thu, 17 May 2007, Nick Piggin wrote: ... and I don't want to change the VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) Or one more... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Hugh Dickins [EMAIL PROTECTED] wrote: VM in a way that people are not unhappy with :) I'm hoping you intended one less negative ;) Or did you mean one fewer negative? :-) David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Andrew Morton wrote: On Tue, 15 May 2007 16:52:31 +0100 David Howells <[EMAIL PROTECTED]> wrote: Implement shared-writable mmap for AFS. This blows up in -mm: fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function) fs/afs/file.c:60: error: unknown field 'populate' specified in initializer fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function) because Nick went and renamed half the VM and deleted the other half. And page_mkwrite is next ;) I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? Yes he did. It seems to only be noticable in microbenchmarks. In my opinion not enough to withhold pagecache corruption bug fixes. Still, I have some lock_page speedup work that eliminates that regression anyway. However, Hugh hasn't exactly said yes or no yet... -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
David Howells wrote: Implement shared-writable mmap for AFS. The key with which to access the file is obtained from the VMA at the point where the PTE is made writable by the page_mkwrite() VMA op and cached in the affected page. If there's an outstanding write on the page made with a different key, then page_mkwrite() will flush it before attaching a record of the new key. Good, will be nice to get a page_mkwrite() user in the tree. +/* + * notification that a previously read-only page is about to become writable + * - if it returns an error, the caller will deliver a bus error signal + * + * we use this to make a record of the key with which the writeback should be + * performed and to flush any outstanding writes made with a different key + * + * the key to be used is attached to the file pinned by the VMA + */ +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host); + struct key *key = vma->vm_file->private_data; + int ret; + + _enter("{{%x:%u},%x},{%lx}", + vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index); + + lock_page(page); + ret = afs_prepare_write(vma->vm_file, page, 0, 0); + unlock_page(page); + + _leave(" = %d", ret); + return ret; +} By the looks of afs_prepare_write, it is going to go bang when the page gets truncated before lock_page. Checking page->mapping after lock_page should do the trick. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Tue, 15 May 2007 16:52:31 +0100 David Howells <[EMAIL PROTECTED]> wrote: > Implement shared-writable mmap for AFS. This blows up in -mm: fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function) fs/afs/file.c:60: error: unknown field 'populate' specified in initializer fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function) because Nick went and renamed half the VM and deleted the other half. I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] AFS: Implement shared-writable mmap
Implement shared-writable mmap for AFS. The key with which to access the file is obtained from the VMA at the point where the PTE is made writable by the page_mkwrite() VMA op and cached in the affected page. If there's an outstanding write on the page made with a different key, then page_mkwrite() will flush it before attaching a record of the new key. Signed-off-by: David Howells <[EMAIL PROTECTED]> --- fs/afs/file.c | 22 -- fs/afs/internal.h |1 + fs/afs/write.c| 28 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/fs/afs/file.c b/fs/afs/file.c index 9c0e721..da2a18b 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -22,6 +22,7 @@ static int afs_readpage(struct file *file, struct page *page); static void afs_invalidatepage(struct page *page, unsigned long offset); static int afs_releasepage(struct page *page, gfp_t gfp_flags); static int afs_launder_page(struct page *page); +static int afs_mmap(struct file *file, struct vm_area_struct *vma); const struct file_operations afs_file_operations = { .open = afs_open, @@ -31,7 +32,7 @@ const struct file_operations afs_file_operations = { .write = do_sync_write, .aio_read = generic_file_aio_read, .aio_write = afs_file_write, - .mmap = generic_file_readonly_mmap, + .mmap = afs_mmap, .sendfile = generic_file_sendfile, .fsync = afs_fsync, }; @@ -54,6 +55,12 @@ const struct address_space_operations afs_fs_aops = { .writepages = afs_writepages, }; +static struct vm_operations_struct afs_file_vm_ops = { + .nopage = filemap_nopage, + .populate = filemap_populate, + .page_mkwrite = afs_page_mkwrite, +}; + /* * open an AFS file or directory and attach a key to it */ @@ -266,7 +273,6 @@ static void afs_invalidatepage(struct page *page, unsigned long offset) static int afs_launder_page(struct page *page) { _enter("{%lu}", page->index); - return 0; } @@ -293,3 +299,15 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags) _leave(" = 0"); return 0; } + +/* + * memory map part of an AFS file + */ +static int afs_mmap(struct file *file, struct vm_area_struct *vma) +{ + _enter(""); + + file_accessed(file); + vma->vm_ops = _file_vm_ops; + return 0; +} diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 61038da..141e073 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -710,6 +710,7 @@ extern ssize_t afs_file_write(struct kiocb *, const struct iovec *, unsigned long, loff_t); extern int afs_writeback_all(struct afs_vnode *); extern int afs_fsync(struct file *, struct dentry *, int); +extern int afs_page_mkwrite(struct vm_area_struct *, struct page *); /*/ diff --git a/fs/afs/write.c b/fs/afs/write.c index a03b92a..6ef818d 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -174,6 +174,8 @@ static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, * prepare to perform part of a write to a page * - the caller holds the page locked, preventing it from being written out or * modified by anyone else + * - may be called from afs_page_mkwrite() to set up a page for modification + * through shared-writable mmap */ int afs_prepare_write(struct file *file, struct page *page, unsigned offset, unsigned to) @@ -825,3 +827,29 @@ int afs_fsync(struct file *file, struct dentry *dentry, int datasync) _leave(" = %d", ret); return ret; } + +/* + * notification that a previously read-only page is about to become writable + * - if it returns an error, the caller will deliver a bus error signal + * + * we use this to make a record of the key with which the writeback should be + * performed and to flush any outstanding writes made with a different key + * + * the key to be used is attached to the file pinned by the VMA + */ +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host); + struct key *key = vma->vm_file->private_data; + int ret; + + _enter("{{%x:%u},%x},{%lx}", + vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index); + + lock_page(page); + ret = afs_prepare_write(vma->vm_file, page, 0, 0); + unlock_page(page); + + _leave(" = %d", ret); + return ret; +} - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] AFS: Implement shared-writable mmap
Implement shared-writable mmap for AFS. The key with which to access the file is obtained from the VMA at the point where the PTE is made writable by the page_mkwrite() VMA op and cached in the affected page. If there's an outstanding write on the page made with a different key, then page_mkwrite() will flush it before attaching a record of the new key. Signed-off-by: David Howells [EMAIL PROTECTED] --- fs/afs/file.c | 22 -- fs/afs/internal.h |1 + fs/afs/write.c| 28 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/fs/afs/file.c b/fs/afs/file.c index 9c0e721..da2a18b 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -22,6 +22,7 @@ static int afs_readpage(struct file *file, struct page *page); static void afs_invalidatepage(struct page *page, unsigned long offset); static int afs_releasepage(struct page *page, gfp_t gfp_flags); static int afs_launder_page(struct page *page); +static int afs_mmap(struct file *file, struct vm_area_struct *vma); const struct file_operations afs_file_operations = { .open = afs_open, @@ -31,7 +32,7 @@ const struct file_operations afs_file_operations = { .write = do_sync_write, .aio_read = generic_file_aio_read, .aio_write = afs_file_write, - .mmap = generic_file_readonly_mmap, + .mmap = afs_mmap, .sendfile = generic_file_sendfile, .fsync = afs_fsync, }; @@ -54,6 +55,12 @@ const struct address_space_operations afs_fs_aops = { .writepages = afs_writepages, }; +static struct vm_operations_struct afs_file_vm_ops = { + .nopage = filemap_nopage, + .populate = filemap_populate, + .page_mkwrite = afs_page_mkwrite, +}; + /* * open an AFS file or directory and attach a key to it */ @@ -266,7 +273,6 @@ static void afs_invalidatepage(struct page *page, unsigned long offset) static int afs_launder_page(struct page *page) { _enter({%lu}, page-index); - return 0; } @@ -293,3 +299,15 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags) _leave( = 0); return 0; } + +/* + * memory map part of an AFS file + */ +static int afs_mmap(struct file *file, struct vm_area_struct *vma) +{ + _enter(); + + file_accessed(file); + vma-vm_ops = afs_file_vm_ops; + return 0; +} diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 61038da..141e073 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -710,6 +710,7 @@ extern ssize_t afs_file_write(struct kiocb *, const struct iovec *, unsigned long, loff_t); extern int afs_writeback_all(struct afs_vnode *); extern int afs_fsync(struct file *, struct dentry *, int); +extern int afs_page_mkwrite(struct vm_area_struct *, struct page *); /*/ diff --git a/fs/afs/write.c b/fs/afs/write.c index a03b92a..6ef818d 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -174,6 +174,8 @@ static int afs_prepare_page(struct afs_vnode *vnode, struct page *page, * prepare to perform part of a write to a page * - the caller holds the page locked, preventing it from being written out or * modified by anyone else + * - may be called from afs_page_mkwrite() to set up a page for modification + * through shared-writable mmap */ int afs_prepare_write(struct file *file, struct page *page, unsigned offset, unsigned to) @@ -825,3 +827,29 @@ int afs_fsync(struct file *file, struct dentry *dentry, int datasync) _leave( = %d, ret); return ret; } + +/* + * notification that a previously read-only page is about to become writable + * - if it returns an error, the caller will deliver a bus error signal + * + * we use this to make a record of the key with which the writeback should be + * performed and to flush any outstanding writes made with a different key + * + * the key to be used is attached to the file pinned by the VMA + */ +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host); + struct key *key = vma-vm_file-private_data; + int ret; + + _enter({{%x:%u},%x},{%lx}, + vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index); + + lock_page(page); + ret = afs_prepare_write(vma-vm_file, page, 0, 0); + unlock_page(page); + + _leave( = %d, ret); + return ret; +} - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
On Tue, 15 May 2007 16:52:31 +0100 David Howells [EMAIL PROTECTED] wrote: Implement shared-writable mmap for AFS. This blows up in -mm: fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function) fs/afs/file.c:60: error: unknown field 'populate' specified in initializer fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function) because Nick went and renamed half the VM and deleted the other half. I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
David Howells wrote: Implement shared-writable mmap for AFS. The key with which to access the file is obtained from the VMA at the point where the PTE is made writable by the page_mkwrite() VMA op and cached in the affected page. If there's an outstanding write on the page made with a different key, then page_mkwrite() will flush it before attaching a record of the new key. Good, will be nice to get a page_mkwrite() user in the tree. +/* + * notification that a previously read-only page is about to become writable + * - if it returns an error, the caller will deliver a bus error signal + * + * we use this to make a record of the key with which the writeback should be + * performed and to flush any outstanding writes made with a different key + * + * the key to be used is attached to the file pinned by the VMA + */ +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page) +{ + struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host); + struct key *key = vma-vm_file-private_data; + int ret; + + _enter({{%x:%u},%x},{%lx}, + vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index); + + lock_page(page); + ret = afs_prepare_write(vma-vm_file, page, 0, 0); + unlock_page(page); + + _leave( = %d, ret); + return ret; +} By the looks of afs_prepare_write, it is going to go bang when the page gets truncated before lock_page. Checking page-mapping after lock_page should do the trick. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] AFS: Implement shared-writable mmap
Andrew Morton wrote: On Tue, 15 May 2007 16:52:31 +0100 David Howells [EMAIL PROTECTED] wrote: Implement shared-writable mmap for AFS. This blows up in -mm: fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function) fs/afs/file.c:60: error: unknown field 'populate' specified in initializer fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function) because Nick went and renamed half the VM and deleted the other half. And page_mkwrite is next ;) I need to work out what to do with mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch mm-merge-nopfn-into-fault.patch convert-hugetlbfs-to-use-vm_ops-fault.patch mm-remove-legacy-cruft.patch mm-debug-check-for-the-fault-vs-invalidate-race.patch mm-fix-clear_page_dirty_for_io-vs-fault-race.patch Probably merge them, I guess. Hugh had concerns, I think over small additional overhead from the lock_page()? Yes he did. It seems to only be noticable in microbenchmarks. In my opinion not enough to withhold pagecache corruption bug fixes. Still, I have some lock_page speedup work that eliminates that regression anyway. However, Hugh hasn't exactly said yes or no yet... -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/