Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread David Howells
Hugh Dickins <[EMAIL PROTECTED]> wrote:

> > VM in a way that people are not unhappy with :)
> 
> I'm hoping you intended one less negative ;)

Or did you mean one fewer negative? :-)

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Chuck Ebbert
Hugh Dickins wrote:
> On Thu, 17 May 2007, Nick Piggin wrote:
>> ... and I don't want to change the
>> VM in a way that people are not unhappy with :)
> 
> I'm hoping you intended one less negative ;)

Or one more...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Hugh Dickins wrote:

On Thu, 17 May 2007, Nick Piggin wrote:


... and I don't want to change the
VM in a way that people are not unhappy with :)



I'm hoping you intended one less negative ;)


Derrr... I'm an idiot!

--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Hugh Dickins
On Thu, 17 May 2007, Nick Piggin wrote:
> 
> ... and I don't want to change the
> VM in a way that people are not unhappy with :)

I'm hoping you intended one less negative ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Hugh Dickins wrote:

On Wed, 16 May 2007, Nick Piggin wrote:


Andrew Morton wrote:


I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?



That's right, the overhead of the lock_page()/unlock_page() in the common
path of faulting, and of the extra call to unmap_mapping_range() when
truncating (because page lock doesn't satisfactorily replace the old
sequence count when COWing).



Yes he did. It seems to only be noticable in microbenchmarks.



So far, yes.  I expect it'll surface in some reallife workload
sometime, but let's not get too depressed about that.  I guess
your blithe "Scalability is not an issue" comment still irks me.



I say I believe scalability will not be a huge issue, because for
concurrent faulters on the same page, they still have cacheline
contention beginning before we lock the page (tree_lock), and
ending after we unlock it (page->_count), and a few others in the
middle for good mesure. I sure don't think it is going to help,
but I don't think it would be a great impact on an alrady sucky
workload.

We would have contention against other sources of lock_page, but
OTOH, we want to fix up the clear_page_dirty_for_io race window
by doing a wait_for_page_locked here anyway.

We could introduce some contention for some other lockers... but
again I think they are often situations where the page fields are
going to be modified anyway, or where the fault code would be
contending on something anyway (eg. vs !uptodate read, truncate).
The lock hold time in the fault should be smaller than many.

I'm not saying it would never be contended, but it is such a
granular thing and it is so often surrounded by file level locking.



Still, I have some lock_page speedup work that eliminates that regression
anyway.



Again, rather too blithely said.  You have a deep well of ingenuity,
but I doubt it can actually wash away all of the small overhead added.


OK, I haven't proven to eliminate the overhead completely, but we can
see that options exist... And we should be able to at least help the
more heavily impacted powerpc architecture and bring the hit it to a
more reasonable level there.



However, Hugh hasn't exactly said yes or no yet...



Getting a "yes" or "no" out of me is very hard work indeed.
But I didn't realize that was gating this work: if the world
had to wait on me, we'd be in trouble.

I think there are quite a few people eager to see the subsequent
->fault changes go in.  And I think I'd just like to minimize the
difference between -mm and mainline, so maximize comprehensibilty,
by getting this all in.  I've not heard of any correctness issues
with it, and we all agree that the page lock there is attractively
simple.


Well I value your opinion of course, and I don't want to change the
VM in a way that people are not unhappy with :)



If I ever find a better way of handling it,
I'm free to send patches in future, after all.


That's definitely what I had in mind -- the page lock is a simple
starting point (that is hopefully correct) that we would always be
able to build on with something more fancy in future.



Did that sound something like a "yes"?


A little bit :)


--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Hugh Dickins
On Wed, 16 May 2007, Nick Piggin wrote:
> Andrew Morton wrote:
> > I need to work out what to do with
> > 
> > mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
> > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
> > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
> > mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
> > mm-merge-nopfn-into-fault.patch
> > convert-hugetlbfs-to-use-vm_ops-fault.patch
> > mm-remove-legacy-cruft.patch
> > mm-debug-check-for-the-fault-vs-invalidate-race.patch
> > mm-fix-clear_page_dirty_for_io-vs-fault-race.patch
> > 
> > Probably merge them, I guess.  Hugh had concerns, I think over small
> > additional overhead from the lock_page()?

That's right, the overhead of the lock_page()/unlock_page() in the common
path of faulting, and of the extra call to unmap_mapping_range() when
truncating (because page lock doesn't satisfactorily replace the old
sequence count when COWing).

> Yes he did. It seems to only be noticable in microbenchmarks.

So far, yes.  I expect it'll surface in some reallife workload
sometime, but let's not get too depressed about that.  I guess
your blithe "Scalability is not an issue" comment still irks me.

> In my opinion not enough to withhold pagecache corruption bug fixes.

It is a pity to be adding overhead to a common path in order to fix
such very rare cases, but yes, we probably have to live with that.

> Still, I have some lock_page speedup work that eliminates that regression
> anyway.

Again, rather too blithely said.  You have a deep well of ingenuity,
but I doubt it can actually wash away all of the small overhead added.

> However, Hugh hasn't exactly said yes or no yet...

Getting a "yes" or "no" out of me is very hard work indeed.
But I didn't realize that was gating this work: if the world
had to wait on me, we'd be in trouble.

I think there are quite a few people eager to see the subsequent
->fault changes go in.  And I think I'd just like to minimize the
difference between -mm and mainline, so maximize comprehensibilty,
by getting this all in.  I've not heard of any correctness issues
with it, and we all agree that the page lock there is attractively
simple.

If I ever find a better way of handling it,
I'm free to send patches in future, after all.

Did that sound something like a "yes"?

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Christoph Hellwig wrote:

On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote:


+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host);
+   struct key *key = vma->vm_file->private_data;
+   int ret;
+
+   _enter("{{%x:%u},%x},{%lx}",
+  vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma->vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave(" = %d", ret);
+   return ret;
+}



It looks like you really want Dave's generic page_mkwrite.


Is that appropriate for a non buffer based filesystem?



And we should
really get that one merged for 2.6.22.


Dave asked me about that the other day (in relation to the ->fault ops)...
I have no problem merging it for 2.6.22 and rebasing my patches on top.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Christoph Hellwig
On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote:
> +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
> +{
> + struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host);
> + struct key *key = vma->vm_file->private_data;
> + int ret;
> +
> + _enter("{{%x:%u},%x},{%lx}",
> +vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index);
> +
> + lock_page(page);
> + ret = afs_prepare_write(vma->vm_file, page, 0, 0);
> + unlock_page(page);
> +
> + _leave(" = %d", ret);
> + return ret;
> +}

It looks like you really want Dave's generic page_mkwrite.  And we should
really get that one merged for 2.6.22.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Christoph Hellwig
On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote:
 +int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
 +{
 + struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host);
 + struct key *key = vma-vm_file-private_data;
 + int ret;
 +
 + _enter({{%x:%u},%x},{%lx},
 +vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index);
 +
 + lock_page(page);
 + ret = afs_prepare_write(vma-vm_file, page, 0, 0);
 + unlock_page(page);
 +
 + _leave( = %d, ret);
 + return ret;
 +}

It looks like you really want Dave's generic page_mkwrite.  And we should
really get that one merged for 2.6.22.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Christoph Hellwig wrote:

On Tue, May 15, 2007 at 04:52:31PM +0100, David Howells wrote:


+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host);
+   struct key *key = vma-vm_file-private_data;
+   int ret;
+
+   _enter({{%x:%u},%x},{%lx},
+  vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma-vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave( = %d, ret);
+   return ret;
+}



It looks like you really want Dave's generic page_mkwrite.


Is that appropriate for a non buffer based filesystem?



And we should
really get that one merged for 2.6.22.


Dave asked me about that the other day (in relation to the -fault ops)...
I have no problem merging it for 2.6.22 and rebasing my patches on top.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Hugh Dickins
On Wed, 16 May 2007, Nick Piggin wrote:
 Andrew Morton wrote:
  I need to work out what to do with
  
  mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
  mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
  mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
  mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
  mm-merge-nopfn-into-fault.patch
  convert-hugetlbfs-to-use-vm_ops-fault.patch
  mm-remove-legacy-cruft.patch
  mm-debug-check-for-the-fault-vs-invalidate-race.patch
  mm-fix-clear_page_dirty_for_io-vs-fault-race.patch
  
  Probably merge them, I guess.  Hugh had concerns, I think over small
  additional overhead from the lock_page()?

That's right, the overhead of the lock_page()/unlock_page() in the common
path of faulting, and of the extra call to unmap_mapping_range() when
truncating (because page lock doesn't satisfactorily replace the old
sequence count when COWing).

 Yes he did. It seems to only be noticable in microbenchmarks.

So far, yes.  I expect it'll surface in some reallife workload
sometime, but let's not get too depressed about that.  I guess
your blithe Scalability is not an issue comment still irks me.

 In my opinion not enough to withhold pagecache corruption bug fixes.

It is a pity to be adding overhead to a common path in order to fix
such very rare cases, but yes, we probably have to live with that.

 Still, I have some lock_page speedup work that eliminates that regression
 anyway.

Again, rather too blithely said.  You have a deep well of ingenuity,
but I doubt it can actually wash away all of the small overhead added.

 However, Hugh hasn't exactly said yes or no yet...

Getting a yes or no out of me is very hard work indeed.
But I didn't realize that was gating this work: if the world
had to wait on me, we'd be in trouble.

I think there are quite a few people eager to see the subsequent
-fault changes go in.  And I think I'd just like to minimize the
difference between -mm and mainline, so maximize comprehensibilty,
by getting this all in.  I've not heard of any correctness issues
with it, and we all agree that the page lock there is attractively
simple.

If I ever find a better way of handling it,
I'm free to send patches in future, after all.

Did that sound something like a yes?

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Hugh Dickins wrote:

On Wed, 16 May 2007, Nick Piggin wrote:


Andrew Morton wrote:


I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?



That's right, the overhead of the lock_page()/unlock_page() in the common
path of faulting, and of the extra call to unmap_mapping_range() when
truncating (because page lock doesn't satisfactorily replace the old
sequence count when COWing).



Yes he did. It seems to only be noticable in microbenchmarks.



So far, yes.  I expect it'll surface in some reallife workload
sometime, but let's not get too depressed about that.  I guess
your blithe Scalability is not an issue comment still irks me.



I say I believe scalability will not be a huge issue, because for
concurrent faulters on the same page, they still have cacheline
contention beginning before we lock the page (tree_lock), and
ending after we unlock it (page-_count), and a few others in the
middle for good mesure. I sure don't think it is going to help,
but I don't think it would be a great impact on an alrady sucky
workload.

We would have contention against other sources of lock_page, but
OTOH, we want to fix up the clear_page_dirty_for_io race window
by doing a wait_for_page_locked here anyway.

We could introduce some contention for some other lockers... but
again I think they are often situations where the page fields are
going to be modified anyway, or where the fault code would be
contending on something anyway (eg. vs !uptodate read, truncate).
The lock hold time in the fault should be smaller than many.

I'm not saying it would never be contended, but it is such a
granular thing and it is so often surrounded by file level locking.



Still, I have some lock_page speedup work that eliminates that regression
anyway.



Again, rather too blithely said.  You have a deep well of ingenuity,
but I doubt it can actually wash away all of the small overhead added.


OK, I haven't proven to eliminate the overhead completely, but we can
see that options exist... And we should be able to at least help the
more heavily impacted powerpc architecture and bring the hit it to a
more reasonable level there.



However, Hugh hasn't exactly said yes or no yet...



Getting a yes or no out of me is very hard work indeed.
But I didn't realize that was gating this work: if the world
had to wait on me, we'd be in trouble.

I think there are quite a few people eager to see the subsequent
-fault changes go in.  And I think I'd just like to minimize the
difference between -mm and mainline, so maximize comprehensibilty,
by getting this all in.  I've not heard of any correctness issues
with it, and we all agree that the page lock there is attractively
simple.


Well I value your opinion of course, and I don't want to change the
VM in a way that people are not unhappy with :)



If I ever find a better way of handling it,
I'm free to send patches in future, after all.


That's definitely what I had in mind -- the page lock is a simple
starting point (that is hopefully correct) that we would always be
able to build on with something more fancy in future.



Did that sound something like a yes?


A little bit :)


--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Hugh Dickins
On Thu, 17 May 2007, Nick Piggin wrote:
 
 ... and I don't want to change the
 VM in a way that people are not unhappy with :)

I'm hoping you intended one less negative ;)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Nick Piggin

Hugh Dickins wrote:

On Thu, 17 May 2007, Nick Piggin wrote:


... and I don't want to change the
VM in a way that people are not unhappy with :)



I'm hoping you intended one less negative ;)


Derrr... I'm an idiot!

--
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread Chuck Ebbert
Hugh Dickins wrote:
 On Thu, 17 May 2007, Nick Piggin wrote:
 ... and I don't want to change the
 VM in a way that people are not unhappy with :)
 
 I'm hoping you intended one less negative ;)

Or one more...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-16 Thread David Howells
Hugh Dickins [EMAIL PROTECTED] wrote:

  VM in a way that people are not unhappy with :)
 
 I'm hoping you intended one less negative ;)

Or did you mean one fewer negative? :-)

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Nick Piggin

Andrew Morton wrote:

On Tue, 15 May 2007 16:52:31 +0100
David Howells <[EMAIL PROTECTED]> wrote:



Implement shared-writable mmap for AFS.



This blows up in -mm:

fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function)
fs/afs/file.c:60: error: unknown field 'populate' specified in initializer
fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function)

because Nick went and renamed half the VM and deleted the other half.


And page_mkwrite is next ;)



I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?


Yes he did. It seems to only be noticable in microbenchmarks. In my opinion
not enough to withhold pagecache corruption bug fixes.

Still, I have some lock_page speedup work that eliminates that regression
anyway.

However, Hugh hasn't exactly said yes or no yet...

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Nick Piggin

David Howells wrote:

Implement shared-writable mmap for AFS.

The key with which to access the file is obtained from the VMA at the point
where the PTE is made writable by the page_mkwrite() VMA op and cached in the
affected page.

If there's an outstanding write on the page made with a different key, then
page_mkwrite() will flush it before attaching a record of the new key.


Good, will be nice to get a page_mkwrite() user in the tree.




+/*
+ * notification that a previously read-only page is about to become writable
+ * - if it returns an error, the caller will deliver a bus error signal
+ *
+ * we use this to make a record of the key with which the writeback should be
+ * performed and to flush any outstanding writes made with a different key
+ *
+ * the key to be used is attached to the file pinned by the VMA
+ */
+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host);
+   struct key *key = vma->vm_file->private_data;
+   int ret;
+
+   _enter("{{%x:%u},%x},{%lx}",
+  vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma->vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave(" = %d", ret);
+   return ret;
+}


By the looks of afs_prepare_write, it is going to go bang when the page
gets truncated before lock_page.

Checking page->mapping after lock_page should do the trick.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Andrew Morton
On Tue, 15 May 2007 16:52:31 +0100
David Howells <[EMAIL PROTECTED]> wrote:

> Implement shared-writable mmap for AFS.

This blows up in -mm:

fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function)
fs/afs/file.c:60: error: unknown field 'populate' specified in initializer
fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function)

because Nick went and renamed half the VM and deleted the other half.

I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread David Howells
Implement shared-writable mmap for AFS.

The key with which to access the file is obtained from the VMA at the point
where the PTE is made writable by the page_mkwrite() VMA op and cached in the
affected page.

If there's an outstanding write on the page made with a different key, then
page_mkwrite() will flush it before attaching a record of the new key.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/afs/file.c |   22 --
 fs/afs/internal.h |1 +
 fs/afs/write.c|   28 
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 9c0e721..da2a18b 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -22,6 +22,7 @@ static int afs_readpage(struct file *file, struct page *page);
 static void afs_invalidatepage(struct page *page, unsigned long offset);
 static int afs_releasepage(struct page *page, gfp_t gfp_flags);
 static int afs_launder_page(struct page *page);
+static int afs_mmap(struct file *file, struct vm_area_struct *vma);
 
 const struct file_operations afs_file_operations = {
.open   = afs_open,
@@ -31,7 +32,7 @@ const struct file_operations afs_file_operations = {
.write  = do_sync_write,
.aio_read   = generic_file_aio_read,
.aio_write  = afs_file_write,
-   .mmap   = generic_file_readonly_mmap,
+   .mmap   = afs_mmap,
.sendfile   = generic_file_sendfile,
.fsync  = afs_fsync,
 };
@@ -54,6 +55,12 @@ const struct address_space_operations afs_fs_aops = {
.writepages = afs_writepages,
 };
 
+static struct vm_operations_struct afs_file_vm_ops = {
+   .nopage = filemap_nopage,
+   .populate   = filemap_populate,
+   .page_mkwrite   = afs_page_mkwrite,
+};
+
 /*
  * open an AFS file or directory and attach a key to it
  */
@@ -266,7 +273,6 @@ static void afs_invalidatepage(struct page *page, unsigned 
long offset)
 static int afs_launder_page(struct page *page)
 {
_enter("{%lu}", page->index);
-
return 0;
 }
 
@@ -293,3 +299,15 @@ static int afs_releasepage(struct page *page, gfp_t 
gfp_flags)
_leave(" = 0");
return 0;
 }
+
+/*
+ * memory map part of an AFS file
+ */
+static int afs_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   _enter("");
+
+   file_accessed(file);
+   vma->vm_ops = _file_vm_ops;
+   return 0;
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 61038da..141e073 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -710,6 +710,7 @@ extern ssize_t afs_file_write(struct kiocb *, const struct 
iovec *,
  unsigned long, loff_t);
 extern int afs_writeback_all(struct afs_vnode *);
 extern int afs_fsync(struct file *, struct dentry *, int);
+extern int afs_page_mkwrite(struct vm_area_struct *, struct page *);
 
 
 /*/
diff --git a/fs/afs/write.c b/fs/afs/write.c
index a03b92a..6ef818d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -174,6 +174,8 @@ static int afs_prepare_page(struct afs_vnode *vnode, struct 
page *page,
  * prepare to perform part of a write to a page
  * - the caller holds the page locked, preventing it from being written out or
  *   modified by anyone else
+ * - may be called from afs_page_mkwrite() to set up a page for modification
+ *   through shared-writable mmap
  */
 int afs_prepare_write(struct file *file, struct page *page,
  unsigned offset, unsigned to)
@@ -825,3 +827,29 @@ int afs_fsync(struct file *file, struct dentry *dentry, 
int datasync)
_leave(" = %d", ret);
return ret;
 }
+
+/*
+ * notification that a previously read-only page is about to become writable
+ * - if it returns an error, the caller will deliver a bus error signal
+ *
+ * we use this to make a record of the key with which the writeback should be
+ * performed and to flush any outstanding writes made with a different key
+ *
+ * the key to be used is attached to the file pinned by the VMA
+ */
+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma->vm_file->f_mapping->host);
+   struct key *key = vma->vm_file->private_data;
+   int ret;
+
+   _enter("{{%x:%u},%x},{%lx}",
+  vnode->fid.vid, vnode->fid.vnode, key_serial(key), page->index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma->vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave(" = %d", ret);
+   return ret;
+}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread David Howells
Implement shared-writable mmap for AFS.

The key with which to access the file is obtained from the VMA at the point
where the PTE is made writable by the page_mkwrite() VMA op and cached in the
affected page.

If there's an outstanding write on the page made with a different key, then
page_mkwrite() will flush it before attaching a record of the new key.

Signed-off-by: David Howells [EMAIL PROTECTED]
---

 fs/afs/file.c |   22 --
 fs/afs/internal.h |1 +
 fs/afs/write.c|   28 
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 9c0e721..da2a18b 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -22,6 +22,7 @@ static int afs_readpage(struct file *file, struct page *page);
 static void afs_invalidatepage(struct page *page, unsigned long offset);
 static int afs_releasepage(struct page *page, gfp_t gfp_flags);
 static int afs_launder_page(struct page *page);
+static int afs_mmap(struct file *file, struct vm_area_struct *vma);
 
 const struct file_operations afs_file_operations = {
.open   = afs_open,
@@ -31,7 +32,7 @@ const struct file_operations afs_file_operations = {
.write  = do_sync_write,
.aio_read   = generic_file_aio_read,
.aio_write  = afs_file_write,
-   .mmap   = generic_file_readonly_mmap,
+   .mmap   = afs_mmap,
.sendfile   = generic_file_sendfile,
.fsync  = afs_fsync,
 };
@@ -54,6 +55,12 @@ const struct address_space_operations afs_fs_aops = {
.writepages = afs_writepages,
 };
 
+static struct vm_operations_struct afs_file_vm_ops = {
+   .nopage = filemap_nopage,
+   .populate   = filemap_populate,
+   .page_mkwrite   = afs_page_mkwrite,
+};
+
 /*
  * open an AFS file or directory and attach a key to it
  */
@@ -266,7 +273,6 @@ static void afs_invalidatepage(struct page *page, unsigned 
long offset)
 static int afs_launder_page(struct page *page)
 {
_enter({%lu}, page-index);
-
return 0;
 }
 
@@ -293,3 +299,15 @@ static int afs_releasepage(struct page *page, gfp_t 
gfp_flags)
_leave( = 0);
return 0;
 }
+
+/*
+ * memory map part of an AFS file
+ */
+static int afs_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   _enter();
+
+   file_accessed(file);
+   vma-vm_ops = afs_file_vm_ops;
+   return 0;
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 61038da..141e073 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -710,6 +710,7 @@ extern ssize_t afs_file_write(struct kiocb *, const struct 
iovec *,
  unsigned long, loff_t);
 extern int afs_writeback_all(struct afs_vnode *);
 extern int afs_fsync(struct file *, struct dentry *, int);
+extern int afs_page_mkwrite(struct vm_area_struct *, struct page *);
 
 
 /*/
diff --git a/fs/afs/write.c b/fs/afs/write.c
index a03b92a..6ef818d 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -174,6 +174,8 @@ static int afs_prepare_page(struct afs_vnode *vnode, struct 
page *page,
  * prepare to perform part of a write to a page
  * - the caller holds the page locked, preventing it from being written out or
  *   modified by anyone else
+ * - may be called from afs_page_mkwrite() to set up a page for modification
+ *   through shared-writable mmap
  */
 int afs_prepare_write(struct file *file, struct page *page,
  unsigned offset, unsigned to)
@@ -825,3 +827,29 @@ int afs_fsync(struct file *file, struct dentry *dentry, 
int datasync)
_leave( = %d, ret);
return ret;
 }
+
+/*
+ * notification that a previously read-only page is about to become writable
+ * - if it returns an error, the caller will deliver a bus error signal
+ *
+ * we use this to make a record of the key with which the writeback should be
+ * performed and to flush any outstanding writes made with a different key
+ *
+ * the key to be used is attached to the file pinned by the VMA
+ */
+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host);
+   struct key *key = vma-vm_file-private_data;
+   int ret;
+
+   _enter({{%x:%u},%x},{%lx},
+  vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma-vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave( = %d, ret);
+   return ret;
+}

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Andrew Morton
On Tue, 15 May 2007 16:52:31 +0100
David Howells [EMAIL PROTECTED] wrote:

 Implement shared-writable mmap for AFS.

This blows up in -mm:

fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function)
fs/afs/file.c:60: error: unknown field 'populate' specified in initializer
fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function)

because Nick went and renamed half the VM and deleted the other half.

I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Nick Piggin

David Howells wrote:

Implement shared-writable mmap for AFS.

The key with which to access the file is obtained from the VMA at the point
where the PTE is made writable by the page_mkwrite() VMA op and cached in the
affected page.

If there's an outstanding write on the page made with a different key, then
page_mkwrite() will flush it before attaching a record of the new key.


Good, will be nice to get a page_mkwrite() user in the tree.




+/*
+ * notification that a previously read-only page is about to become writable
+ * - if it returns an error, the caller will deliver a bus error signal
+ *
+ * we use this to make a record of the key with which the writeback should be
+ * performed and to flush any outstanding writes made with a different key
+ *
+ * the key to be used is attached to the file pinned by the VMA
+ */
+int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+   struct afs_vnode *vnode = AFS_FS_I(vma-vm_file-f_mapping-host);
+   struct key *key = vma-vm_file-private_data;
+   int ret;
+
+   _enter({{%x:%u},%x},{%lx},
+  vnode-fid.vid, vnode-fid.vnode, key_serial(key), page-index);
+
+   lock_page(page);
+   ret = afs_prepare_write(vma-vm_file, page, 0, 0);
+   unlock_page(page);
+
+   _leave( = %d, ret);
+   return ret;
+}


By the looks of afs_prepare_write, it is going to go bang when the page
gets truncated before lock_page.

Checking page-mapping after lock_page should do the trick.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] AFS: Implement shared-writable mmap

2007-05-15 Thread Nick Piggin

Andrew Morton wrote:

On Tue, 15 May 2007 16:52:31 +0100
David Howells [EMAIL PROTECTED] wrote:



Implement shared-writable mmap for AFS.



This blows up in -mm:

fs/afs/file.c:59: error: 'filemap_nopage' undeclared here (not in a function)
fs/afs/file.c:60: error: unknown field 'populate' specified in initializer
fs/afs/file.c:60: error: 'filemap_populate' undeclared here (not in a function)

because Nick went and renamed half the VM and deleted the other half.


And page_mkwrite is next ;)



I need to work out what to do with

mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-doc-fix.patch
mm-merge-populate-and-nopage-into-fault-fixes-nonlinear-fix.patch
mm-merge-nopfn-into-fault.patch
convert-hugetlbfs-to-use-vm_ops-fault.patch
mm-remove-legacy-cruft.patch
mm-debug-check-for-the-fault-vs-invalidate-race.patch
mm-fix-clear_page_dirty_for_io-vs-fault-race.patch

Probably merge them, I guess.  Hugh had concerns, I think over small
additional overhead from the lock_page()?


Yes he did. It seems to only be noticable in microbenchmarks. In my opinion
not enough to withhold pagecache corruption bug fixes.

Still, I have some lock_page speedup work that eliminates that regression
anyway.

However, Hugh hasn't exactly said yes or no yet...

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/