Re: [PATCH 14/14] NFS: Use local caching [try #2]

2007-08-10 Thread David Howells
Trond Myklebust [EMAIL PROTECTED] wrote:

   Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h
   should really be moved into fscache.c.
  
  If you wish.  It seems a shame since a lot of them have only one caller.
 
 ...however it also forces you to export a lot of stuff which is really
 private to fscache.c (the atomics etc).

The atomics is actually a bad example.  These are referred to directly by part
of the table in fs/nfs/sysctl.c.  Is there a better way of exporting
statistics than through /proc/sys/ files?

David
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/14] NFS: Use local caching [try #2]

2007-08-09 Thread David Howells
The attached patch makes it possible for the NFS filesystem to make use of the
network filesystem local caching service (FS-Cache).

To be able to use this, an updated mount program is required.  This can be
obtained from:

http://people.redhat.com/steved/cachefs/util-linux/

To mount an NFS filesystem to use caching, add an fsc option to the mount:

mount warthog:/ /a -o fsc

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/Kconfig |8 +
 fs/nfs/Makefile|1 
 fs/nfs/client.c|   11 +
 fs/nfs/file.c  |   38 +++-
 fs/nfs/fscache.c   |  303 +
 fs/nfs/fscache.h   |  464 
 fs/nfs/inode.c |   16 ++
 fs/nfs/internal.h  |8 +
 fs/nfs/read.c  |   27 ++-
 fs/nfs/super.c |1 
 fs/nfs/sysctl.c|   43 
 fs/nfs/write.c |3 
 include/linux/nfs4_mount.h |3 
 include/linux/nfs_fs.h |6 +
 include/linux/nfs_fs_sb.h  |5 
 include/linux/nfs_mount.h  |3 
 16 files changed, 931 insertions(+), 9 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 7feb4cb..76d5d16 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1600,6 +1600,14 @@ config NFS_V4
 
  If unsure, say N.
 
+config NFS_FSCACHE
+   bool Provide NFS client caching support (EXPERIMENTAL)
+   depends on EXPERIMENTAL
+   depends on NFS_FS=m  FSCACHE || NFS_FS=y  FSCACHE=y
+   help
+ Say Y here if you want NFS data to be cached locally on disc through
+ the general filesystem cache manager
+
 config NFS_DIRECTIO
bool Allow direct I/O on NFS files
depends on NFS_FS
diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile
index b55cb23..c9e7c43 100644
--- a/fs/nfs/Makefile
+++ b/fs/nfs/Makefile
@@ -16,4 +16,5 @@ nfs-$(CONFIG_NFS_V4)  += nfs4proc.o nfs4xdr.o nfs4state.o 
nfs4renewd.o \
   nfs4namespace.o
 nfs-$(CONFIG_NFS_DIRECTIO) += direct.o
 nfs-$(CONFIG_SYSCTL) += sysctl.o
+nfs-$(CONFIG_NFS_FSCACHE) += fscache.o
 nfs-objs   := $(nfs-y)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index a49f9fe..7be7807 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -137,6 +137,8 @@ static struct nfs_client *nfs_alloc_client(const char 
*hostname,
clp-cl_state = 1  NFS4CLNT_LEASE_EXPIRED;
 #endif
 
+   nfs_fscache_get_client_cookie(clp);
+
return clp;
 
 error_3:
@@ -168,6 +170,8 @@ static void nfs_free_client(struct nfs_client *clp)
 
nfs4_shutdown_client(clp);
 
+   nfs_fscache_release_client_cookie(clp);
+
/* -EIO all pending I/O */
if (!IS_ERR(clp-cl_rpcclient))
rpc_shutdown_client(clp-cl_rpcclient);
@@ -1308,7 +1312,7 @@ static int nfs_volume_list_show(struct seq_file *m, void 
*v)
 
/* display header on line 1 */
if (v == nfs_volume_list) {
-   seq_puts(m, NV SERVER   PORT DEV FSID\n);
+   seq_puts(m, NV SERVER   PORT DEV FSID  FSC\n);
return 0;
}
/* display one transport per line on subsequent lines */
@@ -1322,12 +1326,13 @@ static int nfs_volume_list_show(struct seq_file *m, 
void *v)
 (unsigned long long) server-fsid.major,
 (unsigned long long) server-fsid.minor);
 
-   seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s\n,
+   seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s %s\n,
   clp-cl_nfsversion,
   NIPQUAD(clp-cl_addr.sin_addr),
   ntohs(clp-cl_addr.sin_port),
   dev,
-  fsid);
+  fsid,
+  nfs_server_fscache_state(server));
 
return 0;
 }
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index c87dc71..dfd36e0 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -34,6 +34,7 @@
 
 #include delegation.h
 #include iostat.h
+#include internal.h
 
 #define NFSDBG_FACILITYNFSDBG_FILE
 
@@ -259,6 +260,10 @@ nfs_file_mmap(struct file * file, struct vm_area_struct * 
vma)
status = nfs_revalidate_mapping(inode, file-f_mapping);
if (!status)
status = generic_file_mmap(file, vma);
+
+   if (status == 0)
+   nfs_fscache_install_vm_ops(inode, vma);
+
return status;
 }
 
@@ -311,22 +316,51 @@ static int nfs_commit_write(struct file *file, struct 
page *page, unsigned offse
return status;
 }
 
+/*
+ * partially or wholly invalidate a page
+ * - release the private state associated with a page if undergoing complete
+ *   page invalidation
+ * - caller holds page lock
+ */
 static void nfs_invalidate_page(struct page *page, unsigned long offset)
 {
if (offset != 0)
return;
/* Cancel any unstarted writes on this page */
nfs_wb_page_priority(page-mapping-host, page, FLUSH_INVALIDATE);
+
+   

Re: [PATCH 14/14] NFS: Use local caching [try #2]

2007-08-09 Thread Trond Myklebust
Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h
should really be moved into fscache.c.

Otherwise, this looks a lot less intrusive than previous patches.

See inlined comments.

On Thu, 2007-08-09 at 17:05 +0100, David Howells wrote:
 The attached patch makes it possible for the NFS filesystem to make use of the
 network filesystem local caching service (FS-Cache).
 
 To be able to use this, an updated mount program is required.  This can be
 obtained from:
 
   http://people.redhat.com/steved/cachefs/util-linux/
 
 To mount an NFS filesystem to use caching, add an fsc option to the mount:
 
   mount warthog:/ /a -o fsc
 
 Signed-Off-By: David Howells [EMAIL PROTECTED]
 ---
 
  fs/Kconfig |8 +
  fs/nfs/Makefile|1 
  fs/nfs/client.c|   11 +
  fs/nfs/file.c  |   38 +++-
  fs/nfs/fscache.c   |  303 +
  fs/nfs/fscache.h   |  464 
 
  fs/nfs/inode.c |   16 ++
  fs/nfs/internal.h  |8 +
  fs/nfs/read.c  |   27 ++-
  fs/nfs/super.c |1 
  fs/nfs/sysctl.c|   43 
  fs/nfs/write.c |3 
  include/linux/nfs4_mount.h |3 
  include/linux/nfs_fs.h |6 +
  include/linux/nfs_fs_sb.h  |5 
  include/linux/nfs_mount.h  |3 
  16 files changed, 931 insertions(+), 9 deletions(-)
 
 diff --git a/fs/Kconfig b/fs/Kconfig
 index 7feb4cb..76d5d16 100644
 --- a/fs/Kconfig
 +++ b/fs/Kconfig
 @@ -1600,6 +1600,14 @@ config NFS_V4
  
 If unsure, say N.
  
 +config NFS_FSCACHE
 + bool Provide NFS client caching support (EXPERIMENTAL)
 + depends on EXPERIMENTAL
 + depends on NFS_FS=m  FSCACHE || NFS_FS=y  FSCACHE=y
 + help
 +   Say Y here if you want NFS data to be cached locally on disc through
 +   the general filesystem cache manager
 +
  config NFS_DIRECTIO
   bool Allow direct I/O on NFS files
   depends on NFS_FS
 diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile
 index b55cb23..c9e7c43 100644
 --- a/fs/nfs/Makefile
 +++ b/fs/nfs/Makefile
 @@ -16,4 +16,5 @@ nfs-$(CONFIG_NFS_V4)+= nfs4proc.o nfs4xdr.o 
 nfs4state.o nfs4renewd.o \
  nfs4namespace.o
  nfs-$(CONFIG_NFS_DIRECTIO) += direct.o
  nfs-$(CONFIG_SYSCTL) += sysctl.o
 +nfs-$(CONFIG_NFS_FSCACHE) += fscache.o
  nfs-objs := $(nfs-y)
 diff --git a/fs/nfs/client.c b/fs/nfs/client.c
 index a49f9fe..7be7807 100644
 --- a/fs/nfs/client.c
 +++ b/fs/nfs/client.c
 @@ -137,6 +137,8 @@ static struct nfs_client *nfs_alloc_client(const char 
 *hostname,
   clp-cl_state = 1  NFS4CLNT_LEASE_EXPIRED;
  #endif
  
 + nfs_fscache_get_client_cookie(clp);
 +
   return clp;
  
  error_3:
 @@ -168,6 +170,8 @@ static void nfs_free_client(struct nfs_client *clp)
  
   nfs4_shutdown_client(clp);
  
 + nfs_fscache_release_client_cookie(clp);
 +
   /* -EIO all pending I/O */
   if (!IS_ERR(clp-cl_rpcclient))
   rpc_shutdown_client(clp-cl_rpcclient);
 @@ -1308,7 +1312,7 @@ static int nfs_volume_list_show(struct seq_file *m, 
 void *v)
  
   /* display header on line 1 */
   if (v == nfs_volume_list) {
 - seq_puts(m, NV SERVER   PORT DEV FSID\n);
 + seq_puts(m, NV SERVER   PORT DEV FSID  FSC\n);
   return 0;
   }
   /* display one transport per line on subsequent lines */
 @@ -1322,12 +1326,13 @@ static int nfs_volume_list_show(struct seq_file *m, 
 void *v)
(unsigned long long) server-fsid.major,
(unsigned long long) server-fsid.minor);
  
 - seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s\n,
 + seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s %s\n,
  clp-cl_nfsversion,
  NIPQUAD(clp-cl_addr.sin_addr),
  ntohs(clp-cl_addr.sin_port),
  dev,
 -fsid);
 +fsid,
 +nfs_server_fscache_state(server));

Please send these changes as a separate patch: they change an existing
interface.

  
   return 0;
  }
 diff --git a/fs/nfs/file.c b/fs/nfs/file.c
 index c87dc71..dfd36e0 100644
 --- a/fs/nfs/file.c
 +++ b/fs/nfs/file.c
 @@ -34,6 +34,7 @@
  
  #include delegation.h
  #include iostat.h
 +#include internal.h
  
  #define NFSDBG_FACILITY  NFSDBG_FILE
  
 @@ -259,6 +260,10 @@ nfs_file_mmap(struct file * file, struct vm_area_struct 
 * vma)
   status = nfs_revalidate_mapping(inode, file-f_mapping);
   if (!status)
   status = generic_file_mmap(file, vma);
 +
 + if (status == 0)
 + nfs_fscache_install_vm_ops(inode, vma);
 +
   return status;

Please note that in 2.6.24 the NFS client will be gaining a
page_mkwrite() method in order to fix the remaining mmap() races.
AFAICS, it should be fairly trivial to merge with your
nfs_file_page_mkwrite.

  }
  
 

Re: [PATCH 14/14] NFS: Use local caching [try #2]

2007-08-09 Thread David Howells
Trond Myklebust [EMAIL PROTECTED] wrote:

 Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h
 should really be moved into fscache.c.

If you wish.  It seems a shame since a lot of them have only one caller.

  +   /* we can do this here as the bits are only set with the page lock
  +* held, and our caller is holding that */
  +   if (!page-private)
  +   ClearPagePrivate(page);
 
 Why would PG_private be set at this point?

Looks like I've got a bit more cleaning up to do.  PG_private isn't set by
FS-Cache, so this bit shouldn't be here.

 In any case, please send this and other PagePrivate changes as a
 separate patch. Any changes to the PagePrivate semantics must be made
 easy to debug.

There shouldn't be any.

Note that due to patch #2, PG_fscache causes releasepage() and
invalidatepage() to be called in addition to PG_private.

  +
  +   if (!nfs_fscache_release_page(page, gfp))
  +   return 0;
 
 This looks _very_ dubious. Why shouldn't I be able to discard a page
 just because fscache isn't done writing it out? I may have very good
 reasons to do so.

Hmmm...  Looking at the truncate routines, I suppose this ought to be okay,
provided the cache retains a reference on the page whilst it's writing it out
(put_page() won't can the page until we release it).

It also seems dubious, though, to release the page when the filesystem is
doing stuff to it, even if it's by proxy in the cache.  I'll have to test
that, but I'm slightly concerned that the netfs could end up releasing its
cookie before the cache has finished with its pages.  On the other hand, with
the new asynchronous stuff I've done, I'm not sure this'll be an actual
problem.

   static int nfs_launder_page(struct page *page)
   {
  +   nfs_fscache_invalidate_page(page, page-mapping-host, 0);
 
 Why? The function of launder_page() is to clean the page, not to
 truncate it.

Okay.

  @@ -1000,11 +1007,13 @@ static int nfs_update_inode(struct inode *inode, 
  struct nfs_fattr *fattr)
  if (data_stable) {
  inode-i_size = new_isize;
  invalid |= NFS_INO_INVALID_DATA;
  +   nfs_fscache_attr_changed(inode);
 
 Can't fscache_attr_changed() call kmalloc(GFP_KERNEL)? You are in a
 spinlocked context here.

Hmmm...  How about I move the call to fscache_attr_changed() to the callers of
nfs_update_inode(), to just after the spinlock is unlocked.  The operation is
going to be asynchronous, so the delay shouldn't matter.

 I'm not too happy about the change to the binary mount interface. As far
 as I'm concerned, the binary interface should really be considered
 frozen, and should not be touched.

I hadn't come across that.  I'll have a look.

 Instead, feel free to update the text-based mount interface (which can
 be found in 2.6.23-rc1 and later). Please put any such mount interface
 changes in a separate patch together with the Kconfig changes.

If you wish, but doesn't that violate the rules of patch division laid down by
Linus and Andrew?

David
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/14] NFS: Use local caching [try #2]

2007-08-09 Thread David Howells

  Instead, feel free to update the text-based mount interface (which can
  be found in 2.6.23-rc1 and later).

I presume you're referring to nfs_mount_option_tokens[] and friends.  Is there
a mount program that can drive this?

David
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/14] NFS: Use local caching [try #2]

2007-08-09 Thread Trond Myklebust
On Thu, 2007-08-09 at 19:52 +0100, David Howells wrote:
 Trond Myklebust [EMAIL PROTECTED] wrote:
 
  Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h
  should really be moved into fscache.c.
 
 If you wish.  It seems a shame since a lot of them have only one caller.

...however it also forces you to export a lot of stuff which is really
private to fscache.c (the atomics etc).

 Note that due to patch #2, PG_fscache causes releasepage() and
 invalidatepage() to be called in addition to PG_private.



   +
   + if (!nfs_fscache_release_page(page, gfp))
   + return 0;
  
  This looks _very_ dubious. Why shouldn't I be able to discard a page
  just because fscache isn't done writing it out? I may have very good
  reasons to do so.
 
 Hmmm...  Looking at the truncate routines, I suppose this ought to be okay,
 provided the cache retains a reference on the page whilst it's writing it out
 (put_page() won't can the page until we release it).
 
 It also seems dubious, though, to release the page when the filesystem is
 doing stuff to it, even if it's by proxy in the cache.  I'll have to test
 that, but I'm slightly concerned that the netfs could end up releasing its
 cookie before the cache has finished with its pages.  On the other hand, with
 the new asynchronous stuff I've done, I'm not sure this'll be an actual
 problem.

Actually, as long as launder_page() and invalidate_page() are doing
their thing, then your current code might be OK. The important thing is
to ensure that invalidate_inode_pages2() and truncate_inode_pages() work
as expected.

static int nfs_launder_page(struct page *page)
{
   + nfs_fscache_invalidate_page(page, page-mapping-host, 0);
  
  Why? The function of launder_page() is to clean the page, not to
  truncate it.
 
 Okay.

What you should be doing here is probably to make a call to
wait_on_page_fscache_write(page). That should suffice to ensure that the
page is clean w.r.t. fscache activity afaics.

   @@ -1000,11 +1007,13 @@ static int nfs_update_inode(struct inode *inode, 
   struct nfs_fattr *fattr)
 if (data_stable) {
 inode-i_size = new_isize;
 invalid |= NFS_INO_INVALID_DATA;
   + nfs_fscache_attr_changed(inode);
  
  Can't fscache_attr_changed() call kmalloc(GFP_KERNEL)? You are in a
  spinlocked context here.
 
 Hmmm...  How about I move the call to fscache_attr_changed() to the callers of
 nfs_update_inode(), to just after the spinlock is unlocked.  The operation is
 going to be asynchronous, so the delay shouldn't matter.

Right. You might also want to add a flag NFS_INO_INVALID_FSCACHE_ATTR or
something like that in order to trigger it.

  I'm not too happy about the change to the binary mount interface. As far
  as I'm concerned, the binary interface should really be considered
  frozen, and should not be touched.
 
 I hadn't come across that.  I'll have a look.
 
  Instead, feel free to update the text-based mount interface (which can
  be found in 2.6.23-rc1 and later). Please put any such mount interface
  changes in a separate patch together with the Kconfig changes.
 
 If you wish, but doesn't that violate the rules of patch division laid down by
 Linus and Andrew?

Why? It should be possible to set this up so that the NFS+fscache code
compiles correctly without the mount patch.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html