Re: [PATCH 14/14] NFS: Use local caching [try #2]
Trond Myklebust [EMAIL PROTECTED] wrote: Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h should really be moved into fscache.c. If you wish. It seems a shame since a lot of them have only one caller. ...however it also forces you to export a lot of stuff which is really private to fscache.c (the atomics etc). The atomics is actually a bad example. These are referred to directly by part of the table in fs/nfs/sysctl.c. Is there a better way of exporting statistics than through /proc/sys/ files? David - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/14] NFS: Use local caching [try #2]
The attached patch makes it possible for the NFS filesystem to make use of the network filesystem local caching service (FS-Cache). To be able to use this, an updated mount program is required. This can be obtained from: http://people.redhat.com/steved/cachefs/util-linux/ To mount an NFS filesystem to use caching, add an fsc option to the mount: mount warthog:/ /a -o fsc Signed-Off-By: David Howells [EMAIL PROTECTED] --- fs/Kconfig |8 + fs/nfs/Makefile|1 fs/nfs/client.c| 11 + fs/nfs/file.c | 38 +++- fs/nfs/fscache.c | 303 + fs/nfs/fscache.h | 464 fs/nfs/inode.c | 16 ++ fs/nfs/internal.h |8 + fs/nfs/read.c | 27 ++- fs/nfs/super.c |1 fs/nfs/sysctl.c| 43 fs/nfs/write.c |3 include/linux/nfs4_mount.h |3 include/linux/nfs_fs.h |6 + include/linux/nfs_fs_sb.h |5 include/linux/nfs_mount.h |3 16 files changed, 931 insertions(+), 9 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index 7feb4cb..76d5d16 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -1600,6 +1600,14 @@ config NFS_V4 If unsure, say N. +config NFS_FSCACHE + bool Provide NFS client caching support (EXPERIMENTAL) + depends on EXPERIMENTAL + depends on NFS_FS=m FSCACHE || NFS_FS=y FSCACHE=y + help + Say Y here if you want NFS data to be cached locally on disc through + the general filesystem cache manager + config NFS_DIRECTIO bool Allow direct I/O on NFS files depends on NFS_FS diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index b55cb23..c9e7c43 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -16,4 +16,5 @@ nfs-$(CONFIG_NFS_V4) += nfs4proc.o nfs4xdr.o nfs4state.o nfs4renewd.o \ nfs4namespace.o nfs-$(CONFIG_NFS_DIRECTIO) += direct.o nfs-$(CONFIG_SYSCTL) += sysctl.o +nfs-$(CONFIG_NFS_FSCACHE) += fscache.o nfs-objs := $(nfs-y) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index a49f9fe..7be7807 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -137,6 +137,8 @@ static struct nfs_client *nfs_alloc_client(const char *hostname, clp-cl_state = 1 NFS4CLNT_LEASE_EXPIRED; #endif + nfs_fscache_get_client_cookie(clp); + return clp; error_3: @@ -168,6 +170,8 @@ static void nfs_free_client(struct nfs_client *clp) nfs4_shutdown_client(clp); + nfs_fscache_release_client_cookie(clp); + /* -EIO all pending I/O */ if (!IS_ERR(clp-cl_rpcclient)) rpc_shutdown_client(clp-cl_rpcclient); @@ -1308,7 +1312,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v) /* display header on line 1 */ if (v == nfs_volume_list) { - seq_puts(m, NV SERVER PORT DEV FSID\n); + seq_puts(m, NV SERVER PORT DEV FSID FSC\n); return 0; } /* display one transport per line on subsequent lines */ @@ -1322,12 +1326,13 @@ static int nfs_volume_list_show(struct seq_file *m, void *v) (unsigned long long) server-fsid.major, (unsigned long long) server-fsid.minor); - seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s\n, + seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s %s\n, clp-cl_nfsversion, NIPQUAD(clp-cl_addr.sin_addr), ntohs(clp-cl_addr.sin_port), dev, - fsid); + fsid, + nfs_server_fscache_state(server)); return 0; } diff --git a/fs/nfs/file.c b/fs/nfs/file.c index c87dc71..dfd36e0 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -34,6 +34,7 @@ #include delegation.h #include iostat.h +#include internal.h #define NFSDBG_FACILITYNFSDBG_FILE @@ -259,6 +260,10 @@ nfs_file_mmap(struct file * file, struct vm_area_struct * vma) status = nfs_revalidate_mapping(inode, file-f_mapping); if (!status) status = generic_file_mmap(file, vma); + + if (status == 0) + nfs_fscache_install_vm_ops(inode, vma); + return status; } @@ -311,22 +316,51 @@ static int nfs_commit_write(struct file *file, struct page *page, unsigned offse return status; } +/* + * partially or wholly invalidate a page + * - release the private state associated with a page if undergoing complete + * page invalidation + * - caller holds page lock + */ static void nfs_invalidate_page(struct page *page, unsigned long offset) { if (offset != 0) return; /* Cancel any unstarted writes on this page */ nfs_wb_page_priority(page-mapping-host, page, FLUSH_INVALIDATE); + +
Re: [PATCH 14/14] NFS: Use local caching [try #2]
Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h should really be moved into fscache.c. Otherwise, this looks a lot less intrusive than previous patches. See inlined comments. On Thu, 2007-08-09 at 17:05 +0100, David Howells wrote: The attached patch makes it possible for the NFS filesystem to make use of the network filesystem local caching service (FS-Cache). To be able to use this, an updated mount program is required. This can be obtained from: http://people.redhat.com/steved/cachefs/util-linux/ To mount an NFS filesystem to use caching, add an fsc option to the mount: mount warthog:/ /a -o fsc Signed-Off-By: David Howells [EMAIL PROTECTED] --- fs/Kconfig |8 + fs/nfs/Makefile|1 fs/nfs/client.c| 11 + fs/nfs/file.c | 38 +++- fs/nfs/fscache.c | 303 + fs/nfs/fscache.h | 464 fs/nfs/inode.c | 16 ++ fs/nfs/internal.h |8 + fs/nfs/read.c | 27 ++- fs/nfs/super.c |1 fs/nfs/sysctl.c| 43 fs/nfs/write.c |3 include/linux/nfs4_mount.h |3 include/linux/nfs_fs.h |6 + include/linux/nfs_fs_sb.h |5 include/linux/nfs_mount.h |3 16 files changed, 931 insertions(+), 9 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index 7feb4cb..76d5d16 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -1600,6 +1600,14 @@ config NFS_V4 If unsure, say N. +config NFS_FSCACHE + bool Provide NFS client caching support (EXPERIMENTAL) + depends on EXPERIMENTAL + depends on NFS_FS=m FSCACHE || NFS_FS=y FSCACHE=y + help + Say Y here if you want NFS data to be cached locally on disc through + the general filesystem cache manager + config NFS_DIRECTIO bool Allow direct I/O on NFS files depends on NFS_FS diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index b55cb23..c9e7c43 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -16,4 +16,5 @@ nfs-$(CONFIG_NFS_V4)+= nfs4proc.o nfs4xdr.o nfs4state.o nfs4renewd.o \ nfs4namespace.o nfs-$(CONFIG_NFS_DIRECTIO) += direct.o nfs-$(CONFIG_SYSCTL) += sysctl.o +nfs-$(CONFIG_NFS_FSCACHE) += fscache.o nfs-objs := $(nfs-y) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index a49f9fe..7be7807 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -137,6 +137,8 @@ static struct nfs_client *nfs_alloc_client(const char *hostname, clp-cl_state = 1 NFS4CLNT_LEASE_EXPIRED; #endif + nfs_fscache_get_client_cookie(clp); + return clp; error_3: @@ -168,6 +170,8 @@ static void nfs_free_client(struct nfs_client *clp) nfs4_shutdown_client(clp); + nfs_fscache_release_client_cookie(clp); + /* -EIO all pending I/O */ if (!IS_ERR(clp-cl_rpcclient)) rpc_shutdown_client(clp-cl_rpcclient); @@ -1308,7 +1312,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v) /* display header on line 1 */ if (v == nfs_volume_list) { - seq_puts(m, NV SERVER PORT DEV FSID\n); + seq_puts(m, NV SERVER PORT DEV FSID FSC\n); return 0; } /* display one transport per line on subsequent lines */ @@ -1322,12 +1326,13 @@ static int nfs_volume_list_show(struct seq_file *m, void *v) (unsigned long long) server-fsid.major, (unsigned long long) server-fsid.minor); - seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s\n, + seq_printf(m, v%d %02x%02x%02x%02x %4hx %-7s %-17s %s\n, clp-cl_nfsversion, NIPQUAD(clp-cl_addr.sin_addr), ntohs(clp-cl_addr.sin_port), dev, -fsid); +fsid, +nfs_server_fscache_state(server)); Please send these changes as a separate patch: they change an existing interface. return 0; } diff --git a/fs/nfs/file.c b/fs/nfs/file.c index c87dc71..dfd36e0 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -34,6 +34,7 @@ #include delegation.h #include iostat.h +#include internal.h #define NFSDBG_FACILITY NFSDBG_FILE @@ -259,6 +260,10 @@ nfs_file_mmap(struct file * file, struct vm_area_struct * vma) status = nfs_revalidate_mapping(inode, file-f_mapping); if (!status) status = generic_file_mmap(file, vma); + + if (status == 0) + nfs_fscache_install_vm_ops(inode, vma); + return status; Please note that in 2.6.24 the NFS client will be gaining a page_mkwrite() method in order to fix the remaining mmap() races. AFAICS, it should be fairly trivial to merge with your nfs_file_page_mkwrite. }
Re: [PATCH 14/14] NFS: Use local caching [try #2]
Trond Myklebust [EMAIL PROTECTED] wrote: Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h should really be moved into fscache.c. If you wish. It seems a shame since a lot of them have only one caller. + /* we can do this here as the bits are only set with the page lock +* held, and our caller is holding that */ + if (!page-private) + ClearPagePrivate(page); Why would PG_private be set at this point? Looks like I've got a bit more cleaning up to do. PG_private isn't set by FS-Cache, so this bit shouldn't be here. In any case, please send this and other PagePrivate changes as a separate patch. Any changes to the PagePrivate semantics must be made easy to debug. There shouldn't be any. Note that due to patch #2, PG_fscache causes releasepage() and invalidatepage() to be called in addition to PG_private. + + if (!nfs_fscache_release_page(page, gfp)) + return 0; This looks _very_ dubious. Why shouldn't I be able to discard a page just because fscache isn't done writing it out? I may have very good reasons to do so. Hmmm... Looking at the truncate routines, I suppose this ought to be okay, provided the cache retains a reference on the page whilst it's writing it out (put_page() won't can the page until we release it). It also seems dubious, though, to release the page when the filesystem is doing stuff to it, even if it's by proxy in the cache. I'll have to test that, but I'm slightly concerned that the netfs could end up releasing its cookie before the cache has finished with its pages. On the other hand, with the new asynchronous stuff I've done, I'm not sure this'll be an actual problem. static int nfs_launder_page(struct page *page) { + nfs_fscache_invalidate_page(page, page-mapping-host, 0); Why? The function of launder_page() is to clean the page, not to truncate it. Okay. @@ -1000,11 +1007,13 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) if (data_stable) { inode-i_size = new_isize; invalid |= NFS_INO_INVALID_DATA; + nfs_fscache_attr_changed(inode); Can't fscache_attr_changed() call kmalloc(GFP_KERNEL)? You are in a spinlocked context here. Hmmm... How about I move the call to fscache_attr_changed() to the callers of nfs_update_inode(), to just after the spinlock is unlocked. The operation is going to be asynchronous, so the delay shouldn't matter. I'm not too happy about the change to the binary mount interface. As far as I'm concerned, the binary interface should really be considered frozen, and should not be touched. I hadn't come across that. I'll have a look. Instead, feel free to update the text-based mount interface (which can be found in 2.6.23-rc1 and later). Please put any such mount interface changes in a separate patch together with the Kconfig changes. If you wish, but doesn't that violate the rules of patch division laid down by Linus and Andrew? David - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/14] NFS: Use local caching [try #2]
Instead, feel free to update the text-based mount interface (which can be found in 2.6.23-rc1 and later). I presume you're referring to nfs_mount_option_tokens[] and friends. Is there a mount program that can drive this? David - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/14] NFS: Use local caching [try #2]
On Thu, 2007-08-09 at 19:52 +0100, David Howells wrote: Trond Myklebust [EMAIL PROTECTED] wrote: Dang, that's a lot of inlines... AFAICS, approx half of fs/nfs/fscache.h should really be moved into fscache.c. If you wish. It seems a shame since a lot of them have only one caller. ...however it also forces you to export a lot of stuff which is really private to fscache.c (the atomics etc). Note that due to patch #2, PG_fscache causes releasepage() and invalidatepage() to be called in addition to PG_private. + + if (!nfs_fscache_release_page(page, gfp)) + return 0; This looks _very_ dubious. Why shouldn't I be able to discard a page just because fscache isn't done writing it out? I may have very good reasons to do so. Hmmm... Looking at the truncate routines, I suppose this ought to be okay, provided the cache retains a reference on the page whilst it's writing it out (put_page() won't can the page until we release it). It also seems dubious, though, to release the page when the filesystem is doing stuff to it, even if it's by proxy in the cache. I'll have to test that, but I'm slightly concerned that the netfs could end up releasing its cookie before the cache has finished with its pages. On the other hand, with the new asynchronous stuff I've done, I'm not sure this'll be an actual problem. Actually, as long as launder_page() and invalidate_page() are doing their thing, then your current code might be OK. The important thing is to ensure that invalidate_inode_pages2() and truncate_inode_pages() work as expected. static int nfs_launder_page(struct page *page) { + nfs_fscache_invalidate_page(page, page-mapping-host, 0); Why? The function of launder_page() is to clean the page, not to truncate it. Okay. What you should be doing here is probably to make a call to wait_on_page_fscache_write(page). That should suffice to ensure that the page is clean w.r.t. fscache activity afaics. @@ -1000,11 +1007,13 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) if (data_stable) { inode-i_size = new_isize; invalid |= NFS_INO_INVALID_DATA; + nfs_fscache_attr_changed(inode); Can't fscache_attr_changed() call kmalloc(GFP_KERNEL)? You are in a spinlocked context here. Hmmm... How about I move the call to fscache_attr_changed() to the callers of nfs_update_inode(), to just after the spinlock is unlocked. The operation is going to be asynchronous, so the delay shouldn't matter. Right. You might also want to add a flag NFS_INO_INVALID_FSCACHE_ATTR or something like that in order to trigger it. I'm not too happy about the change to the binary mount interface. As far as I'm concerned, the binary interface should really be considered frozen, and should not be touched. I hadn't come across that. I'll have a look. Instead, feel free to update the text-based mount interface (which can be found in 2.6.23-rc1 and later). Please put any such mount interface changes in a separate patch together with the Kconfig changes. If you wish, but doesn't that violate the rules of patch division laid down by Linus and Andrew? Why? It should be possible to set this up so that the NFS+fscache code compiles correctly without the mount patch. Trond - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html