completely revised ioctl for notification of when all space from a snapshot delete is freed and similar
I have started rewriting the ioctl#21 patch using a less hacky architecture, including defining a new kind of structure that explicitly organizes information common to delayed/deferred systems, (the list of work pieces, and an atomic for counting active worker threads) and replacing the lists of work pieces, in ioctl#21-tracked deferred systems, with this structure instead. I started march 25, but haven't had time to work on this since then and would like to hand it off. I have the structure as described above in place, and some macros for use within the ioctl body written; the next thing is to try compiling and replace all references to the replaced list heads with the new syntax. Any takers? David Nicol -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V8 2/8] fs: add field to superblock to support cleancache
[PATCH V8 2/8] fs: add field to superblock to support cleancache This second patch of eight in this cleancache series adds a field to the generic superblock to squirrel away a pool identifier that is dynamically provided by cleancache-enabled filesystems at mount time to uniquely identify files and pages belonging to this mounted filesystem. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v8: trivial merge conflict update] Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Andreas Dilger Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: include/linux/fs.h |5 + 1 file changed, 5 insertions(+) --- linux-2.6.39-rc3/include/linux/fs.h 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/include/linux/fs.h 2011-04-13 16:46:44.444861817 -0600 @@ -1430,6 +1430,11 @@ struct super_block { */ char __rcu *s_options; const struct dentry_operations *s_d_op; /* default d_op for dentries */ + + /* +* Saved pool identifier for cleancache (-1 means none) +*/ + int cleancache_poolid; }; extern struct timespec current_fs_time(struct super_block *sb); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V8 0/8] Cleancache
[PATCH V8 0/8] Cleancache This is a courtesy repost to lkml and linux-mm. As of 2.6.39-rc1, Linus has said that he will review cleancache but hasn't yet, so I am updating the patchset to the very latest bits. The patchset can be pulled from: git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem.git (branch stable/cleancache-v8-with-tmem) Version 8 of the cleancache patchset: - Rebase to 2.6.39-rc3 - Resolve trivial merge conflicts for linux-next - Adapt to recent remove_from_page_cache patchset by Minchan Kim - Fix exportfs issue that affected btrfs under certain circumstances - Change two macros to static inlines (per akpm) - Minor documentation changes Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk (see individual patches for additional Acks/SOBs etc) Documentation/ABI/testing/sysfs-kernel-mm-cleancache | 11 Documentation/vm/cleancache.txt | 278 +++ fs/btrfs/extent_io.c |9 fs/btrfs/super.c |2 fs/buffer.c |5 fs/ext3/super.c |2 fs/ext4/super.c |2 fs/mpage.c |7 fs/ocfs2/super.c |2 fs/super.c |3 include/linux/cleancache.h | 122 include/linux/fs.h |5 mm/Kconfig | 23 + mm/Makefile |1 mm/cleancache.c | 244 mm/filemap.c | 11 mm/truncate.c|6 17 files changed, 733 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V8 1/8] mm/fs: cleancache documentation
[PATCH V8 1/8] mm/fs: cleancache documentation This patchset introduces cleancache, an optional new feature exposed by the VFS layer that potentially dramatically increases page cache effectiveness for many workloads in many environments at a negligible cost. It does this by providing an interface to transcendent memory, which is memory/storage that is not otherwise visible to and/or directly addressable by the kernel. Instead of being discarded, hooks in the reclaim code "put" clean pages to cleancache. Filesystems that "opt-in" may "get" pages from cleancache that were previously put, but pages in cleancache are "ephemeral", meaning they may disappear at any time. And the size of cleancache is entirely dynamic and unknowable to the kernel. Filesystems currently supported by this patchset include ext3, ext4, btrfs, and ocfs2. Other filesystems (especially those built entirely on VFS) should be easy to add, but should first be thoroughly tested to ensure coherency. Details and a FAQ are provided in Documentation/vm/cleancache.txt This first patch of eight in this cleancache series only adds two new documentation files. [v8: minor documentation changes by author] [v3: a...@linux-foundation.org: document sysfs API] [v3: h...@infradead.org: move detailed description to Documentation/vm] Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Andrew Morton Acked-by: Randy Dunlap Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Andreas Dilger Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: Documentation/ABI/testing/sysfs-kernel-mm-cleancache | 11 Documentation/vm/cleancache.txt | 278 ++ 2 files changed, 289 insertions(+) --- linux-2.6.39-rc3/Documentation/ABI/testing/sysfs-kernel-mm-cleancache 1969-12-31 17:00:00.0 -0700 +++ linux-2.6.39-rc3-cleancache/Documentation/ABI/testing/sysfs-kernel-mm-cleancache 2011-04-13 16:44:53.079859372 -0600 @@ -0,0 +1,11 @@ +What: /sys/kernel/mm/cleancache/ +Date: April 2011 +Contact: Dan Magenheimer +Description: + /sys/kernel/mm/cleancache/ contains a number of files which + record a count of various cleancache operations + (sum across all filesystems): + succ_gets + failed_gets + puts + flushes --- linux-2.6.39-rc3/Documentation/vm/cleancache.txt1969-12-31 17:00:00.0 -0700 +++ linux-2.6.39-rc3-cleancache/Documentation/vm/cleancache.txt 2011-04-13 16:45:53.581879931 -0600 @@ -0,0 +1,278 @@ +MOTIVATION + +Cleancache is a new optional feature provided by the VFS layer that +potentially dramatically increases page cache effectiveness for +many workloads in many environments at a negligible cost. + +Cleancache can be thought of as a page-granularity victim cache for clean +pages that the kernel's pageframe replacement algorithm (PFRA) would like +to keep around, but can't since there isn't enough memory. So when the +PFRA "evicts" a page, it first attempts to use cleancache code to +put the data contained in that page into "transcendent memory", memory +that is not directly accessible or addressable by the kernel and is +of unknown and possibly time-varying size. + +Later, when a cleancache-enabled filesystem wishes to access a page +in a file on disk, it first checks cleancache to see if it already +contains it; if it does, the page of data is copied into the kernel +and a disk access is avoided. + +Transcendent memory "drivers" for cleancache are currently implemented +in Xen (using hypervisor memory) and zcache (using in-kernel compressed +memory) and other implementations are in development. + +FAQs are included below. + +IMPLEMENTATION OVERVIEW + +A cleancache "backend" that provides transcendent memory registers itself +to the kernel's cleancache "frontend" by calling cleancache_register_ops, +passing a pointer to a cleancache_ops structure with funcs set appropriately. +Note that cleancache_register_ops returns the previous settings so that +chaining can be performed if desired. The functions provided must conform to +certain semantics as follows: + +Most important, cleancache is "ephemeral". Pages which are copied into +cleancache have an indefinite lifetime which is completely unknowable +by the kernel and so may or may not still be in cleancache at any later time. +Thus, as its name implies, cleancache is not suitable for dirty pages. +Cleancache has complete discretion over what pages to preserve and what +pages to discard and when. + +Mounting a cleancache-enabled filesystem should call "init_fs" to obtain a +pool id which, if positive, must be saved in the filesystem's superblock; +a negative return value indicates failure. A "put_page" will copy a +(presumably about-to-
[PATCH V8 4/8] mm/fs: add hooks to support cleancache
[PATCH V8 4/8] mm/fs: add hooks to support cleancache This fourth patch of eight in this cleancache series provides the core hooks in VFS for: initializing cleancache per filesystem; capturing clean pages reclaimed by page cache; attempting to get pages from cleancache before filesystem read; and ensuring coherency between pagecache, disk, and cleancache. Note that the placement of these hooks was stable from 2.6.18 to 2.6.38; a minor semantic change was required due to a patchset in 2.6.39. All hooks become no-ops if CONFIG_CLEANCACHE is unset, or become a check of a boolean global if CONFIG_CLEANCACHE is set but no cleancache "backend" has claimed cleancache_ops. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v8: minchan@gmail.com: adapt to new remove_from_page_cache function] Signed-off-by: Chris Mason Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Andreas Dilger Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: fs/buffer.c |5 + fs/mpage.c |7 +++ fs/super.c |3 +++ mm/filemap.c | 11 +++ mm/truncate.c|6 ++ 5 files changed, 32 insertions(+) --- linux-2.6.39-rc3/fs/super.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/super.c 2011-04-13 17:08:09.175853426 -0600 @@ -31,6 +31,7 @@ #include #include #include +#include #include "internal.h" @@ -112,6 +113,7 @@ static struct super_block *alloc_super(s s->s_maxbytes = MAX_NON_LFS; s->s_op = &default_op; s->s_time_gran = 10; + s->cleancache_poolid = -1; } out: return s; @@ -177,6 +179,7 @@ void deactivate_locked_super(struct supe { struct file_system_type *fs = s->s_type; if (atomic_dec_and_test(&s->s_active)) { + cleancache_flush_fs(s); fs->kill_sb(s); /* * We need to call rcu_barrier so all the delayed rcu free --- linux-2.6.39-rc3/fs/buffer.c2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/buffer.c 2011-04-13 17:07:24.700917174 -0600 @@ -41,6 +41,7 @@ #include #include #include +#include static int fsync_buffers_list(spinlock_t *lock, struct list_head *list); @@ -269,6 +270,10 @@ void invalidate_bdev(struct block_device invalidate_bh_lrus(); lru_add_drain_all();/* make sure all lru add caches are flushed */ invalidate_mapping_pages(mapping, 0, -1); + /* 99% of the time, we don't need to flush the cleancache on the bdev. +* But, for the strange corners, lets be cautious +*/ + cleancache_flush_inode(mapping); } EXPORT_SYMBOL(invalidate_bdev); --- linux-2.6.39-rc3/fs/mpage.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/mpage.c 2011-04-13 17:07:24.706913410 -0600 @@ -27,6 +27,7 @@ #include #include #include +#include /* * I/O completion handler for multipage BIOs. @@ -271,6 +272,12 @@ do_mpage_readpage(struct bio *bio, struc SetPageMappedToDisk(page); } + if (fully_mapped && blocks_per_page == 1 && !PageUptodate(page) && + cleancache_get_page(page) == 0) { + SetPageUptodate(page); + goto confused; + } + /* * This page will go to BIO. Do we need to send this BIO off first? */ --- linux-2.6.39-rc3/mm/filemap.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/mm/filemap.c2011-04-13 17:09:46.367852002 -0600 @@ -34,6 +34,7 @@ #include /* for BUG_ON(!in_atomic()) only */ #include #include /* for page_is_file_cache() */ +#include #include "internal.h" /* @@ -118,6 +119,16 @@ void __delete_from_page_cache(struct pag { struct address_space *mapping = page->mapping; + /* +* if we're uptodate, flush out into the cleancache, otherwise +* invalidate any existing cleancache entries. We can't leave +* stale data around in the cleancache once our page is gone +*/ + if (PageUptodate(page) && PageMappedToDisk(page)) + cleancache_put_page(page); + else + cleancache_flush_page(mapping, page); + radix_tree_delete(&mapping->page_tree, page->index); page->mapping = NULL; mapping->nrpages--; --- linux-2.6.39-rc3/mm/truncate.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/mm/truncate.c 2011-04-13 17:07:24.710911759 -0600 @@ -19,6 +19,7 @@ #include #include /* grr. try_to_release_page, do_inva
[PATCH V8 3/8] mm: cleancache core ops functions and config
[PATCH V8 3/8] mm: cleancache core ops functions and config This third patch of eight in this cleancache series provides the core code for cleancache that interfaces between the hooks in VFS and individual filesystems and a cleancache backend. It also includes build and config patches. Two new files are added: mm/cleancache.c and include/linux/cleancache.h. Note that CONFIG_CLEANCACHE defaults to on; in systems that do not provide a cleancache backend, all hooks devolve to a simple check of a global enable flag, so performance impact should be negligible but can be reduced to zero impact if config'ed off. Details and a FAQ can be found in Documentation/vm/cleancache.txt Credits: Cleancache_ops design derived from Jeremy Fitzhardinge design for tmem [v8: dan.magenhei...@oracle.com: fix exportfs call affecting btrfs] [v8: a...@linux-foundation.org: use static inline function, not macro] [v7: dan.magenhei...@oracle.com: cleanup sysfs and remove cleancache prefix] [v6: jbeul...@novell.com: robustly handle buggy fs encode_fh actor definition] [v5: jer...@goop.org: clean up global usage and static var names] [v5: jer...@goop.org: simplify init hook and any future fs init changes] [v5: h...@infradead.org: cleaner non-global interface for ops registration] [v4: adil...@sun.com: interface must support exportfs FS's] [v4: h...@infradead.org: interface must support 64-bit FS on 32-bit kernel] [v3: a...@linux-foundation.org: use one ops struct to avoid pointer hops] [v3: a...@linux-foundation.org: document and ensure PageLocked reqts are met] [v3: ngu...@vflare.org: fix success/fail codes, change funcs to void] [v2: v...@zeniv.linux.org.uk: use sane types] Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Al Viro Acked-by: Andrew Morton Acked-by: Nitin Gupta Acked-by: Minchan Kim Acked-by: Andreas Dilger Acked-by: Jan Beulich Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Chris Mason Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker --- Diffstat: include/linux/cleancache.h | 122 ++ mm/Kconfig | 23 + mm/Makefile |1 mm/cleancache.c | 244 + 4 files changed, 390 insertions(+) --- linux-2.6.39-rc3/include/linux/cleancache.h 1969-12-31 17:00:00.0 -0700 +++ linux-2.6.39-rc3-cleancache/include/linux/cleancache.h 2011-04-13 16:59:42.029957587 -0600 @@ -0,0 +1,122 @@ +#ifndef _LINUX_CLEANCACHE_H +#define _LINUX_CLEANCACHE_H + +#include +#include +#include + +#define CLEANCACHE_KEY_MAX 6 + +/* + * cleancache requires every file with a page in cleancache to have a + * unique key unless/until the file is removed/truncated. For some + * filesystems, the inode number is unique, but for "modern" filesystems + * an exportable filehandle is required (see exportfs.h) + */ +struct cleancache_filekey { + union { + ino_t ino; + __u32 fh[CLEANCACHE_KEY_MAX]; + u32 key[CLEANCACHE_KEY_MAX]; + } u; +}; + +struct cleancache_ops { + int (*init_fs)(size_t); + int (*init_shared_fs)(char *uuid, size_t); + int (*get_page)(int, struct cleancache_filekey, + pgoff_t, struct page *); + void (*put_page)(int, struct cleancache_filekey, + pgoff_t, struct page *); + void (*flush_page)(int, struct cleancache_filekey, pgoff_t); + void (*flush_inode)(int, struct cleancache_filekey); + void (*flush_fs)(int); +}; + +extern struct cleancache_ops + cleancache_register_ops(struct cleancache_ops *ops); +extern void __cleancache_init_fs(struct super_block *); +extern void __cleancache_init_shared_fs(char *, struct super_block *); +extern int __cleancache_get_page(struct page *); +extern void __cleancache_put_page(struct page *); +extern void __cleancache_flush_page(struct address_space *, struct page *); +extern void __cleancache_flush_inode(struct address_space *); +extern void __cleancache_flush_fs(struct super_block *); +extern int cleancache_enabled; + +#ifdef CONFIG_CLEANCACHE +static inline bool cleancache_fs_enabled(struct page *page) +{ + return page->mapping->host->i_sb->cleancache_poolid >= 0; +} +static inline bool cleancache_fs_enabled_mapping(struct address_space *mapping) +{ + return mapping->host->i_sb->cleancache_poolid >= 0; +} +#else +#define cleancache_enabled (0) +#define cleancache_fs_enabled(_page) (0) +#define cleancache_fs_enabled_mapping(_page) (0) +#endif + +/* + * The shim layer provided by these inline functions allows the compiler + * to reduce all cleancache hooks to nothingness if CONFIG_CLEANCACHE + * is disabled, to a single global variable check if CONFIG_CLEANCACHE + * is enabled but no cleancache "backend" has dynamically enabled it, + * and, for the most frequent cleancache ops, to a single global variable + * check
[PATCH V8 6/8] btrfs: add cleancache support
[PATCH V8 6/8] btrfs: add cleancache support This sixth patch of eight in this cleancache series "opts-in" cleancache for btrfs. Filesystems must explicitly enable cleancache by calling cleancache_init_fs anytime an instance of the filesystem is mounted. Btrfs uses its own readpage which must be hooked, but all other cleancache hooks are in the VFS layer including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v6-v8: no changes] [v5: jer...@goop.org: simplify init hook and any future fs init changes] Signed-off-by: Dan Magenheimer Signed-off-by: Chris Mason Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Andreas Dilger Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: fs/btrfs/extent_io.c |9 + fs/btrfs/super.c |2 ++ 2 files changed, 11 insertions(+) --- linux-2.6.39-rc3/fs/btrfs/super.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/btrfs/super.c2011-04-13 17:10:46.357852791 -0600 @@ -39,6 +39,7 @@ #include #include #include +#include #include "compat.h" #include "ctree.h" #include "disk-io.h" @@ -610,6 +611,7 @@ static int btrfs_fill_super(struct super sb->s_root = root_dentry; save_mount_options(sb, data); + cleancache_init_fs(sb); return 0; fail_close: --- linux-2.6.39-rc3/fs/btrfs/extent_io.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/btrfs/extent_io.c2011-04-13 17:10:46.368921914 -0600 @@ -10,6 +10,7 @@ #include #include #include +#include #include "extent_io.h" #include "extent_map.h" #include "compat.h" @@ -1990,6 +1991,13 @@ static int __extent_read_full_page(struc set_page_extent_mapped(page); + if (!PageUptodate(page)) { + if (cleancache_get_page(page) == 0) { + BUG_ON(blocksize != PAGE_SIZE); + goto out; + } + } + end = page_end; while (1) { lock_extent(tree, start, end, GFP_NOFS); @@ -2117,6 +2125,7 @@ static int __extent_read_full_page(struc cur = cur + iosize; page_offset += iosize; } +out: if (!nr) { if (!PageError(page)) SetPageUptodate(page); -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V8 5/8] ext3: add cleancache support
[PATCH V8 5/8] ext3: add cleancache support This fifth patch of eight in this cleancache series "opts-in" cleancache for ext3. Filesystems must explicitly enable cleancache by calling cleancache_init_fs anytime an instance of the filesystem is mounted. For ext3, all other cleancache hooks are in the VFS layer including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v6-v8: no changes] [v5: jer...@goop.org: simplify init hook and any future fs init changes] Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Andreas Dilger Cc: Ted Ts'o Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: fs/ext3/super.c |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.39-rc3/fs/ext3/super.c2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/ext3/super.c 2011-04-13 17:10:40.918915872 -0600 @@ -36,6 +36,7 @@ #include #include #include +#include #include @@ -1367,6 +1368,7 @@ static int ext3_setup_super(struct super } else { ext3_msg(sb, KERN_INFO, "using internal journal"); } + cleancache_init_fs(sb); return res; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V8 7/8] ext4: add cleancache support
[PATCH V8 7/8] ext4: add cleancache support This seventh patch of eight in this cleancache series "opts-in" cleancache for ext4. Filesystems must explicitly enable cleancache by calling cleancache_init_fs anytime an instance of the filesystem is mounted. For ext4, all other cleancache hooks are in the VFS layer including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v6-v8: no changes] [v5: jer...@goop.org: simplify init hook and any future fs init changes] Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Andreas Dilger Cc: Ted Ts'o Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: fs/ext4/super.c |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.39-rc3/fs/ext4/super.c2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/ext4/super.c 2011-04-13 17:10:52.708850707 -0600 @@ -38,6 +38,7 @@ #include #include #include +#include #include #include @@ -1932,6 +1933,7 @@ static int ext4_setup_super(struct super EXT4_INODES_PER_GROUP(sb), sbi->s_mount_opt, sbi->s_mount_opt2); + cleancache_init_fs(sb); return res; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xen: cleancache shim to Xen Transcendent Memory
[PATCH] xen: cleancache shim to Xen Transcendent Memory This patch provides a shim between the kernel-internal cleancache API (see Documentation/mm/cleancache.txt) and the Xen Transcendent Memory ABI (see http://oss.oracle.com/projects/tmem). Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented pseudo-RAM store for cleancache pages, shared cleancache pages, and frontswap pages. Tmem provides enterprise-quality concurrency, full save/restore and live migration support, compression and deduplication. A presentation showing up to 8% faster performance and up to 52% reduction in sectors read on a kernel compile workload, despite aggressive in-kernel page reclamation ("self-ballooning") can be found at: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdf Signed-off-by: Dan Magenheimer Reviewed-by: Jeremy Fitzhardinge Cc: Konrad Rzeszutek Wilk Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Andreas Dilger Cc: Ted Ts'o Cc: Mark Fasheh Cc: Joel Becker Cc: Nitin Gupta --- Diffstat: arch/x86/include/asm/xen/hypercall.h |7 drivers/xen/Makefile |1 drivers/xen/tmem.c | 264 + include/xen/interface/xen.h | 22 + 4 files changed, 294 insertions(+) diff -Napur -X linux-2.6.39-rc3/Documentation/dontdiff linux-2.6.39-rc3-cleancache/arch/x86/include/asm/xen/hypercall.h linux-2.6.39-rc3-cleancache-xen/arch/x86/include/asm/xen/hypercall.h --- linux-2.6.39-rc3-cleancache/arch/x86/include/asm/xen/hypercall.h 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache-xen/arch/x86/include/asm/xen/hypercall.h 2011-04-14 09:54:19.210857901 -0600 @@ -447,6 +447,13 @@ HYPERVISOR_hvm_op(int op, void *arg) return _hypercall2(unsigned long, hvm_op, op, arg); } +static inline int +HYPERVISOR_tmem_op( + struct tmem_op *op) +{ + return _hypercall1(int, tmem_op, op); +} + static inline void MULTI_fpu_taskswitch(struct multicall_entry *mcl, int set) { diff -Napur -X linux-2.6.39-rc3/Documentation/dontdiff linux-2.6.39-rc3-cleancache/drivers/xen/Makefile linux-2.6.39-rc3-cleancache-xen/drivers/xen/Makefile --- linux-2.6.39-rc3-cleancache/drivers/xen/Makefile2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache-xen/drivers/xen/Makefile2011-04-14 09:54:19.236096051 -0600 @@ -1,5 +1,6 @@ obj-y += grant-table.o features.o events.o manage.o balloon.o obj-y += xenbus/ +obj-y += tmem.o nostackp := $(call cc-option, -fno-stack-protector) CFLAGS_features.o := $(nostackp) diff -Napur -X linux-2.6.39-rc3/Documentation/dontdiff linux-2.6.39-rc3-cleancache/drivers/xen/tmem.c linux-2.6.39-rc3-cleancache-xen/drivers/xen/tmem.c --- linux-2.6.39-rc3-cleancache/drivers/xen/tmem.c 1969-12-31 17:00:00.0 -0700 +++ linux-2.6.39-rc3-cleancache-xen/drivers/xen/tmem.c 2011-04-14 09:54:19.236917913 -0600 @@ -0,0 +1,264 @@ +/* + * Xen implementation for transcendent memory (tmem) + * + * Copyright (C) 2009-2010 Oracle Corp. All rights reserved. + * Author: Dan Magenheimer + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define TMEM_CONTROL 0 +#define TMEM_NEW_POOL 1 +#define TMEM_DESTROY_POOL 2 +#define TMEM_NEW_PAGE 3 +#define TMEM_PUT_PAGE 4 +#define TMEM_GET_PAGE 5 +#define TMEM_FLUSH_PAGE6 +#define TMEM_FLUSH_OBJECT 7 +#define TMEM_READ 8 +#define TMEM_WRITE 9 +#define TMEM_XCHG 10 + +/* Bits for HYPERVISOR_tmem_op(TMEM_NEW_POOL) */ +#define TMEM_POOL_PERSIST 1 +#define TMEM_POOL_SHARED 2 +#define TMEM_POOL_PAGESIZE_SHIFT 4 +#define TMEM_VERSION_SHIFT24 + + +struct tmem_pool_uuid { + u64 uuid_lo; + u64 uuid_hi; +}; + +struct tmem_oid { + u64 oid[3]; +}; + +#define TMEM_POOL_PRIVATE_UUID { 0, 0 } + +/* flags for tmem_ops.new_pool */ +#define TMEM_POOL_PERSIST 1 +#define TMEM_POOL_SHARED 2 + +/* xen tmem foundation ops/hypercalls */ + +static inline int xen_tmem_op(u32 tmem_cmd, u32 tmem_pool, struct tmem_oid oid, + u32 index, unsigned long gmfn, u32 tmem_offset, u32 pfn_offset, u32 len) +{ + struct tmem_op op; + int rc = 0; + + op.cmd = tmem_cmd; + op.pool_id = tmem_pool; + op.u.gen.oid[0] = oid.oid[0]; + op.u.gen.oid[1] = oid.oid[1]; + op.u.gen.oid[2] = oid.oid[2]; + op.u.gen.index = index; + op.u.gen.tmem_offset = tmem_offset; + op.u.gen.pfn_offset = pfn_offset; + op.u.gen.len = len; + set_xen_guest_handle(op.u.gen.gmfn, (void *)gmfn); + rc = HYPERVISOR_tmem_op(&op); + return rc; +} + +static int xen_tmem_new_pool(struct tmem_pool_uuid uu
[PATCH V8 8/8] ocfs2: add cleancache support
[PATCH V8 8/8] ocfs2: add cleancache support This eighth patch of eight in this cleancache series "opts-in" cleancache for ocfs2. Clustered filesystems must explicitly enable cleancache by calling cleancache_init_shared_fs anytime an instance of the filesystem is mounted. Ocfs2 is currently the only user of the clustered filesystem interface but nevertheless, the cleancache hooks in the VFS layer are sufficient for ocfs2 including the matching cleancache_flush_fs hook which must be called on unmount. Details and a FAQ can be found in Documentation/vm/cleancache.txt [v8: trivial merge conflict update] [v5: jer...@goop.org: simplify init hook and any future fs init changes] Signed-off-by: Dan Magenheimer Signed-off-by: Joel Becker Reviewed-by: Jeremy Fitzhardinge Reviewed-by: Konrad Rzeszutek Wilk Cc: Mark Fasheh Cc: Andrew Morton Cc: Al Viro Cc: Matthew Wilcox Cc: Nick Piggin Cc: Mel Gorman Cc: Rik Van Riel Cc: Jan Beulich Cc: Chris Mason Cc: Andreas Dilger Cc: Ted Tso Cc: Nitin Gupta --- Diffstat: fs/ocfs2/super.c |2 ++ 1 file changed, 2 insertions(+) --- linux-2.6.39-rc3/fs/ocfs2/super.c 2011-04-11 18:21:51.0 -0600 +++ linux-2.6.39-rc3-cleancache/fs/ocfs2/super.c2011-04-13 17:11:45.664861458 -0600 @@ -41,6 +41,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include "ocfs2_trace.h" @@ -2352,6 +2353,7 @@ static int ocfs2_initialize_super(struct mlog_errno(status); goto bail; } + cleancache_init_shared_fs((char *)&uuid_net_key, sb); bail: return status; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V8 4/8] mm/fs: add hooks to support cleancache
Hi Dan, On Fri, Apr 15, 2011 at 6:17 AM, Dan Magenheimer wrote: > [PATCH V8 4/8] mm/fs: add hooks to support cleancache > > This fourth patch of eight in this cleancache series provides the > core hooks in VFS for: initializing cleancache per filesystem; > capturing clean pages reclaimed by page cache; attempting to get > pages from cleancache before filesystem read; and ensuring coherency > between pagecache, disk, and cleancache. Note that the placement > of these hooks was stable from 2.6.18 to 2.6.38; a minor semantic > change was required due to a patchset in 2.6.39. > > All hooks become no-ops if CONFIG_CLEANCACHE is unset, or become > a check of a boolean global if CONFIG_CLEANCACHE is set but no > cleancache "backend" has claimed cleancache_ops. > > Details and a FAQ can be found in Documentation/vm/cleancache.txt > > [v8: minchan@gmail.com: adapt to new remove_from_page_cache function] > Signed-off-by: Chris Mason > Signed-off-by: Dan Magenheimer > Reviewed-by: Jeremy Fitzhardinge > Reviewed-by: Konrad Rzeszutek Wilk > Cc: Andrew Morton > Cc: Al Viro > Cc: Matthew Wilcox > Cc: Nick Piggin > Cc: Mel Gorman > Cc: Rik Van Riel > Cc: Jan Beulich > Cc: Andreas Dilger > Cc: Ted Ts'o > Cc: Mark Fasheh > Cc: Joel Becker > Cc: Nitin Gupta > > --- > > Diffstat: > fs/buffer.c | 5 + > fs/mpage.c | 7 +++ > fs/super.c | 3 +++ > mm/filemap.c | 11 +++ > mm/truncate.c | 6 ++ > 5 files changed, 32 insertions(+) > > --- linux-2.6.39-rc3/fs/super.c 2011-04-11 18:21:51.0 -0600 > +++ linux-2.6.39-rc3-cleancache/fs/super.c 2011-04-13 17:08:09.175853426 > -0600 > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > #include "internal.h" > > > @@ -112,6 +113,7 @@ static struct super_block *alloc_super(s > s->s_maxbytes = MAX_NON_LFS; > s->s_op = &default_op; > s->s_time_gran = 10; > + s->cleancache_poolid = -1; > } > out: > return s; > @@ -177,6 +179,7 @@ void deactivate_locked_super(struct supe > { > struct file_system_type *fs = s->s_type; > if (atomic_dec_and_test(&s->s_active)) { > + cleancache_flush_fs(s); > fs->kill_sb(s); > /* > * We need to call rcu_barrier so all the delayed rcu free > --- linux-2.6.39-rc3/fs/buffer.c 2011-04-11 18:21:51.0 -0600 > +++ linux-2.6.39-rc3-cleancache/fs/buffer.c 2011-04-13 17:07:24.700917174 > -0600 > @@ -41,6 +41,7 @@ > #include > #include > #include > +#include > > static int fsync_buffers_list(spinlock_t *lock, struct list_head *list); > > @@ -269,6 +270,10 @@ void invalidate_bdev(struct block_device > invalidate_bh_lrus(); > lru_add_drain_all(); /* make sure all lru add caches are flushed */ > invalidate_mapping_pages(mapping, 0, -1); > + /* 99% of the time, we don't need to flush the cleancache on the bdev. > + * But, for the strange corners, lets be cautious > + */ > + cleancache_flush_inode(mapping); > } > EXPORT_SYMBOL(invalidate_bdev); > > --- linux-2.6.39-rc3/fs/mpage.c 2011-04-11 18:21:51.0 -0600 > +++ linux-2.6.39-rc3-cleancache/fs/mpage.c 2011-04-13 17:07:24.706913410 > -0600 > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > > /* > * I/O completion handler for multipage BIOs. > @@ -271,6 +272,12 @@ do_mpage_readpage(struct bio *bio, struc > SetPageMappedToDisk(page); > } > > + if (fully_mapped && blocks_per_page == 1 && !PageUptodate(page) && > + cleancache_get_page(page) == 0) { > + SetPageUptodate(page); > + goto confused; > + } > + > /* > * This page will go to BIO. Do we need to send this BIO off first? > */ > --- linux-2.6.39-rc3/mm/filemap.c 2011-04-11 18:21:51.0 -0600 > +++ linux-2.6.39-rc3-cleancache/mm/filemap.c 2011-04-13 17:09:46.367852002 > -0600 > @@ -34,6 +34,7 @@ > #include /* for BUG_ON(!in_atomic()) only */ > #include > #include /* for page_is_file_cache() */ > +#include > #include "internal.h" > > /* > @@ -118,6 +119,16 @@ void __delete_from_page_cache(struct pag > { > struct address_space *mapping = page->mapping; > > + /* > + * if we're uptodate, flush out into the cleancache, otherwise > + * invalidate any existing cleancache entries. We can't leave > + * stale data around in the cleancache once our page is gone > + */ > + if (PageUptodate(page) && PageMappedToDisk(page)) > + cleancache_put_page(page); > + else > + cleancache_flush_page(mapping, page); > + First of all, thanks for resolving conflict with my patch. Before I suggested a thin
[PATCH 1/3] fs: remove FS_COW_FL
FS_COW_FL and FS_NOCOW_FL were newly introduced to control per file COW in btrfs, but FS_NOCOW_FL is sufficient. The fact is we don't have corresponding BTRFS_INODE_COW flag. COW is default, and FS_NOCOW_FL can be used to switch off COW for a single file. If we mount btrfs with nodatacow, a newly created file will be set with the FS_NOCOW_FL flag. So to turn on COW for it, we can just clear the FS_NOCOW_FL flag. Signed-off-by: Li Zefan --- fs/btrfs/ioctl.c | 15 ++- include/linux/fs.h |1 - 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f580a3a..3240dd9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -144,16 +144,13 @@ static int check_flags(unsigned int flags) if (flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \ FS_NOATIME_FL | FS_NODUMP_FL | \ FS_SYNC_FL | FS_DIRSYNC_FL | \ - FS_NOCOMP_FL | FS_COMPR_FL | \ - FS_NOCOW_FL | FS_COW_FL)) + FS_NOCOMP_FL | FS_COMPR_FL | + FS_NOCOW_FL)) return -EOPNOTSUPP; if ((flags & FS_NOCOMP_FL) && (flags & FS_COMPR_FL)) return -EINVAL; - if ((flags & FS_NOCOW_FL) && (flags & FS_COW_FL)) - return -EINVAL; - return 0; } @@ -218,6 +215,10 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) ip->flags |= BTRFS_INODE_DIRSYNC; else ip->flags &= ~BTRFS_INODE_DIRSYNC; + if (flags & FS_NOCOW_FL) + ip->flags |= BTRFS_INODE_NODATACOW; + else + ip->flags &= ~BTRFS_INODE_NODATACOW; /* * The COMPRESS flag can only be changed by users, while the NOCOMPRESS @@ -231,10 +232,6 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) ip->flags |= BTRFS_INODE_COMPRESS; ip->flags &= ~BTRFS_INODE_NOCOMPRESS; } - if (flags & FS_NOCOW_FL) - ip->flags |= BTRFS_INODE_NODATACOW; - else if (flags & FS_COW_FL) - ip->flags &= ~BTRFS_INODE_NODATACOW; trans = btrfs_join_transaction(root, 1); BUG_ON(IS_ERR(trans)); diff --git a/include/linux/fs.h b/include/linux/fs.h index de9dd81..56a4141 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -365,7 +365,6 @@ struct inodes_stat_t { #define FS_EXTENT_FL 0x0008 /* Extents */ #define FS_DIRECTIO_FL 0x0010 /* Use direct i/o */ #define FS_NOCOW_FL0x0080 /* Do not cow file */ -#define FS_COW_FL 0x0200 /* Cow file */ #define FS_RESERVED_FL 0x8000 /* reserved for ext2 lib */ #define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] Btrfs: fix FS_IOC_SETFLAGS ioctl
Steps to reproduce the bug: - Call FS_IOC_SETLFAGS ioctl with flags=FS_COMPR_FL - Call FS_IOC_SETFLAGS ioctl with flags=0 - Call FS_IOC_GETFLAGS ioctl, and you'll see FS_COMPR_FL is still set! Signed-off-by: Li Zefan --- fs/btrfs/ioctl.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index aeabf6b..3e7031d 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -238,6 +238,8 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) } else if (flags & FS_COMPR_FL) { ip->flags |= BTRFS_INODE_COMPRESS; ip->flags &= ~BTRFS_INODE_NOCOMPRESS; + } else { + ip->flags &= ~(BTRFS_INODE_COMPRESS | BTRFS_INODE_NOCOMPRESS); } trans = btrfs_join_transaction(root, 1); -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] Btrfs: fix FS_IOC_GETFLAGS ioctl
As we've added per file compression/cow support. Signed-off-by: Li Zefan --- fs/btrfs/ioctl.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 3240dd9..aeabf6b 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -81,6 +81,13 @@ static unsigned int btrfs_flags_to_ioctl(unsigned int flags) iflags |= FS_NOATIME_FL; if (flags & BTRFS_INODE_DIRSYNC) iflags |= FS_DIRSYNC_FL; + if (flags & BTRFS_INODE_NODATACOW) + iflags |= FS_NOCOW_FL; + + if ((flags & BTRFS_INODE_COMPRESS) && !(flags & BTRFS_INODE_NOCOMPRESS)) + iflags |= FS_COMPR_FL; + else if (flags & BTRFS_INODE_NOCOMPRESS) + iflags |= FS_NOCOMP_FL; return iflags; } -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html