RE: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-26 Thread Dan Magenheimer
 From: Minchan Kim [mailto:minchan@gmail.com]
 On Sat, Apr 16, 2011 at 3:53 AM, Dan Magenheimer
 dan.magenhei...@oracle.com wrote:
  From: OGAWA Hirofumi [mailto:hirof...@mail.parknet.co.jp]
 
  Andrew Morton a...@linux-foundation.org writes:
 
  Andrew, I wonder if you would be so kind as to read the following
  and make a ruling.  If you determine a preferable set of names,
  I will abide by your decision and repost (if necessary).
 
  The problem is this: The English language has a limited number
  of words that can be used to represent data motion and mapping
  and most/all of them are already used in the kernel, often,
  to quote Andrew, in confusing ways.  Complicating this, I
  think the semantics of the cleancache operations are different
  from the semantics of any other kernel operation... intentionally
  so, because the value of cleancache is a direct result of those
  differing semantics.  And the cleancache semantics
  are fairly complex (again intentionally) so a single function
  name can't possibly describe the semantics.
 
  The cleancache operations are:
  - put (page)
  - get (page)
  - flush page
  - flush inode
  - init fs
  - flush fs
 
  I think these names are reasonable representations of the
  semantics of the operations performed... but I'm not a kernel
  expert so there is certainly room for disagreement.  Though I
  absolutely recognize the importance of a name, I am primarily
  interested in merging the semantics of the operations and
  would happily accept any name that kernel developers could
  agree on.  However, I fear that there will be NO name that
  will satisfy all, so would prefer to keep the existing names.
  If some renaming is eventually agreed upon, this could be done
  post-merge.
 
  Here's a brief description of the semantics:
  :
   semantics for other operations elided
  :
  The cleancache operation currently known as get has the
  following semantics:  Derive the filesystem-determined handle
  from this struct page.  If cleancache contains a page matching
  that handle, recreate the page of data from cleancache and
  place the results in the pageframe referred to by the
  struct page.  Then delete in cleancache any record of the
  handle and any data associated with it, so that a
  subsequent get will no longer find a match for the handle;
  any space used for the data can also be freed.
  :
   semantics for other operations elided
  :
 
 At least, I didn't confused your semantics except just flush. That's
 why I suggested only flush but after seeing your explaining, there is
 another thing I want to change. The get/put is common semantic of
 reference counting in kernel but in POV your semantics, it makes sense
 to me but get has a exclusive semantic so I want to represent it with
 API name. Maybe cleancache_get_page_exclusive.
 
 The summary is that I don't want to change all API name. Just two
 thing.
 (I am not sure you and others agree on me. It's just suggestion).
 
 1. cleancache_flush_page - cleancache_[invalidate|remove]_page
 2. cleancache_get_page - cleancache_get_page_exclusive
 

Hi Minchan --

Thanks for continuing to be interested in this and sorry for my
delayed response.

Actually, your comment about get_page_exclusive points out an
incompleteness in my description of the semantics for
cleancache_get_page.

First, I forgot to list cleancache_init_shared_fs, which is
the equivalent of cleancache_init_fs but used for clustered
filesystems.  (Support is included in the patch for ocfs2 but
I haven't played with it in quite some time and my focus has
been on the other filesystems, so it slipped my mind :-}

The cleancache_get_page operation has a slightly different semantics
depending on which of the init_fs calls was used.  However, the
location of the cleancache_get_page hook is the same regardless
of the fs, so the name of the operation must represent both
semantics.  In the case of init_fs (non-shared), the behavior
of cleancache_get_page is that the get is destructive; the page
is removed from cleancache on a successful get.  In the case of
a init_shared_fs, however, the get is non-destructive; the
page is NOT removed from cleancache.  When cleancache contains
pages from multiple kernels (e.g. Xen guests or different machines
in a RAMster cluster), this semantic difference can make a big
performance difference for a clustered filesystem.  Since zcache
only contains pages for a single kernel, the difference is moot.

Because of this, I am hesitant to add exclusive to the
end of the name of the operation.

 BTW, Nice description.
 Please include it in documentation if we can't reach the conclusion.
 It will help others to understand semantic of cleancache.

Thanks!  Nearly all of the description already exists in various
places in the patch but I agree that it would be good if I add
a new section to the Documentation file with the exact semantics.

Dan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs 

Re: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-17 Thread Minchan Kim
On Sat, Apr 16, 2011 at 3:53 AM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 From: OGAWA Hirofumi [mailto:hirof...@mail.parknet.co.jp]

 Andrew Morton a...@linux-foundation.org writes:

   Before I suggested a thing about cleancache_flush_page,
   cleancache_flush_inode.
  
   what's the meaning of flush's semantic?
   I thought it means invalidation.
   AFAIC, how about change flush with invalidate?
 
  I'm not sure the words flush and invalidate are defined
  precisely or used consistently everywhere in computer
  science, but I think that invalidate is to destroy
  a pointer to some data, but not necessarily destroy the
  data itself.   And flush means to actually remove
  the data.  So one would invalidate a mapping but one
  would flush a cache.
 
  Since cleancache_flush_page and cleancache_flush_inode
  semantically remove data from cleancache, I think flush
  is a better name than invalidate.
 
  Does that make sense?
 
  nope ;)
 
  Kernel code freely uses flush to refer to both invalidation and to
  writeback, sometimes in confusing ways.  In this case,
  cleancache_flush_inode and cleancache_flush_page rather sound like
 they
  might write those things to backing store.

 I'd like to mention about *_{get,put}_page too. In linux get/put is not
 meaning read/write. There is {get,put}_page those are refcount stuff
 (Yeah, and I felt those methods does refcount by quick read. But it
 seems to be false. There is no xen codes, so I don't know actually
 though.).

 And I agree, I also think the needing thing is consistency on the linux
 codes (term).

 Thanks.
 --
 OGAWA Hirofumi hirof...@mail.parknet.co.jp

 Hmmm, yes, that's a point of confusion also.  No, cleancache put/get
 do not have any relationship with reference counting.

 Andrew, I wonder if you would be so kind as to read the following
 and make a ruling.  If you determine a preferable set of names,
 I will abide by your decision and repost (if necessary).

 The problem is this: The English language has a limited number
 of words that can be used to represent data motion and mapping
 and most/all of them are already used in the kernel, often,
 to quote Andrew, in confusing ways.  Complicating this, I
 think the semantics of the cleancache operations are different
 from the semantics of any other kernel operation... intentionally
 so, because the value of cleancache is a direct result of those
 differing semantics.  And the cleancache semantics
 are fairly complex (again intentionally) so a single function
 name can't possibly describe the semantics.

 The cleancache operations are:
 - put (page)
 - get (page)
 - flush page
 - flush inode
 - init fs
 - flush fs

 I think these names are reasonable representations of the
 semantics of the operations performed... but I'm not a kernel
 expert so there is certainly room for disagreement.  Though I
 absolutely recognize the importance of a name, I am primarily
 interested in merging the semantics of the operations and
 would happily accept any name that kernel developers could
 agree on.  However, I fear that there will be NO name that
 will satisfy all, so would prefer to keep the existing names.
 If some renaming is eventually agreed upon, this could be done
 post-merge.

 Here's a brief description of the semantics:

 The cleancache operation currently known as put has the
 following semantics:  If *possible*, please take the data
 contained in the pageframe referred to by this struct page
 into cleancache and associate it with the filesystem-determined
 handle derived from the struct page.

 The cleancache operation currently known as get has the
 following semantics:  Derive the filesystem-determined handle
 from this struct page.  If cleancache contains a page matching
 that handle, recreate the page of data from cleancache and
 place the results in the pageframe referred to by the
 struct page.  Then delete in cleancache any record of the
 handle and any data associated with it, so that a
 subsequent get will no longer find a match for the handle;
 any space used for the data can also be freed.

 (Note that take the data and recreate the page of data are
 similar in semantics to copy to and copy from, but since
 the cleancache operation may perform an inflight transformation
 on the data, and copy usually means a byte-for-byte replication,
 the word copy is also misleading.)

 The cleancache operation currently known as flush has the
 following semantics:  Derive the filesystem-determined handle
 from this struct page and struct mapping.  If cleancache
 contains a page matching that handle, delete in cleancache any
 record of the handle and any data associated with it, so that a
 subsequent get will no longer find a match for the handle;
 any space used for the data can also be freed

 The cleancache operation currently known as flush inode has
 the following semantics: Derive the filesystem-determined filekey
 from this struct mapping.  If cleancache contains ANY handles
 matching 

RE: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-15 Thread Dan Magenheimer
Hi Minchan --

 First of all, thanks for resolving conflict with my patch.

You're welcome!  As I pointed out offlist, yours was the first
change in MM that caused any semantic changes to the cleancache
core hooks patch since before 2.6.18.
 
 Before I suggested a thing about cleancache_flush_page,
 cleancache_flush_inode.
 
 what's the meaning of flush's semantic?
 I thought it means invalidation.
 AFAIC, how about change flush with invalidate?

I'm not sure the words flush and invalidate are defined
precisely or used consistently everywhere in computer
science, but I think that invalidate is to destroy
a pointer to some data, but not necessarily destroy the
data itself.   And flush means to actually remove
the data.  So one would invalidate a mapping but one
would flush a cache.

Since cleancache_flush_page and cleancache_flush_inode
semantically remove data from cleancache, I think flush
is a better name than invalidate.

Does that make sense?

Thanks,
Dan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-15 Thread Andrew Morton
On Fri, 15 Apr 2011 07:47:57 -0700 (PDT) Dan Magenheimer 
dan.magenhei...@oracle.com wrote:

 Hi Minchan --
 
  First of all, thanks for resolving conflict with my patch.
 
 You're welcome!  As I pointed out offlist, yours was the first
 change in MM that caused any semantic changes to the cleancache
 core hooks patch since before 2.6.18.
  
  Before I suggested a thing about cleancache_flush_page,
  cleancache_flush_inode.
  
  what's the meaning of flush's semantic?
  I thought it means invalidation.
  AFAIC, how about change flush with invalidate?
 
 I'm not sure the words flush and invalidate are defined
 precisely or used consistently everywhere in computer
 science, but I think that invalidate is to destroy
 a pointer to some data, but not necessarily destroy the
 data itself.   And flush means to actually remove
 the data.  So one would invalidate a mapping but one
 would flush a cache.
 
 Since cleancache_flush_page and cleancache_flush_inode
 semantically remove data from cleancache, I think flush
 is a better name than invalidate.
 
 Does that make sense?
 

nope ;)

Kernel code freely uses flush to refer to both invalidation and to
writeback, sometimes in confusing ways.  In this case,
cleancache_flush_inode and cleancache_flush_page rather sound like they
might write those things to backing store.  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-15 Thread OGAWA Hirofumi
Andrew Morton a...@linux-foundation.org writes:

  Before I suggested a thing about cleancache_flush_page,
  cleancache_flush_inode.
  
  what's the meaning of flush's semantic?
  I thought it means invalidation.
  AFAIC, how about change flush with invalidate?
 
 I'm not sure the words flush and invalidate are defined
 precisely or used consistently everywhere in computer
 science, but I think that invalidate is to destroy
 a pointer to some data, but not necessarily destroy the
 data itself.   And flush means to actually remove
 the data.  So one would invalidate a mapping but one
 would flush a cache.
 
 Since cleancache_flush_page and cleancache_flush_inode
 semantically remove data from cleancache, I think flush
 is a better name than invalidate.
 
 Does that make sense?
 

 nope ;)

 Kernel code freely uses flush to refer to both invalidation and to
 writeback, sometimes in confusing ways.  In this case,
 cleancache_flush_inode and cleancache_flush_page rather sound like they
 might write those things to backing store.  

I'd like to mention about *_{get,put}_page too. In linux get/put is not
meaning read/write. There is {get,put}_page those are refcount stuff
(Yeah, and I felt those methods does refcount by quick read. But it
seems to be false. There is no xen codes, so I don't know actually
though.).

And I agree, I also think the needing thing is consistency on the linux
codes (term).

Thanks.
-- 
OGAWA Hirofumi hirof...@mail.parknet.co.jp
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-15 Thread Dan Magenheimer
 From: OGAWA Hirofumi [mailto:hirof...@mail.parknet.co.jp]
 
 Andrew Morton a...@linux-foundation.org writes:
 
   Before I suggested a thing about cleancache_flush_page,
   cleancache_flush_inode.
  
   what's the meaning of flush's semantic?
   I thought it means invalidation.
   AFAIC, how about change flush with invalidate?
 
  I'm not sure the words flush and invalidate are defined
  precisely or used consistently everywhere in computer
  science, but I think that invalidate is to destroy
  a pointer to some data, but not necessarily destroy the
  data itself.   And flush means to actually remove
  the data.  So one would invalidate a mapping but one
  would flush a cache.
 
  Since cleancache_flush_page and cleancache_flush_inode
  semantically remove data from cleancache, I think flush
  is a better name than invalidate.
 
  Does that make sense?
 
  nope ;)
 
  Kernel code freely uses flush to refer to both invalidation and to
  writeback, sometimes in confusing ways.  In this case,
  cleancache_flush_inode and cleancache_flush_page rather sound like
 they
  might write those things to backing store.
 
 I'd like to mention about *_{get,put}_page too. In linux get/put is not
 meaning read/write. There is {get,put}_page those are refcount stuff
 (Yeah, and I felt those methods does refcount by quick read. But it
 seems to be false. There is no xen codes, so I don't know actually
 though.).
 
 And I agree, I also think the needing thing is consistency on the linux
 codes (term).
 
 Thanks.
 --
 OGAWA Hirofumi hirof...@mail.parknet.co.jp

Hmmm, yes, that's a point of confusion also.  No, cleancache put/get
do not have any relationship with reference counting.

Andrew, I wonder if you would be so kind as to read the following
and make a ruling.  If you determine a preferable set of names,
I will abide by your decision and repost (if necessary).

The problem is this: The English language has a limited number
of words that can be used to represent data motion and mapping
and most/all of them are already used in the kernel, often,
to quote Andrew, in confusing ways.  Complicating this, I
think the semantics of the cleancache operations are different
from the semantics of any other kernel operation... intentionally
so, because the value of cleancache is a direct result of those
differing semantics.  And the cleancache semantics
are fairly complex (again intentionally) so a single function
name can't possibly describe the semantics.

The cleancache operations are:
- put (page)
- get (page)
- flush page
- flush inode
- init fs
- flush fs

I think these names are reasonable representations of the
semantics of the operations performed... but I'm not a kernel
expert so there is certainly room for disagreement.  Though I
absolutely recognize the importance of a name, I am primarily
interested in merging the semantics of the operations and
would happily accept any name that kernel developers could
agree on.  However, I fear that there will be NO name that
will satisfy all, so would prefer to keep the existing names.
If some renaming is eventually agreed upon, this could be done
post-merge.

Here's a brief description of the semantics:

The cleancache operation currently known as put has the
following semantics:  If *possible*, please take the data
contained in the pageframe referred to by this struct page
into cleancache and associate it with the filesystem-determined
handle derived from the struct page.

The cleancache operation currently known as get has the
following semantics:  Derive the filesystem-determined handle
from this struct page.  If cleancache contains a page matching
that handle, recreate the page of data from cleancache and
place the results in the pageframe referred to by the
struct page.  Then delete in cleancache any record of the
handle and any data associated with it, so that a
subsequent get will no longer find a match for the handle;
any space used for the data can also be freed.

(Note that take the data and recreate the page of data are
similar in semantics to copy to and copy from, but since
the cleancache operation may perform an inflight transformation
on the data, and copy usually means a byte-for-byte replication,
the word copy is also misleading.)

The cleancache operation currently known as flush has the
following semantics:  Derive the filesystem-determined handle
from this struct page and struct mapping.  If cleancache
contains a page matching that handle, delete in cleancache any
record of the handle and any data associated with it, so that a
subsequent get will no longer find a match for the handle;
any space used for the data can also be freed

The cleancache operation currently known as flush inode has
the following semantics: Derive the filesystem-determined filekey
from this struct mapping.  If cleancache contains ANY handles
matching that filekey, delete in cleancache any record of
any matching handle and any data associated with those handles;
any space used for the data 

Re: [PATCH V8 4/8] mm/fs: add hooks to support cleancache

2011-04-14 Thread Minchan Kim
Hi Dan,

On Fri, Apr 15, 2011 at 6:17 AM, Dan Magenheimer
dan.magenhei...@oracle.com wrote:
 [PATCH V8 4/8] mm/fs: add hooks to support cleancache

 This fourth patch of eight in this cleancache series provides the
 core hooks in VFS for: initializing cleancache per filesystem;
 capturing clean pages reclaimed by page cache; attempting to get
 pages from cleancache before filesystem read; and ensuring coherency
 between pagecache, disk, and cleancache.  Note that the placement
 of these hooks was stable from 2.6.18 to 2.6.38; a minor semantic
 change was required due to a patchset in 2.6.39.

 All hooks become no-ops if CONFIG_CLEANCACHE is unset, or become
 a check of a boolean global if CONFIG_CLEANCACHE is set but no
 cleancache backend has claimed cleancache_ops.

 Details and a FAQ can be found in Documentation/vm/cleancache.txt

 [v8: minchan@gmail.com: adapt to new remove_from_page_cache function]
 Signed-off-by: Chris Mason chris.ma...@oracle.com
 Signed-off-by: Dan Magenheimer dan.magenhei...@oracle.com
 Reviewed-by: Jeremy Fitzhardinge jer...@goop.org
 Reviewed-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Al Viro v...@zeniv.linux.org.uk
 Cc: Matthew Wilcox matt...@wil.cx
 Cc: Nick Piggin npig...@kernel.dk
 Cc: Mel Gorman m...@csn.ul.ie
 Cc: Rik Van Riel r...@redhat.com
 Cc: Jan Beulich jbeul...@novell.com
 Cc: Andreas Dilger adil...@sun.com
 Cc: Ted Ts'o ty...@mit.edu
 Cc: Mark Fasheh mfas...@suse.com
 Cc: Joel Becker joel.bec...@oracle.com
 Cc: Nitin Gupta ngu...@vflare.org

 ---

 Diffstat:
  fs/buffer.c                              |    5 +
  fs/mpage.c                               |    7 +++
  fs/super.c                               |    3 +++
  mm/filemap.c                             |   11 +++
  mm/truncate.c                            |    6 ++
  5 files changed, 32 insertions(+)

 --- linux-2.6.39-rc3/fs/super.c 2011-04-11 18:21:51.0 -0600
 +++ linux-2.6.39-rc3-cleancache/fs/super.c      2011-04-13 17:08:09.175853426 
 -0600
 @@ -31,6 +31,7 @@
  #include linux/mutex.h
  #include linux/backing-dev.h
  #include linux/rculist_bl.h
 +#include linux/cleancache.h
  #include internal.h


 @@ -112,6 +113,7 @@ static struct super_block *alloc_super(s
                s-s_maxbytes = MAX_NON_LFS;
                s-s_op = default_op;
                s-s_time_gran = 10;
 +               s-cleancache_poolid = -1;
        }
  out:
        return s;
 @@ -177,6 +179,7 @@ void deactivate_locked_super(struct supe
  {
        struct file_system_type *fs = s-s_type;
        if (atomic_dec_and_test(s-s_active)) {
 +               cleancache_flush_fs(s);
                fs-kill_sb(s);
                /*
                 * We need to call rcu_barrier so all the delayed rcu free
 --- linux-2.6.39-rc3/fs/buffer.c        2011-04-11 18:21:51.0 -0600
 +++ linux-2.6.39-rc3-cleancache/fs/buffer.c     2011-04-13 17:07:24.700917174 
 -0600
 @@ -41,6 +41,7 @@
  #include linux/bitops.h
  #include linux/mpage.h
  #include linux/bit_spinlock.h
 +#include linux/cleancache.h

  static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);

 @@ -269,6 +270,10 @@ void invalidate_bdev(struct block_device
        invalidate_bh_lrus();
        lru_add_drain_all();    /* make sure all lru add caches are flushed */
        invalidate_mapping_pages(mapping, 0, -1);
 +       /* 99% of the time, we don't need to flush the cleancache on the bdev.
 +        * But, for the strange corners, lets be cautious
 +        */
 +       cleancache_flush_inode(mapping);
  }
  EXPORT_SYMBOL(invalidate_bdev);

 --- linux-2.6.39-rc3/fs/mpage.c 2011-04-11 18:21:51.0 -0600
 +++ linux-2.6.39-rc3-cleancache/fs/mpage.c      2011-04-13 17:07:24.706913410 
 -0600
 @@ -27,6 +27,7 @@
  #include linux/writeback.h
  #include linux/backing-dev.h
  #include linux/pagevec.h
 +#include linux/cleancache.h

  /*
  * I/O completion handler for multipage BIOs.
 @@ -271,6 +272,12 @@ do_mpage_readpage(struct bio *bio, struc
                SetPageMappedToDisk(page);
        }

 +       if (fully_mapped  blocks_per_page == 1  !PageUptodate(page) 
 +           cleancache_get_page(page) == 0) {
 +               SetPageUptodate(page);
 +               goto confused;
 +       }
 +
        /*
         * This page will go to BIO.  Do we need to send this BIO off first?
         */
 --- linux-2.6.39-rc3/mm/filemap.c       2011-04-11 18:21:51.0 -0600
 +++ linux-2.6.39-rc3-cleancache/mm/filemap.c    2011-04-13 17:09:46.367852002 
 -0600
 @@ -34,6 +34,7 @@
  #include linux/hardirq.h /* for BUG_ON(!in_atomic()) only */
  #include linux/memcontrol.h
  #include linux/mm_inline.h /* for page_is_file_cache() */
 +#include linux/cleancache.h
  #include internal.h

  /*
 @@ -118,6 +119,16 @@ void __delete_from_page_cache(struct pag
  {
        struct address_space *mapping = page-mapping;

 +       /*
 +        * if we're uptodate, flush out into the