Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-22 Thread Dave Hansen
On 03/22/2013 03:34 AM, Kirill A. Shutemov wrote:
> Dave Hansen wrote:
>> On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
>>> +   error = radix_tree_insert(>page_tree,
>>> +   offset + i, page + i);
>>> +   if (error) {
>>> +   page_cache_release(page + i);
>>> +   break;
>>> +   }
>>> }
>>
>> Throughout all this new code, I'd really challenge you to try as much as
>> possible to minimize the code stuck under "if (PageTransHuge(page))".
> 
> I put thp-related code under the 'if' intentionally to be able to optimize
> it out if !CONFIG_TRANSPARENT_HUGEPAGE. The config option is disabled by
> default.

You've gotta give the compiler some credit! :)  In this function's case,
the compiler should be able to realize that nr=1 is constant if
!TRANSPARENT_HUGEPAGE.  It'll realize that all your for loops are:

for (i = 0; i < 1; i++) {

and will end up generating _very_ similar code to what you get with the
explicit #ifdefs.  You already _have_ #ifdefs, but they're just up in
the headers around PageTransHuge()'s definition.

The advantages for you are *huge* since it means that folks will have a
harder time inadvertently breaking your CONFIG_TRANSPARENT_HUGEPAGE code.

>> Does the cgroup code know how to handle these large pages internally
>> somehow?  It looks like the charge/uncharge is only being done for the
>> head page.
> 
> It can. We only need to remove PageCompound() check there. Patch is in
> git.

OK, cool.  This _might_ deserve a comment or something here.  Again, it
looks asymmetric and fishy to someone just reading the code.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-22 Thread Kirill A. Shutemov
Dave Hansen wrote:
> On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" 
> > 
> > For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
> > page for the specified index and HPAGE_CACHE_NR-1 tail pages for
> > following indexes.
> > 
> > Signed-off-by: Kirill A. Shutemov 
> > ---
> >  mm/filemap.c |   76 
> > --
> >  1 file changed, 53 insertions(+), 23 deletions(-)
> > 
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 2d99191..6bac9e2 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
> > address_space *mapping,
> > pgoff_t offset, gfp_t gfp_mask)
> >  {
> > int error;
> > +   int nr = 1;
> >  
> > VM_BUG_ON(!PageLocked(page));
> > VM_BUG_ON(PageSwapBacked(page));
> > @@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, 
> > struct address_space *mapping,
> > error = mem_cgroup_cache_charge(page, current->mm,
> > gfp_mask & GFP_RECLAIM_MASK);
> > if (error)
> > -   goto out;
> > +   return error;
> >  
> > -   error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> > -   if (error == 0) {
> > -   page_cache_get(page);
> > -   page->mapping = mapping;
> > -   page->index = offset;
> > +   if (PageTransHuge(page)) {
> > +   BUILD_BUG_ON(HPAGE_CACHE_NR > RADIX_TREE_PRELOAD_NR);
> > +   nr = HPAGE_CACHE_NR;
> > +   }
> 
> That seems like a slightly odd place to put a BUILD_BUG_ON().  I guess
> it doesn't matter to some degree, but does putting it inside the if()
> imply anything?

It actually matters.

HPAGE_CACHE_NR is BUILD_BUG() if !CONFIG_TRANSPARENT_HUGEPAGE, so we need
to hide it inside 'if (PageTransHuge(page))'. PageTransHuge(page) is 0 in
compile time if !CONFIG_TRANSPARENT_HUGEPAGE, so compiler can be smart and
optimize out the check.

> > +   error = radix_tree_preload_count(nr, gfp_mask & ~__GFP_HIGHMEM);
> > +   if (error) {
> > +   mem_cgroup_uncharge_cache_page(page);
> > +   return error;
> > +   }
> >  
> > -   spin_lock_irq(>tree_lock);
> > -   error = radix_tree_insert(>page_tree, offset, page);
> > -   if (likely(!error)) {
> > -   mapping->nrpages++;
> > -   __inc_zone_page_state(page, NR_FILE_PAGES);
> > -   spin_unlock_irq(>tree_lock);
> > -   trace_mm_filemap_add_to_page_cache(page);
> > -   } else {
> > -   page->mapping = NULL;
> > -   /* Leave page->index set: truncation relies upon it */
> > -   spin_unlock_irq(>tree_lock);
> > -   mem_cgroup_uncharge_cache_page(page);
> > -   page_cache_release(page);
> 
> I do really like how this rewrite de-indents this code. :)

:)

> > +   page_cache_get(page);
> > +   spin_lock_irq(>tree_lock);
> > +   page->mapping = mapping;
> > +   page->index = offset;
> > +   error = radix_tree_insert(>page_tree, offset, page);
> > +   if (unlikely(error))
> > +   goto err;
> > +   if (PageTransHuge(page)) {
> > +   int i;
> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
> > +   page_cache_get(page + i);
> > +   page[i].index = offset + i;
> 
> Is it OK to leave page->mapping unset for these?

Good catch, thanks.
Seems nobody really use it, since I haven't got any oops, but we need to
set it anyway.

> > +   error = radix_tree_insert(>page_tree,
> > +   offset + i, page + i);
> > +   if (error) {
> > +   page_cache_release(page + i);
> > +   break;
> > +   }
> > }
> 
> Throughout all this new code, I'd really challenge you to try as much as
> possible to minimize the code stuck under "if (PageTransHuge(page))".

I put thp-related code under the 'if' intentionally to be able to optimize
it out if !CONFIG_TRANSPARENT_HUGEPAGE. The config option is disabled by
default.

> For instance, could you change the for() loop a bit and have it shared
> between both cases, like:
> 
> > +   for (i = 0; i < nr; i++) {
> > +   page_cache_get(page + i);
> > +   page[i].index = offset + i;
> > +   error = radix_tree_insert(>page_tree,
> > +   offset + i, page + i);
> > +   if (error) {
> > +   page_cache_release(page + i);
> > +   break;
> > +   }
> > }
> 
> > -   radix_tree_preload_end();
> > -   } else
> > -   mem_cgroup_uncharge_cache_page(page);
> > -out:
> > +   if (error) {
> > +   error = ENOSPC; /* no space for a huge page */
> > +   for (i--; i > 0; i--) {
> > +   

Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-22 Thread Kirill A. Shutemov
Dave Hansen wrote:
 On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
  From: Kirill A. Shutemov kirill.shute...@linux.intel.com
  
  For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
  page for the specified index and HPAGE_CACHE_NR-1 tail pages for
  following indexes.
  
  Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com
  ---
   mm/filemap.c |   76 
  --
   1 file changed, 53 insertions(+), 23 deletions(-)
  
  diff --git a/mm/filemap.c b/mm/filemap.c
  index 2d99191..6bac9e2 100644
  --- a/mm/filemap.c
  +++ b/mm/filemap.c
  @@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
  address_space *mapping,
  pgoff_t offset, gfp_t gfp_mask)
   {
  int error;
  +   int nr = 1;
   
  VM_BUG_ON(!PageLocked(page));
  VM_BUG_ON(PageSwapBacked(page));
  @@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, 
  struct address_space *mapping,
  error = mem_cgroup_cache_charge(page, current-mm,
  gfp_mask  GFP_RECLAIM_MASK);
  if (error)
  -   goto out;
  +   return error;
   
  -   error = radix_tree_preload(gfp_mask  ~__GFP_HIGHMEM);
  -   if (error == 0) {
  -   page_cache_get(page);
  -   page-mapping = mapping;
  -   page-index = offset;
  +   if (PageTransHuge(page)) {
  +   BUILD_BUG_ON(HPAGE_CACHE_NR  RADIX_TREE_PRELOAD_NR);
  +   nr = HPAGE_CACHE_NR;
  +   }
 
 That seems like a slightly odd place to put a BUILD_BUG_ON().  I guess
 it doesn't matter to some degree, but does putting it inside the if()
 imply anything?

It actually matters.

HPAGE_CACHE_NR is BUILD_BUG() if !CONFIG_TRANSPARENT_HUGEPAGE, so we need
to hide it inside 'if (PageTransHuge(page))'. PageTransHuge(page) is 0 in
compile time if !CONFIG_TRANSPARENT_HUGEPAGE, so compiler can be smart and
optimize out the check.

  +   error = radix_tree_preload_count(nr, gfp_mask  ~__GFP_HIGHMEM);
  +   if (error) {
  +   mem_cgroup_uncharge_cache_page(page);
  +   return error;
  +   }
   
  -   spin_lock_irq(mapping-tree_lock);
  -   error = radix_tree_insert(mapping-page_tree, offset, page);
  -   if (likely(!error)) {
  -   mapping-nrpages++;
  -   __inc_zone_page_state(page, NR_FILE_PAGES);
  -   spin_unlock_irq(mapping-tree_lock);
  -   trace_mm_filemap_add_to_page_cache(page);
  -   } else {
  -   page-mapping = NULL;
  -   /* Leave page-index set: truncation relies upon it */
  -   spin_unlock_irq(mapping-tree_lock);
  -   mem_cgroup_uncharge_cache_page(page);
  -   page_cache_release(page);
 
 I do really like how this rewrite de-indents this code. :)

:)

  +   page_cache_get(page);
  +   spin_lock_irq(mapping-tree_lock);
  +   page-mapping = mapping;
  +   page-index = offset;
  +   error = radix_tree_insert(mapping-page_tree, offset, page);
  +   if (unlikely(error))
  +   goto err;
  +   if (PageTransHuge(page)) {
  +   int i;
  +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
  +   page_cache_get(page + i);
  +   page[i].index = offset + i;
 
 Is it OK to leave page-mapping unset for these?

Good catch, thanks.
Seems nobody really use it, since I haven't got any oops, but we need to
set it anyway.

  +   error = radix_tree_insert(mapping-page_tree,
  +   offset + i, page + i);
  +   if (error) {
  +   page_cache_release(page + i);
  +   break;
  +   }
  }
 
 Throughout all this new code, I'd really challenge you to try as much as
 possible to minimize the code stuck under if (PageTransHuge(page)).

I put thp-related code under the 'if' intentionally to be able to optimize
it out if !CONFIG_TRANSPARENT_HUGEPAGE. The config option is disabled by
default.

 For instance, could you change the for() loop a bit and have it shared
 between both cases, like:
 
  +   for (i = 0; i  nr; i++) {
  +   page_cache_get(page + i);
  +   page[i].index = offset + i;
  +   error = radix_tree_insert(mapping-page_tree,
  +   offset + i, page + i);
  +   if (error) {
  +   page_cache_release(page + i);
  +   break;
  +   }
  }
 
  -   radix_tree_preload_end();
  -   } else
  -   mem_cgroup_uncharge_cache_page(page);
  -out:
  +   if (error) {
  +   error = ENOSPC; /* no space for a huge page */
  +   for (i--; i  0; i--) {
  +   radix_tree_delete(mapping-page_tree,
  +   offset + i);
  +   

Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-22 Thread Dave Hansen
On 03/22/2013 03:34 AM, Kirill A. Shutemov wrote:
 Dave Hansen wrote:
 On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
 +   error = radix_tree_insert(mapping-page_tree,
 +   offset + i, page + i);
 +   if (error) {
 +   page_cache_release(page + i);
 +   break;
 +   }
 }

 Throughout all this new code, I'd really challenge you to try as much as
 possible to minimize the code stuck under if (PageTransHuge(page)).
 
 I put thp-related code under the 'if' intentionally to be able to optimize
 it out if !CONFIG_TRANSPARENT_HUGEPAGE. The config option is disabled by
 default.

You've gotta give the compiler some credit! :)  In this function's case,
the compiler should be able to realize that nr=1 is constant if
!TRANSPARENT_HUGEPAGE.  It'll realize that all your for loops are:

for (i = 0; i  1; i++) {

and will end up generating _very_ similar code to what you get with the
explicit #ifdefs.  You already _have_ #ifdefs, but they're just up in
the headers around PageTransHuge()'s definition.

The advantages for you are *huge* since it means that folks will have a
harder time inadvertently breaking your CONFIG_TRANSPARENT_HUGEPAGE code.

 Does the cgroup code know how to handle these large pages internally
 somehow?  It looks like the charge/uncharge is only being done for the
 head page.
 
 It can. We only need to remove PageCompound() check there. Patch is in
 git.

OK, cool.  This _might_ deserve a comment or something here.  Again, it
looks asymmetric and fishy to someone just reading the code.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-21 Thread Dave Hansen
On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" 
> 
> For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
> page for the specified index and HPAGE_CACHE_NR-1 tail pages for
> following indexes.
> 
> Signed-off-by: Kirill A. Shutemov 
> ---
>  mm/filemap.c |   76 
> --
>  1 file changed, 53 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 2d99191..6bac9e2 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
> address_space *mapping,
>   pgoff_t offset, gfp_t gfp_mask)
>  {
>   int error;
> + int nr = 1;
>  
>   VM_BUG_ON(!PageLocked(page));
>   VM_BUG_ON(PageSwapBacked(page));
> @@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, struct 
> address_space *mapping,
>   error = mem_cgroup_cache_charge(page, current->mm,
>   gfp_mask & GFP_RECLAIM_MASK);
>   if (error)
> - goto out;
> + return error;
>  
> - error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> - if (error == 0) {
> - page_cache_get(page);
> - page->mapping = mapping;
> - page->index = offset;
> + if (PageTransHuge(page)) {
> + BUILD_BUG_ON(HPAGE_CACHE_NR > RADIX_TREE_PRELOAD_NR);
> + nr = HPAGE_CACHE_NR;
> + }

That seems like a slightly odd place to put a BUILD_BUG_ON().  I guess
it doesn't matter to some degree, but does putting it inside the if()
imply anything?

> + error = radix_tree_preload_count(nr, gfp_mask & ~__GFP_HIGHMEM);
> + if (error) {
> + mem_cgroup_uncharge_cache_page(page);
> + return error;
> + }
>  
> - spin_lock_irq(>tree_lock);
> - error = radix_tree_insert(>page_tree, offset, page);
> - if (likely(!error)) {
> - mapping->nrpages++;
> - __inc_zone_page_state(page, NR_FILE_PAGES);
> - spin_unlock_irq(>tree_lock);
> - trace_mm_filemap_add_to_page_cache(page);
> - } else {
> - page->mapping = NULL;
> - /* Leave page->index set: truncation relies upon it */
> - spin_unlock_irq(>tree_lock);
> - mem_cgroup_uncharge_cache_page(page);
> - page_cache_release(page);

I do really like how this rewrite de-indents this code. :)

> + page_cache_get(page);
> + spin_lock_irq(>tree_lock);
> + page->mapping = mapping;
> + page->index = offset;
> + error = radix_tree_insert(>page_tree, offset, page);
> + if (unlikely(error))
> + goto err;
> + if (PageTransHuge(page)) {
> + int i;
> + for (i = 1; i < HPAGE_CACHE_NR; i++) {
> + page_cache_get(page + i);
> + page[i].index = offset + i;

Is it OK to leave page->mapping unset for these?

> + error = radix_tree_insert(>page_tree,
> + offset + i, page + i);
> + if (error) {
> + page_cache_release(page + i);
> + break;
> + }
>   }

Throughout all this new code, I'd really challenge you to try as much as
possible to minimize the code stuck under "if (PageTransHuge(page))".

For instance, could you change the for() loop a bit and have it shared
between both cases, like:

> + for (i = 0; i < nr; i++) {
> + page_cache_get(page + i);
> + page[i].index = offset + i;
> + error = radix_tree_insert(>page_tree,
> + offset + i, page + i);
> + if (error) {
> + page_cache_release(page + i);
> + break;
> + }
>   }

> - radix_tree_preload_end();
> - } else
> - mem_cgroup_uncharge_cache_page(page);
> -out:
> + if (error) {
> + error = ENOSPC; /* no space for a huge page */
> + for (i--; i > 0; i--) {
> + radix_tree_delete(>page_tree,
> + offset + i);
> + page_cache_release(page + i);
> + }
> + radix_tree_delete(>page_tree, offset);

I wonder if this would look any nicer if you just did all the
page_cache_get()s for the entire huge page along with the head page, and
then released them all in one place.  I think it might shrink the error
handling paths here.

> + goto err;
> + }
> + }
> + __mod_zone_page_state(page_zone(page), NR_FILE_PAGES, nr);
> + mapping->nrpages += nr;
> + 

Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-21 Thread Dave Hansen
On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
 From: Kirill A. Shutemov kirill.shute...@linux.intel.com
 
 For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
 page for the specified index and HPAGE_CACHE_NR-1 tail pages for
 following indexes.
 
 Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com
 ---
  mm/filemap.c |   76 
 --
  1 file changed, 53 insertions(+), 23 deletions(-)
 
 diff --git a/mm/filemap.c b/mm/filemap.c
 index 2d99191..6bac9e2 100644
 --- a/mm/filemap.c
 +++ b/mm/filemap.c
 @@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
 address_space *mapping,
   pgoff_t offset, gfp_t gfp_mask)
  {
   int error;
 + int nr = 1;
  
   VM_BUG_ON(!PageLocked(page));
   VM_BUG_ON(PageSwapBacked(page));
 @@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, struct 
 address_space *mapping,
   error = mem_cgroup_cache_charge(page, current-mm,
   gfp_mask  GFP_RECLAIM_MASK);
   if (error)
 - goto out;
 + return error;
  
 - error = radix_tree_preload(gfp_mask  ~__GFP_HIGHMEM);
 - if (error == 0) {
 - page_cache_get(page);
 - page-mapping = mapping;
 - page-index = offset;
 + if (PageTransHuge(page)) {
 + BUILD_BUG_ON(HPAGE_CACHE_NR  RADIX_TREE_PRELOAD_NR);
 + nr = HPAGE_CACHE_NR;
 + }

That seems like a slightly odd place to put a BUILD_BUG_ON().  I guess
it doesn't matter to some degree, but does putting it inside the if()
imply anything?

 + error = radix_tree_preload_count(nr, gfp_mask  ~__GFP_HIGHMEM);
 + if (error) {
 + mem_cgroup_uncharge_cache_page(page);
 + return error;
 + }
  
 - spin_lock_irq(mapping-tree_lock);
 - error = radix_tree_insert(mapping-page_tree, offset, page);
 - if (likely(!error)) {
 - mapping-nrpages++;
 - __inc_zone_page_state(page, NR_FILE_PAGES);
 - spin_unlock_irq(mapping-tree_lock);
 - trace_mm_filemap_add_to_page_cache(page);
 - } else {
 - page-mapping = NULL;
 - /* Leave page-index set: truncation relies upon it */
 - spin_unlock_irq(mapping-tree_lock);
 - mem_cgroup_uncharge_cache_page(page);
 - page_cache_release(page);

I do really like how this rewrite de-indents this code. :)

 + page_cache_get(page);
 + spin_lock_irq(mapping-tree_lock);
 + page-mapping = mapping;
 + page-index = offset;
 + error = radix_tree_insert(mapping-page_tree, offset, page);
 + if (unlikely(error))
 + goto err;
 + if (PageTransHuge(page)) {
 + int i;
 + for (i = 1; i  HPAGE_CACHE_NR; i++) {
 + page_cache_get(page + i);
 + page[i].index = offset + i;

Is it OK to leave page-mapping unset for these?

 + error = radix_tree_insert(mapping-page_tree,
 + offset + i, page + i);
 + if (error) {
 + page_cache_release(page + i);
 + break;
 + }
   }

Throughout all this new code, I'd really challenge you to try as much as
possible to minimize the code stuck under if (PageTransHuge(page)).

For instance, could you change the for() loop a bit and have it shared
between both cases, like:

 + for (i = 0; i  nr; i++) {
 + page_cache_get(page + i);
 + page[i].index = offset + i;
 + error = radix_tree_insert(mapping-page_tree,
 + offset + i, page + i);
 + if (error) {
 + page_cache_release(page + i);
 + break;
 + }
   }

 - radix_tree_preload_end();
 - } else
 - mem_cgroup_uncharge_cache_page(page);
 -out:
 + if (error) {
 + error = ENOSPC; /* no space for a huge page */
 + for (i--; i  0; i--) {
 + radix_tree_delete(mapping-page_tree,
 + offset + i);
 + page_cache_release(page + i);
 + }
 + radix_tree_delete(mapping-page_tree, offset);

I wonder if this would look any nicer if you just did all the
page_cache_get()s for the entire huge page along with the head page, and
then released them all in one place.  I think it might shrink the error
handling paths here.

 + goto err;
 + }
 + }
 + __mod_zone_page_state(page_zone(page), NR_FILE_PAGES, nr);
 + mapping-nrpages += nr;
 + 

Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
> On Fri, Mar 15, 2013 at 9:50 PM, Kirill A. Shutemov
>  wrote:
> > Hillf Danton wrote:
> >> On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
> >>  wrote:
> >> > Hillf Danton wrote:
> >> >> On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
> >> >>  wrote:
> >> >> > +   page_cache_get(page);
> >> >> > +   spin_lock_irq(>tree_lock);
> >> >> > +   page->mapping = mapping;
> >> >> > +   page->index = offset;
> >> >> > +   error = radix_tree_insert(>page_tree, offset, page);
> >> >> > +   if (unlikely(error))
> >> >> > +   goto err;
> >> >> > +   if (PageTransHuge(page)) {
> >> >> > +   int i;
> >> >> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
> >> >>   struct page *tail = page + i; to easy reader
> >> >>
> >> >> > +   page_cache_get(page + i);
> >> >> s/page_cache_get/get_page_foll/ ?
> >> >
> >> > Why?
> >> >
> >> see follow_trans_huge_pmd() please.
> >
> > Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
> > Could you elaborate?
> >
> Lets see the code
> 
>   page += (addr & ~HPAGE_PMD_MASK) >> PAGE_SHIFT;
> //page is tail now
>   VM_BUG_ON(!PageCompound(page));
>   if (flags & FOLL_GET)
>   get_page_foll(page);
> //raise page count with the foll function
> 
> thus I raised question.

get_page_foll() is designed to be part of follow_page*() call chain.
get_page() can handle compound pages properly.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Hillf Danton
On Fri, Mar 15, 2013 at 9:50 PM, Kirill A. Shutemov
 wrote:
> Hillf Danton wrote:
>> On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
>>  wrote:
>> > Hillf Danton wrote:
>> >> On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
>> >>  wrote:
>> >> > +   page_cache_get(page);
>> >> > +   spin_lock_irq(>tree_lock);
>> >> > +   page->mapping = mapping;
>> >> > +   page->index = offset;
>> >> > +   error = radix_tree_insert(>page_tree, offset, page);
>> >> > +   if (unlikely(error))
>> >> > +   goto err;
>> >> > +   if (PageTransHuge(page)) {
>> >> > +   int i;
>> >> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
>> >>   struct page *tail = page + i; to easy reader
>> >>
>> >> > +   page_cache_get(page + i);
>> >> s/page_cache_get/get_page_foll/ ?
>> >
>> > Why?
>> >
>> see follow_trans_huge_pmd() please.
>
> Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
> Could you elaborate?
>
Lets see the code

page += (addr & ~HPAGE_PMD_MASK) >> PAGE_SHIFT;
//page is tail now
VM_BUG_ON(!PageCompound(page));
if (flags & FOLL_GET)
get_page_foll(page);
//raise page count with the foll function

thus I raised question.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
> On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
>  wrote:
> > Hillf Danton wrote:
> >> On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
> >>  wrote:
> >> > +   page_cache_get(page);
> >> > +   spin_lock_irq(>tree_lock);
> >> > +   page->mapping = mapping;
> >> > +   page->index = offset;
> >> > +   error = radix_tree_insert(>page_tree, offset, page);
> >> > +   if (unlikely(error))
> >> > +   goto err;
> >> > +   if (PageTransHuge(page)) {
> >> > +   int i;
> >> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
> >>   struct page *tail = page + i; to easy reader
> >>
> >> > +   page_cache_get(page + i);
> >> s/page_cache_get/get_page_foll/ ?
> >
> > Why?
> >
> see follow_trans_huge_pmd() please.

Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
Could you elaborate?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Hillf Danton
On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
 wrote:
> Hillf Danton wrote:
>> On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
>>  wrote:
>> > +   page_cache_get(page);
>> > +   spin_lock_irq(>tree_lock);
>> > +   page->mapping = mapping;
>> > +   page->index = offset;
>> > +   error = radix_tree_insert(>page_tree, offset, page);
>> > +   if (unlikely(error))
>> > +   goto err;
>> > +   if (PageTransHuge(page)) {
>> > +   int i;
>> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
>>   struct page *tail = page + i; to easy reader
>>
>> > +   page_cache_get(page + i);
>> s/page_cache_get/get_page_foll/ ?
>
> Why?
>
see follow_trans_huge_pmd() please.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
> On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
>  wrote:
> > +   page_cache_get(page);
> > +   spin_lock_irq(>tree_lock);
> > +   page->mapping = mapping;
> > +   page->index = offset;
> > +   error = radix_tree_insert(>page_tree, offset, page);
> > +   if (unlikely(error))
> > +   goto err;
> > +   if (PageTransHuge(page)) {
> > +   int i;
> > +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
>   struct page *tail = page + i; to easy reader
> 
> > +   page_cache_get(page + i);
> s/page_cache_get/get_page_foll/ ?

Why?

> > +   page[i].index = offset + i;
> > +   error = radix_tree_insert(>page_tree,
> > +   offset + i, page + i);
> > +   if (error) {
> > +   page_cache_release(page + i);
> > +   break;
> > +   }
> > }
> > -   radix_tree_preload_end();
> > -   } else
> > -   mem_cgroup_uncharge_cache_page(page);
> > -out:
> > +   if (error) {
> > +   error = ENOSPC; /* no space for a huge page */
> s/E/-E/

Good catch! Thanks.

> > +   for (i--; i > 0; i--) {
> > +   radix_tree_delete(>page_tree,
> > +   offset + i);
> > +   page_cache_release(page + i);
> > +   }
> > +   radix_tree_delete(>page_tree, offset);
> > +   goto err;
> > +   }
> > +   }

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
 On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
 kirill.shute...@linux.intel.com wrote:
  +   page_cache_get(page);
  +   spin_lock_irq(mapping-tree_lock);
  +   page-mapping = mapping;
  +   page-index = offset;
  +   error = radix_tree_insert(mapping-page_tree, offset, page);
  +   if (unlikely(error))
  +   goto err;
  +   if (PageTransHuge(page)) {
  +   int i;
  +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
   struct page *tail = page + i; to easy reader
 
  +   page_cache_get(page + i);
 s/page_cache_get/get_page_foll/ ?

Why?

  +   page[i].index = offset + i;
  +   error = radix_tree_insert(mapping-page_tree,
  +   offset + i, page + i);
  +   if (error) {
  +   page_cache_release(page + i);
  +   break;
  +   }
  }
  -   radix_tree_preload_end();
  -   } else
  -   mem_cgroup_uncharge_cache_page(page);
  -out:
  +   if (error) {
  +   error = ENOSPC; /* no space for a huge page */
 s/E/-E/

Good catch! Thanks.

  +   for (i--; i  0; i--) {
  +   radix_tree_delete(mapping-page_tree,
  +   offset + i);
  +   page_cache_release(page + i);
  +   }
  +   radix_tree_delete(mapping-page_tree, offset);
  +   goto err;
  +   }
  +   }

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Hillf Danton
On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
kirill.shute...@linux.intel.com wrote:
 Hillf Danton wrote:
 On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
 kirill.shute...@linux.intel.com wrote:
  +   page_cache_get(page);
  +   spin_lock_irq(mapping-tree_lock);
  +   page-mapping = mapping;
  +   page-index = offset;
  +   error = radix_tree_insert(mapping-page_tree, offset, page);
  +   if (unlikely(error))
  +   goto err;
  +   if (PageTransHuge(page)) {
  +   int i;
  +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
   struct page *tail = page + i; to easy reader

  +   page_cache_get(page + i);
 s/page_cache_get/get_page_foll/ ?

 Why?

see follow_trans_huge_pmd() please.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
 On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
 kirill.shute...@linux.intel.com wrote:
  Hillf Danton wrote:
  On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
  kirill.shute...@linux.intel.com wrote:
   +   page_cache_get(page);
   +   spin_lock_irq(mapping-tree_lock);
   +   page-mapping = mapping;
   +   page-index = offset;
   +   error = radix_tree_insert(mapping-page_tree, offset, page);
   +   if (unlikely(error))
   +   goto err;
   +   if (PageTransHuge(page)) {
   +   int i;
   +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
struct page *tail = page + i; to easy reader
 
   +   page_cache_get(page + i);
  s/page_cache_get/get_page_foll/ ?
 
  Why?
 
 see follow_trans_huge_pmd() please.

Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
Could you elaborate?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Hillf Danton
On Fri, Mar 15, 2013 at 9:50 PM, Kirill A. Shutemov
kirill.shute...@linux.intel.com wrote:
 Hillf Danton wrote:
 On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
 kirill.shute...@linux.intel.com wrote:
  Hillf Danton wrote:
  On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
  kirill.shute...@linux.intel.com wrote:
   +   page_cache_get(page);
   +   spin_lock_irq(mapping-tree_lock);
   +   page-mapping = mapping;
   +   page-index = offset;
   +   error = radix_tree_insert(mapping-page_tree, offset, page);
   +   if (unlikely(error))
   +   goto err;
   +   if (PageTransHuge(page)) {
   +   int i;
   +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
struct page *tail = page + i; to easy reader
 
   +   page_cache_get(page + i);
  s/page_cache_get/get_page_foll/ ?
 
  Why?
 
 see follow_trans_huge_pmd() please.

 Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
 Could you elaborate?

Lets see the code

page += (addr  ~HPAGE_PMD_MASK)  PAGE_SHIFT;
//page is tail now
VM_BUG_ON(!PageCompound(page));
if (flags  FOLL_GET)
get_page_foll(page);
//raise page count with the foll function

thus I raised question.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-15 Thread Kirill A. Shutemov
Hillf Danton wrote:
 On Fri, Mar 15, 2013 at 9:50 PM, Kirill A. Shutemov
 kirill.shute...@linux.intel.com wrote:
  Hillf Danton wrote:
  On Fri, Mar 15, 2013 at 9:23 PM, Kirill A. Shutemov
  kirill.shute...@linux.intel.com wrote:
   Hillf Danton wrote:
   On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
   kirill.shute...@linux.intel.com wrote:
+   page_cache_get(page);
+   spin_lock_irq(mapping-tree_lock);
+   page-mapping = mapping;
+   page-index = offset;
+   error = radix_tree_insert(mapping-page_tree, offset, page);
+   if (unlikely(error))
+   goto err;
+   if (PageTransHuge(page)) {
+   int i;
+   for (i = 1; i  HPAGE_CACHE_NR; i++) {
 struct page *tail = page + i; to easy reader
  
+   page_cache_get(page + i);
   s/page_cache_get/get_page_foll/ ?
  
   Why?
  
  see follow_trans_huge_pmd() please.
 
  Sorry, I fail to see how follow_trans_huge_pmd() is relevant here.
  Could you elaborate?
 
 Lets see the code
 
   page += (addr  ~HPAGE_PMD_MASK)  PAGE_SHIFT;
 //page is tail now
   VM_BUG_ON(!PageCompound(page));
   if (flags  FOLL_GET)
   get_page_foll(page);
 //raise page count with the foll function
 
 thus I raised question.

get_page_foll() is designed to be part of follow_page*() call chain.
get_page() can handle compound pages properly.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-14 Thread Hillf Danton
On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
 wrote:
> +   page_cache_get(page);
> +   spin_lock_irq(>tree_lock);
> +   page->mapping = mapping;
> +   page->index = offset;
> +   error = radix_tree_insert(>page_tree, offset, page);
> +   if (unlikely(error))
> +   goto err;
> +   if (PageTransHuge(page)) {
> +   int i;
> +   for (i = 1; i < HPAGE_CACHE_NR; i++) {
struct page *tail = page + i; to easy reader

> +   page_cache_get(page + i);
s/page_cache_get/get_page_foll/ ?

> +   page[i].index = offset + i;
> +   error = radix_tree_insert(>page_tree,
> +   offset + i, page + i);
> +   if (error) {
> +   page_cache_release(page + i);
> +   break;
> +   }
> }
> -   radix_tree_preload_end();
> -   } else
> -   mem_cgroup_uncharge_cache_page(page);
> -out:
> +   if (error) {
> +   error = ENOSPC; /* no space for a huge page */
s/E/-E/

> +   for (i--; i > 0; i--) {
> +   radix_tree_delete(>page_tree,
> +   offset + i);
> +   page_cache_release(page + i);
> +   }
> +   radix_tree_delete(>page_tree, offset);
> +   goto err;
> +   }
> +   }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" 

For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
page for the specified index and HPAGE_CACHE_NR-1 tail pages for
following indexes.

Signed-off-by: Kirill A. Shutemov 
---
 mm/filemap.c |   76 --
 1 file changed, 53 insertions(+), 23 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 2d99191..6bac9e2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
address_space *mapping,
pgoff_t offset, gfp_t gfp_mask)
 {
int error;
+   int nr = 1;
 
VM_BUG_ON(!PageLocked(page));
VM_BUG_ON(PageSwapBacked(page));
@@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, struct 
address_space *mapping,
error = mem_cgroup_cache_charge(page, current->mm,
gfp_mask & GFP_RECLAIM_MASK);
if (error)
-   goto out;
+   return error;
 
-   error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
-   if (error == 0) {
-   page_cache_get(page);
-   page->mapping = mapping;
-   page->index = offset;
+   if (PageTransHuge(page)) {
+   BUILD_BUG_ON(HPAGE_CACHE_NR > RADIX_TREE_PRELOAD_NR);
+   nr = HPAGE_CACHE_NR;
+   }
+   error = radix_tree_preload_count(nr, gfp_mask & ~__GFP_HIGHMEM);
+   if (error) {
+   mem_cgroup_uncharge_cache_page(page);
+   return error;
+   }
 
-   spin_lock_irq(>tree_lock);
-   error = radix_tree_insert(>page_tree, offset, page);
-   if (likely(!error)) {
-   mapping->nrpages++;
-   __inc_zone_page_state(page, NR_FILE_PAGES);
-   spin_unlock_irq(>tree_lock);
-   trace_mm_filemap_add_to_page_cache(page);
-   } else {
-   page->mapping = NULL;
-   /* Leave page->index set: truncation relies upon it */
-   spin_unlock_irq(>tree_lock);
-   mem_cgroup_uncharge_cache_page(page);
-   page_cache_release(page);
+   page_cache_get(page);
+   spin_lock_irq(>tree_lock);
+   page->mapping = mapping;
+   page->index = offset;
+   error = radix_tree_insert(>page_tree, offset, page);
+   if (unlikely(error))
+   goto err;
+   if (PageTransHuge(page)) {
+   int i;
+   for (i = 1; i < HPAGE_CACHE_NR; i++) {
+   page_cache_get(page + i);
+   page[i].index = offset + i;
+   error = radix_tree_insert(>page_tree,
+   offset + i, page + i);
+   if (error) {
+   page_cache_release(page + i);
+   break;
+   }
}
-   radix_tree_preload_end();
-   } else
-   mem_cgroup_uncharge_cache_page(page);
-out:
+   if (error) {
+   error = ENOSPC; /* no space for a huge page */
+   for (i--; i > 0; i--) {
+   radix_tree_delete(>page_tree,
+   offset + i);
+   page_cache_release(page + i);
+   }
+   radix_tree_delete(>page_tree, offset);
+   goto err;
+   }
+   }
+   __mod_zone_page_state(page_zone(page), NR_FILE_PAGES, nr);
+   mapping->nrpages += nr;
+   spin_unlock_irq(>tree_lock);
+   trace_mm_filemap_add_to_page_cache(page);
+   radix_tree_preload_end();
+   return 0;
+err:
+   page->mapping = NULL;
+   /* Leave page->index set: truncation relies upon it */
+   spin_unlock_irq(>tree_lock);
+   radix_tree_preload_end();
+   mem_cgroup_uncharge_cache_page(page);
+   page_cache_release(page);
return error;
 }
 EXPORT_SYMBOL(add_to_page_cache_locked);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-14 Thread Kirill A. Shutemov
From: Kirill A. Shutemov kirill.shute...@linux.intel.com

For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head
page for the specified index and HPAGE_CACHE_NR-1 tail pages for
following indexes.

Signed-off-by: Kirill A. Shutemov kirill.shute...@linux.intel.com
---
 mm/filemap.c |   76 --
 1 file changed, 53 insertions(+), 23 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 2d99191..6bac9e2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -447,6 +447,7 @@ int add_to_page_cache_locked(struct page *page, struct 
address_space *mapping,
pgoff_t offset, gfp_t gfp_mask)
 {
int error;
+   int nr = 1;
 
VM_BUG_ON(!PageLocked(page));
VM_BUG_ON(PageSwapBacked(page));
@@ -454,32 +455,61 @@ int add_to_page_cache_locked(struct page *page, struct 
address_space *mapping,
error = mem_cgroup_cache_charge(page, current-mm,
gfp_mask  GFP_RECLAIM_MASK);
if (error)
-   goto out;
+   return error;
 
-   error = radix_tree_preload(gfp_mask  ~__GFP_HIGHMEM);
-   if (error == 0) {
-   page_cache_get(page);
-   page-mapping = mapping;
-   page-index = offset;
+   if (PageTransHuge(page)) {
+   BUILD_BUG_ON(HPAGE_CACHE_NR  RADIX_TREE_PRELOAD_NR);
+   nr = HPAGE_CACHE_NR;
+   }
+   error = radix_tree_preload_count(nr, gfp_mask  ~__GFP_HIGHMEM);
+   if (error) {
+   mem_cgroup_uncharge_cache_page(page);
+   return error;
+   }
 
-   spin_lock_irq(mapping-tree_lock);
-   error = radix_tree_insert(mapping-page_tree, offset, page);
-   if (likely(!error)) {
-   mapping-nrpages++;
-   __inc_zone_page_state(page, NR_FILE_PAGES);
-   spin_unlock_irq(mapping-tree_lock);
-   trace_mm_filemap_add_to_page_cache(page);
-   } else {
-   page-mapping = NULL;
-   /* Leave page-index set: truncation relies upon it */
-   spin_unlock_irq(mapping-tree_lock);
-   mem_cgroup_uncharge_cache_page(page);
-   page_cache_release(page);
+   page_cache_get(page);
+   spin_lock_irq(mapping-tree_lock);
+   page-mapping = mapping;
+   page-index = offset;
+   error = radix_tree_insert(mapping-page_tree, offset, page);
+   if (unlikely(error))
+   goto err;
+   if (PageTransHuge(page)) {
+   int i;
+   for (i = 1; i  HPAGE_CACHE_NR; i++) {
+   page_cache_get(page + i);
+   page[i].index = offset + i;
+   error = radix_tree_insert(mapping-page_tree,
+   offset + i, page + i);
+   if (error) {
+   page_cache_release(page + i);
+   break;
+   }
}
-   radix_tree_preload_end();
-   } else
-   mem_cgroup_uncharge_cache_page(page);
-out:
+   if (error) {
+   error = ENOSPC; /* no space for a huge page */
+   for (i--; i  0; i--) {
+   radix_tree_delete(mapping-page_tree,
+   offset + i);
+   page_cache_release(page + i);
+   }
+   radix_tree_delete(mapping-page_tree, offset);
+   goto err;
+   }
+   }
+   __mod_zone_page_state(page_zone(page), NR_FILE_PAGES, nr);
+   mapping-nrpages += nr;
+   spin_unlock_irq(mapping-tree_lock);
+   trace_mm_filemap_add_to_page_cache(page);
+   radix_tree_preload_end();
+   return 0;
+err:
+   page-mapping = NULL;
+   /* Leave page-index set: truncation relies upon it */
+   spin_unlock_irq(mapping-tree_lock);
+   radix_tree_preload_end();
+   mem_cgroup_uncharge_cache_page(page);
+   page_cache_release(page);
return error;
 }
 EXPORT_SYMBOL(add_to_page_cache_locked);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-14 Thread Hillf Danton
On Fri, Mar 15, 2013 at 1:50 AM, Kirill A. Shutemov
kirill.shute...@linux.intel.com wrote:
 +   page_cache_get(page);
 +   spin_lock_irq(mapping-tree_lock);
 +   page-mapping = mapping;
 +   page-index = offset;
 +   error = radix_tree_insert(mapping-page_tree, offset, page);
 +   if (unlikely(error))
 +   goto err;
 +   if (PageTransHuge(page)) {
 +   int i;
 +   for (i = 1; i  HPAGE_CACHE_NR; i++) {
struct page *tail = page + i; to easy reader

 +   page_cache_get(page + i);
s/page_cache_get/get_page_foll/ ?

 +   page[i].index = offset + i;
 +   error = radix_tree_insert(mapping-page_tree,
 +   offset + i, page + i);
 +   if (error) {
 +   page_cache_release(page + i);
 +   break;
 +   }
 }
 -   radix_tree_preload_end();
 -   } else
 -   mem_cgroup_uncharge_cache_page(page);
 -out:
 +   if (error) {
 +   error = ENOSPC; /* no space for a huge page */
s/E/-E/

 +   for (i--; i  0; i--) {
 +   radix_tree_delete(mapping-page_tree,
 +   offset + i);
 +   page_cache_release(page + i);
 +   }
 +   radix_tree_delete(mapping-page_tree, offset);
 +   goto err;
 +   }
 +   }
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/