Re: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-04-01 Thread Naoya Horiguchi
On Tue, Mar 31, 2015 at 02:35:21PM -0700, Andrew Morton wrote:
> On Tue, 31 Mar 2015 01:45:55 + Naoya Horiguchi 
>  wrote:
> 
> > Currently when a process accesses to hugetlb range protected with PROTNONE,
> > unexpected COWs are triggered, which finally put hugetlb subsystem into
> > broken/uncontrollable state, where for example h->resv_huge_pages is 
> > subtracted
> > too much and wrapped around to a very large number, and free hugepage pool
> > is no longer maintainable.
> > 
> > This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
> > problem. And this also allows us to avoid useless overhead of minor faults.
> > 
> > ...
> >
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
> > vma = mm->mmap;
> > }
> > for (; vma; vma = vma->vm_next) {
> > -   if (!vma_migratable(vma) || !vma_policy_mof(vma))
> > +   if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
> > +   is_vm_hugetlb_page(vma)) {
> > continue;
> > +   }
> >  
> > /*
> >  * Shared library pages mapped by multiple processes are not
> 
> Which kernel version(s) need this patch?

I don't bisect completely, but the problem this patch is mentioning is visible
since v4.0-rc1 (not reproduced at v3.19).

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-04-01 Thread Naoya Horiguchi
On Tue, Mar 31, 2015 at 02:35:21PM -0700, Andrew Morton wrote:
 On Tue, 31 Mar 2015 01:45:55 + Naoya Horiguchi 
 n-horigu...@ah.jp.nec.com wrote:
 
  Currently when a process accesses to hugetlb range protected with PROTNONE,
  unexpected COWs are triggered, which finally put hugetlb subsystem into
  broken/uncontrollable state, where for example h-resv_huge_pages is 
  subtracted
  too much and wrapped around to a very large number, and free hugepage pool
  is no longer maintainable.
  
  This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
  problem. And this also allows us to avoid useless overhead of minor faults.
  
  ...
 
  --- a/kernel/sched/fair.c
  +++ b/kernel/sched/fair.c
  @@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
  vma = mm-mmap;
  }
  for (; vma; vma = vma-vm_next) {
  -   if (!vma_migratable(vma) || !vma_policy_mof(vma))
  +   if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
  +   is_vm_hugetlb_page(vma)) {
  continue;
  +   }
   
  /*
   * Shared library pages mapped by multiple processes are not
 
 Which kernel version(s) need this patch?

I don't bisect completely, but the problem this patch is mentioning is visible
since v4.0-rc1 (not reproduced at v3.19).

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-03-31 Thread Andrew Morton
On Tue, 31 Mar 2015 01:45:55 + Naoya Horiguchi  
wrote:

> Currently when a process accesses to hugetlb range protected with PROTNONE,
> unexpected COWs are triggered, which finally put hugetlb subsystem into
> broken/uncontrollable state, where for example h->resv_huge_pages is 
> subtracted
> too much and wrapped around to a very large number, and free hugepage pool
> is no longer maintainable.
> 
> This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
> problem. And this also allows us to avoid useless overhead of minor faults.
> 
> ...
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
>   vma = mm->mmap;
>   }
>   for (; vma; vma = vma->vm_next) {
> - if (!vma_migratable(vma) || !vma_policy_mof(vma))
> + if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
> + is_vm_hugetlb_page(vma)) {
>   continue;
> + }
>  
>   /*
>* Shared library pages mapped by multiple processes are not

Which kernel version(s) need this patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-03-31 Thread Andrew Morton
On Tue, 31 Mar 2015 01:45:55 + Naoya Horiguchi n-horigu...@ah.jp.nec.com 
wrote:

 Currently when a process accesses to hugetlb range protected with PROTNONE,
 unexpected COWs are triggered, which finally put hugetlb subsystem into
 broken/uncontrollable state, where for example h-resv_huge_pages is 
 subtracted
 too much and wrapped around to a very large number, and free hugepage pool
 is no longer maintainable.
 
 This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
 problem. And this also allows us to avoid useless overhead of minor faults.
 
 ...

 --- a/kernel/sched/fair.c
 +++ b/kernel/sched/fair.c
 @@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
   vma = mm-mmap;
   }
   for (; vma; vma = vma-vm_next) {
 - if (!vma_migratable(vma) || !vma_policy_mof(vma))
 + if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
 + is_vm_hugetlb_page(vma)) {
   continue;
 + }
  
   /*
* Shared library pages mapped by multiple processes are not

Which kernel version(s) need this patch?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-03-30 Thread Naoya Horiguchi
On Mon, Mar 30, 2015 at 12:59:01PM +0100, Mel Gorman wrote:
> On Mon, Mar 30, 2015 at 07:42:13PM +0900, Naoya Horiguchi wrote:
...
> 
> I note now that the patch was too hasty. By rights, that check
> should be covered by vma_migratable() but it's only checked if
> CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION which means it's x86-only. If you
> are seeing this problem on any other arch then a more correct fix might be
> to remove the CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION check in vma_migratable.

Changing vma_migratable() affects other usecases of hugepage migration like
mbind(), so simply removing the ifdef doesn't work for such usecases.
I didn't test other archs, but I guess that this problem could happen on all
archs enabling numa balancing, whether it supports 
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION.

So I'd like pick/push your first suggestion. It passed my testing.

Thanks,
Naoya Horiguchi
---
From: Naoya Horiguchi 
Subject: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

Currently when a process accesses to hugetlb range protected with PROTNONE,
unexpected COWs are triggered, which finally put hugetlb subsystem into
broken/uncontrollable state, where for example h->resv_huge_pages is subtracted
too much and wrapped around to a very large number, and free hugepage pool
is no longer maintainable.

This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
problem. And this also allows us to avoid useless overhead of minor faults.

Suggested-by: Mel Gorman 
Signed-off-by: Naoya Horiguchi 
---
 kernel/sched/fair.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ce18f3c097a..6ad0d570f38e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
vma = mm->mmap;
}
for (; vma; vma = vma->vm_next) {
-   if (!vma_migratable(vma) || !vma_policy_mof(vma))
+   if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
+   is_vm_hugetlb_page(vma)) {
continue;
+   }
 
/*
 * Shared library pages mapped by multiple processes are not
-- 
1.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

2015-03-30 Thread Naoya Horiguchi
On Mon, Mar 30, 2015 at 12:59:01PM +0100, Mel Gorman wrote:
 On Mon, Mar 30, 2015 at 07:42:13PM +0900, Naoya Horiguchi wrote:
...
 
 I note now that the patch was too hasty. By rights, that check
 should be covered by vma_migratable() but it's only checked if
 CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION which means it's x86-only. If you
 are seeing this problem on any other arch then a more correct fix might be
 to remove the CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION check in vma_migratable.

Changing vma_migratable() affects other usecases of hugepage migration like
mbind(), so simply removing the ifdef doesn't work for such usecases.
I didn't test other archs, but I guess that this problem could happen on all
archs enabling numa balancing, whether it supports 
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION.

So I'd like pick/push your first suggestion. It passed my testing.

Thanks,
Naoya Horiguchi
---
From: Naoya Horiguchi n-horigu...@ah.jp.nec.com
Subject: [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB)

Currently when a process accesses to hugetlb range protected with PROTNONE,
unexpected COWs are triggered, which finally put hugetlb subsystem into
broken/uncontrollable state, where for example h-resv_huge_pages is subtracted
too much and wrapped around to a very large number, and free hugepage pool
is no longer maintainable.

This patch simply stops changing protection for vma(VM_HUGETLB) to fix the
problem. And this also allows us to avoid useless overhead of minor faults.

Suggested-by: Mel Gorman mgor...@suse.de
Signed-off-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
---
 kernel/sched/fair.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7ce18f3c097a..6ad0d570f38e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2161,8 +2161,10 @@ void task_numa_work(struct callback_head *work)
vma = mm-mmap;
}
for (; vma; vma = vma-vm_next) {
-   if (!vma_migratable(vma) || !vma_policy_mof(vma))
+   if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
+   is_vm_hugetlb_page(vma)) {
continue;
+   }
 
/*
 * Shared library pages mapped by multiple processes are not
-- 
1.9.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/