Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-31 Thread Andrea Arcangeli
On Wed, May 31, 2017 at 03:39:22PM +0300, Mike Rapoport wrote: > For the CRIU usecase, disabling THP for a while and re-enabling it > back will do the trick, provided VMAs flags are not affected, like > in the patch you've sent. Moreover, we may even get away with Are you going to check uname -r

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 04:56:33PM +0200, Michal Hocko wrote: > On Tue 30-05-17 16:39:41, Michal Hocko wrote: > > On Tue 30-05-17 16:04:56, Andrea Arcangeli wrote: > [...] > > > About the proposed madvise, it just clear bits, but it doesn't change > > > at all how

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 04:56:33PM +0200, Michal Hocko wrote: > On Tue 30-05-17 16:39:41, Michal Hocko wrote: > > On Tue 30-05-17 16:04:56, Andrea Arcangeli wrote: > [...] > > > About the proposed madvise, it just clear bits, but it doesn't change > > > at all how

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 04:39:41PM +0200, Michal Hocko wrote: > I sysctl for the mapcount can be increased, right? I also assume that > those vmas will get merged after the post copy is done. Assuming you enlarge the sysctl to the worst possible case, with 64bit address space you can have

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 04:39:41PM +0200, Michal Hocko wrote: > I sysctl for the mapcount can be increased, right? I also assume that > those vmas will get merged after the post copy is done. Assuming you enlarge the sysctl to the worst possible case, with 64bit address space you can have

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 12:39:30PM +0200, Michal Hocko wrote: > On Tue 30-05-17 13:19:22, Mike Rapoport wrote: > > On Tue, May 30, 2017 at 09:44:08AM +0200, Michal Hocko wrote: > > > On Wed 24-05-17 17:27:36, Mike Rapoport wrote: > > > > On Wed, May 24, 2017 at 01:18:00PM +0200, Michal Hocko

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-30 Thread Andrea Arcangeli
On Tue, May 30, 2017 at 12:39:30PM +0200, Michal Hocko wrote: > On Tue 30-05-17 13:19:22, Mike Rapoport wrote: > > On Tue, May 30, 2017 at 09:44:08AM +0200, Michal Hocko wrote: > > > On Wed 24-05-17 17:27:36, Mike Rapoport wrote: > > > > On Wed, May 24, 2017 at 01:18:00PM +0200, Michal Hocko

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-24 Thread Andrea Arcangeli
Hello, On Wed, May 24, 2017 at 05:27:36PM +0300, Mike Rapoport wrote: > khugepaged does skip over VMAs which have userfault. We could register the > regions with userfault before populating them to avoid collapses in the > transition period. But then we'll have to populate these regions with >

Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE

2017-05-24 Thread Andrea Arcangeli
Hello, On Wed, May 24, 2017 at 05:27:36PM +0300, Mike Rapoport wrote: > khugepaged does skip over VMAs which have userfault. We could register the > regions with userfault before populating them to avoid collapses in the > transition period. But then we'll have to populate these regions with >

Re: [PATCH 2/4] thp: fix MADV_DONTNEED vs. numa balancing race

2017-05-16 Thread Andrea Arcangeli
On Wed, Apr 12, 2017 at 03:33:35PM +0200, Vlastimil Babka wrote: > On 03/02/2017 04:10 PM, Kirill A. Shutemov wrote: > > In case prot_numa, we are under down_read(mmap_sem). It's critical > > to not clear pmd intermittently to avoid race with MADV_DONTNEED > > which is also under

Re: [PATCH 2/4] thp: fix MADV_DONTNEED vs. numa balancing race

2017-05-16 Thread Andrea Arcangeli
On Wed, Apr 12, 2017 at 03:33:35PM +0200, Vlastimil Babka wrote: > On 03/02/2017 04:10 PM, Kirill A. Shutemov wrote: > > In case prot_numa, we are under down_read(mmap_sem). It's critical > > to not clear pmd intermittently to avoid race with MADV_DONTNEED > > which is also under

Re: Q. drm/i915 shrinker, synchronize_rcu_expedited() from handlers

2017-05-10 Thread Andrea Arcangeli
:05 +0900, J. R. Okajima wrote: > > > > > Thanx for the reply. > > > > >  > > > > > Andrea Arcangeli: > > > > > >  > > > > > > Yes I already reported this, my original fix was way more efficient > >

Re: Q. drm/i915 shrinker, synchronize_rcu_expedited() from handlers

2017-05-10 Thread Andrea Arcangeli
:05 +0900, J. R. Okajima wrote: > > > > > Thanx for the reply. > > > > >  > > > > > Andrea Arcangeli: > > > > > >  > > > > > > Yes I already reported this, my original fix was way more efficient > >

Re: Review request: draft ioctl_userfaultfd(2) manual page

2017-05-03 Thread Andrea Arcangeli
On Fri, Apr 21, 2017 at 11:11:18AM +0200, Michael Kerrisk (man-pages) wrote: > Hello Mike, > Hello Andrea (we need your help!), Sorry for not answering sooner! (I had a vacation last week) > > On 03/22/2017 02:54 PM, Mike Rapoport wrote: > >>The various ioctl(2) operations are

Re: Review request: draft ioctl_userfaultfd(2) manual page

2017-05-03 Thread Andrea Arcangeli
On Fri, Apr 21, 2017 at 11:11:18AM +0200, Michael Kerrisk (man-pages) wrote: > Hello Mike, > Hello Andrea (we need your help!), Sorry for not answering sooner! (I had a vacation last week) > > On 03/22/2017 02:54 PM, Mike Rapoport wrote: > >>The various ioctl(2) operations are

Re: Q. drm/i915 shrinker, synchronize_rcu_expedited() from handlers

2017-04-30 Thread Andrea Arcangeli
On Sun, Apr 30, 2017 at 03:07:58PM +0900, J. R. Okajima wrote: > Hello, > > Since v4.11-rc7 I can see the workqueue stops on my development/test system. > Git-bisecting tells me the suspicious commit is > c053b5a 2017-04-11 drm/i915: Don't call synchronize_rcu_expedited under >

Re: Q. drm/i915 shrinker, synchronize_rcu_expedited() from handlers

2017-04-30 Thread Andrea Arcangeli
On Sun, Apr 30, 2017 at 03:07:58PM +0900, J. R. Okajima wrote: > Hello, > > Since v4.11-rc7 I can see the workqueue stops on my development/test system. > Git-bisecting tells me the suspicious commit is > c053b5a 2017-04-11 drm/i915: Don't call synchronize_rcu_expedited under >

Re: Is it safe for kthreadd to drain_all_pages?

2017-04-13 Thread Andrea Arcangeli
Hello, On Sat, Apr 08, 2017 at 07:09:10PM +0100, Mel Gorman wrote: > On Sat, Apr 08, 2017 at 10:04:20AM -0700, Hugh Dickins wrote: > > On Fri, 7 Apr 2017, Hugh Dickins wrote: > > > On Fri, 7 Apr 2017, Michal Hocko wrote: > > > > On Fri 07-04-17 09:58:17, Hugh Dickins wrote: > > > > > On Fri, 7

Re: Is it safe for kthreadd to drain_all_pages?

2017-04-13 Thread Andrea Arcangeli
Hello, On Sat, Apr 08, 2017 at 07:09:10PM +0100, Mel Gorman wrote: > On Sat, Apr 08, 2017 at 10:04:20AM -0700, Hugh Dickins wrote: > > On Fri, 7 Apr 2017, Hugh Dickins wrote: > > > On Fri, 7 Apr 2017, Michal Hocko wrote: > > > > On Fri 07-04-17 09:58:17, Hugh Dickins wrote: > > > > > On Fri, 7

Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-09 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 04:30:11PM +0100, Chris Wilson wrote: > Not getting hangs is a good sign, but lockdep doesn't like it: > > [ 460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 > check_flush_dependency+0x92/0x130 > [ 460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H)

Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-09 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 04:30:11PM +0100, Chris Wilson wrote: > Not getting hangs is a good sign, but lockdep doesn't like it: > > [ 460.684901] WARNING: CPU: 1 PID: 172 at kernel/workqueue.c:2418 > check_flush_dependency+0x92/0x130 > [ 460.684924] workqueue: PF_MEMALLOC task 172(kworker/1:1H)

Re: [PATCH 5/5] i915: fence workqueue optimization

2017-04-07 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 10:58:38AM +0100, Chris Wilson wrote: > On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote: > > Insist to run llist_del_all() until the free_list is found empty, this > > may avoid having to schedule more workqueues. > > The work will

Re: [PATCH 5/5] i915: fence workqueue optimization

2017-04-07 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 10:58:38AM +0100, Chris Wilson wrote: > On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote: > > Insist to run llist_del_all() until the free_list is found empty, this > > may avoid having to schedule more workqueues. > > The work will

Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-07 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 11:02:11AM +0100, Chris Wilson wrote: > On Fri, Apr 07, 2017 at 01:23:44AM +0200, Andrea Arcangeli wrote: > > Waiting a RCU grace period only guarantees the work gets queued, but > > until after the queued workqueue returns, there's no guarantee the > >

Re: [PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-07 Thread Andrea Arcangeli
On Fri, Apr 07, 2017 at 11:02:11AM +0100, Chris Wilson wrote: > On Fri, Apr 07, 2017 at 01:23:44AM +0200, Andrea Arcangeli wrote: > > Waiting a RCU grace period only guarantees the work gets queued, but > > until after the queued workqueue returns, there's no guarantee the > >

[PATCH 4/5] i915: schedule while freeing the lists of gem objects

2017-04-06 Thread Andrea Arcangeli
Add cond_resched(). Signed-off-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/gpu/drm/i915/i915_gem.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 612fde3..c81baeb 100644 --- a/drivers/gpu/dr

[PATCH 4/5] i915: schedule while freeing the lists of gem objects

2017-04-06 Thread Andrea Arcangeli
Add cond_resched(). Signed-off-by: Andrea Arcangeli --- drivers/gpu/drm/i915/i915_gem.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 612fde3..c81baeb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu

[PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-06 Thread Andrea Arcangeli
-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ drivers/gpu/drm/i915/i915_gem_shrinker.c | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3982489..612fde3 100644 --- a/drive

[PATCH 2/5] i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker

2017-04-06 Thread Andrea Arcangeli
-by: Andrea Arcangeli --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ drivers/gpu/drm/i915/i915_gem_shrinker.c | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3982489..612fde3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c

[PATCH 5/5] i915: fence workqueue optimization

2017-04-06 Thread Andrea Arcangeli
Insist to run llist_del_all() until the free_list is found empty, this may avoid having to schedule more workqueues. Signed-off-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/gpu/drm/i915/intel_display.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/d

[PATCH 0/5] Re: [Intel-gfx] [BUG][REGRESSION] i915 gpu hangs under load

2017-04-06 Thread Andrea Arcangeli
. Andrea Arcangeli (5): i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker i915: initialize the free_list of the fencing atomic_helper i915: schedule while freeing the lists of gem objects

[PATCH 5/5] i915: fence workqueue optimization

2017-04-06 Thread Andrea Arcangeli
Insist to run llist_del_all() until the free_list is found empty, this may avoid having to schedule more workqueues. Signed-off-by: Andrea Arcangeli --- drivers/gpu/drm/i915/intel_display.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915

[PATCH 0/5] Re: [Intel-gfx] [BUG][REGRESSION] i915 gpu hangs under load

2017-04-06 Thread Andrea Arcangeli
. Andrea Arcangeli (5): i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock i915: flush gem obj freeing workqueues to add accuracy to the i915 shrinker i915: initialize the free_list of the fencing atomic_helper i915: schedule while freeing the lists of gem objects

[PATCH 3/5] i915: initialize the free_list of the fencing atomic_helper

2017-04-06 Thread Andrea Arcangeli
Just in case the llist model changes and NULL isn't valid initialization. Signed-off-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/gpu/drm/i915/intel_display.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/dr

[PATCH 1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock

2017-04-06 Thread Andrea Arcangeli
/0x40 ? ret_from_fork+0x23/0x30 Signed-off-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/gpu/drm/i915/i915_gem.c | 9 + drivers/gpu/drm/i915/i915_gem_shrinker.c | 14 ++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/dr

[PATCH 3/5] i915: initialize the free_list of the fencing atomic_helper

2017-04-06 Thread Andrea Arcangeli
Just in case the llist model changes and NULL isn't valid initialization. Signed-off-by: Andrea Arcangeli --- drivers/gpu/drm/i915/intel_display.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ed1f4f2

[PATCH 1/5] i915: avoid kernel hang caused by synchronize rcu struct_mutex deadlock

2017-04-06 Thread Andrea Arcangeli
/0x40 ? ret_from_fork+0x23/0x30 Signed-off-by: Andrea Arcangeli --- drivers/gpu/drm/i915/i915_gem.c | 9 + drivers/gpu/drm/i915/i915_gem_shrinker.c | 14 ++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu

Re: ZONE_NORMAL vs. ZONE_MOVABLE

2017-03-16 Thread Andrea Arcangeli
Hello Joonsoo, On Thu, Mar 16, 2017 at 02:31:22PM +0900, Joonsoo Kim wrote: > I don't follow up previous discussion so please let me know if I miss > something. I'd just like to mention about sticky pageblocks. The interesting part of the previous discussion relevant for the sticky movable

Re: ZONE_NORMAL vs. ZONE_MOVABLE

2017-03-16 Thread Andrea Arcangeli
Hello Joonsoo, On Thu, Mar 16, 2017 at 02:31:22PM +0900, Joonsoo Kim wrote: > I don't follow up previous discussion so please let me know if I miss > something. I'd just like to mention about sticky pageblocks. The interesting part of the previous discussion relevant for the sticky movable

Re: ZONE_NORMAL vs. ZONE_MOVABLE

2017-03-15 Thread Andrea Arcangeli
On Wed, Mar 15, 2017 at 02:11:40PM +0100, Michal Hocko wrote: > OK, I see now. I am afraid there is quite a lot of code which expects > that zones do not overlap. We can have holes in zones but not different > zones interleaving. Probably something which could be addressed but far > from trivial

Re: ZONE_NORMAL vs. ZONE_MOVABLE

2017-03-15 Thread Andrea Arcangeli
On Wed, Mar 15, 2017 at 02:11:40PM +0100, Michal Hocko wrote: > OK, I see now. I am afraid there is quite a lot of code which expects > that zones do not overlap. We can have holes in zones but not different > zones interleaving. Probably something which could be addressed but far > from trivial

Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)

2017-03-14 Thread Andrea Arcangeli
Hello, On Mon, Mar 13, 2017 at 10:21:45AM +0100, Michal Hocko wrote: > On Fri 10-03-17 13:00:37, Reza Arbab wrote: > > On Fri, Mar 10, 2017 at 04:53:33PM +0100, Michal Hocko wrote: > > >OK, so while I was playing with this setup some more I probably got why > > >this is done this way. All new

Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)

2017-03-14 Thread Andrea Arcangeli
Hello, On Mon, Mar 13, 2017 at 10:21:45AM +0100, Michal Hocko wrote: > On Fri 10-03-17 13:00:37, Reza Arbab wrote: > > On Fri, Mar 10, 2017 at 04:53:33PM +0100, Michal Hocko wrote: > > >OK, so while I was playing with this setup some more I probably got why > > >this is done this way. All new

Re: [LSF/MM TOPIC][LSF/MM,ATTEND] shared TLB, hugetlb reservations

2017-03-14 Thread Andrea Arcangeli
Hello, On Wed, Mar 08, 2017 at 05:30:55PM -0800, Mike Kravetz wrote: > On 01/10/2017 03:02 PM, Mike Kravetz wrote: > > Another more concrete topic is hugetlb reservations. Michal Hocko > > proposed the topic "mm patches review bandwidth", and brought up the > > related subject of areas in need

Re: [LSF/MM TOPIC][LSF/MM,ATTEND] shared TLB, hugetlb reservations

2017-03-14 Thread Andrea Arcangeli
Hello, On Wed, Mar 08, 2017 at 05:30:55PM -0800, Mike Kravetz wrote: > On 01/10/2017 03:02 PM, Mike Kravetz wrote: > > Another more concrete topic is hugetlb reservations. Michal Hocko > > proposed the topic "mm patches review bandwidth", and brought up the > > related subject of areas in need

Re: mm: use-after-free in zap_page_range

2017-03-03 Thread Andrea Arcangeli
Hello Dmitry, On Fri, Mar 03, 2017 at 02:54:26PM +0100, Dmitry Vyukov wrote: > The following program triggers use-after-free in zap_page_range: > https://gist.githubusercontent.com/dvyukov/b59dfbaa0cb1e5231094d228fa57c9bd/raw/95c4da18cb96f8aaa47c10012d8c4484fd5917ad/gistfile1.txt I posted the

Re: mm: use-after-free in zap_page_range

2017-03-03 Thread Andrea Arcangeli
Hello Dmitry, On Fri, Mar 03, 2017 at 02:54:26PM +0100, Dmitry Vyukov wrote: > The following program triggers use-after-free in zap_page_range: > https://gist.githubusercontent.com/dvyukov/b59dfbaa0cb1e5231094d228fa57c9bd/raw/95c4da18cb96f8aaa47c10012d8c4484fd5917ad/gistfile1.txt I posted the

Re: fs: use-after-free in userfaultfd_exit

2017-03-01 Thread Andrea Arcangeli
On Wed, Mar 01, 2017 at 07:48:00PM +0100, Dmitry Vyukov wrote: > Hello, > > I've got the following use-after-free report while running syzkaller > fuzzer on 86292b33d4b79ee03e2f43ea0381ef85f077c760: Yes, I posted the fix for this one last Friday, I found it during stress testing, it triggered

Re: fs: use-after-free in userfaultfd_exit

2017-03-01 Thread Andrea Arcangeli
On Wed, Mar 01, 2017 at 07:48:00PM +0100, Dmitry Vyukov wrote: > Hello, > > I've got the following use-after-free report while running syzkaller > fuzzer on 86292b33d4b79ee03e2f43ea0381ef85f077c760: Yes, I posted the fix for this one last Friday, I found it during stress testing, it triggered

Re: mm: fault in __do_fault

2017-02-28 Thread Andrea Arcangeli
On Tue, Feb 28, 2017 at 06:32:20PM +0300, Kirill A. Shutemov wrote: > Andrea, does it look okay for you? > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 625b7285a37b..56f61f1a1dc1 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -489,7 +489,7 @@ int

Re: mm: fault in __do_fault

2017-02-28 Thread Andrea Arcangeli
On Tue, Feb 28, 2017 at 06:32:20PM +0300, Kirill A. Shutemov wrote: > Andrea, does it look okay for you? > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 625b7285a37b..56f61f1a1dc1 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -489,7 +489,7 @@ int

Re: mm: fault in __do_fault

2017-02-28 Thread Andrea Arcangeli
n you verify this fix: >From a65381bc86d2963713b6a9c4a73cded7dd184282 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli <aarca...@redhat.com> Date: Tue, 28 Feb 2017 16:36:59 +0100 Subject: [PATCH 1/1] userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE __do_fault assumes vmf->p

Re: mm: fault in __do_fault

2017-02-28 Thread Andrea Arcangeli
n you verify this fix: >From a65381bc86d2963713b6a9c4a73cded7dd184282 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Tue, 28 Feb 2017 16:36:59 +0100 Subject: [PATCH 1/1] userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE __do_fault assumes vmf->page has been initia

Re: [GIT PULL] Kselftest update for 4.11-rc1

2017-02-27 Thread Andrea Arcangeli
Hello, On Sun, Feb 26, 2017 at 12:40:36PM -0800, Mike Kravetz wrote: > Another option is to only have a single userfaultfd executable and pass > an option (anon, hugetlb, shmem ...) that indicates the type of pages/mapping > to test. That sounds good to me. I guess it was quicker and it

Re: [GIT PULL] Kselftest update for 4.11-rc1

2017-02-27 Thread Andrea Arcangeli
Hello, On Sun, Feb 26, 2017 at 12:40:36PM -0800, Mike Kravetz wrote: > Another option is to only have a single userfaultfd executable and pass > an option (anon, hugetlb, shmem ...) that indicates the type of pages/mapping > to test. That sounds good to me. I guess it was quicker and it

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-22 Thread Andrea Arcangeli
On Tue, Feb 21, 2017 at 04:25:45PM +0300, Kirill A. Shutemov wrote: > I think more strict vma_is_anonymous() is a good thing. > > But I don't see a point introducing one more helper. Let's just make the > existing helper work better. That would be simpler agreed. The point of having an "unsafe"

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-22 Thread Andrea Arcangeli
On Tue, Feb 21, 2017 at 04:25:45PM +0300, Kirill A. Shutemov wrote: > I think more strict vma_is_anonymous() is a good thing. > > But I don't see a point introducing one more helper. Let's just make the > existing helper work better. That would be simpler agreed. The point of having an "unsafe"

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
On Fri, Feb 17, 2017 at 01:08:55PM -0800, Andrew Morton wrote: > I had a bunch more rejects to fix in that function. Below is the final > result - please check it carefully. Sure, reviewed and this is the diff that remains (vm_shared assignment location is irrelevant, I put it at the end as it's

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
On Fri, Feb 17, 2017 at 01:08:55PM -0800, Andrew Morton wrote: > I had a bunch more rejects to fix in that function. Below is the final > result - please check it carefully. Sure, reviewed and this is the diff that remains (vm_shared assignment location is irrelevant, I put it at the end as it's

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
On Fri, Feb 17, 2017 at 12:17:38PM -0800, Andrew Morton wrote: > I merged this up and a small issue remains: Great! > The value of `err' here is EINVAL. That sems appropriate, but it only > happens by sheer luck. It might have been programmer luck but just for completeness, at runtime no luck

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
On Fri, Feb 17, 2017 at 12:17:38PM -0800, Andrew Morton wrote: > I merged this up and a small issue remains: Great! > The value of `err' here is EINVAL. That sems appropriate, but it only > happens by sheer luck. It might have been programmer luck but just for completeness, at runtime no luck

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
eppability (if such word exist) calling it vm_shared_alloc would have been preferable. We can clean it up post upstream merge or it should be diffed against mm latest or it may cause more rejects. Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> The patches were not against latest -mm so I solve

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-17 Thread Andrea Arcangeli
eppability (if such word exist) calling it vm_shared_alloc would have been preferable. We can clean it up post upstream merge or it should be diffed against mm latest or it may cause more rejects. Reviewed-by: Andrea Arcangeli The patches were not against latest -mm so I solved the rejects during merge

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-16 Thread Andrea Arcangeli
ma->vm_ops = _vm_ops; So I would turn it into: /* * shmem_zero_setup is invoked in mmap for MAP_ANONYMOUS|MAP_SHARED but * it will overwrite vm_ops, so vma_is_anonymous must return false. */ if (WARN_ON_ONCE(vma_is_anonymous(dst_vma) && dst_vma->vm_flags & VM_SHARED))

Re: [PATCH] userfaultfd: hugetlbfs: add UFFDIO_COPY support for shared mappings

2017-02-16 Thread Andrea Arcangeli
ma->vm_ops = _vm_ops; So I would turn it into: /* * shmem_zero_setup is invoked in mmap for MAP_ANONYMOUS|MAP_SHARED but * it will overwrite vm_ops, so vma_is_anonymous must return false. */ if (WARN_ON_ONCE(vma_is_anonymous(dst_vma) && dst_vma->vm_flags & VM_SHARED))

Re: [PATCH v3 03/14] mm: use pmd lock instead of racy checks in zap_pmd_range()

2017-02-13 Thread Andrea Arcangeli
Hello! On Mon, Feb 13, 2017 at 01:59:06PM +0300, Kirill A. Shutemov wrote: > On Sun, Feb 12, 2017 at 06:25:09PM -0600, Zi Yan wrote: > > Since in mm/compaction.c, the kernel does not down_read(mmap_sem) during > > memory > > compaction. Namely, base page migrations do not hold

Re: [PATCH v3 03/14] mm: use pmd lock instead of racy checks in zap_pmd_range()

2017-02-13 Thread Andrea Arcangeli
Hello! On Mon, Feb 13, 2017 at 01:59:06PM +0300, Kirill A. Shutemov wrote: > On Sun, Feb 12, 2017 at 06:25:09PM -0600, Zi Yan wrote: > > Since in mm/compaction.c, the kernel does not down_read(mmap_sem) during > > memory > > compaction. Namely, base page migrations do not hold

Re: [PATCH] mprotect: drop overprotective lock_pte_protection()

2017-02-08 Thread Andrea Arcangeli
his? > > Right. Except, it doesn't drop unneeded pmd_trans_unstable(pmd) check after > __split_huge_pmd(). > > Could you fold this part of my patch into Andrea's? Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> > > diff --git a/mm/mprotect.c b/mm/mprotect.c >

Re: [PATCH] mprotect: drop overprotective lock_pte_protection()

2017-02-08 Thread Andrea Arcangeli
drop unneeded pmd_trans_unstable(pmd) check after > __split_huge_pmd(). > > Could you fold this part of my patch into Andrea's? Reviewed-by: Andrea Arcangeli > > diff --git a/mm/mprotect.c b/mm/mprotect.c > index f9c07f54dd62..e919e4613eab 100644 > --- a/mm/mprotect.c > +

Re: [PATCH] mprotect: drop overprotective lock_pte_protection()

2017-02-07 Thread Andrea Arcangeli
On Tue, Feb 07, 2017 at 05:33:47PM +0300, Kirill A. Shutemov wrote: > lock_pte_protection() uses pmd_lock() to make sure that we have stable > PTE page table before walking pte range. > > That's not necessary. We only need to make sure that PTE page table is > established. It cannot vanish under

Re: [PATCH] mprotect: drop overprotective lock_pte_protection()

2017-02-07 Thread Andrea Arcangeli
On Tue, Feb 07, 2017 at 05:33:47PM +0300, Kirill A. Shutemov wrote: > lock_pte_protection() uses pmd_lock() to make sure that we have stable > PTE page table before walking pte range. > > That's not necessary. We only need to make sure that PTE page table is > established. It cannot vanish under

Re: [PATCH] userfaultfd: mcopy_atomic: update cases returning -ENOENT

2017-02-07 Thread Andrea Arcangeli
t; [1] http://www.spinics.net/lists/linux-mm/msg121267.html > > Signed-off-by: Mike Rapoport <r...@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>

Re: [PATCH] userfaultfd: mcopy_atomic: update cases returning -ENOENT

2017-02-07 Thread Andrea Arcangeli
/linux-mm/msg121267.html > > Signed-off-by: Mike Rapoport Reviewed-by: Andrea Arcangeli

Re: [PATCH v2 4/5] userfaultfd: mcopy_atomic: return -ENOENT when no compatible VMA found

2017-02-02 Thread Andrea Arcangeli
On Fri, Jan 27, 2017 at 08:44:32PM +0200, Mike Rapoport wrote: > - err = -EINVAL; > + err = -ENOENT; > dst_vma = find_vma(dst_mm, dst_start); > if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) > goto out_unlock; > +

Re: [PATCH v2 4/5] userfaultfd: mcopy_atomic: return -ENOENT when no compatible VMA found

2017-02-02 Thread Andrea Arcangeli
On Fri, Jan 27, 2017 at 08:44:32PM +0200, Mike Rapoport wrote: > - err = -EINVAL; > + err = -ENOENT; > dst_vma = find_vma(dst_mm, dst_start); > if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) > goto out_unlock; > +

Re: [PATCH v2 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-19 Thread Andrea Arcangeli
Hello, On Thu, Jan 19, 2017 at 07:35:53PM +0100, Claudio Imbrenda wrote: > +/* Checksum of an empty (zeroed) page */ > +static unsigned int zero_checksum; > + > +/* Whether to merge empty (zeroed) pages with actual zero pages */ > +static bool ksm_use_zero_pages; Both could be defined as

Re: [PATCH v2 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-19 Thread Andrea Arcangeli
Hello, On Thu, Jan 19, 2017 at 07:35:53PM +0100, Claudio Imbrenda wrote: > +/* Checksum of an empty (zeroed) page */ > +static unsigned int zero_checksum; > + > +/* Whether to merge empty (zeroed) pages with actual zero pages */ > +static bool ksm_use_zero_pages; Both could be defined as

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-18 Thread Andrea Arcangeli
On Wed, Jan 18, 2017 at 06:17:09PM +0100, Claudio Imbrenda wrote: > That's true. As I said above, my previous example was not very well > thought. The more realistic scenario is that of having the colored zero > pages of a guest merged. That's a good point for making a special case that retains

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-18 Thread Andrea Arcangeli
On Wed, Jan 18, 2017 at 06:17:09PM +0100, Claudio Imbrenda wrote: > That's true. As I said above, my previous example was not very well > thought. The more realistic scenario is that of having the colored zero > pages of a guest merged. That's a good point for making a special case that retains

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-18 Thread Andrea Arcangeli
On Wed, Jan 18, 2017 at 04:15:56PM +0100, Claudio Imbrenda wrote: > I'm not sure it would make sense to have this for archs that don't have > page coloring. Merging empty pages together instead of with the It's still good to be able to exercise this code on all archs (if nothing else for

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-18 Thread Andrea Arcangeli
On Wed, Jan 18, 2017 at 04:15:56PM +0100, Claudio Imbrenda wrote: > I'm not sure it would make sense to have this for archs that don't have > page coloring. Merging empty pages together instead of with the It's still good to be able to exercise this code on all archs (if nothing else for

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-12 Thread Andrea Arcangeli
Hello Claudio, On Thu, Jan 12, 2017 at 05:17:14PM +0100, Claudio Imbrenda wrote: > +#ifdef __HAVE_COLOR_ZERO_PAGE > + /* > + * Same checksum as an empty page. We attempt to merge it with the > + * appropriate zero page. > + */ > + if (checksum == zero_checksum) { > +

Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

2017-01-12 Thread Andrea Arcangeli
Hello Claudio, On Thu, Jan 12, 2017 at 05:17:14PM +0100, Claudio Imbrenda wrote: > +#ifdef __HAVE_COLOR_ZERO_PAGE > + /* > + * Same checksum as an empty page. We attempt to merge it with the > + * appropriate zero page. > + */ > + if (checksum == zero_checksum) { > +

Re: [PATCH v7 08/11] x86, kvm/x86.c: support vcpu preempted check

2016-12-19 Thread Andrea Arcangeli
not fundamental for correct functionality of the guest pv spinlock code. This bug was introduced in commit 0b9f6c4615c993d2b552e0d2bd1ade49b56e5beb in v4.9-rc7. >From 458897fd44aa9b91459a006caa4051a7d1628a23 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli <aarca...@redhat.com> Date: Sat, 1

Re: [PATCH v7 08/11] x86, kvm/x86.c: support vcpu preempted check

2016-12-19 Thread Andrea Arcangeli
tionality of the guest pv spinlock code. This bug was introduced in commit 0b9f6c4615c993d2b552e0d2bd1ade49b56e5beb in v4.9-rc7. >From 458897fd44aa9b91459a006caa4051a7d1628a23 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Sat, 17 Dec 2016 18:43:52 +0100 Subject: [PATCH 1/2] kvm: fix schedule in atomic in kvm_steal_tim

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-16 Thread Andrea Arcangeli
On Thu, Dec 15, 2016 at 05:40:45PM -0800, Dave Hansen wrote: > On 12/15/2016 05:38 PM, Li, Liang Z wrote: > > > > Use 52 bits for 'pfn', 12 bits for 'length', when the 12 bits is not long > > enough for the 'length' > > Set the 'length' to a special value to indicate the "actual length in next

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-16 Thread Andrea Arcangeli
On Thu, Dec 15, 2016 at 05:40:45PM -0800, Dave Hansen wrote: > On 12/15/2016 05:38 PM, Li, Liang Z wrote: > > > > Use 52 bits for 'pfn', 12 bits for 'length', when the 12 bits is not long > > enough for the 'length' > > Set the 'length' to a special value to indicate the "actual length in next

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-16 Thread Andrea Arcangeli
On Fri, Dec 16, 2016 at 01:12:21AM +, Li, Liang Z wrote: > There still exist the case if the MAX_ORDER is configured to a large value, > e.g. 36 for a system > with huge amount of memory, then there is only 28 bits left for the pfn, > which is not enough. Not related to the balloon but how

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-16 Thread Andrea Arcangeli
On Fri, Dec 16, 2016 at 01:12:21AM +, Li, Liang Z wrote: > There still exist the case if the MAX_ORDER is configured to a large value, > e.g. 36 for a system > with huge amount of memory, then there is only 28 bits left for the pfn, > which is not enough. Not related to the balloon but how

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-09 Thread Andrea Arcangeli
Hello, On Fri, Dec 09, 2016 at 05:35:45AM +, Li, Liang Z wrote: > > On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > > What's the conclusion of your discussion? It seems you want some > > > statistic before deciding whether to ripping the bitmap from the ABI, > > > am I right? > > > > I think

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-09 Thread Andrea Arcangeli
Hello, On Fri, Dec 09, 2016 at 05:35:45AM +, Li, Liang Z wrote: > > On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > > What's the conclusion of your discussion? It seems you want some > > > statistic before deciding whether to ripping the bitmap from the ABI, > > > am I right? > > > > I think

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
On Wed, Dec 07, 2016 at 11:54:34AM -0800, Dave Hansen wrote: > We're talking about a bunch of different stuff which is all being > conflated. There are 3 issues here that I can see. I'll attempt to > summarize what I think is going on: > > 1. Current patches do a hypercall for each order in the

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
On Wed, Dec 07, 2016 at 11:54:34AM -0800, Dave Hansen wrote: > We're talking about a bunch of different stuff which is all being > conflated. There are 3 issues here that I can see. I'll attempt to > summarize what I think is going on: > > 1. Current patches do a hypercall for each order in the

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
On Wed, Dec 07, 2016 at 10:44:31AM -0800, Dave Hansen wrote: > On 12/07/2016 10:38 AM, Andrea Arcangeli wrote: > >> > and leaves room for the bitmap size to be encoded as well, if we decide > >> > we need a bitmap in the future. > > How would a bitmap ever be

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
On Wed, Dec 07, 2016 at 10:44:31AM -0800, Dave Hansen wrote: > On 12/07/2016 10:38 AM, Andrea Arcangeli wrote: > >> > and leaves room for the bitmap size to be encoded as well, if we decide > >> > we need a bitmap in the future. > > How would a bitmap ever be

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
Hello, On Wed, Dec 07, 2016 at 08:57:01AM -0800, Dave Hansen wrote: > It is more space-efficient. We're fitting the order into 6 bits, which > would allows the full 2^64 address space to be represented in one entry, Very large order is the same as very large len, 6 bits of order or 8 bytes of

Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-07 Thread Andrea Arcangeli
Hello, On Wed, Dec 07, 2016 at 08:57:01AM -0800, Dave Hansen wrote: > It is more space-efficient. We're fitting the order into 6 bits, which > would allows the full 2^64 address space to be represented in one entry, Very large order is the same as very large len, 6 bits of order or 8 bytes of

Re: BUG Re: mm: vma_merge: fix vm_page_prot SMP race condition against rmap_walk

2016-09-27 Thread Andrea Arcangeli
Hello, On Tue, Sep 27, 2016 at 05:16:15AM -0500, Shaun Tancheff wrote: > git bisect points at commit c9634dcf00c9c93b ("mm: vma_merge: fix > vm_page_prot SMP race condition against rmap_walk") I assume linux-next? But I can't find the commit, but I should know what this is. > > Last lines to

Re: BUG Re: mm: vma_merge: fix vm_page_prot SMP race condition against rmap_walk

2016-09-27 Thread Andrea Arcangeli
Hello, On Tue, Sep 27, 2016 at 05:16:15AM -0500, Shaun Tancheff wrote: > git bisect points at commit c9634dcf00c9c93b ("mm: vma_merge: fix > vm_page_prot SMP race condition against rmap_walk") I assume linux-next? But I can't find the commit, but I should know what this is. > > Last lines to

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-13 Thread Andrea Arcangeli
Hello, On Tue, Sep 13, 2016 at 04:53:49PM +0800, Huang, Ying wrote: > I am glad to discuss my final goal, that is, swapping out/in the full > THP without splitting. Why I want to do that is copied as below, I think that is a fine objective. It wasn't implemented initially just to keep things

<    2   3   4   5   6   7   8   9   10   11   >