Hello,
On Tue, Sep 13, 2016 at 04:53:49PM +0800, Huang, Ying wrote:
> I am glad to discuss my final goal, that is, swapping out/in the full
> THP without splitting. Why I want to do that is copied as below,
I think that is a fine objective. It wasn't implemented initially just
to keep things
ing mapped ptes.
>
> Signed-off-by: Ebru Akagunduz <ebru.akagun...@gmail.com>
> Suggested-by: Andrea Arcangeli <aarca...@redhat.com>
> ---
> mm/khugepaged.c | 10 +-
> 1 file changed, 5 insertions(+), 5 deletions(-)
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
ing mapped ptes.
>
> Signed-off-by: Ebru Akagunduz
> Suggested-by: Andrea Arcangeli
> ---
> mm/khugepaged.c | 10 +-
> 1 file changed, 5 insertions(+), 5 deletions(-)
Reviewed-by: Andrea Arcangeli
gt;
> ---
> mm/khugepaged.c | 15 ---
> 1 file changed, 8 insertions(+), 7 deletions(-)
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
lock was dropped.
>
> [1]
> http://lkml.kernel.org/r/cact4y+z3gigbvhca9krjfcjx0g70v_nrhbwkbu+ygoesbdk...@mail.gmail.com
>
> Signed-off-by: Kirill A. Shutemov
> Reported-by: Dmitry Vyukov
> ---
> mm/khugepaged.c | 15 ++++---
> 1 file changed, 8 insertions(+), 7 deletions(-)
Reviewed-by: Andrea Arcangeli
Hello Kirill,
On Mon, Aug 29, 2016 at 03:42:33PM +0300, Kirill A. Shutemov wrote:
> @@ -898,13 +899,13 @@ static bool __collapse_huge_page_swapin(struct
> mm_struct *mm,
> /* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
> if (ret & VM_FAULT_RETRY) {
>
Hello Kirill,
On Mon, Aug 29, 2016 at 03:42:33PM +0300, Kirill A. Shutemov wrote:
> @@ -898,13 +899,13 @@ static bool __collapse_huge_page_swapin(struct
> mm_struct *mm,
> /* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
> if (ret & VM_FAULT_RETRY) {
>
On Fri, Aug 19, 2016 at 03:53:59PM +0100, Mel Gorman wrote:
> Compaction is not the same as LRU management.
Sure but compaction is invoked by reclaim and if reclaim is node-wide,
it makes more sense if compaction would be node-wide as well.
Otherwise what you compact? Just the higher zone, or
On Fri, Aug 19, 2016 at 03:53:59PM +0100, Mel Gorman wrote:
> Compaction is not the same as LRU management.
Sure but compaction is invoked by reclaim and if reclaim is node-wide,
it makes more sense if compaction would be node-wide as well.
Otherwise what you compact? Just the higher zone, or
On Fri, Aug 19, 2016 at 03:23:20PM +0200, Vlastimil Babka wrote:
> What's that? Never head of this before, but sounds scary :) I thought
> that zone_reclaim itself was rather discouraged nowadays, not a big
> candidate for further improvement.,,
It's some fix that I tried to push upstream but
On Fri, Aug 19, 2016 at 03:23:20PM +0200, Vlastimil Babka wrote:
> What's that? Never head of this before, but sounds scary :) I thought
> that zone_reclaim itself was rather discouraged nowadays, not a big
> candidate for further improvement.,,
It's some fix that I tried to push upstream but
Hello Mel,
On Fri, Jul 08, 2016 at 10:34:36AM +0100, Mel Gorman wrote:
> Minor changes this time
>
> Changelog since v8
> This is the latest version of a series that moves LRUs from the zones to
I'm afraid this is a bit incomplete...
I had troubles in rebasing the compaction-enabled
Hello Mel,
On Fri, Jul 08, 2016 at 10:34:36AM +0100, Mel Gorman wrote:
> Minor changes this time
>
> Changelog since v8
> This is the latest version of a series that moves LRUs from the zones to
I'm afraid this is a bit incomplete...
I had troubles in rebasing the compaction-enabled
Hi Mike,
On Thu, Aug 04, 2016 at 11:14:11AM +0300, Mike Rapoport wrote:
> These patches enable userfaultfd support for shared memory mappings. The
> VMAs backed with shmem/tmpfs can be registered with userfaultfd which
> allows management of page faults in these areas by userland.
>
> This patch
Hi Mike,
On Thu, Aug 04, 2016 at 11:14:11AM +0300, Mike Rapoport wrote:
> These patches enable userfaultfd support for shared memory mappings. The
> VMAs backed with shmem/tmpfs can be registered with userfaultfd which
> allows management of page faults in these areas by userland.
>
> This patch
Hello,
On Wed, Jul 27, 2016 at 01:33:35PM +0300, Kirill A. Shutemov wrote:
> I guess you can get work 64k blocks with 4k pages if you *always* allocate
> order-4 pages for page cache of the filesystem. But I don't think it's
> sustainable. It's significant pressure on buddy allocator and
Hello,
On Wed, Jul 27, 2016 at 01:33:35PM +0300, Kirill A. Shutemov wrote:
> I guess you can get work 64k blocks with 4k pages if you *always* allocate
> order-4 pages for page cache of the filesystem. But I don't think it's
> sustainable. It's significant pressure on buddy allocator and
Hello Michal,
CC'ed Hugh,
On Fri, Jun 03, 2016 at 04:46:00PM +0200, Michal Hocko wrote:
> What do you think about the external dependencies mentioned above. Do
> you think this is a sufficient argument wrt. occasional higher
> latencies?
It's a tradeoff and both latencies would be short and
Hello Michal,
CC'ed Hugh,
On Fri, Jun 03, 2016 at 04:46:00PM +0200, Michal Hocko wrote:
> What do you think about the external dependencies mentioned above. Do
> you think this is a sufficient argument wrt. occasional higher
> latencies?
It's a tradeoff and both latencies would be short and
On Thu, Jun 02, 2016 at 02:21:10PM +0200, Michal Hocko wrote:
> Testing with the patch makes some sense as well, but I would like to
> hear from Andrea whether the approach is good because I am wondering why
> he hasn't done that before - it feels so much simpler than the current
> code.
The
On Thu, Jun 02, 2016 at 02:21:10PM +0200, Michal Hocko wrote:
> Testing with the patch makes some sense as well, but I would like to
> hear from Andrea whether the approach is good because I am wondering why
> he hasn't done that before - it feels so much simpler than the current
> code.
The
On Tue, May 24, 2016 at 11:12:23AM +0300, Mika Westerberg wrote:
> Hmm, the kernel shipped with Fedora 23 has that enabled:
>
> lahna % grep CONFIG_DEBUG_VM /boot/config-4.4.9-300.fc23.x86_64
> CONFIG_DEBUG_VM=y
> # CONFIG_DEBUG_VM_VMACACHE is not set
> # CONFIG_DEBUG_VM_RB is not set
Yes, it
On Tue, May 24, 2016 at 11:12:23AM +0300, Mika Westerberg wrote:
> Hmm, the kernel shipped with Fedora 23 has that enabled:
>
> lahna % grep CONFIG_DEBUG_VM /boot/config-4.4.9-300.fc23.x86_64
> CONFIG_DEBUG_VM=y
> # CONFIG_DEBUG_VM_VMACACHE is not set
> # CONFIG_DEBUG_VM_RB is not set
Yes, it
On Tue, May 24, 2016 at 12:49:42AM +0300, Kirill A. Shutemov wrote:
> That's what we do now and that's not enough.
>
> We would need to serialize against pmd_lock() during normal page-fault
> path (and other pte manipulation), which we don't do now if pmd points to
> page table.
Yes, mmap_sem
On Tue, May 24, 2016 at 12:49:42AM +0300, Kirill A. Shutemov wrote:
> That's what we do now and that's not enough.
>
> We would need to serialize against pmd_lock() during normal page-fault
> path (and other pte manipulation), which we don't do now if pmd points to
> page table.
Yes, mmap_sem
gt;
> Note that we use address only in CONFIG_DEBUG_VM=y case and the bug is not
> visible on production kernels with the option disabled.
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 8a839935b18c..0ea5d9071b32 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1098,6 +1098,8 @@ void page_move_anon_rmap(struct page *page,
>
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> VM_BUG_ON_VMA(!anon_vma, vma);
> + if (IS_ENABLED(CONFIG_DEBUG_VM) && PageTransHuge(page))
> + address &= HPAGE_PMD_MASK;
> VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
>
> anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
Just sent a patch doing the exact same thing just emebedded in the
VM_BUG_ON_PAGE, either version is fine with me.
gt;
> Note that we use address only in CONFIG_DEBUG_VM=y case and the bug is not
> visible on production kernels with the option disabled.
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 8a839935b18c..0ea5d9071b32 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1098,6 +1098,8 @@ void page_move_anon_rmap(struct page *page,
>
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> VM_BUG_ON_VMA(!anon_vma, vma);
> + if (IS_ENABLED(CONFIG_DEBUG_VM) && PageTransHuge(page))
> + address &= HPAGE_PMD_MASK;
> VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
>
> anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
Reviewed-by: Andrea Arcangeli
Just sent a patch doing the exact same thing just emebedded in the
VM_BUG_ON_PAGE, either version is fine with me.
test this to shut off the false positive?
>From 4db87e3e44837a0b038e58eaa3fea29db84723ec Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <aarca...@redhat.com>
Date: Mon, 23 May 2016 17:03:57 +0200
Subject: [PATCH 1/1] mm: thp: avoid false positive VM_BUG_ON_PAGE in
page_move_anon_rmap()
If the page_move_anon_rm
test this to shut off the false positive?
>From 4db87e3e44837a0b038e58eaa3fea29db84723ec Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli
Date: Mon, 23 May 2016 17:03:57 +0200
Subject: [PATCH 1/1] mm: thp: avoid false positive VM_BUG_ON_PAGE in
page_move_anon_rmap()
If the page_move_anon_rmap() is refiling
ome
latency in the footprint reduction in the future non-cooperative
usage).
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
> +static inline bool userfaultfd_get_mm(struct userfaultfd_ctx *ctx)
> +{
> + return atomic_inc_not_zero(>mm->mm_users);
> +}
Nice cl
ome
latency in the footprint reduction in the future non-cooperative
usage).
Reviewed-by: Andrea Arcangeli
> +static inline bool userfaultfd_get_mm(struct userfaultfd_ctx *ctx)
> +{
> + return atomic_inc_not_zero(>mm->mm_users);
> +}
Nice cleanup, but wouldn't it be more ge
ion.
Reviewed-by: "Kirill A. Shutemov" <kir...@shutemov.name>
Signed-off-by: Andrea Arcangeli <aarca...@redhat.com>
---
include/linux/mm.h | 9 +++
include/linux/swap.h | 6 ++---
mm/huge_memory.c | 71 +---
mm/memory
ion.
Reviewed-by: "Kirill A. Shutemov"
Signed-off-by: Andrea Arcangeli
---
include/linux/mm.h | 9 +++
include/linux/swap.h | 6 ++---
mm/huge_memory.c | 71 +---
mm/memory.c | 22 ++--
mm/swapfile.c
Hello Nicolas,
On Thu, May 12, 2016 at 05:31:52PM +0200, Nicolas Morey-Chaisemartin wrote:
>
>
> Le 05/12/2016 à 03:52 PM, Jerome Glisse a écrit :
> > On Thu, May 12, 2016 at 03:30:24PM +0200, Nicolas Morey-Chaisemartin wrote:
> >> Le 05/12/2016 à 11:36 AM, Jerome Glisse a écrit :
> >>> On Thu,
Hello Nicolas,
On Thu, May 12, 2016 at 05:31:52PM +0200, Nicolas Morey-Chaisemartin wrote:
>
>
> Le 05/12/2016 à 03:52 PM, Jerome Glisse a écrit :
> > On Thu, May 12, 2016 at 03:30:24PM +0200, Nicolas Morey-Chaisemartin wrote:
> >> Le 05/12/2016 à 11:36 AM, Jerome Glisse a écrit :
> >>> On Thu,
used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.
Reviewed-by: "Kirill A. Shutemov" <kir...@shutemov.name>
Signed-off-by: Andrea Arcangeli <aarca.
used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.
Reviewed-by: "Kirill A. Shutemov"
Signed-off-by: Andrea Arcangeli
---
include/linux/mm.h | 9 +++
anyway.
Andrea Arcangeli (3):
mm: thp: calculate the mapcount correctly for THP pages during WP
faults
mm: thp: microoptimize compound_mapcount()
mm: thp: split_huge_pmd_address() comment improvement
include/linux/mm.h | 12 +++--
include/linux/swap.h | 8 +++---
mm
anyway.
Andrea Arcangeli (3):
mm: thp: calculate the mapcount correctly for THP pages during WP
faults
mm: thp: microoptimize compound_mapcount()
mm: thp: split_huge_pmd_address() comment improvement
include/linux/mm.h | 12 +++--
include/linux/swap.h | 8 +++---
mm
ill A. Shutemov" <kir...@shutemov.name>
Signed-off-by: Andrea Arcangeli <aarca...@redhat.com>
---
include/linux/mm.h | 9 +++
include/linux/swap.h | 8 ---
mm/huge_memory.c | 67 +---
mm/memory.c | 22 ++
Shutemov"
Signed-off-by: Andrea Arcangeli
---
include/linux/mm.h | 9 +++
include/linux/swap.h | 8 ---
mm/huge_memory.c | 67 +---
mm/memory.c | 22 ++---
mm/swapfile.c| 13 +-
5 files cha
compound_mapcount() is only called after PageCompound() has already
been checked by the caller, so there's no point to check it again. Gcc
may optimize it away too because it's inline but this will remove the
runtime check for sure and add it'll add an assert instead.
Signed-off-by: Andrea
Comment is partly wrong, this improves it by including the case of
split_huge_pmd_address() called by try_to_unmap_one if
TTU_SPLIT_HUGE_PMD is set.
Signed-off-by: Andrea Arcangeli <aarca...@redhat.com>
---
mm/huge_memory.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff
compound_mapcount() is only called after PageCompound() has already
been checked by the caller, so there's no point to check it again. Gcc
may optimize it away too because it's inline but this will remove the
runtime check for sure and add it'll add an assert instead.
Signed-off-by: Andrea
Comment is partly wrong, this improves it by including the case of
split_huge_pmd_address() called by try_to_unmap_one if
TTU_SPLIT_HUGE_PMD is set.
Signed-off-by: Andrea Arcangeli
---
mm/huge_memory.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b
gt; + spin_unlock(_mmlist_lock);
>
> /* Repeat until we've completed scanning the whole list */
> slot = ksm_scan.mm_slot;
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
While the above patch is correct, I would however prefer if you could
update it to keep relea
gt; + spin_unlock(_mmlist_lock);
>
> /* Repeat until we've completed scanning the whole list */
> slot = ksm_scan.mm_slot;
Reviewed-by: Andrea Arcangeli
While the above patch is correct, I would however prefer if you could
update it to keep releasing the ksm_mmli
Hello Zhou,
Great catch.
On Thu, May 05, 2016 at 08:42:56PM +0800, Zhou Chengming wrote:
> remove_trailing_rmap_items(slot, ksm_scan.rmap_list);
> + up_read(>mmap_sem);
>
> spin_lock(_mmlist_lock);
> ksm_scan.mm_slot = list_entry(slot->mm_list.next,
> @@ -1666,16 +1667,12
Hello Zhou,
Great catch.
On Thu, May 05, 2016 at 08:42:56PM +0800, Zhou Chengming wrote:
> remove_trailing_rmap_items(slot, ksm_scan.rmap_list);
> + up_read(>mmap_sem);
>
> spin_lock(_mmlist_lock);
> ksm_scan.mm_slot = list_entry(slot->mm_list.next,
> @@ -1666,16 +1667,12
On Thu, May 05, 2016 at 06:11:10PM +0300, Kirill A. Shutemov wrote:
> Hm. How total_mapcount equal to NULL wouldn't lead to NULL-pointer
> dereference inside page_trans_huge_mapcount()?
Sorry for the confusion, this was still work in progress and then I've
seen the email from Alex and I sent the
On Thu, May 05, 2016 at 06:11:10PM +0300, Kirill A. Shutemov wrote:
> Hm. How total_mapcount equal to NULL wouldn't lead to NULL-pointer
> dereference inside page_trans_huge_mapcount()?
Sorry for the confusion, this was still work in progress and then I've
seen the email from Alex and I sent the
On Thu, May 05, 2016 at 04:39:24PM +0200, Andrea Arcangeli wrote:
> I'm currently testing this:
I must have been testing an earlier version, this below has better
chance not to oops. There's a reason I didn't attempt a proper submit
yet.. this is just for testing until we're sure this ok.
I a
On Thu, May 05, 2016 at 04:39:24PM +0200, Andrea Arcangeli wrote:
> I'm currently testing this:
I must have been testing an earlier version, this below has better
chance not to oops. There's a reason I didn't attempt a proper submit
yet.. this is just for testing until we're sure this ok.
I a
Hello Alex,
On Wed, May 04, 2016 at 07:19:27PM -0600, Alex Williamson wrote:
> On Mon, 2 May 2016 20:03:07 +0200
> Andrea Arcangeli <aarca...@redhat.com> wrote:
>
> > On Mon, May 02, 2016 at 07:00:42PM +0300, Kirill A. Shutemov wrote:
> > > Agreed. I just didn't
Hello Alex,
On Wed, May 04, 2016 at 07:19:27PM -0600, Alex Williamson wrote:
> On Mon, 2 May 2016 20:03:07 +0200
> Andrea Arcangeli wrote:
>
> > On Mon, May 02, 2016 at 07:00:42PM +0300, Kirill A. Shutemov wrote:
> > > Agreed. I just didn't see the two-refcounts sol
On Mon, May 02, 2016 at 07:12:52PM +0300, Kirill A. Shutemov wrote:
> Any reason why mmu_notifier is not an option?
No way to trigger an hardware re-tried secondary MMU fault as result
of PCI DMA memory access, and expensive to do an MMU notifier
invalidate if it requires waiting for the DMA to
On Mon, May 02, 2016 at 07:12:52PM +0300, Kirill A. Shutemov wrote:
> Any reason why mmu_notifier is not an option?
No way to trigger an hardware re-tried secondary MMU fault as result
of PCI DMA memory access, and expensive to do an MMU notifier
invalidate if it requires waiting for the DMA to
On Mon, May 02, 2016 at 05:22:49PM +0200, Jerome Glisse wrote:
> I think this is still fine as it means that device will read only and thus
> you can migrate to different page (ie the guest is not expecting to read back
> anything writen by the device and device writting to the page would be
On Mon, May 02, 2016 at 05:22:49PM +0200, Jerome Glisse wrote:
> I think this is still fine as it means that device will read only and thus
> you can migrate to different page (ie the guest is not expecting to read back
> anything writen by the device and device writting to the page would be
On Mon, May 02, 2016 at 06:00:13PM +0300, Kirill A. Shutemov wrote:
> Switching to non-fast GUP would help :-P
If we had a race in khugepaged or ksmd against gup_fast O_DIRECT we'd
get flood of bugreports of data corruption with KVM run with
cache=direct.
Just wanted to reassure there's no race,
On Mon, May 02, 2016 at 06:00:13PM +0300, Kirill A. Shutemov wrote:
> Switching to non-fast GUP would help :-P
If we had a race in khugepaged or ksmd against gup_fast O_DIRECT we'd
get flood of bugreports of data corruption with KVM run with
cache=direct.
Just wanted to reassure there's no race,
On Mon, May 02, 2016 at 03:14:02PM +0300, Kirill A. Shutemov wrote:
> Quick look around:
>
> - I don't see any check page_count() around __replace_page() in uprobes,
>so it can easily replace pinned page.
>
> - KSM has the page_count() check, there's still race wrt GUP_fast: it can
>
On Mon, May 02, 2016 at 03:14:02PM +0300, Kirill A. Shutemov wrote:
> Quick look around:
>
> - I don't see any check page_count() around __replace_page() in uprobes,
>so it can easily replace pinned page.
>
> - KSM has the page_count() check, there's still race wrt GUP_fast: it can
>
On Mon, May 02, 2016 at 07:00:42PM +0300, Kirill A. Shutemov wrote:
> Sounds correct, but code is going to be ugly :-/
Now if a page is not shared in the parent, it is already in the local
anon_vma. The only thing we could lose here is a pmd split in the
child caused by swapping and then parent
On Mon, May 02, 2016 at 07:00:42PM +0300, Kirill A. Shutemov wrote:
> Sounds correct, but code is going to be ugly :-/
Now if a page is not shared in the parent, it is already in the local
anon_vma. The only thing we could lose here is a pmd split in the
child caused by swapping and then parent
On Mon, May 02, 2016 at 01:41:19PM +0300, Kirill A. Shutemov wrote:
> I don't think this would work correctly. Let's check one of callers:
>
> static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> unsigned long address, pte_t *page_table, pmd_t *pmd,
>
On Mon, May 02, 2016 at 01:41:19PM +0300, Kirill A. Shutemov wrote:
> I don't think this would work correctly. Let's check one of callers:
>
> static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> unsigned long address, pte_t *page_table, pmd_t *pmd,
>
00 2001
From: Andrea Arcangeli <aarca...@redhat.com>
Date: Fri, 29 Apr 2016 01:05:06 +0200
Subject: [PATCH 1/1] mm: thp: calculate the mapcount correctly for THP pages
during WP faults
This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning w
00 2001
From: Andrea Arcangeli
Date: Fri, 29 Apr 2016 01:05:06 +0200
Subject: [PATCH 1/1] mm: thp: calculate the mapcount correctly for THP pages
during WP faults
This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning will not get broken by false
p
k vs 4MB. The problem of
course is when we really need a COW, we'll waste an additional 32k,
but then it doesn't matter that much as we'd be forced to load 4MB of
cache anyway in such case. There's room for optimizations but even the
simple below patch would be ok for now.
>From 09e3d1ff10b49fb9c3a
k vs 4MB. The problem of
course is when we really need a COW, we'll waste an additional 32k,
but then it doesn't matter that much as we'd be forced to load 4MB of
cache anyway in such case. There's room for optimizations but even the
simple below patch would be ok for now.
>From 09e3d1ff10b49fb9c3ab77f
On Wed, Apr 27, 2016 at 05:57:30PM +0200, Andrea Arcangeli wrote:
> couldn't do a fix as cleaner as this one for 4.6.
ehm "cleaner then"
If you've suggestions for a better name than PageTransCompoundMap I
can respin a new patch though, I considered "CanMap" but I opte
On Wed, Apr 27, 2016 at 05:57:30PM +0200, Andrea Arcangeli wrote:
> couldn't do a fix as cleaner as this one for 4.6.
ehm "cleaner then"
If you've suggestions for a better name than PageTransCompoundMap I
can respin a new patch though, I considered "CanMap" but I opte
On Wed, Apr 27, 2016 at 06:18:34PM +0300, Kirill A. Shutemov wrote:
> Okay, I see.
>
> But do we really want to make PageTransCompoundMap() visiable beyond KVM
> code? It looks like too KVM-specific.
Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will
On Wed, Apr 27, 2016 at 06:18:34PM +0300, Kirill A. Shutemov wrote:
> Okay, I see.
>
> But do we really want to make PageTransCompoundMap() visiable beyond KVM
> code? It looks like too KVM-specific.
Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will
Hello Andres,
On Tue, Apr 19, 2016 at 10:07:29AM -0700, Andres Lagar-Cavilla wrote:
> Andrea, we provide the, ahem, adjustments to
> transparent_hugepage_adjust. Rest assured we aggressively use mmu
> notifiers with no further changes required.
Did you notice I just fixed a THP related bug in
Hello Andres,
On Tue, Apr 19, 2016 at 10:07:29AM -0700, Andres Lagar-Cavilla wrote:
> Andrea, we provide the, ahem, adjustments to
> transparent_hugepage_adjust. Rest assured we aggressively use mmu
> notifiers with no further changes required.
Did you notice I just fixed a THP related bug in
On Wed, Apr 27, 2016 at 04:50:30PM +0300, Kirill A. Shutemov wrote:
> I know nothing about kvm. How do you protect against pmd splitting between
> get_user_pages() and the check?
get_user_pages_fast() runs fully lockless and unpins the page right
away (we need a get_user_pages_fast without the
On Wed, Apr 27, 2016 at 04:50:30PM +0300, Kirill A. Shutemov wrote:
> I know nothing about kvm. How do you protect against pmd splitting between
> get_user_pages() and the check?
get_user_pages_fast() runs fully lockless and unpins the page right
away (we need a get_user_pages_fast without the
), KVM would map the whole compound page
into the shadow pagetables, despite regular faults or userfaults (like
UFFDIO_COPY) may map regular pages into the primary MMU as result of
the pte faults, leading to the guest mode and userland mode going out
of sync and not working on the same memory at all time
), KVM would map the whole compound page
into the shadow pagetables, despite regular faults or userfaults (like
UFFDIO_COPY) may map regular pages into the primary MMU as result of
the pte faults, leading to the guest mode and userland mode going out
of sync and not working on the same memory at all ti
Hello Pavel and Mike,
On Wed, Apr 20, 2016 at 12:44:48PM +0300, Pavel Emelyanov wrote:
> On 03/20/2016 03:42 PM, Mike Rapoport wrote:
> > Hi,
> >
> > This set is to address the issues that appear in userfaultfd usage
> > scenarios when the task monitoring the uffd and the mm-owner do not
> >
Hello Pavel and Mike,
On Wed, Apr 20, 2016 at 12:44:48PM +0300, Pavel Emelyanov wrote:
> On 03/20/2016 03:42 PM, Mike Rapoport wrote:
> > Hi,
> >
> > This set is to address the issues that appear in userfaultfd usage
> > scenarios when the task monitoring the uffd and the mm-owner do not
> >
Hello,
On Mon, Apr 18, 2016 at 03:55:44PM -0700, Shi, Yang wrote:
> Hi Kirill,
>
> Finally, I got some time to look into and try yours and Hugh's patches,
> got two problems.
One thing that come to mind to test is this: qemu with -machine
accel=kvm -mem-path=/dev/shm/,share=on .
The THP
Hello,
On Mon, Apr 18, 2016 at 03:55:44PM -0700, Shi, Yang wrote:
> Hi Kirill,
>
> Finally, I got some time to look into and try yours and Hugh's patches,
> got two problems.
One thing that come to mind to test is this: qemu with -machine
accel=kvm -mem-path=/dev/shm/,share=on .
The THP
Hello,
On Mon, Apr 04, 2016 at 03:06:25PM +0300, Kirill A. Shutemov wrote:
> On Mon, Apr 04, 2016 at 02:03:54PM +0200, Vlastimil Babka wrote:
> > [+CC Andrea]
> >
> > On 04/02/2016 11:48 AM, Dmitry Vyukov wrote:
> > >Hello,
> > >
> > >The following program triggers a BUG in
Hello,
On Mon, Apr 04, 2016 at 03:06:25PM +0300, Kirill A. Shutemov wrote:
> On Mon, Apr 04, 2016 at 02:03:54PM +0200, Vlastimil Babka wrote:
> > [+CC Andrea]
> >
> > On 04/02/2016 11:48 AM, Dmitry Vyukov wrote:
> > >Hello,
> > >
> > >The following program triggers a BUG in
t; #endif
>
> Or perhaps better, centralise the non-SMP definitions:
>
> arch/x86/include/asm/pgtable-2level.h | 6 --
> arch/x86/include/asm/pgtable-3level.h | 7 +--
> arch/x86/include/asm/pgtable.h| 5 +
> arch/x86/include/asm/pgtable_64.h | 18 ++
> 4 files changed, 8 insertions(+), 28 deletions(-)
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
t; #endif
>
> Or perhaps better, centralise the non-SMP definitions:
>
> arch/x86/include/asm/pgtable-2level.h | 6 --
> arch/x86/include/asm/pgtable-3level.h | 7 +--
> arch/x86/include/asm/pgtable.h| 5 +
> arch/x86/include/asm/pgtable_64.h | 18 ++
> 4 files changed, 8 insertions(+), 28 deletions(-)
Reviewed-by: Andrea Arcangeli
Hello everyone,
On Fri, Mar 04, 2016 at 03:30:18PM -0500, Matthew Wilcox wrote:
> On Wed, Feb 03, 2016 at 08:48:35AM +0100, Ingo Molnar wrote:
> > > @@ -111,8 +111,10 @@ static inline pud_t native_pudp_get_and_
> > > #ifdef CONFIG_SMP
> > > return native_make_pud(xchg(>pud, 0));
> > > #else
>
Hello everyone,
On Fri, Mar 04, 2016 at 03:30:18PM -0500, Matthew Wilcox wrote:
> On Wed, Feb 03, 2016 at 08:48:35AM +0100, Ingo Molnar wrote:
> > > @@ -111,8 +111,10 @@ static inline pud_t native_pudp_get_and_
> > > #ifdef CONFIG_SMP
> > > return native_make_pud(xchg(>pud, 0));
> > > #else
>
Hello,
On Thu, Mar 03, 2016 at 08:46:41AM +0100, Sedat Dilek wrote:
> One technical question:
> How do I get the latest Linux version shipped userfaultfd first?
> ( Maybe there exist more elegant ways I do. Always open to improve my
> Git knowledge. )
Perhaps there are cleaner ways, I would do
Hello,
On Thu, Mar 03, 2016 at 08:46:41AM +0100, Sedat Dilek wrote:
> One technical question:
> How do I get the latest Linux version shipped userfaultfd first?
> ( Maybe there exist more elegant ways I do. Always open to improve my
> Git knowledge. )
Perhaps there are cleaner ways, I would do
unlock ctx->qwh.lock
> + * lock ctx->wqh.lock (in poll_wait)
> + * __add_wait_queue
> + * unlock ctx->wqh.lock
> + * eventfd_poll returns 0
> + */
> + count = READ_ONCE(ctx->count);
>
> if (count > 0)
> events |= POLLIN;
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
unlock ctx->qwh.lock
> + * lock ctx->wqh.lock (in poll_wait)
> + * __add_wait_queue
> + * unlock ctx->wqh.lock
> + * eventfd_poll returns 0
> + */
> + count = READ_ONCE(ctx->count);
>
> if (count > 0)
> events |= POLLIN;
Reviewed-by: Andrea Arcangeli
On Fri, Feb 26, 2016 at 01:32:53PM +0300, Kirill A. Shutemov wrote:
> Could you elaborate on problems with rmap? I have looked into this deeply
> yet.
>
> Do you see anything what would prevent following basic scheme:
>
> - Identify series of small pages as candidate for collapsing into
>a
On Fri, Feb 26, 2016 at 01:32:53PM +0300, Kirill A. Shutemov wrote:
> Could you elaborate on problems with rmap? I have looked into this deeply
> yet.
>
> Do you see anything what would prevent following basic scheme:
>
> - Identify series of small pages as candidate for collapsing into
>a
think some more about this and come up with
> solutions how to avoid these kinds of "very late user space accesses"
> cleanly, I think that would be great.
Agreed.
Thanks,
Andrea
>From 03f7e43aab4e4b6f02599f4e4675581f691e Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <
think some more about this and come up with
> solutions how to avoid these kinds of "very late user space accesses"
> cleanly, I think that would be great.
Agreed.
Thanks,
Andrea
>From 03f7e43aab4e4b6f02599f4e4675581f691e Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli
Hello,
On Wed, Mar 02, 2016 at 12:48:46AM +, Al Viro wrote:
> On Tue, Mar 01, 2016 at 12:06:49PM -0800, Linus Torvalds wrote:
>
> > So the only access we really care about is the child tid-pointer
> > clearing one, and that always happens after PF_EXITING has been set
> > afaik.
> >
> > No
701 - 800 of 3668 matches
Mail list logo