Re: [RFC PATCH V2 0/5] vhost: accelerate metadata access through vmap()

2019-03-12 Thread Andrea Arcangeli
On Tue, Mar 12, 2019 at 02:19:15PM -0700, James Bottomley wrote: > I mean in the sequence > > flush_dcache_page(page); > flush_dcache_page(page); > > The first flush_dcache_page did all the work and the second it a > tightly pipelined no-op. That's what I mean by there not really being > a

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
Hello Jeson, On Fri, Mar 08, 2019 at 04:50:36PM +0800, Jason Wang wrote: > Just to make sure I understand here. For boosting through huge TLB, do > you mean we can do that in the future (e.g by mapping more userspace > pages to kenrel) or it can be done by this series (only about three 4K >

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
On Fri, Mar 08, 2019 at 04:58:44PM +0800, Jason Wang wrote: > Can I simply can set_page_dirty() before vunmap() in the mmu notifier > callback, or is there any reason that it must be called within vumap()? I also don't see any problem in doing it before vunmap. As far as the mmu notifier and

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-08 Thread Andrea Arcangeli
On Fri, Mar 08, 2019 at 05:13:26PM +0800, Jason Wang wrote: > Actually not wrapping around,  the pages for used ring was marked as > dirty after a round of virtqueue processing when we're sure vhost wrote > something there. Thanks for the clarification. So we need to convert it to

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-07 Thread Andrea Arcangeli
Hello Jerome, On Thu, Mar 07, 2019 at 03:17:22PM -0500, Jerome Glisse wrote: > So for the above the easiest thing is to call set_page_dirty() from > the mmu notifier callback. It is always safe to use the non locking > variant from such callback. Well it is safe only if the page was > map with

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-07 Thread Andrea Arcangeli
On Thu, Mar 07, 2019 at 02:09:10PM -0500, Jerome Glisse wrote: > I thought this patch was only for anonymous memory ie not file back ? Yes, the other common usages are on hugetlbfs/tmpfs that also don't need to implement writeback and are obviously safe too. > If so then set dirty is mostly

Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

2019-03-07 Thread Andrea Arcangeli
On Thu, Mar 07, 2019 at 12:56:45PM -0500, Michael S. Tsirkin wrote: > On Thu, Mar 07, 2019 at 10:47:22AM -0500, Michael S. Tsirkin wrote: > > On Wed, Mar 06, 2019 at 02:18:12AM -0500, Jason Wang wrote: > > > +static const struct mmu_notifier_ops vhost_mmu_notifier_ops = { > > > + .invalidate_range

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-06 Thread Andrea Arcangeli
Hello Zhong, On Wed, Mar 06, 2019 at 09:07:00PM +0800, zhong jiang wrote: > The patch use call_rcu to delay free the task_struct, but It is possible to > free the task_struct > ahead of get_mem_cgroup_from_mm. is it right? Yes it is possible to free before get_mem_cgroup_from_mm, but if it's

Re: KASAN: use-after-free Read in get_mem_cgroup_from_mm

2019-03-05 Thread Andrea Arcangeli
owever that mm is on its way to exit_mmap as soon as the ioclt returns and this only ever happens during race conditions, so the way CRIU monitor works there wasn't anything fundamentally concerning about this detail, despite it's remarkably "strange". Our priority was to keep the fork code

Re: [PATCH v2] mm/memory.c: do_fault: avoid usage of stale vm_area_struct

2019-03-02 Thread Andrea Arcangeli
rdered after up_read(mmap_sem) either. Other than the above detail: Reviewed-by: Andrea Arcangeli Thanks, Andrea

Re: [RFC][Patch v8 0/7] KVM: Guest Free Page Hinting

2019-02-18 Thread Andrea Arcangeli
Hello, On Mon, Feb 18, 2019 at 03:47:22PM -0800, Alexander Duyck wrote: > essentially fragmented them. I guess hugepaged went through and > started trying to reassemble the huge pages and as a result there have > been apps that ended up consuming more memory than they would have > otherwise since

Re: [RFC PATCH 0/4] Restore change_pte optimization to its former glory

2019-02-18 Thread Andrea Arcangeli
On Mon, Feb 18, 2019 at 11:04:13AM -0500, Jerome Glisse wrote: > So i run 2 exact same VMs side by side (copy of same COW image) and > built the same kernel tree inside each (that is the only important > workload that exist ;)) but the change_pte did not have any impact: > > before mean {real:

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Andrea Arcangeli
On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote: > Before, we choose to use stop_machine() to reduce the overhead of hot > path (page fault handler) as much as possible. But now, I found > rcu_read_lock_sched() is just a wrapper of preempt_disable(). So maybe > we can switch to RCU

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Andrea Arcangeli
Hello, On Thu, Feb 14, 2019 at 12:30:02PM -0800, Andrew Morton wrote: > This was discussed to death and I think the changelog explains the > conclusions adequately. swapoff is super-rare so a stop_machine() in > that path is appropriate if its use permits more efficiency in the > regular swap

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-13 Thread Andrea Arcangeli
Hello everyone, On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote: > @@ -2386,7 +2463,17 @@ static void enable_swap_info(struct swap_info_struct > *p, int prio, > frontswap_init(p->type, frontswap_map); > spin_lock(_lock); > spin_lock(>lock); > -

Re: [RFC PATCH 0/4] Restore change_pte optimization to its former glory

2019-02-11 Thread Andrea Arcangeli
On Mon, Feb 11, 2019 at 02:09:31PM -0500, Jerome Glisse wrote: > Yeah, between do you have any good workload for me to test this ? I > was thinking of running few same VM and having KSM work on them. Is > there some way to trigger KVM to fork ? As the other case is breaking > COW after fork. KVM

Re: [RFC PATCH 2/4] mm/mmu_notifier: use unsigned for event field in range struct

2019-02-01 Thread Andrea Arcangeli
On Thu, Jan 31, 2019 at 01:37:04PM -0500, Jerome Glisse wrote: > From: Jérôme Glisse > > Use unsigned for event field in range struct so that we can also set > flags with the event. This patch change the field and introduce the > helper. > > Signed-off-by: Jérôme Glisse &

Re: [RFC PATCH 1/4] uprobes: use set_pte_at() not set_pte_at_notify()

2019-02-01 Thread Andrea Arcangeli
On Thu, Jan 31, 2019 at 01:37:03PM -0500, Jerome Glisse wrote: > @@ -207,8 +207,7 @@ static int __replace_page(struct vm_area_struct *vma, > unsigned long addr, > > flush_cache_page(vma, addr, pte_pfn(*pvmw.pte)); > ptep_clear_flush_notify(vma, addr, pvmw.pte); > -

Re: [RFC PATCH 0/4] Restore change_pte optimization to its former glory

2019-02-01 Thread Andrea Arcangeli
On Fri, Feb 01, 2019 at 06:57:38PM -0500, Andrea Arcangeli wrote: > If it's cleared with ptep_clear_flush_notify, change_pte still won't > work. The above text needs updating with > "ptep_clear_flush". set_pte_at_notify is all about having > ptep_clear_flush only befo

Re: [RFC PATCH 0/4] Restore change_pte optimization to its former glory

2019-02-01 Thread Andrea Arcangeli
Hello everyone, On Thu, Jan 31, 2019 at 01:37:02PM -0500, Jerome Glisse wrote: > From: Jérôme Glisse > > This patchset is on top of my patchset to add context information to > mmu notifier [1] you can find a branch with everything [2]. I have not > tested it but i wanted to get the discussion

Re: [PATCH] powerpc/powernv/npu: Remove redundant change_pte() hook

2019-01-31 Thread Andrea Arcangeli
gt; invalidate_range() already. > > CC: Benjamin Herrenschmidt > CC: Paul Mackerras > CC: Michael Ellerman > CC: Alistair Popple > CC: Alexey Kardashevskiy > CC: Mark Hairgrove > CC: Balbir Singh > CC: David Gibson > CC: Andrea Arcangeli > CC: Jerome Glisse &g

Re: [LSF/MM TOPIC]: userfaultfd (was: [LSF/MM TOPIC] NUMA remote THP vs NUMA local non-THP under MADV_HUGEPAGE)

2019-01-30 Thread Andrea Arcangeli
Hello Mike, On Wed, Jan 30, 2019 at 10:13:36AM +0200, Mike Rapoport wrote: > We (CRIU) have some concerns about obsoleting soft-dirty in favor of > uffd-wp. If there are other soft-dirty users these concerns would be > relevant to them as well. > > With soft-dirty we collect the information

[LSF/MM TOPIC] NUMA remote THP vs NUMA local non-THP under MADV_HUGEPAGE

2019-01-29 Thread Andrea Arcangeli
Hello, I'd like to attend the LSF/MM Summit 2019. I'm interested in most MM topics and it's enlightening to listen to the common non-MM topics too. One current topic that could be of interest is the THP / NUMA tradeoff in subject. One issue about a change in MADV_HUGEPAGE behavior made ~3 years

Re: [PATCH 0/1] RFC: sched/fair: skip select_idle_sibling() in presence of sync wakeups

2019-01-09 Thread Andrea Arcangeli
On Wed, Jan 09, 2019 at 10:07:51AM +, Mel Gorman wrote: > I agree with Mike here. Many previous attempts to strictly obey the strict > hint has led to regressions elsewhere -- specifically a task waking 2+ > wakees that temporarily stack on one CPU when nearby CPUs sharing LLC sync-waking 2

Re: [PATCH 0/1] RFC: sched/fair: skip select_idle_sibling() in presence of sync wakeups

2019-01-09 Thread Andrea Arcangeli
Hello Mike, On Wed, Jan 09, 2019 at 05:19:48AM +0100, Mike Galbraith wrote: > On Tue, 2019-01-08 at 22:49 -0500, Andrea Arcangeli wrote: > > Hello, > > > > we noticed some unexpected performance regressions in the scheduler by > > switching the guest CPU topology fro

[PATCH 0/1] RFC: sched/fair: skip select_idle_sibling() in presence of sync wakeups

2019-01-08 Thread Andrea Arcangeli
} else { while (n--) { write(pipe1[1], buf, 1); read(pipe2[0], buf, 1); } } return 0; } Andrea Arcangeli (1): sched/fair: skip select_idle_sibling() in presence of sync wakeups kernel/sched/fair.c | 13 +++

[PATCH 1/1] sched/fair: skip select_idle_sibling() in presence of sync wakeups

2019-01-08 Thread Andrea Arcangeli
sed at 100% utilization and that increases performance for those common workloads. Signed-off-by: Andrea Arcangeli --- kernel/sched/fair.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d1907506318a..b2ac152a6935 100644 --

Re: [PATCH V6 1/4] mm/cma: Add PF flag to force non cma alloc

2019-01-08 Thread Andrea Arcangeli
take a page pin by > migrating pages from CMA region. Marking the section PF_MEMALLOC_NOCMA ensures > that we avoid uncessary page migration later. > > Suggested-by: Andrea Arcangeli > Signed-off-by: Aneesh Kumar K.V Reviewed-by: Andrea Arcangeli

[PATCH 0/1] mm/hugetlb.c: teach follow_hugetlb_page() to handle FOLL_NOWAIT

2019-01-08 Thread Andrea Arcangeli
reproduces it easily because it's an heavy user of VM_FAULT_RETRY retvals. Thanks, Andrea Andrea Arcangeli (1): mm/hugetlb.c: teach follow_hugetlb_page() to handle FOLL_NOWAIT mm/hugetlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)

[PATCH 1/1] mm/hugetlb.c: teach follow_hugetlb_page() to handle FOLL_NOWAIT

2019-01-08 Thread Andrea Arcangeli
witch get_user_page_nowait() to get_user_pages_unlocked()") Signed-off-by: Andrea Arcangeli Tested-by: "Dr. David Alan Gilbert" Reported-by: "Dr. David Alan Gilbert" --- mm/hugetlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/m

Re: [PATCH V6 3/4] powerpc/mm/iommu: Allow migration of cma allocated pages during mm_iommu_get

2019-01-08 Thread Andrea Arcangeli
Hello, On Tue, Jan 08, 2019 at 10:21:09AM +0530, Aneesh Kumar K.V wrote: > @@ -187,41 +149,25 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, > unsigned long ua, > goto unlock_exit; > } > > + ret = get_user_pages_cma_migrate(ua, entries, 1, mem->hpages); In

Re: KASAN: use-after-free Read in handle_userfault (2)

2019-01-04 Thread Andrea Arcangeli
On Wed, Jan 02, 2019 at 02:37:58PM +0100, Dmitry Vyukov wrote: > If we are proceeding with "mm: some enhancements to the page fault > mechanism", that's good as it will eliminate at least part of this > output. Agreed. > There are 2 types of debug configs: ones add additional checks for >

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2019-01-04 Thread Andrea Arcangeli
blob/master/kernel/page_mapped_crash/repro.c > > Fix the loop to iterate for "1 << compound_order" pages. > > Debugged-by: Laszlo Ersek > Suggested-by: "Kirill A. Shutemov" > Signed-off-by: Jan Stancek > --- > mm/util.c | 2 +- > 1 file change

Re: KASAN: use-after-free Read in handle_userfault (2)

2018-12-30 Thread Andrea Arcangeli
Hello, On Sun, Dec 30, 2018 at 08:48:05AM +0100, Dmitry Vyukov wrote: > On Wed, Dec 12, 2018 at 10:58 AM Dmitry Vyukov wrote: > > > > On Wed, Dec 12, 2018 at 10:45 AM syzbot > > wrote: > > > > > > Hello, > > > > > > syzbot found the following crash on: > > > > > > HEAD commit:14cf8c1d5b90

Re: [PATCH 1/2] mm: vmscan: skip KSM page in direct reclaim if priority is low

2018-12-21 Thread Andrea Arcangeli
Hello Yang, On Thu, Dec 20, 2018 at 10:33:26PM -0800, Yang Shi wrote: > > > On 12/20/18 10:04 PM, Hugh Dickins wrote: > > On Thu, 20 Dec 2018, Andrew Morton wrote: > >> Is anyone interested in reviewing this? Seems somewhat serious. > >> Thanks. > > Somewhat serious, but no need to rush. > > >

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-12 Thread Andrea Arcangeli
On Wed, Dec 12, 2018 at 10:50:51AM +0100, Michal Hocko wrote: > I can be convinced that larger pages really require a different behavior > than base pages but you should better show _real_ numbers on a wider > variety workloads to back your claims. I have only heard hand waving and I agree with

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-12 Thread Andrea Arcangeli
Hello, I now found a two socket EPYC (is this Naples?) to try to confirm the THP effect of intra-socket THP. CPU(s):128 On-line CPU(s) list: 0-127 Thread(s) per core:2 Core(s) per socket:32 Socket(s): 2 NUMA node(s): 8 NUMA node0 CPU(s):

Re: [PATCH] userfaultfd: clear flag if remap event not enabled

2018-12-10 Thread Andrea Arcangeli
e should not generate the remap event, and at the same > > time we should clear all the uffd flags on the new VMA. Without > > this patch, we can still have the VM_UFFD_MISSING|VM_UFFD_WP > > flags on the new VMA even the fault handling process does not > > even know the exista

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-09 Thread Andrea Arcangeli
Hello, On Sun, Dec 09, 2018 at 04:29:13PM -0800, David Rientjes wrote: > [..] on this platform, at least, hugepages are > preferred on the same socket but there isn't a significant benefit from > getting a cross socket hugepage over small page. [..] You didn't release the proprietary software

[PATCH 0/1] userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered

2018-12-06 Thread Andrea Arcangeli
. This should be applied on top of 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas") to shut off the false positive warning. Thanks, Andrea Andrea Arcangeli (1): userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered fs/userfau

[PATCH 0/1] userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered

2018-12-06 Thread Andrea Arcangeli
. This should be applied on top of 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas") to shut off the false positive warning. Thanks, Andrea Andrea Arcangeli (1): userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered fs/userfau

[PATCH 1/1] userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered

2018-12-06 Thread Andrea Arcangeli
allow to register VM_MAYWRITE vmas") Reported-by: syzbot+06c7092e7d71218a2...@syzkaller.appspotmail.com Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index cd58939dc977..7a85e609f

[PATCH 1/1] userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered

2018-12-06 Thread Andrea Arcangeli
allow to register VM_MAYWRITE vmas") Reported-by: syzbot+06c7092e7d71218a2...@syzkaller.appspotmail.com Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index cd58939dc977..7a85e609f

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 04:18:14PM -0800, David Rientjes wrote: > On Wed, 5 Dec 2018, Andrea Arcangeli wrote: > > > __GFP_COMPACT_ONLY gave an hope it could give some middle ground but > > it shows awful compaction results, it basically destroys compaction > > effec

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 04:18:14PM -0800, David Rientjes wrote: > On Wed, 5 Dec 2018, Andrea Arcangeli wrote: > > > __GFP_COMPACT_ONLY gave an hope it could give some middle ground but > > it shows awful compaction results, it basically destroys compaction > > effec

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Andrea Arcangeli
use transparent huge pages (THP) when transparent_hugepage/enabled=madvise. Otherwise THP is only used when it's enabled system wide. Signed-off-by: Luiz Capitulino Signed-off-by: Anthony Liguori Signed-off-by: Andrea Arcangeli --- exec.c | 1 + osdep.h | 5 + 2 files changed, 6 i

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Andrea Arcangeli
use transparent huge pages (THP) when transparent_hugepage/enabled=madvise. Otherwise THP is only used when it's enabled system wide. Signed-off-by: Luiz Capitulino Signed-off-by: Anthony Liguori Signed-off-by: Andrea Arcangeli --- exec.c | 1 + osdep.h | 5 + 2 files changed, 6 i

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
Hello, On Wed, Dec 05, 2018 at 01:59:32PM -0800, David Rientjes wrote: > [..] and the kernel test robot has reported, [..] Just for completeness you may have missed one email: https://lkml.kernel.org/r/87tvk1yjkp@yhuang-dev.intel.com 'So I think the report should have been a "performance

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
Hello, On Wed, Dec 05, 2018 at 01:59:32PM -0800, David Rientjes wrote: > [..] and the kernel test robot has reported, [..] Just for completeness you may have missed one email: https://lkml.kernel.org/r/87tvk1yjkp@yhuang-dev.intel.com 'So I think the report should have been a "performance

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 02:03:10PM -0800, Linus Torvalds wrote: > On Wed, Dec 5, 2018 at 12:40 PM Andrea Arcangeli wrote: > > > > So ultimately we decided that the saner behavior that gives the least > > risk of regression for the short term, until we can do something &g

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 02:03:10PM -0800, Linus Torvalds wrote: > On Wed, Dec 5, 2018 at 12:40 PM Andrea Arcangeli wrote: > > > > So ultimately we decided that the saner behavior that gives the least > > risk of regression for the short term, until we can do something &g

Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 09:15:28PM +0100, Michal Hocko wrote: > If the __GFP_THISNODE should be really used then it should be applied to > all other types of pages. Not only THP. And as such done in a separate > patch. Not a part of the revert. The cleanup was meant to unify THP > allocations and

Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 09:15:28PM +0100, Michal Hocko wrote: > If the __GFP_THISNODE should be really used then it should be applied to > all other types of pages. Not only THP. And as such done in a separate > patch. Not a part of the revert. The cleanup was meant to unify THP > allocations and

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 11:49:26AM -0800, David Rientjes wrote: > High thp utilization is not always better, especially when those hugepages > are accessed remotely and introduce the regressions that I've reported. > Seeking high thp utilization at all costs is not the goal if it causes >

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Andrea Arcangeli
On Wed, Dec 05, 2018 at 11:49:26AM -0800, David Rientjes wrote: > High thp utilization is not always better, especially when those hugepages > are accessed remotely and introduce the regressions that I've reported. > Seeking high thp utilization at all costs is not the goal if it causes >

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
Hello, Sorry, it has been challenging to keep up with all fast replies, so I'll start by answering to the critical result below: On Tue, Dec 04, 2018 at 10:45:58AM +, Mel Gorman wrote: > thpscale Percentage Faults Huge >4.20.0-rc4 4.20.0-rc4 >

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Andrea Arcangeli
Hello, Sorry, it has been challenging to keep up with all fast replies, so I'll start by answering to the critical result below: On Tue, Dec 04, 2018 at 10:45:58AM +, Mel Gorman wrote: > thpscale Percentage Faults Huge >4.20.0-rc4 4.20.0-rc4 >

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Andrea Arcangeli
On Mon, Dec 03, 2018 at 11:28:07AM -0800, Linus Torvalds wrote: > On Mon, Dec 3, 2018 at 10:59 AM Michal Hocko wrote: > > > > You are misinterpreting my words. I haven't dismissed anything. I do > > recognize both usecases under discussion. > > > > I have merely said that a better THP locality

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Andrea Arcangeli
On Mon, Dec 03, 2018 at 11:28:07AM -0800, Linus Torvalds wrote: > On Mon, Dec 3, 2018 at 10:59 AM Michal Hocko wrote: > > > > You are misinterpreting my words. I haven't dismissed anything. I do > > recognize both usecases under discussion. > > > > I have merely said that a better THP locality

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Andrea Arcangeli
On Mon, Dec 03, 2018 at 07:59:54PM +0100, Michal Hocko wrote: > I have merely said that a better THP locality needs more work and during > the review discussion I have even volunteered to work on that. There > are other reclaim related fixes under work right now. All I am saying > is that

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Andrea Arcangeli
On Mon, Dec 03, 2018 at 07:59:54PM +0100, Michal Hocko wrote: > I have merely said that a better THP locality needs more work and during > the review discussion I have even volunteered to work on that. There > are other reclaim related fixes under work right now. All I am saying > is that

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-28 Thread Andrea Arcangeli
On Wed, Nov 28, 2018 at 08:48:46AM -0800, Linus Torvalds wrote: > On Tue, Nov 27, 2018 at 7:20 PM Huang, Ying wrote: > > > > From the above data, for the parent commit 3 processes exited within > > 14s, another 3 exited within 100s. For this commit, the first process > > exited at 203s. That

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-28 Thread Andrea Arcangeli
On Wed, Nov 28, 2018 at 08:48:46AM -0800, Linus Torvalds wrote: > On Tue, Nov 27, 2018 at 7:20 PM Huang, Ying wrote: > > > > From the above data, for the parent commit 3 processes exited within > > 14s, another 3 exited within 100s. For this commit, the first process > > exited at 203s. That

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-27 Thread Andrea Arcangeli
a "stable" state without introducing new (minor) features. The below is for further review of the potential alternative (which has still margin for improvement). === From: Andrea Arcangeli Subject: [PATCH 1/2] mm: thp: consolidate policy_nodemask call Just a minor cleanup. Signed-o

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-27 Thread Andrea Arcangeli
a "stable" state without introducing new (minor) features. The below is for further review of the potential alternative (which has still margin for improvement). === From: Andrea Arcangeli Subject: [PATCH 1/2] mm: thp: consolidate policy_nodemask call Just a minor cleanup. Signed-o

Re: [patch V2 27/28] x86/speculation: Add seccomp Spectre v2 user space protection mode

2018-11-26 Thread Andrea Arcangeli
Hello, On Sun, Nov 25, 2018 at 11:28:59PM +0100, Thomas Gleixner wrote: > Indeed. Just checked the documentation again, it's also not clear whether > IBPB is required if STIPB is in use. I tried to ask this question too earlier: https://lkml.kernel.org/r/20181119234528.gj29...@redhat.com If

Re: [patch V2 27/28] x86/speculation: Add seccomp Spectre v2 user space protection mode

2018-11-26 Thread Andrea Arcangeli
Hello, On Sun, Nov 25, 2018 at 11:28:59PM +0100, Thomas Gleixner wrote: > Indeed. Just checked the documentation again, it's also not clear whether > IBPB is required if STIPB is in use. I tried to ask this question too earlier: https://lkml.kernel.org/r/20181119234528.gj29...@redhat.com If

[PATCH 1/5] userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails

2018-11-26 Thread Andrea Arcangeli
ultfd support") Signed-off-by: Andrea Arcangeli --- mm/hugetlb.c | 2 +- mm/shmem.c | 2 +- mm/userfaultfd.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7f2a28ab46d5..705a3e9cc910 100644 --- a/mm/hugetlb.c +++ b/mm

[PATCH 1/5] userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails

2018-11-26 Thread Andrea Arcangeli
ultfd support") Signed-off-by: Andrea Arcangeli --- mm/hugetlb.c | 2 +- mm/shmem.c | 2 +- mm/userfaultfd.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7f2a28ab46d5..705a3e9cc910 100644 --- a/mm/hugetlb.c +++ b/mm

[PATCH 4/5] userfaultfd: shmem: add i_size checks

2018-11-26 Thread Andrea Arcangeli
("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Cc: sta...@vger.kernel.org Signed-off-by: Andrea Arcangeli --- mm/shmem.c | 18 -- mm/userfaultfd.c | 26 -- 2 files changed, 40 insertions(+), 4 deletions(-) diff

[PATCH 4/5] userfaultfd: shmem: add i_size checks

2018-11-26 Thread Andrea Arcangeli
("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Cc: sta...@vger.kernel.org Signed-off-by: Andrea Arcangeli --- mm/shmem.c | 18 -- mm/userfaultfd.c | 26 -- 2 files changed, 40 insertions(+), 4 deletions(-) diff

[PATCH 2/5] userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem

2018-11-26 Thread Andrea Arcangeli
. Reported-by: Mike Rapoport Reviewed-by: Hugh Dickins Cc: sta...@vger.kernel.org Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli --- mm/userfaultfd.c | 15 +-- 1 file changed, 13 insertions(+), 2

[PATCH 2/5] userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem

2018-11-26 Thread Andrea Arcangeli
. Reported-by: Mike Rapoport Reviewed-by: Hugh Dickins Cc: sta...@vger.kernel.org Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli --- mm/userfaultfd.c | 15 +-- 1 file changed, 13 insertions(+), 2

[PATCH 0/5] userfaultfd shmem updates

2018-11-26 Thread Andrea Arcangeli
, Andrea Andrea Arcangeli (5): userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas userfaultfd: shmem: add i_size checks userfaultfd

[PATCH 5/5] userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set

2018-11-26 Thread Andrea Arcangeli
c_pte for userfaultfd support") Reported-by: Hugh Dickins Signed-off-by: Andrea Arcangeli --- mm/shmem.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index c3ece7a51949..82a381d463bc 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2272,6 +2272,16 @@

[PATCH 0/5] userfaultfd shmem updates

2018-11-26 Thread Andrea Arcangeli
, Andrea Andrea Arcangeli (5): userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas userfaultfd: shmem: add i_size checks userfaultfd

[PATCH 5/5] userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set

2018-11-26 Thread Andrea Arcangeli
c_pte for userfaultfd support") Reported-by: Hugh Dickins Signed-off-by: Andrea Arcangeli --- mm/shmem.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/mm/shmem.c b/mm/shmem.c index c3ece7a51949..82a381d463bc 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2272,6 +2272,16 @@

[PATCH 3/5] userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas

2018-11-26 Thread Andrea Arcangeli
ernel.org Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 15 +++ mm/userfaultfd.c | 15 ++- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 356d2b8568c1..cd58939dc977 100644 --- a/fs/userfaultfd.c ++

[PATCH 3/5] userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas

2018-11-26 Thread Andrea Arcangeli
ernel.org Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 15 +++ mm/userfaultfd.c | 15 ++- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 356d2b8568c1..cd58939dc977 100644 --- a/fs/userfaultfd.c ++

Re: [PATCH] mm: put_and_wait_on_page_locked() while page is migrated

2018-11-24 Thread Andrea Arcangeli
w treat a PageWaiters page as if an extra > reference were held? Perhaps, but I don't think it matters much, since > shrink_page_list() already had to win its trylock_page(), so waiters are > not very common there: I noticed no difference when trying the bigger > change, and it's surely not needed while put_and_wait_on_page_locked() > is only used for page migration. > > Reported-and-tested-by: Baoquan He > Signed-off-by: Hugh Dickins > Acked-by: Michal Hocko Reviewed-by: Andrea Arcangeli

Re: [PATCH] mm: put_and_wait_on_page_locked() while page is migrated

2018-11-24 Thread Andrea Arcangeli
w treat a PageWaiters page as if an extra > reference were held? Perhaps, but I don't think it matters much, since > shrink_page_list() already had to win its trylock_page(), so waiters are > not very common there: I noticed no difference when trying the bigger > change, and it's surely not needed while put_and_wait_on_page_locked() > is only used for page migration. > > Reported-and-tested-by: Baoquan He > Signed-off-by: Hugh Dickins > Acked-by: Michal Hocko Reviewed-by: Andrea Arcangeli

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-20 Thread Andrea Arcangeli
On Tue, Nov 20, 2018 at 12:11:22PM +0300, Kirill A. Shutemov wrote: > On Sat, Nov 10, 2018 at 11:44:12AM -0500, Andrea Arcangeli wrote: > > I would prefer to add intelligence to detect when COWs after fork > > should be done at 2m or 4k granularity (in the latter case by > &

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-20 Thread Andrea Arcangeli
On Tue, Nov 20, 2018 at 12:11:22PM +0300, Kirill A. Shutemov wrote: > On Sat, Nov 10, 2018 at 11:44:12AM -0500, Andrea Arcangeli wrote: > > I would prefer to add intelligence to detect when COWs after fork > > should be done at 2m or 4k granularity (in the latter case by > &

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 03:25:41PM -0800, Dave Hansen wrote: > On 11/19/18 3:16 PM, Andrea Arcangeli wrote: > > So you may want to ask why it wasn't written as your "any" vs "any" email: > > Presumably because the authors really and truly meant what they said.

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 03:25:41PM -0800, Dave Hansen wrote: > On 11/19/18 3:16 PM, Andrea Arcangeli wrote: > > So you may want to ask why it wasn't written as your "any" vs "any" email: > > Presumably because the authors really and truly meant what they said.

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 01:33:08PM -0800, Dave Hansen wrote: > On 11/19/18 11:32 AM, Andrea Arcangeli wrote: > > The specs don't say if by making it immune from BTB mistraining, it > > also could prevent to mistrain the BTB in order to attack what's > > outside the SECCOMP ja

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 01:33:08PM -0800, Dave Hansen wrote: > On 11/19/18 11:32 AM, Andrea Arcangeli wrote: > > The specs don't say if by making it immune from BTB mistraining, it > > also could prevent to mistrain the BTB in order to attack what's > > outside the SECCOMP ja

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 08:39:41PM +0100, Jiri Kosina wrote: > On Mon, 19 Nov 2018, Andrea Arcangeli wrote: > > > Generally speaking the untrusted code that would try to use spectrev2 > > to attack the other processes is more likely to run inside SECCOMP > > jail tha

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
On Mon, Nov 19, 2018 at 08:39:41PM +0100, Jiri Kosina wrote: > On Mon, 19 Nov 2018, Andrea Arcangeli wrote: > > > Generally speaking the untrusted code that would try to use spectrev2 > > to attack the other processes is more likely to run inside SECCOMP > > jail tha

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
Hello everyone, On Mon, Nov 19, 2018 at 02:49:36PM +0100, Jiri Kosina wrote: > On Mon, 19 Nov 2018, Thomas Gleixner wrote: > > > > On Sat, 17 Nov 2018, Jiri Kosina wrote: > > > > > Subject: [PATCH] x86/speculation: enforce STIBP for SECCOMP tasks in lite > > > mode > > > > > > If 'lite' mode

Re: [Patch v5 11/16] x86/speculation: Add Spectre v2 app to app protection modes

2018-11-19 Thread Andrea Arcangeli
Hello everyone, On Mon, Nov 19, 2018 at 02:49:36PM +0100, Jiri Kosina wrote: > On Mon, 19 Nov 2018, Thomas Gleixner wrote: > > > > On Sat, 17 Nov 2018, Jiri Kosina wrote: > > > > > Subject: [PATCH] x86/speculation: enforce STIBP for SECCOMP tasks in lite > > > mode > > > > > > If 'lite' mode

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-10 Thread Andrea Arcangeli
On Sat, Nov 10, 2018 at 01:22:49PM +, Mel Gorman wrote: > On Fri, Nov 09, 2018 at 02:51:50PM -0500, Andrea Arcangeli wrote: > > And if you're in the camp that is concerned about the use of more RAM > > or/and about the higher latency of COW faults, I'm afraid the > >

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-10 Thread Andrea Arcangeli
On Sat, Nov 10, 2018 at 01:22:49PM +, Mel Gorman wrote: > On Fri, Nov 09, 2018 at 02:51:50PM -0500, Andrea Arcangeli wrote: > > And if you're in the camp that is concerned about the use of more RAM > > or/and about the higher latency of COW faults, I'm afraid the > >

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Andrea Arcangeli
Hello, On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote: > On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > > The basic idea as outlined by Mel Gorman in [2] is: > > > > 1) On first fault in a sufficiently sized range, allocate a huge page > >sized and

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Andrea Arcangeli
Hello, On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote: > On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > > The basic idea as outlined by Mel Gorman in [2] is: > > > > 1) On first fault in a sufficiently sized range, allocate a huge page > >sized and

Re: [PATCH 1/2] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

2018-10-29 Thread Andrea Arcangeli
Hello, On Mon, Oct 29, 2018 at 11:08:34AM +0100, Michal Hocko wrote: > This seems like a separate issue which should better be debugged. Please > open a new thread describing the problem and the state of the node. Yes, in my view it should be evaluated separately too, because it's overall less

Re: [PATCH 1/2] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

2018-10-29 Thread Andrea Arcangeli
Hello, On Mon, Oct 29, 2018 at 11:08:34AM +0100, Michal Hocko wrote: > This seems like a separate issue which should better be debugged. Please > open a new thread describing the problem and the state of the node. Yes, in my view it should be evaluated separately too, because it's overall less

Re: [PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-18 Thread Andrea Arcangeli
On Thu, Oct 18, 2018 at 04:16:40PM -0700, Mike Kravetz wrote: > I was not sure about this, and expected someone could come up with > something better. It just seems there are filesystems like huegtlbfs, > where it makes no sense wasting cycles traversing the filesystem. So, > let's not even try.

Re: [PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-18 Thread Andrea Arcangeli
On Thu, Oct 18, 2018 at 04:16:40PM -0700, Mike Kravetz wrote: > I was not sure about this, and expected someone could come up with > something better. It just seems there are filesystems like huegtlbfs, > where it makes no sense wasting cycles traversing the filesystem. So, > let's not even try.

Re: [PATCH v3 2/2] sysctl: handle overflow for file-max

2018-10-18 Thread Andrea Arcangeli
Hi Al, On Wed, Oct 17, 2018 at 01:35:48AM +0100, Al Viro wrote: > On Wed, Oct 17, 2018 at 12:33:22AM +0200, Christian Brauner wrote: > > Currently, when writing > > > > echo 18446744073709551616 > /proc/sys/fs/file-max > > > > /proc/sys/fs/file-max will overflow and be set to 0. That quickly >

<    1   2   3   4   5   6   7   8   9   10   >