Re: [RFC PATCH] mm: align anon mmap for THP

2019-01-14 Thread Kirill A. Shutemov
ers. User still may want to get speific range as THP (to avoid false sharing or something). But still I believe userspace has all required tools to get it right. -- Kirill A. Shutemov

Re: [RFC PATCH] mm: align anon mmap for THP

2019-01-11 Thread Kirill A. Shutemov
ion (increases number of VMA and therefore page fault cost). I think any change in this direction has to be way more data driven. -- Kirill A. Shutemov

Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged

2019-01-08 Thread Kirill A. Shutemov
that makes the page temporary unmapped. For instance migration (including NUMA balancing), compaction, khugepaged... -- Kirill A. Shutemov

Re: kernel BUG at mm/huge_memory.c:LINE!

2019-01-08 Thread Kirill A. Shutemov
8 EFLAGS: 0246 ORIG_RAX: > 0001 > <4>[220723.488851] RAX: ffda RBX: 7fe785e0 RCX: > 7fe7a60d8330 > <4>[220723.489511] RDX: 11be3b28 RSI: 7fe785e0 RDI: > 0003 > <4>[220723.490156] RBP: 11be3b28 R08: 7fe7a69097c0 R09: > 7ffc05104bb7 > <4>[220723.490770] R10: 7ffc051048c0 R11: 0246 R12: > 000107b88c00 > <4>[220723.491364] R13: 7ffc051057b8 R14: 11be3b28 R15: > 0001072886b8 > <4>[220723.491933] Code: 8b 54 24 08 48 c7 c7 28 54 06 82 e8 51 59 eb ff 49 > 8b 45 20 a8 01 0f 85 1b 01 00 00 48 c7 c6 8c 50 06 82 4c 89 ef e8 cb 64 fb > ff <0f> 0b 48 c7 c6 60 53 06 82 4c 89 ff e8 ba 64 fb ff 0f 0b e8 23 > <1>[220723.493137] RIP: split_huge_page_to_list+0x7b5/0x8d0 RSP: > c900311d76a0 > <4>[220723.493731] ---[ end trace 74a2900540d3546c ]--- > <5>[220723.494322] ---[ now 2018-12-24 07:08:37+03 ]--- > > > > > > > --- > > This bug is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkal...@googlegroups.com. > > > > syzbot will keep track of this bug report. See: > > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with > > syzbot. > -- Kirill A. Shutemov

Re: KASAN: use-after-free Read in filemap_fault

2018-12-28 Thread Kirill A. Shutemov
count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT); ret = VM_FAULT_MAJOR; + fpin = do_sync_mmap_readahead(vmf); retry_find: page = pagecache_get_page(mapping, offset, FGP_CREAT|FGP_FOR_MMAP, -- Kirill A. Shutemov

Re: KASAN: use-after-free Read in filemap_fault

2018-12-28 Thread Kirill A. Shutemov
On Sat, Dec 29, 2018 at 01:01:52AM +0300, Kirill A. Shutemov wrote: > On Fri, Dec 28, 2018 at 12:51:04PM -0800, syzbot wrote: > > Allocated by task 8196: > > ... > > > Freed by task 8197: > > Hm. VMA allocated by one process (I don't see threads in the test cas

Re: KASAN: use-after-free Read in filemap_fault

2018-12-28 Thread Kirill A. Shutemov
On Fri, Dec 28, 2018 at 12:51:04PM -0800, syzbot wrote: > Allocated by task 8196: ... > Freed by task 8197: Hm. VMA allocated by one process (I don't see threads in the test case) gets freed by another one. Looks fishy to me. -- Kirill A. Shutemov

Re: [PATCH v3 0/2] hugetlbfs: use i_mmap_rwsem for better synchronization

2018-12-24 Thread Kirill A. Shutemov
aring synchronization > hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race Looks good to me. Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-22 Thread Kirill A. Shutemov
ce there will be no contention >* on the semaphore, overhead is negligible. >*/ > i_mmap_lock_write(mapping); > remove_inode_hugepages(inode, 0, LLONG_MAX); > i_mmap_unlock_write(mapping); LGTM. -- Kirill A. Shutemov

Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-21 Thread Kirill A. Shutemov
On Fri, Dec 21, 2018 at 10:28:25AM -0800, Mike Kravetz wrote: > On 12/21/18 2:28 AM, Kirill A. Shutemov wrote: > > On Tue, Dec 18, 2018 at 02:35:57PM -0800, Mike Kravetz wrote: > >> Instead of writing the required complicated code for this rare > >> occurren

Re: [PATCH] x86/cpu: sort cpuinfo flags

2018-12-21 Thread Kirill A. Shutemov
On Fri, Dec 21, 2018 at 02:04:03PM +0100, Borislav Petkov wrote: > On Fri, Dec 21, 2018 at 03:40:37PM +0300, Kirill A. Shutemov wrote: > > But I don't see an improvement in readability of data presented to user as > > a silly idea. > > Improving readability is not a silly

Re: [PATCH] x86/cpu: sort cpuinfo flags

2018-12-21 Thread Kirill A. Shutemov
On Thu, Dec 20, 2018 at 04:04:11PM +, Borislav Petkov wrote: > On Thu, Dec 20, 2018 at 03:02:41PM +0300, Kirill A. Shutemov wrote: > > Below is my attempt on doing the same. The key difference is that the > > sorted array is generated at compile-time by mkcapflags.sh. >

Re: [PATCH] ubifs: Get/put page when changing PG_private

2018-12-21 Thread Kirill A. Shutemov
=2 > > Until this regression is not fully understood, including the implications > for UBIFS, I'll not merge this patch. This looks like a reasonable resolution to me: http://lkml.kernel.org/r/20181221093919.ga2...@lst.de But let's wait the inclusion (or objection). -- Kirill A. Shutemov

Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-21 Thread Kirill A. Shutemov
n truncation and hold punch code to cover the call to > remove_inode_hugepages. One of remove_inode_hugepages() callers is noticeably missing -- hugetlbfs_evict_inode(). Why? It at least deserves a comment on why the lock rule doesn't apply to it. -- Kirill A. Shutemov

Re: [PATCH v2 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

2018-12-21 Thread Kirill A. Shutemov
mented? I *looks* correct to me, but it's better to write it down somewhere. Mayby add to the header of mm/rmap.c? > ret = handle_userfault(, VM_UFFD_MISSING); > + > + i_mmap_lock_read(mapping); > mutex_lock(_fault_mutex_table[hash]); > goto out; > } -- Kirill A. Shutemov

Re: [PATCH] x86/cpu: sort cpuinfo flags

2018-12-20 Thread Kirill A. Shutemov
; i < 32*NBUGINTS; i++) { > - unsigned int bug_bit = 32*NCAPINTS + i; > + unsigned int bug_bit = x86_NR_CAPS + i; s/x86_NR_CAPS/X86_NR_CAPS/ > > if (cpu_has_bug(c, bug_bit) && x86_bug_flags[i]) > seq_printf(m, " %s", x86_bug_flags[i]); > _ -- Kirill A. Shutemov

Re: [PATCH] x86/cpu: sort cpuinfo flags

2018-12-20 Thread Kirill A. Shutemov
break; + + bug = bug_bit - 32*NCAPINTS; + + if (cpu_has_bug(c, bug_bit) && x86_bug_flags[bug]) + seq_printf(m, " %s", x86_bug_flags[bug]); + } +} + static int show_cpuinfo(struct seq_file *m, void *v) { struct cpuinfo_x86 *c = v; @@ -96,19 +141,8 @@ static int show_cpuinfo(struct seq_file *m, void *v) show_cpuinfo_core(m, c, cpu); show_cpuinfo_misc(m, c); - - seq_puts(m, "flags\t\t:"); - for (i = 0; i < 32*NCAPINTS; i++) - if (cpu_has(c, i) && x86_cap_flags[i] != NULL) - seq_printf(m, " %s", x86_cap_flags[i]); - - seq_puts(m, "\nbugs\t\t:"); - for (i = 0; i < 32*NBUGINTS; i++) { - unsigned int bug_bit = 32*NCAPINTS + i; - - if (cpu_has_bug(c, bug_bit) && x86_bug_flags[i]) - seq_printf(m, " %s", x86_bug_flags[i]); - } + show_cpuinfo_flags(m, c); + show_cpuinfo_bugs(m, c); seq_printf(m, "\nbogomips\t: %lu.%02lu\n", c->loops_per_jiffy/(50/HZ), -- Kirill A. Shutemov

Re: [PATCH] mm: Remove __hugepage_set_anon_rmap()

2018-12-17 Thread Kirill A. Shutemov
ned-off-by: Kirill Tkhai Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH] ubifs: Get/put page when changing PG_private

2018-12-17 Thread Kirill A. Shutemov
. > The lead to situations where the CMA memory allocator failed to > allocate memory. > > Fix this by using get/put_page when changing PG_private. Looks good to me. Acked-by: Kirill A. Shutemov > Cc: > Cc: zhangjun > Fixes: 4ac1c17b2044 ("UBIFS: Implement ->migratepage()&q

Re: [PATCH v2] ubifs: fix page_count in ->ubifs_migrate_page()

2018-12-13 Thread Kirill A. Shutemov
if (rc != MIGRATEPAGE_SUCCESS) > > return rc; > > > > Let's wait a few days to give Kirill a chance to review, then I'll apply the > patch. I don't remmeber much context now... Could you remind me why ubifs doesn't take additional pin when sets PG_private? Migration is not the only place where the additional pin is implied. See all users of page_has_private() helper. Notably reclaim path. -- Kirill A. Shutemov

Re: [PATCH v3] mm, memcg: fix reclaim deadlock with writeback

2018-12-13 Thread Kirill A. Shutemov
hidden __GFP_ACCOUNT | GFP_KERNEL allocations > from under a fs page locked but they should be really rare. I am not > aware of a better solution unfortunately. > > Reported-and-Debugged-by: Liu Bo > Cc: stable > Fixes: c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages") > Signed-off-by: Michal Hocko Acked-by: Kirill A. Shutemov Will you take care about converting vmf_insert_* to use the pre-allocated page table? -- Kirill A. Shutemov

Re: [PATCH v3] mm: thp: fix flags for pmd migration when split

2018-12-13 Thread Kirill A. Shutemov
n > the migrating pmd pages (on x86_64 we're fetching bit 11 which is part > of swap offset instead of bit 2) and it could potentially corrupt the > memory of an userspace program which depends on the dirty bit. > > CC: Andrea Arcangeli > CC: Andrew Morton > CC: "Kirill A

Re: [PATCH] mm, memcg: fix reclaim deadlock with writeback

2018-12-12 Thread Kirill A. Shutemov
to find a better solution too. But I cannot say I like it. I think we need to spend more time on making ->prealloc_pte useful: looks like it would help to convert vmf_insert_* helpers to take struct vm_fault * as input and propagate it down to pmd population point. Otherwise DAX and drivers would alloacate the page table for nothing. Have you considered if we need anything similar for anon path? Is it possible to have similar deadlock with swaping rather than writeback? -- Kirill A. Shutemov

Re: [PATCH] mm, memcg: fix reclaim deadlock with writeback

2018-12-12 Thread Kirill A. Shutemov
A deadlock. Side node: Do we have PG_writeback vs. PG_locked ordering documentated somewhere? IIUC, the trace from task2 suggests that we must not wait for writeback on the locked page. But that not what I see for many wait_on_page_writeback() users: it usally called with the page locked. I see it for truncate, shmem, swapfile, splice... Maybe the problem is within task2 codepath after all? -- Kirill A. Shutemov

Re: [PATCH] mm, memcg: fix reclaim deadlock with writeback

2018-12-11 Thread Kirill A. Shutemov
The trick with ->prealloc_pte works for faultaround because we can rely on ->map_pages() to not sleep and we know how it will setup page table entry. Basically, core controls most of the path. It's not the case with ->fault(). It is free to sleep and allocate whatever it wants. For instance, DAX page fault will setup page table entry on its own and return VM_FAULT_NOPAGE. It uses vmf_insert_mixed() to setup the page table and ignores your pre-allocated page table. But it's just an example. The problem is that ->fault() is not bounded on what it can do, unlike ->map_pages(). -- Kirill A. Shutemov

[tip:x86/urgent] x86/mm: Fix guard hole handling

2018-12-11 Thread tip-bot for Kirill A. Shutemov
Commit-ID: 16877a5570e0c5f4270d5b17f9bab427bcae9514 Gitweb: https://git.kernel.org/tip/16877a5570e0c5f4270d5b17f9bab427bcae9514 Author: Kirill A. Shutemov AuthorDate: Fri, 30 Nov 2018 23:23:27 +0300 Committer: Thomas Gleixner CommitDate: Tue, 11 Dec 2018 11:19:24 +0100 x86/mm: Fix

[tip:x86/urgent] x86/dump_pagetables: Fix LDT remap address marker

2018-12-11 Thread tip-bot for Kirill A. Shutemov
Commit-ID: 254eb5505ca0ca749d3a491fc6668b6c16647a99 Gitweb: https://git.kernel.org/tip/254eb5505ca0ca749d3a491fc6668b6c16647a99 Author: Kirill A. Shutemov AuthorDate: Fri, 30 Nov 2018 23:23:28 +0300 Committer: Thomas Gleixner CommitDate: Tue, 11 Dec 2018 11:19:24 +0100 x86

Re: [PATCH] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support

2018-12-10 Thread Kirill A. Shutemov
efine OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) Have you tested it with CONFIG_X86_5LEVEL=y? ASAICS, the patch makes OBJ_INDEX_BITS and what depends from it dynamic -- it depends what paging mode we are booting in. ZS_SIZE_CLASSES depends indirectly on OBJ_INDEX_BITS and I don't see how struct zs_pool definition can compile with dynamic ZS_SIZE_CLASSES. Hm? -- Kirill A. Shutemov

Re: [PATCH] selftests/vm/gup_benchmark.c: match gup struct to kernel

2018-12-10 Thread Kirill A. Shutemov
ils with EINVAL. > > Signed-off-by: Alison Schofield Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [patch for-4.20] Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"

2018-12-10 Thread Kirill A. Shutemov
s and I don't feel that my opinion should have much weight here. Do not gate it on me. (I do follow the discussion, but I don't have anything meaningful to contribute so far.) -- Kirill A. Shutemov

Re: [PATCHv2 0/2] Fixups for LDT remap placement change

2018-12-10 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 08:23:26PM +, Kirill A. Shutemov wrote: > There's a couple fixes for the recent LDT remap placement change. Ping? -- Kirill A. Shutemov

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-12-03 Thread Kirill A. Shutemov
k PAGE_SIZE, compound_order() == 31 means 8 TiB pages. I doubt we will see such allocation requests any time soon. Even with 1k base page size, it's still 2 TiB. We will see other limitations in page allocaiton path before the compund order type will be an issue. -- Kirill A. Shutemov

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-12-03 Thread Kirill A. Shutemov
k PAGE_SIZE, compound_order() == 31 means 8 TiB pages. I doubt we will see such allocation requests any time soon. Even with 1k base page size, it's still 2 TiB. We will see other limitations in page allocaiton path before the compund order type will be an issue. -- Kirill A. Shutemov

Re: [PATCHv3 1/3] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-12-03 Thread Kirill A. Shutemov
tity (direct mapping, LDT remap, whatever). It's too fragile. -- Kirill A. Shutemov

Re: [PATCHv3 1/3] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-12-03 Thread Kirill A. Shutemov
tity (direct mapping, LDT remap, whatever). It's too fragile. -- Kirill A. Shutemov

Re: [PATCH v2 0/5] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()

2018-12-03 Thread Kirill A. Shutemov
86/mm: Validate kernel_physical_mapping_init() pte population > x86/mm: Drop usage of __flush_tlb_all() in > kernel_physical_mapping_init() Looks good to me. For the whole patchset: Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v2 0/5] x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()

2018-12-03 Thread Kirill A. Shutemov
86/mm: Validate kernel_physical_mapping_init() pte population > x86/mm: Drop usage of __flush_tlb_all() in > kernel_physical_mapping_init() Looks good to me. For the whole patchset: Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 01:45:46PM +0100, Michal Hocko wrote: > On Fri 30-11-18 15:36:51, Kirill A. Shutemov wrote: > > On Fri, Nov 30, 2018 at 01:18:51PM +0100, Michal Hocko wrote: > > > On Fri 30-11-18 13:06:57, Jan Stancek wrote: > > > > LTP proc01 testcase has b

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 01:45:46PM +0100, Michal Hocko wrote: > On Fri 30-11-18 15:36:51, Kirill A. Shutemov wrote: > > On Fri, Nov 30, 2018 at 01:18:51PM +0100, Michal Hocko wrote: > > > On Fri 30-11-18 13:06:57, Jan Stancek wrote: > > > > LTP proc01 testcase has b

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
t; > This is much less magic than the previous version. It is still not clear > to me how is mapping higher order pages to page tables other than THP > though. So a more detailed information about the source would bre really > welcome. Once we know that we can add a Fixes tag and als

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
t; > This is much less magic than the previous version. It is still not clear > to me how is mapping higher order pages to page tables other than THP > though. So a more detailed information about the source would bre really > welcome. Once we know that we can add a Fixes tag and als

Re: [PATCH 1/2] x86/mm: Fix guard hole handling

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 12:03:33PM +, Juergen Gross wrote: > On 30/11/2018 12:57, Kirill A. Shutemov wrote: > > There is a guard hole at the beginning of kernel address space, also > > used by hypervisors. It occupies 16 PGD entries. > > > > We do not state

Re: [PATCH 1/2] x86/mm: Fix guard hole handling

2018-11-30 Thread Kirill A. Shutemov
On Fri, Nov 30, 2018 at 12:03:33PM +, Juergen Gross wrote: > On 30/11/2018 12:57, Kirill A. Shutemov wrote: > > There is a guard hole at the beginning of kernel address space, also > > used by hypervisors. It occupies 16 PGD entries. > > > > We do not state

[PATCH 0/2] Fixups for LDT remap placement change

2018-11-30 Thread Kirill A. Shutemov
There's a couple fixes for the recent LDT remap placement change. The first patch fixes crash when kernel booted as Xen dom0. The second patch fixes address space markers in dump_pagetables output. It's purely cosmetic change, backporting to the stable tree is optional. Kirill A. Shutemov (2

[PATCH 0/2] Fixups for LDT remap placement change

2018-11-30 Thread Kirill A. Shutemov
There's a couple fixes for the recent LDT remap placement change. The first patch fixes crash when kernel booted as Xen dom0. The second patch fixes address space markers in dump_pagetables output. It's purely cosmetic change, backporting to the stable tree is optional. Kirill A. Shutemov (2

[PATCH 1/2] x86/mm: Fix guard hole handling

2018-11-30 Thread Kirill A. Shutemov
of kernel memory layout. [1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03313.html Signed-off-by: Kirill A. Shutemov Reported-by: Hans van Kranenburg Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") --- arch/x86/i

[PATCH 2/2] x86/dump_pagetables: Fix LDT remap address marker

2018-11-30 Thread Kirill A. Shutemov
The LDT remap placement has been changed. It's now placed before direct mapping in the kernel virtual address space for both paging modes. Change address markers order accordingly. Signed-off-by: Kirill A. Shutemov Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-

[PATCH 1/2] x86/mm: Fix guard hole handling

2018-11-30 Thread Kirill A. Shutemov
of kernel memory layout. [1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03313.html Signed-off-by: Kirill A. Shutemov Reported-by: Hans van Kranenburg Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") --- arch/x86/i

[PATCH 2/2] x86/dump_pagetables: Fix LDT remap address marker

2018-11-30 Thread Kirill A. Shutemov
The LDT remap placement has been changed. It's now placed before direct mapping in the kernel virtual address space for both paging modes. Change address markers order accordingly. Signed-off-by: Kirill A. Shutemov Fixes: d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-

Re: [PATCH] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
Thanks for catching this. But I think the right fix would be to change the loop condition: for (i = 0; i < (1 << compund_order(page)); i++) { Non-THP compound page also can be mapped and we need to check mapcount of subpages. Any objections? If not, please update the patch. -- Kirill A. Shutemov

Re: [PATCH] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Kirill A. Shutemov
Thanks for catching this. But I think the right fix would be to change the loop condition: for (i = 0; i < (1 << compund_order(page)); i++) { Non-THP compound page also can be mapped and we need to check mapcount of subpages. Any objections? If not, please update the patch. -- Kirill A. Shutemov

Re: [PATCH] mm: warn only once if page table misaccounting is detected

2018-11-27 Thread Kirill A. Shutemov
and the console with > hardly any added information. Therefore print the warning only once. > > Cc: Kirill A. Shutemov > Cc: Martin Schwidefsky > Signed-off-by: Heiko Carstens Fair enough. Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH] mm: warn only once if page table misaccounting is detected

2018-11-27 Thread Kirill A. Shutemov
and the console with > hardly any added information. Therefore print the warning only once. > > Cc: Kirill A. Shutemov > Cc: Martin Schwidefsky > Signed-off-by: Heiko Carstens Fair enough. Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

2018-11-27 Thread Kirill A. Shutemov
On Tue, Nov 27, 2018 at 08:34:12AM +0100, Heiko Carstens wrote: > On Wed, Oct 31, 2018 at 01:36:23PM +0300, Kirill A. Shutemov wrote: > > On Wed, Oct 31, 2018 at 11:09:44AM +0100, Heiko Carstens wrote: > > > On Wed, Oct 31, 2018 at 07:31:49AM +0100, Martin Schwidefsky wro

Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

2018-11-27 Thread Kirill A. Shutemov
On Tue, Nov 27, 2018 at 08:34:12AM +0100, Heiko Carstens wrote: > On Wed, Oct 31, 2018 at 01:36:23PM +0300, Kirill A. Shutemov wrote: > > On Wed, Oct 31, 2018 at 11:09:44AM +0100, Heiko Carstens wrote: > > > On Wed, Oct 31, 2018 at 07:31:49AM +0100, Martin Schwidefsky wro

Re: [PATCH 2/2] page cache: Store only head pages in i_pages

2018-11-23 Thread Kirill A. Shutemov
On Fri, Nov 23, 2018 at 09:19:00AM -0800, Matthew Wilcox wrote: > On Fri, Nov 23, 2018 at 01:56:44PM +0300, Kirill A. Shutemov wrote: > > On Thu, Nov 22, 2018 at 01:32:24PM -0800, Matthew Wilcox wrote: > > > Transparent Huge Pages are currently stored in i_pages as pointers to

Re: [PATCH 2/2] page cache: Store only head pages in i_pages

2018-11-23 Thread Kirill A. Shutemov
On Fri, Nov 23, 2018 at 09:19:00AM -0800, Matthew Wilcox wrote: > On Fri, Nov 23, 2018 at 01:56:44PM +0300, Kirill A. Shutemov wrote: > > On Thu, Nov 22, 2018 at 01:32:24PM -0800, Matthew Wilcox wrote: > > > Transparent Huge Pages are currently stored in i_pages as pointers to

Re: [PATCHv3 1/3] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-11-23 Thread Kirill A. Shutemov
it used as parameter to compiler. That's the reason KASAN area alignment looks strange. A possibly better solution would be to actually include LDT in KASLR: randomize the area along with direct mapping, vmalloc and vmemmap. But it's more complexity than I found reasonable for a fix. Do you want to try this? :) -- Kirill A. Shutemov

Re: [PATCHv3 1/3] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-11-23 Thread Kirill A. Shutemov
it used as parameter to compiler. That's the reason KASAN area alignment looks strange. A possibly better solution would be to actually include LDT in KASLR: randomize the area along with direct mapping, vmalloc and vmemmap. But it's more complexity than I found reasonable for a fix. Do you want to try this? :) -- Kirill A. Shutemov

Re: [PATCH 2/2] page cache: Store only head pages in i_pages

2018-11-23 Thread Kirill A. Shutemov
efficiently in i_pages. I probably miss something, I don't see how it wouldn't break split_huge_page(). I don't see what would replace head pages in i_pages with formerly-tail-pages? Hm? -- Kirill A. Shutemov

Re: [PATCH 2/2] page cache: Store only head pages in i_pages

2018-11-23 Thread Kirill A. Shutemov
efficiently in i_pages. I probably miss something, I don't see how it wouldn't break split_huge_page(). I don't see what would replace head pages in i_pages with formerly-tail-pages? Hm? -- Kirill A. Shutemov

Re: [PATCH 1/2] mm: Remove redundant test from find_get_pages_contig

2018-11-23 Thread Kirill A. Shutemov
ter we've > got the reference, they can change after we return the page to the caller. Hm. IIRC, page->mapping can be set to NULL due truncation, but what about index? When it can be changed? Truncation doesn't touch it. -- Kirill A. Shutemov

Re: [PATCH 1/2] mm: Remove redundant test from find_get_pages_contig

2018-11-23 Thread Kirill A. Shutemov
ter we've > got the reference, they can change after we return the page to the caller. Hm. IIRC, page->mapping can be set to NULL due truncation, but what about index? When it can be changed? Truncation doesn't touch it. -- Kirill A. Shutemov

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-21 Thread Kirill A. Shutemov
On Wed, Nov 21, 2018 at 08:35:28AM +0800, Yang Shi wrote: > > > On 11/20/18 9:42 PM, Kirill A. Shutemov wrote: > > On Tue, Nov 20, 2018 at 08:10:51PM +0800, Yang Shi wrote: > > > > > > On 11/20/18 4:57 PM, Kirill A. Shutemov wrote: > > > > On Fr

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-21 Thread Kirill A. Shutemov
On Wed, Nov 21, 2018 at 08:35:28AM +0800, Yang Shi wrote: > > > On 11/20/18 9:42 PM, Kirill A. Shutemov wrote: > > On Tue, Nov 20, 2018 at 08:10:51PM +0800, Yang Shi wrote: > > > > > > On 11/20/18 4:57 PM, Kirill A. Shutemov wrote: > > > > On Fr

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Kirill A. Shutemov
On Tue, Nov 20, 2018 at 03:12:07PM +0100, Michal Hocko wrote: > On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > > From: Michal Hocko > > > > > > filemap_map_pages takes

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Kirill A. Shutemov
On Tue, Nov 20, 2018 at 03:12:07PM +0100, Michal Hocko wrote: > On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > > From: Michal Hocko > > > > > > filemap_map_pages takes

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Kirill A. Shutemov
line of comment in the code. As is it might be confusing for a reader. -- Kirill A. Shutemov

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Kirill A. Shutemov
line of comment in the code. As is it might be confusing for a reader. -- Kirill A. Shutemov

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-20 Thread Kirill A. Shutemov
On Tue, Nov 20, 2018 at 08:10:51PM +0800, Yang Shi wrote: > > > On 11/20/18 4:57 PM, Kirill A. Shutemov wrote: > > On Fri, Nov 16, 2018 at 08:56:04AM -0800, Yang Shi wrote: > > > > a8dda165ec vfree: add debug might_sleep() > > > > dd2283f260 mm: mmap:

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-20 Thread Kirill A. Shutemov
On Tue, Nov 20, 2018 at 08:10:51PM +0800, Yang Shi wrote: > > > On 11/20/18 4:57 PM, Kirill A. Shutemov wrote: > > On Fri, Nov 16, 2018 at 08:56:04AM -0800, Yang Shi wrote: > > > > a8dda165ec vfree: add debug might_sleep() > > > > dd2283f260 mm: mmap:

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-20 Thread Kirill A. Shutemov
ver THP back after multiple 4k COW in the range. Currently khugepaged is not able to collapse PTE entires backed by compound page back to PMD. I have this on my todo list for long time, but... -- Kirill A. Shutemov

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-20 Thread Kirill A. Shutemov
ver THP back after multiple 4k COW in the range. Currently khugepaged is not able to collapse PTE entires backed by compound page back to PMD. I have this on my todo list for long time, but... -- Kirill A. Shutemov

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-20 Thread Kirill A. Shutemov
nk* we need to understand more about what detached VMAs mean for rmap. The anon_vma for these VMAs still reachable for the rmap and therefore VMA too. I don't quite grasp what is implications of this, but it doesn't look good. I'll look into this more when I get some free cycles. It's better to disable the optimization for now (by ignoring 'downgrade' in __do_munmap()). Before it hits release. -- Kirill A. Shutemov

Re: [LKP] dd2283f260 [ 97.263072] WARNING:at_kernel/locking/lockdep.c:#lock_downgrade

2018-11-20 Thread Kirill A. Shutemov
nk* we need to understand more about what detached VMAs mean for rmap. The anon_vma for these VMAs still reachable for the rmap and therefore VMA too. I don't quite grasp what is implications of this, but it doesn't look good. I'll look into this more when I get some free cycles. It's better to disable the optimization for now (by ignoring 'downgrade' in __do_munmap()). Before it hits release. -- Kirill A. Shutemov

Re: "x86/mm: Introduce the 'no5lvl' kernel parameter" broke SETUP_DTB?

2018-11-19 Thread Kirill A. Shutemov
le-missing-cells for /testcase-dat1 [2.801606] OF: /testcase-data/phandle-tests/consumer-b: could not find phandle [2.801842] OF: /testcase-data/phandle-tests/consumer-b: arguments longer than property [2.811520] ### dt-test ### end of unittest - 162 passed, 0 failed I've checked 372fddf70904 and v4.19. I don't see a difference comparing to v4.17. Were you able to track down the issue? -- Kirill A. Shutemov

Re: "x86/mm: Introduce the 'no5lvl' kernel parameter" broke SETUP_DTB?

2018-11-19 Thread Kirill A. Shutemov
le-missing-cells for /testcase-dat1 [2.801606] OF: /testcase-data/phandle-tests/consumer-b: could not find phandle [2.801842] OF: /testcase-data/phandle-tests/consumer-b: arguments longer than property [2.811520] ### dt-test ### end of unittest - 162 passed, 0 failed I've checked 372fddf70904 and v4.19. I don't see a difference comparing to v4.17. Were you able to track down the issue? -- Kirill A. Shutemov

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-10 Thread Kirill A. Shutemov
On Fri, Nov 09, 2018 at 10:34:07AM -0500, Zi Yan wrote: > On 9 Nov 2018, at 8:11, Mel Gorman wrote: > > > On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote: > >> On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > >>> The basic ide

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-10 Thread Kirill A. Shutemov
On Fri, Nov 09, 2018 at 10:34:07AM -0500, Zi Yan wrote: > On 9 Nov 2018, at 8:11, Mel Gorman wrote: > > > On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote: > >> On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > >>> The basic ide

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Kirill A. Shutemov
it's no-go to me. Prove me wrong with performance data. :) -- Kirill A. Shutemov

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Kirill A. Shutemov
it's no-go to me. Prove me wrong with performance data. :) -- Kirill A. Shutemov

[tip:x86/urgent] x86/ldt: Remove unused variable in map_ldt_struct()

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: b082f2dd80612015cd6d9d84e52099734ec9a0e1 Gitweb: https://git.kernel.org/tip/b082f2dd80612015cd6d9d84e52099734ec9a0e1 Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:56 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/ldt: Remove

[tip:x86/urgent] x86/ldt: Remove unused variable in map_ldt_struct()

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: b082f2dd80612015cd6d9d84e52099734ec9a0e1 Gitweb: https://git.kernel.org/tip/b082f2dd80612015cd6d9d84e52099734ec9a0e1 Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:56 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/ldt: Remove

[tip:x86/urgent] x86/ldt: Unmap PTEs for the slot before freeing LDT pages

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: a0e6e0831c516860fc7f9be1db6c081fe902ebcf Gitweb: https://git.kernel.org/tip/a0e6e0831c516860fc7f9be1db6c081fe902ebcf Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:55 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/ldt: Unmap

[tip:x86/urgent] x86/ldt: Unmap PTEs for the slot before freeing LDT pages

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: a0e6e0831c516860fc7f9be1db6c081fe902ebcf Gitweb: https://git.kernel.org/tip/a0e6e0831c516860fc7f9be1db6c081fe902ebcf Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:55 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/ldt: Unmap

[tip:x86/urgent] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: d52888aa2753e3063a9d3a0c9f72f94aa9809c15 Gitweb: https://git.kernel.org/tip/d52888aa2753e3063a9d3a0c9f72f94aa9809c15 Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:54 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/mm: Move LDT

[tip:x86/urgent] x86/mm: Move LDT remap out of KASLR region on 5-level paging

2018-11-06 Thread tip-bot for Kirill A. Shutemov
Commit-ID: d52888aa2753e3063a9d3a0c9f72f94aa9809c15 Gitweb: https://git.kernel.org/tip/d52888aa2753e3063a9d3a0c9f72f94aa9809c15 Author: Kirill A. Shutemov AuthorDate: Fri, 26 Oct 2018 15:28:54 +0300 Committer: Thomas Gleixner CommitDate: Tue, 6 Nov 2018 21:35:11 +0100 x86/mm: Move LDT

Re: [PATCH] mremap: properly flush TLB before releasing the page

2018-11-02 Thread Kirill A. Shutemov
On Fri, Nov 02, 2018 at 04:00:17PM +0100, Jann Horn wrote: > On Fri, Nov 2, 2018 at 3:56 PM Kirill A. Shutemov > wrote: > > On Fri, Nov 02, 2018 at 01:22:42PM +, Will Deacon wrote: > > > From: Linus Torvalds > > > > > > Commit eb66ae030829605d61fbe

Re: [PATCH] mremap: properly flush TLB before releasing the page

2018-11-02 Thread Kirill A. Shutemov
On Fri, Nov 02, 2018 at 04:00:17PM +0100, Jann Horn wrote: > On Fri, Nov 2, 2018 at 3:56 PM Kirill A. Shutemov > wrote: > > On Fri, Nov 02, 2018 at 01:22:42PM +, Will Deacon wrote: > > > From: Linus Torvalds > > > > > > Commit eb66ae030829605d61fbe

Re: [PATCH] mremap: properly flush TLB before releasing the page

2018-11-02 Thread Kirill A. Shutemov
old_addr, old_pmd); > @@ -224,10 +238,7 @@ unsigned long move_page_tables(struct vm_area_struct > *vma, > extent = LATENCY_LIMIT; > move_ptes(vma, old_pmd, old_addr, old_addr + extent, > new_vma, new_pmd, new_addr, need_rmap_locks); > - need_flush = true; > } > - if (likely(need_flush)) > - flush_tlb_range(vma, old_end-len, old_addr); > > mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end); > > -- > 2.1.4 > -- Kirill A. Shutemov

Re: [PATCH] mremap: properly flush TLB before releasing the page

2018-11-02 Thread Kirill A. Shutemov
old_addr, old_pmd); > @@ -224,10 +238,7 @@ unsigned long move_page_tables(struct vm_area_struct > *vma, > extent = LATENCY_LIMIT; > move_ptes(vma, old_pmd, old_addr, old_addr + extent, > new_vma, new_pmd, new_addr, need_rmap_locks); > - need_flush = true; > } > - if (likely(need_flush)) > - flush_tlb_range(vma, old_end-len, old_addr); > > mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end); > > -- > 2.1.4 > -- Kirill A. Shutemov

Re: [PATCH 2/4] mm: introduce mm_[p4d|pud|pmd]_folded

2018-10-31 Thread Kirill A. Shutemov
On Wed, Oct 31, 2018 at 01:59:59PM +0100, Martin Schwidefsky wrote: > Add three architecture overrideable functions to test if the > p4d, pud, or pmd layer of a page table is folded or not. > > Signed-off-by: Martin Schwidefsky Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 2/4] mm: introduce mm_[p4d|pud|pmd]_folded

2018-10-31 Thread Kirill A. Shutemov
On Wed, Oct 31, 2018 at 01:59:59PM +0100, Martin Schwidefsky wrote: > Add three architecture overrideable functions to test if the > p4d, pud, or pmd layer of a page table is folded or not. > > Signed-off-by: Martin Schwidefsky Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 1/4] mm: make the __PAGETABLE_PxD_FOLDED defines non-empty

2018-10-31 Thread Kirill A. Shutemov
efine exists. > > Signed-off-by: Martin Schwidefsky Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 1/4] mm: make the __PAGETABLE_PxD_FOLDED defines non-empty

2018-10-31 Thread Kirill A. Shutemov
efine exists. > > Signed-off-by: Martin Schwidefsky Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCHv3 2/3] x86/ldt: Unmap PTEs for the slot before freeing LDT pages

2018-10-31 Thread Kirill A. Shutemov
On Fri, Oct 26, 2018 at 12:28:55PM +, Kirill A. Shutemov wrote: > + va = (unsigned long)ldt_slot_va(ldt->slot); > + flush_tlb_mm_range(mm, va, va + nr_pages * PAGE_SIZE, 0, false); I've got it wrong on rebase. It has to be PAGE_SHIFT instead of 0. Here's the fix up. diff --g

Re: [PATCHv3 2/3] x86/ldt: Unmap PTEs for the slot before freeing LDT pages

2018-10-31 Thread Kirill A. Shutemov
On Fri, Oct 26, 2018 at 12:28:55PM +, Kirill A. Shutemov wrote: > + va = (unsigned long)ldt_slot_va(ldt->slot); > + flush_tlb_mm_range(mm, va, va + nr_pages * PAGE_SIZE, 0, false); I've got it wrong on rebase. It has to be PAGE_SHIFT instead of 0. Here's the fix up. diff --g

Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

2018-10-31 Thread Kirill A. Shutemov
put is done with > pr_alert() and not with VM_BUG_ON() or one of the WARN*() variants? > > That would to get more information with DEBUG_VM and / or > panic_on_warn=1 set. At least for automated testing it would be nice > to have such triggers. Stack trace is not helpful there. It will always show the exit path which is useless. -- Kirill A. Shutemov

Re: [PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

2018-10-31 Thread Kirill A. Shutemov
put is done with > pr_alert() and not with VM_BUG_ON() or one of the WARN*() variants? > > That would to get more information with DEBUG_VM and / or > panic_on_warn=1 set. At least for automated testing it would be nice > to have such triggers. Stack trace is not helpful there. It will always show the exit path which is useless. -- Kirill A. Shutemov

Re: [PATCH 1/3] mm: introduce mm_[p4d|pud|pmd]_folded

2018-10-31 Thread Kirill A. Shutemov
t; > #define __PAGETABLE_P4D_FOLDED > #define __PAGETABLE_PMD_FOLDED > #define __PAGETABLE_PUD_FOLDED > > While the definition of CONFIG_xxx symbols looks like this > > #define CONFIG_xxx 1 > > The __is_defined needs the value for the __take_second_arg trick. I guess this is easily fixable :) -- Kirill A. Shutemov

<    3   4   5   6   7   8   9   10   11   12   >