Re: [PATCH 0/3] eldie generated code for folded p4d/pud

2019-10-10 Thread Kirill A. Shutemov
Agreed. Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v3] ARC: mm: remove __ARCH_USE_5LEVEL_HACK

2019-10-10 Thread Kirill A. Shutemov
d new delta > | free_pgd_range 546 656+110 > | p4d_clear_bad 2 20 +18 > | Total: Before=4137148, After=4137276, chg 0.00% > > Cc: Kirill A. Shutemov > Signed-off-by: Vineet Gupta Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present

2019-10-09 Thread Kirill A. Shutemov
use a high value that allows a > large range of positive caller-defined return codes for future users. > > Cc: Matthew Wilcox > Cc: Will Deacon > Cc: Peter Zijlstra > Cc: Rik van Riel > Cc: Minchan Kim > Cc: Michal Hocko > Cc: Huang Ying > Cc: Jérôme Glisse &

Re: [PATCH v4 2/9] mm: pagewalk: Take the pagetable lock in walk_pte_range()

2019-10-09 Thread Kirill A. Shutemov
> Cc: Peter Zijlstra > Cc: Rik van Riel > Cc: Minchan Kim > Cc: Michal Hocko > Cc: Huang Ying > Cc: Jérôme Glisse > Cc: Kirill A. Shutemov > Suggested-by: Linus Torvalds > Signed-off-by: Thomas Hellstrom > --- > mm/pagewalk.c | 5 +++-- > 1 file chan

[RFC, PATCH] mm, thp: Try to bound number of pages on deferred split queue

2019-10-09 Thread Kirill A. Shutemov
, triggering the split would cost CPU time (~100ms per GB of THPs to split). I have doubts about the approach, so: Not-Signed-off-by: Kirill A. Shutemov --- include/linux/mmzone.h | 5 ++ mm/huge_memory.c | 129 - mm/memcontrol.c| 3 + mm

Re: [PATCH] ARC: mm: remove __ARCH_USE_5LEVEL_HACK

2019-10-08 Thread Kirill A. Shutemov
-off-by: Vineet Gupta > --- > arch/arc/include/asm/pgtable.h | 1 - > arch/arc/mm/fault.c| 10 -- > 2 files changed, 8 insertions(+), 3 deletions(-) Have you tested it with CONFIG_HIGHMEM=y? alloc_kmap_pgtable() has to be converted too. -- Kirill A. Shutemov

Re: [PATCH] mm: thp: move deferred split queue to memcg's nodeinfo

2019-10-08 Thread Kirill A. Shutemov
before going oom? Maybe there would be also > > other things that would benefit from this scheme instead of traditional > > reclaim and shrinkers? > > That is exactly what we have discussed some time ago. Yes, I've posted the patch: http://lkml.kernel.org/r/20190827125911.boya23eowxhqmopa@box But I still not sure that the approach is right. I expect it to trigger performance regressions. For system with pleanty of free memory, we will just pay split cost for nothing in many cases. -- Kirill A. Shutemov

Re: [PATCH v10 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-10-08 Thread Kirill A. Shutemov
> > ; James Morse ; Marc > > Zyngier ; Matthew Wilcox ; Kirill A. > > Shutemov ; linux-arm- > > ker...@lists.infradead.org; linux-kernel@vger.kernel.org; linux- > > m...@kvack.org; Punit Agrawal ; Thomas > > Gleixner ; Andrew Morton > foundation.org>; hejia..

Re: [PATCH V4 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

2019-10-07 Thread Kirill A. Shutemov
On Mon, Oct 07, 2019 at 03:51:58PM +0200, Ingo Molnar wrote: > > * Kirill A. Shutemov wrote: > > > On Mon, Oct 07, 2019 at 03:06:17PM +0200, Ingo Molnar wrote: > > > > > > * Anshuman Khandual wrote: > > > > > > > This adds a t

Re: [PATCH V4 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

2019-10-07 Thread Kirill A. Shutemov
inline function + define. Something like: #define mm_p4d_folded mm_p4d_folded static inline bool mm_p4d_folded(struct mm_struct *mm) { return !pgtable_l5_enabled(); } But I don't see much reason to be more verbose here than needed. -- Kirill A. Shutemov

Re: [PATCH] Make SPLIT_RSS_COUNTING configurable

2019-10-06 Thread Kirill A. Shutemov
On Fri, Oct 04, 2019 at 06:45:21AM -0700, Daniel Colascione wrote: > On Fri, Oct 4, 2019 at 6:26 AM Kirill A. Shutemov > wrote: > > On Fri, Oct 04, 2019 at 02:33:49PM +0200, Michal Hocko wrote: > > > On Wed 02-10-19 19:08:16, Daniel Colascione wrote: > > > > On

Re: [PATCH] Make SPLIT_RSS_COUNTING configurable

2019-10-04 Thread Kirill A. Shutemov
s it? probably on > the number of cpus) and it should be auto-enabled or it should be > dropped altogether. You cannot really expect people know how to enable > this without a deep understanding of the MM internals. Not to mention > all those users using distro kernels/configs. > > A config option sounds like a bad way forward. And I don't see much point anyway. Reading RSS counters from proc is inherently racy. It can just either way after the read due to process behaviour. -- Kirill A. Shutemov

Re: [PATCH v3 2/7] mm: Add a walk_page_mapping() function to the pagewalk code

2019-10-04 Thread Kirill A. Shutemov
On Fri, Oct 04, 2019 at 02:58:59PM +0200, Thomas Hellström (VMware) wrote: > On 10/4/19 2:37 PM, Kirill A. Shutemov wrote: > > On Thu, Oct 03, 2019 at 01:32:45PM +0200, Thomas Hellström (VMware) wrote: > > > > > + * If @mapping allows faulting of huge pmds and puds, it

Re: [PATCH v3 2/7] mm: Add a walk_page_mapping() function to the pagewalk code

2019-10-04 Thread Kirill A. Shutemov
> It looks like if a concurrent thread faults in a huge pud just after the > test for pud_none in that pmd_alloc, things might go pretty bad. Hm? It will fail the next pmd_none() check under ptl. Do you have a particular racing scenarion? -- Kirill A. Shutemov

Re: [PATCH v3 2/7] mm: Add a walk_page_mapping() function to the pagewalk code

2019-10-03 Thread Kirill A. Shutemov
The walk_page_mapping() function will be initially be used for dirty- > tracking in virtual graphics drivers. > > Cc: Andrew Morton > Cc: Matthew Wilcox > Cc: Will Deacon > Cc: Peter Zijlstra > Cc: Rik van Riel > Cc: Minchan Kim > Cc: Michal Hocko > Cc: Huang Ying >

Re: [PATCH v3 1/7] mm: Remove BUG_ON mmap_sem not held from xxx_trans_huge_lock()

2019-10-03 Thread Kirill A. Shutemov
ocko > Cc: Huang Ying > Cc: Jérôme Glisse > Cc: Kirill A. Shutemov > Signed-off-by: Thomas Hellstrom The patch looks good to me: Acked-by: Kirill A. Shutemov But I looked at usage at pagewalk.c and it is inconsitent. The walker takes ptl before calling ->pud_entry(), but

Re: [PATCH 14/15] mm: Align THP mappings for non-DAX

2019-10-01 Thread Kirill A. Shutemov
On Tue, Oct 01, 2019 at 10:08:30AM -0600, William Kucharski wrote: > > > > On Oct 1, 2019, at 8:20 AM, Kirill A. Shutemov wrote: > > > > On Tue, Oct 01, 2019 at 06:18:28AM -0600, William Kucharski wrote: > >> > >> > >> On 10/1/19 5:32 AM, K

Re: [PATCH 14/15] mm: Align THP mappings for non-DAX

2019-10-01 Thread Kirill A. Shutemov
On Tue, Oct 01, 2019 at 06:18:28AM -0600, William Kucharski wrote: > > > On 10/1/19 5:32 AM, Kirill A. Shutemov wrote: > > On Tue, Oct 01, 2019 at 05:21:26AM -0600, William Kucharski wrote: > > > > > > > > > > On Oct 1, 2019, a

Re: [PATCH v2 2/3] mm, page_owner: decouple freeing stack trace from debug_pagealloc

2019-10-01 Thread Kirill A. Shutemov
On Tue, Oct 01, 2019 at 08:26:28AM -0400, Qian Cai wrote: > On Tue, 2019-10-01 at 14:51 +0300, Kirill A. Shutemov wrote: > > On Tue, Oct 01, 2019 at 10:07:44AM +0200, Vlastimil Babka wrote: > > > On 10/1/19 1:49 AM, Qian Cai wrote: > > > > > > > > > &

Re: [PATCH] mm/thp: make set_huge_zero_page() return void

2019-10-01 Thread Kirill A. Shutemov
ntry); > if (pgtable) > pgtable_trans_huge_deposit(mm, pmd, pgtable); > set_pmd_at(mm, haddr, pmd, entry); > mm_inc_nr_ptes(mm); > - return true; > } > > vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) > -- > 2.20.1 > > -- Kirill A. Shutemov

Re: [PATCH v2 2/3] mm, page_owner: decouple freeing stack trace from debug_pagealloc

2019-10-01 Thread Kirill A. Shutemov
can do that and it was intention, yes. The extra parameter was > requested by Kirill, so I'll defer the answer to him :) DEBUG_PAGEALLOC is much more intrusive debug option. Not all architectures support it in an efficient way. Some require hibernation. I don't see a reason to tie these two option together. -- Kirill A. Shutemov

Re: [PATCH v2 2/3] mm, page_owner: decouple freeing stack trace from debug_pagealloc

2019-10-01 Thread Kirill A. Shutemov
I don't know too much about KASAN internals, but I > assume it's not possible to use it that way on production kernels yet? I don't know about production, but QEMU (without KVM acceleration) is painfully slow if KASAN is enabled. -- Kirill A. Shutemov

Re: [PATCH 14/15] mm: Align THP mappings for non-DAX

2019-10-01 Thread Kirill A. Shutemov
On Tue, Oct 01, 2019 at 05:21:26AM -0600, William Kucharski wrote: > > > > On Oct 1, 2019, at 4:45 AM, Kirill A. Shutemov wrote: > > > > On Tue, Sep 24, 2019 at 05:52:13PM -0700, Matthew Wilcox wrote: > >> > >> diff --git a/mm/huge_memory.c

Re: [PATCH 14/15] mm: Align THP mappings for non-DAX

2019-10-01 Thread Kirill A. Shutemov
lags, PMD_SIZE); > if (addr) I think you reducing ASLR without any real indication that THP is relevant for the VMA. We need to know if any huge page allocation will be *attempted* for the VMA or the file. -- Kirill A. Shutemov

Re: [PATCH 13/15] mm: Add a huge page fault handler for files

2019-10-01 Thread Kirill A. Shutemov
op of mm prep patches -- Matthew] > Signed-off-by: Matthew Wilcox (Oracle) As we discuss before, I think it's wrong direction to separate ->fault() from ->huge_fault(). If ->fault() would return a huge page alloc_set_pte() will do The Right Thing™. -- Kirill A. Shutemov

Re: [PATCH 12/15] mm: Support removing arbitrary sized pages from mapping

2019-10-01 Thread Kirill A. Shutemov
= 2; > + refcount = 1 + compound_nr(page); > if (!page_ref_freeze(page, refcount)) > goto cannot_free; > /* note: atomic_cmpxchg in page_ref_freeze provides the smp_rmb */ > -- > 2.23.0 > > -- Kirill A. Shutemov

Re: [PATCH 11/15] mm: Remove hpage_nr_pages

2019-10-01 Thread Kirill A. Shutemov
so it's better to remove it and avoid the confusion > by just using compound_nr() or page_size(). > > Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 10/15] mm: Allow find_get_page to be used for large pages

2019-10-01 Thread Kirill A. Shutemov
ge) { > + if (fgp_order(fgp_flags)) > + count_vm_event(THP_FILE_ALLOC); > > - /* > - * add_to_page_cache_lru locks the page, and for mmap we expect > - * an unlocked page. > - */ > - if (page && (fgp_flags & FGP_FOR_MMAP)) > - unlock_page(page); > + /* > + * add_to_page_cache_lru locks the page, and > + * for mmap we expect an unlocked page. > + */ > + if (fgp_flags & FGP_FOR_MMAP) > + unlock_page(page); > + } > } > > return page; > -- > 2.23.0 > -- Kirill A. Shutemov

Re: Ack to merge through DRM? WAS Re: [PATCH v2 1/5] mm: Add write-protect and clean utilities for address space ranges

2019-09-30 Thread Kirill A. Shutemov
On Fri, Sep 27, 2019 at 09:39:48AM -0700, Linus Torvalds wrote: > On Fri, Sep 27, 2019 at 5:17 AM Kirill A. Shutemov > wrote: > > > > > Call it "walk_page_mapping()". And talk extensively about how the > > > locking differs a lot from the usual "walk

Re: [PATCH v4 03/11] mm/gup: Applies counting method to monitor gup_pgd_range

2019-09-30 Thread Kirill A. Shutemov
On Fri, Sep 27, 2019 at 08:40:00PM -0300, Leonardo Bras wrote: > As decribed, gup_pgd_range is a lockless pagetable walk. So, in order to ^ typo -- Kirill A. Shutemov

Re: [PATCH] mm/page_alloc: fix a crash in free_pages_prepare()

2019-09-30 Thread Kirill A. Shutemov
ike Power also uses the hook. Have you check that this patch will not affect Power? -- Kirill A. Shutemov

Re: [PATCH] mm: drop mmap_sem before calling balance_dirty_pages() in write fault

2019-09-27 Thread Kirill A. Shutemov
On Thu, Sep 26, 2019 at 03:26:24PM -0400, Johannes Weiner wrote: > I have just one nitpick: Here's addressed one: >From 22a9d58a79a3ebb727d9e909c8f833cd0a751c08 Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 26 Sep 2019 16:34:26 +0300 Subject: [PATCH] shm

Re: [PATCH 09/15] mm: Allow large pages to be added to the page cache

2019-09-26 Thread Kirill A. Shutemov
; > } while (xas_nomem(, gfp_mask & GFP_RECLAIM_MASK)); > @@ -907,7 +922,7 @@ static int __add_to_page_cache_locked(struct page *page, > /* Leave page->index set: truncation relies upon it */ > if (!huge) > mem_cgroup_cancel_charge(page, memcg, false); > - put_page(page); > + page_ref_sub(page, nr); > return xas_error(); > } > ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO); > -- > 2.23.0 > > -- Kirill A. Shutemov

Re: [PATCH 08/15] mm: Add __page_cache_alloc_order

2019-09-26 Thread Kirill A. Shutemov
On Tue, Sep 24, 2019 at 05:52:07PM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" > > This new function allows page cache pages to be allocated that are > larger than an order-0 page. > > Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Kiri

Re: [PATCH 07/15] mm: Make prep_transhuge_page tail-callable

2019-09-26 Thread Kirill A. Shutemov
t; instead of assigning the result to a temporary variable and conditionally > passing that to prep_transhuge_page(). Patch makes sense, "tail-callable" made me think you want it to be able accept tail pages. Can we re-phrase this? > > Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 03/15] mm: Add file_offset_of_ helpers

2019-09-26 Thread Kirill A. Shutemov
line loff_t file_offset_of_next_page(struct page *page) > +{ > + return ((loff_t)page->index + compound_nr(page)) << PAGE_SHIFT; Wouldn't it be more readable as return file_offset_of_page(page) + page_size(page); ? > +} > + > static inline loff_t page_file_offset(struct page *page) > { > return ((loff_t)page_index(page)) << PAGE_SHIFT; > -- > 2.23.0 > > -- Kirill A. Shutemov

Re: [PATCH 01/15] mm: Use vm_fault error code directly

2019-09-26 Thread Kirill A. Shutemov
On Tue, Sep 24, 2019 at 05:52:00PM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" > > Use VM_FAULT_OOM instead of indirecting through vmf_error(-ENOMEM). > > Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH] mm: drop mmap_sem before calling balance_dirty_pages() in write fault

2019-09-26 Thread Kirill A. Shutemov
ompletely read over the context of that email you linked - that > there is a bug in the existing code - and looked at it as mere > refactoring patch. My apologies. > > Switching that shmem site to maybe_unlock_mmap_for_io() to indirectly > pin the inode (in a separate bug fix patch)

Re: [PATCH v9 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-26 Thread Kirill A. Shutemov
s error > in case there can be some obscure use-case.(by Kirill) > > [1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork > > Signed-off-by: Jia He > Reported-by: Yibo Cai Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v3 2/2] x86/boot/64: round memory hole size up to next PMD page.

2019-09-26 Thread Kirill A. Shutemov
of the space to be mapped is usable RAM and includes no reserved > areas. > > Signed-off-by: Steve Wahl > Cc: sta...@vger.kernel.org Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v3 1/2] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

2019-09-26 Thread Kirill A. Shutemov
in 1 GiB of > reserved space). > > The solution is to invalidate the pages of this table outside the > kernel image's space before the page table is activated. This patch > has been validated to fix this problem on our hardware. > > Signed-off-by: Steve Wahl > Cc: sta...@vger.kernel.org Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v3 3/4] mm: don't expose non-hugetlb page to fast gup prematurely

2019-09-26 Thread Kirill A. Shutemov
ecially those > looping set_pte_at()s in mm/huge_memory.c. We can rename current set_pte_at() to __set_pte_at() or something and leave it in places where barrier is not needed. The new set_pte_at()( will be used in the rest of the places with the barrier inside. BTW, have you looked at other levels of page table hierarchy. Do we have the same issue for PMD/PUD/... pages? -- Kirill A. Shutemov

Re: [PATCH 3/3] mm, page_owner: rename flag indicating that page is allocated

2019-09-26 Thread Kirill A. Shutemov
irril suggested naming it PAGE_EXT_OWNER_ALLOCED to make it ^ typo And PAGE_EXT_OWNER_ALLOCED is my typo. I meant PAGE_EXT_OWNER_ALLOCATED :P -- Kirill A. Shutemov

Re: [PATCH 2/3] mm, debug, kasan: save and dump freeing stack trace for kasan

2019-09-26 Thread Kirill A. Shutemov
f (static_branch_unlikely(_owner_free_stack)) { > + handle = READ_ONCE(page_owner->free_handle); > + if (!handle) { > + pr_alert("page_owner free stack trace missing\n"); > + } else { > + nr_entries = stack_depot_fetch(handle, ); > + pr_alert("page last free stack trace:\n"); > + stack_trace_print(entries, nr_entries, 0); > + } > } > #endif > > -- > 2.23.0 > > -- Kirill A. Shutemov

Re: [PATCH 1/3] mm, page_owner: fix off-by-one error in __set_page_owner_handle()

2019-09-26 Thread Kirill A. Shutemov
ull size of struct page_ext is not known at compile time, we can't > use a simple page_ext++ statement, so introduce a page_ext_next() inline > function for that. > > Reported-by: Kirill A. Shutemov > Fixes: 7e2f2a0cd17c ("mm, page_owner: record page owner for each s

Re: [PATCH v2] mm: don't expose page to fast gup prematurely

2019-09-25 Thread Kirill A. Shutemov
On Tue, Sep 24, 2019 at 04:05:50PM -0600, Yu Zhao wrote: > On Tue, Sep 24, 2019 at 02:23:16PM +0300, Kirill A. Shutemov wrote: > > On Sat, Sep 14, 2019 at 01:05:18AM -0600, Yu Zhao wrote: > > > We don't want to expose page to fast gup running on a remote CPU > > > befo

Re: [PATCH v2 4/4] mm, page_owner, debug_pagealloc: save and dump freeing stack trace

2019-09-24 Thread Kirill A. Shutemov
On Tue, Sep 24, 2019 at 05:15:01PM +0200, Vlastimil Babka wrote: > On 9/24/19 1:42 PM, Kirill A. Shutemov wrote: > >> --- a/mm/page_owner.c > >> +++ b/mm/page_owner.c > >> @@ -24,6 +24,9 @@ struct page_owner { > >>short last_migrate_reason; > >>

Re: [PATCH v2 2/4] mm, page_owner: record page owner for each subpage

2019-09-24 Thread Kirill A. Shutemov
On Tue, Sep 24, 2019 at 05:10:59PM +0200, Vlastimil Babka wrote: > On 9/24/19 1:31 PM, Kirill A. Shutemov wrote: > > On Tue, Aug 20, 2019 at 03:18:26PM +0200, Vlastimil Babka wrote: > >> Currently, page owner info is only recorded for the first page of a > >>

Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-24 Thread Kirill A. Shutemov
addr, PAGE_SIZE)) > > > > + if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) > > > > { > > > > + /* Give a warn in case there can be some obscure > > > > +* use-case > > > > +*/ > > > > + WARN_ON_ONCE(1); > > > > > > That's more of a question for the mm guys: at this point we do the > > > copying with the ptl released; is there anything else that could have > > > made the pte old in the meantime? I think unuse_pte() is only called on > > > anonymous vmas, so it shouldn't be the case here. > > If we need to hold the ptl here, you could as well have an enclosing > kmap/kunmap_atomic (option 2) with some goto instead of "return false". Yeah, look like we need to hold ptl for longer. There is nothing I see that would prevent clearing young bit under us otherwise. -- Kirill A. Shutemov

Re: [PATCH V3 0/2] mm/debug: Add tests for architecture exported page table helpers

2019-09-24 Thread Kirill A. Shutemov
wn) in pud_clear_tests() as there were no available > __pgd() definitions. > > - ARM32 > - IA64 Hm. Grep shows __pgd() definitions for both of them. Is it for specific config? -- Kirill A. Shutemov

Re: [PATCH v2 4/4] mm, page_owner, debug_pagealloc: save and dump freeing stack trace

2019-09-24 Thread Kirill A. Shutemov
last_migrate_reason; > gfp_t gfp_mask; > depot_stack_handle_t handle; > +#ifdef CONFIG_DEBUG_PAGEALLOC > + depot_stack_handle_t free_handle; > +#endif I think it's possible to add space for the second stack handle at runtime: adjust page_owner_ops->size inside the ->need(). The second stack might be useful beyond CONFIG_DEBUG_PAGEALLOC. We probably should not tie these features. -- Kirill A. Shutemov

Re: [PATCH v2 3/4] mm, page_owner: keep owner info when freeing the page

2019-09-24 Thread Kirill A. Shutemov
to "page owner > info has been set at least once" and add new PAGE_EXT_OWNER_ACTIVE for > tracking > whether page is supposed to be currently tracked allocated or free. Why not PAGE_EXT_OWNER_ALLOCED if it means "allocated"? Active is somewhat loaded term for a page. -- Kirill A. Shutemov

Re: [PATCH v2 2/4] mm, page_owner: record page owner for each subpage

2019-09-24 Thread Kirill A. Shutemov
t; @@ -562,7 +577,8 @@ static void init_pages_in_zone(pg_data_t *pgdat, struct > zone *zone) > continue; > > /* Found early allocated page */ > - __set_page_owner_handle(page_ext, early_handle, 0, 0); > + __set_page_owner_handle(page, page_ext, early_handle, > + 0, 0); > count++; > } > cond_resched(); > -- > 2.22.0 > > -- Kirill A. Shutemov

Re: [PATCH v2] mm: don't expose page to fast gup prematurely

2019-09-24 Thread Kirill A. Shutemov
nge() > __SetPageSwapBacked() SetPageReferenced() Is there a particular codepath that has what you listed for CPU? After quick look, I only saw that we page_add_new_anon_rmap() called before set_pte_at(). -- Kirill A. Shutemov

Re: [PATCH v8 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-23 Thread Kirill A. Shutemov
s error > in case there can be some obscure use-case.(by Kirill) > > [1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork > > Reported-by: Yibo Cai > Signed-off-by: Jia He Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v7 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-20 Thread Kirill A. Shutemov
... > pte_unmap_unlock(vmf->pte, vmf->ptl); > return false; > ... > return true; > ... > if (!cow_user_page(new_page, old_page, vmf)) { > > That reads more sensibly for me. I like this idea too. -- Kirill A. Shutemov

Re: [PATCH v7 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-20 Thread Kirill A. Shutemov
s error > in case there can be some obscure use-case.(by Kirill) > > [1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork > > Reported-by: Yibo Cai > Signed-off-by: Jia He Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH] mm, memcg: assign shrinker_map before kvfree

2019-09-20 Thread Kirill A. Shutemov
hrinker_maps and mem_cgroup_css_free only) still > this looks suspicious and we can easily eliminate the gap at all. With this logic it will still look suspicious since you don't wait a grace period before freeing the map. -- Kirill A. Shutemov

Re: [PATCH v6 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-20 Thread Kirill A. Shutemov
* from the second attempt. > + */ > + put_page(new_page); I think you also need to give the reference on the old page back: if (old_page) put_page(old_page); > + goto normal; I don't see much point in this goto. Just return 0. > + } > } > > if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, , > false)) > @@ -2420,6 +2471,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > } > put_page(old_page); > } > + > +normal: > return page_copied ? VM_FAULT_WRITE : 0; > oom_free_new: > put_page(new_page); > -- > 2.17.1 > > -- Kirill A. Shutemov

Re: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-19 Thread Kirill A. Shutemov
On Thu, Sep 19, 2019 at 03:41:43PM +, Catalin Marinas wrote: > On Thu, Sep 19, 2019 at 06:00:07PM +0300, Kirill A. Shutemov wrote: > > On Wed, Sep 18, 2019 at 07:00:30PM +0100, Catalin Marinas wrote: > > > On Wed, Sep 18, 2019 at 05:00:27PM +0300, Kirill A. Shutemov wrote: &

Re: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-19 Thread Kirill A. Shutemov
On Wed, Sep 18, 2019 at 07:00:30PM +0100, Catalin Marinas wrote: > On Wed, Sep 18, 2019 at 05:00:27PM +0300, Kirill A. Shutemov wrote: > > On Wed, Sep 18, 2019 at 09:19:14PM +0800, Jia He wrote: > > > @@ -2152,20 +2163,34 @@ static inline void cow_user_page(struct page >

Re: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-19 Thread Kirill A. Shutemov
On Thu, Sep 19, 2019 at 10:16:34AM +0800, Jia He wrote: > Hi Kirill > > [On behalf of justin...@arm.com because some mails are filted...] > > On 2019/9/18 22:00, Kirill A. Shutemov wrote: > > On Wed, Sep 18, 2019 at 09:19:14PM +0800, Jia He wrote: > > > Wh

Re: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-18 Thread Kirill A. Shutemov
copy_user_highpage(dst, src, addr, vma); > } > > static gfp_t __get_fault_gfp_mask(struct vm_area_struct *vma) > @@ -2318,7 +2343,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > vmf->address); > if (!new_page) > goto oom; > - cow_user_page(new_page, old_page, vmf->address, vma); > + cow_user_page(new_page, old_page, vmf); > } > > if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, , > false)) > -- > 2.17.1 > > -- Kirill A. Shutemov

Re: KASAN: use-after-free Read in shmem_fault (2)

2019-09-17 Thread Kirill A. Shutemov
On Mon, Sep 09, 2019 at 06:04:12PM +0300, Kirill A. Shutemov wrote: > On Mon, Sep 09, 2019 at 06:55:21AM -0700, Matthew Wilcox wrote: > > On Mon, Sep 02, 2019 at 05:20:30PM +0300, Kirill A. Shutemov wrote: > > > On Mon, Sep 02, 2019 at 06:52:54AM -0700, Matthew Wilcox wrote: >

Re: [PATCH v3] mm: memory: fix /proc/meminfo reporting for MLOCK_ONFAULT

2019-09-17 Thread Kirill A. Shutemov
On Mon, Sep 16, 2019 at 06:26:19PM +0300, Kirill A. Shutemov wrote: > > --- > > mm/memory.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index e0c232fe81d9..55da24f33bc4 100644 > > --- a/mm/memory.c &g

Re: [PATCH v3] mm: memory: fix /proc/meminfo reporting for MLOCK_ONFAULT

2019-09-17 Thread Kirill A. Shutemov
On Tue, Sep 17, 2019 at 12:15:19PM +0200, Michal Hocko wrote: > On Mon 16-09-19 18:26:19, Kirill A. Shutemov wrote: > > On Fri, Sep 13, 2019 at 02:11:19PM -0700, Lucian Adrian Grijincu wrote: > > > As pages are faulted in MLOCK_ONFAULT correctly updates > > > /proc/sel

Re: [PATCH v3] mm: memory: fix /proc/meminfo reporting for MLOCK_ONFAULT

2019-09-16 Thread Kirill A. Shutemov
--- > > Signed-off-by: Lucian Adrian Grijincu > Acked-by: Souptick Joarder > --- > mm/memory.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index e0c232fe81d9..55da24f33bc4 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3311,6 +3311,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct > mem_cgroup *memcg, > } else { > inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); > page_add_file_rmap(page, false); > + if (vma->vm_flags & VM_LOCKED && !PageTransCompound(page)) > + mlock_vma_page(page); Why do you only do this for file pages? > } > set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); > > -- > 2.17.1 > > -- Kirill A. Shutemov

[tip: x86/mm] x86/mm: Enable 5-level paging support by default

2019-09-16 Thread tip-bot2 for Kirill A. Shutemov
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 18ec1eaf58fbf2d9009a752a102a3d8e0d905a0f Gitweb: https://git.kernel.org/tip/18ec1eaf58fbf2d9009a752a102a3d8e0d905a0f Author:Kirill A. Shutemov AuthorDate:Fri, 13 Sep 2019 12:54:52 +03:00

Re: [PATCH v3 2/2] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-16 Thread Kirill A. Shutemov
On Mon, Sep 16, 2019 at 09:35:21AM +, Justin He (Arm Technology China) wrote: > > Hi Kirill > > -Original Message- > > From: Kirill A. Shutemov > > Sent: 2019年9月16日 17:16 > > To: Justin He (Arm Technology China) > > Cc: Catalin Marinas ; Will Dea

Re: [PATCH v3 1/2] arm64: mm: implement arch_faults_on_old_pte() on arm64

2019-09-16 Thread Kirill A. Shutemov
s_on_old_pte arch_faults_on_old_pte > + > #endif /* !__ASSEMBLY__ */ > > #endif /* __ASM_PGTABLE_H */ > -- > 2.17.1 > > -- Kirill A. Shutemov

Re: [PATCH v3 2/2] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-16 Thread Kirill A. Shutemov
copy_user_highpage(dst, src, vmf->address, vmf->vma); > } > > static gfp_t __get_fault_gfp_mask(struct vm_area_struct *vma) > @@ -2318,7 +2338,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > vmf->address); > if (!new_page) > goto oom; > - cow_user_page(new_page, old_page, vmf->address, vma); > + cow_user_page(new_page, old_page, vmf); > } > > if (mem_cgroup_try_charge_delay(new_page, mm, GFP_KERNEL, , > false)) > -- > 2.17.1 > > -- Kirill A. Shutemov

Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

2019-09-16 Thread Kirill A. Shutemov
On Fri, Sep 13, 2019 at 10:14:15AM -0500, Steve Wahl wrote: > On Thu, Sep 12, 2019 at 01:19:17PM +0300, Kirill A. Shutemov wrote: > > On Wed, Sep 11, 2019 at 03:08:35PM -0500, Steve Wahl wrote: > > > Thank you for your time looking into this with me! > > > > With all

Re: [PATCH v3 1/2] mm: clean up validate_slab()

2019-09-16 Thread Kirill A. Shutemov
nvalid > object, free or in use. This shouldn't matter because the original > behavior isn't intended anyway. > > Signed-off-by: Yu Zhao Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

[PATCH] x86/mm: Enable 5-level paging support by default

2019-09-13 Thread Kirill A. Shutemov
regressions I saw were in early boot code that runs independently from CONFIG_X86_5LEVEL. The next major release of distributions expected to have CONFIG_X86_5LEVEL=y. Enable the option by default. It may help to catch obscure bugs early. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 1 + 1

[PATCH] mm, thp: Do not queue fully unmapped pages for deferred split

2019-09-13 Thread Kirill A. Shutemov
Adding fully unmapped pages into deferred split queue is not productive: these pages are about to be freed or they are pinned and cannot be split anyway. Signed-off-by: Kirill A. Shutemov --- mm/rmap.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/rmap.c

Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-13 Thread Kirill A. Shutemov
but its just a single line. Kirill suggested this in the > previous version. There is a generic fallback definition but s390 has it's > own. This change overrides the generic one for x86 probably as a fix or as > an improvement. Kirill should be able to help classify it in which case it > can be a separate patch. I don't think it worth a separate patch. -- Kirill A. Shutemov

Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers

2019-09-13 Thread Kirill A. Shutemov
/x86/mm/highmem_32.c#L34 > >> > >> I have not checked others, but I guess it is like that for all. > >> > > > > > > Seems like I answered too quickly. All kmap_atomic() do preempt_disable(), > > but not all pte_alloc_map() call kmap_atomic(). > > > > However, for instance ARM does: > > > > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/include/asm/pgtable.h#L200 > > > > And X86 as well: > > > > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/include/asm/pgtable_32.h#L51 > > > > Microblaze also: > > > > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/microblaze/include/asm/pgtable.h#L495 > > All the above platforms checks out to be using k[un]map_atomic(). I am > wondering whether > any of the intermediate levels will have similar problems on any these 32 bit > platforms > or any other platforms which might be using generic k[un]map_atomic(). No. Kernel only allocates pte page table from highmem. All other page tables are always visible in kernel address space. -- Kirill A. Shutemov

Re: [PATCH v2 1/4] mm: correct mask size for slub page->objects

2019-09-12 Thread Kirill A. Shutemov
On Thu, Sep 12, 2019 at 03:11:14PM -0600, Yu Zhao wrote: > On Thu, Sep 12, 2019 at 12:40:35PM +0300, Kirill A. Shutemov wrote: > > On Wed, Sep 11, 2019 at 08:31:08PM -0600, Yu Zhao wrote: > > > Mask of slub objects per page shouldn't be larger than what > >

Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-12 Thread Kirill A. Shutemov
code here __init (or it's variants) so it can be discarded on boot. It has not use after that. -- Kirill A. Shutemov

Re: [PATCH v9 0/8] stg mail -e --version=v9 \

2019-09-12 Thread Kirill A. Shutemov
very well obvious in the code including the consequences. I do not > really like an idea of hiding similar constrains behind a generic > looking feature which is completely detached from the allocator and so > any future change of the allocator might subtly break it. I don't necessary follow why shuffling is more integral to page allocator than reporting would be. It's next to shutffle.c under mm/ and integrated in a simillar way. -- Kirill A. Shutemov

Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

2019-09-12 Thread Kirill A. Shutemov
why it's crucial not to map more than needed. I think we also need to make it clear that this is workaround for a broken hardware: speculative execution must not trigger a halt. -- Kirill A. Shutemov

Re: [PATCH v2 4/4] mm: lock slub page when listing objects

2019-09-12 Thread Kirill A. Shutemov
. Can we repharase it to indicate that slab_lock is required to protect page->objects? -- Kirill A. Shutemov

Re: [PATCH v2 3/4] mm: avoid slub allocation while holding list_lock

2019-09-12 Thread Kirill A. Shutemov
ock)->rlock){-.-.}, at: __kmem_cache_shutdown+0x81/0x3cb > > other info that might help us debug this: >Possible unsafe locking scenario: > > CPU0 > > lock(&(>list_lock)->rlock); > lock(&(>list_lock)->rlock); > >*** DEADLOCK *** > > Signed-off-by: Yu Zhao Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH v2 2/4] mm: clean up validate_slab()

2019-09-12 Thread Kirill A. Shutemov
f (!check_object(s, page, p, SLUB_RED_ACTIVE)) > - return 0; > - return 1; > + if (!check_object(s, page, p, val)) > + break; > + } > } > > static void validate_slab_slab(struct kmem_cache *s, struct page *page, > -- > 2.23.0.162.g0b9fbb3734-goog > -- Kirill A. Shutemov

Re: [PATCH v2 1/4] mm: correct mask size for slub page->objects

2019-09-12 Thread Kirill A. Shutemov
s not used and always zero. So what? I can imagine for some microarchitecures accessing higher 16 bits of int is cheaper than shifting by 15. -- Kirill A. Shutemov

Re: [PATCH 2/3] mm: avoid slub allocation while holding list_lock

2019-09-11 Thread Kirill A. Shutemov
not something sane to do. -- Kirill A. Shutemov

Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.

2019-09-10 Thread Kirill A. Shutemov
On Tue, Sep 10, 2019 at 09:28:10AM -0500, Steve Wahl wrote: > On Mon, Sep 09, 2019 at 11:14:14AM +0300, Kirill A. Shutemov wrote: > > On Fri, Sep 06, 2019 at 04:29:50PM -0500, Steve Wahl wrote: > > > ... > > > The answer is to invalidate the pages of this table outs

Re: [PATCH 1/2] memory_hotplug: Add a bounds check to check_hotplug_memory_range()

2019-09-10 Thread Kirill A. Shutemov
static int check_hotplug_memory_range(u64 start, u64 size) > { > /* memory range must be block size aligned */ > @@ -1040,7 +1057,7 @@ static int check_hotplug_memory_range(u64 start, u64 > size) > return -EINVAL; > } > > - return 0; > + return check_hotplug_memory_addressable(start, size); > } > > static int online_memory_block(struct memory_block *mem, void *arg) > -- > 2.21.0 > -- Kirill A. Shutemov

Re: [PATCH v2 1/2] mm/page_ext: support to record the last stack of page

2019-09-10 Thread Kirill A. Shutemov
t; > +#endif > > > }; > > > > > > static unsigned long total_usage; > > > > > > > Are you sure this patch compiles? > > > This is patchsets, it need another patch2. > We have verified it by running KASAN UT on Qemu. Any patchset must be bisectable: do not break anything in the middle of patchset. -- Kirill A. Shutemov

Re: [PATCH v2] mm: fix double page fault on arm64 if PTE_AF is cleared

2019-09-10 Thread Kirill A. Shutemov
entry = pte_mkyoung(vmf->orig_pte); > + if (ptep_set_access_flags(vmf->vma, vmf->address, > + vmf->pte, entry, 0)) > + update_mmu_cache(vmf->vma, vmf->address, > + vmf->pte); > + } > + I don't see where you take ptl. -- Kirill A. Shutemov

Re: [PATCH] mm: avoid slub allocation while holding list_lock

2019-09-10 Thread Kirill A. Shutemov
On Mon, Sep 09, 2019 at 03:39:38PM -0600, Yu Zhao wrote: > On Tue, Sep 10, 2019 at 05:57:22AM +0900, Tetsuo Handa wrote: > > On 2019/09/10 1:00, Kirill A. Shutemov wrote: > > > On Mon, Sep 09, 2019 at 12:10:16AM -0600, Yu Zhao wrote: > > >> If we are already under

Re: [PATCH v9 2/8] mm: Adjust shuffle code to allow for future coalescing

2019-09-09 Thread Kirill A. Shutemov
t noticed this. -- Kirill A. Shutemov

Re: [PATCH v9 6/8] mm: Introduce Reported pages

2019-09-09 Thread Kirill A. Shutemov
reader. > > Would something like the following work for you? > /* > * Allocate space to store the boundaries for the zone we are > * actively reporting on. We will need to store one boundary > * pointer per migratetype, and then we need to have one of these > * arrays per order for orders greater than or equal to > * PAGE_REPORTING_MIN_ORDER. > */ Ack. -- Kirill A. Shutemov

Re: [PATCH] mm: avoid slub allocation while holding list_lock

2019-09-09 Thread Kirill A. Shutemov
0 > > lock(&(>list_lock)->rlock); > lock(&(>list_lock)->rlock); > >*** DEADLOCK *** > > Signed-off-by: Yu Zhao Looks sane to me: Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov

Re: [PATCH 0/3] Remove __online_page_set_limits()

2019-09-09 Thread Kirill A. Shutemov
lloon.c | 1 - > include/linux/memory_hotplug.h | 1 - > mm/memory_hotplug.c| 5 - > 4 files changed, 8 deletions(-) Do we really need 3 separate patches to remove 8 lines of code? -- Kirill A. Shutemov

Re: [PATCH v9 2/8] mm: Adjust shuffle code to allow for future coalescing

2019-09-09 Thread Kirill A. Shutemov
> > > extern void __shuffle_free_memory(pg_data_t *pgdat); > > > +extern bool __shuffle_pick_tail(void); > > > static inline void shuffle_free_memory(pg_data_t *pgdat) > > > { > > > if (!static_branch_unlikely(_alloc_shuffle_key)) > > > @@ -43,6 +45,11 @@ static inline bool is_shuffle_order(int order) > > > return false; > > > return order >= SHUFFLE_ORDER; > > > } > > > + > > > +static inline bool shuffle_pick_tail(void) > > > +{ > > > + return __shuffle_pick_tail(); > > > +} > > > > I don't see a reason in __shuffle_pick_tail() existing if you call it > > unconditionally. > > That is for compilation purposes. The function is not used in the > shuffle_pick_tail below that always returns false. Wouldn't it be the same if you rename __shuffle_pick_tail() to shuffle_pick_tail() and put its declaration under the same #ifdef? -- Kirill A. Shutemov

Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture page table helpers

2019-09-09 Thread Kirill A. Shutemov
le entry from generic code like this test case is bit tricky. That > >> is because there are not enough helpers to create entries with an absolute > >> value. This would have been easier if all the platforms provided functions > >> like __pxx() which is not the case now. Otherwise something like this > >> should > >> have worked. > >> > >> > >> pud_t pud = READ_ONCE(*pudp); > >> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0)) > >> WRITE_ONCE(*pudp, pud); > >> > >> But __pud() will fail to build in many platforms. > > > > Hmm, I simply used this on my system to make pud_clear_tests() work, not > > sure if it works on all archs: > > > > pud_val(*pudp) |= RANDOM_NZVALUE; > > Which compiles on arm64 but then fails on x86 because of the way pmd_val() > has been defined there. Use instead *pudp = __pud(pud_val(*pudp) | RANDOM_NZVALUE); It *should* be more portable. -- Kirill A. Shutemov

Re: KASAN: use-after-free Read in shmem_fault (2)

2019-09-09 Thread Kirill A. Shutemov
On Mon, Sep 09, 2019 at 06:55:21AM -0700, Matthew Wilcox wrote: > On Mon, Sep 02, 2019 at 05:20:30PM +0300, Kirill A. Shutemov wrote: > > On Mon, Sep 02, 2019 at 06:52:54AM -0700, Matthew Wilcox wrote: > > > On Sat, Aug 31, 2019 at 12:58:26PM +0800, Hillf Danton wrote: > >

Re: [PATCH v9 6/8] mm: Introduce Reported pages

2019-09-09 Thread Kirill A. Shutemov
eporting_free_stats(struct zone *zone) > +{ > + /* free reported_page statisitics */ > + kfree(zone->reported_pages); > + zone->reported_pages = NULL; > +} > + > +static DEFINE_MUTEX(page_reporting_mutex); > +DEFINE_STATIC_KEY_FALSE(page_reporting_notify_enabled); > + > +void page_reporting_shutdown(struct page_reporting_dev_info *phdev) > +{ > + mutex_lock(_reporting_mutex); > + > + if (rcu_access_pointer(ph_dev_info) == phdev) { > + /* Disable page reporting notification */ > + static_branch_disable(_reporting_notify_enabled); > + RCU_INIT_POINTER(ph_dev_info, NULL); > + synchronize_rcu(); > + > + /* Flush any existing work, and lock it out */ > + cancel_delayed_work_sync(>work); > + > + /* Free scatterlist */ > + kfree(phdev->sg); > + phdev->sg = NULL; > + > + /* Free boundaries */ > + kfree(boundary); > + boundary = NULL; > + } > + > + mutex_unlock(_reporting_mutex); > +} > +EXPORT_SYMBOL_GPL(page_reporting_shutdown); > + > +int page_reporting_startup(struct page_reporting_dev_info *phdev) > +{ > + struct zone *zone; > + int err = 0; > + > + /* No point in enabling this if it cannot handle any pages */ > + if (!phdev->capacity) > + return -EINVAL; Looks like a usage error. Maybe WARN_ON()? > + > + mutex_lock(_reporting_mutex); > + > + /* nothing to do if already in use */ > + if (rcu_access_pointer(ph_dev_info)) { > + err = -EBUSY; > + goto err_out; > + } Again, it's from "something went horribly wrong" category. Maybe WARN_ON()? > + > + boundary = kcalloc(MAX_ORDER - PAGE_REPORTING_MIN_ORDER, > +sizeof(struct list_head *) * MIGRATE_TYPES, > +GFP_KERNEL); Could you comment here on why this size of array is allocated? The calculation is not obvious to a reader. > + if (!boundary) { > + err = -ENOMEM; > + goto err_out; > + } > + > + /* allocate scatterlist to store pages being reported on */ > + phdev->sg = kcalloc(phdev->capacity, sizeof(*phdev->sg), GFP_KERNEL); > + if (!phdev->sg) { > + err = -ENOMEM; > + > + kfree(boundary); > + boundary = NULL; > + > + goto err_out; > + } > + > + > + /* initialize refcnt and work structures */ > + atomic_set(>refcnt, 0); > + INIT_DELAYED_WORK(>work, _reporting_process); > + > + /* assign device, and begin initial flush of populated zones */ > + rcu_assign_pointer(ph_dev_info, phdev); > + for_each_populated_zone(zone) { > + spin_lock_irq(>lock); > + __page_reporting_request(zone); > + spin_unlock_irq(>lock); > + } > + > + /* enable page reporting notification */ > + static_branch_enable(_reporting_notify_enabled); > +err_out: > + mutex_unlock(_reporting_mutex); > + > + return err; > +} > +EXPORT_SYMBOL_GPL(page_reporting_startup); > > -- Kirill A. Shutemov

Re: [PATCH v9 3/8] mm: Move set/get_pcppage_migratetype to mmzone.h

2019-09-09 Thread Kirill A. Shutemov
w would we avoid to messing with ->index when the page is not in the right state of its life-cycle. Can we add some VM_BUG_ON()s here? -- Kirill A. Shutemov

<    1   2   3   4   5   6   7   8   9   10   >