[PATCH] hugetlb_cgroup: fix offline of hugetlb cgroup with reservations

2020-12-03 Thread Mike Kravetz
issue, a related bug in hugetlb_cgroup_css_offline was noticed. The hstate index is not reinitialized each time through the do-while loop. Fix this as well. Fixes: 1adc4d419aa2 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations") Cc: Reported-by: Adrian Moren

Re: [PATCH v7 00/15] Free some vmemmap pages of hugetlb page

2020-12-03 Thread Mike Kravetz
As previously mentioned, I feel qualified to review the hugetlb changes and some other closely related changes. However, this patch set is touching quite a few areas and I do not feel qualified to make authoritative statements about them all. I too hope others will take a look. -- Mike Kravetz

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-11 Thread Mike Kravetz
On 3/11/21 9:59 AM, Mike Kravetz wrote: > On 3/11/21 4:17 AM, Michal Hocko wrote: >>> Yeah per cpu preempt counting shouldn't be noticeable but I have to >>> confess I haven't benchmarked it. >> >> But all this seems moot now >> http://lkml.kern

Re: [PATCH 0/3] Add support for free vmemmap pages of HugeTLB for arm64

2021-03-12 Thread Mike Kravetz
going wrong. > Are you specifying 'hugetlb_free_vmemmap=on' on the kernel command line? This feature is only enabled if you 'opt in' via the command line option. -- Mike Kravetz

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-12 Thread Mike Kravetz
On 3/12/21 12:15 AM, Michal Hocko wrote: > On Thu 11-03-21 14:53:08, Mike Kravetz wrote: >> On 3/11/21 9:59 AM, Mike Kravetz wrote: >>> On 3/11/21 4:17 AM, Michal Hocko wrote: >>>>> Yeah per cpu preempt counting shouldn't be noticeable but I have to

Re: [PATCH v2] hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings

2021-03-12 Thread Mike Kravetz
-- > mm/hugetlb.c | 42 ++ > mm/hugetlb_cgroup.c| 11 +++-- > 3 files changed, 60 insertions(+), 8 deletions(-) Just a few minor nits below, all in comments. It is not required, but would be nice to update these. Code lo

Re: [PATCH 3/5] hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()

2021-03-12 Thread Mike Kravetz
> --- > mm/hugetlb_cgroup.c | 1 - > 1 file changed, 1 deletion(-) Reviewed-by: Mike Kravetz -- Mike Kravetz > > diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c > index 8668ba87cfe4..3dde6ddf0170 100644 > --- a/mm/hugetlb_cgroup.c > +++ b/mm/hugetlb_cgroup.c > @@ -785,

Re: [PATCH 4/5] mm/hugetlb: simplify the code when alloc_huge_page() failed in hugetlb_no_page()

2021-03-12 Thread Mike Kravetz
xisting code made that very clear. Would have been even more clear with an unlikely modifier. In any case, the lengthy comment above this code makes it clear why the check is there. Code changes are fine. Reviewed-by: Mike Kravetz -- Mike Kravetz >

Re: [PATCH 5/5] mm/hugetlb: avoid calculating fault_mutex_hash in truncate_op case

2021-03-12 Thread Mike Kravetz
sh = 0; Do we need to initialize hash here? I would not bring this up normally, but the purpose of the patch is to save cpu cycles. -- Mike Kravetz > > index = page->index; > - hash = hug

Re: [PATCH 5/5] mm/hugetlb: avoid calculating fault_mutex_hash in truncate_op case

2021-03-13 Thread Mike Kravetz
On 3/12/21 6:49 PM, Miaohe Lin wrote: > Hi: > On 2021/3/13 4:03, Mike Kravetz wrote: >> On 3/8/21 3:28 AM, Miaohe Lin wrote: >>> The fault_mutex hashing overhead can be avoided in truncate_op case because >>> page faults can not race with truncation in this rout

Re: [PATCH v2 1/5] hugetlb: use page.private for hugetlb specific page flags

2021-01-20 Thread Mike Kravetz
On 1/20/21 1:30 AM, Oscar Salvador wrote: > On Tue, Jan 19, 2021 at 05:30:45PM -0800, Mike Kravetz wrote: >> + * Macros to create test, set and clear function definitions for >> + * hugetlb specific page flags. >> + */ >> +#ifdef CONFIG_HUGETLB_PAGE >> +#d

Re: [PATCH v2 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag

2021-01-20 Thread Mike Kravetz
On 1/20/21 2:09 AM, Oscar Salvador wrote: > On Tue, Jan 19, 2021 at 05:30:48PM -0800, Mike Kravetz wrote: >> Use new hugetlb specific HPageTemporary flag to replace the >> PageHugeTemporary() interfaces. >> >> Signed-off-by: Mike Kravetz > > I would have added

Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag

2021-01-20 Thread Mike Kravetz
On 1/20/21 2:00 AM, Oscar Salvador wrote: > On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote: >> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote: >>> Use the new hugetlb page specific flag HPageMigratable to replace the >>> page_huge_activ

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-10 Thread Mike Kravetz
on configurations. I'll put together a separate patch where we can discuss the merits of making the change from !in_task to in_atomic, and what work remains in this put_page area. -- Mike Kravetz

Re: [PATCH] mm/hugetlb: Fix build with !ARCH_WANT_HUGE_PMD_SHARE

2021-03-10 Thread Mike Kravetz
xactly sure how this is supposed to be handled. > Cc: Andrew Morton > Cc: Mike Kravetz > Cc: Axel Rasmussen > Reported-by: Naresh Kamboju > Tested-by: Naresh Kamboju > Signed-off-by: Peter Xu > --- > mm/hugetlb.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deleti

Re: [RFC PATCH 0/3] hugetlb: add demote/split page functionality

2021-03-10 Thread Mike Kravetz
On 3/10/21 8:23 AM, Michal Hocko wrote: > On Mon 08-03-21 16:18:52, Mike Kravetz wrote: > [...] >> Converting larger to smaller hugetlb pages can be accomplished today by >> first freeing the larger page to the buddy allocator and then allocating >> the smaller pages.

Re: [RFC PATCH 0/3] hugetlb: add demote/split page functionality

2021-03-10 Thread Mike Kravetz
may sound crazy, but I think it may be the long term goal. -- Mike Kravetz

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-10 Thread Mike Kravetz
On 3/10/21 1:49 PM, Paul E. McKenney wrote: > On Wed, Mar 10, 2021 at 10:11:22PM +0100, Michal Hocko wrote: >> On Wed 10-03-21 10:56:08, Mike Kravetz wrote: >>> On 3/10/21 7:19 AM, Michal Hocko wrote: >>>> On Mon 08-03-21 18:28:02, Muchun Song wrote: >>>&

[PATCH] hugetlb: select PREEMPT_COUNT if HUGETLB_PAGE for in_atomic use

2021-03-10 Thread Mike Kravetz
[2] https://lore.kernel.org/linux-mm/yejji9oawhuza...@dhcp22.suse.cz/ [3] https://lore.kernel.org/linux-mm/ydzaawk41k4gd...@dhcp22.suse.cz/ Suggested-by: Michal Hocko Signed-off-by: Mike Kravetz --- fs/Kconfig | 1 + mm/hugetlb.c | 10 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff

Re: [PATCH] hugetlb: select PREEMPT_COUNT if HUGETLB_PAGE for in_atomic use

2021-03-11 Thread Mike Kravetz
>> >> The code really doesn't look _that_ complicated. > > Fair enough. As I've said I am not a great fan of this patch either > but it is a quick fix for a likely long term problem. If reworking the > hugetlb locking is preferable then be it. Thanks you Michal and Peter. This patch was mostly about starting a discussion, as this topic came up in a couple different places. I included the 'train wreck' of how we got here just for a bit of history. I'll start working on a proper fix. -- Mike Kravetz

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-11 Thread Mike Kravetz
ge requests must be sent to a workqueue. Any ideas on how to address this? -- Mike Kravetz

Re: possible deadlock in sk_clone_lock

2021-03-01 Thread Mike Kravetz
tlb_lock irq safe would not help. Again, I may be missing something. Note that we also are considering doing more with the hugetlb lock dropped in this path in the 'free vmemmap of hugetlb pages' series. Since we need to do some work that could block in this path, it seems like we really need to use a workqueue. It is too bad that there is not an interface to identify all the cases where interrupts are disabled. -- Mike Kravetz

[RFC PATCH 1/3] hugetlb: add demote hugetlb page sysfs interfaces

2021-03-08 Thread Mike Kravetz
number of hugetlb pages to an appropriate number of demote_size pages. This patch does not provide full demote functionality. It only provides the sysfs interfaces and uses existing code to free pages to the buddy allocator is demote_size == PAGESIZE. Signed-off-by: Mike Kravetz --- include

[RFC PATCH 3/3] hugetlb: add hugetlb demote page support

2021-03-08 Thread Mike Kravetz
Demote page functionality will split a huge page into a number of huge pages of a smaller size. For example, on x86 a 1GB huge page can be demoted into 512 2M huge pages. Demotion is done 'in place' by simply splitting the huge page. Signed-off-by: Mike Kravetz --- mm/huge

[RFC PATCH 0/3] hugetlb: add demote/split page functionality

2021-03-08 Thread Mike Kravetz
erved huge pages. Therefore, when a value is written to the sysfs demote file that value is only the maximum number of pages which will be demoted. It is possible fewer will actually be demoted. If demote_size is PAGESIZE, demote will simply free pages to the buddy allocator. Mike Kravetz (3): hu

[RFC PATCH 2/3] hugetlb: add HPageCma flag and code to free non-gigantic pages in CMA

2021-03-08 Thread Mike Kravetz
appropriate action. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 7 +++ mm/hugetlb.c| 27 +-- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5e9d6c8ab411..b4ec2daea5aa

Re: [PATCH] hugetlbfs: make hugepage size conversion more readable

2021-01-21 Thread Mike Kravetz
the kernel, size in KB is often calculated as (size << (PAGE_SHIFT - 10)). If you change the calculation in the hugetlb code to be: huge_page_size(h) << (PAGE_SHIFT - 10) my compiler will actually reduce the size of the routine by one instruction. -- Mike Kravetz > return mnt; > } > >

Re: [PATCH v2] hugetlbfs: remove meaningless variable avoid_reserve

2021-01-21 Thread Mike Kravetz
dd a comment offered by Mike Kravetz to explain this. > > Reviewed-by: David Hildenbrand > Signed-off-by: Miaohe Lin > Cc: Mike Kravetz > --- > fs/hugetlbfs/inode.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) Reviewed-by: Mike Kravetz > >

Re: [PATCH] hugetlbfs: make hugepage size conversion more readable

2021-01-21 Thread Mike Kravetz
On 1/21/21 5:42 PM, Miaohe Lin wrote: > Hi: > On 2021/1/22 3:00, Mike Kravetz wrote: >> On 1/20/21 1:23 AM, Miaohe Lin wrote: >>> The calculation 1U << (h->order + PAGE_SHIFT - 10) is actually equal to >>> (PAGE_SHIFT << (h->order)) >> 10.

Re: [RFC PATCH 0/3] hugetlb: add demote/split page functionality

2021-03-09 Thread Mike Kravetz
On 3/9/21 1:01 AM, David Hildenbrand wrote: > On 09.03.21 01:18, Mike Kravetz wrote: >> To address these issues, introduce the concept of hugetlb page demotion. >> Demotion provides a means of 'in place' splitting a hugetlb page to >> pages of a smaller size. For

Re: [RFC PATCH 0/3] hugetlb: add demote/split page functionality

2021-03-09 Thread Mike Kravetz
On 3/9/21 9:50 AM, David Hildenbrand wrote: > On 09.03.21 18:11, Mike Kravetz wrote: >> On 3/9/21 1:01 AM, David Hildenbrand wrote: >>> On 09.03.21 01:18, Mike Kravetz wrote: >>>> To address these issues, introduce the concept of hugetlb page demotion. >>>&

Re: [PATCH RFC 00/30] userfaultfd-wp: Support shmem and hugetlbfs

2021-02-05 Thread Mike Kravetz
l try to take a closer look at the areas where efforts overlap. -- Mike Kravetz

Re: [PATCH] mm: hugetlb: fix missing put_page in gather_surplus_pages()

2021-01-26 Thread Mike Kravetz
the hugetlb page on the free list with a count of 1. There is no check in the enqueue code. When we dequeue the page, set_page_refcounted() is used to set the count to 1 without looking at the current value. And, all the other VM_DEBUG macros are off so we mostly do not notice the bug. Thanks again, Reviewed-by: Mike Kravetz -- Mike Kravetz > } > free: >

Re: [PATCH 1/2] mm/hugetlb: grab head page refcount once per group of subpages

2021-01-26 Thread Mike Kravetz
+++--- > 3 files changed, 29 insertions(+), 22 deletions(-) Thanks. Nice straight forward improvement. Reviewed-by: Mike Kravetz -- Mike Kravetz > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a5d618d08506..0d793486822b 100644 > --- a/include/li

Re: [PATCH 2/2] mm/hugetlb: refactor subpage recording

2021-01-26 Thread Mike Kravetz
pages are so large that we do not guarantee that page++ pointer * arithmetic will work across the entire page. We need something more * specialized. */ static void __copy_gigantic_page(struct page *dst, struct page *src, int nr_pages) -- Mike Kravetz > +

Re: [PATCH] mm/hugetlb: Fix use after free when subpool max_hpages accounting is not enabled

2021-01-26 Thread Mike Kravetz
by: Miaohe Lin > --- > mm/hugetlb.c | 16 +--- > 1 file changed, 13 insertions(+), 3 deletions(-) Thanks, Reviewed-by: Mike Kravetz -- Mike Kravetz > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 777bc0e45bf3..53ea65d1c5ab 100644 > --- a/mm/hugetlb.c &

Re: [PATCH 2/2] mm/hugetlb: refactor subpage recording

2021-01-26 Thread Mike Kravetz
On 1/26/21 4:07 PM, Jason Gunthorpe wrote: > On Tue, Jan 26, 2021 at 01:21:46PM -0800, Mike Kravetz wrote: >> On 1/26/21 11:21 AM, Joao Martins wrote: >>> On 1/26/21 6:08 PM, Mike Kravetz wrote: >>>> On 1/25/21 12:57 PM, Joao Martins wrote: >>>>> >&g

Re: [PATCH] mm/hugetlb: Simplify the calculation of variables

2021-01-26 Thread Mike Kravetz
gt; 1 file changed, 1 insertion(+), 2 deletions(-) Thanks, Reviewed-by: Mike Kravetz -- Mike Kravetz > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index cbf32d2..5e6a6e7 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3367,8 +3367,7 @@ static unsigned in

Re: [PATCH 2/2] mm/hugetlb: refactor subpage recording

2021-01-26 Thread Mike Kravetz
On 1/26/21 11:21 AM, Joao Martins wrote: > On 1/26/21 6:08 PM, Mike Kravetz wrote: >> On 1/25/21 12:57 PM, Joao Martins wrote: >>> >>> +static void record_subpages_vmas(struct page *page, struct vm_area_struct >>> *vma, >>> +

Re: [PATCH 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one

2021-01-04 Thread Mike Kravetz
+) Thanks! Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH 2/6] hugetlbfs: fix cannot migrate the fallocated HugeTLB page

2021-01-04 Thread Mike Kravetz
); > + putback_active_hugepage(page); I'm curious why you used putback_active_hugepage() here instead of simply calling set_page_huge_active() before the put_page()? When the page was allocated, it was placed on the active list (alloc_huge_page). Therefore, the hugetlb_lock locking and

Re: [PATCH 3/6] mm: hugetlb: fix a race between freeing and dissolving the page

2021-01-04 Thread Mike Kravetz
* // page is freed to the buddy > + * spin_unlock(&hugetlb_lock) > + * spin_lock(&hugetlb_lock) > + * enqueue_huge_page(page) > + * // It is wrong, the pa

Re: [PATCH 4/6] mm: hugetlb: add return -EAGAIN for dissolve_free_huge_page

2021-01-04 Thread Mike Kravetz
Is it acceptable to keep retrying in that case? In addition, the 'Free some vmemmap' series may slow the free_huge_page path even more. In these worst case scenarios, I am not sure we want to just spin retrying. -- Mike Kravetz > > Signed-off-by: Muchun Song >

Re: [PATCH 5/6] mm: hugetlb: fix a race between isolating and freeing page

2021-01-04 Thread Mike Kravetz
on CPU0. Because > it is already freed to the buddy allocator. > > Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle > hugepage") > Signed-off-by: Muchun Song > --- > mm/hugetlb.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Thanks! Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH 6/6] mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active

2021-01-04 Thread Mike Kravetz
ged, 1 deletion(-) Thanks! Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
On 1/15/21 1:17 AM, Oscar Salvador wrote: > On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote: >> Use the new hugetlb page specific flag to replace the page_huge_active >> interfaces. By it's name, page_huge_active implied that a huge page >> was on the acti

Re: [RFC PATCH 3/3] hugetlb: convert PageHugeTemporary() to HPageTempSurplus

2021-01-15 Thread Mike Kravetz
On 1/15/21 2:16 AM, Oscar Salvador wrote: > On Mon, Jan 11, 2021 at 01:01:52PM -0800, Mike Kravetz wrote: >> Use new hugetlb specific flag HPageTempSurplus to replace the >> PageHugeTemporary() interfaces. >> >> Signed-off-by: Mike Kravetz &

Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
On 1/15/21 9:43 AM, Mike Kravetz wrote: > On 1/15/21 1:17 AM, Oscar Salvador wrote: >> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote: >>> Use the new hugetlb page specific flag to replace the page_huge_active >>> interfaces. By it's name, page_huge

Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

2021-01-15 Thread Mike Kravetz
to be something like: 1) allocate a fresh hugetlb page from buddy 2) free the 'migrated' free huge page back to buddy I do not think we can use the existing 'isolate-migrate' flow. Isolating a page would make it unavailable for allocation and that could cause application issues. -- Mike Kravetz

[PATCH 4/5] hugetlb: convert PageHugeTemporary() to HP_Temporary flag

2021-01-15 Thread Mike Kravetz
Use new hugetlb specific flag HP_Temporary flag to replace the PageHugeTemporary() interfaces. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 5 + mm/hugetlb.c| 36 +++- 2 files changed, 12 insertions(+), 29 deletions(-) diff --git a

[PATCH 3/5] hugetlb: only set HP_Migratable for migratable hstates

2021-01-15 Thread Mike Kravetz
onger necessary. If migration is not supported for the hstate, HP_Migratable will not be set, the page will not be isolated and no attempt will be made to migrate. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 2 +- include/linux/hugetlb.h | 9 + mm/hugetlb.c

[PATCH 5/5] hugetlb: convert PageHugeFreed to HP_Freed flag

2021-01-15 Thread Mike Kravetz
Use new hugetlb specific flag HP_Freed flag to replace the PageHugeFreed interfaces. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 2 ++ mm/hugetlb.c| 23 --- 2 files changed, 6 insertions(+), 19 deletions(-) diff --git a/include/linux/hugetlb.h b

[PATCH 1/5] hugetlb: use page.private for hugetlb specific page flags

2021-01-15 Thread Mike Kravetz
subsequent patches. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 12 ++-- include/linux/hugetlb.h | 61 + mm/hugetlb.c| 46 +++ 3 files changed, 87 insertions(+), 32 deletions(-) diff --git a/fs/hugetlbfs

[PATCH 0/5] create hugetlb flags to consolidate state

2021-01-15 Thread Mike Kravetz
of flag manipulation routines (Oscar) Moved flags and routines to hugetlb.h (Muchun) Changed format of page flag names (Muchun) Changed subpool routine names (Matthew) More comments in code (Oscar) Based on v5.11-rc3-mmotm-2021-01-12-01-57 Mike Kravetz (5): hugetlb: use page.private

[PATCH 2/5] hugetlb: convert page_huge_active() to HP_Migratable flag

2021-01-15 Thread Mike Kravetz
race with code freeing the page. The extra check in page_huge_active shortened the race window, but did not prevent the race. Offline code calling scan_movable_pages already deals with these races, so removing the check is acceptable. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c

Re: [PATCH 2/5] hugetlb: convert page_huge_active() to HP_Migratable flag

2021-01-16 Thread Mike Kravetz
s need to be set/tested outside hugetlb code, so > it indeed looks nicer and more consistent to follow page-flags.h convention. > > Sorry for the noise. Thanks everyone! I was unsure about the best way to go for this. Will send out a new version in a few days using the page-flag style macros. -- Mike Kravetz

Re: [PATCH v2] hugetlbfs: make hugepage size conversion more readable

2021-01-22 Thread Mike Kravetz
Miaohe Lin > --- > fs/hugetlbfs/inode.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Thanks, Reviewed-by: Mike Kravetz -- Mike Kravetz > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index 25c1857ff45d..c87894b221da 100644 > --- a/fs/hugetlbfs/i

[PATCH v3 4/5] hugetlb: convert PageHugeTemporary() to HPageTemporary flag

2021-01-22 Thread Mike Kravetz
. Signed-off-by: Mike Kravetz Reviewed-by: Oscar Salvador --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c| 36 +++- 2 files changed, 13 insertions(+), 29 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index cd1960541f2a

[PATCH v3 0/5] create hugetlb flags to consolidate state

2021-01-22 Thread Mike Kravetz
Based on v5.11-rc4-mmotm-2021-01-21-20-07 Mike Kravetz (5): hugetlb: use page.private for hugetlb specific page flags hugetlb: convert page_huge_active() HPageMigratable flag hugetlb: only set HPageMigratable for migratable hstates hugetlb: convert PageHugeTemporary() to HPageTemporary flag

[PATCH v3 2/5] hugetlb: convert page_huge_active() HPageMigratable flag

2021-01-22 Thread Mike Kravetz
d can race with code freeing the page. The extra check in page_huge_active shortened the race window, but did not prevent the race. Offline code calling scan_movable_pages already deals with these races, so removing the check is acceptable. Add comment to racy code. Signed-off-by: Mike Kr

[PATCH v3 3/5] hugetlb: only set HPageMigratable for migratable hstates

2021-01-22 Thread Mike Kravetz
l not be isolated and no attempt will be made to migrate. We should never get to unmap_and_move_huge_page for a page where migration is not supported, so throw a warning if we do. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 2 +- include/linux/hugetlb.h | 9 + mm/huge

[PATCH v3 1/5] hugetlb: use page.private for hugetlb specific page flags

2021-01-22 Thread Mike Kravetz
rsion of other state information will happen in subsequent patches. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 12 ++-- include/linux/hugetlb.h | 68 + mm/hugetlb.c| 48 +++-- 3 files changed, 96 inserti

[PATCH v3 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag

2021-01-22 Thread Mike Kravetz
Use new hugetlb specific HPageFreed flag to replace the PageHugeFreed interfaces. Signed-off-by: Mike Kravetz Reviewed-by: Oscar Salvador Reviewed-by: Muchun Song --- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c| 23 --- 2 files changed, 7 insertions(+), 19

Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag

2021-01-22 Thread Mike Kravetz
On 1/21/21 10:53 PM, Miaohe Lin wrote: > Hi: > On 2021/1/20 9:30, Mike Kravetz wrote: >> Use the new hugetlb page specific flag HPageMigratable to replace the >> page_huge_active interfaces. By it's name, page_huge_active implied >> that a huge page was on the activ

Re: [PATCH v13 03/12] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page

2021-01-22 Thread Mike Kravetz
, *next; > + > + list_for_each_entry_safe(page, next, list, lru) { > + list_del(&page->lru); > + free_vmemmap_page(page); > + } > +} > + > +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, > + struc

Re: [PATCH v2] Documentation/admin-guide: kernel-parameters: update CMA entries

2021-01-25 Thread Mike Kravetz
c: Jonathan Corbet > Cc: linux-...@vger.kernel.org > Cc: linux...@kvack.org > Cc: Andrew Morton > Cc: Mike Kravetz > --- > v2: rebase & resend > > Documentation/admin-guide/kernel-parameters.txt |8 > 1 file changed, 4 insertions(+), 4 de

Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

2021-01-25 Thread Mike Kravetz
VABLE, in that case you also >> shouldn't allocate the vmemmap from there ... > > Yeah, you are right. So I tend to trigger OOM to kill other processes to > reclaim some memory when we allocate memory fails. IIUC, even non-gigantic hugetlb pages can exist in CMA. They can be migrated out of CMA if needed (except free pages in the pool, but that is a separate issue David H already noted in another thread). When we first started discussing this patch set, one suggestion was to force hugetlb pool pages to be allocated at boot time and never permit them to be freed back to the buddy allocator. A primary reason for the suggestion was to avoid this issue of needing to allocate memory when freeing a hugetlb page to buddy. IMO, that would be an unreasonable restriction for many existing hugetlb use cases. A simple thought is that we simply fail the 'freeing hugetlb page to buddy' if we can not allocate the required vmemmap pages. However, as David R says freeing hugetlb pages to buddy is a reasonable way to free up memory in oom situations. However, failing the operation 'might' be better than looping forever trying to allocate the pages needed? As mentioned in the previous patch, it would be better to use GFP_ATOMIC to at least dip into reserves if we can. I think using pages of the hugetlb for vmemmap to cover pages of the hugetlb is the only way we can guarantee success of freeing a hugetlb page to buddy. However, this should only only be used when there is no other option and could result in vmemmap pages residing in CMA or ZONE_MOVABLE. I'm not sure how much better this is than failing the free to buddy operation. I don't have a solution. Just wanted to share some thoughts. BTW, just thought of something else. Consider offlining a memory section that contains a free hugetlb page. The offline code will try to disolve the hugetlb page (free to buddy). So, vmemmap pages will need to be allocated. We will try to allocate vmemap pages on the same node as the hugetlb page. But, if this memory section is the last of the node all the pages will have been isolated and no allocations will succeed. Is that a possible scenario, or am I just having too many negative thoughts? -- Mike Kravetz

Re: [PATCH v3 1/5] hugetlb: use page.private for hugetlb specific page flags

2021-01-27 Thread Mike Kravetz
On 1/27/21 2:20 AM, Michal Hocko wrote: > [sorry for jumping in late] > > On Fri 22-01-21 11:52:27, Mike Kravetz wrote: >> As hugetlbfs evolved, state information about hugetlb pages was added. >> One 'convenient' way of doing this was to use available fields in ta

Re: [PATCH v3 2/5] hugetlb: convert page_huge_active() HPageMigratable flag

2021-01-27 Thread Mike Kravetz
On 1/27/21 2:25 AM, Michal Hocko wrote: > On Fri 22-01-21 11:52:28, Mike Kravetz wrote: >> Use the new hugetlb page specific flag HPageMigratable to replace the >> page_huge_active interfaces. By it's name, page_huge_active implied >> that a huge page was on the active l

Re: [PATCH v3 5/5] hugetlb: convert PageHugeFreed to HPageFreed flag

2021-01-27 Thread Mike Kravetz
On 1/27/21 2:41 AM, Michal Hocko wrote: > On Fri 22-01-21 11:52:31, Mike Kravetz wrote: >> Use new hugetlb specific HPageFreed flag to replace the >> PageHugeFreed interfaces. >> >> Signed-off-by: Mike Kravetz >> Reviewed-by: Oscar Salvador >> Reviewed

Re: [PATCH v3 3/5] hugetlb: only set HPageMigratable for migratable hstates

2021-01-27 Thread Mike Kravetz
On 1/27/21 2:35 AM, Michal Hocko wrote: > On Fri 22-01-21 11:52:29, Mike Kravetz wrote: >> The HP_Migratable flag indicates a page is a candidate for migration. >> Only set the flag if the page's hstate supports migration. This allows >> the migration paths to detect no

Re: [External] Re: [PATCH 2/6] hugetlbfs: fix cannot migrate the fallocated HugeTLB page

2021-01-05 Thread Mike Kravetz
On 1/4/21 6:44 PM, Muchun Song wrote: > On Tue, Jan 5, 2021 at 6:40 AM Mike Kravetz wrote: >> >> On 1/3/21 10:58 PM, Muchun Song wrote: >>> Because we only can isolate a active page via isolate_huge_page() >>> and hugetlbfs_fallocate() forget to mark it as

Re: [External] Re: [PATCH 3/6] mm: hugetlb: fix a race between freeing and dissolving the page

2021-01-05 Thread Mike Kravetz
On 1/4/21 6:55 PM, Muchun Song wrote: > On Tue, Jan 5, 2021 at 8:02 AM Mike Kravetz wrote: >> >> On 1/3/21 10:58 PM, Muchun Song wrote: >>> There is a race condition between __free_huge_page() >>> and dissolve_free_huge_page(). >>> >>> CPU0:

Re: [External] Re: [PATCH 4/6] mm: hugetlb: add return -EAGAIN for dissolve_free_huge_page

2021-01-05 Thread Mike Kravetz
On 1/4/21 7:46 PM, Muchun Song wrote: > On Tue, Jan 5, 2021 at 11:14 AM Muchun Song wrote: >> >> On Tue, Jan 5, 2021 at 9:33 AM Mike Kravetz wrote: >>> >>> On 1/3/21 10:58 PM, Muchun Song wrote: >>>> When dissolve_free_huge_page() races with __free

Re: [PATCH v2 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page

2021-01-06 Thread Mike Kravetz
migration could race with the page fault and the page could be migrated before being added to the page table of the faulting task. This was an issue when hugetlb_no_page set_page_huge_active right after allocating and clearing the huge page. Commit cb6acd01e2e4 moved the set_page_huge_active after adding the page to the page table to address this issue. -- Mike Kravetz

Re: [PATCH v2 3/6] mm: hugetlb: fix a race between freeing and dissolving the page

2021-01-06 Thread Mike Kravetz
return page; >> @@ -1291,6 +1308,17 @@ static inline void >> destroy_compound_gigantic_page(struct page *page, >> unsigned int order) { } >> #endif >> >> +/* >> + * Because we reuse the mapping field of so

Re: [PATCH v2 2/6] mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page

2021-01-06 Thread Mike Kravetz
On 1/6/21 12:02 PM, Michal Hocko wrote: > On Wed 06-01-21 11:30:25, Mike Kravetz wrote: >> On 1/6/21 8:35 AM, Michal Hocko wrote: >>> On Wed 06-01-21 16:47:35, Muchun Song wrote: >>>> Because we only can isolate a active page via isolate_huge_page() >>>> a

Re: [PATCH] mm/hugetlb: fix deadlock in hugetlb_cow error path

2020-12-15 Thread Mike Kravetz
On 12/14/20 5:06 PM, Mike Kravetz wrote: > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index d029d938d26d..8713f8ef0f4c 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -4106,10 +4106,30 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, > struct v

Re: [PATCH v9 02/11] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP

2020-12-15 Thread Mike Kravetz
an be saved for each 1GB HugeTLB page. When a HugeTLB page is allocated or freed, the vmemmap array representing the range associated with the page will need to be remapped. When a page is allocated, vmemmap pages are freed after remapping. When a page

Re: [PATCH v9 02/11] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP

2020-12-15 Thread Mike Kravetz
On 12/15/20 5:03 PM, Mike Kravetz wrote: > On 12/13/20 7:45 AM, Muchun Song wrote: >> diff --git a/fs/Kconfig b/fs/Kconfig >> index 976e8b9033c4..4c3a9c614983 100644 >> --- a/fs/Kconfig >> +++ b/fs/Kconfig >> @@ -245,6 +245,21 @@ config HUGETLBFS >> config

Re: [PATCH v9 03/11] mm/hugetlb: Free the vmemmap pages associated with each HugeTLB page

2020-12-16 Thread Mike Kravetz
Not sure if the word '_reuse' is best in this function name. To me, the name implies this routine will reuse vmemmap pages. Perhaps, it makes more sense to rename as 'vmemmap_remap_free'? It will first remap, then free vmemmap. But, then I looked at the code

Re: [PATCH v9 03/11] mm/hugetlb: Free the vmemmap pages associated with each HugeTLB page

2020-12-16 Thread Mike Kravetz
On 12/16/20 2:25 PM, Oscar Salvador wrote: > On Wed, Dec 16, 2020 at 02:08:30PM -0800, Mike Kravetz wrote: >>> + * vmemmap_rmap_walk - walk vmemmap page table >>> + >>> +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, >>> +

Re: [PATCH v9 04/11] mm/hugetlb: Defer freeing of HugeTLB pages

2020-12-16 Thread Mike Kravetz
FP_ATOMIC to allocate the vmemmap pages. > > Signed-off-by: Muchun Song It is unfortunate we need to add this complexitty, but I can not think of another way. One small comment (no required change) below. Reviewed-by: Mike Kravetz > --- > m

Re: [PATCH v9 05/11] mm/hugetlb: Allocate the vmemmap pages associated with each HugeTLB page

2020-12-16 Thread Mike Kravetz
* handle allocation failures. Once we allocate > + * vmemmap pages successfully, then we can free > + * a HugeTLB page. > + */ > + goto retry; > + } > + list_add_tail(&page->lru, list); > + } > +} > + -- Mike Kravetz

[PATCH] mm/hugetlb: fix deadlock in hugetlb_cow error path

2020-12-14 Thread Mike Kravetz
tmail.com Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Cc: Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d029d938d26d..

Re: [PATCH v5 00/21] Free some vmemmap pages of hugetlb page

2020-11-20 Thread Mike Kravetz
this feature was enabled. This would eliminate a bunch of the complex code doing page table manipulation. It does not address the issue of struct page pages going away which is being discussed here, but it could be a way to simply the first version of this code. If this is going to be an 'opt in' feature as previously suggested, then eliminating the PMD/huge page vmemmap mapping may be acceptable. My guess is that sysadmins would only 'opt in' if they expect most of system memory to be used by hugetlb pages. We certainly have database and virtualization use cases where this is true. -- Mike Kravetz

Re: [PATCH v4 09/10] hugetlbfs: add hugetlbfs_fallocate()

2015-07-22 Thread Mike Kravetz
On 07/22/2015 03:03 PM, Andrew Morton wrote: On Tue, 21 Jul 2015 11:09:43 -0700 Mike Kravetz wrote: ... + + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + return -EOPNOTSUPP; EOPNOTSUPP is a networking thing. It's inappropriate here. The problem i

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-22 Thread Mike Kravetz
tlb selftests in the kernel and pointing people to libhugetlbfs is the way to go. From a very quick scan of the selftests, I would guess libhugetlbfs covers everything in those tests. I'm willing to verify the testing provided by selftests is included in libhugetlbfs, and remove selftests if that i

Re: [patch] mmap.2: document the munmap exception for underlying page size

2015-07-22 Thread Mike Kravetz
etlbfs, I beieve the offset must be a multiple of the hugetlb page size. A similar comment/exception about using the "underlying page size" would apply here as well. -- Mike Kravetz -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a m

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-23 Thread Mike Kravetz
On 07/23/2015 08:17 AM, Eric B Munson wrote: On Wed, 22 Jul 2015, Mike Kravetz wrote: On 07/22/2015 03:30 PM, Andrew Morton wrote: On Wed, 22 Jul 2015 15:19:54 -0700 Davidlohr Bueso wrote: I didn't know that libhugetlbfs has tests. I wonder if that makes tools/testing/selftests

Re: [PATCH v4 00/10] hugetlbfs: add fallocate support

2015-07-23 Thread Mike Kravetz
On 07/23/2015 10:17 AM, Eric B Munson wrote: On Thu, 23 Jul 2015, Mike Kravetz wrote: On 07/23/2015 08:17 AM, Eric B Munson wrote: On Wed, 22 Jul 2015, Mike Kravetz wrote: On 07/22/2015 03:30 PM, Andrew Morton wrote: On Wed, 22 Jul 2015 15:19:54 -0700 Davidlohr Bueso wrote: I didn&#

Re: [RFC v5 PATCH 8/9] hugetlbfs: add hugetlbfs_fallocate()

2015-07-24 Thread Mike Kravetz
: one if CONFIG_NUMA is defined and one (a no-op) if not. I am happy with either, but am a relative newbie in this area so am looking for a little guidance. -- Mike Kravetz --- From 04c37a979c5ce8cd39d3243e4e2c12905e4f1e6e Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 24 Jul 2015 08:14

Re: [PATCH 1/3] Reverted "selftests: add hugetlbfstest"

2015-08-03 Thread Mike Kravetz
igned-off-by: Mike Kravetz --- tools/testing/selftests/vm/Makefile| 1 - tools/testing/selftests/vm/hugetlbfstest.c | 86 -- tools/testing/selftests/vm/run_vmtests | 11 3 files changed, 98 deletions(-) delete mode 100644 tools/testing/selftes

[PATCH v2 00/10] hugetlbfs: add fallocate support

2015-07-08 Thread Mike Kravetz
ror handling in region_del() when kmalloc() fails stills needs to be addressed madvise remove support remains Mike Kravetz (10): mm/hugetlb: add cache of descriptors to resv_map for region_add mm/hugetlb: add region_del() to delete a specific range of entries mm/hugetlb: expose hugetlb fault m

[PATCH v2 01/10] mm/hugetlb: add cache of descriptors to resv_map for region_add

2015-07-08 Thread Mike Kravetz
or callers creating reservations with vma_needs_reservation/vma_commit_reservation. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 3 + mm/hugetlb.c| 168 ++-- 2 files changed, 152 insertions(+), 19 deletions(-) diff --git a/inc

[PATCH v2 04/10] hugetlbfs: hugetlb_vmtruncate_list() needs to take a range to delete

2015-07-08 Thread Mike Kravetz
callers to add 0 as end of range. Since the routine will be used in hole punch as well as truncate operations, it is more appropriately renamed to hugetlb_vmdelete_list(). Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 25 ++--- 1 file changed, 18 insertions(+), 7

[PATCH v2 06/10] mm/hugetlb: vma_has_reserves() needs to handle fallocate hole punch

2015-07-08 Thread Mike Kravetz
). vma_has_reserves is passed "chg" which indicates whether or not a region/reserve map is present. Use this to determine if reserves are actually present or were removed via hole punch. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 16 +--- 1 file changed, 13 insertions(+), 3

[PATCH v2 03/10] mm/hugetlb: expose hugetlb fault mutex for use by fallocate

2015-07-08 Thread Mike Kravetz
changes to be more consistent with other global hugetlb symbols. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 5 + mm/hugetlb.c| 20 ++-- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h

[PATCH v2 10/10] mm: madvise allow remove operation for hugetlbfs

2015-07-08 Thread Mike Kravetz
Now that we have hole punching support for hugetlbfs, we can also support the MADV_REMOVE interface to it. Signed-off-by: Dave Hansen Signed-off-by: Mike Kravetz --- mm/madvise.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/madvise.c b/mm/madvise.c index 70ce0d4

[PATCH v2 05/10] hugetlbfs: truncate_hugepages() takes a range of pages

2015-07-08 Thread Mike Kravetz
() is also modified to take a range of pages. hugetlb_unreserve_pages is modified to detect an error from region_del and pass it back to the caller. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 98 - include/linux/hugetlb.h | 4 +- mm

<    2   3   4   5   6   7   8   9   10   11   >