issue, a related bug in hugetlb_cgroup_css_offline was noticed.
The hstate index is not reinitialized each time through the do-while loop.
Fix this as well.
Fixes: 1adc4d419aa2 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb
reservations")
Cc:
Reported-by: Adrian Moren
As previously mentioned, I feel qualified to review the hugetlb changes
and some other closely related changes. However, this patch set is
touching quite a few areas and I do not feel qualified to make authoritative
statements about them all. I too hope others will take a look.
--
Mike Kravetz
On 3/11/21 9:59 AM, Mike Kravetz wrote:
> On 3/11/21 4:17 AM, Michal Hocko wrote:
>>> Yeah per cpu preempt counting shouldn't be noticeable but I have to
>>> confess I haven't benchmarked it.
>>
>> But all this seems moot now
>> http://lkml.kern
going wrong.
>
Are you specifying 'hugetlb_free_vmemmap=on' on the kernel command line?
This feature is only enabled if you 'opt in' via the command line option.
--
Mike Kravetz
On 3/12/21 12:15 AM, Michal Hocko wrote:
> On Thu 11-03-21 14:53:08, Mike Kravetz wrote:
>> On 3/11/21 9:59 AM, Mike Kravetz wrote:
>>> On 3/11/21 4:17 AM, Michal Hocko wrote:
>>>>> Yeah per cpu preempt counting shouldn't be noticeable but I have to
--
> mm/hugetlb.c | 42 ++
> mm/hugetlb_cgroup.c| 11 +++--
> 3 files changed, 60 insertions(+), 8 deletions(-)
Just a few minor nits below, all in comments. It is not required, but
would be nice to update these. Code lo
> ---
> mm/hugetlb_cgroup.c | 1 -
> 1 file changed, 1 deletion(-)
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index 8668ba87cfe4..3dde6ddf0170 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -785,
xisting code
made that very clear. Would have been even more clear with an unlikely
modifier. In any case, the lengthy comment above this code makes it
clear why the check is there. Code changes are fine.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
sh = 0;
Do we need to initialize hash here?
I would not bring this up normally, but the purpose of the patch is to save
cpu cycles.
--
Mike Kravetz
>
> index = page->index;
> - hash = hug
On 3/12/21 6:49 PM, Miaohe Lin wrote:
> Hi:
> On 2021/3/13 4:03, Mike Kravetz wrote:
>> On 3/8/21 3:28 AM, Miaohe Lin wrote:
>>> The fault_mutex hashing overhead can be avoided in truncate_op case because
>>> page faults can not race with truncation in this rout
On 1/20/21 1:30 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:45PM -0800, Mike Kravetz wrote:
>> + * Macros to create test, set and clear function definitions for
>> + * hugetlb specific page flags.
>> + */
>> +#ifdef CONFIG_HUGETLB_PAGE
>> +#d
On 1/20/21 2:09 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:48PM -0800, Mike Kravetz wrote:
>> Use new hugetlb specific HPageTemporary flag to replace the
>> PageHugeTemporary() interfaces.
>>
>> Signed-off-by: Mike Kravetz
>
> I would have added
On 1/20/21 2:00 AM, Oscar Salvador wrote:
> On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote:
>> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag HPageMigratable to replace the
>>> page_huge_activ
on
configurations.
I'll put together a separate patch where we can discuss the merits of
making the change from !in_task to in_atomic, and what work remains in
this put_page area.
--
Mike Kravetz
xactly sure how this is
supposed to be handled.
> Cc: Andrew Morton
> Cc: Mike Kravetz
> Cc: Axel Rasmussen
> Reported-by: Naresh Kamboju
> Tested-by: Naresh Kamboju
> Signed-off-by: Peter Xu
> ---
> mm/hugetlb.c | 8 +---
> 1 file changed, 5 insertions(+), 3 deleti
On 3/10/21 8:23 AM, Michal Hocko wrote:
> On Mon 08-03-21 16:18:52, Mike Kravetz wrote:
> [...]
>> Converting larger to smaller hugetlb pages can be accomplished today by
>> first freeing the larger page to the buddy allocator and then allocating
>> the smaller pages.
may sound crazy, but I think it may be the long term goal.
--
Mike Kravetz
On 3/10/21 1:49 PM, Paul E. McKenney wrote:
> On Wed, Mar 10, 2021 at 10:11:22PM +0100, Michal Hocko wrote:
>> On Wed 10-03-21 10:56:08, Mike Kravetz wrote:
>>> On 3/10/21 7:19 AM, Michal Hocko wrote:
>>>> On Mon 08-03-21 18:28:02, Muchun Song wrote:
>>>&
[2] https://lore.kernel.org/linux-mm/yejji9oawhuza...@dhcp22.suse.cz/
[3] https://lore.kernel.org/linux-mm/ydzaawk41k4gd...@dhcp22.suse.cz/
Suggested-by: Michal Hocko
Signed-off-by: Mike Kravetz
---
fs/Kconfig | 1 +
mm/hugetlb.c | 10 +-
2 files changed, 6 insertions(+), 5 deletions(-)
diff
>>
>> The code really doesn't look _that_ complicated.
>
> Fair enough. As I've said I am not a great fan of this patch either
> but it is a quick fix for a likely long term problem. If reworking the
> hugetlb locking is preferable then be it.
Thanks you Michal and Peter. This patch was mostly about starting a
discussion, as this topic came up in a couple different places. I
included the 'train wreck' of how we got here just for a bit of history.
I'll start working on a proper fix.
--
Mike Kravetz
ge requests must be sent to a workqueue.
Any ideas on how to address this?
--
Mike Kravetz
tlb_lock irq safe would not help.
Again, I may be missing something.
Note that we also are considering doing more with the hugetlb lock
dropped in this path in the 'free vmemmap of hugetlb pages' series.
Since we need to do some work that could block in this path, it seems
like we really need to use a workqueue. It is too bad that there is not
an interface to identify all the cases where interrupts are disabled.
--
Mike Kravetz
number of hugetlb pages to an appropriate number of demote_size pages.
This patch does not provide full demote functionality. It only provides
the sysfs interfaces and uses existing code to free pages to the buddy
allocator is demote_size == PAGESIZE.
Signed-off-by: Mike Kravetz
---
include
Demote page functionality will split a huge page into a number of huge
pages of a smaller size. For example, on x86 a 1GB huge page can be
demoted into 512 2M huge pages. Demotion is done 'in place' by simply
splitting the huge page.
Signed-off-by: Mike Kravetz
---
mm/huge
erved huge pages. Therefore, when a value is written to the sysfs demote
file that value is only the maximum number of pages which will be demoted.
It is possible fewer will actually be demoted.
If demote_size is PAGESIZE, demote will simply free pages to the buddy
allocator.
Mike Kravetz (3):
hu
appropriate action.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 7 +++
mm/hugetlb.c| 27 +--
2 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5e9d6c8ab411..b4ec2daea5aa
the kernel, size in KB is often calculated as (size << (PAGE_SHIFT - 10)).
If you change the calculation in the hugetlb code to be:
huge_page_size(h) << (PAGE_SHIFT - 10)
my compiler will actually reduce the size of the routine by one instruction.
--
Mike Kravetz
> return mnt;
> }
>
>
dd a comment offered by Mike Kravetz to explain this.
>
> Reviewed-by: David Hildenbrand
> Signed-off-by: Miaohe Lin
> Cc: Mike Kravetz
> ---
> fs/hugetlbfs/inode.c | 12 +---
> 1 file changed, 9 insertions(+), 3 deletions(-)
Reviewed-by: Mike Kravetz
>
>
On 1/21/21 5:42 PM, Miaohe Lin wrote:
> Hi:
> On 2021/1/22 3:00, Mike Kravetz wrote:
>> On 1/20/21 1:23 AM, Miaohe Lin wrote:
>>> The calculation 1U << (h->order + PAGE_SHIFT - 10) is actually equal to
>>> (PAGE_SHIFT << (h->order)) >> 10.
On 3/9/21 1:01 AM, David Hildenbrand wrote:
> On 09.03.21 01:18, Mike Kravetz wrote:
>> To address these issues, introduce the concept of hugetlb page demotion.
>> Demotion provides a means of 'in place' splitting a hugetlb page to
>> pages of a smaller size. For
On 3/9/21 9:50 AM, David Hildenbrand wrote:
> On 09.03.21 18:11, Mike Kravetz wrote:
>> On 3/9/21 1:01 AM, David Hildenbrand wrote:
>>> On 09.03.21 01:18, Mike Kravetz wrote:
>>>> To address these issues, introduce the concept of hugetlb page demotion.
>>>&
l try to take a closer look at the areas where efforts overlap.
--
Mike Kravetz
the
hugetlb page on the free list with a count of 1. There is no check in the
enqueue code. When we dequeue the page, set_page_refcounted() is used to
set the count to 1 without looking at the current value. And, all the other
VM_DEBUG macros are off so we mostly do not notice the bug.
Thanks again,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> }
> free:
>
+++---
> 3 files changed, 29 insertions(+), 22 deletions(-)
Thanks. Nice straight forward improvement.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a5d618d08506..0d793486822b 100644
> --- a/include/li
pages are so large that we do not guarantee that page++ pointer
* arithmetic will work across the entire page. We need something more
* specialized.
*/
static void __copy_gigantic_page(struct page *dst, struct page *src,
int nr_pages)
--
Mike Kravetz
> +
by: Miaohe Lin
> ---
> mm/hugetlb.c | 16 +---
> 1 file changed, 13 insertions(+), 3 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 777bc0e45bf3..53ea65d1c5ab 100644
> --- a/mm/hugetlb.c
&
On 1/26/21 4:07 PM, Jason Gunthorpe wrote:
> On Tue, Jan 26, 2021 at 01:21:46PM -0800, Mike Kravetz wrote:
>> On 1/26/21 11:21 AM, Joao Martins wrote:
>>> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>>>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>>>
>&g
gt; 1 file changed, 1 insertion(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cbf32d2..5e6a6e7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3367,8 +3367,7 @@ static unsigned in
On 1/26/21 11:21 AM, Joao Martins wrote:
> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>
>>> +static void record_subpages_vmas(struct page *page, struct vm_area_struct
>>> *vma,
>>> +
+)
Thanks!
Reviewed-by: Mike Kravetz
--
Mike Kravetz
);
> + putback_active_hugepage(page);
I'm curious why you used putback_active_hugepage() here instead of simply
calling set_page_huge_active() before the put_page()?
When the page was allocated, it was placed on the active list (alloc_huge_page).
Therefore, the hugetlb_lock locking and
* // page is freed to the buddy
> + * spin_unlock(&hugetlb_lock)
> + * spin_lock(&hugetlb_lock)
> + * enqueue_huge_page(page)
> + * // It is wrong, the pa
Is it acceptable
to keep retrying in that case? In addition, the 'Free some vmemmap' series
may slow the free_huge_page path even more.
In these worst case scenarios, I am not sure we want to just spin retrying.
--
Mike Kravetz
>
> Signed-off-by: Muchun Song
>
on CPU0. Because
> it is already freed to the buddy allocator.
>
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle
> hugepage")
> Signed-off-by: Muchun Song
> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Thanks!
Reviewed-by: Mike Kravetz
--
Mike Kravetz
ged, 1 deletion(-)
Thanks!
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 1/15/21 1:17 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag to replace the page_huge_active
>> interfaces. By it's name, page_huge_active implied that a huge page
>> was on the acti
On 1/15/21 2:16 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:52PM -0800, Mike Kravetz wrote:
>> Use new hugetlb specific flag HPageTempSurplus to replace the
>> PageHugeTemporary() interfaces.
>>
>> Signed-off-by: Mike Kravetz
&
On 1/15/21 9:43 AM, Mike Kravetz wrote:
> On 1/15/21 1:17 AM, Oscar Salvador wrote:
>> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag to replace the page_huge_active
>>> interfaces. By it's name, page_huge
to
be something like:
1) allocate a fresh hugetlb page from buddy
2) free the 'migrated' free huge page back to buddy
I do not think we can use the existing 'isolate-migrate' flow. Isolating
a page would make it unavailable for allocation and that could cause
application issues.
--
Mike Kravetz
Use new hugetlb specific flag HP_Temporary flag to replace the
PageHugeTemporary() interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 5 +
mm/hugetlb.c| 36 +++-
2 files changed, 12 insertions(+), 29 deletions(-)
diff --git a
onger necessary. If migration is not supported for the hstate,
HP_Migratable will not be set, the page will not be isolated and no
attempt will be made to migrate.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 2 +-
include/linux/hugetlb.h | 9 +
mm/hugetlb.c
Use new hugetlb specific flag HP_Freed flag to replace the
PageHugeFreed interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm/hugetlb.c| 23 ---
2 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/include/linux/hugetlb.h b
subsequent patches.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12 ++--
include/linux/hugetlb.h | 61 +
mm/hugetlb.c| 46 +++
3 files changed, 87 insertions(+), 32 deletions(-)
diff --git a/fs/hugetlbfs
of flag manipulation routines (Oscar)
Moved flags and routines to hugetlb.h (Muchun)
Changed format of page flag names (Muchun)
Changed subpool routine names (Matthew)
More comments in code (Oscar)
Based on v5.11-rc3-mmotm-2021-01-12-01-57
Mike Kravetz (5):
hugetlb: use page.private
race with
code freeing the page. The extra check in page_huge_active shortened the
race window, but did not prevent the race. Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c
s need to be set/tested outside hugetlb code, so
> it indeed looks nicer and more consistent to follow page-flags.h convention.
>
> Sorry for the noise.
Thanks everyone!
I was unsure about the best way to go for this. Will send out a new version
in a few days using the page-flag style macros.
--
Mike Kravetz
Miaohe Lin
> ---
> fs/hugetlbfs/inode.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 25c1857ff45d..c87894b221da 100644
> --- a/fs/hugetlbfs/i
.
Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
---
include/linux/hugetlb.h | 6 ++
mm/hugetlb.c| 36 +++-
2 files changed, 13 insertions(+), 29 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index cd1960541f2a
Based on v5.11-rc4-mmotm-2021-01-21-20-07
Mike Kravetz (5):
hugetlb: use page.private for hugetlb specific page flags
hugetlb: convert page_huge_active() HPageMigratable flag
hugetlb: only set HPageMigratable for migratable hstates
hugetlb: convert PageHugeTemporary() to HPageTemporary flag
d can race with
code freeing the page. The extra check in page_huge_active shortened the
race window, but did not prevent the race. Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable. Add comment to racy code.
Signed-off-by: Mike Kr
l not be isolated and no attempt will be made to migrate. We should
never get to unmap_and_move_huge_page for a page where migration is not
supported, so throw a warning if we do.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 2 +-
include/linux/hugetlb.h | 9 +
mm/huge
rsion of other state information will happen in subsequent patches.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12 ++--
include/linux/hugetlb.h | 68 +
mm/hugetlb.c| 48 +++--
3 files changed, 96 inserti
Use new hugetlb specific HPageFreed flag to replace the
PageHugeFreed interfaces.
Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
Reviewed-by: Muchun Song
---
include/linux/hugetlb.h | 3 +++
mm/hugetlb.c| 23 ---
2 files changed, 7 insertions(+), 19
On 1/21/21 10:53 PM, Miaohe Lin wrote:
> Hi:
> On 2021/1/20 9:30, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces. By it's name, page_huge_active implied
>> that a huge page was on the activ
, *next;
> +
> + list_for_each_entry_safe(page, next, list, lru) {
> + list_del(&page->lru);
> + free_vmemmap_page(page);
> + }
> +}
> +
> +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr,
> + struc
c: Jonathan Corbet
> Cc: linux-...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: Andrew Morton
> Cc: Mike Kravetz
> ---
> v2: rebase & resend
>
> Documentation/admin-guide/kernel-parameters.txt |8
> 1 file changed, 4 insertions(+), 4 de
VABLE, in that case you also
>> shouldn't allocate the vmemmap from there ...
>
> Yeah, you are right. So I tend to trigger OOM to kill other processes to
> reclaim some memory when we allocate memory fails.
IIUC, even non-gigantic hugetlb pages can exist in CMA. They can be migrated
out of CMA if needed (except free pages in the pool, but that is a separate
issue David H already noted in another thread).
When we first started discussing this patch set, one suggestion was to force
hugetlb pool pages to be allocated at boot time and never permit them to be
freed back to the buddy allocator. A primary reason for the suggestion was
to avoid this issue of needing to allocate memory when freeing a hugetlb page
to buddy. IMO, that would be an unreasonable restriction for many existing
hugetlb use cases.
A simple thought is that we simply fail the 'freeing hugetlb page to buddy'
if we can not allocate the required vmemmap pages. However, as David R says
freeing hugetlb pages to buddy is a reasonable way to free up memory in oom
situations. However, failing the operation 'might' be better than looping
forever trying to allocate the pages needed? As mentioned in the previous
patch, it would be better to use GFP_ATOMIC to at least dip into reserves if
we can.
I think using pages of the hugetlb for vmemmap to cover pages of the hugetlb
is the only way we can guarantee success of freeing a hugetlb page to buddy.
However, this should only only be used when there is no other option and could
result in vmemmap pages residing in CMA or ZONE_MOVABLE. I'm not sure how
much better this is than failing the free to buddy operation.
I don't have a solution. Just wanted to share some thoughts.
BTW, just thought of something else. Consider offlining a memory section that
contains a free hugetlb page. The offline code will try to disolve the hugetlb
page (free to buddy). So, vmemmap pages will need to be allocated. We will
try to allocate vmemap pages on the same node as the hugetlb page. But, if
this memory section is the last of the node all the pages will have been
isolated and no allocations will succeed. Is that a possible scenario, or am
I just having too many negative thoughts?
--
Mike Kravetz
On 1/27/21 2:20 AM, Michal Hocko wrote:
> [sorry for jumping in late]
>
> On Fri 22-01-21 11:52:27, Mike Kravetz wrote:
>> As hugetlbfs evolved, state information about hugetlb pages was added.
>> One 'convenient' way of doing this was to use available fields in ta
On 1/27/21 2:25 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:28, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces. By it's name, page_huge_active implied
>> that a huge page was on the active l
On 1/27/21 2:41 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:31, Mike Kravetz wrote:
>> Use new hugetlb specific HPageFreed flag to replace the
>> PageHugeFreed interfaces.
>>
>> Signed-off-by: Mike Kravetz
>> Reviewed-by: Oscar Salvador
>> Reviewed
On 1/27/21 2:35 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:29, Mike Kravetz wrote:
>> The HP_Migratable flag indicates a page is a candidate for migration.
>> Only set the flag if the page's hstate supports migration. This allows
>> the migration paths to detect no
On 1/4/21 6:44 PM, Muchun Song wrote:
> On Tue, Jan 5, 2021 at 6:40 AM Mike Kravetz wrote:
>>
>> On 1/3/21 10:58 PM, Muchun Song wrote:
>>> Because we only can isolate a active page via isolate_huge_page()
>>> and hugetlbfs_fallocate() forget to mark it as
On 1/4/21 6:55 PM, Muchun Song wrote:
> On Tue, Jan 5, 2021 at 8:02 AM Mike Kravetz wrote:
>>
>> On 1/3/21 10:58 PM, Muchun Song wrote:
>>> There is a race condition between __free_huge_page()
>>> and dissolve_free_huge_page().
>>>
>>> CPU0:
On 1/4/21 7:46 PM, Muchun Song wrote:
> On Tue, Jan 5, 2021 at 11:14 AM Muchun Song wrote:
>>
>> On Tue, Jan 5, 2021 at 9:33 AM Mike Kravetz wrote:
>>>
>>> On 1/3/21 10:58 PM, Muchun Song wrote:
>>>> When dissolve_free_huge_page() races with __free
migration could race with the page fault
and the page could be migrated before being added to the page table of
the faulting task. This was an issue when hugetlb_no_page set_page_huge_active
right after allocating and clearing the huge page. Commit cb6acd01e2e4
moved the set_page_huge_active after adding the page to the page table
to address this issue.
--
Mike Kravetz
return page;
>> @@ -1291,6 +1308,17 @@ static inline void
>> destroy_compound_gigantic_page(struct page *page,
>> unsigned int order) { }
>> #endif
>>
>> +/*
>> + * Because we reuse the mapping field of so
On 1/6/21 12:02 PM, Michal Hocko wrote:
> On Wed 06-01-21 11:30:25, Mike Kravetz wrote:
>> On 1/6/21 8:35 AM, Michal Hocko wrote:
>>> On Wed 06-01-21 16:47:35, Muchun Song wrote:
>>>> Because we only can isolate a active page via isolate_huge_page()
>>>> a
On 12/14/20 5:06 PM, Mike Kravetz wrote:
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index d029d938d26d..8713f8ef0f4c 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4106,10 +4106,30 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm,
> struct v
an be saved for each 1GB HugeTLB page.
When a HugeTLB page is allocated or freed, the vmemmap array
representing the range associated with the page will need to be
remapped. When a page is allocated, vmemmap pages are freed
after remapping. When a page
On 12/15/20 5:03 PM, Mike Kravetz wrote:
> On 12/13/20 7:45 AM, Muchun Song wrote:
>> diff --git a/fs/Kconfig b/fs/Kconfig
>> index 976e8b9033c4..4c3a9c614983 100644
>> --- a/fs/Kconfig
>> +++ b/fs/Kconfig
>> @@ -245,6 +245,21 @@ config HUGETLBFS
>> config
Not sure if the word '_reuse' is best in this function name. To me, the name
implies this routine will reuse vmemmap pages. Perhaps, it makes more sense
to rename as 'vmemmap_remap_free'? It will first remap, then free vmemmap.
But, then I looked at the code
On 12/16/20 2:25 PM, Oscar Salvador wrote:
> On Wed, Dec 16, 2020 at 02:08:30PM -0800, Mike Kravetz wrote:
>>> + * vmemmap_rmap_walk - walk vmemmap page table
>>> +
>>> +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr,
>>> +
FP_ATOMIC to allocate the vmemmap pages.
>
> Signed-off-by: Muchun Song
It is unfortunate we need to add this complexitty, but I can not think
of another way. One small comment (no required change) below.
Reviewed-by: Mike Kravetz
> ---
> m
* handle allocation failures. Once we allocate
> + * vmemmap pages successfully, then we can free
> + * a HugeTLB page.
> + */
> + goto retry;
> + }
> + list_add_tail(&page->lru, list);
> + }
> +}
> +
--
Mike Kravetz
tmail.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization")
Cc:
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 22 +-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d029d938d26d..
this feature was enabled. This would eliminate a bunch
of the complex code doing page table manipulation. It does not address
the issue of struct page pages going away which is being discussed here,
but it could be a way to simply the first version of this code. If this
is going to be an 'opt in' feature as previously suggested, then eliminating
the PMD/huge page vmemmap mapping may be acceptable. My guess is that
sysadmins would only 'opt in' if they expect most of system memory to be used
by hugetlb pages. We certainly have database and virtualization use cases
where this is true.
--
Mike Kravetz
On 07/22/2015 03:03 PM, Andrew Morton wrote:
On Tue, 21 Jul 2015 11:09:43 -0700 Mike Kravetz wrote:
...
+
+ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
+ return -EOPNOTSUPP;
EOPNOTSUPP is a networking thing. It's inappropriate here.
The problem i
tlb selftests in the kernel and pointing
people to libhugetlbfs is the way to go. From a very quick scan
of the selftests, I would guess libhugetlbfs covers everything
in those tests.
I'm willing to verify the testing provided by selftests is included
in libhugetlbfs, and remove selftests if that i
etlbfs, I beieve the offset must be a multiple of the
hugetlb page size. A similar comment/exception about using
the "underlying page size" would apply here as well.
--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a m
On 07/23/2015 08:17 AM, Eric B Munson wrote:
On Wed, 22 Jul 2015, Mike Kravetz wrote:
On 07/22/2015 03:30 PM, Andrew Morton wrote:
On Wed, 22 Jul 2015 15:19:54 -0700 Davidlohr Bueso wrote:
I didn't know that libhugetlbfs has tests. I wonder if that makes
tools/testing/selftests
On 07/23/2015 10:17 AM, Eric B Munson wrote:
On Thu, 23 Jul 2015, Mike Kravetz wrote:
On 07/23/2015 08:17 AM, Eric B Munson wrote:
On Wed, 22 Jul 2015, Mike Kravetz wrote:
On 07/22/2015 03:30 PM, Andrew Morton wrote:
On Wed, 22 Jul 2015 15:19:54 -0700 Davidlohr Bueso wrote:
I didn
: one
if CONFIG_NUMA is defined and one (a no-op) if not. I am happy
with either, but am a relative newbie in this area so am looking
for a little guidance.
--
Mike Kravetz
---
From 04c37a979c5ce8cd39d3243e4e2c12905e4f1e6e Mon Sep 17 00:00:00 2001
From: Michal Hocko
Date: Fri, 24 Jul 2015 08:14
igned-off-by: Mike Kravetz
---
tools/testing/selftests/vm/Makefile| 1 -
tools/testing/selftests/vm/hugetlbfstest.c | 86
--
tools/testing/selftests/vm/run_vmtests | 11
3 files changed, 98 deletions(-)
delete mode 100644 tools/testing/selftes
ror handling in region_del() when kmalloc() fails stills needs
to be addressed
madvise remove support remains
Mike Kravetz (10):
mm/hugetlb: add cache of descriptors to resv_map for region_add
mm/hugetlb: add region_del() to delete a specific range of entries
mm/hugetlb: expose hugetlb fault m
or callers creating
reservations with vma_needs_reservation/vma_commit_reservation.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 3 +
mm/hugetlb.c| 168 ++--
2 files changed, 152 insertions(+), 19 deletions(-)
diff --git a/inc
callers to add 0 as end of range.
Since the routine will be used in hole punch as well as truncate
operations, it is more appropriately renamed to hugetlb_vmdelete_list().
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 25 ++---
1 file changed, 18 insertions(+), 7
). vma_has_reserves is passed "chg" which
indicates whether or not a region/reserve map is present. Use
this to determine if reserves are actually present or were removed
via hole punch.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 16 +---
1 file changed, 13 insertions(+), 3
changes to
be more consistent with other global hugetlb symbols.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 5 +
mm/hugetlb.c| 20 ++--
2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.
Signed-off-by: Dave Hansen
Signed-off-by: Mike Kravetz
---
mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 70ce0d4
() is also modified to take a range of pages.
hugetlb_unreserve_pages is modified to detect an error from
region_del and pass it back to the caller.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 98 -
include/linux/hugetlb.h | 4 +-
mm
601 - 700 of 1393 matches
Mail list logo