f course, this would require we leave the call to vma_shareable() at the
beginning of huge_pmd_share. It also means that we are always making a
function call into huge_pmd_share to determine if sharing is possible.
That is not any different than today. If we do not want to make that extra
function c
he find_vma() call.
>
> Suggested-by: Mike Kravetz
> Signed-off-by: Peter Xu
> Signed-off-by: Axel Rasmussen
> ---
> arch/arm64/mm/hugetlbpage.c | 4 ++--
> arch/ia64/mm/hugetlbpage.c| 3 ++-
> arch/mips/mm/hugetlbpage.c| 4 ++--
> arch/parisc/mm/hugetlbpage.c
On 2/11/21 12:47 PM, Zi Yan wrote:
> On 28 Jan 2021, at 16:53, Mike Kravetz wrote:
>
>> On 1/28/21 10:26 AM, Joao Martins wrote:
>>> For a given hugepage backing a VA, there's a rather ineficient
>>> loop which is solely responsible for storing subpages in GUP
&g
if we can not figure out how to move forward on this
issue.
It would be great if David H, David R and Michal could share their opinions
on this. No need to review details the code yet (unless you want), but
let's start a discussion on how to move past this issue if we can.
--
Mike Kravetz
ret = true;
Should probably check for -EBUSY as this means someone started using
the page while we were allocating a new one. It would complicate the
code to try and do the 'right thing'. Right thing might be getting
dissolving the new pool page and then trying to isolate this in use
page.
, but since we are scanning lockless there is no way to eliminate
them all. Best to just minimize the windows and document.
--
Mike Kravetz
> + /*
> + * Hugetlb page in-use. Isolate and migrate.
> +
Peter Xu's
"userfaultfd-wp: Support shmem and hugetlbfs" work touch the same areas
and have similiar issues. Therefore, code is being sent earlier than
normal so that efforts in common areas can be coordinated.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=211287
Mike Kravetz (5):
hugetlb:
Pagemap was only using the vma flag PM_SOFT_DIRTY for hugetlb vmas.
This is insufficient. Check the individual pte entries.
Signed-off-by: Mike Kravetz
---
fs/proc/task_mmu.c | 4
1 file changed, 4 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 602e3a52884d
There was is no hugetlb specific routine for clearing soft dirty and
other referrences. The 'default' routines would only clear the
VM_SOFTDIRTY flag in the vma.
Add new routine specifically for hugetlb vmas.
Signed-off-by: Mike Kravetz
---
fs/proc/task_mmu.c | 110
to perform pmd sharing.
A subsequent patch will add code to allow soft dirty monitoring for hugetlb
vmas. Any existing pmd sharing will be undone at that time.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
it would actually allocate and install a COW page.
Modify the code to not call hugetlb_cow for SHARED mappings and just
update the pte.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 23 ---
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
Add interfaces to set and clear soft dirty in hugetlb ptes. Make
hugetlb interfaces needed for /proc clear_refs available outside
hugetlb.c.
arch/s390 has it's own version of most routines in asm-generic/hugetlb.h,
so add new routines there as well.
Signed-off-by: Mike Kravetz
---
arch/s390
On 2/5/21 6:36 PM, Peter Xu wrote:
> On Fri, Feb 05, 2021 at 01:53:34PM -0800, Mike Kravetz wrote:
>> On 1/29/21 2:49 PM, Peter Xu wrote:
>>> On Fri, Jan 15, 2021 at 12:08:37PM -0500, Peter Xu wrote:
>>>> This is a RFC series to support userfaultfd upon shmem and hug
On 2/8/21 7:27 PM, Miaohe Lin wrote:
> On 2021/2/9 3:52, Mike Kravetz wrote:
>> On 1/23/21 1:31 AM, Miaohe Lin wrote:
>>> The current implementation of hugetlb_cgroup for shared mappings could have
>>> different behavior. Consider the following two scenarios:
>>>
On 2/8/21 11:11 PM, Miaohe Lin wrote:
> All callers know they are operating on a hugetlb head page. So this
> VM_BUG_ON_PAGE can not catch anything useful.
>
> Signed-off-by: Miaohe Lin
> ---
> mm/hugetlb.c | 1 -
> 1 file changed, 1 deletion(-)
Thanks,
Reviewed-by: M
On 2/8/21 6:10 PM, Miaohe Lin wrote:
> Hi:
> On 2021/2/9 9:26, Mike Kravetz wrote:
>> On 2/8/21 12:37 AM, Miaohe Lin wrote:
>>> PageHead(page) is implicitly checked in set_page_huge_active() via the
>>> PageHeadHuge(page) check. So remove this explicit one.
>>
On 2/8/21 5:24 PM, Miaohe Lin wrote:
> Hi:
> On 2021/2/9 8:45, Mike Kravetz wrote:
>> On 2/8/21 12:24 AM, Miaohe Lin wrote:
>>> We can use helper huge_page_size() to get the hugepage size directly to
>>> simplify the code slightly.
>>>
>>> Signed-of
: Miaohe Lin
> ---
> fs/hugetlbfs/inode.c | 7 ++-
> 1 file changed, 2 insertions(+), 5 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 394da2ab08ad..701c82c36138 100644
> --- a/fs/
d page.
--
Mike Kravetz
>
> Signed-off-by: Miaohe Lin
> ---
> mm/hugetlb.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6cdb59d8f663..bbbe013a3a2d 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5577,7 +5577,
we change this to
huge_page_size(h) / SZ_1K);
as in hugetlb_report_meminfo above? Or, is that one where it takes an
additional instruction to do the divide as opposed to the shift? I would
rather add the instruction and keep everything consistent.
--
Mike Kravetz
sertions(+), 4 deletions(-)
Reviewed-by: Mike Kravetz
--
Mike Kravetz
ach individual reservation
which adds the complexity. I can not think of a better way to do things.
Please update commit message with an explanation of what users might see
because of this issue and resubmit as a patch.
Thanks,
--
Mike Kravetz
>
> In order to fix this, we have to make su
er look at the areas where efforts overlap.
--
Mike Kravetz
On 2/4/21 5:43 PM, Peter Xu wrote:
> On Thu, Feb 04, 2021 at 03:25:37PM -0800, Mike Kravetz wrote:
>> On 2/4/21 6:50 AM, Peter Xu wrote:
>>> This is the last missing piece of the COW-during-fork effort when there're
>>> pinned pages found. One can reference 70e8
))) {
> + if (!prealloc) {
> + put_page(ptepage);
> + spin_unlock(src_ptl);
> + spin_unlock(dst_ptl);
> + prealloc = alloc_huge_page
s,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cf82629319ed..442705be052a 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5237,7 +5237,7 @@ static unsigned long page_table_shareable(struct
> vm_area_struct *sv
s is, or modifying to include
the information above. To me, the three distinct blocks of code handling
the NORESERVE, shared and private cases makes things fairly clear and
the comment does apply in that context.
--
Mike Kravetz
> if (vma->vm_flags & VM_MAYSHARE) {
> /*
>* We know VM_NORESERVE is not set. Therefore, there SHOULD
>
On 2/1/21 3:49 AM, Michal Hocko wrote:
> On Fri 29-01-21 10:46:15, Mike Kravetz wrote:
>> On 1/28/21 2:15 PM, Andrew Morton wrote:
>>> On Thu, 28 Jan 2021 14:00:29 -0800 Mike Kravetz
>>> wrote:
>>>>
>>>> Michal suggested that comments des
*/
> - if (rg->from > t)
> + if (rg->from >= t)
> break;
>
> /* Add an entry for last_accounted_offset -> rg->from, and
>
Changing any of this code makes me nervous. However, I agree with your
analysis. The change makes the code match the comment WRT the [from, to)
nature of regions.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 2/2/21 11:42 PM, Muchun Song wrote:
> On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz wrote:
>>
>> Signed-off-by: Mike Kravetz
>
> Hi Mike,
>
> I found that you may forget to remove set_page_huge_active()
> from include/linux/hugetlb.h.
>
> diff --git a/inc
-
> 1 file changed, 27 insertions(+), 24 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 18f6ee317900..d2859c2aecc9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
Thanks, that is a pretty straight forward change. A cleanup with no
functiona
spin_unlock(src_ptl);
> + spin_unlock(dst_ptl);
> + prealloc = alloc_huge_page(vma, addr,
> 0);
One quick que
set_compound_order(page, 0);
> - page[1].compound_nr = 0;
I may be reading the code wrong, but set_compound_order(page, 0) will
set page[1].compound_nr to the value of 1. That is different than the
explicit setting to 0 in the existing code.
If that is correct, then you should say
| 2 ++
> 2 files changed, 6 insertions(+)
When you move this back into the "userfaultfd: add minor fault handling"
series, feel free to add:
Reviewed-by: Mike Kravetz
Thanks,
--
Mike Kravetz
b gigantic page' being <= 1 order, so this change
makes sense. Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index a3e4fa2c5e94..dac5db569ccb 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1219
olding
'big chunks' of memory for a specific purpose and dumping them when needed.
They were not doing this with hugetlb pages, but nothing would surprise me.
In this series, vmmap freeing is 'opt in' at boot time. I would expect
the use cases that want to opt in rarely if ever free/dissolve hugetlb
pages. But, I could be wrong.
--
Mike Kravetz
nto the same ifdef:
>
> #ifdef CONFIG_USERFAULTFD
> static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> pte_t *dst_pte,
> struct vm_area_struct *dst_vma,
> unsigned long dst_addr,
> unsigned long src_addr,
> struct page **pagep)
> {
> BUG();
> return 0;
> }
> #endif /* CONFIG_USERFAULTFD */
>
> Let's also see whether Mike would have a preference on this.
>
No real preference. Just need to fix up the argument list in that
second definition.
--
Mike Kravetz
ck to see if vma is sharable. Might be as
simple as !(vma->vm_flags & VM_MAYSHARE). I see a comment/question in
a later patch about only doing minor fault processing on shared mappings.
Code below looks fine, but it would be a wast to do all that for a vma
that could not be
-
> 2 files changed, 8 insertions(+), 8 deletions(-)
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 4508136c8376..f94a35296618 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -96
turn false;
We are testing for uffd conditions that prevent sharing to determine if
huge_pmd_share should be called. Since we have the vma, perhaps we should
do the vma_sharable() test here as well? Or, perhaps delay all checks
until we are in huge_pmd_share and add uffd_disable_huge_pmd_share t
On 2/1/21 1:38 PM, Mike Kravetz wrote:
> On 1/28/21 3:42 PM, Axel Rasmussen wrote:
>> From: Peter Xu
>>
>> It is a preparation work to be able to behave differently in the per
>> architecture huge_pte_alloc() according to different VMA attributes.
>>
>>
d code, but would not be as efficient.
I prefer passing the vma argument as is done here.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 1/28/21 2:15 PM, Andrew Morton wrote:
> On Thu, 28 Jan 2021 14:00:29 -0800 Mike Kravetz
> wrote:
>>
>> Michal suggested that comments describing synchronization be added for each
>> flag. Since I did 'one patch per flag', that would be an update to each
>> pa
t fails, we use part of the hugepage to
>remap.
I honestly am not sure about this. This would only happen for pages in
NORMAL. The only time using part of the huge page for vmemmap would help is
if we are trying to dissolve huge pages to free up memory for other uses.
> What's your opinion about this? Should we take this approach?
I think trying to solve all the issues that could happen as the result of
not being able to dissolve a hugetlb page has made this extremely complex.
I know this is something we need to address/solve. We do not want to add
more unexpected behavior in corner cases. However, I can not help but think
about similar issues today. For example, if a huge page is in use in
ZONE_MOVABLE or CMA there is no guarantee that it can be migrated today.
Correct? We may need to allocate another huge page for the target of the
migration, and there is no guarantee we can do that.
--
Mike Kravetz
On 1/28/21 1:37 PM, Andrew Morton wrote:
> On Thu, 28 Jan 2021 06:52:21 +0100 Oscar Salvador wrote:
>
>> On Wed, Jan 27, 2021 at 03:36:41PM -0800, Mike Kravetz wrote:
>>> Yes, this patch is somewhat optional. It should be a minor improvement
>>> in cases wh
+-
> 1 file changed, 28 insertions(+), 21 deletions(-)
Thanks for updating this.
Reviewed-by: Mike Kravetz
I think there still is an open general question about whether we can always
assume page structs are contiguous for really big pages. That is outside
On 1/27/21 2:35 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:29, Mike Kravetz wrote:
>> The HP_Migratable flag indicates a page is a candidate for migration.
>> Only set the flag if the page's hstate supports migration. This allows
>> the migration paths to detect non-mig
On 1/27/21 2:41 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:31, Mike Kravetz wrote:
>> Use new hugetlb specific HPageFreed flag to replace the
>> PageHugeFreed interfaces.
>>
>> Signed-off-by: Mike Kravetz
>> Reviewed-by: Oscar Salvador
>> Reviewed
On 1/27/21 2:25 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:28, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces. By it's name, page_huge_active implied
>> that a huge page was on the
On 1/27/21 2:20 AM, Michal Hocko wrote:
> [sorry for jumping in late]
>
> On Fri 22-01-21 11:52:27, Mike Kravetz wrote:
>> As hugetlbfs evolved, state information about hugetlb pages was added.
>> One 'convenient' way of doing this was to use available fields in tail
>>
On 1/26/21 11:21 AM, Joao Martins wrote:
> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>
>>> +static void record_subpages_vmas(struct page *page, struct vm_area_struct
>>> *vma,
>>> +
gt; 1 file changed, 1 insertion(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cbf32d2..5e6a6e7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3367,8 +3367,7 @@ static unsigned in
On 1/26/21 4:07 PM, Jason Gunthorpe wrote:
> On Tue, Jan 26, 2021 at 01:21:46PM -0800, Mike Kravetz wrote:
>> On 1/26/21 11:21 AM, Joao Martins wrote:
>>> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>>>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>>>
>&g
by: Miaohe Lin
> ---
> mm/hugetlb.c | 16 +---
> 1 file changed, 13 insertions(+), 3 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 777bc0e45bf3..53ea65d1c5ab 100644
> --- a/mm/hugetlb.c
&
so large that we do not guarantee that page++ pointer
* arithmetic will work across the entire page. We need something more
* specialized.
*/
static void __copy_gigantic_page(struct page *dst, struct page *src,
int nr_pages)
--
Mike Kravetz
> +
+++---
> 3 files changed, 29 insertions(+), 22 deletions(-)
Thanks. Nice straight forward improvement.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a5d618d08506..0d793486822b 100644
> --- a/include/li
lb page on the free list with a count of 1. There is no check in the
enqueue code. When we dequeue the page, set_page_refcounted() is used to
set the count to 1 without looking at the current value. And, all the other
VM_DEBUG macros are off so we mostly do not notice the bug.
Thanks again,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> }
> free:
>
from there ...
>
> Yeah, you are right. So I tend to trigger OOM to kill other processes to
> reclaim some memory when we allocate memory fails.
IIUC, even non-gigantic hugetlb pages can exist in CMA. They can be migrated
out of CMA if needed (except free pages in the pool, but that is a separate
issue David H already noted in another thread).
When we first started discussing this patch set, one suggestion was to force
hugetlb pool pages to be allocated at boot time and never permit them to be
freed back to the buddy allocator. A primary reason for the suggestion was
to avoid this issue of needing to allocate memory when freeing a hugetlb page
to buddy. IMO, that would be an unreasonable restriction for many existing
hugetlb use cases.
A simple thought is that we simply fail the 'freeing hugetlb page to buddy'
if we can not allocate the required vmemmap pages. However, as David R says
freeing hugetlb pages to buddy is a reasonable way to free up memory in oom
situations. However, failing the operation 'might' be better than looping
forever trying to allocate the pages needed? As mentioned in the previous
patch, it would be better to use GFP_ATOMIC to at least dip into reserves if
we can.
I think using pages of the hugetlb for vmemmap to cover pages of the hugetlb
is the only way we can guarantee success of freeing a hugetlb page to buddy.
However, this should only only be used when there is no other option and could
result in vmemmap pages residing in CMA or ZONE_MOVABLE. I'm not sure how
much better this is than failing the free to buddy operation.
I don't have a solution. Just wanted to share some thoughts.
BTW, just thought of something else. Consider offlining a memory section that
contains a free hugetlb page. The offline code will try to disolve the hugetlb
page (free to buddy). So, vmemmap pages will need to be allocated. We will
try to allocate vmemap pages on the same node as the hugetlb page. But, if
this memory section is the last of the node all the pages will have been
isolated and no allocations will succeed. Is that a possible scenario, or am
I just having too many negative thoughts?
--
Mike Kravetz
n Corbet
> Cc: linux-...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: Andrew Morton
> Cc: Mike Kravetz
> ---
> v2: rebase & resend
>
> Documentation/admin-guide/kernel-parameters.txt |8
> 1 file changed, 4 insertions(+), 4 deletions(-)
_each_entry_safe(page, next, list, lru) {
> + list_del(>lru);
> + free_vmemmap_page(page);
> + }
> +}
> +
> +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr,
> + struct vmemmap_remap_walk *walk)
> +{
&g
On 1/21/21 10:53 PM, Miaohe Lin wrote:
> Hi:
> On 2021/1/20 9:30, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces. By it's name, page_huge_active implied
>> that a huge page was on t
Use new hugetlb specific HPageFreed flag to replace the
PageHugeFreed interfaces.
Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
Reviewed-by: Muchun Song
---
include/linux/hugetlb.h | 3 +++
mm/hugetlb.c| 23 ---
2 files changed, 7 insertions(+), 19
information will happen in subsequent patches.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12 ++--
include/linux/hugetlb.h | 68 +
mm/hugetlb.c| 48 +++--
3 files changed, 96 insertions(+), 32 deletions
race with
code freeing the page. The extra check in page_huge_active shortened the
race window, but did not prevent the race. Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable. Add comment to racy code.
Signed-off-by: Mike Kravetz
will not be isolated and no attempt will be made to migrate. We should
never get to unmap_and_move_huge_page for a page where migration is not
supported, so throw a warning if we do.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 2 +-
include/linux/hugetlb.h | 9 +
mm/hugetlb.c
v5.11-rc4-mmotm-2021-01-21-20-07
Mike Kravetz (5):
hugetlb: use page.private for hugetlb specific page flags
hugetlb: convert page_huge_active() HPageMigratable flag
hugetlb: only set HPageMigratable for migratable hstates
hugetlb: convert PageHugeTemporary() to HPageTemporary flag
hugetlb
.
Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
---
include/linux/hugetlb.h | 6 ++
mm/hugetlb.c| 36 +++-
2 files changed, 13 insertions(+), 29 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index cd1960541f2a
Miaohe Lin
> ---
> fs/hugetlbfs/inode.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 25c1857ff45d..c87894b221da 100644
> --- a/fs/hugetlbfs/i
On 1/21/21 5:42 PM, Miaohe Lin wrote:
> Hi:
> On 2021/1/22 3:00, Mike Kravetz wrote:
>> On 1/20/21 1:23 AM, Miaohe Lin wrote:
>>> The calculation 1U << (h->order + PAGE_SHIFT - 10) is actually equal to
>>> (PAGE_SHIFT << (h->order)) >&
dd a comment offered by Mike Kravetz to explain this.
>
> Reviewed-by: David Hildenbrand
> Signed-off-by: Miaohe Lin
> Cc: Mike Kravetz
> ---
> fs/hugetlbfs/inode.c | 12 +---
> 1 file changed, 9 insertions(+), 3 deletions(-)
Reviewed-by: Mike Kravetz
>
>
the kernel, size in KB is often calculated as (size << (PAGE_SHIFT - 10)).
If you change the calculation in the hugetlb code to be:
huge_page_size(h) << (PAGE_SHIFT - 10)
my compiler will actually reduce the size of the routine by one instruction.
--
Mike Kravetz
> return mnt;
> }
>
>
On 1/20/21 2:00 AM, Oscar Salvador wrote:
> On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote:
>> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag HPageMigratable to replace the
>>> page_huge_activ
On 1/20/21 2:09 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:48PM -0800, Mike Kravetz wrote:
>> Use new hugetlb specific HPageTemporary flag to replace the
>> PageHugeTemporary() interfaces.
>>
>> Signed-off-by: Mike Kravetz
>
> I would have add
On 1/20/21 1:30 AM, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:45PM -0800, Mike Kravetz wrote:
>> + * Macros to create test, set and clear function definitions for
>> + * hugetlb specific page flags.
>> + */
>> +#ifdef CONFIG_HUGETLB_PAGE
>> +#d
will not be isolated and no attempt will be made to migrate. We should
never get to unmap_and_move_huge_page for a page where migration is not
supported, so throw a warning if we do.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 2 +-
include/linux/hugetlb.h | 9 +
mm/hugetlb.c
race with
code freeing the page. The extra check in page_huge_active shortened the
race window, but did not prevent the race. Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable. Add comment to racy code.
Signed-off-by: Mike Kravetz
---
fs
Use new hugetlb specific HPageFreed flag to replace the
PageHugeFreed interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 3 +++
mm/hugetlb.c| 23 ---
2 files changed, 7 insertions(+), 19 deletions(-)
diff --git a/include/linux/hugetlb.h b
hanged subpool routine names (Matthew)
More comments in code (Oscar)
Based on v5.11-rc3-mmotm-2021-01-12-01-57
Mike Kravetz (5):
hugetlb: use page.private for hugetlb specific page flags
hugetlb: convert page_huge_active() HPageMigratable flag
hugetlb: only set HPageMigratable for migratable h
information will happen in subsequent patches.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12 ++-
include/linux/hugetlb.h | 72 +
mm/hugetlb.c| 45 +-
3 files changed, 97 insertions(+), 32 deletions(-)
diff
Use new hugetlb specific HPageTemporary flag to replace the
PageHugeTemporary() interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 6 ++
mm/hugetlb.c| 36 +++-
2 files changed, 13 insertions(+), 29 deletions(-)
diff --git
Miaohe Lin
>>
>> I would avoid mentioning gbl_reserve as not all callers use it, and focus
>> on what delta means:
>>
>> "When reservation accounting remains unchanged..", but anyway:
>
> Sounds good. Maybe Andrew could kindly do this if this patch is picked up ?
Thank you and Andrew.
Looks like Andrew updated the commit message and added to his tree.
--
Mike Kravetz
--
> fs/hugetlbfs/inode.c | 2 --
> 1 file changed, 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 016c863b493b..79464963f95e 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/i
s with
* reserves for the file at the inode level. If we fallocate
* pages in these areas, we need to consume the reserves
* to keep reservation accounting consistent.
*/
--
Mike Kravetz
> - page = alloc_hug
f-by: Miaohe Lin
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> ---
> fs/hugetlbfs/inode.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 9b221b87fbea..88751e35e69d 100644
> --- a/fs/hugetlbfs/ino
t; to generic_file_buffered_read(). So replace do_generic_mapping_read() with
> generic_file_buffered_read() to keep comment uptodate.
>
> Signed-off-by: Miaohe Lin
> ---
> fs/hugetlbfs/inode.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks,
Reviewed-by: Mik
CC Andrew
On 1/19/21 9:53 AM, Mike Kravetz wrote:
> On 1/16/21 1:18 AM, Miaohe Lin wrote:
>> Since commit e5ff215941d5 ("hugetlb: multiple hstates for multiple page
>> sizes"), we can use macro default_hstate to get the struct hstate which
>> we use by default.
s need to be set/tested outside hugetlb code, so
> it indeed looks nicer and more consistent to follow page-flags.h convention.
>
> Sorry for the noise.
Thanks everyone!
I was unsure about the best way to go for this. Will send out a new version
in a few days using the page-flag style macros.
--
Mike Kravetz
with
code freeing the page. The extra check in page_huge_active shortened the
race window, but did not prevent the race. Offline code calling
scan_movable_pages already deals with these races, so removing the check
is acceptable.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 2
patches.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12 ++--
include/linux/hugetlb.h | 61 +
mm/hugetlb.c| 46 +++
3 files changed, 87 insertions(+), 32 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b
set of flag manipulation routines (Oscar)
Moved flags and routines to hugetlb.h (Muchun)
Changed format of page flag names (Muchun)
Changed subpool routine names (Matthew)
More comments in code (Oscar)
Based on v5.11-rc3-mmotm-2021-01-12-01-57
Mike Kravetz (5):
hugetlb: use page.priv
Use new hugetlb specific flag HP_Freed flag to replace the
PageHugeFreed interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 2 ++
mm/hugetlb.c| 23 ---
2 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/include/linux/hugetlb.h b
Use new hugetlb specific flag HP_Temporary flag to replace the
PageHugeTemporary() interfaces.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 5 +
mm/hugetlb.c| 36 +++-
2 files changed, 12 insertions(+), 29 deletions(-)
diff --git
necessary. If migration is not supported for the hstate,
HP_Migratable will not be set, the page will not be isolated and no
attempt will be made to migrate.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 2 +-
include/linux/hugetlb.h | 9 +
mm/hugetlb.c| 8
to
be something like:
1) allocate a fresh hugetlb page from buddy
2) free the 'migrated' free huge page back to buddy
I do not think we can use the existing 'isolate-migrate' flow. Isolating
a page would make it unavailable for allocation and that could cause
application issues.
--
Mike Kravetz
On 1/15/21 9:43 AM, Mike Kravetz wrote:
> On 1/15/21 1:17 AM, Oscar Salvador wrote:
>> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>>> Use the new hugetlb page specific flag to replace the page_huge_active
>>> interfaces. By it's name, page_huge_acti
On 1/15/21 2:16 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:52PM -0800, Mike Kravetz wrote:
>> Use new hugetlb specific flag HPageTempSurplus to replace the
>> PageHugeTemporary() interfaces.
>>
>> Signed-off-by: Mike Kravetz
&
On 1/15/21 1:17 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag to replace the page_huge_active
>> interfaces. By it's name, page_huge_active implied that a huge page
>> was on the
; use it.
>
> Signed-off-by: Miaohe Lin
> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
this code path. However,
there are other code paths where hugetlb_acct_memory is called with a delta
value of 0 as well. I would rather see a simple check at the beginning of
hugetlb_acct_memory like.
if (!delta)
return 0;
--
Mike Kravetz
>>
>> return 0;
>> }
>>
>
>
uot; was done
before I even noticed your efforts here.
At least we agree the metadata could be better organized. :)
IMO, using page.private of the head page to consolidate flags will be
easier to manage. So, I would like to use that.
The BUILD_BUG_ON in this patch makes sense.
--
Mike Kravetz
201 - 300 of 2070 matches
Mail list logo