On 9/3/19 10:57 AM, Mike Kravetz wrote:
> On 8/29/19 12:18 AM, Michal Hocko wrote:
>> [Cc cgroups maintainers]
>>
>> On Wed 28-08-19 10:58:00, Mina Almasry wrote:
>>> On Wed, Aug 28, 2019 at 4:23 AM Michal Hocko wrote:
>>>>
>>>> On Mon 26-0
the page table lock and check for huge_pte_none before
returning an error. This is the same check that must be made further
in the code even if page allocation is successful.
Reported-by: Li Wang
Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Signed-off-by: Mike Kravetz
Tested-b
ptep)))
goto backout;
--
Mike Kravetz
On 8/8/19 12:47 AM, Michal Hocko wrote:
> On Thu 08-08-19 09:46:07, Michal Hocko wrote:
>> On Wed 07-08-19 17:05:33, Mike Kravetz wrote:
>>> Li Wang discovered that LTP/move_page12 V2 sometimes triggers SIGBUS
>>> in the kernel-v5.2.3 testing. This is caused by a ra
On 8/15/19 4:04 PM, Mina Almasry wrote:
> On Wed, Aug 14, 2019 at 9:46 AM Mike Kravetz wrote:
>>
>> On 8/13/19 4:54 PM, Mike Kravetz wrote:
>>> On 8/8/19 4:13 PM, Mina Almasry wrote:
>>>> For shared mappings, the pointer to the hugetlb_cgroup to uncharg
On 8/15/19 4:08 PM, Mina Almasry wrote:
> On Tue, Aug 13, 2019 at 4:54 PM Mike Kravetz wrote:
>>> mm/hugetlb.c | 208 +--
>>> 1 file changed, 170 insertions(+), 38 deletions(-)
>>>
>>> diff --git
t would seem to be related to commit 3e2c19f9bef7e
> * mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch
--
Mike Kravetz
the swap devices that may cause warning messages.
>
> Fixes: 6a946753dbe6 ("mm/swap_state.c: simplify total_swapcache_pages() with
> get_swap_device()")
> Signed-off-by: "Huang, Ying"
Thank you, this eliminates the messages for me:
Tested-by: Mike Kravetz
--
Mike Kravetz
will still succeed if there is memory available, but it will not try
as hard to free up memory.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 87 ++--
1 file changed, 77 insertions(+), 10 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index
.
Signed-off-by: Mike Kravetz
---
mm/compaction.c | 18 +-
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 952dc2fb24e5..325b746068d1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2294,9 +2294,15 @@ static enum
From: Hillf Danton
Address the issue of should_continue_reclaim continuing true too often
for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned.
This could happen during hugetlb page allocation causing stalls for
minutes or hours.
Restructure code so that false will be returned in
ton (1):
mm, reclaim: make should_continue_reclaim perform dryrun detection
Mike Kravetz (2):
mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly
orders
hugetlbfs: don't retry when pool page allocations start to fail
mm/compaction.c | 18 +++---
mm/hugetlb.c
cdb07bdea28e ("mm/rmap.c: remove redundant variable cend")
It appears Commit 0f10851ea475 ("mm/mmu_notifier: avoid double notification
when it is useless") is what removed the use of cstart and cend. And, they
should have been removed then.
> Reported-by: Hulk Robot
> Sig
ages to
the hugetlb pool and then using them within applications? Or, are you
dynamically allocating them at fault time (hugetlb overcommit/surplus)?
Latency time for use of such pages includes:
- Putting together 1G contiguous
- Clearing 1G memory
In the 'allocation at fault time' mode you incur both costs at fault time.
If using pages from the pool, your only cost at fault time is clearing the
page.
--
Mike Kravetz
le the race', I think it might be acceptable to just
put a big semaphore around it.
--
Mike Kravetz
Hello Andrew,
Unless someone objects, can you add patches 1-3 of this series to your tree.
They have been reviewed and are fairly simple cleanups.
--
Mike Kravetz
On 7/22/20 8:22 PM, Baoquan He wrote:
> v1 is here:
> https://lore.kernel.org/linux-mm/20200720062623.13135-1-...@redh
On 8/24/20 8:01 PM, Muchun Song wrote:
> On Tue, Aug 25, 2020 at 5:21 AM Mike Kravetz wrote:
>>
>> I too am looking at this now and do not completely understand the race.
>> It could be that:
>>
>> hugetlb_sysctl_handler_common
>> ...
>> table-
On 9/2/20 3:49 AM, Vlastimil Babka wrote:
> On 9/1/20 3:46 AM, Wei Yang wrote:
>> The page allocated from buddy is not on any list, so just use list_add()
>> is enough.
>>
>> Signed-off-by: Wei Yang
>> Reviewed-by: Baoquan He
>> Reviewed-by: Mike Kravet
before normal memory allocators, so use the memblock
allocator.
Signed-off-by: Mike Kravetz
---
arch/arm/mm/dma-mapping.c | 29 ---
arch/mips/configs/cu1000-neo_defconfig | 1 -
arch/mips/configs/cu1830-neo_defconfig | 1 -
include/linux/cma.h
wframe+0x44/0xa9
>
> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
> Signed-off-by: Muchun Song
Thank you!
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 8/27/20 8:32 PM, Wei Yang wrote:
> We are sure to get a valid file_region, otherwise the
> VM_BUG_ON(resv->region_cache_count <= 0) at the very beginning would be
> triggered.
>
> Let's remove the redundant one.
>
> Signed-off-by: Wei Yang
Thank you.
Reviewed-
t; Signed-off-by: Wei Yang
> Reviewed-by: Mike Kravetz
Commit bbe88753bd42 (mm/hugetlb: make hugetlb migration callback CMA aware)
in v5.9-rc2 modified dequeue_huge_page_node_exact. This patch will need
to be updated to take those changes into account.
--
Mike Kravetz
can modify table->data
in the global data structure without any synchronization. Worse yet, is
that that value is local to their stacks. That was the root cause of the
issue addressed by Muchun's patch.
Does that analysis make sense? Or, are we missing something.
--
Mike Kravetz
> nasty free_huge_page
>
> mm/hugetlb.c | 101 ++-
> 1 file changed, 44 insertions(+), 57 deletions(-)
Thanks Wei Yang!
I'll take a look at these next week.
--
Mike Kravetz
4002c8 RCX: 00440329
> RDX: RSI: 4000 RDI: 20001000
> RBP: 006ca018 R08: R09:
> R10: 0003 R11: 0246 R12: 00401b30
> R13: 00401bc0 R14: R15:
On 9/16/20 1:40 PM, Joe Perches wrote:
> Convert the unbound sprintf in hugetlb_report_node_meminfo to use
> sysfs_emit_at so that no possible overrun of a PAGE_SIZE buf can occur.
>
> Signed-off-by: Joe Perches
Acked-by: Mike Kravetz
--
Mike Kravetz
gd_none(*pgd))
> + return NULL;
> + p4d = p4d_offset(pgd, addr);
> + if (p4d_none(*p4d))
> + return NULL;
> + pud = pud_offset(p4d, addr);
> +
> + WARN_ON_ONCE(pud_bad(*pud));
> + if (pud_none(*pud) || pud_bad(*pud))
> + return NULL;
> + pmd = pmd_offset(pud, addr);
> +
> + return pmd;
> +}
That routine is not really hugetlb specific. Perhaps we could move it
to sparse-vmemmap.c? Or elsewhere?
--
Mike Kravetz
On 10/28/20 12:26 AM, Muchun Song wrote:
> On Wed, Oct 28, 2020 at 8:33 AM Mike Kravetz wrote:
>> On 10/26/20 7:51 AM, Muchun Song wrote:
>>
>> I see the following routines follow the pattern for vmemmap manipulation
>> in dax.
>
> Did you mean move those
mentioned.
>
More eyes on that series would be appreciated.
That series will dynamically free and allocate memmap pages as hugetlb
pages are allocated or freed. I haven't looked through this series, but
my first thought is that we would need to ensure those allocs/frees are
directed to t
(page, HUGETLB_PAGE_DTOR);
> set_hugetlb_cgroup(page, NULL);
> @@ -1783,6 +1892,14 @@ static struct page *alloc_fresh_huge_page(struct
> hstate *h,
> if (!page)
> return NULL;
>
> + if (vmemmap_pgtable_prealloc(h, page)) {
> + if (hstate_is_gigantic(h))
> + free_gigantic_page(page, huge_page_order(h));
> + else
> + put_page(page);
> + return NULL;
> + }
> +
It seems a bit strange that we will fail a huge page allocation if
vmemmap_pgtable_prealloc fails. Not sure, but it almost seems like we shold
allow the allocation and log a warning? It is somewhat unfortunate that
we need to allocate a page to free pages.
> if (hstate_is_gigantic(h))
> prep_compound_gigantic_page(page, huge_page_order(h));
> prep_new_huge_page(h, page, page_to_nid(page));
>
--
Mike Kravetz
e))
is_hugetlb = true;
else
is_thp = true;
Although, the compiler may be able to optimize. I did not check.
> +
> nr_subpages = thp_nr_pages(page);
> + if (is_hugetlb)
> + nr_subpages =
> pages_per_huge_page(page_hstate(page));
Can we just use compound_order() here for all cases?
--
Mike Kravetz
different
code. The performance issues discovered here will be taken into account with
the new code. However, as previously mentioned additional synchronization
is required for functional correctness. As a result, there will be some
regression in this code.
-
On 10/12/20 6:59 PM, Xing Zhengjun wrote:
>
>
> On 10/13/2020 1:40 AM, Mike Kravetz wrote:
>> On 10/11/20 10:29 PM, Xing Zhengjun wrote:
>>> Hi Mike,
>>>
>>> I re-test it in v5.9-rc8, the regression still existed. It is almost
>>>
slate' approach seemed best but I am open to whatever
would be easiest to review.
[1]
https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/
Mike Kravetz (3):
hugetlbfs: revert use of i_mmap_rwsem for pmd sharing and more sync
hugetlbfs: introduce hinode_rwsem for pmd
s per hugetlb calculation")
commit 87bf91d39bb5 ("hugetlbfs: Use i_mmap_rwsem to address page
fault/truncate race")
commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization")
Signed-off-by
as necessary.
File truncation (remove_inode_hugepages) needs to handle page mapping
changes that could have happened before locking the page. This could
happen if page was added to page cache and later backed out in fault
processing.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 34
is not taken
if the caller knows the target can not possibly be part of a shared pmd.
lockdep_assert calls are added to huge_pmd_share and huge_pmd_unshare to
help catch callers not using the proper locking.
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 8
include/linux/hugetlb.h | 66
is that this will be easier to review.
Mike Kravetz (4):
Revert hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race
hugetlbfs: add hinode_rwsem to hugetlb specific inode
hugetlbfs: use hinode_rwsem for pmd sharing synchronization
huegtlbfs: handle page fault/truncate races
fs/hugetlbfs/inode.c
d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization")
Cc:
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 31 +--
include/linux/fs.h | 15
include/linux/hugetlb.h | 8 --
mm/hugetlb.c| 188 +++
ensure proper locking are also added.
Use of the new semaphore and supporting routines will be provided in a
later patch.
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization")
Cc:
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 12
inc
sem to address page
fault/truncate race")
Cc:
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 28
mm/hugetlb.c | 23 ---
2 files changed, 20 insertions(+), 31 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetl
c:
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 34 --
mm/hugetlb.c | 40 ++--
2 files changed, 58 insertions(+), 16 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index bc9979382a1e..6b
On 1/28/21 2:15 PM, Andrew Morton wrote:
> On Thu, 28 Jan 2021 14:00:29 -0800 Mike Kravetz
> wrote:
>>
>> Michal suggested that comments describing synchronization be added for each
>> flag. Since I did 'one patch per flag', that would be an update to each
>> pa
+-
> 1 file changed, 28 insertions(+), 21 deletions(-)
Thanks for updating this.
Reviewed-by: Mike Kravetz
I think there still is an open general question about whether we can always
assume page structs are contiguous for really big pages. That is outside
On 1/28/21 1:37 PM, Andrew Morton wrote:
> On Thu, 28 Jan 2021 06:52:21 +0100 Oscar Salvador wrote:
>
>> On Wed, Jan 27, 2021 at 03:36:41PM -0800, Mike Kravetz wrote:
>>> Yes, this patch is somewhat optional. It should be a minor improvement
>>> in cases wh
olding
'big chunks' of memory for a specific purpose and dumping them when needed.
They were not doing this with hugetlb pages, but nothing would surprise me.
In this series, vmmap freeing is 'opt in' at boot time. I would expect
the use cases that want to opt in rarely if ever free/dissolve hugetlb
pages. But, I could be wrong.
--
Mike Kravetz
spin_unlock(src_ptl);
> + spin_unlock(dst_ptl);
> + prealloc = alloc_huge_page(vma, addr,
> 0);
One quick que
On 2/1/21 3:49 AM, Michal Hocko wrote:
> On Fri 29-01-21 10:46:15, Mike Kravetz wrote:
>> On 1/28/21 2:15 PM, Andrew Morton wrote:
>>> On Thu, 28 Jan 2021 14:00:29 -0800 Mike Kravetz
>>> wrote:
>>>>
>>>> Michal suggested that comments des
On 2/2/21 11:42 PM, Muchun Song wrote:
> On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz wrote:
>>
>> Signed-off-by: Mike Kravetz
>
> Hi Mike,
>
> I found that you may forget to remove set_page_huge_active()
> from include/linux/hugetlb.h.
>
> diff --git a/inc
-
> 1 file changed, 27 insertions(+), 24 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 18f6ee317900..d2859c2aecc9 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
Thanks, that is a pretty straight forward change. A cleanup with no
functiona
*/
> - if (rg->from > t)
> + if (rg->from >= t)
> break;
>
> /* Add an entry for last_accounted_offset -> rg->from, and
>
Changing any of this code makes me nervous. However, I agree with your
analysis. The change makes the code match the comment WRT the [from, to)
nature of regions.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
+++---
> 3 files changed, 29 insertions(+), 22 deletions(-)
Thanks. Nice straight forward improvement.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a5d618d08506..0d793486822b 100644
> --- a/include/li
so large that we do not guarantee that page++ pointer
* arithmetic will work across the entire page. We need something more
* specialized.
*/
static void __copy_gigantic_page(struct page *dst, struct page *src,
int nr_pages)
--
Mike Kravetz
> +
lb page on the free list with a count of 1. There is no check in the
enqueue code. When we dequeue the page, set_page_refcounted() is used to
set the count to 1 without looking at the current value. And, all the other
VM_DEBUG macros are off so we mostly do not notice the bug.
Thanks again,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> }
> free:
>
On 1/27/21 2:41 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:31, Mike Kravetz wrote:
>> Use new hugetlb specific HPageFreed flag to replace the
>> PageHugeFreed interfaces.
>>
>> Signed-off-by: Mike Kravetz
>> Reviewed-by: Oscar Salvador
>> Reviewed
On 1/27/21 2:20 AM, Michal Hocko wrote:
> [sorry for jumping in late]
>
> On Fri 22-01-21 11:52:27, Mike Kravetz wrote:
>> As hugetlbfs evolved, state information about hugetlb pages was added.
>> One 'convenient' way of doing this was to use available fields in tail
>>
On 1/27/21 2:25 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:28, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag HPageMigratable to replace the
>> page_huge_active interfaces. By it's name, page_huge_active implied
>> that a huge page was on the
On 1/27/21 2:35 AM, Michal Hocko wrote:
> On Fri 22-01-21 11:52:29, Mike Kravetz wrote:
>> The HP_Migratable flag indicates a page is a candidate for migration.
>> Only set the flag if the page's hstate supports migration. This allows
>> the migration paths to detect non-mig
On 1/26/21 11:21 AM, Joao Martins wrote:
> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>
>>> +static void record_subpages_vmas(struct page *page, struct vm_area_struct
>>> *vma,
>>> +
t fails, we use part of the hugepage to
>remap.
I honestly am not sure about this. This would only happen for pages in
NORMAL. The only time using part of the huge page for vmemmap would help is
if we are trying to dissolve huge pages to free up memory for other uses.
> What's your opinion about this? Should we take this approach?
I think trying to solve all the issues that could happen as the result of
not being able to dissolve a hugetlb page has made this extremely complex.
I know this is something we need to address/solve. We do not want to add
more unexpected behavior in corner cases. However, I can not help but think
about similar issues today. For example, if a huge page is in use in
ZONE_MOVABLE or CMA there is no guarantee that it can be migrated today.
Correct? We may need to allocate another huge page for the target of the
migration, and there is no guarantee we can do that.
--
Mike Kravetz
On 1/26/21 4:07 PM, Jason Gunthorpe wrote:
> On Tue, Jan 26, 2021 at 01:21:46PM -0800, Mike Kravetz wrote:
>> On 1/26/21 11:21 AM, Joao Martins wrote:
>>> On 1/26/21 6:08 PM, Mike Kravetz wrote:
>>>> On 1/25/21 12:57 PM, Joao Martins wrote:
>>>>>
>&g
gt; 1 file changed, 1 insertion(+), 2 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index cbf32d2..5e6a6e7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3367,8 +3367,7 @@ static unsigned in
by: Miaohe Lin
> ---
> mm/hugetlb.c | 16 +---
> 1 file changed, 13 insertions(+), 3 deletions(-)
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 777bc0e45bf3..53ea65d1c5ab 100644
> --- a/mm/hugetlb.c
&
set_compound_order(page, 0);
> - page[1].compound_nr = 0;
I may be reading the code wrong, but set_compound_order(page, 0) will
set page[1].compound_nr to the value of 1. That is different than the
explicit setting to 0 in the existing code.
If that is correct, then you should say
b gigantic page' being <= 1 order, so this change
makes sense. Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index a3e4fa2c5e94..dac5db569ccb 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1219
e with
> region_del exists.
>
> Signed-off-by: Mina Almasry
Thanks. I like this modification as it does simplify the code and could
be added as a general cleanup independent of the other changes.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> ---
> mm/hugetlb.c | 63 +---
n_add, and I want to make that change in one place
> only. It should improve maintainability anyway on its own.
>
> Signed-off-by: Mina Almasry
Like the previous patch, this is a good improvement indepentent of the
rest of the series. Thanks!
Reviewed-by: Mike Kravetz
--
Mike Kravetz
ere done in the
region_chg call, and it was relatively easy to do in existing code when
region_chg would only need one additional region at most.
I'm thinking that we may have to make region_chg allocate the worst case
number of regions (t - f)/2, OR change to the code such that region_add
could return an error.
--
Mike Kravetz
, remove it from the
definition and all callers.
No functional change.
Reported-by: Nathan Chancellor
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c| 4 ++--
include/linux/hugetlb.h | 2 +-
mm/hugetlb.c| 10 +-
mm/userfaultfd.c| 2 +-
4 files changed, 9
you describe above. I have never looked at/for delays in
these environments around pmd sharing (page faults), but that does not mean
they do not exist. I will try to get the DB group to give me access to one
of their large environments for analysis.
We may want to consider making the timeout value and disable threshold user
configurable.
--
Mike Kravetz
one already
knows.
At one time, I thought it was safe to acquire the semaphore in read mode for
huge_pmd_share, but write mode for huge_pmd_unshare. See commit b43a99900559.
This was reverted along with another patch for other reasons.
If we change change from write to read mode, this may have significant impact
on the stalls.
--
Mike Kravetz
y to overlayfs.
IMO - This BUG/report revealed two issues. First is the BUG by mmap'ing
a hugetlbfs file on overlayfs. The other is that core mmap code will skip
any filesystem specific get_unmapped area routine if on a union/overlay.
My patch fixes both, but if we go with a whitelist approach and don't allow
hugetlbfs I think we still need to address the filesystem specific
get_unmapped area issue. That is easy enough to do by adding a routine to
overlayfs which calls the routine for the underlying fs.
--
Mike Kravetz
On 5/22/20 3:05 AM, Miklos Szeredi wrote:
> On Wed, May 20, 2020 at 10:27:15AM -0700, Mike Kravetz wrote:
>
>> I am fairly confident it is all about checking limits and alignment. The
>> filesystem knows if it can/should align to base or huge page size. DAX has
>> som
if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see
> https://stackoverflow.com/a/37406982]
>
> url:
> https://github.com/0day
he 1st madvise() event.
>
> Do pgd size pages work properly?
Adding Anshuman and Aneesh as they added pgd support for power. And,
this patch will disable that as well IIUC.
This patch makes sense for x86. My only concern/question is for other
archs which may have huge page sizes defi
-or- hugetlbfs, split out the required
memfd code to separate files.
These files are not used until a subsequent patch which deletes
duplicate code in the orifinal files and enables their use.
Signed-off-by: Mike Kravetz
---
include/linux/memfd.h | 16 +++
mm/memfd.c| 341
Remove memfd and file sealing routines from shmem.c, and enable
the use of the new files (memfd.c and memfd.h).
A new config option MEMFD_CREATE is defined that is enabled if
TMPFS -or- HUGETLBFS is enabled.
Signed-off-by: Mike Kravetz
---
fs/Kconfig | 3 +
fs/fcntl.c
HUGETLBFS_I will be referenced (but not used) in code outside #ifdef
CONFIG_HUGETLBFS. Move the definition to prevent compiler errors.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 27 ---
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git
this was sent as a RFC, one comment suggested combining patches 2
and 3 so that we would not have 'new unused' files between patches. If
this is desired, I can make the change. For me, it is easier to read
as separate patches.
Mike Kravetz (3):
mm: hugetlbfs: move HUGETLBFS_I outside #ifdef
entry properly works, and
> + * - other mm code walking over page table is aware of pud-aligned
> + *hwpoison entries.
> + */
> + if (huge_page_size(page_hstate(head)) > PMD_SIZE) {
> + action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED
Remove memfd and file sealing routines from shmem.c, and enable
the use of the new files (memfd.c and memfd.h).
A new config option MEMFD_CREATE is defined that is enabled if
TMPFS -or- HUGETLBFS is enabled.
Signed-off-by: Mike Kravetz
---
fs/Kconfig | 3 +
fs/fcntl.c
applied to the wrong git tree, please drop us a note to
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Mike-Kravetz/restructure-memfd-code/20180131-023405
> base: git://git.cmpxchg.org/linux-mmotm.git master
> reproduce:
> # apt-
-off-by: Mike Kravetz
---
include/linux/memfd.h | 16 +++
mm/memfd.c| 342 ++
2 files changed, 358 insertions(+)
create mode 100644 include/linux/memfd.h
create mode 100644 mm/memfd.c
diff --git a/include/linux/memfd.h b/include/linux
HUGETLBFS_I will be referenced (but not used) in code outside #ifdef
CONFIG_HUGETLBFS. Move the definition to prevent compiler errors.
Signed-off-by: Mike Kravetz
---
include/linux/hugetlb.h | 27 ---
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git
han MAX_ORDER contiguous pages?".
Not sure that we should be adding to the current alloc_contig_range
interface until we decide it is something which will be useful long term.
--
Mike Kravetz
this was sent as a RFC, one comment suggested combining patches 2
and 3 so that we would not have 'new unused' files between patches. If
this is desired, I can make the change. For me, it is easier to read
as separate patches.
v2:
- Fixed sparse warnings inherited from existing code
Mike Kravetz (3
On 09/20/2017 12:25 AM, Michael Kerrisk (man-pages) wrote:
> Hello Mike,
>
> On 09/19/2017 11:42 PM, Mike Kravetz wrote:
>> v2: Fix incorrect wording noticed by Jann Horn.
>> Remove deprecated and memfd_create discussion as suggested
>> by Florian Weimer.
>&g
pages to zero, the poisoned page will be counted as 'surplus'.
I was thinking about keeping at least a bad page count (if not
a list) to avoid user confusion. It may be overkill as I have
not given too much thought to this issue. Anyone else have
thoughts here?
Mike Kravetz (1):
mm:hugetlbfs
epage in unrecoverable
memory error")
Cc: Naoya Horiguchi
Cc: Michal Hocko
Cc: Aneesh Kumar
Cc: Anshuman Khandual
Cc: Andrew Morton
Cc:
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlb
empt to mirror private anon mapping will fail.
>
> Suggested-by: Mike Kravetz
> Signed-off-by: Anshuman Khandual
The tests themselves look fine. However, they are pretty simple and
could very easily be combined into one 'mremap_mirror.c' file. I
would prefer that they be combine
On 10/19/2017 07:30 PM, Naoya Horiguchi wrote:
> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
>
> Thank you for addressing this. The patch itself looks good to me, but
> the reported issue (negative reserve count) doesn't reproduce in my trial
> with v4.14-rc5, so
> sorry for the inconvenience.
>
> On 2017-10-08 18:47 Mike Kravetz wrote:
>> You are correct. That check in function vma_to_resize() will prevent
>> mremap from growing or relocating hugetlb backed mappings. This check
>> existed in the 2.6.0 linux kernel, so this restriction
outstanding issue is sorting out the config option dependencies. Although,
IMO this is not a strict requirement for this series. I have addressed this
issue in a follow on series:
http://lkml.kernel.org/r/20171109014109.21077-1-mike.krav...@oracle.com
--
Mike Kravetz
On 11/07/2017 04:27 AM, Marc-André
On 10/23/2017 12:32 AM, Naoya Horiguchi wrote:
> On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote:
>> On 10/19/2017 07:30 PM, Naoya Horiguchi wrote:
>>> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
>>>
>>> Thank you for addressi
viding a flag to mmap in
> order to make hugepages work correctly.
Well at least this has a built in fall back mechanism. When using hugetlb(fs)
pages, you would need to handle the case where mremap fails due to lack of
configured huge pages.
I assume your allocator will be for somewhat general application usage. Yet,
for the most reliability the user/admin will need to know at boot time how
many huge pages will be needed and set that up.
--
Mike Kravetz
On 10/23/2017 03:10 PM, Dave Hansen wrote:
> On 10/03/2017 04:56 PM, Mike Kravetz wrote:
>> mmap(MAP_CONTIG) would have the following semantics:
>> - The entire mapping (length size) would be backed by physically contiguous
>> pages.
>> - If 'length' phys
On 7/7/19 10:19 PM, Hillf Danton wrote:
> On Mon, 01 Jul 2019 20:15:51 -0700 Mike Kravetz wrote:
>> On 7/1/19 1:59 AM, Mel Gorman wrote:
>>>
>>> I think it would be reasonable to have should_continue_reclaim allow an
>>> exit if scanning at higher priori
On 7/10/19 12:44 PM, Michal Hocko wrote:
> On Wed 10-07-19 11:42:40, Mike Kravetz wrote:
> [...]
>> As Michal suggested, I'm going to do some testing to see what impact
>> dropping the __GFP_RETRY_MAYFAIL flag for these huge page allocations
>> will have on the number of pa
On 7/11/19 10:47 PM, Hillf Danton wrote:
>
> On Thu, 11 Jul 2019 02:42:56 +0800 Mike Kravetz wrote:
>>
>> It is quite easy to hit the condition where:
>> nr_reclaimed == 0 && nr_scanned == 0 is true, but we skip the previous test
>>
> Then skipping ch
e
routines (or their callers) it has been verified that address is within
a vma. In addition, mmap_sem is held so that vmas can not change.
Therefore, there should be no way for find_vma to return NULL here.
Please let me know if there is something I have overlooked. Otherwise,
there is no
1101 - 1200 of 2070 matches
Mail list logo