On 7/25/19 1:13 AM, Mel Gorman wrote:
> On Wed, Jul 24, 2019 at 10:50:14AM -0700, Mike Kravetz wrote:
>> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages,
>> the pages will be interleaved between all nodes of the system. If
>> nodes are not equal, it is q
sting to fix it.
>
> Signed-off-by: Naoya Horiguchi
> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> Cc: # v4.19+
Reviewed-by: Mike Kravetz
To follow-up on Andrew's comment/question about user visible effects. Without
this fix, there are cases where madvise(
On 7/31/19 5:06 AM, Vlastimil Babka wrote:
> On 7/24/19 7:50 PM, Mike Kravetz wrote:
>> For PAGE_ALLOC_COSTLY_ORDER allocations, MIN_COMPACT_COSTLY_PRIORITY is
>> minimum (highest priority). Other places in the compaction code key off
>> of MIN_COMPACT_PRIORITY. Cos
few
pages and none of those are reclaimed.
Can we not get nr_scanned == 0 on an arbitrary chunk of the LRU?
I must be missing something, because I do not see how nr_scanned == 0
guarantees a full scan.
--
Mike Kravetz
On 7/31/19 6:23 AM, Vlastimil Babka wrote:
> On 7/25/19 7:15 PM, Mike Kravetz wrote:
>> On 7/25/19 1:13 AM, Mel Gorman wrote:
>>> On Wed, Jul 24, 2019 at 10:50:14AM -0700, Mike Kravetz wrote:
>>>
>>> set_max_huge_pages can fail the NODEMASK_ALLOC() alloc whi
quests. Any suggestions on how to test that?
--
Mike Kravetz
> 8<
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index 9569e7c786d3..b8bfe8d5d2e9 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -129,11 +129,7 @@ sta
[ T1315] el0_svc_handler+0x19c/0x26c
> [ 788.922088][ T1315] el0_svc+0x8/0xc
>
> Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold
> the
> semaphore to protect concurrency access, so it could just be converted to a
> spinlock instead.
I do not have enough experience with this ipc code to comment on your proposed
change. But, I will look into it.
[1] https://lkml.org/lkml/2019/4/23/2
--
Mike Kravetz
_page(),
> which are also cleaned up by this patch.
It may just be me, but I am having a hard time separating the fix for this
issue from the change to the dissolve_free_huge_page routine. Would it be
more clear or possible to create separate patches for these?
--
Mike Kravetz
On 5/28/19 2:49 AM, Wanpeng Li wrote:
> Cc Paolo,
> Hi all,
> On Wed, 14 Feb 2018 at 06:34, Mike Kravetz wrote:
>>
>> On 02/12/2018 06:48 PM, Michael Ellerman wrote:
>>> Andrew Morton writes:
>>>
>>>> On Thu, 08 Feb 2018 12:30:45 + Punit
On 8/2/19 5:05 AM, Vlastimil Babka wrote:
>
> On 8/1/19 10:33 PM, Mike Kravetz wrote:
>> On 8/1/19 6:01 AM, Vlastimil Babka wrote:
>>> Could you try testing the patch below instead? It should hopefully
>>> eliminate the stalls. If it makes hugepage allocation give u
From: Vlastimil Babka
Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours
when should_compact_retry() would return true more often then it should.
Specifically, this was in the case where compact_result was COMPACT_DEFERRED
and COMPACT_PARTIAL_SKIPPED and no pro
that there are not
enough inactive lru pages left to satisfy the costly allocation.
We can give up reclaiming pages too if we see dryrun occur, with the
certainty of plenty of inactive pages. IOW with dryrun detected, we are
sure we have reclaimed as many pages as we could.
Cc: Mike Kravetz
Cc: Mel Gorman
will still succeed if there is memory available, but it will not try
as hard to free up memory.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 86 ++--
1 file changed, 76 insertions(+), 10 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index
] http://lkml.kernel.org/r/d38a095e-dc39-7e82-bb76-2c9247929...@oracle.com
[2] http://lkml.kernel.org/r/20190724175014.9935-1-mike.krav...@oracle.com
Hillf Danton (1):
mm, reclaim: make should_continue_reclaim perform dryrun detection
Mike Kravetz (1):
hugetlbfs: don't retry when pool page alloc
On 8/5/19 1:42 AM, Vlastimil Babka wrote:
> On 8/3/19 12:39 AM, Mike Kravetz wrote:
>> From: Hillf Danton
>>
>> Address the issue of should_continue_reclaim continuing true too often
>> for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned.
>> This
On 8/5/19 3:57 AM, Vlastimil Babka wrote:
> On 8/5/19 10:42 AM, Vlastimil Babka wrote:
>> On 8/3/19 12:39 AM, Mike Kravetz wrote:
>>> From: Hillf Danton
>>>
>>> Address the issue of should_continue_reclaim continuing true too often
>>> for __
On 8/5/19 2:28 AM, Vlastimil Babka wrote:
> On 8/3/19 12:39 AM, Mike Kravetz wrote:
>> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages,
>> the pages will be interleaved between all nodes of the system. If
>> nodes are not equal, it is quite possible fo
:
mm, reclaim: make should_continue_reclaim perform dryrun detection
Mike Kravetz (1):
hugetlbfs: don't retry when pool page allocations start to fail
Vlastimil Babka (2):
mm, reclaim: cleanup should_continue_reclaim()
mm, compaction: raise compaction priority after it withdrawns
include
as we could.
Cc: Mike Kravetz
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Vlastimil Babka
Cc: Johannes Weiner
Signed-off-by: Hillf Danton
Tested-by: Mike Kravetz
Acked-by: Mel Gorman
Acked-by: Vlastimil Babka
Signed-off-by: Mike Kravetz
---
v2 - Updated commit message and added SOB.
mm/vmscan.c
From: Vlastimil Babka
Mike Kravetz reports that "hugetlb allocations could stall for minutes
or hours when should_compact_retry() would return true more often then
it should. Specifically, this was in the case where compact_result was
COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no pro
will still succeed if there is memory available, but it will not try
as hard to free up memory.
Signed-off-by: Mike Kravetz
---
v2 - Removed __GFP_NORETRY from bit mask allocations and added more
comments. OK to pass NULL to NODEMASK_FREE.
mm/hugetlb.c | 89
as been scanned" with nr_scanned == 0 didn't really work.
Signed-off-by: Vlastimil Babka
Acked-by: Mike Kravetz
Signed-off-by: Mike Kravetz
---
Commit message reformatted to avoid line wrap.
mm/vmscan.c | 43 ++-
1 file changed, 14 insertions(+), 2
and allocations? If a combined
controller will work for new use cases, that would be my preference. Of
course, I have not prototyped such a controller so there may be issues when
we get into the details. For a reservation only or combined controller,
the region_* changes proposed by Mina would be used.
--
Mike Kravetz
s_in_progress = 1
cache entries 1
- region_chg(3,4)
adds_in_progress = 2
cache entries 2
- region_chg(5,6)
adds_in_progress = 3
cache entries 3
At this point, no region descriptors are in the map because only
region_chg has been called.
- region_chg(0,6)
adds_in_progress = 4
cache entries 4
Is that correct so far?
Then the following sequence happens,
- region_add(1,2)
adds_in_progress = 3
cache entries 3
- region_add(3,4)
adds_in_progress = 2
cache entries 2
- region_add(5,6)
adds_in_progress = 1
cache entries 1
list of region descriptors is:
[1->2] [3->4] [5->6]
- region_add(0,6)
This is going to require 3 cache entries but only one is in the cache.
I think we are going to BUG in get_file_region_entry_from_cache() the
second time it is called from add_reservation_in_range().
I stopped looking at the code here as things will need to change if this
is a real issue.
--
Mike Kravetz
nly sticking point left is whether an added controller
> can support both cgroup-v2 and cgroup-v1. If I could get confirmation
> on that I'll provide a patchset.
Sorry, but I can not provide cgroup expertise.
--
Mike Kravetz
On 9/27/19 3:51 PM, Mina Almasry wrote:
> On Fri, Sep 27, 2019 at 2:59 PM Mike Kravetz wrote:
>>
>> On 9/26/19 5:55 PM, Mina Almasry wrote:
>>> Provided we keep the existing controller untouched, should the new
>>> controller track:
>>>
>>>
ng stalls?
If so, can you try the simple change of taking the semaphore in read mode
in huge_pmd_share.
--
Mike Kravetz
On 8/10/19 3:01 PM, Mina Almasry wrote:
> On Sat, Aug 10, 2019 at 11:58 AM Mike Kravetz wrote:
>>
>> On 8/9/19 12:42 PM, Mina Almasry wrote:
>>> On Fri, Aug 9, 2019 at 10:54 AM Mike Kravetz
>>> wrote:
>>>> On 8/8/19 4:13 PM, Mina Almasry wrote:
&g
if (!dry_run) {
> + list_del(>link);
> + kfree(rg);
Is it possible that the region struct we are deleting pointed to
a reservation_counter? Perhaps even for another cgroup?
Just concerned with the way regions are coalesced that we may be
deleting counters.
--
Mike Kravetz
On 8/13/19 4:54 PM, Mike Kravetz wrote:
> On 8/8/19 4:13 PM, Mina Almasry wrote:
>> For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives
>> in the resv_map entries, in file_region->reservation_counter.
>>
>> When a file_region entry is added to t
etermine how
many reservations were actually consumed. I did not look close enough to
determine the code drops reservation usage counts as pages are added to shared
mappings.
--
Mike Kravetz
pages, and will SIGBUS you when you try to access the remaining 2
> pages. So the problem persists. Folks would still like to know they
> are crossing the limits on mmap time.
If you got the failure at mmap time in the MAP_POPULATE case would this
be useful?
Just thinking that would be a relatively simple change.
--
Mike Kravetz
On 8/9/19 1:57 PM, Mina Almasry wrote:
> On Fri, Aug 9, 2019 at 1:39 PM Mike Kravetz wrote:
>>
>> On 8/9/19 11:05 AM, Mina Almasry wrote:
>>> On Fri, Aug 9, 2019 at 4:27 AM Michal Koutný wrote:
>>>>> Alternatives considered:
>>>>> [...]
>
On 8/9/19 12:42 PM, Mina Almasry wrote:
> On Fri, Aug 9, 2019 at 10:54 AM Mike Kravetz wrote:
>> On 8/8/19 4:13 PM, Mina Almasry wrote:
>>> Problem:
>>> Currently tasks attempting to allocate more hugetlb memory than is
>>> available get
>>>
b39d0ee2632d to cause regressions and noticable
behavior changes.
My quick/limited testing in [1] was insufficient. It was also mentioned that
if something like b39d0ee2632d went forward, I would like exemptions for
__GFP_RETRY_MAYFAIL requests as in this patch.
>
> [mho...@suse.com: rewo
32d went
forward there should be an exception for __GFP_RETRY_MAYFAIL requests.
[1] https://lkml.kernel.org/r/3468b605-a3a9-6978-9699-57c52a90b...@oracle.com
--
Mike Kravetz
b.c:4055:40: note: place parentheses around the 'sizeof(u32)'
> expression to silence this warning
> hash = jhash2((u32 *), sizeof(key)/sizeof(u32), 0);
> ^ CC fs/ext4/ialloc.o
>
> Fix the warning adding parentheses aroun
s pretty straight forward, but the idea
was to stress the underlying code. In fact, it did identify issues with
isolation which were corrected.
I exercised this new interface in the same way and am happy to report that
no issues were detected.
--
Mike Kravetz
hstate. It now does
that for the '0' hstate, and 0 is not always equal to default_hstate_idx.
David was that intentional or an oversight? I can fix up, just wanted to
make sure there was not some reason for the change.
--
Mike Kravetz
Sorry for noise, left off David
On 10/17/19 5:08 PM, Mike Kravetz wrote:
> Cc: David
> On 10/17/19 3:38 AM, Chengguang Xu wrote:
>> In order to avoid using incorrect mnt, we should set
>> mnt to NULL when we get error from mount_one_hugetlbfs().
>>
>> Signed-off-by
e commit 78911d0e18ac ("userfaultfd: use vma_pagesize
> for all huge page size calculation")
>
Thanks! That should have been removed with the recent cleanups.
> Signed-off-by: YueHaibing
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 10/9/19 6:23 PM, Wei Yang wrote:
> On Wed, Oct 09, 2019 at 05:45:57PM -0700, Mike Kravetz wrote:
>> On 10/9/19 5:27 AM, YueHaibing wrote:
>>> Fixes gcc '-Wunused-but-set-variable' warning:
>>>
>>> mm/userfaultfd.c: In function '__mcopy_atomic_hugetlb':
On 10/9/19 8:30 PM, Wei Yang wrote:
> On Wed, Oct 09, 2019 at 07:25:18PM -0700, Mike Kravetz wrote:
>> On 10/9/19 6:23 PM, Wei Yang wrote:
>>> On Wed, Oct 09, 2019 at 05:45:57PM -0700, Mike Kravetz wrote:
>>>> On 10/9/19 5:27 AM, YueHaibing wrote:
>>>
of the "hugepagesz=" in arch specific code to a common
routine in arch independent code.
Signed-off-by: Mike Kravetz
Acked-by: Gerald Schaefer [s390]
Acked-by: Will Deacon
---
arch/arm64/mm/hugetlbpage.c | 17 +
arch/powerpc/mm/hugetlbpage.c | 20 +---
arc
independent routine.
- Clean up command line processing to follow desired semantics and
document those semantics.
[1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpe...@huawei.com
Mike Kravetz (4):
hugetlbfs: add arch_hugetlb_valid_size
hugetlbfs: move hugepagesz= parsing to arch
processing "hugepagesz=".
After this, calls to size_to_hstate() in arch specific code can be
removed and hugetlb_add_hstate can be called without worrying about
warning messages.
Signed-off-by: Mike Kravetz
Acked-by: Mina Almasry
Acked-by: Gerald Schaefer [s390]
Acked-by: Will Deacon
Test
the bootmem allocator required
for gigantic allocations is not available at this time.
Signed-off-by: Mike Kravetz
Acked-by: Gerald Schaefer [s390]
Acked-by: Will Deacon
Tested-by: Sandipan Das
---
.../admin-guide/kernel-parameters.txt | 40 +++--
Documentation/admin-guide/mm
ed by some
architectures to set up ALL huge pages sizes.
Signed-off-by: Mike Kravetz
Acked-by: Mina Almasry
Reviewed-by: Peter Xu
Acked-by: Gerald Schaefer [s390]
Acked-by: Will Deacon
---
arch/arm64/mm/hugetlbpage.c | 15 ---
arch/powerpc/mm/hugetlbpage.c | 15 ---
On 10/22/19 12:09 AM, Piotr Sarna wrote:
> On 10/21/19 7:17 PM, Mike Kravetz wrote:
>> On 10/15/19 4:37 PM, Mike Kravetz wrote:
>>> On 10/15/19 3:50 AM, Michal Hocko wrote:
>>>> On Tue 15-10-19 11:01:12, Piotr Sarna wrote:
>>>>> With hugetlbfs, a co
On 10/11/19 1:41 PM, Mina Almasry wrote:
> On Fri, Oct 11, 2019 at 12:10 PM Mina Almasry wrote:
>>
>> On Mon, Sep 23, 2019 at 10:47 AM Mike Kravetz
>> wrote:
>>>
>>> On 9/19/19 3:24 PM, Mina Almasry wrote:
>>
>> Mike, note your suggestion a
n. This would use the existing code to prevent all hugetlb usage.
It seems like there may be some discussion about 'the right' way to
do kdump. I can't add to that discussion, but if such an option as
nohugepages is needed, I can help.
--
Mike Kravetz
p22.suse.cz
>
> Reported-by: Michal Hocko
> Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to
> zones until online") # visible after d0dc12e86b319
> Cc: sta...@vger.kernel.org # v4.13+
> Cc: Anshuman Khandual
> Cc: Mike Kravetz
> Cc:
s implemented. So, that is why it does not make (more) use of that
option.
The implementation looks to be straight forward. However, I really do
not want to add more functionality to hugetlbfs unless there is specific
use case that needs it.
--
Mike Kravetz
ould like to get feedback from anyone that knows how the existing
hugetlb cgroup controller may be used today. Comments from Aneesh would
be very welcome to know if reservations were considered in development of the
existing code.
--
Mike Kravetz
On 9/23/19 12:18 PM, Mina Almasry wrote:
> On Mon, Sep 23, 2019 at 10:47 AM Mike Kravetz wrote:
>>
>> On 9/19/19 3:24 PM, Mina Almasry wrote:
>>> Patch series implements hugetlb_cgroup reservation usage and limits, which
>>> track hugetlb reservations r
es causes allocations to fail sooner in the case of
COMPACT_DEFERRED:
http://lkml.kernel.org/r/20190806014744.15446-4-mike.krav...@oracle.com
hugetlb allocations have the __GFP_RETRY_MAYFAIL flag set. They are willing
to retry and wait and callers are aware of this. Even though my limited
testing did not show regressions caused by this patch, I would prefer if the
quick exit did not apply to __GFP_RETRY_MAYFAIL requests.
--
Mike Kravetz
y, these races are rare and I had to work really hard to produce
them. I'll try to find my testing mechanism. My concern is reintroducing
this abandoning of pageblocks. I have not looked further in your series
to see if this potentially addressed later. If not, then we should not
remove t
urn -EBUSY' removed from the comment and
assumed the code would not return an error code. The code now more
explicitly does return -EBUSY. My concern was when I incorrectly thought
you were removing the error return code. Sorry for the noise.
Acked-by: Mike Kravetz
--
Mike Kravetz
x4d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator")
Signed-off-by: Mike Kravetz
Cc:
---
mm/cma.c | 23 +--
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/mm/cma.c b/mm/cma.c
index
roblem in the existing code
that needs to be fixed in stable. I think the existing code is correct, just
inefficient.
--
Mike Kravetz
On 7/30/20 4:26 PM, Peter Xu wrote:
> Hi, Mike,
>
> On Thu, Jul 30, 2020 at 02:49:18PM -0700, Mike Kravetz wrote:
>> On 7/30/20 1:16 PM, Peter Xu wrote:
>>> This is found by code observation only.
>>>
>>> Firstly, the worst case scenario should assume t
gt; Signed-off-by: Joonsoo Kim
Thanks for consolidating these.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
itectures. arc looks like it could be
problematic as update_mmu_cache_pmd calls update_mmu_cache and then
operates on (address & PAGE_MASK). That could now be different.
--
Mike Kravetz
>
> Signed-off-by: Bibo Mao
> ---
> mm/huge_memory.c | 2 ++
> 1 file chang
On 6/25/20 5:01 AM, Aneesh Kumar K.V wrote:
> Mike Kravetz writes:
>
>> On 6/24/20 2:26 AM, Bibo Mao wrote:
>>> When set_pmd_at is called in function do_huge_pmd_anonymous_page,
>>> new tlb entry can be added by software on MIPS platform.
>>>
>>&
On 6/22/20 3:01 PM, Mike Kravetz wrote:
> On 6/21/20 5:55 PM, kernel test robot wrote:
>> Greeting,
>>
>> FYI, we noticed a -33.4% regression of vm-scalability.throughput due to
>> commit:
>>
>>
>> commit: c0d0381ade79885c04a04c303284b040616b116e (
On 7/19/20 11:22 PM, Anshuman Khandual wrote:
>
>
> On 07/17/2020 10:32 PM, Mike Kravetz wrote:
>> On 7/16/20 10:02 PM, Anshuman Khandual wrote:
>>>
>>>
>>> On 07/16/2020 11:55 PM, Mike Kravetz wrote:
>>>> >From 17c8f37afbf42fe7412e6eeb
ma(vma);
> unsigned long reserve, start, end;
> - long gbl_reserve;
> + long glb_reserve;
I see both 'gbl' and 'glb' being used for global in variable names. grep will
actually return more hits for gbl than glb. Unless there is consensus that
'glb' should be used for glo
On 7/19/20 11:26 PM, Baoquan He wrote:
> Just like his neighbour is_hugetlb_entry_migration() has done.
>
> Signed-off-by: Baoquan He
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> ---
> mm/hugetlb.c | 8
> 1 file changed, 4 insertions(+), 4 deletions(-)
&g
entry_migration()
> and is_hugetlb_entry_hwpoisoned() to simplify code.
>
> Signed-off-by: Baoquan He
Agreed, we can remove the checks for non_swap_entry.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> ---
> mm/hugetlb.c | 4 ++--
> 1 file changed, 2 insertions(+), 2
onally be followed by the hugepages parameter to preallocate a
> specific number of huge pages of default size. The number of default
>
Unfortunately, this review comment was missed when the typo was introduced.
https://lore.kernel.org/lkml/5ca27419-7496-8799-aeed-3042c9770...@o
gt; old_max ? "increased" : "decreased",
> + abs(old_max - h->max_huge_pages));
> + }
> spin_unlock(_lock);
I would prefer if we drop the lock before logging the message. That would
involve grabbing the value of h->max_huge_pages before dropping the lock.
--
Mike Kravetz
>
> NODEMASK_FREE(node_alloc_noretry);
>
] mm/hugetlb: better checks before using hugetlb_cma
>
> Signed-off-by: Stephen Rothwell
Thanks Stephen, sorry for missing that in review.
Acked-by: Mike Kravetz
--
Mike Kravetz
> ---
> mm/hugetlb.c | 9 ++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff
you have multiple gigantic page sizes supported at one
time (one system instance) on powerpc?
--
Mike Kravetz
olicy. This new code will help
produce a quick failure as described in the commit message, and it does
not make existing interactions any worse.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
ff81060f80 <__fentry__>
0x8126a3a5 <+5>: xor%eax,%eax
0x8126a3a7 <+7>: mov%gs:0x17bc0,%rdx
0x8126a3b0 <+16>:testb $0x1,0x778(%rdx)
0x8126a3b7 <+23>:jne0x8126a3ba
0x8126a3b9 <+25>:retq
0x8126a3ba <+26>:mov0x6c(%rdi),%eax
0x8126a3bd <+29>:retq
End of assembler dump.
--
Mike Kravetz
patch summarizes the issues.
IMO, at this time it makes little sense to perform checks for more than
MPOL_BIND at reservation time. If we ever take on the monumental task of
supporting mempolicy directed per-node reservations throughout the life of
a process, support for other policies will need to be taken into account.
--
Mike Kravetz
gt;> [-Wdeclaration-after-statement]
>>>
>>> Instead we should switch to C99 and declare it as "for (int __nr_nodes" :P
>>
>> Hmm... I tried what you suggested, but compiler complains.
>>
>> 'for' loop initial declarations are only allowed in C99 or C11 mode
>
> Yes, by "we should switch to C99" I meant that the kernel kbuild system would
> need to switch. Not a trivial change...
> Without that, I don't see how your patch is possible to do safely.
Vlastimil, thanks for pointing out future potential issues with this patch.
I likely would have missed that.
Wei, thanks for taking the time to put together the patch. However, I tend
to agree with Vlastimil's assesment. The cleanup is not worth the risk of
running into issues if someone uses multiple instances of the macro.
--
Mike Kravetz
Gushchin
> Cc: Catalin Marinas
> Cc: Will Deacon
> Cc: Thomas Gleixner
> Cc: Ingo Molnar
> Cc: Borislav Petkov
> Cc: H. Peter Anvin
> Cc: Mike Kravetz
> Cc: Mike Rapoport
> Cc: Andrew Morton
> Cc: Anshuman Khandual
> Cc: Jonathan Cameron
> Signed-o
On 7/15/20 4:14 AM, Song Bao Hua (Barry Song) wrote:
>> From: Mike Kravetz [mailto:mike.krav...@oracle.com]
>> huge_page_size(h)/1024);
>>
>> +if (order >= MAX_ORDER && hugetlb_cma_size)
>> +hugetlb_cma
this seems too simple we can
revisit that decision.
>> +
>> parsed_hstate = h;
>> }
>>
>> @@ -5647,7 +5650,10 @@ void __init hugetlb_cma_reserve(int order)
>> unsigned long size, reserved, per_node;
>> int nid;
>>
>> -cma_reserve_called = true;
>> +if (cma_reserve_called)
>> +return;
>> +else
>> +cma_reserve_called = true;
>
> (nit: don't need the 'else' here)
Yes, duh!
--
Mike Kravetz
On 7/16/20 10:02 PM, Anshuman Khandual wrote:
>
>
> On 07/16/2020 11:55 PM, Mike Kravetz wrote:
>> >From 17c8f37afbf42fe7412e6eebb3619c6e0b7e1c3c Mon Sep 17 00:00:00 2001
>> From: Mike Kravetz
>> Date: Tue, 14 Jul 2020 15:54:46 -0700
>> Subject: [PATCH] h
On 7/17/20 2:51 AM, Anshuman Khandual wrote:
>
>
> On 07/17/2020 02:06 PM, Will Deacon wrote:
>> On Fri, Jul 17, 2020 at 10:32:53AM +0530, Anshuman Khandual wrote:
>>>
>>>
>>> On 07/16/2020 11:55 PM, Mike Kravetz wrote:
>>>> >From 17c8f37a
.
I do not feel strongly one way or another about adding the warning. Since
it is fairly trivial and could help diagnose issues I am in favor of adding
it. If people feel strongly that it should not be added, I am open to
those arguments.
--
Mike Kravetz
goto out;
> }
There is a big comment before this code in hugetlb_acct_memory. The comment
only talks about cpusets. We should probably update that to include mempolicy
as well. It could be as simple as s/cpuset/cpuset or mempolicy/.
--
Mike Kravetz
On 7/16/20 1:12 AM, Will Deacon wrote:
> On Wed, Jul 15, 2020 at 09:59:24AM -0700, Mike Kravetz wrote:
>>
>> So, everything in the existing code really depends on the hugetlb definition
>> of gigantic page (order >= MAX_ORDER). The code to check for
>> 'order >
Signed-off-by: Naoya Horiguchi
> Signed-off-by: Oscar Salvador
Reviewed-by: Mike Kravetz
--
Mike Kravetz
On 7/22/20 1:49 AM, Baoquan He wrote:
> On 07/20/20 at 05:38pm, Mike Kravetz wrote:
>>> + if (count != h->max_huge_pages) {
>>> + char buf[32];
>>> +
>>> + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
>>> +
s patch?
>
I hope you do nothing with this as the patch is not upstream.
--
Mike Kravetz
On 6/24/19 2:53 PM, Mike Kravetz wrote:
> On 6/24/19 2:30 PM, Qian Cai wrote:
>> So the problem is that ipcget_public() has held the semaphore "ids->rwsem"
>> for
>> too long seems unnecessarily and then goes to sleep sometimes due to direct
>> rec
On 4/24/19 7:35 AM, Vlastimil Babka wrote:
> On 4/23/19 6:39 PM, Mike Kravetz wrote:
>>> That being said, I do not think __GFP_RETRY_MAYFAIL is wrong here. It
>>> looks like there is something wrong in the reclaim going on.
>>
>> Ok, I will start digging into that.
On 7/1/19 1:59 AM, Mel Gorman wrote:
> On Fri, Jun 28, 2019 at 11:20:42AM -0700, Mike Kravetz wrote:
>> On 4/24/19 7:35 AM, Vlastimil Babka wrote:
>>> On 4/23/19 6:39 PM, Mike Kravetz wrote:
>>>>> That being said, I do not think __GFP_RETRY_MAYFAIL i
On 6/5/19 12:58 AM, Vlastimil Babka wrote:
> On 6/5/19 1:30 AM, Mike Kravetz wrote:
>> While looking at some really long hugetlb page allocation times, I noticed
>> instances where should_compact_retry() was returning true more often that
>> I expected. In one allocation atte
;
}
Just curious, is this intentional?
--
Mike Kravetz
TRY and back to hopefully
take into account transient conditions.
>From 528c52397301f02acb614c610bd65f0f9a107481 Mon Sep 17 00:00:00 2001
From: Mike Kravetz
Date: Wed, 3 Jul 2019 13:36:24 -0700
Subject: [PATCH] hugetlbfs: don't retry when pool page allocations start to
fail
When allocating hugetlbfs pool
On 7/4/19 4:09 AM, Michal Hocko wrote:
> On Wed 03-07-19 16:54:35, Mike Kravetz wrote:
>> On 7/3/19 2:43 AM, Mel Gorman wrote:
>>> Indeed. I'm getting knocked offline shortly so I didn't give this the
>>> time it deserves but it appears that part of this problem is
&g
g dissolve_free_huge_page(). dissolve_free_huge_pages is called as
part of memory offline processing. We do not know if the memory to be offlined
contains huge pages or not. With your changes, we are taking hugetlb_lock
on each call to dissolve_free_huge_page just to discover that the page is
not a huge page.
You 'could' add a PageHuge(page) check to dissolve_free_huge_page before
taking the lock. However, you would need to check again after taking the
lock.
--
Mike Kravetz
origuchi
Thanks for the updates,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
for
> dissolving, where we should return success for !PageHuge() case because
> the given hugepage is considered as already dissolved.
>
> This change also affects other callers of dissolve_free_huge_page(),
> which are cleaned up together.
>
> Reported-by: Chen, Jerry T
> Tested-by: Chen, Jerry T
> Signed-off-by: Naoya Horiguchi
Thanks,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
n reality, this does not impact powerpc as that architecture has it's
own hugetlb_get_unmapped_area routine.
Because of this, I suggest we add a comment above this code and switch
the if/else order. For example,
+ /*
+* Use mm->get_unmapped_area value as a hint to use topdown routine.
+* If architectures have special needs, they should define their own
+* version of hugetlb_get_unmapped_area.
+*/
+ if (mm->get_unmapped_area == arch_get_unmapped_area_topdown)
+ return hugetlb_get_unmapped_area_topdown(file, addr, len,
+ pgoff, flags);
+ return hugetlb_get_unmapped_area_bottomup(file, addr, len,
+ pgoff, flags);
Thoughts?
--
Mike Kravetz
> }
> #endif
>
>
On 5/12/20 11:11 AM, Mike Kravetz wrote:
> On 5/12/20 8:04 AM, Miklos Szeredi wrote:
>> On Tue, Apr 7, 2020 at 12:06 AM Mike Kravetz wrote:
>>> On 4/5/20 8:06 PM, syzbot wrote:
>>>
>>> The routine is_file_hugepages() is just comparing the file ops to huegt
1201 - 1300 of 2070 matches
Mail list logo