Re: [RFC PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-07-25 Thread Mike Kravetz
On 7/25/19 1:13 AM, Mel Gorman wrote: > On Wed, Jul 24, 2019 at 10:50:14AM -0700, Mike Kravetz wrote: >> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, >> the pages will be interleaved between all nodes of the system. If >> nodes are not equal, it is q

Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

2019-06-10 Thread Mike Kravetz
sting to fix it. > > Signed-off-by: Naoya Horiguchi > Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining") > Cc: # v4.19+ Reviewed-by: Mike Kravetz To follow-up on Andrew's comment/question about user visible effects. Without this fix, there are cases where madvise(

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-07-31 Thread Mike Kravetz
On 7/31/19 5:06 AM, Vlastimil Babka wrote: > On 7/24/19 7:50 PM, Mike Kravetz wrote: >> For PAGE_ALLOC_COSTLY_ORDER allocations, MIN_COMPACT_COSTLY_PRIORITY is >> minimum (highest priority). Other places in the compaction code key off >> of MIN_COMPACT_PRIORITY. Cos

Re: [RFC PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-07-31 Thread Mike Kravetz
few pages and none of those are reclaimed. Can we not get nr_scanned == 0 on an arbitrary chunk of the LRU? I must be missing something, because I do not see how nr_scanned == 0 guarantees a full scan. -- Mike Kravetz

Re: [RFC PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-07-31 Thread Mike Kravetz
On 7/31/19 6:23 AM, Vlastimil Babka wrote: > On 7/25/19 7:15 PM, Mike Kravetz wrote: >> On 7/25/19 1:13 AM, Mel Gorman wrote: >>> On Wed, Jul 24, 2019 at 10:50:14AM -0700, Mike Kravetz wrote: >>> >>> set_max_huge_pages can fail the NODEMASK_ALLOC() alloc whi

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-08-01 Thread Mike Kravetz
quests. Any suggestions on how to test that? -- Mike Kravetz > 8< > diff --git a/include/linux/compaction.h b/include/linux/compaction.h > index 9569e7c786d3..b8bfe8d5d2e9 100644 > --- a/include/linux/compaction.h > +++ b/include/linux/compaction.h > @@ -129,11 +129,7 @@ sta

Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)

2019-06-24 Thread Mike Kravetz
[ T1315] el0_svc_handler+0x19c/0x26c > [ 788.922088][ T1315] el0_svc+0x8/0xc > > Ideally, it seems only ipc_findkey() and newseg() in this path needs to hold > the > semaphore to protect concurrency access, so it could just be converted to a > spinlock instead. I do not have enough experience with this ipc code to comment on your proposed change. But, I will look into it. [1] https://lkml.org/lkml/2019/4/23/2 -- Mike Kravetz

Re: [PATCH v1] mm: hugetlb: soft-offline: fix wrong return value of soft offline

2019-05-29 Thread Mike Kravetz
_page(), > which are also cleaned up by this patch. It may just be me, but I am having a hard time separating the fix for this issue from the change to the dissolve_free_huge_page routine. Would it be more clear or possible to create separate patches for these? -- Mike Kravetz

Re: [PATCH v2] mm: hwpoison: disable memory error handling on 1GB hugepage

2019-05-29 Thread Mike Kravetz
On 5/28/19 2:49 AM, Wanpeng Li wrote: > Cc Paolo, > Hi all, > On Wed, 14 Feb 2018 at 06:34, Mike Kravetz wrote: >> >> On 02/12/2018 06:48 PM, Michael Ellerman wrote: >>> Andrew Morton writes: >>> >>>> On Thu, 08 Feb 2018 12:30:45 + Punit

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-08-02 Thread Mike Kravetz
On 8/2/19 5:05 AM, Vlastimil Babka wrote: > > On 8/1/19 10:33 PM, Mike Kravetz wrote: >> On 8/1/19 6:01 AM, Vlastimil Babka wrote: >>> Could you try testing the patch below instead? It should hopefully >>> eliminate the stalls. If it makes hugepage allocation give u

[PATCH 2/3] mm, compaction: raise compaction priority after it withdrawns

2019-08-02 Thread Mike Kravetz
From: Vlastimil Babka Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours when should_compact_retry() would return true more often then it should. Specifically, this was in the case where compact_result was COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no pro

[PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-02 Thread Mike Kravetz
that there are not enough inactive lru pages left to satisfy the costly allocation. We can give up reclaiming pages too if we see dryrun occur, with the certainty of plenty of inactive pages. IOW with dryrun detected, we are sure we have reclaimed as many pages as we could. Cc: Mike Kravetz Cc: Mel Gorman

[PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-08-02 Thread Mike Kravetz
will still succeed if there is memory available, but it will not try as hard to free up memory. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 86 ++-- 1 file changed, 76 insertions(+), 10 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index

[PATCH 0/3] address hugetlb page allocation stalls

2019-08-02 Thread Mike Kravetz
] http://lkml.kernel.org/r/d38a095e-dc39-7e82-bb76-2c9247929...@oracle.com [2] http://lkml.kernel.org/r/20190724175014.9935-1-mike.krav...@oracle.com Hillf Danton (1): mm, reclaim: make should_continue_reclaim perform dryrun detection Mike Kravetz (1): hugetlbfs: don't retry when pool page alloc

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Mike Kravetz
On 8/5/19 1:42 AM, Vlastimil Babka wrote: > On 8/3/19 12:39 AM, Mike Kravetz wrote: >> From: Hillf Danton >> >> Address the issue of should_continue_reclaim continuing true too often >> for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned. >> This

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Mike Kravetz
On 8/5/19 3:57 AM, Vlastimil Babka wrote: > On 8/5/19 10:42 AM, Vlastimil Babka wrote: >> On 8/3/19 12:39 AM, Mike Kravetz wrote: >>> From: Hillf Danton >>> >>> Address the issue of should_continue_reclaim continuing true too often >>> for __

Re: [PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-08-05 Thread Mike Kravetz
On 8/5/19 2:28 AM, Vlastimil Babka wrote: > On 8/3/19 12:39 AM, Mike Kravetz wrote: >> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, >> the pages will be interleaved between all nodes of the system. If >> nodes are not equal, it is quite possible fo

[PATCH v2 0/4] address hugetlb page allocation stalls

2019-08-05 Thread Mike Kravetz
: mm, reclaim: make should_continue_reclaim perform dryrun detection Mike Kravetz (1): hugetlbfs: don't retry when pool page allocations start to fail Vlastimil Babka (2): mm, reclaim: cleanup should_continue_reclaim() mm, compaction: raise compaction priority after it withdrawns include

[PATCH v2 1/4] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Mike Kravetz
as we could. Cc: Mike Kravetz Cc: Mel Gorman Cc: Michal Hocko Cc: Vlastimil Babka Cc: Johannes Weiner Signed-off-by: Hillf Danton Tested-by: Mike Kravetz Acked-by: Mel Gorman Acked-by: Vlastimil Babka Signed-off-by: Mike Kravetz --- v2 - Updated commit message and added SOB. mm/vmscan.c

[PATCH v2 3/4] mm, compaction: raise compaction priority after it withdrawns

2019-08-05 Thread Mike Kravetz
From: Vlastimil Babka Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours when should_compact_retry() would return true more often then it should. Specifically, this was in the case where compact_result was COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no pro

[PATCH v2 4/4] hugetlbfs: don't retry when pool page allocations start to fail

2019-08-05 Thread Mike Kravetz
will still succeed if there is memory available, but it will not try as hard to free up memory. Signed-off-by: Mike Kravetz --- v2 - Removed __GFP_NORETRY from bit mask allocations and added more comments. OK to pass NULL to NODEMASK_FREE. mm/hugetlb.c | 89

[PATCH v2 2/4] mm, reclaim: cleanup should_continue_reclaim()

2019-08-05 Thread Mike Kravetz
as been scanned" with nr_scanned == 0 didn't really work. Signed-off-by: Vlastimil Babka Acked-by: Mike Kravetz Signed-off-by: Mike Kravetz --- Commit message reformatted to avoid line wrap. mm/vmscan.c | 43 ++- 1 file changed, 14 insertions(+), 2

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-26 Thread Mike Kravetz
and allocations? If a combined controller will work for new use cases, that would be my preference. Of course, I have not prototyped such a controller so there may be issues when we get into the details. For a reservation only or combined controller, the region_* changes proposed by Mina would be used. -- Mike Kravetz

Re: [PATCH v5 4/7] hugetlb: disable region_add file_region coalescing

2019-09-27 Thread Mike Kravetz
s_in_progress = 1 cache entries 1 - region_chg(3,4) adds_in_progress = 2 cache entries 2 - region_chg(5,6) adds_in_progress = 3 cache entries 3 At this point, no region descriptors are in the map because only region_chg has been called. - region_chg(0,6) adds_in_progress = 4 cache entries 4 Is that correct so far? Then the following sequence happens, - region_add(1,2) adds_in_progress = 3 cache entries 3 - region_add(3,4) adds_in_progress = 2 cache entries 2 - region_add(5,6) adds_in_progress = 1 cache entries 1 list of region descriptors is: [1->2] [3->4] [5->6] - region_add(0,6) This is going to require 3 cache entries but only one is in the cache. I think we are going to BUG in get_file_region_entry_from_cache() the second time it is called from add_reservation_in_range(). I stopped looking at the code here as things will need to change if this is a real issue. -- Mike Kravetz

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-27 Thread Mike Kravetz
nly sticking point left is whether an added controller > can support both cgroup-v2 and cgroup-v1. If I could get confirmation > on that I'll provide a patchset. Sorry, but I can not provide cgroup expertise. -- Mike Kravetz

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-27 Thread Mike Kravetz
On 9/27/19 3:51 PM, Mina Almasry wrote: > On Fri, Sep 27, 2019 at 2:59 PM Mike Kravetz wrote: >> >> On 9/26/19 5:55 PM, Mina Almasry wrote: >>> Provided we keep the existing controller untouched, should the new >>> controller track: >>> >>>

Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD

2019-09-12 Thread Mike Kravetz
ng stalls? If so, can you try the simple change of taking the semaphore in read mode in huge_pmd_share. -- Mike Kravetz

Re: [RFC PATCH v2 0/5] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-08-13 Thread Mike Kravetz
On 8/10/19 3:01 PM, Mina Almasry wrote: > On Sat, Aug 10, 2019 at 11:58 AM Mike Kravetz wrote: >> >> On 8/9/19 12:42 PM, Mina Almasry wrote: >>> On Fri, Aug 9, 2019 at 10:54 AM Mike Kravetz >>> wrote: >>>> On 8/8/19 4:13 PM, Mina Almasry wrote: &g

Re: [RFC PATCH v2 4/5] hugetlb_cgroup: Add accounting for shared mappings

2019-08-13 Thread Mike Kravetz
if (!dry_run) { > + list_del(>link); > + kfree(rg); Is it possible that the region struct we are deleting pointed to a reservation_counter? Perhaps even for another cgroup? Just concerned with the way regions are coalesced that we may be deleting counters. -- Mike Kravetz

Re: [RFC PATCH v2 4/5] hugetlb_cgroup: Add accounting for shared mappings

2019-08-14 Thread Mike Kravetz
On 8/13/19 4:54 PM, Mike Kravetz wrote: > On 8/8/19 4:13 PM, Mina Almasry wrote: >> For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives >> in the resv_map entries, in file_region->reservation_counter. >> >> When a file_region entry is added to t

Re: [RFC PATCH v2 0/5] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-08-09 Thread Mike Kravetz
etermine how many reservations were actually consumed. I did not look close enough to determine the code drops reservation usage counts as pages are added to shared mappings. -- Mike Kravetz

Re: [RFC PATCH] hugetlbfs: Add hugetlb_cgroup reservation limits

2019-08-09 Thread Mike Kravetz
pages, and will SIGBUS you when you try to access the remaining 2 > pages. So the problem persists. Folks would still like to know they > are crossing the limits on mmap time. If you got the failure at mmap time in the MAP_POPULATE case would this be useful? Just thinking that would be a relatively simple change. -- Mike Kravetz

Re: [RFC PATCH] hugetlbfs: Add hugetlb_cgroup reservation limits

2019-08-09 Thread Mike Kravetz
On 8/9/19 1:57 PM, Mina Almasry wrote: > On Fri, Aug 9, 2019 at 1:39 PM Mike Kravetz wrote: >> >> On 8/9/19 11:05 AM, Mina Almasry wrote: >>> On Fri, Aug 9, 2019 at 4:27 AM Michal Koutný wrote: >>>>> Alternatives considered: >>>>> [...] >

Re: [RFC PATCH v2 0/5] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-08-10 Thread Mike Kravetz
On 8/9/19 12:42 PM, Mina Almasry wrote: > On Fri, Aug 9, 2019 at 10:54 AM Mike Kravetz wrote: >> On 8/8/19 4:13 PM, Mina Almasry wrote: >>> Problem: >>> Currently tasks attempting to allocate more hugetlb memory than is >>> available get >>>

Re: [PATCH] mm, hugetlb: allow hugepage allocations to excessively reclaim

2019-10-07 Thread Mike Kravetz
b39d0ee2632d to cause regressions and noticable behavior changes. My quick/limited testing in [1] was insufficient. It was also mentioned that if something like b39d0ee2632d went forward, I would like exemptions for __GFP_RETRY_MAYFAIL requests as in this patch. > > [mho...@suse.com: rewo

Re: [rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim

2019-10-07 Thread Mike Kravetz
32d went forward there should be an exception for __GFP_RETRY_MAYFAIL requests. [1] https://lkml.kernel.org/r/3468b605-a3a9-6978-9699-57c52a90b...@oracle.com -- Mike Kravetz

Re: [PATCH] hugetlb: Fix clang compilation warning

2019-10-16 Thread Mike Kravetz
b.c:4055:40: note: place parentheses around the 'sizeof(u32)' > expression to silence this warning > hash = jhash2((u32 *), sizeof(key)/sizeof(u32), 0); > ^ CC fs/ext4/ialloc.o > > Fix the warning adding parentheses aroun

Re: [PATCH V2] mm/page_alloc: Add alloc_contig_pages()

2019-10-16 Thread Mike Kravetz
s pretty straight forward, but the idea was to stress the underlying code. In fact, it did identify issues with isolation which were corrected. I exercised this new interface in the same way and am happy to report that no issues were detected. -- Mike Kravetz

Re: [PATCH] hugetlbfs: fix error handling in init_hugetlbfs_fs()

2019-10-17 Thread Mike Kravetz
hstate. It now does that for the '0' hstate, and 0 is not always equal to default_hstate_idx. David was that intentional or an oversight? I can fix up, just wanted to make sure there was not some reason for the change. -- Mike Kravetz

Re: [PATCH] hugetlbfs: fix error handling in init_hugetlbfs_fs()

2019-10-17 Thread Mike Kravetz
Sorry for noise, left off David On 10/17/19 5:08 PM, Mike Kravetz wrote: > Cc: David > On 10/17/19 3:38 AM, Chengguang Xu wrote: >> In order to avoid using incorrect mnt, we should set >> mnt to NULL when we get error from mount_one_hugetlbfs(). >> >> Signed-off-by

Re: [PATCH -next] userfaultfd: remove set but not used variable 'h'

2019-10-09 Thread Mike Kravetz
e commit 78911d0e18ac ("userfaultfd: use vma_pagesize > for all huge page size calculation") > Thanks! That should have been removed with the recent cleanups. > Signed-off-by: YueHaibing Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH -next] userfaultfd: remove set but not used variable 'h'

2019-10-09 Thread Mike Kravetz
On 10/9/19 6:23 PM, Wei Yang wrote: > On Wed, Oct 09, 2019 at 05:45:57PM -0700, Mike Kravetz wrote: >> On 10/9/19 5:27 AM, YueHaibing wrote: >>> Fixes gcc '-Wunused-but-set-variable' warning: >>> >>> mm/userfaultfd.c: In function '__mcopy_atomic_hugetlb':

Re: [PATCH -next] userfaultfd: remove set but not used variable 'h'

2019-10-09 Thread Mike Kravetz
On 10/9/19 8:30 PM, Wei Yang wrote: > On Wed, Oct 09, 2019 at 07:25:18PM -0700, Mike Kravetz wrote: >> On 10/9/19 6:23 PM, Wei Yang wrote: >>> On Wed, Oct 09, 2019 at 05:45:57PM -0700, Mike Kravetz wrote: >>>> On 10/9/19 5:27 AM, YueHaibing wrote: >>>

[PATCH v4 1/4] hugetlbfs: add arch_hugetlb_valid_size

2020-04-28 Thread Mike Kravetz
of the "hugepagesz=" in arch specific code to a common routine in arch independent code. Signed-off-by: Mike Kravetz Acked-by: Gerald Schaefer [s390] Acked-by: Will Deacon --- arch/arm64/mm/hugetlbpage.c | 17 + arch/powerpc/mm/hugetlbpage.c | 20 +--- arc

[PATCH v4 0/4] Clean up hugetlb boot command line processing

2020-04-28 Thread Mike Kravetz
independent routine. - Clean up command line processing to follow desired semantics and document those semantics. [1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpe...@huawei.com Mike Kravetz (4): hugetlbfs: add arch_hugetlb_valid_size hugetlbfs: move hugepagesz= parsing to arch

[PATCH v4 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate

2020-04-28 Thread Mike Kravetz
processing "hugepagesz=". After this, calls to size_to_hstate() in arch specific code can be removed and hugetlb_add_hstate can be called without worrying about warning messages. Signed-off-by: Mike Kravetz Acked-by: Mina Almasry Acked-by: Gerald Schaefer [s390] Acked-by: Will Deacon Test

[PATCH v4 4/4] hugetlbfs: clean up command line processing

2020-04-28 Thread Mike Kravetz
the bootmem allocator required for gigantic allocations is not available at this time. Signed-off-by: Mike Kravetz Acked-by: Gerald Schaefer [s390] Acked-by: Will Deacon Tested-by: Sandipan Das --- .../admin-guide/kernel-parameters.txt | 40 +++-- Documentation/admin-guide/mm

[PATCH v4 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code

2020-04-28 Thread Mike Kravetz
ed by some architectures to set up ALL huge pages sizes. Signed-off-by: Mike Kravetz Acked-by: Mina Almasry Reviewed-by: Peter Xu Acked-by: Gerald Schaefer [s390] Acked-by: Will Deacon --- arch/arm64/mm/hugetlbpage.c | 15 --- arch/powerpc/mm/hugetlbpage.c | 15 ---

Re: [PATCH] hugetlbfs: add O_TMPFILE support

2019-10-22 Thread Mike Kravetz
On 10/22/19 12:09 AM, Piotr Sarna wrote: > On 10/21/19 7:17 PM, Mike Kravetz wrote: >> On 10/15/19 4:37 PM, Mike Kravetz wrote: >>> On 10/15/19 3:50 AM, Michal Hocko wrote: >>>> On Tue 15-10-19 11:01:12, Piotr Sarna wrote: >>>>> With hugetlbfs, a co

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-10-14 Thread Mike Kravetz
On 10/11/19 1:41 PM, Mina Almasry wrote: > On Fri, Oct 11, 2019 at 12:10 PM Mina Almasry wrote: >> >> On Mon, Sep 23, 2019 at 10:47 AM Mike Kravetz >> wrote: >>> >>> On 9/19/19 3:24 PM, Mina Almasry wrote: >> >> Mike, note your suggestion a

Re: [PATCH] hugetlb: Add nohugepages parameter to prevent hugepages creation

2019-10-14 Thread Mike Kravetz
n. This would use the existing code to prevent all hugetlb usage. It seems like there may be some discussion about 'the right' way to do kdump. I can't add to that discussion, but if such an option as nohugepages is needed, I can help. -- Mike Kravetz

Re: [PATCH v1] hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()

2019-10-15 Thread Mike Kravetz
p22.suse.cz > > Reported-by: Michal Hocko > Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to > zones until online") # visible after d0dc12e86b319 > Cc: sta...@vger.kernel.org # v4.13+ > Cc: Anshuman Khandual > Cc: Mike Kravetz > Cc:

Re: [PATCH] hugetlbfs: add O_TMPFILE support

2019-10-15 Thread Mike Kravetz
s implemented. So, that is why it does not make (more) use of that option. The implementation looks to be straight forward. However, I really do not want to add more functionality to hugetlbfs unless there is specific use case that needs it. -- Mike Kravetz

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-23 Thread Mike Kravetz
ould like to get feedback from anyone that knows how the existing hugetlb cgroup controller may be used today. Comments from Aneesh would be very welcome to know if reservations were considered in development of the existing code. -- Mike Kravetz

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-23 Thread Mike Kravetz
On 9/23/19 12:18 PM, Mina Almasry wrote: > On Mon, Sep 23, 2019 at 10:47 AM Mike Kravetz wrote: >> >> On 9/19/19 3:24 PM, Mina Almasry wrote: >>> Patch series implements hugetlb_cgroup reservation usage and limits, which >>> track hugetlb reservations r

Re: [rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed

2019-09-05 Thread Mike Kravetz
es causes allocations to fail sooner in the case of COMPACT_DEFERRED: http://lkml.kernel.org/r/20190806014744.15446-4-mike.krav...@oracle.com hugetlb allocations have the __GFP_RETRY_MAYFAIL flag set. They are willing to retry and wait and callers are aware of this. Even though my limited testing did not show regressions caused by this patch, I would prefer if the quick exit did not apply to __GFP_RETRY_MAYFAIL requests. -- Mike Kravetz

Re: [PATCH v1 2/6] mm/page_isolation: don't dump_page(NULL) in set_migratetype_isolate()

2020-07-29 Thread Mike Kravetz
y, these races are rare and I had to work really hard to produce them. I'll try to find my testing mechanism. My concern is reintroducing this abandoning of pageblocks. I have not looked further in your series to see if this potentially addressed later. If not, then we should not remove t

Re: [PATCH v1 2/6] mm/page_isolation: don't dump_page(NULL) in set_migratetype_isolate()

2020-07-29 Thread Mike Kravetz
urn -EBUSY' removed from the comment and assumed the code would not return an error code. The code now more explicitly does return -EBUSY. My concern was when I incorrectly thought you were removing the error return code. Sorry for the noise. Acked-by: Mike Kravetz -- Mike Kravetz

[PATCH] cma: don't quit at first error when activating reserved areas

2020-07-30 Thread Mike Kravetz
x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Mike Kravetz Cc: --- mm/cma.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index

Re: [PATCH v2] mm/hugetlb: Fix calculation of adjust_range_if_pmd_sharing_possible

2020-07-30 Thread Mike Kravetz
roblem in the existing code that needs to be fixed in stable. I think the existing code is correct, just inefficient. -- Mike Kravetz

Re: [PATCH v2] mm/hugetlb: Fix calculation of adjust_range_if_pmd_sharing_possible

2020-07-30 Thread Mike Kravetz
On 7/30/20 4:26 PM, Peter Xu wrote: > Hi, Mike, > > On Thu, Jul 30, 2020 at 02:49:18PM -0700, Mike Kravetz wrote: >> On 7/30/20 1:16 PM, Peter Xu wrote: >>> This is found by code observation only. >>> >>> Firstly, the worst case scenario should assume t

Re: [PATCH v3 3/8] mm/hugetlb: unify migration callbacks

2020-06-24 Thread Mike Kravetz
gt; Signed-off-by: Joonsoo Kim Thanks for consolidating these. Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH 2/3] mm/huge_memory.c: update tlb entry if pmd is changed

2020-06-24 Thread Mike Kravetz
itectures. arc looks like it could be problematic as update_mmu_cache_pmd calls update_mmu_cache and then operates on (address & PAGE_MASK). That could now be different. -- Mike Kravetz > > Signed-off-by: Bibo Mao > --- > mm/huge_memory.c | 2 ++ > 1 file chang

Re: [PATCH 2/3] mm/huge_memory.c: update tlb entry if pmd is changed

2020-06-25 Thread Mike Kravetz
On 6/25/20 5:01 AM, Aneesh Kumar K.V wrote: > Mike Kravetz writes: > >> On 6/24/20 2:26 AM, Bibo Mao wrote: >>> When set_pmd_at is called in function do_huge_pmd_anonymous_page, >>> new tlb entry can be added by software on MIPS platform. >>> >>&

Re: [hugetlbfs] c0d0381ade: vm-scalability.throughput -33.4% regression

2020-06-25 Thread Mike Kravetz
On 6/22/20 3:01 PM, Mike Kravetz wrote: > On 6/21/20 5:55 PM, kernel test robot wrote: >> Greeting, >> >> FYI, we noticed a -33.4% regression of vm-scalability.throughput due to >> commit: >> >> >> commit: c0d0381ade79885c04a04c303284b040616b116e (

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-20 Thread Mike Kravetz
On 7/19/20 11:22 PM, Anshuman Khandual wrote: > > > On 07/17/2020 10:32 PM, Mike Kravetz wrote: >> On 7/16/20 10:02 PM, Anshuman Khandual wrote: >>> >>> >>> On 07/16/2020 11:55 PM, Mike Kravetz wrote: >>>> >From 17c8f37afbf42fe7412e6eeb

Re: [PATCH 1/5] mm/hugetlb.c: Fix typo of glb_reserve

2020-07-20 Thread Mike Kravetz
ma(vma); > unsigned long reserve, start, end; > - long gbl_reserve; > + long glb_reserve; I see both 'gbl' and 'glb' being used for global in variable names. grep will actually return more hits for gbl than glb. Unless there is consensus that 'glb' should be used for glo

Re: [PATCH 2/5] mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool

2020-07-20 Thread Mike Kravetz
On 7/19/20 11:26 PM, Baoquan He wrote: > Just like his neighbour is_hugetlb_entry_migration() has done. > > Signed-off-by: Baoquan He Thanks, Reviewed-by: Mike Kravetz -- Mike Kravetz > --- > mm/hugetlb.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) &g

Re: [PATCH 3/5] mm/hugetlb.c: Remove the unnecessary non_swap_entry()

2020-07-20 Thread Mike Kravetz
entry_migration() > and is_hugetlb_entry_hwpoisoned() to simplify code. > > Signed-off-by: Baoquan He Agreed, we can remove the checks for non_swap_entry. Reviewed-by: Mike Kravetz -- Mike Kravetz > --- > mm/hugetlb.c | 4 ++-- > 1 file changed, 2 insertions(+), 2

Re: [PATCH 4/5] doc/vm: fix typo in in the hugetlb admin documentation

2020-07-20 Thread Mike Kravetz
onally be followed by the hugepages parameter to preallocate a > specific number of huge pages of default size. The number of default > Unfortunately, this review comment was missed when the typo was introduced. https://lore.kernel.org/lkml/5ca27419-7496-8799-aeed-3042c9770...@o

Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

2020-07-20 Thread Mike Kravetz
gt; old_max ? "increased" : "decreased", > + abs(old_max - h->max_huge_pages)); > + } > spin_unlock(_lock); I would prefer if we drop the lock before logging the message. That would involve grabbing the value of h->max_huge_pages before dropping the lock. -- Mike Kravetz > > NODEMASK_FREE(node_alloc_noretry); >

Re: linux-next: build failure after merge of the akpm-current tree

2020-07-21 Thread Mike Kravetz
] mm/hugetlb: better checks before using hugetlb_cma > > Signed-off-by: Stephen Rothwell Thanks Stephen, sorry for missing that in review. Acked-by: Mike Kravetz -- Mike Kravetz > --- > mm/hugetlb.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-27 Thread Mike Kravetz
you have multiple gigantic page sizes supported at one time (one system instance) on powerpc? -- Mike Kravetz

Re: [PATCH v3] mm/hugetlb: add mempolicy check in the reservation routine

2020-07-27 Thread Mike Kravetz
olicy. This new code will help produce a quick failure as described in the commit message, and it does not make existing interactions any worse. Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH v3] mm/hugetlb: add mempolicy check in the reservation routine

2020-07-27 Thread Mike Kravetz
ff81060f80 <__fentry__> 0x8126a3a5 <+5>: xor%eax,%eax 0x8126a3a7 <+7>: mov%gs:0x17bc0,%rdx 0x8126a3b0 <+16>:testb $0x1,0x778(%rdx) 0x8126a3b7 <+23>:jne0x8126a3ba 0x8126a3b9 <+25>:retq 0x8126a3ba <+26>:mov0x6c(%rdi),%eax 0x8126a3bd <+29>:retq End of assembler dump. -- Mike Kravetz

Re: [PATCH v4] mm/hugetlb: add mempolicy check in the reservation routine

2020-07-28 Thread Mike Kravetz
patch summarizes the issues. IMO, at this time it makes little sense to perform checks for more than MPOL_BIND at reservation time. If we ever take on the monumental task of supporting mempolicy directed per-node reservations throughout the life of a process, support for other policies will need to be taken into account. -- Mike Kravetz

Re: [PATCH] mm/hugetlb: hide nr_nodes in the internal of for_each_node_mask_to_[alloc|free]

2020-07-14 Thread Mike Kravetz
gt;> [-Wdeclaration-after-statement] >>> >>> Instead we should switch to C99 and declare it as "for (int __nr_nodes" :P >> >> Hmm... I tried what you suggested, but compiler complains. >> >> 'for' loop initial declarations are only allowed in C99 or C11 mode > > Yes, by "we should switch to C99" I meant that the kernel kbuild system would > need to switch. Not a trivial change... > Without that, I don't see how your patch is possible to do safely. Vlastimil, thanks for pointing out future potential issues with this patch. I likely would have missed that. Wei, thanks for taking the time to put together the patch. However, I tend to agree with Vlastimil's assesment. The cleanup is not worth the risk of running into issues if someone uses multiple instances of the macro. -- Mike Kravetz

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-14 Thread Mike Kravetz
Gushchin > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Borislav Petkov > Cc: H. Peter Anvin > Cc: Mike Kravetz > Cc: Mike Rapoport > Cc: Andrew Morton > Cc: Anshuman Khandual > Cc: Jonathan Cameron > Signed-o

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-15 Thread Mike Kravetz
On 7/15/20 4:14 AM, Song Bao Hua (Barry Song) wrote: >> From: Mike Kravetz [mailto:mike.krav...@oracle.com] >> huge_page_size(h)/1024); >> >> +if (order >= MAX_ORDER && hugetlb_cma_size) >> +hugetlb_cma

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-15 Thread Mike Kravetz
this seems too simple we can revisit that decision. >> + >> parsed_hstate = h; >> } >> >> @@ -5647,7 +5650,10 @@ void __init hugetlb_cma_reserve(int order) >> unsigned long size, reserved, per_node; >> int nid; >> >> -cma_reserve_called = true; >> +if (cma_reserve_called) >> +return; >> +else >> +cma_reserve_called = true; > > (nit: don't need the 'else' here) Yes, duh! -- Mike Kravetz

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-17 Thread Mike Kravetz
On 7/16/20 10:02 PM, Anshuman Khandual wrote: > > > On 07/16/2020 11:55 PM, Mike Kravetz wrote: >> >From 17c8f37afbf42fe7412e6eebb3619c6e0b7e1c3c Mon Sep 17 00:00:00 2001 >> From: Mike Kravetz >> Date: Tue, 14 Jul 2020 15:54:46 -0700 >> Subject: [PATCH] h

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-17 Thread Mike Kravetz
On 7/17/20 2:51 AM, Anshuman Khandual wrote: > > > On 07/17/2020 02:06 PM, Will Deacon wrote: >> On Fri, Jul 17, 2020 at 10:32:53AM +0530, Anshuman Khandual wrote: >>> >>> >>> On 07/16/2020 11:55 PM, Mike Kravetz wrote: >>>> >From 17c8f37a

Re: [PATCH v2 4/4] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

2020-07-23 Thread Mike Kravetz
. I do not feel strongly one way or another about adding the warning. Since it is fairly trivial and could help diagnose issues I am in favor of adding it. If people feel strongly that it should not be added, I am open to those arguments. -- Mike Kravetz

Re: [PATCH v2] mm/hugetlb: add mempolicy check in the reservation routine

2020-07-24 Thread Mike Kravetz
goto out; > } There is a big comment before this code in hugetlb_acct_memory. The comment only talks about cpusets. We should probably update that to include mempolicy as well. It could be as simple as s/cpuset/cpuset or mempolicy/. -- Mike Kravetz

Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory

2020-07-16 Thread Mike Kravetz
On 7/16/20 1:12 AM, Will Deacon wrote: > On Wed, Jul 15, 2020 at 09:59:24AM -0700, Mike Kravetz wrote: >> >> So, everything in the existing code really depends on the hugetlb definition >> of gigantic page (order >= MAX_ORDER). The code to check for >> 'order >

Re: [PATCH v4 03/15] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

2020-07-16 Thread Mike Kravetz
Signed-off-by: Naoya Horiguchi > Signed-off-by: Oscar Salvador Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH 5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

2020-07-22 Thread Mike Kravetz
On 7/22/20 1:49 AM, Baoquan He wrote: > On 07/20/20 at 05:38pm, Mike Kravetz wrote: >>> + if (count != h->max_huge_pages) { >>> + char buf[32]; >>> + >>> + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); >>> +

Re: [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2019-06-14 Thread Mike Kravetz
s patch? > I hope you do nothing with this as the patch is not upstream. -- Mike Kravetz

Re: LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)

2019-06-27 Thread Mike Kravetz
On 6/24/19 2:53 PM, Mike Kravetz wrote: > On 6/24/19 2:30 PM, Qian Cai wrote: >> So the problem is that ipcget_public() has held the semaphore "ids->rwsem" >> for >> too long seems unnecessarily and then goes to sleep sometimes due to direct >> rec

Re: [Question] Should direct reclaim time be bounded?

2019-06-28 Thread Mike Kravetz
On 4/24/19 7:35 AM, Vlastimil Babka wrote: > On 4/23/19 6:39 PM, Mike Kravetz wrote: >>> That being said, I do not think __GFP_RETRY_MAYFAIL is wrong here. It >>> looks like there is something wrong in the reclaim going on. >> >> Ok, I will start digging into that.

Re: [Question] Should direct reclaim time be bounded?

2019-07-01 Thread Mike Kravetz
On 7/1/19 1:59 AM, Mel Gorman wrote: > On Fri, Jun 28, 2019 at 11:20:42AM -0700, Mike Kravetz wrote: >> On 4/24/19 7:35 AM, Vlastimil Babka wrote: >>> On 4/23/19 6:39 PM, Mike Kravetz wrote: >>>>> That being said, I do not think __GFP_RETRY_MAYFAIL i

Re: question: should_compact_retry limit

2019-06-05 Thread Mike Kravetz
On 6/5/19 12:58 AM, Vlastimil Babka wrote: > On 6/5/19 1:30 AM, Mike Kravetz wrote: >> While looking at some really long hugetlb page allocation times, I noticed >> instances where should_compact_retry() was returning true more often that >> I expected. In one allocation atte

question: should_compact_retry limit

2019-06-04 Thread Mike Kravetz
; } Just curious, is this intentional? -- Mike Kravetz

Re: [Question] Should direct reclaim time be bounded?

2019-07-03 Thread Mike Kravetz
TRY and back to hopefully take into account transient conditions. >From 528c52397301f02acb614c610bd65f0f9a107481 Mon Sep 17 00:00:00 2001 From: Mike Kravetz Date: Wed, 3 Jul 2019 13:36:24 -0700 Subject: [PATCH] hugetlbfs: don't retry when pool page allocations start to fail When allocating hugetlbfs pool

Re: [Question] Should direct reclaim time be bounded?

2019-07-04 Thread Mike Kravetz
On 7/4/19 4:09 AM, Michal Hocko wrote: > On Wed 03-07-19 16:54:35, Mike Kravetz wrote: >> On 7/3/19 2:43 AM, Mel Gorman wrote: >>> Indeed. I'm getting knocked offline shortly so I didn't give this the >>> time it deserves but it appears that part of this problem is &g

Re: [PATCH v2 2/2] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge

2019-06-11 Thread Mike Kravetz
g dissolve_free_huge_page(). dissolve_free_huge_pages is called as part of memory offline processing. We do not know if the memory to be offlined contains huge pages or not. With your changes, we are taking hugetlb_lock on each call to dissolve_free_huge_page just to discover that the page is not a huge page. You 'could' add a PageHuge(page) check to dissolve_free_huge_page before taking the lock. However, you would need to check again after taking the lock. -- Mike Kravetz

Re: [PATCH v3 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

2019-06-18 Thread Mike Kravetz
origuchi Thanks for the updates, Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH v3 2/2] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge

2019-06-18 Thread Mike Kravetz
for > dissolving, where we should return success for !PageHuge() case because > the given hugepage is considered as already dissolved. > > This change also affects other callers of dissolve_free_huge_page(), > which are cleaned up together. > > Reported-by: Chen, Jerry T > Tested-by: Chen, Jerry T > Signed-off-by: Naoya Horiguchi Thanks, Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH v5] hugetlbfs: Get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs

2020-05-15 Thread Mike Kravetz
n reality, this does not impact powerpc as that architecture has it's own hugetlb_get_unmapped_area routine. Because of this, I suggest we add a comment above this code and switch the if/else order. For example, + /* +* Use mm->get_unmapped_area value as a hint to use topdown routine. +* If architectures have special needs, they should define their own +* version of hugetlb_get_unmapped_area. +*/ + if (mm->get_unmapped_area == arch_get_unmapped_area_topdown) + return hugetlb_get_unmapped_area_topdown(file, addr, len, + pgoff, flags); + return hugetlb_get_unmapped_area_bottomup(file, addr, len, + pgoff, flags); Thoughts? -- Mike Kravetz > } > #endif > >

Re: kernel BUG at mm/hugetlb.c:LINE!

2020-05-15 Thread Mike Kravetz
On 5/12/20 11:11 AM, Mike Kravetz wrote: > On 5/12/20 8:04 AM, Miklos Szeredi wrote: >> On Tue, Apr 7, 2020 at 12:06 AM Mike Kravetz wrote: >>> On 4/5/20 8:06 PM, syzbot wrote: >>> >>> The routine is_file_hugepages() is just comparing the file ops to huegt

<    8   9   10   11   12   13   14   15   16   17   >