Re: [PATCH] mm,hugetlb: compute page_size_log properly

2017-07-17 Thread Mike Kravetz
the shm user space definitions in the uapi file as previously suggested by Matthew Wilcox. I did not (yet) move the shm definitions to arch specific files as suggested by Aneesh Kumar. [1] https://lkml.org/lkml/2017/7/6/564 Mike Kravetz (3): mm:hugetlb: Define system call hugetlb size encodings

[RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions

2017-07-17 Thread Mike Kravetz
Use the common definitions from hugetlb_encode.h header file for encoding hugetlb size definitions in mmap system call flags. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- arch/alpha/include/uapi/asm/mman.h | 14 ++ arch/mips/include/uapi/asm/mman.h

[RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file

2017-07-17 Thread Mike Kravetz
for these encodings. Put common definitions in a single header file. arch specific code can still override if desired. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- include/uapi/asm-generic/hugetlb_encode.h | 30 ++ 1 file changed, 30 insertions(+) creat

Re: [PATCH] mm/mremap: Fail map duplication attempts for private mappings

2017-07-19 Thread Mike Kravetz
On 07/13/2017 12:11 PM, Vlastimil Babka wrote: > [+CC linux-api] > > On 07/13/2017 05:58 PM, Mike Kravetz wrote: >> mremap will create a 'duplicate' mapping if old_size == 0 is >> specified. Such duplicate mappings make no sense for private >> mappings. If duplication

Re: [PATCH] selftests/vm: Add test to validate mirror functionality with mremap

2017-07-20 Thread Mike Kravetz
On 07/20/2017 02:36 AM, Anshuman Khandual wrote: > This adds a test to validate mirror functionality with mremap() > system call on shared anon mappings. > > Suggested-by: Mike Kravetz <mike.krav...@oracle.com> > Signed-off-by: Anshuman Khandual <khand...@linux.vnet

Re: [PATCH v2] mm/mremap: Fail map duplication attempts for private mappings

2017-07-21 Thread Mike Kravetz
On 07/21/2017 07:36 AM, Michal Hocko wrote: > On Thu 20-07-17 13:37:59, Mike Kravetz wrote: >> mremap will create a 'duplicate' mapping if old_size == 0 is >> specified. Such duplicate mappings make no sense for private >> mappings. > > sorry for the nit picking bu

Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions

2017-07-27 Thread Mike Kravetz
On 07/27/2017 12:50 AM, Michal Hocko wrote: > On Wed 26-07-17 10:39:30, Mike Kravetz wrote: >> On 07/26/2017 03:07 AM, Michal Hocko wrote: >>> On Wed 26-07-17 11:53:38, Michal Hocko wrote: >>>> On Mon 17-07-17 15:28:01, Mike Kravetz wrote: >>>>> Use

Re: [PATCH V3] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level

2017-07-27 Thread Mike Kravetz
this assume that there is a pre-allocated gigantic page sitting unused that will be the target of the migration? alloc_huge_page_node will not allocate a gigantic page. Or, am I missing something? -- Mike Kravetz > > While allocating the new gigantic HugeTLB page, it should not matter > whethe

Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions

2017-07-26 Thread Mike Kravetz
On 07/26/2017 03:07 AM, Michal Hocko wrote: > On Wed 26-07-17 11:53:38, Michal Hocko wrote: >> On Mon 17-07-17 15:28:01, Mike Kravetz wrote: >>> Use the common definitions from hugetlb_encode.h header file for >>> encoding hugetlb size definitions in shmget system c

Re: [PATCH 1/1] mm/hugetlb: Make huge_pte_offset() consistent and document behaviour

2017-07-26 Thread Mike Kravetz
ion of huge_pte_offset might have different semantics. I have not reviewed all the arch specific instances of the routine to know if this is even possible. Just curious if you examined these, or perhaps you think this is not an issue? -- Mike Kravetz

[PATCH v2] mm/mremap: Fail map duplication attempts for private mappings

2017-07-20 Thread Mike Kravetz
-by: Mike Kravetz <mike.krav...@oracle.com> --- mm/mremap.c | 9 + 1 file changed, 9 insertions(+) diff --git a/mm/mremap.c b/mm/mremap.c index cd8a1b1..949f6a7 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -383,6 +383,15 @@ static struct vm_area_struct *vma_to_resize(unsigned long

[RFC PATCH 0/1] mm/mremap: add MREMAP_MIRROR flag

2017-07-06 Thread Mike Kravetz
of a huge page mapping, but it can not be used to expand or mirror a mapping. Such support is fairly straight forward. [1] https://lkml.org/lkml/2004/1/12/260 Mike Kravetz (1): mm/mremap: add MREMAP_MIRROR flag for existing mirroring functionality include/uapi/linux/mman.h | 5

[RFC PATCH 1/1] mm/mremap: add MREMAP_MIRROR flag for existing mirroring functionality

2017-07-06 Thread Mike Kravetz
be specified. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- include/uapi/linux/mman.h | 5 +++-- mm/mremap.c | 23 --- tools/include/uapi/linux/mman.h | 5 +++-- 3 files changed, 22 insertions(+), 11 deletions(-) diff --git a/include/uapi

Re: [PATCH v3] powerpc/mm: Implemented default_hugepagesz verification for powerpc

2017-08-04 Thread Mike Kravetz
lized in this function [-Wmaybe-uninitialized] You have added a way of getting out of that big if/else if statement without setting mhp. mhp will be examined later in the code, so this is indeed a bug. Like Aneesh, I am not sure if there is great benefit in this patch. You added this change in functionality only for powerpc. IMO, it would be best if behavior was consistent in all architectures. So, if we change it for powerpc we may want to change everywhere. -- Mike Kravetz

Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK

2017-08-04 Thread Mike Kravetz
[ 694.603984] Code: 60 f3 c5 81 e8 2e 7e 03 00 0f 0b 48 c7 c6 60 f3 c5 81 4c 89 e7 e8 1d 7e 03 00 0f 0b 48 c7 c6 00 f4 c5 81 4c 89 e7 e8 0c 7e 03 00 <0f> 0b 48 c7 c6 38 f3 c5 81 4c 89 e7 e8 fb 7d 03 00 0f 0b 48 c7 [ 694.606500] RIP: __delete_from_page_cache+0x344/0x410 RS

Re: gigantic hugepages vs. movable zones

2017-07-28 Thread Mike Kravetz
I did some simple smoke testing of allocating 1G pages with the new code and ensuring they ended up as expected. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> -- Mike Kravetz > --- > mm/hugetlb.c | 35 --- > 1 file changed, 20 insertions(+

Re: [PATCH -mm] mm: Clear to access sub-page last when clearing huge page

2017-08-07 Thread Mike Kravetz
hugetlb faults as well. hugetlb_fault is the only caller of hugetlb_no_page, so this should be pretty straight forward. Were you thinking of additional improvements? -- Mike Kravetz

Re: [PATCH v2 0/2] mm,fork,security: introduce MADV_WIPEONFORK

2017-08-07 Thread Mike Kravetz
into anonymous mappings? What happens to file references? What about the really ugly case of hugetlb mappings? Do they get 'transformed' to non-hugetlb mappings? Or, do you create a separate hugetlb mapping for the child? -- Mike Kravetz

Re: [PATCH v2 0/2] mm,fork,security: introduce MADV_WIPEONFORK

2017-08-08 Thread Mike Kravetz
On 08/08/2017 06:15 AM, Rik van Riel wrote: > On Tue, 2017-08-08 at 11:58 +0200, Florian Weimer wrote: >> On 08/07/2017 08:23 PM, Mike Kravetz wrote: >>> If my thoughts above are correct, what about returning EINVAL if >>> one >>> attempts to set MADV_DON

[RFC PATCH 0/1] Add hugetlbfs support to memfd_create()

2017-08-07 Thread Mike Kravetz
/564 Mike Kravetz (1): mm/shmem: add hugetlbfs support to memfd_create() include/uapi/linux/memfd.h | 24 mm/shmem.c | 37 +++-- 2 files changed, 55 insertions(+), 6 deletions(-) -- 2.7.5

[RFC PATCH 1/1] mm/shmem: add hugetlbfs support to memfd_create()

2017-08-07 Thread Mike Kravetz
to encode huge page size in the flag arguments. hugetlbfs does not support sealing operations, therefore specifying MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- include/uapi/linux/memfd.h | 24 ++

Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK

2017-08-18 Thread Mike Kravetz
t.com> My primary concern with the first suggested patch was trying to define semantics if MADV_WIPEONFORK was applied to a shared or file backed mapping. This is no longer allowed. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> > --- > arch/alpha/include/uapi/asm/mman.

Re: [PATCH v2] mm/hugetlb.c: make huge_pte_offset() consistent and document behaviour

2017-08-18 Thread Mike Kravetz
igu...@ah.jp.nec.com> > Cc: Steve Capper <steve.cap...@arm.com> > Cc: Will Deacon <will.dea...@arm.com> > Cc: Kirill A. Shutemov <kirill.shute...@linux.intel.com> > Cc: Michal Hocko <mho...@suse.com> > Cc: Mike Kravetz <mike.krav...@oracle.com> > --

Re: [patch v2 -mm] mm, hugetlb: schedule when potentially allocating many hugepages

2017-06-09 Thread Mike Kravetz
> > Signed-off-by: David Rientjes <rient...@google.com> Thanks for doing this. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> -- Mike Kravetz > --- > Based on -mm only to prevent merge conflicts with > "mm/hugetlb.c: warn the user when issues arise on boo

Re: [PATCH 2/3] hugetlb: add support for preferred node to alloc_huge_page_nodemask

2017-06-26 Thread Mike Kravetz
because it is not really needed > - simplify dequeue_huge_page_nodemask and alloc_huge_page_nodemask a bit > as per Vlastimil Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> Tested-by: Mike Kravetz <mike.krav...@oracle.com> -- Mike Kravetz > > Acked-by: Vlastimil Babka <vba...@suse.cz>

Re: [PATCH 1/3] mm, hugetlb: unclutter hugetlb allocation layers

2017-06-26 Thread Mike Kravetz
_GFP_THISNODE there. > > Not only this removes quite some code it also should make those layers > easier to follow and clear wrt responsibilities. > > Changes since v1 > - pulled gfp mask out of __hugetlb_alloc_buddy_huge_page and make it an > explicit argument to allow __

Re: [PATCH 3/3] mm, hugetlb, soft_offline: use new_page_nodemask for soft offline migration

2017-06-26 Thread Mike Kravetz
erently and so alloc_huge_page_node(nid) would check on this > specific node. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> Tested-by: Mike Kravetz <mike.krav...@oracle.com> -- Mike Kravetz > > Noticed-by: Vlastimil Babka <vba...@suse.cz> > Acked-by: Vlastimil

[PATCH] sparc64: mm: fix copy_tsb to correctly copy huge page TSBs

2017-06-02 Thread Mike Kravetz
<anthony.yzn...@oracle.com> Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- arch/sparc/kernel/tsb.S | 11 +++ arch/sparc/mm/tsb.c | 7 +-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S index 1068

Re: [patch -mm] mm, hugetlb: schedule when potentially allocating many hugepages

2017-06-07 Thread Mike Kravetz
et = alloc_fresh_gigantic_page(h, nodes_allowed); > else > ret = alloc_fresh_huge_page(h, nodes_allowed); > + cond_resched(); Are not the following lines immediately before the above huge page allocation in set_max_huge_pages, or am I

Re: [RFC PATCH 2/4] hugetlb: add support for preferred node to alloc_huge_page_nodemask

2017-06-14 Thread Mike Kravetz
On 06/14/2017 03:12 PM, Mike Kravetz wrote: > On 06/13/2017 02:00 AM, Michal Hocko wrote: >> From: Michal Hocko <mho...@suse.com> >> >> alloc_huge_page_nodemask tries to allocate from any numa node in the >> allowed node mask starting from lower numa nodes. This

Re: [RFC PATCH 2/4] hugetlb: add support for preferred node to alloc_huge_page_nodemask

2017-06-14 Thread Mike Kravetz
ages 0 surplus_hugepages node1 512 free_hugepages 512 nr_hugepages 0 surplus_hugepages I can take a closer look at the failures tomorrow. -- Mike Kravetz

Re: [patch] mremap.2: Add description of old_size == 0 functionality

2017-09-18 Thread Mike Kravetz
On 09/18/2017 06:45 AM, Florian Weimer wrote: > On 09/15/2017 11:53 PM, Mike Kravetz wrote: >> +If the value of \fIold_size\fP is zero, and \fIold_address\fP refers to >> +a private anonymous mapping, then >> +.BR mremap () >> +will create a new mapping of the

Re: [patch] mremap.2: Add description of old_size == 0 functionality

2017-09-18 Thread Mike Kravetz
On 09/17/2017 06:52 PM, Jann Horn wrote: > On Fri, Sep 15, 2017 at 2:37 PM, Mike Kravetz <mike.krav...@oracle.com> wrote: > [...] >> A recent change was made to mremap so that an attempt to create a >> duplicate a private mapping will fail. >> >> commit dba5

DAX error inject/page poison

2017-09-19 Thread Mike Kravetz
to work (ever?) in such situations. If so, should we perhaps add a IS_DAX like check and return something like EINVAL? Or, at least document expected behavior? If madvise(MADV_HWPOISON) will not work, how can one inject errors to test error handling code? -- Mike Kravetz

[patch v2] mremap.2: Add description of old_size == 0 functionality

2017-09-19 Thread Mike Kravetz
. This was used to create a 'duplicate mapping'. A recent change was made to mremap so that an attempt to create a duplicate a private mapping will fail. Document the 'old_size == 0' behavior and new return code from below commit. commit dba58d3b8c5045ad89c1c95d33d01451e3964db7 Author: Mike Kravetz <mike.k

[patch] mremap.2: Add description of old_size == 0 functionality

2017-09-15 Thread Mike Kravetz
and discourage its use. A recent change was made to mremap so that an attempt to create a duplicate a private mapping will fail. commit dba58d3b8c5045ad89c1c95d33d01451e3964db7 Author: Mike Kravetz <mike.krav...@oracle.com> Date: Wed Sep 6 16:20:55 2017 -0700 mm/mremap: fail map dupli

Re: [patch] mremap.2: Add description of old_size == 0 functionality

2017-09-15 Thread Mike Kravetz
CC: linux-mm On 09/15/2017 02:37 PM, Mike Kravetz wrote: > Since at least the 2.6 time frame, mremap would create a new mapping > of the same pages if 'old_size == 0'. It would also leave the original > mapping. This was used to create a 'duplicate mapping'. > > Docume

[patch] memfd_create.2: Add description of MFD_HUGETLB (hugetlbfs) support

2017-09-15 Thread Mike Kravetz
hugetlbfs support for memfd_create was recently merged by Linus and should be in the Linux 4.14 release. To request hugetlbfs support a new memfd_create flag (MFD_HUGETLB) was added. This patch documents the following commit: commit 749df87bd7bee5a79cef073f5d032ddb2b211de8 Author: Mike Kravetz

Re: [patch] memfd_create.2: Add description of MFD_HUGETLB (hugetlbfs) support

2017-09-15 Thread Mike Kravetz
CC: linux-mm On 09/15/2017 02:43 PM, Mike Kravetz wrote: > hugetlbfs support for memfd_create was recently merged by Linus and > should be in the Linux 4.14 release. To request hugetlbfs support > a new memfd_create flag (MFD_HUGETLB) was added. > > This patch documents the f

Re: [RFC] mmap(MAP_CONTIG)

2017-10-04 Thread Mike Kravetz
e populated at mmap time, and the pages locked. Therefore, there should be no swap or migration. -- Mike Kravetz

Re: [RFC] mmap(MAP_CONTIG)

2017-10-04 Thread Mike Kravetz
On 10/04/2017 06:49 AM, Anshuman Khandual wrote: > On 10/04/2017 05:26 AM, Mike Kravetz wrote: >> At Plumbers this year, Guy Shattah and Christoph Lameter gave a presentation >> titled 'User space contiguous memory allocation for DMA' [1]. The slides >> point out the

Re: [RFC] mmap(MAP_CONTIG)

2017-10-04 Thread Mike Kravetz
On 10/04/2017 04:54 AM, Michal Nazarewicz wrote: > On Tue, Oct 03 2017, Mike Kravetz wrote: >> At Plumbers this year, Guy Shattah and Christoph Lameter gave a presentation >> titled 'User space contiguous memory allocation for DMA' [1]. The slides >> point out the performanc

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Mike Kravetz
-allocate pages for their use, and this 'might' be something useful for contiguous allocations as well. I wonder if going down the path of a separate devide/filesystem/etc for contiguous allocations might be a better option. It would keep the implementation somewhat separate. However, I would then be afraid that we end up with another 'separate/special vm' as in the case of hugetlbfs today. -- Mike Kravetz

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Mike Kravetz
On 10/16/2017 02:03 PM, Laura Abbott wrote: > On 10/16/2017 01:32 PM, Mike Kravetz wrote: >> On 10/16/2017 11:07 AM, Michal Hocko wrote: >>> On Mon 16-10-17 10:43:38, Mike Kravetz wrote: >>>> Just to be clear, the posix standard talks about a typed memory object. >

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-12 Thread Mike Kravetz
On 10/12/2017 07:37 AM, Michal Hocko wrote: > On Wed 11-10-17 18:46:11, Mike Kravetz wrote: >> Add new MAP_CONTIG flag to mmap system call. Check for flag in normal >> mmap flag processing. If present, pre-allocate a contiguous set of >> pages to back the mapping. The

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Mike Kravetz
On 10/16/2017 11:07 AM, Michal Hocko wrote: > On Mon 16-10-17 10:43:38, Mike Kravetz wrote: >> Just to be clear, the posix standard talks about a typed memory object. >> The suggested implementation has one create a connection to the memory >> object to receive a fd, then use

Re: [patch v2] mremap.2: Add description of old_size == 0 functionality

2017-09-25 Thread Mike Kravetz
On 09/20/2017 12:25 AM, Michael Kerrisk (man-pages) wrote: > Hello Mike, > > On 09/19/2017 11:42 PM, Mike Kravetz wrote: >> v2: Fix incorrect wording noticed by Jann Horn. >> Remove deprecated and memfd_create discussion as suggested >> by Florian Weimer. >&g

Re: [PATCH] mm/hugetlbfs: Remove the redundant -ENIVAL return from hugetlbfs_setattr()

2017-09-29 Thread Mike Kravetz
return -EINVAL; > error = hugetlb_vmtruncate(inode, attr->ia_size); > Thanks for noticing. I would hope the compiler is smarter than the code and optimize this away. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> -- Mike Kravetz

Re: [PATCH] mm, hugetlb: fix "treat_as_movable" condition in htlb_alloc_mask

2017-09-29 Thread Mike Kravetz
upported(), which is only there if ARCH_ENABLE_HUGEPAGE_MIGRATION is defined. IIUC, this functionality was added for powerpc. Yet, powerpc does not define ARCH_ENABLE_HUGEPAGE_MIGRATION (unless I am missing something). -- Mike Kravetz

Re: [PATCH v2] mm/hugetlb.c: make huge_pte_offset() consistent and document behaviour

2017-08-21 Thread Mike Kravetz
On 08/21/2017 11:07 AM, Catalin Marinas wrote: > On Fri, Aug 18, 2017 at 02:29:18PM -0700, Mike Kravetz wrote: >> On 08/18/2017 07:54 AM, Punit Agrawal wrote: >>> When walking the page tables to resolve an address that points to >>> !p*d_present() entry, huge_pte_of

Re: + mm-madvise-fix-freeing-of-locked-page-with-madv_free.patch added to -mm tree

2017-08-25 Thread Mike Kravetz
on the page (taken when allocated). It will still be non-zero as we have successfully added it to the page cache. So, we are not freeing the page here, just dropping the reference count. This should not cause a problem like that seen in madvise. -- Mike Kravetz

Re: + mm-madvise-fix-freeing-of-locked-page-with-madv_free.patch added to -mm tree

2017-08-25 Thread Mike Kravetz
On 08/25/2017 03:51 PM, Nadav Amit wrote: > Mike Kravetz <mike.krav...@oracle.com> wrote: > >> On 08/25/2017 03:02 PM, Nadav Amit wrote: >>> Michal Hocko <mho...@kernel.org> wrote: >>> >>>> Hmm, I do not see this neither in linux-mm nor LK

Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()

2017-08-27 Thread Mike Kravetz
etlbfs_fallocate()") > > cc: Eric Biggers <ebigge...@gmail.com> > cc: Mike Kravetz <mike.krav...@oracle.com> > > Signed-off-by: Nadav Amit <na...@vmware.com> Thank you Nadav. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> Since hugetlbf

Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()

2017-08-28 Thread Mike Kravetz
Adding Andrew, Michal on CC On 08/27/2017 01:08 PM, Nadav Amit wrote: > Mike Kravetz <mike.krav...@oracle.com> wrote: > >> On 08/26/2017 12:11 PM, Nadav Amit wrote: >>> hugetlfs_fallocate() currently performs put_page() before unlock_page(). >>> This scen

Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()

2017-08-28 Thread Mike Kravetz
On 08/28/2017 11:09 AM, Michal Hocko wrote: > On Mon 28-08-17 10:45:58, Mike Kravetz wrote: >> Adding Andrew, Michal on CC >> >> On 08/27/2017 01:08 PM, Nadav Amit wrote: >>> Mike Kravetz <mike.krav...@oracle.com> wrote: >>> >>>> On 08/26/

Re: [PATCH 1/2] mm: Introduce wrapper to access mm->nr_ptes

2017-10-04 Thread Mike Kravetz
ernel/fork.c b/kernel/fork.c > index 5624918154db..1c08f0136667 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -813,7 +813,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, > struct task_struct *p, > init_rwsem(>mmap_sem); > INIT_LIST_

[RFC] mmap(MAP_CONTIG)

2017-10-03 Thread Mike Kravetz
) with some kludges to use the pages at fault time. It is really ugly, which is why I am not sharing the code. Hoping for some comments/suggestions. [1] https://www.linuxplumbersconf.org/2017/ocw/proposals/4669 -- Mike Kravetz

[RFC PATCH 2/3] mm/map_contig: Use pre-allocated pages for VM_CONTIG mappings

2017-10-11 Thread Mike Kravetz
When populating mappings backed by contiguous memory allocations (VM_CONTIG), use the preallocated pages instead of allocating new. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- mm/memory.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/mm/me

[RFC PATCH 0/3] Add mmap(MAP_CONTIG) support

2017-10-11 Thread Mike Kravetz
. Also, the allocations should probably be done outside mmap_sem but that was the easiest place to do it in this quick and easy POC. I just wanted to throw out some code to get further ideas. It is far from complete. Mike Kravetz (3): mm/map_contig: Add VM_CONTIG flag to vma struct mm

[RFC PATCH 1/3] mm/map_contig: Add VM_CONTIG flag to vma struct

2017-10-11 Thread Mike Kravetz
Add the flag VM_CONTIG to vma structure to identify vmas which are backed by contiguous memory allocations. This flag is not propogated to child processes, so be sure to clear at fork time. Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- include/linux/mm.h | 1 + kernel/

[RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-11 Thread Mike Kravetz
-by: Mike Kravetz <mike.krav...@oracle.com> --- include/uapi/asm-generic/mman.h | 1 + mm/mmap.c | 94 + 2 files changed, 95 insertions(+) diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h index 7162cd

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-17 Thread Mike Kravetz
ence well enough to know if it would be possible for driver code to make CMA reservations. But, it looks doubtful. -- Mike Kravetz

Re: PROBLEM: Remapping hugepages mappings causes kernel to return EINVAL

2017-10-23 Thread Mike Kravetz
viding a flag to mmap in > order to make hugepages work correctly. Well at least this has a built in fall back mechanism. When using hugetlb(fs) pages, you would need to handle the case where mremap fails due to lack of configured huge pages. I assume your allocator will be for somewhat general application usage. Yet, for the most reliability the user/admin will need to know at boot time how many huge pages will be needed and set that up. -- Mike Kravetz

Re: [PATCH 1/1] mm:hugetlbfs: Fix hwpoison reserve accounting

2017-10-23 Thread Mike Kravetz
On 10/23/2017 12:32 AM, Naoya Horiguchi wrote: > On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote: >> On 10/19/2017 07:30 PM, Naoya Horiguchi wrote: >>> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote: >>> >>> Thank you for addressi

Re: [PATCH v3 0/9] memfd: add sealing to hugetlb-backed memory

2017-11-14 Thread Mike Kravetz
outstanding issue is sorting out the config option dependencies. Although, IMO this is not a strict requirement for this series. I have addressed this issue in a follow on series: http://lkml.kernel.org/r/20171109014109.21077-1-mike.krav...@oracle.com -- Mike Kravetz On 11/07/2017 04:27 AM, Marc-André

Re: [PATCH RFC 1/2] mm, hugetlb: unify core page allocation accounting and initialization

2017-11-28 Thread Mike Kravetz
s, node, nodes_allowed) { > - page = alloc_fresh_huge_page_node(h, node); > - if (page) { > - ret = 1; > + page = __hugetlb_alloc_buddy_huge_page(h, gfp_mask, > + n

Re: [PATCH 1/1] mm/cma: fix alloc_contig_range ret code/potential leak

2017-11-22 Thread Mike Kravetz
On 11/22/2017 04:00 AM, Johannes Weiner wrote: > On Mon, Nov 20, 2017 at 11:39:30AM -0800, Mike Kravetz wrote: >> If the call __alloc_contig_migrate_range() in alloc_contig_range >> returns -EBUSY, processing continues so that test_pages_isolated() >> is called where

Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()

2017-11-28 Thread Mike Kravetz
ed, if only to prevent future breakage or someone copy-pasting this >> code. >> >> Fixes: 70c3547e36f5c ("hugetlbfs: add hugetlbfs_fallocate()") >> >> cc: Eric Biggers <ebigge...@gmail.com> >> cc: Mike Kravetz <mike.krav...@oracle.com> >>

Re: [PATCH RFC 1/2] mm, hugetlb: unify core page allocation accounting and initialization

2017-11-29 Thread Mike Kravetz
On 11/28/2017 10:57 PM, Michal Hocko wrote: > On Tue 28-11-17 13:34:53, Mike Kravetz wrote: >> On 11/28/2017 06:12 AM, Michal Hocko wrote: > [...] >>> +/* >>> + * Allocates a fresh page to the hugetlb allocator pool in the node >>> interleaved &

Re: [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration

2017-11-28 Thread Mike Kravetz
e)) { > + SetPageHugeTemporary(hpage); > + ClearPageHugeTemporary(new_hpage); > + } > } > > unlock_page(hpage); > I'm still trying to wrap my head around all the different scenarios. In general, this new code only 'kicks in' if the there is not a free pre-allocated huge page for migration. Right? So, if there are free huge pages they are 'consumed' during migration and the number of available pre-allocated huge pages is reduced? Or, is that not exactly how it works? Or does it depend in the purpose of the migration? The only reason I ask is because this new method of allocating a surplus page (if successful) results in no decrease of available huge pages. Perhaps all migrations should attempt to allocate surplus pages and not impact the pre-allocated number of available huge pages. Or, perhaps I am just confused. :) -- Mike Kravetz

Re: [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration

2017-11-30 Thread Mike Kravetz
On 11/29/2017 11:57 PM, Michal Hocko wrote: > On Wed 29-11-17 11:52:53, Mike Kravetz wrote: >> On 11/29/2017 01:22 AM, Michal Hocko wrote: >>> What about this on top. I haven't tested this yet though. >> >> Yes, this would work. >> >> However, I th

Re: hugetlb page migration vs. overcommit

2017-11-22 Thread Mike Kravetz
ce Naoya was originally involved in huge page migration, I would welcome his comments. -- Mike Kravetz

[PATCH v2] mm/cma: fix alloc_contig_range ret code/potential leak

2017-11-22 Thread Mike Kravetz
if __alloc_contig_migrate_range returns -EBUSY. Also, clear return code in this case so that it is not accidentally used or returned to caller. Fixes: 8ef5849fa8a2 ("mm/cma: always check which page caused allocation failure") Cc: <sta...@vger.kernel.org> Signed-off-by: Mike Kr

Re: [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration

2017-11-29 Thread Mike Kravetz
if (h->surplus_huge_pages_node[old_nid]) { > + h->surplus_huge_pages_node[old_nid]--; > + h->surplus_huge_pages_node[new_nid]++; > + } You need to take hugetlb_lock before adjusting the surplus counts. -- Mike Kravetz

Re: [RFC PATCH 1/5] mm, hugetlb: unify core page allocation accounting and initialization

2017-12-12 Thread Mike Kravetz
ge to the allocator pool. All current callers are updated to call > put_page explicitly. Later patches will add new callers which won't > need it. > > This patch shouldn't introduce any functional change. > > Signed-off-by: Michal Hocko <mho...@suse.com> Reviewed-by: Mi

Re: [RFC PATCH 2/5] mm, hugetlb: integrate giga hugetlb more naturally to the allocation path

2017-12-12 Thread Mike Kravetz
his will simplify set_max_huge_pages > which doesn't have to care about what kind of huge page we allocate. > > Signed-off-by: Michal Hocko <mho...@suse.com> I agree with the analysis. Thanks for cleaning this up. There really is no need for the separate allocation paths. Reviewed-

Re: [RFC PATCH 4/5] mm, hugetlb: get rid of surplus page accounting tricks

2017-12-14 Thread Mike Kravetz
On 12/13/2017 11:50 PM, Michal Hocko wrote: > On Wed 13-12-17 16:45:55, Mike Kravetz wrote: >> On 12/04/2017 06:01 AM, Michal Hocko wrote: >>> From: Michal Hocko <mho...@suse.com> >>> >>> alloc_surplus_huge_page increases the pool size and the num

Re: [RFC PATCH 3/5] mm, hugetlb: do not rely on overcommit limit during migration

2017-12-14 Thread Mike Kravetz
On 12/13/2017 11:40 PM, Michal Hocko wrote: > On Wed 13-12-17 15:35:33, Mike Kravetz wrote: >> On 12/04/2017 06:01 AM, Michal Hocko wrote: > [...] >>> Before migration >>> /sys/devices/system/node/node0/hugepages/hugepages-2048kB/free_hugepages:0 >>> /

Re: [RFC PATCH 5/5] mm, hugetlb: further simplify hugetlb allocation API

2017-12-14 Thread Mike Kravetz
others lose their excessive prefix underscores to make names shorter > This patch will need to be modified to take into account the incremental diff to patch 4 in this series. Other than that, the changes look good. Reviewed-by: Mike Kravetz <mike.krav...@oracle.com>

Re: [RFC PATCH 4/5] mm, hugetlb: get rid of surplus page accounting tricks

2017-12-13 Thread Mike Kravetz
ncremented the global counters already > - */ > h->nr_huge_pages_node[r_nid]++; > h->surplus_huge_pages_node[r_nid]++; > - } else { > - h->nr_huge_pages--; > - h->surplus_huge_pages--; In the case

Re: [RFC PATCH 3/5] mm, hugetlb: do not rely on overcommit limit during migration

2017-12-13 Thread Mike Kravetz
ClearPageHugeTemporary(newpage); > + > + spin_lock(_lock); > + if (h->surplus_huge_pages_node[old_nid]) { > + h->surplus_huge_pages_node[old_nid]--; > + h->surplus_huge_pages_node[new_nid]++; > +

Re: [PATCH] mm: show stats for non-default hugepage sizes in /proc/meminfo

2017-11-13 Thread Mike Kravetz
well. Although, in practice one does tend to use a single huge pages size. If you change the default huge page size, then those entries will be in /proc/meminfo. -- Mike Kravetz

Re: [PATCH] mm: show stats for non-default hugepage sizes in /proc/meminfo

2017-11-13 Thread Mike Kravetz
ach. The 'trick' is coming up with a name or description that is not confusing. Unfortunately, we have to leave the existing entries. So, this new entry will be greater than or equal to HugePages_Total. :( I guess Hugetlb is as good of a name as any? -- Mike Kravetz

Re: [RFC PATCH 0/3] restructure memfd code

2017-11-20 Thread Mike Kravetz
On 11/20/2017 02:28 AM, Marc-André Lureau wrote: > Hi > > On Thu, Nov 9, 2017 at 2:41 AM, Mike Kravetz <mike.krav...@oracle.com> wrote: >> With the addition of memfd hugetlbfs support, we now have the situation >> where memfd depends on TMPFS -or- HUGETLBFS.

[PATCH 0/1] mm/cma: fix alloc_contig_range ret code/potential leak

2017-11-20 Thread Mike Kravetz
. Mike Kravetz (1): mm/cma: fix alloc_contig_range ret code/potential leak mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.13.6

[PATCH 1/1] mm/cma: fix alloc_contig_range ret code/potential leak

2017-11-20 Thread Mike Kravetz
(). Fixes: 8ef5849fa8a2 ("mm/cma: always check which page caused allocation failure") Cc: <sta...@vger.kernel.org> Signed-off-by: Mike Kravetz <mike.krav...@oracle.com> --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b

Re: [PATCH v2] mm: show total hugetlb memory consumption in /proc/meminfo

2017-11-20 Thread Mike Kravetz
_hstate) > > I'm not understanding this test. Are we assuming that default_hstate > always refers to the highest-index hstate? If so why, and is that > valid? Actually default_hstate is defined as: #define default_hstate (hstates[default_hstate_idx]) defau

Re: [PATCH v2] mm: show total hugetlb memory consumption in /proc/meminfo

2017-11-21 Thread Mike Kravetz
that there is no consistency guarantee for the numbers with the default huge page size today. However, I am not really a fan of taking the lock for that guarantee. IMO, the above code is fine. This discussion reminds me that ideally there should be a per-hstate lock. My guess is that the globa

Re: [RFC PATCH 0/3] restructure memfd code

2017-11-21 Thread Mike Kravetz
On 11/21/2017 08:32 AM, Khalid Aziz wrote: > On Wed, 2017-11-08 at 17:41 -0800, Mike Kravetz wrote: >> With the addition of memfd hugetlbfs support, we now have the >> situation >> where memfd depends on TMPFS -or- HUGETLBFS. Previously, memfd was >> only >> sup

Re: [PATCH 1/6] shmem: unexport shmem_add_seals()/shmem_get_seals()

2017-11-01 Thread Mike Kravetz
On 10/31/2017 11:40 AM, Marc-André Lureau wrote: > The functions are called through shmem_fcntl() only. And no danger in removing the EXPORTs as the routines only work with shmem file structs. > > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> Reviewed-b

Re: [PATCH 4/6] hugetlbfs: implement memfd sealing

2017-11-01 Thread Mike Kravetz
: added similar check as shmem_setattr() & shmem_fallocate() > > Except write() operation that doesn't exist with hugetlbfs, that > should make sealing as close as it can be to shmem support. > > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> Looks fine to

Re: [PATCH 3/6] hugetlb: expose hugetlbfs_inode_info in header

2017-11-01 Thread Mike Kravetz
l need to be accessed by code in mm/shmem.c for file sealing operations. Move inode information definition from .c file to header for needed access. -- Mike Kravetz > > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> > --- > fs/hugetlbfs/inode.c| 10 -- &g

Re: [PATCH 5/6] shmem: add sealing support to hugetlb-backed memfd

2017-11-01 Thread Mike Kravetz
ile_inode(file))->seals; > +#endif > + > + return NULL; > +} > + As mentioned in patch 2, I think this code will need to be restructured so that hugetlbfs file sealing will work even is CONFIG_TMPFS is not defined. The above routine is behind #ifdef CONFIG_TMPFS. In gen

Re: [PATCH 2/6] shmem: rename functions that are memfd-related

2017-11-01 Thread Mike Kravetz
ed? I admit that having CONFIG_HUGETLBFS defined without CONFIG_TMPFS is unlikely, but I think possible. Based on the above #ifdef/#else, I think hugetlbfs seals will not work if CONFIG_TMPFS is not defined. -- Mike Kravetz > diff --git a/mm/shmem.c b/mm/shmem.c > index 37260c

Re: [PATCH 2/6] shmem: rename functions that are memfd-related

2017-11-03 Thread Mike Kravetz
d on the above #ifdef/#else, I >> think hugetlbfs seals will not work if CONFIG_TMPFS is not defined. > > Good point, memfd_create() will not exists either. > > I think this is a separate concern, and preexisting from this patch series > though. Ah yes. I should have a

Re: [PATCH 3/6] hugetlb: expose hugetlbfs_inode_info in header

2017-11-03 Thread Mike Kravetz
header for needed access. > > Ok, Does the patch get your Reviewed-by tag with that change? > > thanks > Yes, you can add Reviewed-by: Mike Kravetz <mike.krav...@oracle.com> with an updated commit message. -- Mike Kravetz

Re: [PATCH 4/6] hugetlbfs: implement memfd sealing

2017-11-03 Thread Mike Kravetz
ap or hole punch/truncate. So, we do not really need to worry about those special (a)io cases for hugetlbfs. -- Mike Kravetz > you need to make sure there are no page references > left around. For instance, on shmem any process might trigger the > kernel to GUP mapped sh

Re: [PATCH 2/6] shmem: rename functions that are memfd-related

2017-11-03 Thread Mike Kravetz
om this patch series >>> though. >> >> Ah yes. I should have addressed this when adding hugetlbfs memfd_create >> support. >> >> Of course, one 'simple' way to address this would be to make CONFIG_HUGETLBFS >> depend on CONFIG_TMPFS. Not sure what peo

Re: [PATCH 4/6] hugetlbfs: implement memfd sealing

2017-11-03 Thread Mike Kravetz
On 11/03/2017 10:41 AM, David Herrmann wrote: > Hi > > On Fri, Nov 3, 2017 at 6:12 PM, Mike Kravetz <mike.krav...@oracle.com> wrote: >> On 11/03/2017 10:03 AM, David Herrmann wrote: >>> Hi >>> >>> On Tue, Oct 31, 2017 at 7:40 PM, Marc-Andr

Re: [PATCH 4/6] hugetlbfs: implement memfd sealing

2017-11-03 Thread Mike Kravetz
On 11/03/2017 10:56 AM, Mike Kravetz wrote: > On 11/03/2017 10:41 AM, David Herrmann wrote: >> Hi >> >> On Fri, Nov 3, 2017 at 6:12 PM, Mike Kravetz <mike.krav...@oracle.com> wrote: >>> On 11/03/2017 10:03 AM, David Herrmann wrote: >>>> Hi >&

<    1   2   3   4   5   6   7   8   9   10   >