[PATCH v3 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-22 Thread Mike Kravetz
7; as it can not longer happen. Remove the dead code and expand comments to explain reasoning. Similarly, checks for races with truncation in the page fault path can be simplified and removed. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike

[PATCH v3 0/2] hugetlbfs: use i_mmap_rwsem for better synchronization

2018-12-22 Thread Mike Kravetz
Comments made by Naoya were addressed. Mike Kravetz (2): hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race fs/hugetlbfs/inode.c | 61 +++- mm/hugetl

[PATCH v3 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

2018-12-22 Thread Mike Kravetz
dition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. Cc: Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz

Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-21 Thread Mike Kravetz
On 12/21/18 12:21 PM, Kirill A. Shutemov wrote: > On Fri, Dec 21, 2018 at 10:28:25AM -0800, Mike Kravetz wrote: >> On 12/21/18 2:28 AM, Kirill A. Shutemov wrote: >>> On Tue, Dec 18, 2018 at 02:35:57PM -0800, Mike Kravetz wrote: >>>> Instead of writing the required

Re: [PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-21 Thread Mike Kravetz
On 12/21/18 2:28 AM, Kirill A. Shutemov wrote: > On Tue, Dec 18, 2018 at 02:35:57PM -0800, Mike Kravetz wrote: >> Instead of writing the required complicated code for this rare >> occurrence, just eliminate the race. i_mmap_rwsem is now held in read >> mode for the d

Re: [PATCH v2 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

2018-12-21 Thread Mike Kravetz
On 12/21/18 2:05 AM, Kirill A. Shutemov wrote: > On Tue, Dec 18, 2018 at 02:35:56PM -0800, Mike Kravetz wrote: >> While looking at BUGs associated with invalid huge page map counts, >> it was discovered and observed that a huge pte pointer could become >> 'invalid' a

[PATCH v2 2/2] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-18 Thread Mike Kravetz
7; as it can not longer happen. Remove the dead code and expand comments to explain reasoning. Similarly, checks for races with truncation in the page fault path can be simplified and removed. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike

[PATCH v2 0/2] hugetlbfs: use i_mmap_rwsem for better synchronization

2018-12-18 Thread Mike Kravetz
7-1-mike.krav...@oracle.com Comments made by Naoya were addressed. Mike Kravetz (2): hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race fs/hugetlbfs/inode.c | 50 + mm/hugetlb.c

[PATCH v2 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

2018-12-18 Thread Mike Kravetz
dition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. Cc: Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz

Re: [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-18 Thread Mike Kravetz
On 12/18/18 2:10 PM, Andrew Morton wrote: > On Mon, 17 Dec 2018 16:17:52 -0800 Mike Kravetz > wrote: > >> ... >> >>> As you suggested in a comment to the subsequent patch, it would be better to >>> combine the patches and remove the dead code when it

Re: [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-17 Thread Mike Kravetz
On 12/17/18 10:42 AM, Mike Kravetz wrote: > On 12/17/18 2:25 AM, Aneesh Kumar K.V wrote: >> On 12/4/18 1:38 AM, Mike Kravetz wrote: >>> >>> Instead of writing the required complicated code for this rare >>> occurrence, just eliminate the race. i_mmap_rwsem

Re: [PATCH] mm: Remove __hugepage_set_anon_rmap()

2018-12-17 Thread Mike Kravetz
Tkhai Thanks for cleaning this up! Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-17 Thread Mike Kravetz
On 12/17/18 2:25 AM, Aneesh Kumar K.V wrote: > On 12/4/18 1:38 AM, Mike Kravetz wrote: >> hugetlbfs page faults can race with truncate and hole punch operations. >> Current code in the page fault path attempts to handle this by 'backing >> out' operations if we e

[PATCH 0/3] hugetlbfs: use i_mmap_rwsem for better synchronization

2018-12-03 Thread Mike Kravetz
o separate patches addressing each issue. Hopefully, this is easier to understand/review. Mike Kravetz (3): hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race hugetlbfs: remove unnecessary code after i_mmap_rwsem synchroniz

[PATCH 3/3] hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization

2018-12-03 Thread Mike Kravetz
After expanding i_mmap_rwsem use for better shared pmd and page fault/ truncation synchronization, remove code that is no longer necessary. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike Kravetz --- fs/hugetlbfs/in

[PATCH 2/3] hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race

2018-12-03 Thread Mike Kravetz
t eliminate the race. i_mmap_rwsem is now held in read mode for the duration of page fault processing. Hold i_mmap_rwsem longer in truncation and hold punch code to cover the call to remove_inode_hugepages. Cc: Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: M

[PATCH 1/3] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization

2018-12-03 Thread Mike Kravetz
dition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. Cc: Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz

Re: [PATCH] hugetlbfs: Call VM_BUG_ON_PAGE earlier in free_huge_page

2018-11-29 Thread Mike Kravetz
gt; >> Signed-off-by: Yongkai Wu >> Acked-by: Michal Hocko >> Acked-by: Mike Kravetz Thank you for fixing the formatting and commit message. Adding Andrew on so he can add to his tree as appropriatre. Also Cc'ing Michal. >> --- >> mm/hugetlb.c | 5 +++-- &g

Re: [PATCH] mm/hugetl.c: keep the page mapping info when free_huge_page() hit the VM_BUG_ON_PAGE

2018-11-13 Thread Mike Kravetz
ation to the change log (commit message) and fix the formatting of the patch. Can you tell us more about the root cause of your issue? What was the issue? How could you reproduce it? How did you solve it? -- Mike Kravetz

Re: [PATCH] hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!

2018-11-05 Thread Mike Kravetz
On 11/5/18 1:30 PM, Andrew Morton wrote: > On Mon, 5 Nov 2018 13:23:15 -0800 Mike Kravetz > wrote: > >> This bug has been experienced several times by Oracle DB team. >> The BUG is in the routine remove_inode_hugepages() as follows: >> /* >> * If

[PATCH] hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!

2018-11-05 Thread Mike Kravetz
iated page. This is how we end up with an elevated map count. To solve, check the dst_pte entry for huge_pte_none. If !none, this implies PMD sharing so do not copy. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff

Re: [PATCH RFC v2 1/1] hugetlbfs: use i_mmap_rwsem for pmd sharing and truncate/fault sync

2018-10-30 Thread Mike Kravetz
On 10/25/18 5:42 PM, Naoya Horiguchi wrote: > Hi Mike, > > On Tue, Oct 23, 2018 at 09:50:53PM -0700, Mike Kravetz wrote: >> Now, anopther task truncates the hugetlbfs file. As part of truncation, >> it unmaps everyone who has the file mapped. If a task has a shared p

[PATCH RFC v2 0/1] hugetlbfs: Use i_mmap_rwsem for pmd share and fault/trunc

2018-10-23 Thread Mike Kravetz
worse. This leads to bad things such as incorrect page map/reference counts or invaid memory references. Fix this all by modifying the usage of i_mmap_rwsem to cover fault/truncate races as well as handling of shared pmds Mike Kravetz (1): hugetlbfs: use i_mmap_rwsem for pmd sharing and tru

[PATCH RFC v2 1/1] hugetlbfs: use i_mmap_rwsem for pmd sharing and truncate/fault sync

2018-10-23 Thread Mike Kravetz
d in read mode after huge_pte_alloc, until the caller is finished with the returned ptep. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 21 ++ mm/hugetlb.c | 65 +--- mm/rmap.c| 10 +++ mm/userfaultfd.c |

Re: [PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-23 Thread Mike Kravetz
On 10/23/18 12:43 AM, Michal Hocko wrote: > On Wed 17-10-18 21:10:22, Mike Kravetz wrote: >> Some test systems were experiencing negative huge page reserve >> counts and incorrect file block counts. This was traced to >> /proc/sys/vm/drop_caches removing clean pages f

Re: [PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-18 Thread Mike Kravetz
On 10/18/18 6:47 PM, Andrew Morton wrote: > On Thu, 18 Oct 2018 20:46:21 -0400 Andrea Arcangeli > wrote: > >> On Thu, Oct 18, 2018 at 04:16:40PM -0700, Mike Kravetz wrote: >>> I was not sure about this, and expected someone could come up with >>> something

Re: [PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-18 Thread Mike Kravetz
On 10/18/18 4:08 PM, Andrew Morton wrote: > On Wed, 17 Oct 2018 21:10:22 -0700 Mike Kravetz > wrote: > >> Some test systems were experiencing negative huge page reserve >> counts and incorrect file block counts. This was traced to >> /proc/sys/vm/drop_caches removing

[PATCH] hugetlbfs: dirty pages as they are added to pagecache

2018-10-17 Thread Mike Kravetz
tle sense to even try to drop hugetlbfs pagecache pages, so disable calls to these filesystems in drop_caches code. Fixes: 70c3547e36f5 ("hugetlbfs: add hugetlbfs_fallocate()") Cc: sta...@vger.kernel.org Signed-off-by: Mike Kravetz --- fs/drop_caches.c | 7 +++ mm/hugetlb.c | 6 ++

Re: [PATCH RFC 1/1] hugetlbfs: introduce truncation/fault mutex to avoid races

2018-10-08 Thread Mike Kravetz
On 10/8/18 1:03 AM, Kirill A. Shutemov wrote: > On Sun, Oct 07, 2018 at 04:38:48PM -0700, Mike Kravetz wrote: >> The following hugetlbfs truncate/page fault race can be recreated >> with programs doing something like the following. >> >> A huegtlbfs file is mmap(MAP_SHA

[PATCH RFC 0/1] hugetlbfs: fix truncate/fault races

2018-10-07 Thread Mike Kravetz
following patch describes the current race in detail and adds the mutex to prevent truncate/fault races. Mike Kravetz (1): hugetlbfs: introduce truncation/fault mutex to avoid races fs/hugetlbfs/inode.c| 24 include/linux/hugetlb.h | 1 + mm/hugetlb.c| 25

[PATCH RFC 1/1] hugetlbfs: introduce truncation/fault mutex to avoid races

2018-10-07 Thread Mike Kravetz
ation takes in write mode. Signed-off-by: Mike Kravetz --- fs/hugetlbfs/inode.c| 24 include/linux/hugetlb.h | 1 + mm/hugetlb.c| 25 +++-- mm/userfaultfd.c| 8 +++- 4 files changed, 47 insertions(+), 11 deletions(-) diff

Re: [PATCH V2] mm/hugetlb: Add mmap() encodings for 32MB and 512MB page sizes

2018-09-25 Thread Mike Kravetz
nitions per Mike Thanks Anshuman, Acked-by: Mike Kravetz -- Mike Kravetz > > include/uapi/asm-generic/hugetlb_encode.h | 2 ++ > include/uapi/linux/memfd.h| 2 ++ > include/uapi/linux/mman.h | 2 ++ > include/uapi/linux/shm.h

Re: [PATCH] mm/hugetlb: Add mmap() encodings for 32MB and 512MB page sizes

2018-09-24 Thread Mike Kravetz
; include/uapi/linux/mman.h | 2 ++ > 2 files changed, 4 insertions(+) Thanks Anshuman, However, I think we should also add similar definitions in: uapi/linux/memfd.h uapi/linux/shm.h -- Mike Kravetz

Re: Plumbers 2018 - Performance and Scalability Microconference

2018-09-06 Thread Mike Kravetz
t;> Can we come up with a 2M base page VM or something? We have possible >> memory sizes of a couple TB now. That should give us a million or so 2M >> pages to work with. > > That sounds a good idea. Don't know whether someone has tried this. IIRC, Hugh Dickins and some others at Google tried going down this path. There was a brief discussion at LSF/MM. It is something I too would like to explore in my spare time. -- Mike Kravetz

Re: [RFC PATCH] mm/hugetlb: make hugetlb_lock irq safe

2018-09-05 Thread Mike Kravetz
On 09/05/2018 04:07 PM, Matthew Wilcox wrote: > On Wed, Sep 05, 2018 at 03:00:08PM -0700, Andrew Morton wrote: >> On Wed, 5 Sep 2018 14:35:11 -0700 Mike Kravetz >> wrote: >> >>>>so perhaps we could put some >>&

Re: [RFC PATCH] mm/hugetlb: make hugetlb_lock irq safe

2018-09-05 Thread Mike Kravetz
he powerpc iommu code was added, I doubt this was taken into account. I would be afraid of someone adding put_page from hardirq context. -- Mike Kravetz > And attention will need to be paid to -stable backporting. How long > has mm_iommu_free() existed, and been doing this?

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-09-04 Thread Mike Kravetz
ly need/want a separate patch. We could just add the notifiers to the shared pmd patch. Back porting the shared pmd patch will also require some fixup. Either would work. I'll admit I do not know what stable maintainers would prefer. -- Mike Kravetz

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-30 Thread Mike Kravetz
se wrote: >>>>> On Wed, Aug 29, 2018 at 08:39:06PM +0200, Michal Hocko wrote: >>>>>> On Wed 29-08-18 14:14:25, Jerome Glisse wrote: >>>>>>> On Wed, Aug 29, 2018 at 10:24:44AM -0700, Mike Kravetz wrote: >>>>>> [...] >

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-29 Thread Mike Kravetz
On 08/29/2018 02:11 PM, Jerome Glisse wrote: > On Wed, Aug 29, 2018 at 08:39:06PM +0200, Michal Hocko wrote: >> On Wed 29-08-18 14:14:25, Jerome Glisse wrote: >>> On Wed, Aug 29, 2018 at 10:24:44AM -0700, Mike Kravetz wrote: >> [...] >>>> What would be the best

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-29 Thread Mike Kravetz
On 08/27/2018 06:46 AM, Jerome Glisse wrote: > On Mon, Aug 27, 2018 at 09:46:45AM +0200, Michal Hocko wrote: >> On Fri 24-08-18 11:08:24, Mike Kravetz wrote: >>> Here is an updated patch which does as you suggest above. >> [...] >>> @@ -1409,6 +1419,32 @@ static

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-27 Thread Mike Kravetz
On 08/27/2018 12:46 AM, Michal Hocko wrote: > On Fri 24-08-18 11:08:24, Mike Kravetz wrote: >> On 08/24/2018 01:41 AM, Michal Hocko wrote: >>> On Thu 23-08-18 13:59:16, Mike Kravetz wrote: >>> >>> Acked-by: Michal Hocko >>> >>> One nit be

Re: [PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-24 Thread Mike Kravetz
On 08/24/2018 01:41 AM, Michal Hocko wrote: > On Thu 23-08-18 13:59:16, Mike Kravetz wrote: > > Acked-by: Michal Hocko > > One nit below. > > [...] >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index 3103099f64fd..a73c5728e961 100644 >> --- a/mm/hugetl

[PATCH v6 2/2] hugetlb: take PMD sharing into account when flushing tlb/caches

2018-08-23 Thread Mike Kravetz
that this range is flushed if huge_pmd_unshare succeeds and unmaps a PUD_SUZE area. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 53 +++- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index

[PATCH v6 0/2] huge_pmd_unshare migration and flushing

2018-08-23 Thread Mike Kravetz
routine as suggested by Kirill. v3-v5: Address build errors if !CONFIG_HUGETLB_PAGE and !CONFIG_ARCH_WANT_HUGE_PMD_SHARE Mike Kravetz (2): mm: migration: fix migration of huge PMD shared pages hugetlb: take PMD sharing into account when flushing tlb/caches include/linux/huge

[PATCH v6 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-23 Thread Mike Kravetz
page") Cc: sta...@vger.kernel.org Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 14 ++ mm/hugetlb.c| 40 +-- mm/rmap.c | 42 ++--- 3 files changed, 91 insertions(+), 5 deleti

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-23 Thread Mike Kravetz
On 08/23/2018 10:01 AM, Mike Kravetz wrote: > On 08/23/2018 05:48 AM, Michal Hocko wrote: >> On Tue 21-08-18 18:10:42, Mike Kravetz wrote: >> [...] >> >> OK, after burning myself when trying to be clever here it seems like >> your proposed solutio

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-23 Thread Mike Kravetz
On 08/23/2018 05:48 AM, Michal Hocko wrote: > On Tue 21-08-18 18:10:42, Mike Kravetz wrote: > [...] > > OK, after burning myself when trying to be clever here it seems like > your proposed solution is indeed simpler. > >> +bool huge_pmd_sharing_possible(st

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-23 Thread Mike Kravetz
On 08/23/2018 01:21 AM, Kirill A. Shutemov wrote: > On Thu, Aug 23, 2018 at 09:30:35AM +0200, Michal Hocko wrote: >> On Wed 22-08-18 09:48:16, Mike Kravetz wrote: >>> On 08/22/2018 05:28 AM, Michal Hocko wrote: >>>> On Tue 21-08-18 18:10:42, Mike Kravetz wrote: &g

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-22 Thread Mike Kravetz
On 08/22/2018 02:05 PM, Kirill A. Shutemov wrote: > On Tue, Aug 21, 2018 at 06:10:42PM -0700, Mike Kravetz wrote: >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index 3103099f64fd..f085019a4724 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -4555,6 +45

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-22 Thread Mike Kravetz
On 08/22/2018 05:28 AM, Michal Hocko wrote: > On Tue 21-08-18 18:10:42, Mike Kravetz wrote: > [...] >> diff --git a/mm/rmap.c b/mm/rmap.c >> index eb477809a5c0..8cf853a4b093 100644 >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -1362,11 +1362,21 @@ static boo

Re: [PATCH v2 0/2] mm: soft-offline: fix race against page allocation

2018-08-21 Thread Mike Kravetz
ration of huge PMD shared pages issue. That is sort of a Tested-by :). Just wanted to point out that it was pretty easy to hit this issue. It was easier than the issue I am working. And, the issue I am trying to address was seen in a real customer environment. So, I would not be surprised to see this issue in real customer environments as well. If you (or others) think we should go forward with these patches, I can spend some time doing a review. Already did a 'quick look' some time back. -- Mike Kravetz

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-21 Thread Mike Kravetz
te to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/Mike-Kravetz/huge_pmd_unshare-migration-and-flushing/20180822-050255 > config: sparc64-allyesconfig (attached as .config) > compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2

Re: [PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-21 Thread Mike Kravetz
te to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/Mike-Kravetz/huge_pmd_unshare-migration-and-flushing/20180822-050255 > config: i386-tinyconfig (attached as .config) > compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 > reproduce: > #

[PATCH v3 1/2] mm: migration: fix migration of huge PMD shared pages

2018-08-21 Thread Mike Kravetz
page") Cc: sta...@vger.kernel.org Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 14 ++ mm/hugetlb.c| 40 +++ mm/rmap.c | 42 ++--- 3 files changed, 93 insertions(+), 3 deleti

[PATCH v3 0/2] huge_pmd_unshare migration and flushing

2018-08-21 Thread Mike Kravetz
advantage of the new routine huge_pmd_sharing_possible() to adjust flushing ranges in the cases where huge PMD sharing is possible. There is no copy to stable for this patch as it has not been reported as an issue and discovered only via code inspection. Mike Kravetz (2): mm: migration: fix migration of

[PATCH v3 2/2] hugetlb: take PMD sharing into account when flushing tlb/caches

2018-08-21 Thread Mike Kravetz
range is flushed if huge_pmd_unshare succeeds and unmaps a PUD_SUZE area. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 53 +++- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd155dc52117

Re: [PATCH] mm: migration: fix migration of huge PMD shared pages

2018-08-14 Thread Mike Kravetz
On 08/14/2018 01:48 AM, Kirill A. Shutemov wrote: > On Mon, Aug 13, 2018 at 11:21:41PM +0000, Mike Kravetz wrote: >> On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote: >>> On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote: >>>> I am not %100 sure on the req

[PATCH v2] mm: migration: fix migration of huge PMD shared pages

2018-08-13 Thread Mike Kravetz
dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz --- v2: Fixed build issue for !CONFIG_HUGETLB_PAGE and typos in comment include/linux/hugetlb.h | 6 ++ mm/rmap.c | 21 + 2 files changed, 27 insertions(+) diff --git a/inclu

Re: [PATCH] mm: migration: fix migration of huge PMD shared pages

2018-08-13 Thread Mike Kravetz
On 08/13/2018 03:58 AM, Kirill A. Shutemov wrote: > On Sun, Aug 12, 2018 at 08:41:08PM -0700, Mike Kravetz wrote: >> The page migration code employs try_to_unmap() to try and unmap the >> source page. This is accomplished by using rmap_walk to find all >> vmas where the

[PATCH] mm: migration: fix migration of huge PMD shared pages

2018-08-12 Thread Mike Kravetz
huge_pmd_unshare for hugetlbfs huge pages. If it is a shared mapping it will be 'unshared' which removes the page table entry and drops reference on PMD page. After this, flush caches and TLB. Signed-off-by: Mike Kravetz --- I am not %100 sure on the required flushing, so suggestions would be a

Re: [PATCH v4 06/11] hugetlb: Introduce generic version of huge_pte_none

2018-07-26 Thread Mike Kravetz
On 07/05/2018 04:07 AM, Alexandre Ghiti wrote: > arm, arm64, ia64, parisc, powerpc, sh, sparc, x86 architectures > use the same version of huge_pte_none, so move this generic > implementation into asm-generic/hugetlb.h. > Reviewed-by: Mike Kravetz -- Mike Kravetz > Signed-of

Re: [PATCH v2 1/2] mm: fix race on soft-offlining free huge pages

2018-07-17 Thread Mike Kravetz
On 07/17/2018 06:28 PM, Naoya Horiguchi wrote: > On Tue, Jul 17, 2018 at 01:10:39PM -0700, Mike Kravetz wrote: >> It seems that soft_offline_free_page can be called for in use pages. >> Certainly, that is the case in the first workflow above. With the >> suggested changes, I

Re: [PATCH v2 1/2] mm: fix race on soft-offlining free huge pages

2018-07-17 Thread Mike Kravetz
workflow above. With the suggested changes, I think this is OK for huge pages. However, it seems that setting HWPoison on a in use non-huge page could cause issues? While looking at the code, I noticed this comment in __get_any_page() /* * When the target page is a free hugepage, just remove it * from free hugepage list. */ Did that apply to some code that was removed? It does not seem to make any sense in that routine. -- Mike Kravetz

Re: [PATCH v2] mm: hugetlb: don't zero 1GiB bootmem pages.

2018-07-11 Thread Mike Kravetz
unt of time as with 0, as opposed to the 25+ > minutes it would take before. > > Signed-off-by: Cannon Matthews Thanks, Acked-by: Mike Kravetz -- Mike Kravetz > --- > v2: removed the memset of the huge_bootmem_page area and added > INIT_LIST_HEAD instead. > > mm/hugetlb.c

Re: [PATCH] mm/hugetlb: remove gigantic page support for HIGHMEM

2018-07-11 Thread Mike Kravetz
On 07/11/2018 01:57 PM, Davidlohr Bueso wrote: > On Wed, 11 Jul 2018, Mike Kravetz wrote: > >> This reverts commit ee8f248d266e ("hugetlb: add phys addr to struct >> huge_bootmem_page") >> >> At one time powerpc used this field and supporting code.

[PATCH] mm/hugetlb: remove gigantic page support for HIGHMEM

2018-07-11 Thread Mike Kravetz
line"). There are no users of this field and supporting code, so remove it. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 3 --- mm/hugetlb.c| 9 + 2 files changed, 1 insertion(+), 11 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.

Re: [PATCH] mm: hugetlb: don't zero 1GiB bootmem pages.

2018-07-11 Thread Mike Kravetz
add. We do >> initialize the rest explicitly already. > > Forgot to mention that after that is addressed you can add > Acked-by: Michal Hocko Cannon, How about if you make this change suggested by Michal, and I will submit a separate patch to revert the patch which added the phys field to huge_bootmem_page structure. FWIW, Reviewed-by: Mike Kravetz -- Mike Kravetz

Re: [PATCH] mm: hugetlb: don't zero 1GiB bootmem pages.

2018-07-10 Thread Mike Kravetz
n't find any code that sets phys. Although, it is potentially used in gather_bootmem_prealloc(). It appears powerpc used this field at one time, but no longer does. Am I missing something? Not an issue with this patch, rather existing code. I'd prefer not to do the memset(

Re: [PATCH] mm: hugetlb: yield when prepping struct pages

2018-06-27 Thread Mike Kravetz
off-by: Cannon Matthews My only suggestion would be to remove the mention of 2M pages in the commit message. Thanks for adding this. Reviewed-by: Mike Kravetz -- Mike Kravetz > --- > mm/hugetlb.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c &

Re: [PATCH] userfaultfd: hugetlbfs: Fix userfaultfd_huge_must_wait pte access

2018-06-26 Thread Mike Kravetz
tfd is/can be enabled for impacted architectures. Reviewed-by: Mike Kravetz -- Mike Kravetz > --- > fs/userfaultfd.c | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 123bf7d516fc..594d192b2331 1006

Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map

2018-05-31 Thread Mike Kravetz
ection. Unlike the case in ad55eac74f20, the access permissions on section 1 (RE) are different than section 2 (RW). If we allowed the previous MAP_FIXED behavior, we would be changing part of a read only section to read write. This is exactly what MAP_FIXED_NOREPLACE was designed to prevent. -- Mike Kravetz

Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map

2018-05-30 Thread Mike Kravetz
nmapped the overlapping page. However, this does not seem right. -- Mike Kravetz

Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map

2018-05-30 Thread Mike Kravetz
On 05/30/2018 01:02 AM, Michal Hocko wrote: > On Tue 29-05-18 15:21:14, Mike Kravetz wrote: >> Just a quick heads up. I noticed a change in libhugetlbfs testing starting >> with v4.17-rc1. >> >> V4.16 libhugetlbfs test results >> ** TEST SUM

Re: [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map

2018-05-29 Thread Mike Kravetz
libhugetlbfs tests for an unrelated issue/change and, will do some analysis to see exactly what is happening. Also, will take it upon myself to run libhugetlbfs test suite on a regular (at least weekly) basis. -- Mike Kravetz

Re: [patch] mm, hugetlb_cgroup: suppress SIGBUS when hugetlb_cgroup charge fails

2018-05-29 Thread Mike Kravetz
ssh_to_dbg # sudo ./test_mmap 4 mapping 4 huge pages address 7f62bba0 read (-) address 7f62bbc0 read (-) Connection to dbg closed by remote host. Connection to dbf closed. OOM did kick in (lots of console/log output) and killed the shell as well. -- Mike Kravetz

Re: [PATCH -V2 -mm 4/4] mm, hugetlbfs: Pass fault address to cow handler

2018-05-24 Thread Mike Kravetz
ge after the page fault under heavy cache contention. > > Signed-off-by: "Huang, Ying" Reviewed-by: Mike Kravetz -- Mike Kravetz > Cc: Mike Kravetz > Cc: Michal Hocko > Cc: David Rientjes > Cc: Andrea Arcangeli > Cc: "Kirill A. Shutemov" > Cc: Andi

Re: [PATCH -V2 -mm 3/4] mm, hugetlbfs: Rename address to haddr in hugetlb_cow()

2018-05-24 Thread Mike Kravetz
> the "address" is renamed to "haddr" in hugetlb_cow() in this patch. > Next patch will use target subpage address in hugetlb_cow() too. > > The patch is just code cleanup without any functionality changes. > > Signed-off-by: "Huang, Ying" > Suggested-

Re: [PATCH -V2 -mm 2/4] mm, huge page: Copy target sub-page last when copy huge page

2018-05-24 Thread Mike Kravetz
ry area from the begin to the end, so > cause copy on write. For each child process, other child processes > could be seen as other workloads which generate heavy cache pressure. > At the same time, the IPC (instruction per cycle) increased from 0.63 > to 0.78, and the time spent in user sp

Re: [PATCH -V2 -mm 1/4] mm, clear_huge_page: Move order algorithm into a separate function

2018-05-24 Thread Mike Kravetz
er inline support of the compilers, > the indirect call will be optimized to be the direct call. Our tests > show no performance change with the patch. > > This patch is a code cleanup without functionality change. > > Signed-off-by: "Huang, Ying" > Suggested-by:

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-24 Thread Mike Kravetz
allocator. I really do not think hugetlbfs overcommit will provide any benefit over THP for your use case. Also, new user space code is required to "fall back" to normal pages in the case of hugetlbfs page allocation failure. This is not needed in the THP case. -- Mike Kravetz

Re: [PATCH v2 3/4] mm: add find_alloc_contig_pages() interface

2018-05-22 Thread Mike Kravetz
On 05/22/2018 09:41 AM, Reinette Chatre wrote: > On 5/21/2018 4:48 PM, Mike Kravetz wrote: >> On 05/21/2018 01:54 AM, Vlastimil Babka wrote: >>> On 05/04/2018 01:29 AM, Mike Kravetz wrote: >>>> +/** >>>> + * find_alloc_contig_pages() --

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-22 Thread Mike Kravetz
ages are not accounted for when they are allocated as 'reserves'. It is not until these reserves are actually used that accounting limits are checked. This 'seems' to align with general allocation of huge pages within the pool. No accounting is done until they are actually allocated to a mapping/file. -- Mike Kravetz

Re: [PATCH v2 0/4] Interface for higher order contiguous allocations

2018-05-21 Thread Mike Kravetz
On 05/21/2018 05:00 AM, Vlastimil Babka wrote: > On 05/04/2018 01:29 AM, Mike Kravetz wrote: >> Vlastimil and Michal brought up the issue of allocation alignment. The >> routine will currently align to 'nr_pages' (which is the requested size >> argument). It does

Re: [PATCH v2 3/4] mm: add find_alloc_contig_pages() interface

2018-05-21 Thread Mike Kravetz
On 05/21/2018 01:54 AM, Vlastimil Babka wrote: > On 05/04/2018 01:29 AM, Mike Kravetz wrote: >> find_alloc_contig_pages() is a new interface that attempts to locate >> and allocate a contiguous range of pages. It is provided as a more > > How about dropping the 'fin

Re: [PATCH v2 2/4] mm: check for proper migrate type during isolation

2018-05-21 Thread Mike Kravetz
On 05/18/2018 03:32 AM, Vlastimil Babka wrote: > On 05/04/2018 01:29 AM, Mike Kravetz wrote: >> The routine start_isolate_page_range and alloc_contig_range have >> comments saying that migratetype must be either MIGRATE_MOVABLE or >> MIGRATE_CMA. However, this is not enforc

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-21 Thread Mike Kravetz
On 05/17/2018 09:27 PM, TSUKADA Koutaro wrote: > Thanks to Mike Kravetz for comment on the previous version patch. > > The purpose of this patch-set is to make it possible to control whether or > not to charge surplus hugetlb pages obtained by overcommitting to memory > cgroup. I

[PATCH] MAINTAINERS: Change hugetlbfs maintainer and update files

2018-05-18 Thread Mike Kravetz
. hugetlb.c and hugetlb.h are not 100% hugetlbfs, but a majority of their content is hugetlbfs related. Signed-off-by: Mike Kravetz --- MAINTAINERS | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9051a9ca24a2..c7a5eb074eb1 100644 --- a

Re: [PATCH v2 1/4] mm: change type of free_contig_range(nr_pages) to unsigned long

2018-05-18 Thread Mike Kravetz
On 05/18/2018 02:12 AM, Vlastimil Babka wrote: > On 05/04/2018 01:29 AM, Mike Kravetz wrote: >> free_contig_range() is currently defined as: >> void free_contig_range(unsigned long pfn, unsigned nr_pages); >> change to, >> void free_contig_range(unsigned long

Re: [PATCH -mm] mm, huge page: Copy to access sub-page last when copy huge page

2018-05-18 Thread Mike Kravetz
r in which sub-pages are to be copied. IIUC, you added the same algorithm for sub-page ordering to copy_huge_page() that was previously added to clear_huge_page(). Correct? If so, then perhaps a common helper could be used by both the clear and copy huge page routines. It would also make maintenance easier. -- Mike Kravetz

Re: [PATCH -V2 -mm] mm, hugetlbfs: Pass fault address to no page handler

2018-05-17 Thread Mike Kravetz
oad which generates heavy cache > pressure. At the same time, the cache miss rate reduced from ~36.3% > to ~25.6%, the IPC (instruction per cycle) increased from 0.3 to 0.37, > and the time spent in user space is reduced ~19.3%. > Agree with Michal that commit message looks better.

Re: [PATCH -mm] mm, hugetlb: Pass fault address to no page handler

2018-05-16 Thread Mike Kravetz
that another area to consider? That gets back to Michal's question of a specific use case or generic optimization. Unless code is simple (as in this patch), seems like we should hold off on considering additional optimizations unless there is a specific use case. I'm still OK with this change. -- Mike Kravetz

Re: [PATCH -mm] mm, hugetlb: Pass fault address to no page handler

2018-05-14 Thread Mike Kravetz
to ~25.6%, the > IPC (instruction per cycle) increased from 0.3 to 0.37, and the time > spent in user space is reduced ~19.3% Since this patch only addresses hugetlbfs huge pages, I would suggest making that more explicit in the commit message. Other than that, the changes look fine to me.

Re: [PATCH] selftests: memfd: split regular and hugetlbfs tests

2018-05-11 Thread Mike Kravetz
DONE > ok 1..2 selftests: memfd: run_fuse_test.sh [PASS] > selftests: memfd: run_hugetlbfs_test.sh > > Please run memfd with hugetlbfs test as root > not ok 1..3 selftests: memfd: run_hugetlbfs_test.sh [SKIP] > > Signed-off-by: Shuah Khan (Samsung OSG) Thanks for all your

Re: [PATCH v2 21/24] selftests: memfd: return Kselftest Skip code for skipped tests

2018-05-07 Thread Mike Kravetz
n it. > > In addition, return skip code when not enough huge pages are available to > run the test. > > Kselftest framework SKIP code is 4 and the framework prints appropriate > messages to indicate that the test is skipped. > > Signed-off-by: Shuah Khan (Samsung OSG) Tha

Re: [PATCH 21/24] selftests: memfd: return Kselftest Skip code for skipped tests

2018-05-04 Thread Mike Kravetz
ksft_skip We now KNOW that we are running as root because of the check above. We can delete this test, and rely on the later check to determine if the number of huge pages was actually increased. How about this instead (untested)? Signed-off-by: Mike Kravetz diff --git a/tools/testing/selftests/

Re: [PATCH] memcg, hugetlb: pages allocated for hugetlb's overcommit will be charged to memcg

2018-05-03 Thread Mike Kravetz
On 05/03/2018 05:09 PM, TSUKADA Koutaro wrote: > On 2018/05/03 11:33, Mike Kravetz wrote: >> On 05/01/2018 11:54 PM, TSUKADA Koutaro wrote: >>> On 2018/05/02 13:41, Mike Kravetz wrote: >>>> What is the reason for not charging pages at allocation/reserve time? I

[PATCH v2 3/4] mm: add find_alloc_contig_pages() interface

2018-05-03 Thread Mike Kravetz
is employed if possible. There is no guarantee that the routine will succeed. So, the user must be prepared for failure and have a fall back plan. Signed-off-by: Mike Kravetz --- include/linux/gfp.h | 12 + mm/page_alloc.c | 136 +++- 2

[PATCH v2 1/4] mm: change type of free_contig_range(nr_pages) to unsigned long

2018-05-03 Thread Mike Kravetz
an unsigned int. However, this should be changed to an unsigned long to be consistent with other page counts. Signed-off-by: Mike Kravetz --- include/linux/gfp.h | 2 +- mm/cma.c| 2 +- mm/hugetlb.c| 2 +- mm/page_alloc.c | 6 +++--- 4 files changed, 6 insertions(+), 6

[PATCH v2 2/4] mm: check for proper migrate type during isolation

2018-05-03 Thread Mike Kravetz
as there are two primary users. Contiguous range allocation which wants to enforce migration type checking. Memory offline (hotplug) which is not concerned about type checking. Signed-off-by: Mike Kravetz --- include/linux/page-isolation.h | 8 +++- mm/memory_hotplug.c| 2

[PATCH v2 4/4] mm/hugetlb: use find_alloc_contig_pages() to allocate gigantic pages

2018-05-03 Thread Mike Kravetz
Use the new find_alloc_contig_pages() interface for the allocation of gigantic pages and remove associated code in hugetlb.c. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 87 +--- 1 file changed, 6 insertions(+), 81 deletions(-) diff

<    3   4   5   6   7   8   9   10   11   12   >