Re: [PATCH v5 00/12] MADV_FREE support
On Sat, Feb 06, 2016 at 02:32:02PM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 02/05/2016 03:15 AM, Minchan Kim wrote: > > On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > >> Hello Minchan, > >> > >> On 11/30/2015 07:39 AM, Minchan Kim wrote: > >>> In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > >>> new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > >>> so this version doesn't include them. > >>> > >>> I have been tested it on mmotm-2015-11-25-17-08 with additional > >>> patch[1] from Kirill to prevent BUG_ON which he didn't send to > >>> linux-mm yet as formal patch. With it, I couldn't find any > >>> problem so far. > >>> > >>> Note that this version is based on THP refcount redesign so > >>> I needed some modification on MADV_FREE because split_huge_pmd > >>> doesn't split a THP page any more and pmd_trans_huge(pmd) is not > >>> enough to guarantee the page is not THP page. > >>> As well, for MAVD_FREE lazy-split, THP split should respect > >>> pmd's dirtiness rather than marking ptes of all subpages dirty > >>> unconditionally. Please, review last patch in this patchset. > >> > >> Now that MADV_FREE has been merged, would you be willing to write > >> patch to the madvise(2) man page that describes the semantics, > >> noes limitations and restrictions, and (ideally) has some sentences > >> describing use cases? > >> > > > > Hello Michael, > > > > Could you review this patch? > > > > Thanks. > > > >>From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 > > From: Minchan Kim > > Date: Fri, 5 Feb 2016 11:09:54 +0900 > > Subject: [PATCH] madvise.2: Add MADV_FREE > > > > Document the MADV_FREE flags added to madvise() in Linux 4.5 > > > > Signed-off-by: Minchan Kim > > --- > > man2/madvise.2 | 19 +++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/man2/madvise.2 b/man2/madvise.2 > > index c1df67c..4704304 100644 > > --- a/man2/madvise.2 > > +++ b/man2/madvise.2 > > @@ -143,6 +143,25 @@ flag are special memory areas that are not managed > > by the virtual memory subsystem. > > Such pages are typically created by device drivers that > > map the pages into user space.) > > +.TP > > +.B MADV_FREE " (since Linux 4.5)" > > +Application is finished with the given range, so kernel can free > > +resources associated with it but the freeing could be delayed until > > +memory pressure happens or canceld by write operation by user. > > + > > +After a successful MADV_FREE operation, user shouldn't expect kernel > > +keeps stale data on the page. However, subsequent write of pages > > +in the range will succeed and then kernel cannot free those dirtied pages > > +so user can always see just written data. If there was no subsequent > > +write, kernel can free those clean pages any time. In such case, > > +user can see zero-fill-on-demand pages. > > + > > +Note that, it works only with private anonymous pages (see > > +.BR mmap (2)). > > +On swapless system, freeing pages in given range happens instantly > > +regardless of memory pressure. > > + > > + > > .\" > > .\" == > > .\" > > > > Thanks for the nice text! I reworked somewhat, trying to fill out a > few details about how I understand things work, but I may have introduced > errors, so I would be happy if you would check the following text: Below looks good to me. Thanks, Michael > >MADV_FREE (since Linux 4.5) > The application no longer requires the pages in the > range specified by addr and len. The kernel can thus > free these pages, but the freeing could be delayed until > memory pressure occurs. For each of the pages that has > been marked to be freed but has not yet been freed, the > free operation will be canceled if the caller writes > into the page. After a successful MADV_FREE operation, > any stale data (i.e., dirty, unwritten pages) will be > lost when the kernel frees the pages. However, subse‐ > quent writes to pages in the range will succeed and then > kernel cannot free those dirtied pages, so that the > caller can always see just written data. If there is no > subsequent write, the kernel can free the pages at any > time. Once pages in the range have been freed, the > caller will see zero-fill-on-demand pages upon subse‐ > quent page references. > > The MADV_FREE operation can be applied only to private > anonymous pages (see mmap(2)). On a swapless system, > freeing pages in a given range happens instantly, > regardless of memory pressure. > > Thanks, > > Michael > > -- > Michael
Re: [PATCH v5 00/12] MADV_FREE support
On Sat, Feb 06, 2016 at 02:32:02PM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 02/05/2016 03:15 AM, Minchan Kim wrote: > > On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > >> Hello Minchan, > >> > >> On 11/30/2015 07:39 AM, Minchan Kim wrote: > >>> In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > >>> new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > >>> so this version doesn't include them. > >>> > >>> I have been tested it on mmotm-2015-11-25-17-08 with additional > >>> patch[1] from Kirill to prevent BUG_ON which he didn't send to > >>> linux-mm yet as formal patch. With it, I couldn't find any > >>> problem so far. > >>> > >>> Note that this version is based on THP refcount redesign so > >>> I needed some modification on MADV_FREE because split_huge_pmd > >>> doesn't split a THP page any more and pmd_trans_huge(pmd) is not > >>> enough to guarantee the page is not THP page. > >>> As well, for MAVD_FREE lazy-split, THP split should respect > >>> pmd's dirtiness rather than marking ptes of all subpages dirty > >>> unconditionally. Please, review last patch in this patchset. > >> > >> Now that MADV_FREE has been merged, would you be willing to write > >> patch to the madvise(2) man page that describes the semantics, > >> noes limitations and restrictions, and (ideally) has some sentences > >> describing use cases? > >> > > > > Hello Michael, > > > > Could you review this patch? > > > > Thanks. > > > >>From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 > > From: Minchan Kim> > Date: Fri, 5 Feb 2016 11:09:54 +0900 > > Subject: [PATCH] madvise.2: Add MADV_FREE > > > > Document the MADV_FREE flags added to madvise() in Linux 4.5 > > > > Signed-off-by: Minchan Kim > > --- > > man2/madvise.2 | 19 +++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/man2/madvise.2 b/man2/madvise.2 > > index c1df67c..4704304 100644 > > --- a/man2/madvise.2 > > +++ b/man2/madvise.2 > > @@ -143,6 +143,25 @@ flag are special memory areas that are not managed > > by the virtual memory subsystem. > > Such pages are typically created by device drivers that > > map the pages into user space.) > > +.TP > > +.B MADV_FREE " (since Linux 4.5)" > > +Application is finished with the given range, so kernel can free > > +resources associated with it but the freeing could be delayed until > > +memory pressure happens or canceld by write operation by user. > > + > > +After a successful MADV_FREE operation, user shouldn't expect kernel > > +keeps stale data on the page. However, subsequent write of pages > > +in the range will succeed and then kernel cannot free those dirtied pages > > +so user can always see just written data. If there was no subsequent > > +write, kernel can free those clean pages any time. In such case, > > +user can see zero-fill-on-demand pages. > > + > > +Note that, it works only with private anonymous pages (see > > +.BR mmap (2)). > > +On swapless system, freeing pages in given range happens instantly > > +regardless of memory pressure. > > + > > + > > .\" > > .\" == > > .\" > > > > Thanks for the nice text! I reworked somewhat, trying to fill out a > few details about how I understand things work, but I may have introduced > errors, so I would be happy if you would check the following text: Below looks good to me. Thanks, Michael > >MADV_FREE (since Linux 4.5) > The application no longer requires the pages in the > range specified by addr and len. The kernel can thus > free these pages, but the freeing could be delayed until > memory pressure occurs. For each of the pages that has > been marked to be freed but has not yet been freed, the > free operation will be canceled if the caller writes > into the page. After a successful MADV_FREE operation, > any stale data (i.e., dirty, unwritten pages) will be > lost when the kernel frees the pages. However, subse‐ > quent writes to pages in the range will succeed and then > kernel cannot free those dirtied pages, so that the > caller can always see just written data. If there is no > subsequent write, the kernel can free the pages at any > time. Once pages in the range have been freed, the > caller will see zero-fill-on-demand pages upon subse‐ > quent page references. > > The MADV_FREE operation can be applied only to private > anonymous pages (see mmap(2)). On a swapless system, > freeing pages in a given range happens instantly, > regardless of memory pressure. > >
Re: [PATCH v5 00/12] MADV_FREE support
Hello Minchan, On 02/05/2016 03:15 AM, Minchan Kim wrote: > On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: >> Hello Minchan, >> >> On 11/30/2015 07:39 AM, Minchan Kim wrote: >>> In v4, Andrew wanted to settle in old basic MADV_FREE and introduces >>> new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later >>> so this version doesn't include them. >>> >>> I have been tested it on mmotm-2015-11-25-17-08 with additional >>> patch[1] from Kirill to prevent BUG_ON which he didn't send to >>> linux-mm yet as formal patch. With it, I couldn't find any >>> problem so far. >>> >>> Note that this version is based on THP refcount redesign so >>> I needed some modification on MADV_FREE because split_huge_pmd >>> doesn't split a THP page any more and pmd_trans_huge(pmd) is not >>> enough to guarantee the page is not THP page. >>> As well, for MAVD_FREE lazy-split, THP split should respect >>> pmd's dirtiness rather than marking ptes of all subpages dirty >>> unconditionally. Please, review last patch in this patchset. >> >> Now that MADV_FREE has been merged, would you be willing to write >> patch to the madvise(2) man page that describes the semantics, >> noes limitations and restrictions, and (ideally) has some sentences >> describing use cases? >> > > Hello Michael, > > Could you review this patch? > > Thanks. > >>From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Fri, 5 Feb 2016 11:09:54 +0900 > Subject: [PATCH] madvise.2: Add MADV_FREE > > Document the MADV_FREE flags added to madvise() in Linux 4.5 > > Signed-off-by: Minchan Kim > --- > man2/madvise.2 | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/man2/madvise.2 b/man2/madvise.2 > index c1df67c..4704304 100644 > --- a/man2/madvise.2 > +++ b/man2/madvise.2 > @@ -143,6 +143,25 @@ flag are special memory areas that are not managed > by the virtual memory subsystem. > Such pages are typically created by device drivers that > map the pages into user space.) > +.TP > +.B MADV_FREE " (since Linux 4.5)" > +Application is finished with the given range, so kernel can free > +resources associated with it but the freeing could be delayed until > +memory pressure happens or canceld by write operation by user. > + > +After a successful MADV_FREE operation, user shouldn't expect kernel > +keeps stale data on the page. However, subsequent write of pages > +in the range will succeed and then kernel cannot free those dirtied pages > +so user can always see just written data. If there was no subsequent > +write, kernel can free those clean pages any time. In such case, > +user can see zero-fill-on-demand pages. > + > +Note that, it works only with private anonymous pages (see > +.BR mmap (2)). > +On swapless system, freeing pages in given range happens instantly > +regardless of memory pressure. > + > + > .\" > .\" == > .\" > Thanks for the nice text! I reworked somewhat, trying to fill out a few details about how I understand things work, but I may have introduced errors, so I would be happy if you would check the following text: MADV_FREE (since Linux 4.5) The application no longer requires the pages in the range specified by addr and len. The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs. For each of the pages that has been marked to be freed but has not yet been freed, the free operation will be canceled if the caller writes into the page. After a successful MADV_FREE operation, any stale data (i.e., dirty, unwritten pages) will be lost when the kernel frees the pages. However, subse‐ quent writes to pages in the range will succeed and then kernel cannot free those dirtied pages, so that the caller can always see just written data. If there is no subsequent write, the kernel can free the pages at any time. Once pages in the range have been freed, the caller will see zero-fill-on-demand pages upon subse‐ quent page references. The MADV_FREE operation can be applied only to private anonymous pages (see mmap(2)). On a swapless system, freeing pages in a given range happens instantly, regardless of memory pressure. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v5 00/12] MADV_FREE support
Hello Minchan, On 02/05/2016 03:15 AM, Minchan Kim wrote: > On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: >> Hello Minchan, >> >> On 11/30/2015 07:39 AM, Minchan Kim wrote: >>> In v4, Andrew wanted to settle in old basic MADV_FREE and introduces >>> new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later >>> so this version doesn't include them. >>> >>> I have been tested it on mmotm-2015-11-25-17-08 with additional >>> patch[1] from Kirill to prevent BUG_ON which he didn't send to >>> linux-mm yet as formal patch. With it, I couldn't find any >>> problem so far. >>> >>> Note that this version is based on THP refcount redesign so >>> I needed some modification on MADV_FREE because split_huge_pmd >>> doesn't split a THP page any more and pmd_trans_huge(pmd) is not >>> enough to guarantee the page is not THP page. >>> As well, for MAVD_FREE lazy-split, THP split should respect >>> pmd's dirtiness rather than marking ptes of all subpages dirty >>> unconditionally. Please, review last patch in this patchset. >> >> Now that MADV_FREE has been merged, would you be willing to write >> patch to the madvise(2) man page that describes the semantics, >> noes limitations and restrictions, and (ideally) has some sentences >> describing use cases? >> > > Hello Michael, > > Could you review this patch? > > Thanks. > >>From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 > From: Minchan Kim> Date: Fri, 5 Feb 2016 11:09:54 +0900 > Subject: [PATCH] madvise.2: Add MADV_FREE > > Document the MADV_FREE flags added to madvise() in Linux 4.5 > > Signed-off-by: Minchan Kim > --- > man2/madvise.2 | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/man2/madvise.2 b/man2/madvise.2 > index c1df67c..4704304 100644 > --- a/man2/madvise.2 > +++ b/man2/madvise.2 > @@ -143,6 +143,25 @@ flag are special memory areas that are not managed > by the virtual memory subsystem. > Such pages are typically created by device drivers that > map the pages into user space.) > +.TP > +.B MADV_FREE " (since Linux 4.5)" > +Application is finished with the given range, so kernel can free > +resources associated with it but the freeing could be delayed until > +memory pressure happens or canceld by write operation by user. > + > +After a successful MADV_FREE operation, user shouldn't expect kernel > +keeps stale data on the page. However, subsequent write of pages > +in the range will succeed and then kernel cannot free those dirtied pages > +so user can always see just written data. If there was no subsequent > +write, kernel can free those clean pages any time. In such case, > +user can see zero-fill-on-demand pages. > + > +Note that, it works only with private anonymous pages (see > +.BR mmap (2)). > +On swapless system, freeing pages in given range happens instantly > +regardless of memory pressure. > + > + > .\" > .\" == > .\" > Thanks for the nice text! I reworked somewhat, trying to fill out a few details about how I understand things work, but I may have introduced errors, so I would be happy if you would check the following text: MADV_FREE (since Linux 4.5) The application no longer requires the pages in the range specified by addr and len. The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs. For each of the pages that has been marked to be freed but has not yet been freed, the free operation will be canceled if the caller writes into the page. After a successful MADV_FREE operation, any stale data (i.e., dirty, unwritten pages) will be lost when the kernel frees the pages. However, subse‐ quent writes to pages in the range will succeed and then kernel cannot free those dirtied pages, so that the caller can always see just written data. If there is no subsequent write, the kernel can free the pages at any time. Once pages in the range have been freed, the caller will see zero-fill-on-demand pages upon subse‐ quent page references. The MADV_FREE operation can be applied only to private anonymous pages (see mmap(2)). On a swapless system, freeing pages in a given range happens instantly, regardless of memory pressure. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v5 00/12] MADV_FREE support
On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 11/30/2015 07:39 AM, Minchan Kim wrote: > > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > > so this version doesn't include them. > > > > I have been tested it on mmotm-2015-11-25-17-08 with additional > > patch[1] from Kirill to prevent BUG_ON which he didn't send to > > linux-mm yet as formal patch. With it, I couldn't find any > > problem so far. > > > > Note that this version is based on THP refcount redesign so > > I needed some modification on MADV_FREE because split_huge_pmd > > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > > enough to guarantee the page is not THP page. > > As well, for MAVD_FREE lazy-split, THP split should respect > > pmd's dirtiness rather than marking ptes of all subpages dirty > > unconditionally. Please, review last patch in this patchset. > > Now that MADV_FREE has been merged, would you be willing to write > patch to the madvise(2) man page that describes the semantics, > noes limitations and restrictions, and (ideally) has some sentences > describing use cases? > Hello Michael, Could you review this patch? Thanks. >From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Fri, 5 Feb 2016 11:09:54 +0900 Subject: [PATCH] madvise.2: Add MADV_FREE Document the MADV_FREE flags added to madvise() in Linux 4.5 Signed-off-by: Minchan Kim --- man2/madvise.2 | 19 +++ 1 file changed, 19 insertions(+) diff --git a/man2/madvise.2 b/man2/madvise.2 index c1df67c..4704304 100644 --- a/man2/madvise.2 +++ b/man2/madvise.2 @@ -143,6 +143,25 @@ flag are special memory areas that are not managed by the virtual memory subsystem. Such pages are typically created by device drivers that map the pages into user space.) +.TP +.B MADV_FREE " (since Linux 4.5)" +Application is finished with the given range, so kernel can free +resources associated with it but the freeing could be delayed until +memory pressure happens or canceld by write operation by user. + +After a successful MADV_FREE operation, user shouldn't expect kernel +keeps stale data on the page. However, subsequent write of pages +in the range will succeed and then kernel cannot free those dirtied pages +so user can always see just written data. If there was no subsequent +write, kernel can free those clean pages any time. In such case, +user can see zero-fill-on-demand pages. + +Note that, it works only with private anonymous pages (see +.BR mmap (2)). +On swapless system, freeing pages in given range happens instantly +regardless of memory pressure. + + .\" .\" == .\" -- 1.9.1
Re: [PATCH v5 00/12] MADV_FREE support
On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 11/30/2015 07:39 AM, Minchan Kim wrote: > > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > > so this version doesn't include them. > > > > I have been tested it on mmotm-2015-11-25-17-08 with additional > > patch[1] from Kirill to prevent BUG_ON which he didn't send to > > linux-mm yet as formal patch. With it, I couldn't find any > > problem so far. > > > > Note that this version is based on THP refcount redesign so > > I needed some modification on MADV_FREE because split_huge_pmd > > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > > enough to guarantee the page is not THP page. > > As well, for MAVD_FREE lazy-split, THP split should respect > > pmd's dirtiness rather than marking ptes of all subpages dirty > > unconditionally. Please, review last patch in this patchset. > > Now that MADV_FREE has been merged, would you be willing to write > patch to the madvise(2) man page that describes the semantics, > noes limitations and restrictions, and (ideally) has some sentences > describing use cases? > Hello Michael, Could you review this patch? Thanks. >From 203372f901f574e991215fdff6907608ba53f932 Mon Sep 17 00:00:00 2001 From: Minchan KimDate: Fri, 5 Feb 2016 11:09:54 +0900 Subject: [PATCH] madvise.2: Add MADV_FREE Document the MADV_FREE flags added to madvise() in Linux 4.5 Signed-off-by: Minchan Kim --- man2/madvise.2 | 19 +++ 1 file changed, 19 insertions(+) diff --git a/man2/madvise.2 b/man2/madvise.2 index c1df67c..4704304 100644 --- a/man2/madvise.2 +++ b/man2/madvise.2 @@ -143,6 +143,25 @@ flag are special memory areas that are not managed by the virtual memory subsystem. Such pages are typically created by device drivers that map the pages into user space.) +.TP +.B MADV_FREE " (since Linux 4.5)" +Application is finished with the given range, so kernel can free +resources associated with it but the freeing could be delayed until +memory pressure happens or canceld by write operation by user. + +After a successful MADV_FREE operation, user shouldn't expect kernel +keeps stale data on the page. However, subsequent write of pages +in the range will succeed and then kernel cannot free those dirtied pages +so user can always see just written data. If there was no subsequent +write, kernel can free those clean pages any time. In such case, +user can see zero-fill-on-demand pages. + +Note that, it works only with private anonymous pages (see +.BR mmap (2)). +On swapless system, freeing pages in given range happens instantly +regardless of memory pressure. + + .\" .\" == .\" -- 1.9.1
Re: [PATCH v5 00/12] MADV_FREE support
Hello Michael, On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 11/30/2015 07:39 AM, Minchan Kim wrote: > > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > > so this version doesn't include them. > > > > I have been tested it on mmotm-2015-11-25-17-08 with additional > > patch[1] from Kirill to prevent BUG_ON which he didn't send to > > linux-mm yet as formal patch. With it, I couldn't find any > > problem so far. > > > > Note that this version is based on THP refcount redesign so > > I needed some modification on MADV_FREE because split_huge_pmd > > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > > enough to guarantee the page is not THP page. > > As well, for MAVD_FREE lazy-split, THP split should respect > > pmd's dirtiness rather than marking ptes of all subpages dirty > > unconditionally. Please, review last patch in this patchset. > > Now that MADV_FREE has been merged, would you be willing to write > patch to the madvise(2) man page that describes the semantics, > noes limitations and restrictions, and (ideally) has some sentences > describing use cases? I will try next week. Thanks for the heads up. > > Thanks, > > Michael > > > > mm: don't split THP page when syscall is called > > > > [1] https://lkml.org/lkml/2015/11/17/134 > > > > git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git > > branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 > > > > In this stage, I don't think we need to write man page. > > It could be done after solid policy and implementation. > > > > * Change from v4 > >* drop lazyfree LRU > >* drop swapless support > >* drop lazyfreeness > >* rebase on recent mmotom with THP refcount redesign > > > > * Change from v3 > >* some bug fix > >* code refactoring > >* lazyfree reclaim logic change > >* reordering patch > > > > * Change from v2 > >* vm_lazyfreeness tuning knob > >* add new LRU list - Johannes, Shaohua > >* support swapless - Johannes > > > > * Change from v1 > >* Don't do unnecessary TLB flush - Shaohua > >* Added Acked-by - Hugh, Michal > >* Merge deactivate_page and deactivate_file_page > >* Add pmd_dirty/pmd_mkclean patches for several arches > >* Add lazy THP split patch > >* Drop zhangyan...@cn.fujitsu.com - Delivery Failure > > > > Chen Gang (1): > > arch: uapi: asm: mman.h: Let MADV_FREE have same value for all > > architectures > > > > Minchan Kim (11): > > mm: support madvise(MADV_FREE) > > mm: define MADV_FREE for some arches > > mm: free swp_entry in madvise_free > > mm: move lazily freed pages to inactive list > > mm: mark stable page dirty in KSM > > x86: add pmd_[dirty|mkclean] for THP > > sparc: add pmd_[dirty|mkclean] for THP > > powerpc: add pmd_[dirty|mkclean] for THP > > arm: add pmd_mkclean for THP > > arm64: add pmd_mkclean for THP > > mm: don't split THP page when syscall is called > > > > arch/alpha/include/uapi/asm/mman.h | 2 + > > arch/arm/include/asm/pgtable-3level.h| 1 + > > arch/arm64/include/asm/pgtable.h | 1 + > > arch/mips/include/uapi/asm/mman.h| 2 + > > arch/parisc/include/uapi/asm/mman.h | 2 + > > arch/powerpc/include/asm/pgtable-ppc64.h | 2 + > > arch/sparc/include/asm/pgtable_64.h | 9 ++ > > arch/x86/include/asm/pgtable.h | 5 + > > arch/xtensa/include/uapi/asm/mman.h | 2 + > > include/linux/huge_mm.h | 3 + > > include/linux/rmap.h | 1 + > > include/linux/swap.h | 1 + > > include/linux/vm_event_item.h| 1 + > > include/uapi/asm-generic/mman-common.h | 1 + > > mm/huge_memory.c | 87 +- > > mm/ksm.c | 6 + > > mm/madvise.c | 199 > > +++ > > mm/rmap.c| 8 ++ > > mm/swap.c| 44 +++ > > mm/swap_state.c | 5 +- > > mm/vmscan.c | 10 +- > > mm/vmstat.c | 1 + > > 22 files changed, 383 insertions(+), 10 deletions(-) > > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v5 00/12] MADV_FREE support
Hello Michael, On Thu, Jan 28, 2016 at 08:16:25AM +0100, Michael Kerrisk (man-pages) wrote: > Hello Minchan, > > On 11/30/2015 07:39 AM, Minchan Kim wrote: > > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > > so this version doesn't include them. > > > > I have been tested it on mmotm-2015-11-25-17-08 with additional > > patch[1] from Kirill to prevent BUG_ON which he didn't send to > > linux-mm yet as formal patch. With it, I couldn't find any > > problem so far. > > > > Note that this version is based on THP refcount redesign so > > I needed some modification on MADV_FREE because split_huge_pmd > > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > > enough to guarantee the page is not THP page. > > As well, for MAVD_FREE lazy-split, THP split should respect > > pmd's dirtiness rather than marking ptes of all subpages dirty > > unconditionally. Please, review last patch in this patchset. > > Now that MADV_FREE has been merged, would you be willing to write > patch to the madvise(2) man page that describes the semantics, > noes limitations and restrictions, and (ideally) has some sentences > describing use cases? I will try next week. Thanks for the heads up. > > Thanks, > > Michael > > > > mm: don't split THP page when syscall is called > > > > [1] https://lkml.org/lkml/2015/11/17/134 > > > > git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git > > branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 > > > > In this stage, I don't think we need to write man page. > > It could be done after solid policy and implementation. > > > > * Change from v4 > >* drop lazyfree LRU > >* drop swapless support > >* drop lazyfreeness > >* rebase on recent mmotom with THP refcount redesign > > > > * Change from v3 > >* some bug fix > >* code refactoring > >* lazyfree reclaim logic change > >* reordering patch > > > > * Change from v2 > >* vm_lazyfreeness tuning knob > >* add new LRU list - Johannes, Shaohua > >* support swapless - Johannes > > > > * Change from v1 > >* Don't do unnecessary TLB flush - Shaohua > >* Added Acked-by - Hugh, Michal > >* Merge deactivate_page and deactivate_file_page > >* Add pmd_dirty/pmd_mkclean patches for several arches > >* Add lazy THP split patch > >* Drop zhangyan...@cn.fujitsu.com - Delivery Failure > > > > Chen Gang (1): > > arch: uapi: asm: mman.h: Let MADV_FREE have same value for all > > architectures > > > > Minchan Kim (11): > > mm: support madvise(MADV_FREE) > > mm: define MADV_FREE for some arches > > mm: free swp_entry in madvise_free > > mm: move lazily freed pages to inactive list > > mm: mark stable page dirty in KSM > > x86: add pmd_[dirty|mkclean] for THP > > sparc: add pmd_[dirty|mkclean] for THP > > powerpc: add pmd_[dirty|mkclean] for THP > > arm: add pmd_mkclean for THP > > arm64: add pmd_mkclean for THP > > mm: don't split THP page when syscall is called > > > > arch/alpha/include/uapi/asm/mman.h | 2 + > > arch/arm/include/asm/pgtable-3level.h| 1 + > > arch/arm64/include/asm/pgtable.h | 1 + > > arch/mips/include/uapi/asm/mman.h| 2 + > > arch/parisc/include/uapi/asm/mman.h | 2 + > > arch/powerpc/include/asm/pgtable-ppc64.h | 2 + > > arch/sparc/include/asm/pgtable_64.h | 9 ++ > > arch/x86/include/asm/pgtable.h | 5 + > > arch/xtensa/include/uapi/asm/mman.h | 2 + > > include/linux/huge_mm.h | 3 + > > include/linux/rmap.h | 1 + > > include/linux/swap.h | 1 + > > include/linux/vm_event_item.h| 1 + > > include/uapi/asm-generic/mman-common.h | 1 + > > mm/huge_memory.c | 87 +- > > mm/ksm.c | 6 + > > mm/madvise.c | 199 > > +++ > > mm/rmap.c| 8 ++ > > mm/swap.c| 44 +++ > > mm/swap_state.c | 5 +- > > mm/vmscan.c | 10 +- > > mm/vmstat.c | 1 + > > 22 files changed, 383 insertions(+), 10 deletions(-) > > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v5 00/12] MADV_FREE support
Hello Minchan, On 11/30/2015 07:39 AM, Minchan Kim wrote: > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > so this version doesn't include them. > > I have been tested it on mmotm-2015-11-25-17-08 with additional > patch[1] from Kirill to prevent BUG_ON which he didn't send to > linux-mm yet as formal patch. With it, I couldn't find any > problem so far. > > Note that this version is based on THP refcount redesign so > I needed some modification on MADV_FREE because split_huge_pmd > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > enough to guarantee the page is not THP page. > As well, for MAVD_FREE lazy-split, THP split should respect > pmd's dirtiness rather than marking ptes of all subpages dirty > unconditionally. Please, review last patch in this patchset. Now that MADV_FREE has been merged, would you be willing to write patch to the madvise(2) man page that describes the semantics, noes limitations and restrictions, and (ideally) has some sentences describing use cases? Thanks, Michael > mm: don't split THP page when syscall is called > > [1] https://lkml.org/lkml/2015/11/17/134 > > git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git > branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 > > In this stage, I don't think we need to write man page. > It could be done after solid policy and implementation. > > * Change from v4 >* drop lazyfree LRU >* drop swapless support >* drop lazyfreeness >* rebase on recent mmotom with THP refcount redesign > > * Change from v3 >* some bug fix >* code refactoring >* lazyfree reclaim logic change >* reordering patch > > * Change from v2 >* vm_lazyfreeness tuning knob >* add new LRU list - Johannes, Shaohua >* support swapless - Johannes > > * Change from v1 >* Don't do unnecessary TLB flush - Shaohua >* Added Acked-by - Hugh, Michal >* Merge deactivate_page and deactivate_file_page >* Add pmd_dirty/pmd_mkclean patches for several arches >* Add lazy THP split patch >* Drop zhangyan...@cn.fujitsu.com - Delivery Failure > > Chen Gang (1): > arch: uapi: asm: mman.h: Let MADV_FREE have same value for all > architectures > > Minchan Kim (11): > mm: support madvise(MADV_FREE) > mm: define MADV_FREE for some arches > mm: free swp_entry in madvise_free > mm: move lazily freed pages to inactive list > mm: mark stable page dirty in KSM > x86: add pmd_[dirty|mkclean] for THP > sparc: add pmd_[dirty|mkclean] for THP > powerpc: add pmd_[dirty|mkclean] for THP > arm: add pmd_mkclean for THP > arm64: add pmd_mkclean for THP > mm: don't split THP page when syscall is called > > arch/alpha/include/uapi/asm/mman.h | 2 + > arch/arm/include/asm/pgtable-3level.h| 1 + > arch/arm64/include/asm/pgtable.h | 1 + > arch/mips/include/uapi/asm/mman.h| 2 + > arch/parisc/include/uapi/asm/mman.h | 2 + > arch/powerpc/include/asm/pgtable-ppc64.h | 2 + > arch/sparc/include/asm/pgtable_64.h | 9 ++ > arch/x86/include/asm/pgtable.h | 5 + > arch/xtensa/include/uapi/asm/mman.h | 2 + > include/linux/huge_mm.h | 3 + > include/linux/rmap.h | 1 + > include/linux/swap.h | 1 + > include/linux/vm_event_item.h| 1 + > include/uapi/asm-generic/mman-common.h | 1 + > mm/huge_memory.c | 87 +- > mm/ksm.c | 6 + > mm/madvise.c | 199 > +++ > mm/rmap.c| 8 ++ > mm/swap.c| 44 +++ > mm/swap_state.c | 5 +- > mm/vmscan.c | 10 +- > mm/vmstat.c | 1 + > 22 files changed, 383 insertions(+), 10 deletions(-) > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v5 00/12] MADV_FREE support
Hello Minchan, On 11/30/2015 07:39 AM, Minchan Kim wrote: > In v4, Andrew wanted to settle in old basic MADV_FREE and introduces > new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later > so this version doesn't include them. > > I have been tested it on mmotm-2015-11-25-17-08 with additional > patch[1] from Kirill to prevent BUG_ON which he didn't send to > linux-mm yet as formal patch. With it, I couldn't find any > problem so far. > > Note that this version is based on THP refcount redesign so > I needed some modification on MADV_FREE because split_huge_pmd > doesn't split a THP page any more and pmd_trans_huge(pmd) is not > enough to guarantee the page is not THP page. > As well, for MAVD_FREE lazy-split, THP split should respect > pmd's dirtiness rather than marking ptes of all subpages dirty > unconditionally. Please, review last patch in this patchset. Now that MADV_FREE has been merged, would you be willing to write patch to the madvise(2) man page that describes the semantics, noes limitations and restrictions, and (ideally) has some sentences describing use cases? Thanks, Michael > mm: don't split THP page when syscall is called > > [1] https://lkml.org/lkml/2015/11/17/134 > > git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git > branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 > > In this stage, I don't think we need to write man page. > It could be done after solid policy and implementation. > > * Change from v4 >* drop lazyfree LRU >* drop swapless support >* drop lazyfreeness >* rebase on recent mmotom with THP refcount redesign > > * Change from v3 >* some bug fix >* code refactoring >* lazyfree reclaim logic change >* reordering patch > > * Change from v2 >* vm_lazyfreeness tuning knob >* add new LRU list - Johannes, Shaohua >* support swapless - Johannes > > * Change from v1 >* Don't do unnecessary TLB flush - Shaohua >* Added Acked-by - Hugh, Michal >* Merge deactivate_page and deactivate_file_page >* Add pmd_dirty/pmd_mkclean patches for several arches >* Add lazy THP split patch >* Drop zhangyan...@cn.fujitsu.com - Delivery Failure > > Chen Gang (1): > arch: uapi: asm: mman.h: Let MADV_FREE have same value for all > architectures > > Minchan Kim (11): > mm: support madvise(MADV_FREE) > mm: define MADV_FREE for some arches > mm: free swp_entry in madvise_free > mm: move lazily freed pages to inactive list > mm: mark stable page dirty in KSM > x86: add pmd_[dirty|mkclean] for THP > sparc: add pmd_[dirty|mkclean] for THP > powerpc: add pmd_[dirty|mkclean] for THP > arm: add pmd_mkclean for THP > arm64: add pmd_mkclean for THP > mm: don't split THP page when syscall is called > > arch/alpha/include/uapi/asm/mman.h | 2 + > arch/arm/include/asm/pgtable-3level.h| 1 + > arch/arm64/include/asm/pgtable.h | 1 + > arch/mips/include/uapi/asm/mman.h| 2 + > arch/parisc/include/uapi/asm/mman.h | 2 + > arch/powerpc/include/asm/pgtable-ppc64.h | 2 + > arch/sparc/include/asm/pgtable_64.h | 9 ++ > arch/x86/include/asm/pgtable.h | 5 + > arch/xtensa/include/uapi/asm/mman.h | 2 + > include/linux/huge_mm.h | 3 + > include/linux/rmap.h | 1 + > include/linux/swap.h | 1 + > include/linux/vm_event_item.h| 1 + > include/uapi/asm-generic/mman-common.h | 1 + > mm/huge_memory.c | 87 +- > mm/ksm.c | 6 + > mm/madvise.c | 199 > +++ > mm/rmap.c| 8 ++ > mm/swap.c| 44 +++ > mm/swap_state.c | 5 +- > mm/vmscan.c | 10 +- > mm/vmstat.c | 1 + > 22 files changed, 383 insertions(+), 10 deletions(-) > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
[PATCH v5 00/12] MADV_FREE support
In v4, Andrew wanted to settle in old basic MADV_FREE and introduces new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later so this version doesn't include them. I have been tested it on mmotm-2015-11-25-17-08 with additional patch[1] from Kirill to prevent BUG_ON which he didn't send to linux-mm yet as formal patch. With it, I couldn't find any problem so far. Note that this version is based on THP refcount redesign so I needed some modification on MADV_FREE because split_huge_pmd doesn't split a THP page any more and pmd_trans_huge(pmd) is not enough to guarantee the page is not THP page. As well, for MAVD_FREE lazy-split, THP split should respect pmd's dirtiness rather than marking ptes of all subpages dirty unconditionally. Please, review last patch in this patchset. mm: don't split THP page when syscall is called [1] https://lkml.org/lkml/2015/11/17/134 git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 In this stage, I don't think we need to write man page. It could be done after solid policy and implementation. * Change from v4 * drop lazyfree LRU * drop swapless support * drop lazyfreeness * rebase on recent mmotom with THP refcount redesign * Change from v3 * some bug fix * code refactoring * lazyfree reclaim logic change * reordering patch * Change from v2 * vm_lazyfreeness tuning knob * add new LRU list - Johannes, Shaohua * support swapless - Johannes * Change from v1 * Don't do unnecessary TLB flush - Shaohua * Added Acked-by - Hugh, Michal * Merge deactivate_page and deactivate_file_page * Add pmd_dirty/pmd_mkclean patches for several arches * Add lazy THP split patch * Drop zhangyan...@cn.fujitsu.com - Delivery Failure Chen Gang (1): arch: uapi: asm: mman.h: Let MADV_FREE have same value for all architectures Minchan Kim (11): mm: support madvise(MADV_FREE) mm: define MADV_FREE for some arches mm: free swp_entry in madvise_free mm: move lazily freed pages to inactive list mm: mark stable page dirty in KSM x86: add pmd_[dirty|mkclean] for THP sparc: add pmd_[dirty|mkclean] for THP powerpc: add pmd_[dirty|mkclean] for THP arm: add pmd_mkclean for THP arm64: add pmd_mkclean for THP mm: don't split THP page when syscall is called arch/alpha/include/uapi/asm/mman.h | 2 + arch/arm/include/asm/pgtable-3level.h| 1 + arch/arm64/include/asm/pgtable.h | 1 + arch/mips/include/uapi/asm/mman.h| 2 + arch/parisc/include/uapi/asm/mman.h | 2 + arch/powerpc/include/asm/pgtable-ppc64.h | 2 + arch/sparc/include/asm/pgtable_64.h | 9 ++ arch/x86/include/asm/pgtable.h | 5 + arch/xtensa/include/uapi/asm/mman.h | 2 + include/linux/huge_mm.h | 3 + include/linux/rmap.h | 1 + include/linux/swap.h | 1 + include/linux/vm_event_item.h| 1 + include/uapi/asm-generic/mman-common.h | 1 + mm/huge_memory.c | 87 +- mm/ksm.c | 6 + mm/madvise.c | 199 +++ mm/rmap.c| 8 ++ mm/swap.c| 44 +++ mm/swap_state.c | 5 +- mm/vmscan.c | 10 +- mm/vmstat.c | 1 + 22 files changed, 383 insertions(+), 10 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 00/12] MADV_FREE support
In v4, Andrew wanted to settle in old basic MADV_FREE and introduces new stuffs(ie, lazyfree LRU, swapless support and lazyfreeness) later so this version doesn't include them. I have been tested it on mmotm-2015-11-25-17-08 with additional patch[1] from Kirill to prevent BUG_ON which he didn't send to linux-mm yet as formal patch. With it, I couldn't find any problem so far. Note that this version is based on THP refcount redesign so I needed some modification on MADV_FREE because split_huge_pmd doesn't split a THP page any more and pmd_trans_huge(pmd) is not enough to guarantee the page is not THP page. As well, for MAVD_FREE lazy-split, THP split should respect pmd's dirtiness rather than marking ptes of all subpages dirty unconditionally. Please, review last patch in this patchset. mm: don't split THP page when syscall is called [1] https://lkml.org/lkml/2015/11/17/134 git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git branch: mm/madv_free-v4.4-rc2-mmotm-2015-11-25-17-08-v5r2 In this stage, I don't think we need to write man page. It could be done after solid policy and implementation. * Change from v4 * drop lazyfree LRU * drop swapless support * drop lazyfreeness * rebase on recent mmotom with THP refcount redesign * Change from v3 * some bug fix * code refactoring * lazyfree reclaim logic change * reordering patch * Change from v2 * vm_lazyfreeness tuning knob * add new LRU list - Johannes, Shaohua * support swapless - Johannes * Change from v1 * Don't do unnecessary TLB flush - Shaohua * Added Acked-by - Hugh, Michal * Merge deactivate_page and deactivate_file_page * Add pmd_dirty/pmd_mkclean patches for several arches * Add lazy THP split patch * Drop zhangyan...@cn.fujitsu.com - Delivery Failure Chen Gang (1): arch: uapi: asm: mman.h: Let MADV_FREE have same value for all architectures Minchan Kim (11): mm: support madvise(MADV_FREE) mm: define MADV_FREE for some arches mm: free swp_entry in madvise_free mm: move lazily freed pages to inactive list mm: mark stable page dirty in KSM x86: add pmd_[dirty|mkclean] for THP sparc: add pmd_[dirty|mkclean] for THP powerpc: add pmd_[dirty|mkclean] for THP arm: add pmd_mkclean for THP arm64: add pmd_mkclean for THP mm: don't split THP page when syscall is called arch/alpha/include/uapi/asm/mman.h | 2 + arch/arm/include/asm/pgtable-3level.h| 1 + arch/arm64/include/asm/pgtable.h | 1 + arch/mips/include/uapi/asm/mman.h| 2 + arch/parisc/include/uapi/asm/mman.h | 2 + arch/powerpc/include/asm/pgtable-ppc64.h | 2 + arch/sparc/include/asm/pgtable_64.h | 9 ++ arch/x86/include/asm/pgtable.h | 5 + arch/xtensa/include/uapi/asm/mman.h | 2 + include/linux/huge_mm.h | 3 + include/linux/rmap.h | 1 + include/linux/swap.h | 1 + include/linux/vm_event_item.h| 1 + include/uapi/asm-generic/mman-common.h | 1 + mm/huge_memory.c | 87 +- mm/ksm.c | 6 + mm/madvise.c | 199 +++ mm/rmap.c| 8 ++ mm/swap.c| 44 +++ mm/swap_state.c | 5 +- mm/vmscan.c | 10 +- mm/vmstat.c | 1 + 22 files changed, 383 insertions(+), 10 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/