Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 06:50:58 -0800
Matthew Wilcox  wrote:

> On Thu, Mar 01, 2018 at 03:44:12PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:  
> > > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > > memory for each of the tables. So we allocate 4K and split it into two
> > > 2K pieces. Now we have to keep track of the pieces to be able to free
> > > them again.  
> > 
> > Have you considered to use slab for page table allocation instead?
> > IIRC some architectures practice this already.  
> 
> You're not allowed to do that any more.  Look at pgtable_page_ctor(),
> or rather ptlock_init().

Oh yes, I forgot about the ptl. This takes up some fields in struct page
which the slab/slub cache want to use as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 06:50:58 -0800
Matthew Wilcox  wrote:

> On Thu, Mar 01, 2018 at 03:44:12PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:  
> > > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > > memory for each of the tables. So we allocate 4K and split it into two
> > > 2K pieces. Now we have to keep track of the pieces to be able to free
> > > them again.  
> > 
> > Have you considered to use slab for page table allocation instead?
> > IIRC some architectures practice this already.  
> 
> You're not allowed to do that any more.  Look at pgtable_page_ctor(),
> or rather ptlock_init().

Oh yes, I forgot about the ptl. This takes up some fields in struct page
which the slab/slub cache want to use as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Matthew Wilcox
On Thu, Mar 01, 2018 at 03:44:12PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > memory for each of the tables. So we allocate 4K and split it into two
> > 2K pieces. Now we have to keep track of the pieces to be able to free
> > them again.
> 
> Have you considered to use slab for page table allocation instead?
> IIRC some architectures practice this already.

You're not allowed to do that any more.  Look at pgtable_page_ctor(),
or rather ptlock_init().


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Matthew Wilcox
On Thu, Mar 01, 2018 at 03:44:12PM +0300, Kirill A. Shutemov wrote:
> On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > memory for each of the tables. So we allocate 4K and split it into two
> > 2K pieces. Now we have to keep track of the pieces to be able to free
> > them again.
> 
> Have you considered to use slab for page table allocation instead?
> IIRC some architectures practice this already.

You're not allowed to do that any more.  Look at pgtable_page_ctor(),
or rather ptlock_init().


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 15:44:12 +0300
"Kirill A. Shutemov"  wrote:

> On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> > On Wed, 28 Feb 2018 14:31:53 -0800
> > Matthew Wilcox  wrote:
> >   
> > > From: Matthew Wilcox 
> > > 
> > > I want to use the _mapcount field to record what a page is in use as.
> > > This can help with debugging and we can also expose that information to
> > > userspace through /proc/kpageflags to help diagnose memory usage (not
> > > included as part of this patch set).
> > > 
> > > First, we need s390 to stop using _mapcount for its own purposes;
> > > Martin, I hope you have time to look at this patch.  I must confess I
> > > don't quite understand what the different bits are used for in the upper
> > > nybble of the _mapcount, but I tried to replicate what you were doing
> > > faithfully.  
> > 
> > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > memory for each of the tables. So we allocate 4K and split it into two
> > 2K pieces. Now we have to keep track of the pieces to be able to free
> > them again.  
> 
> Have you considered to use slab for page table allocation instead?
> IIRC some architectures practice this already.

Well there is a complication with KVM and the page table management for
gmaps. If mm_alloc_pgste(mm) == true then a 4K page page table has to be
allocated. For the gmap I need a place to store an 8 byte value, currently
we use page->index. But the slab/slub code uses page->index for its own
purpose. This creates a conflict, but maybe doing a get_free_page for
mm_alloc_pgste(mm) == true and using a slab cache for normal page tables
might work.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 15:44:12 +0300
"Kirill A. Shutemov"  wrote:

> On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> > On Wed, 28 Feb 2018 14:31:53 -0800
> > Matthew Wilcox  wrote:
> >   
> > > From: Matthew Wilcox 
> > > 
> > > I want to use the _mapcount field to record what a page is in use as.
> > > This can help with debugging and we can also expose that information to
> > > userspace through /proc/kpageflags to help diagnose memory usage (not
> > > included as part of this patch set).
> > > 
> > > First, we need s390 to stop using _mapcount for its own purposes;
> > > Martin, I hope you have time to look at this patch.  I must confess I
> > > don't quite understand what the different bits are used for in the upper
> > > nybble of the _mapcount, but I tried to replicate what you were doing
> > > faithfully.  
> > 
> > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > memory for each of the tables. So we allocate 4K and split it into two
> > 2K pieces. Now we have to keep track of the pieces to be able to free
> > them again.  
> 
> Have you considered to use slab for page table allocation instead?
> IIRC some architectures practice this already.

Well there is a complication with KVM and the page table management for
gmaps. If mm_alloc_pgste(mm) == true then a 4K page page table has to be
allocated. For the gmap I need a place to store an 8 byte value, currently
we use page->index. But the slab/slub code uses page->index for its own
purpose. This creates a conflict, but maybe doing a get_free_page for
mm_alloc_pgste(mm) == true and using a slab cache for normal page tables
might work.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Kirill A. Shutemov
On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> On Wed, 28 Feb 2018 14:31:53 -0800
> Matthew Wilcox  wrote:
> 
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> > 
> > First, we need s390 to stop using _mapcount for its own purposes;
> > Martin, I hope you have time to look at this patch.  I must confess I
> > don't quite understand what the different bits are used for in the upper
> > nybble of the _mapcount, but I tried to replicate what you were doing
> > faithfully.
> 
> Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> but 4K pages. If we use full pages for the pte tables we waste 2K of
> memory for each of the tables. So we allocate 4K and split it into two
> 2K pieces. Now we have to keep track of the pieces to be able to free
> them again.

Have you considered to use slab for page table allocation instead?
IIRC some architectures practice this already.

-- 
 Kirill A. Shutemov


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Kirill A. Shutemov
On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> On Wed, 28 Feb 2018 14:31:53 -0800
> Matthew Wilcox  wrote:
> 
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> > 
> > First, we need s390 to stop using _mapcount for its own purposes;
> > Martin, I hope you have time to look at this patch.  I must confess I
> > don't quite understand what the different bits are used for in the upper
> > nybble of the _mapcount, but I tried to replicate what you were doing
> > faithfully.
> 
> Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> but 4K pages. If we use full pages for the pte tables we waste 2K of
> memory for each of the tables. So we allocate 4K and split it into two
> 2K pieces. Now we have to keep track of the pieces to be able to free
> them again.

Have you considered to use slab for page table allocation instead?
IIRC some architectures practice this already.

-- 
 Kirill A. Shutemov


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 08:17:50 +0100
Martin Schwidefsky  wrote:

> On Wed, 28 Feb 2018 14:31:53 -0800
> Matthew Wilcox  wrote:
> 
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> > 
> > First, we need s390 to stop using _mapcount for its own purposes;
> > Martin, I hope you have time to look at this patch.  I must confess I
> > don't quite understand what the different bits are used for in the upper
> > nybble of the _mapcount, but I tried to replicate what you were doing
> > faithfully.  
> 
> Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> but 4K pages. If we use full pages for the pte tables we waste 2K of
> memory for each of the tables. So we allocate 4K and split it into two
> 2K pieces. Now we have to keep track of the pieces to be able to free
> them again.
> 
> I try to give your patch a spin today. It should be stand-alone, no ?

Ok, that seems to work just fine. System boots and survived some stress
without loosing memory. 

Acked-by: Martin Schwidefsky 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-03-01 Thread Martin Schwidefsky
On Thu, 1 Mar 2018 08:17:50 +0100
Martin Schwidefsky  wrote:

> On Wed, 28 Feb 2018 14:31:53 -0800
> Matthew Wilcox  wrote:
> 
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> > 
> > First, we need s390 to stop using _mapcount for its own purposes;
> > Martin, I hope you have time to look at this patch.  I must confess I
> > don't quite understand what the different bits are used for in the upper
> > nybble of the _mapcount, but I tried to replicate what you were doing
> > faithfully.  
> 
> Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> but 4K pages. If we use full pages for the pte tables we waste 2K of
> memory for each of the tables. So we allocate 4K and split it into two
> 2K pieces. Now we have to keep track of the pieces to be able to free
> them again.
> 
> I try to give your patch a spin today. It should be stand-alone, no ?

Ok, that seems to work just fine. System boots and survived some stress
without loosing memory. 

Acked-by: Martin Schwidefsky 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Martin Schwidefsky
On Wed, 28 Feb 2018 14:31:53 -0800
Matthew Wilcox  wrote:

> From: Matthew Wilcox 
> 
> I want to use the _mapcount field to record what a page is in use as.
> This can help with debugging and we can also expose that information to
> userspace through /proc/kpageflags to help diagnose memory usage (not
> included as part of this patch set).
> 
> First, we need s390 to stop using _mapcount for its own purposes;
> Martin, I hope you have time to look at this patch.  I must confess I
> don't quite understand what the different bits are used for in the upper
> nybble of the _mapcount, but I tried to replicate what you were doing
> faithfully.

Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
but 4K pages. If we use full pages for the pte tables we waste 2K of
memory for each of the tables. So we allocate 4K and split it into two
2K pieces. Now we have to keep track of the pieces to be able to free
them again.

I try to give your patch a spin today. It should be stand-alone, no ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Martin Schwidefsky
On Wed, 28 Feb 2018 14:31:53 -0800
Matthew Wilcox  wrote:

> From: Matthew Wilcox 
> 
> I want to use the _mapcount field to record what a page is in use as.
> This can help with debugging and we can also expose that information to
> userspace through /proc/kpageflags to help diagnose memory usage (not
> included as part of this patch set).
> 
> First, we need s390 to stop using _mapcount for its own purposes;
> Martin, I hope you have time to look at this patch.  I must confess I
> don't quite understand what the different bits are used for in the upper
> nybble of the _mapcount, but I tried to replicate what you were doing
> faithfully.

Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
but 4K pages. If we use full pages for the pte tables we waste 2K of
memory for each of the tables. So we allocate 4K and split it into two
2K pieces. Now we have to keep track of the pieces to be able to free
them again.

I try to give your patch a spin today. It should be stand-alone, no ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Matthew Wilcox
On Wed, Feb 28, 2018 at 03:22:49PM -0800, Randy Dunlap wrote:
> On 02/28/2018 02:31 PM, Matthew Wilcox wrote:
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> 
> Hey,
> 
> Will there be updates to tools/vm/ also, or are these a different set of
> (many) flags?

Those KPF flags are the ones I was talking about.  I haven't looked into
what it takes to assign those flags yet.


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Matthew Wilcox
On Wed, Feb 28, 2018 at 03:22:49PM -0800, Randy Dunlap wrote:
> On 02/28/2018 02:31 PM, Matthew Wilcox wrote:
> > From: Matthew Wilcox 
> > 
> > I want to use the _mapcount field to record what a page is in use as.
> > This can help with debugging and we can also expose that information to
> > userspace through /proc/kpageflags to help diagnose memory usage (not
> > included as part of this patch set).
> 
> Hey,
> 
> Will there be updates to tools/vm/ also, or are these a different set of
> (many) flags?

Those KPF flags are the ones I was talking about.  I haven't looked into
what it takes to assign those flags yet.


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Randy Dunlap
On 02/28/2018 02:31 PM, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> I want to use the _mapcount field to record what a page is in use as.
> This can help with debugging and we can also expose that information to
> userspace through /proc/kpageflags to help diagnose memory usage (not
> included as part of this patch set).

Hey,

Will there be updates to tools/vm/ also, or are these a different set of
(many) flags?

thanks,
-- 
~Randy


Re: [PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Randy Dunlap
On 02/28/2018 02:31 PM, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> I want to use the _mapcount field to record what a page is in use as.
> This can help with debugging and we can also expose that information to
> userspace through /proc/kpageflags to help diagnose memory usage (not
> included as part of this patch set).

Hey,

Will there be updates to tools/vm/ also, or are these a different set of
(many) flags?

thanks,
-- 
~Randy


[PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Matthew Wilcox
From: Matthew Wilcox 

I want to use the _mapcount field to record what a page is in use as.
This can help with debugging and we can also expose that information to
userspace through /proc/kpageflags to help diagnose memory usage (not
included as part of this patch set).

First, we need s390 to stop using _mapcount for its own purposes;
Martin, I hope you have time to look at this patch.  I must confess I
don't quite understand what the different bits are used for in the upper
nybble of the _mapcount, but I tried to replicate what you were doing
faithfully.

Matthew Wilcox (4):
  s390: Use _refcount for pgtables
  mm: Split page_type out from _map_count
  mm: Mark pages allocated through vmalloc
  mm: Mark pages in use for page tables

 arch/s390/mm/pgalloc.c | 21 +
 fs/proc/page.c |  2 +-
 include/linux/mm.h |  2 ++
 include/linux/mm_types.h   | 13 +++
 include/linux/page-flags.h | 57 ++
 kernel/crash_core.c|  1 +
 mm/page_alloc.c| 13 ---
 mm/vmalloc.c   |  2 ++
 scripts/tags.sh|  6 ++---
 9 files changed, 72 insertions(+), 45 deletions(-)

-- 
2.16.1



[PATCH v3 0/4] Split page_type out from mapcount

2018-02-28 Thread Matthew Wilcox
From: Matthew Wilcox 

I want to use the _mapcount field to record what a page is in use as.
This can help with debugging and we can also expose that information to
userspace through /proc/kpageflags to help diagnose memory usage (not
included as part of this patch set).

First, we need s390 to stop using _mapcount for its own purposes;
Martin, I hope you have time to look at this patch.  I must confess I
don't quite understand what the different bits are used for in the upper
nybble of the _mapcount, but I tried to replicate what you were doing
faithfully.

Matthew Wilcox (4):
  s390: Use _refcount for pgtables
  mm: Split page_type out from _map_count
  mm: Mark pages allocated through vmalloc
  mm: Mark pages in use for page tables

 arch/s390/mm/pgalloc.c | 21 +
 fs/proc/page.c |  2 +-
 include/linux/mm.h |  2 ++
 include/linux/mm_types.h   | 13 +++
 include/linux/page-flags.h | 57 ++
 kernel/crash_core.c|  1 +
 mm/page_alloc.c| 13 ---
 mm/vmalloc.c   |  2 ++
 scripts/tags.sh|  6 ++---
 9 files changed, 72 insertions(+), 45 deletions(-)

-- 
2.16.1