Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread KOSAKI Motohiro
> I agree that reporting the amount of shared pages in that historically fashion
> might not be interesting for userspace tools resorting to sysinfo(2),
> nowadays.
>
> OTOH, our documentation implies we do return shared memory there, and FWIW,
> considering the other places we do export the "shared memory" concept to
> userspace nowadays, we are suggesting it's the amount of tmpfs/shmem, and not 
> the
> amount of shared mapped pages it historiacally represented once. What is 
> really
> confusing is having a field that supposedely/expectedely would return the 
> amount
> of shmem to userspace queries, but instead returns a hard-coded zero (0).
>
> I could easily find out that there were some user complaint/confusion on this
> semantic inconsistency in the past, as in:
> https://groups.google.com/forum/#!topic/comp.os.linux.development.system/ogWVn6XdvGA
>
> or in:
> http://marc.info/?l=net-snmp-cvs&m=132148788500667
>
> which suggests users seem to always have understood it as being shmem/tmpfs
> usage, as the /proc/meminfo field "MemShared" was tied direclty to
> sysinfo.sharedram. Historically we reported shared memory that way, and
> when it wasn't accurately meaning that anymore a 0 was hardcoded there to
> potentially not break compatibility with older tools (older than 2.4).
> In 2.6 we got rid of meminfo's "MemShared" until 2009, when you sort of
> re-introduced it re-branded as Shmem. IMO, we should leverage what we
> have in kernel now and take this change to make the exposed data consistent
> across the interfaces that export it today -- sysinfo(2) & /proc/meminfo.
>
> This is not a hard requirement, though, but rather a simple maintenance
> nitpick from code review.

Ok, ack then. But please update a patch description and repost w/
ccing linux-...@vger.kernel.org. Someone might have a specific concern
about a compatibility.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread Rafael Aquini
On Wed, Jun 25, 2014 at 01:27:53PM -0700, Motohiro Kosaki wrote:
> 
> 
> > -Original Message-
> > From: Rafael Aquini [mailto:aqu...@redhat.com]
> > Sent: Wednesday, June 25, 2014 4:16 PM
> > To: Motohiro Kosaki
> > Cc: linux...@kvack.org; Andrew Morton; Rik van Riel; Mel Gorman; Johannes 
> > Weiner; Motohiro Kosaki JP; linux-
> > ker...@vger.kernel.org
> > Subject: Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() 
> > interfaces
> > 
> > On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Rafael Aquini [mailto:aqu...@redhat.com]
> > > > Sent: Wednesday, June 25, 2014 2:40 PM
> > > > To: linux...@kvack.org
> > > > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner;
> > > > Motohiro Kosaki JP; linux-kernel@vger.kernel.org
> > > > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > > > interfaces
> > > >
> > > > This patch leverages the addition of explicit accounting for pages
> > > > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat"
> > > > -- in order to make the users of sysinfo(2) and si_meminfo*() friends 
> > > > aware of that vmstat entry consistently across the interfaces.
> > >
> > > Why?
> > 
> > Because we do not report consistently across the interfaces we declare 
> > exporting that data. Check sysinfo(2) manpage, for instance:
> > [...]
> >struct sysinfo {
> >long uptime; /* Seconds since boot */
> >unsigned long loads[3];  /* 1, 5, and 15 minute load 
> > averages */
> >unsigned long totalram;  /* Total usable main memory size */
> >unsigned long freeram;   /* Available memory size */
> >unsigned long sharedram; /* Amount of shared memory */ <<<<< 
> > [...]
> > 
> > userspace tools resorting to sysinfo() syscall will get a hardcoded 0 for 
> > shared memory which is reported differently from
> > /proc/meminfo.
> > 
> > Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to 
> > gather statistics for /proc/meminfo & friends, and so we
> > can leverage collecting sharedmem from those calls as well, just as we do 
> > for totalram, freeram & bufferram.
> 
> But "Amount of shared memory"  didn't mean amout of shmem. It actually meant 
> amout of page of page-count>=2.
> Again, there is a possibility to change the semantics. But I don't have 
> enough userland knowledge to do. Please investigate
> and explain why your change don't break any userland. 

I agree that reporting the amount of shared pages in that historically fashion 
might not be interesting for userspace tools resorting to sysinfo(2),
nowadays.

OTOH, our documentation implies we do return shared memory there, and FWIW,
considering the other places we do export the "shared memory" concept to 
userspace nowadays, we are suggesting it's the amount of tmpfs/shmem, and not 
the
amount of shared mapped pages it historiacally represented once. What is really
confusing is having a field that supposedely/expectedely would return the amount
of shmem to userspace queries, but instead returns a hard-coded zero (0).

I could easily find out that there were some user complaint/confusion on this 
semantic inconsistency in the past, as in:
https://groups.google.com/forum/#!topic/comp.os.linux.development.system/ogWVn6XdvGA

or in:
http://marc.info/?l=net-snmp-cvs&m=132148788500667

which suggests users seem to always have understood it as being shmem/tmpfs
usage, as the /proc/meminfo field "MemShared" was tied direclty to
sysinfo.sharedram. Historically we reported shared memory that way, and
when it wasn't accurately meaning that anymore a 0 was hardcoded there to
potentially not break compatibility with older tools (older than 2.4).
In 2.6 we got rid of meminfo's "MemShared" until 2009, when you sort of
re-introduced it re-branded as Shmem. IMO, we should leverage what we 
have in kernel now and take this change to make the exposed data consistent 
across the interfaces that export it today -- sysinfo(2) & /proc/meminfo.

This is not a hard requirement, though, but rather a simple maintenance
nitpick from code review. 

Regards,
-- Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread Motohiro Kosaki


> -Original Message-
> From: Rafael Aquini [mailto:aqu...@redhat.com]
> Sent: Wednesday, June 25, 2014 4:16 PM
> To: Motohiro Kosaki
> Cc: linux...@kvack.org; Andrew Morton; Rik van Riel; Mel Gorman; Johannes 
> Weiner; Motohiro Kosaki JP; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() 
> interfaces
> 
> On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
> >
> >
> > > -Original Message-
> > > From: Rafael Aquini [mailto:aqu...@redhat.com]
> > > Sent: Wednesday, June 25, 2014 2:40 PM
> > > To: linux...@kvack.org
> > > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner;
> > > Motohiro Kosaki JP; linux-kernel@vger.kernel.org
> > > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > > interfaces
> > >
> > > This patch leverages the addition of explicit accounting for pages
> > > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat"
> > > -- in order to make the users of sysinfo(2) and si_meminfo*() friends 
> > > aware of that vmstat entry consistently across the interfaces.
> >
> > Why?
> 
> Because we do not report consistently across the interfaces we declare 
> exporting that data. Check sysinfo(2) manpage, for instance:
> [...]
>struct sysinfo {
>long uptime; /* Seconds since boot */
>unsigned long loads[3];  /* 1, 5, and 15 minute load averages 
> */
>unsigned long totalram;  /* Total usable main memory size */
>unsigned long freeram;   /* Available memory size */
>unsigned long sharedram; /* Amount of shared memory */ <<<<< 
> [...]
> 
> userspace tools resorting to sysinfo() syscall will get a hardcoded 0 for 
> shared memory which is reported differently from
> /proc/meminfo.
> 
> Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to 
> gather statistics for /proc/meminfo & friends, and so we
> can leverage collecting sharedmem from those calls as well, just as we do for 
> totalram, freeram & bufferram.

But "Amount of shared memory"  didn't mean amout of shmem. It actually meant 
amout of page of page-count>=2.
Again, there is a possibility to change the semantics. But I don't have enough 
userland knowledge to do. Please investigate
and explain why your change don't break any userland. 






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread Motohiro Kosaki


> -Original Message-
> From: motohiro.kos...@us.fujitsu.com [mailto:motohiro.kos...@us.fujitsu.com]
> Sent: Wednesday, June 25, 2014 3:41 PM
> To: Rafael Aquini; linux...@kvack.org
> Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki 
> JP; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() 
> interfaces
> 
> 
> 
> > -Original Message-
> > From: Rafael Aquini [mailto:aqu...@redhat.com]
> > Sent: Wednesday, June 25, 2014 2:40 PM
> > To: linux...@kvack.org
> > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro
> > Kosaki JP; linux-kernel@vger.kernel.org
> > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo()
> > interfaces
> >
> > This patch leverages the addition of explicit accounting for pages
> > used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat" --
> > in order to make the users of sysinfo(2) and si_meminfo*() friends aware of 
> > that vmstat entry consistently across the interfaces.
> 
> Why?
> Traditionally sysinfo.sharedram was not used for shmem. It was totally 
> strange semantics and completely outdated feature.
> So, we may reuse it for another purpose. But I'm not sure its benefit.
> 
> Why don't you use /proc/meminfo?
> I'm afraid userland programs get a confusion.

For the record. This is historical implementation at linux-2.3.12. I.e. account 
sum of page count.


void si_meminfo(struct sysinfo *val)
{
int i;

i = max_mapnr;
val->totalram = 0;
val->sharedram = 0;
val->freeram = nr_free_pages << PAGE_SHIFT;
val->bufferram = atomic_read(&buffermem);
while (i-- > 0)  {
if (PageReserved(mem_map+i))
continue;
val->totalram++;
if (!page_count(mem_map+i))
continue;
val->sharedram += page_count(mem_map+i) - 1;
}
val->totalram <<= PAGE_SHIFT;
val->sharedram <<= PAGE_SHIFT;
return;
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread Rafael Aquini
On Wed, Jun 25, 2014 at 12:41:17PM -0700, Motohiro Kosaki wrote:
> 
> 
> > -Original Message-
> > From: Rafael Aquini [mailto:aqu...@redhat.com]
> > Sent: Wednesday, June 25, 2014 2:40 PM
> > To: linux...@kvack.org
> > Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro 
> > Kosaki JP; linux-kernel@vger.kernel.org
> > Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() 
> > interfaces
> > 
> > This patch leverages the addition of explicit accounting for pages used by 
> > shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem
> > vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() 
> > friends aware of that vmstat entry consistently across the
> > interfaces.
> 
> Why?

Because we do not report consistently across the interfaces we declare
exporting that data. Check sysinfo(2) manpage, for instance:
[...]
   struct sysinfo {
   long uptime; /* Seconds since boot */
   unsigned long loads[3];  /* 1, 5, and 15 minute load averages */
   unsigned long totalram;  /* Total usable main memory size */
   unsigned long freeram;   /* Available memory size */
   unsigned long sharedram; /* Amount of shared memory */ <
[...]

userspace tools resorting to sysinfo() syscall will get a hardcoded 0
for shared memory which is reported differently from /proc/meminfo.

Also, si_meminfo() & si_meminfo_node() are utilized within the kernel to
gather statistics for /proc/meminfo & friends, and so we can leverage collecting
sharedmem from those calls as well, just as we do for totalram, freeram & 
bufferram.

Regards,
-- Rafael

> Traditionally sysinfo.sharedram was not used for shmem. It was totally 
> strange semantics and completely outdated feature. 
> So, we may reuse it for another purpose. But I'm not sure its benefit. 
> 
> Why don't you use /proc/meminfo?
> I'm afraid userland programs get a confusion. 
> 
> 
> > 
> > Signed-off-by: Rafael Aquini 
> > ---
> >  drivers/base/node.c | 2 +-
> >  fs/proc/meminfo.c   | 2 +-
> >  mm/page_alloc.c | 3 ++-
> >  3 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/base/node.c b/drivers/base/node.c index 
> > 8f7ed99..c6d3ae0 100644
> > --- a/drivers/base/node.c
> > +++ b/drivers/base/node.c
> > @@ -126,7 +126,7 @@ static ssize_t node_read_meminfo(struct device *dev,
> >nid, K(node_page_state(nid, NR_FILE_PAGES)),
> >nid, K(node_page_state(nid, NR_FILE_MAPPED)),
> >nid, K(node_page_state(nid, NR_ANON_PAGES)),
> > -  nid, K(node_page_state(nid, NR_SHMEM)),
> > +  nid, K(i.sharedram),
> >nid, node_page_state(nid, NR_KERNEL_STACK) *
> > THREAD_SIZE / 1024,
> >nid, K(node_page_state(nid, NR_PAGETABLE)), diff --git 
> > a/fs/proc/meminfo.c b/fs/proc/meminfo.c index
> > 7445af0..aa1eee0 100644
> > --- a/fs/proc/meminfo.c
> > +++ b/fs/proc/meminfo.c
> > @@ -168,7 +168,7 @@ static int meminfo_proc_show(struct seq_file *m, void 
> > *v)
> > K(global_page_state(NR_WRITEBACK)),
> > K(global_page_state(NR_ANON_PAGES)),
> > K(global_page_state(NR_FILE_MAPPED)),
> > -   K(global_page_state(NR_SHMEM)),
> > +   K(i.sharedram),
> > K(global_page_state(NR_SLAB_RECLAIMABLE) +
> > global_page_state(NR_SLAB_UNRECLAIMABLE)),
> > K(global_page_state(NR_SLAB_RECLAIMABLE)),
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 20d17f8..f72ea38 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3040,7 +3040,7 @@ static inline void show_node(struct zone *zone)  void 
> > si_meminfo(struct sysinfo *val)  {
> > val->totalram = totalram_pages;
> > -   val->sharedram = 0;
> > +   val->sharedram = global_page_state(NR_SHMEM);
> > val->freeram = global_page_state(NR_FREE_PAGES);
> > val->bufferram = nr_blockdev_pages();
> > val->totalhigh = totalhigh_pages;
> > @@ -3060,6 +3060,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
> > for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
> > managed_pages += pgdat->node_zones[zone_type].managed_pages;
> > val->totalram = managed_pages;
> > +   val->sharedram = node_page_state(nid, NR_SHMEM);
> > val->freeram = node_page_state(nid, NR_FREE_PAGES);  #ifdef 
> > CONFIG_HIGHMEM
> > val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
> > --
> > 1.9.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces

2014-06-25 Thread Motohiro Kosaki


> -Original Message-
> From: Rafael Aquini [mailto:aqu...@redhat.com]
> Sent: Wednesday, June 25, 2014 2:40 PM
> To: linux...@kvack.org
> Cc: Andrew Morton; Rik van Riel; Mel Gorman; Johannes Weiner; Motohiro Kosaki 
> JP; linux-kernel@vger.kernel.org
> Subject: [PATCH] mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces
> 
> This patch leverages the addition of explicit accounting for pages used by 
> shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem
> vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() friends 
> aware of that vmstat entry consistently across the
> interfaces.

Why?
Traditionally sysinfo.sharedram was not used for shmem. It was totally strange 
semantics and completely outdated feature. 
So, we may reuse it for another purpose. But I'm not sure its benefit. 

Why don't you use /proc/meminfo?
I'm afraid userland programs get a confusion. 


> 
> Signed-off-by: Rafael Aquini 
> ---
>  drivers/base/node.c | 2 +-
>  fs/proc/meminfo.c   | 2 +-
>  mm/page_alloc.c | 3 ++-
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/node.c b/drivers/base/node.c index 8f7ed99..c6d3ae0 
> 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -126,7 +126,7 @@ static ssize_t node_read_meminfo(struct device *dev,
>  nid, K(node_page_state(nid, NR_FILE_PAGES)),
>  nid, K(node_page_state(nid, NR_FILE_MAPPED)),
>  nid, K(node_page_state(nid, NR_ANON_PAGES)),
> -nid, K(node_page_state(nid, NR_SHMEM)),
> +nid, K(i.sharedram),
>  nid, node_page_state(nid, NR_KERNEL_STACK) *
>   THREAD_SIZE / 1024,
>  nid, K(node_page_state(nid, NR_PAGETABLE)), diff --git 
> a/fs/proc/meminfo.c b/fs/proc/meminfo.c index
> 7445af0..aa1eee0 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -168,7 +168,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>   K(global_page_state(NR_WRITEBACK)),
>   K(global_page_state(NR_ANON_PAGES)),
>   K(global_page_state(NR_FILE_MAPPED)),
> - K(global_page_state(NR_SHMEM)),
> + K(i.sharedram),
>   K(global_page_state(NR_SLAB_RECLAIMABLE) +
>   global_page_state(NR_SLAB_UNRECLAIMABLE)),
>   K(global_page_state(NR_SLAB_RECLAIMABLE)),
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 20d17f8..f72ea38 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3040,7 +3040,7 @@ static inline void show_node(struct zone *zone)  void 
> si_meminfo(struct sysinfo *val)  {
>   val->totalram = totalram_pages;
> - val->sharedram = 0;
> + val->sharedram = global_page_state(NR_SHMEM);
>   val->freeram = global_page_state(NR_FREE_PAGES);
>   val->bufferram = nr_blockdev_pages();
>   val->totalhigh = totalhigh_pages;
> @@ -3060,6 +3060,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
>   for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++)
>   managed_pages += pgdat->node_zones[zone_type].managed_pages;
>   val->totalram = managed_pages;
> + val->sharedram = node_page_state(nid, NR_SHMEM);
>   val->freeram = node_page_state(nid, NR_FREE_PAGES);  #ifdef 
> CONFIG_HIGHMEM
>   val->totalhigh = pgdat->node_zones[ZONE_HIGHMEM].managed_pages;
> --
> 1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/