Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-05 Thread Balbir Singh
On Sun, Sep 02, 2007 at 09:53:22PM -0700, Paul Menage wrote:
> On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
> > -   s += sprintf(s, "%lu\n", *val);
> > +   if (read_strategy)
> > +   s += read_strategy(*val, s);
> > +   else
> > +   s += sprintf(s, "%lu\n", *val);
> 
> This would be better as %llu
> 
> > +   tmp = simple_strtoul(buf, &end, 10);
> 
> and this as simple_strtoull()
>

Hi,

Here's the new patch for the UI changes. I hope I've got it right
(Jetlag can do funny things to the mind and eyes :-) )

Changelog for version 4

1. Fix parsing of unsigned long long arguments.

Changelog for version 3
1. Change memory.limit and memory.usage to memory.limit_in_bytes and
   memory.usage_in_bytes respectively
2. Remove "Bytes" from the output of the limit and usage counters
3. Remove spurious printk

Changelog for version 2

1. Back end tracking is done in bytes, round up values of the limit if the
   specified value is not a multiple of page size. Display memory.usage
   and memory.limit in bytes (Dave Hansen, Paul Menage)

Change the interface to use bytes instead of pages. Page sizes can vary
across platforms and configurations. A new strategy routine has been added
to the resource counters infrastructure to format the data as desired.

Suggested by David Rientjes, Andrew Morton and Herbert Poetzl

Tested on a UML setup with the config for memory control enabled.
---


Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
---

 Documentation/controllers/memory.txt |   28 ++---
 include/linux/res_counter.h  |   10 +++---
 kernel/res_counter.c |   30 +-
 mm/memcontrol.c  |   56 +++
 4 files changed, 94 insertions(+), 30 deletions(-)

diff -puN Documentation/controllers/memory.txt~mem-control-make-ui-more-usable 
Documentation/controllers/memory.txt
--- 
linux-2.6.23-rc4/Documentation/controllers/memory.txt~mem-control-make-ui-more-usable
   2007-09-02 11:12:03.0 +0100
+++ linux-2.6.23-rc4-balbir/Documentation/controllers/memory.txt
2007-09-02 11:17:12.0 +0100
@@ -165,11 +165,29 @@ c. Enable CONFIG_CONTAINER_MEM_CONT
 
 Since now we're in the 0 container,
 We can alter the memory limit:
-# echo -n 6000 > /containers/0/memory.limit
+# echo -n 4M > /containers/0/memory.limit_in_bytes
+
+NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
+mega or gigabytes.
+
+# cat /containers/0/memory.limit_in_bytes
+4194304 Bytes
+
+NOTE: The interface has now changed to display the usage in bytes
+instead of pages
 
 We can check the usage:
-# cat /containers/0/memory.usage
-25
+# cat /containers/0/memory.usage_in_bytes
+1216512 Bytes
+
+Setting a limit to a number that is not a multiple of page size causes
+rounding up of the value. The user must check back to see (by reading
+memory.limit_in_bytes), to check for differences between desired values and
+committed values. Currently, all accounting is done in multiples of PAGE_SIZE
+
+# echo -n 1 > memory.limit_in_bytes
+# cat memory.limit_in_bytes
+4096 Bytes
 
 The memory.failcnt field gives the number of times that the container limit was
 exceeded.
@@ -206,8 +224,8 @@ container might have some charge associa
 tasks have migrated away from it. If some pages are still left, after following
 the steps listed in sections 4.1 and 4.2, check the Swap Cache usage in
 /proc/meminfo to see if the Swap Cache usage is showing up in the
-containers memory.usage counter. A simple test of swapoff -a and swapon -a
-should free any pending Swap Cache usage.
+containers memory.usage_in_bytes counter. A simple test of swapoff -a and
+swapon -a should free any pending Swap Cache usage.
 
 4.4 Choosing what to account  -- Page Cache (unmapped) vs RSS (mapped)?
 
diff -puN include/linux/res_counter.h~mem-control-make-ui-more-usable 
include/linux/res_counter.h
--- 
linux-2.6.23-rc4/include/linux/res_counter.h~mem-control-make-ui-more-usable
2007-09-02 11:12:03.0 +0100
+++ linux-2.6.23-rc4-balbir/include/linux/res_counter.h 2007-09-02 
11:12:03.0 +0100
@@ -23,11 +23,11 @@ struct res_counter {
/*
 * the current resource consumption level
 */
-   unsigned long usage;
+   unsigned long long usage;
/*
 * the limit that usage cannot exceed
 */
-   unsigned long limit;
+   unsigned long long limit;
/*
 * the number of unsuccessful attempts to consume the resource
 */
@@ -52,9 +52,11 @@ struct res_counter {
  */
 
 ssize_t res_counter_read(struct res_counter *counter, int member,
-   const char __user *buf, size_t nbytes, loff_t *pos);
+   const char __user *buf, size_t nbytes, loff_t *pos,
+   int (*read_strategy)(unsigned long long val, char *s));
 ssize_t res_counter_write(struct res_counter *counter, int member,
-   const char __user *buf,

Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-05 Thread Balbir Singh
> 
> But val is an unsigned long long*. So printing *val with %lu will
> break (at least a warning, and maybe corruption if you had other
> parameters) on 32-bit archs.
> 

How does this look?


Changelog for version 4

1. Make all resource counters members unsigned long long
2. Use documentation comments from Dave Hansen

Changelog for version 3
1. Change memory.limit and memory.usage to memory.limit_in_bytes and
   memory.usage_in_bytes respectively
2. Remove "Bytes" from the output of the limit and usage counters
3. Remove spurious printk

Changelog for version 2

1. Back end tracking is done in bytes, round up values of the limit if the
   specified value is not a multiple of page size. Display memory.usage
   and memory.limit in bytes (Dave Hansen, Paul Menage)

Change the interface to use bytes instead of pages. Page sizes can vary
across platforms and configurations. A new strategy routine has been added
to the resource counters infrastructure to format the data as desired.

Suggested by David Rientjes, Andrew Morton and Herbert Poetzl

Tested on a UML setup with the config for memory control enabled.

Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
---

 Documentation/controllers/memory.txt |   29 -
 include/linux/res_counter.h  |   12 +++-
 kernel/res_counter.c |   34 +++---
 mm/memcontrol.c  |   35 +--
 4 files changed, 79 insertions(+), 31 deletions(-)

diff -puN Documentation/controllers/memory.txt~mem-control-make-ui-more-usable 
Documentation/controllers/memory.txt
--- 
linux-2.6.23-rc4/Documentation/controllers/memory.txt~mem-control-make-ui-more-usable
   2007-09-02 11:12:03.0 +0100
+++ linux-2.6.23-rc4-balbir/Documentation/controllers/memory.txt
2007-09-05 16:44:41.0 +0100
@@ -165,11 +165,30 @@ c. Enable CONFIG_CONTAINER_MEM_CONT
 
 Since now we're in the 0 container,
 We can alter the memory limit:
-# echo -n 6000 > /containers/0/memory.limit
+# echo -n 4M > /containers/0/memory.limit_in_bytes
+
+NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
+mega or gigabytes.
+
+# cat /containers/0/memory.limit_in_bytes
+4194304 Bytes
+
+NOTE: The interface has now changed to display the usage in bytes
+instead of pages
 
 We can check the usage:
-# cat /containers/0/memory.usage
-25
+# cat /containers/0/memory.usage_in_bytes
+1216512 Bytes
+
+A successful write to this file does not guarantee a successful set of
+this limit to the value written into the file.  This can be due to a
+number of factors, such as rounding up to page boundaries or the total
+availability of memory on the system.  The user is required to re-read
+this file after a write to guarantee the value committed by the kernel.
+
+# echo -n 1 > memory.limit_in_bytes
+# cat memory.limit_in_bytes
+4096 Bytes
 
 The memory.failcnt field gives the number of times that the container limit was
 exceeded.
@@ -206,8 +225,8 @@ container might have some charge associa
 tasks have migrated away from it. If some pages are still left, after following
 the steps listed in sections 4.1 and 4.2, check the Swap Cache usage in
 /proc/meminfo to see if the Swap Cache usage is showing up in the
-containers memory.usage counter. A simple test of swapoff -a and swapon -a
-should free any pending Swap Cache usage.
+containers memory.usage_in_bytes counter. A simple test of swapoff -a and
+swapon -a should free any pending Swap Cache usage.
 
 4.4 Choosing what to account  -- Page Cache (unmapped) vs RSS (mapped)?
 
diff -puN include/linux/res_counter.h~mem-control-make-ui-more-usable 
include/linux/res_counter.h
--- 
linux-2.6.23-rc4/include/linux/res_counter.h~mem-control-make-ui-more-usable
2007-09-02 11:12:03.0 +0100
+++ linux-2.6.23-rc4-balbir/include/linux/res_counter.h 2007-09-05 
16:12:49.0 +0100
@@ -23,15 +23,15 @@ struct res_counter {
/*
 * the current resource consumption level
 */
-   unsigned long usage;
+   unsigned long long usage;
/*
 * the limit that usage cannot exceed
 */
-   unsigned long limit;
+   unsigned long long limit;
/*
 * the number of unsuccessful attempts to consume the resource
 */
-   unsigned long failcnt;
+   unsigned long long failcnt;
/*
 * the lock to protect all of the above.
 * the routines below consider this to be IRQ-safe
@@ -52,9 +52,11 @@ struct res_counter {
  */
 
 ssize_t res_counter_read(struct res_counter *counter, int member,
-   const char __user *buf, size_t nbytes, loff_t *pos);
+   const char __user *buf, size_t nbytes, loff_t *pos,
+   int (*read_strategy)(unsigned long long val, char *s));
 ssize_t res_counter_write(struct res_counter *counter, int member,
-   const char __user *buf, size_t nbytes, loff_t *pos);
+ 

Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-04 Thread Dave Hansen
On Sun, 2007-09-02 at 16:20 +0530, Balbir Singh wrote:
> 
> +Setting a limit to a number that is not a multiple of page size causes
> +rounding up of the value. The user must check back to see (by reading
> +memory.limit_in_bytes), to check for differences between desired values and
> +committed values. Currently, all accounting is done in multiples of 
> PAGE_SIZE 

I wonder if we can say this in a bit more generic fashion.

A successful write to this file does not guarantee a successful
set of this limit to the value written into the file.  This can
be due to a number of factors, such as rounding up to page
boundaries or the total availability of memory on the system.
The user is required to re-read this file after a write to
guarantee the value committed by the kernel.

This keeps a user from saying "I page aligned the value I stuck in
there, no I don't have to check it."

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-04 Thread Balbir Singh
Paul Menage wrote:
> On 9/3/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>> Paul Menage wrote:
>>> On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
 -   s += sprintf(s, "%lu\n", *val);
 +   if (read_strategy)
 +   s += read_strategy(*val, s);
 +   else
 +   s += sprintf(s, "%lu\n", *val);
>>> This would be better as %llu
>>>
>> Hi, Paul,
>>
>> This does not need fixing, since the other counters like failcnt are
>> still unsigned long
>>
> 
> But val is an unsigned long long*. So printing *val with %lu will
> break (at least a warning, and maybe corruption if you had other
> parameters) on 32-bit archs.
> 

Yeah... Hmm.. just wonder if all the counters should be unsigned long
long? failcnt is the only remaining unsigned long counter now.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-04 Thread Paul Menage
On 9/3/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
> Paul Menage wrote:
> > On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
> >> -   s += sprintf(s, "%lu\n", *val);
> >> +   if (read_strategy)
> >> +   s += read_strategy(*val, s);
> >> +   else
> >> +   s += sprintf(s, "%lu\n", *val);
> >
> > This would be better as %llu
> >
>
> Hi, Paul,
>
> This does not need fixing, since the other counters like failcnt are
> still unsigned long
>

But val is an unsigned long long*. So printing *val with %lu will
break (at least a warning, and maybe corruption if you had other
parameters) on 32-bit archs.

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-03 Thread Balbir Singh
Paul Menage wrote:
> On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>> -   s += sprintf(s, "%lu\n", *val);
>> +   if (read_strategy)
>> +   s += read_strategy(*val, s);
>> +   else
>> +   s += sprintf(s, "%lu\n", *val);
> 
> This would be better as %llu
> 

Hi, Paul,

This does not need fixing, since the other counters like failcnt are
still unsigned long

>> +   tmp = simple_strtoul(buf, &end, 10);
> 
> and this as simple_strtoull()
> 


Will do, thanks!



-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-03 Thread Balbir Singh
Paul Menage wrote:
> On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>> -   s += sprintf(s, "%lu\n", *val);
>> +   if (read_strategy)
>> +   s += read_strategy(*val, s);
>> +   else
>> +   s += sprintf(s, "%lu\n", *val);
> 
> This would be better as %llu
> 
>> +   tmp = simple_strtoul(buf, &end, 10);
> 
> and this as simple_strtoull()
> 
> Paul
> 

Thanks for catching it, I'll fix that.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v3)

2007-09-02 Thread Paul Menage
On 9/2/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
> -   s += sprintf(s, "%lu\n", *val);
> +   if (read_strategy)
> +   s += read_strategy(*val, s);
> +   else
> +   s += sprintf(s, "%lu\n", *val);

This would be better as %llu

> +   tmp = simple_strtoul(buf, &end, 10);

and this as simple_strtoull()

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v2)

2007-08-31 Thread Balbir Singh
On 9/1/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Fri, 31 Aug 2007 00:22:46 +0530
> Balbir Singh <[EMAIL PROTECTED]> wrote:
>
> > +/*
> > + * Strategy routines for formating read/write data
> > + */
> > +int mem_container_read_strategy(unsigned long long val, char *buf)
> > +{
> > + return sprintf(buf, "%llu Bytes\n", val);
> > +}
>
> It's a bit cheesy to be printing the units like this.  It's better to just
> print the raw number.
>
> If you really want to remind the user what units that number is in (not a
> bad idea) then it can be encoded in the filename, like
> /proc/sys/vm/min_free_kbytes, /proc/sys/vm/dirty_expire_centisecs, etc.
>

Sounds good, I'll change the file to memory.limit_in_bytes and
memory.usage_in_bytes.

>
> > +int mem_container_write_strategy(char *buf, unsigned long long *tmp)
> > +{
> > + *tmp = memparse(buf, &buf);
> > + if (*buf != '\0')
> > + return -EINVAL;
> > +
> > + printk("tmp is %llu\n", *tmp);
>
> don't think we want that.
>

Yes, I'll redo the patch and resend.

> > + /*
> > +  * Round up the value to the closest page size
> > +  */
> > + *tmp = ((*tmp + PAGE_SIZE - 1) >> PAGE_SHIFT) << PAGE_SHIFT;
> > + return 0;
> > +}

Thanks,
Balbir
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v2)

2007-08-31 Thread Andrew Morton
On Fri, 31 Aug 2007 00:22:46 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> +/*
> + * Strategy routines for formating read/write data
> + */
> +int mem_container_read_strategy(unsigned long long val, char *buf)
> +{
> + return sprintf(buf, "%llu Bytes\n", val);
> +}

It's a bit cheesy to be printing the units like this.  It's better to just
print the raw number.

If you really want to remind the user what units that number is in (not a
bad idea) then it can be encoded in the filename, like
/proc/sys/vm/min_free_kbytes, /proc/sys/vm/dirty_expire_centisecs, etc.


> +int mem_container_write_strategy(char *buf, unsigned long long *tmp)
> +{
> + *tmp = memparse(buf, &buf);
> + if (*buf != '\0')
> + return -EINVAL;
> +
> + printk("tmp is %llu\n", *tmp);

don't think we want that.

> + /*
> +  * Round up the value to the closest page size
> +  */
> + *tmp = ((*tmp + PAGE_SIZE - 1) >> PAGE_SHIFT) << PAGE_SHIFT;
> + return 0;
> +}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface (v2)

2007-08-30 Thread Balbir Singh
Balbir Singh wrote:
> Change the interface to use kilobytes instead of pages. Page sizes can vary
> across platforms and configurations. A new strategy routine has been added
> to the resource counters infrastructure to format the data as desired.
> 

Typo, the description should be

Changelog for version 2

1. Back end tracking is done in bytes, round up values of the limit if the
   specified value is not a multiple of page size. Display memory.usage
   and memory.limit in bytes (Dave Hansen, Paul Menage)

Change the interface to use bytes instead of pages. Page sizes can vary
across platforms and configurations. A new strategy routine has been added
to the resource counters infrastructure to format the data as desired.

Suggested by David Rientjes, Andrew Morton and Herbert Poetzl

Tested on a UML setup with the config for memory control enabled.

Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-30 Thread Balbir Singh
KAMEZAWA Hiroyuki wrote:
> On Thu, 30 Aug 2007 04:07:11 +0530
> Balbir Singh <[EMAIL PROTECTED]> wrote:
>> 1. Several people recommended it
>> 2. Herbert mentioned that they've moved to that interface and it
>>was working fine for them.
>>
> 
> I have no strong opinion. But how about Mega bytes ? (too big ?)
> There will be no rounding up/down problem.
> 

Here is what I am thinking, allow the user to input bytes/kilobytes/
megabytes or gigabytes. Store the data internally in kilobytes or
PFN. I prefer kilobytes (no rounding issues), but while implementing
limits we round up to the closest PFN.


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread KAMEZAWA Hiroyuki
On Thu, 30 Aug 2007 04:07:11 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:
> 1. Several people recommended it
> 2. Herbert mentioned that they've moved to that interface and it
>was working fine for them.
> 

I have no strong opinion. But how about Mega bytes ? (too big ?)
There will be no rounding up/down problem.

-Kame.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Balbir Singh
Dave Hansen wrote:
> On Thu, 2007-08-30 at 03:57 +0530, Balbir Singh wrote:
>> True, mmap() is a good example of such an interface for developers, I
>> am not sure about system admins though.
>>
>> To quote Andrew
>> 
>> Reporting tools could run getpagesize() and do the arithmetic, but we
>> generally try to avoid exposing PAGE_SIZE, HZ, etc to userspace in this
>> manner.
>> 
> 
> Well, rounding to PAGE_SIZE exposes PAGE_SIZE as well, just in a
> non-intuitive fashion. :)
> 

Agreed, but the user might choose to ignore it altogether.

> If we're going to modify what the user specifies, we should probably at
> least mandate that writes are only a "suggestion" and users must read
> back the value to ensure what actually got committed.
> 

Agreed, excellent suggestion!

> If we're going to round in any direction, shouldn't we round up?  If a
> user specifies 4097 bytes and uses two pages, we don't want to complain
> when they hit that second page.  
> 

Absolutely, I used rounding to mean round up, truncation for rounding down.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Balbir Singh
Dave Hansen wrote:
> On Wed, 2007-08-29 at 15:20 -0700, Paul Menage wrote:
>> I'd argue that having the user's specified limit be truncated to the
>> page size is less confusing than giving an EINVAL if it's not page
>> aligned.
> 
> Do we truncate mmap() values to the nearest page so to not confuse the
> user? ;)
> 

I think rounding to the closest page size is a better option, but
again it can be a bit confusing. I am all for using memparse() to
parse the user input as a specification of the memory limit.

The second question of how to store it internally without truncation/
rounding is something we need to agree upon. We also need to see
how to display the data back to the user.

I chose kilobytes for two reasons

1. Several people recommended it
2. Herbert mentioned that they've moved to that interface and it
   was working fine for them.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL

PS: I am going off to the web to search for some CUI/CLI guidelines.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Dave Hansen
On Thu, 2007-08-30 at 03:57 +0530, Balbir Singh wrote:
> True, mmap() is a good example of such an interface for developers, I
> am not sure about system admins though.
> 
> To quote Andrew
> 
> Reporting tools could run getpagesize() and do the arithmetic, but we
> generally try to avoid exposing PAGE_SIZE, HZ, etc to userspace in this
> manner.
> 

Well, rounding to PAGE_SIZE exposes PAGE_SIZE as well, just in a
non-intuitive fashion. :)

If we're going to modify what the user specifies, we should probably at
least mandate that writes are only a "suggestion" and users must read
back the value to ensure what actually got committed.

If we're going to round in any direction, shouldn't we round up?  If a
user specifies 4097 bytes and uses two pages, we don't want to complain
when they hit that second page.  

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Balbir Singh
Dave Hansen wrote:
> On Thu, 2007-08-30 at 03:34 +0530, Balbir Singh wrote:
>> I've thought about this before. The problem is that a user could
>> set his limit to 1 bytes, but would then see the usage and
>> limit round to the closest page boundary. This can be confusing
>> to a user. 
> 
> True, but we're lying if we allow a user to set their limit there,
> because we can't actually enforce a limit at 8,192 bytes vs 10,000.
> They're the same limit as far as the kernel is concerned.
> 
> Why not just -EINVAL if the value isn't page-aligned?  There are plenty
> of interfaces in the kernel that require userspace to know the page
> size, so this shouldn't be too difficult.

True, mmap() is a good example of such an interface for developers, I
am not sure about system admins though.

To quote Andrew

Reporting tools could run getpagesize() and do the arithmetic, but we
generally try to avoid exposing PAGE_SIZE, HZ, etc to userspace in this
manner.


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Dave Hansen
On Wed, 2007-08-29 at 15:20 -0700, Paul Menage wrote:
> 
> I'd argue that having the user's specified limit be truncated to the
> page size is less confusing than giving an EINVAL if it's not page
> aligned.

Do we truncate mmap() values to the nearest page so to not confuse the
user? ;)

Imagine a careful application setting and accounting for limits on a
long-running system.  Might its internal accounting get sufficiently
misaligned from the kernel's after a while to cause a problem?
Truncating values like that would appear reserve significantly less
memory than desired over a long period of time.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Paul Menage
On 8/29/07, Dave Hansen <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-08-30 at 03:34 +0530, Balbir Singh wrote:
> > I've thought about this before. The problem is that a user could
> > set his limit to 1 bytes, but would then see the usage and
> > limit round to the closest page boundary. This can be confusing
> > to a user.
>
> True, but we're lying if we allow a user to set their limit there,
> because we can't actually enforce a limit at 8,192 bytes vs 10,000.
> They're the same limit as far as the kernel is concerned.
>
> Why not just -EINVAL if the value isn't page-aligned?  There are plenty
> of interfaces in the kernel that require userspace to know the page
> size, so this shouldn't be too difficult.

I'd argue that having the user's specified limit be truncated to the
page size is less confusing than giving an EINVAL if it's not page
aligned.

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Dave Hansen
On Thu, 2007-08-30 at 03:34 +0530, Balbir Singh wrote:
> I've thought about this before. The problem is that a user could
> set his limit to 1 bytes, but would then see the usage and
> limit round to the closest page boundary. This can be confusing
> to a user. 

True, but we're lying if we allow a user to set their limit there,
because we can't actually enforce a limit at 8,192 bytes vs 10,000.
They're the same limit as far as the kernel is concerned.

Why not just -EINVAL if the value isn't page-aligned?  There are plenty
of interfaces in the kernel that require userspace to know the page
size, so this shouldn't be too difficult.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Balbir Singh
Dave Hansen wrote:
> On Wed, 2007-08-29 at 16:40 +0530, Balbir Singh wrote:
>>
>> @@ -352,7 +353,7 @@ int mem_container_charge(struct page *pa
>> kfree(pc);
>> pc = race_pc;
>> atomic_inc(&pc->ref_cnt);
>> -   res_counter_uncharge(&mem->res, 1);
>> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
>> css_put(&mem->css);
>> goto done;
>> } 
> 
> Do these changes really need to happen anywhere besides the
> user<->kernel boundary?  Why can't internal tracking be in pages?

I've thought about this before. The problem is that a user could
set his limit to 1 bytes, but would then see the usage and
limit round to the closest page boundary. This can be confusing
to a user.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Dave Hansen
On Wed, 2007-08-29 at 16:40 +0530, Balbir Singh wrote:
> 
> 
> @@ -352,7 +353,7 @@ int mem_container_charge(struct page *pa
> kfree(pc);
> pc = race_pc;
> atomic_inc(&pc->ref_cnt);
> -   res_counter_uncharge(&mem->res, 1);
> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
> css_put(&mem->css);
> goto done;
> } 

Do these changes really need to happen anywhere besides the
user<->kernel boundary?  Why can't internal tracking be in pages?

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Paul Menage
On 8/29/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
> >
> > This seems a bit inconsistent - if you write a value to a limit file,
> > then the value that you read back is reduced by a factor of 1024?
> > Having the "(kB)" suffix isn't really a big help to automated
> > middleware.
> >
>
> Why is that? Is it because you could write 4M and see it show up
> as 4096 kilobytes? We'll that can be fixed with another variant
> of the memparse() utility.

I was thinking the other way around - you can write 1048576 (i.e. 1MB)
to the file and read back 1024. It just seems to me that it's clearer
if you write X to the file to get X back.

>
> 64 bit might be an overkill for 32 bit machines. 32 bit machines with
> PAE cannot use 32 bit values, they need 64 bits.

How is using a 64-bit value for consistency overkill?

As someone pointed out, 4TB machines probably aren't that far around
the corner (if they're not here already) so even if you use KB rather
than bytes, userspace needs to be using an int64 for this value in
case it ends up running as a 32-bit-compiled app on a 64-bit kernel
with lots of memory.

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Balbir Singh
Paul Menage wrote:
> On 8/29/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>> Change the interface to use kilobytes instead of pages. Page sizes can vary
>> across platforms and configurations. A new strategy routine has been added
>> to the resource counters infrastructure to format the data as desired.
>>
>> Suggested by David Rientjes, Andrew Morton and Herbert Poetzl
>>
>> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
>> ---
>>
>>  Documentation/controllers/memory.txt |7 +++--
>>  include/linux/res_counter.h  |6 ++--
>>  kernel/res_counter.c |   24 +
>>  mm/memcontrol.c  |   47 
>> +++
>>  4 files changed, 64 insertions(+), 20 deletions(-)
>>
>> diff -puN mm/memcontrol.c~mem-control-make-ui-use-kilobytes mm/memcontrol.c
>> --- linux-2.6.23-rc3/mm/memcontrol.c~mem-control-make-ui-use-kilobytes  
>> 2007-08-28 13:20:44.0 +0530
>> +++ linux-2.6.23-rc3-balbir/mm/memcontrol.c 2007-08-29 
>> 14:36:07.0 +0530
>> @@ -32,6 +32,7 @@
>>
>>  struct container_subsys mem_container_subsys;
>>  static const int MEM_CONTAINER_RECLAIM_RETRIES = 5;
>> +static const int MEM_CONTAINER_CHARGE_KB = (PAGE_SIZE >> 10);
>>
>>  /*
>>   * The memory controller data structure. The memory controller controls both
>> @@ -312,7 +313,7 @@ int mem_container_charge(struct page *pa
>>  * If we created the page_container, we should free it on exceeding
>>  * the container limit.
>>  */
>> -   while (res_counter_charge(&mem->res, 1)) {
>> +   while (res_counter_charge(&mem->res, MEM_CONTAINER_CHARGE_KB)) {
>> if (try_to_free_mem_container_pages(mem))
>> continue;
>>
>> @@ -352,7 +353,7 @@ int mem_container_charge(struct page *pa
>> kfree(pc);
>> pc = race_pc;
>> atomic_inc(&pc->ref_cnt);
>> -   res_counter_uncharge(&mem->res, 1);
>> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
>> css_put(&mem->css);
>> goto done;
>> }
>> @@ -417,7 +418,7 @@ void mem_container_uncharge(struct page_
>> css_put(&mem->css);
>> page_assign_page_container(page, NULL);
>> unlock_page_container(page);
>> -   res_counter_uncharge(&mem->res, 1);
>> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
>>
>> spin_lock_irqsave(&mem->lru_lock, flags);
>> list_del_init(&pc->lru);
>> @@ -426,12 +427,37 @@ void mem_container_uncharge(struct page_
>> }
>>  }
>>
>> -static ssize_t mem_container_read(struct container *cont, struct cftype 
>> *cft,
>> -   struct file *file, char __user *userbuf, size_t 
>> nbytes,
>> -   loff_t *ppos)
>> +int mem_container_read_strategy(unsigned long val, char *buf)
>> +{
>> +   return sprintf(buf, "%lu (kB)\n", val);
>> +}
>> +
>> +int mem_container_write_strategy(char *buf, unsigned long *tmp)
>> +{
>> +   *tmp = memparse(buf, &buf);
>> +   if (*buf != '\0')
>> +   return -EINVAL;
>> +
>> +   *tmp = *tmp >> 10;  /* convert to kilobytes */
>> +   return 0;
>> +}
> 
> This seems a bit inconsistent - if you write a value to a limit file,
> then the value that you read back is reduced by a factor of 1024?
> Having the "(kB)" suffix isn't really a big help to automated
> middleware.
> 

Why is that? Is it because you could write 4M and see it show up
as 4096 kilobytes? We'll that can be fixed with another variant
of the memparse() utility.

> I'd still be in favour of just reading/writing 64-bit values
> representing bytes - simple, and unambiguous for programmatic use, and
> not really any less user-friendly than kilobytes  for manual use
> (since the numbers involved are going to be unwieldly for manual use
> whether they're in bytes or kB).
> 

64 bit might be an overkill for 32 bit machines. 32 bit machines with
PAE cannot use 32 bit values, they need 64 bits. I think KiloBytes
is an acceptable metric these days, everybody understands them.

> Paul
> ___
> Containers mailing list
> [EMAIL PROTECTED]
> https://lists.linux-foundation.org/mailman/listinfo/containers


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [-mm PATCH] Memory controller improve user interface

2007-08-29 Thread Paul Menage
On 8/29/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>
> Change the interface to use kilobytes instead of pages. Page sizes can vary
> across platforms and configurations. A new strategy routine has been added
> to the resource counters infrastructure to format the data as desired.
>
> Suggested by David Rientjes, Andrew Morton and Herbert Poetzl
>
> Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
> ---
>
>  Documentation/controllers/memory.txt |7 +++--
>  include/linux/res_counter.h  |6 ++--
>  kernel/res_counter.c |   24 +
>  mm/memcontrol.c  |   47 
> +++
>  4 files changed, 64 insertions(+), 20 deletions(-)
>
> diff -puN mm/memcontrol.c~mem-control-make-ui-use-kilobytes mm/memcontrol.c
> --- linux-2.6.23-rc3/mm/memcontrol.c~mem-control-make-ui-use-kilobytes  
> 2007-08-28 13:20:44.0 +0530
> +++ linux-2.6.23-rc3-balbir/mm/memcontrol.c 2007-08-29 14:36:07.0 
> +0530
> @@ -32,6 +32,7 @@
>
>  struct container_subsys mem_container_subsys;
>  static const int MEM_CONTAINER_RECLAIM_RETRIES = 5;
> +static const int MEM_CONTAINER_CHARGE_KB = (PAGE_SIZE >> 10);
>
>  /*
>   * The memory controller data structure. The memory controller controls both
> @@ -312,7 +313,7 @@ int mem_container_charge(struct page *pa
>  * If we created the page_container, we should free it on exceeding
>  * the container limit.
>  */
> -   while (res_counter_charge(&mem->res, 1)) {
> +   while (res_counter_charge(&mem->res, MEM_CONTAINER_CHARGE_KB)) {
> if (try_to_free_mem_container_pages(mem))
> continue;
>
> @@ -352,7 +353,7 @@ int mem_container_charge(struct page *pa
> kfree(pc);
> pc = race_pc;
> atomic_inc(&pc->ref_cnt);
> -   res_counter_uncharge(&mem->res, 1);
> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
> css_put(&mem->css);
> goto done;
> }
> @@ -417,7 +418,7 @@ void mem_container_uncharge(struct page_
> css_put(&mem->css);
> page_assign_page_container(page, NULL);
> unlock_page_container(page);
> -   res_counter_uncharge(&mem->res, 1);
> +   res_counter_uncharge(&mem->res, MEM_CONTAINER_CHARGE_KB);
>
> spin_lock_irqsave(&mem->lru_lock, flags);
> list_del_init(&pc->lru);
> @@ -426,12 +427,37 @@ void mem_container_uncharge(struct page_
> }
>  }
>
> -static ssize_t mem_container_read(struct container *cont, struct cftype *cft,
> -   struct file *file, char __user *userbuf, size_t 
> nbytes,
> -   loff_t *ppos)
> +int mem_container_read_strategy(unsigned long val, char *buf)
> +{
> +   return sprintf(buf, "%lu (kB)\n", val);
> +}
> +
> +int mem_container_write_strategy(char *buf, unsigned long *tmp)
> +{
> +   *tmp = memparse(buf, &buf);
> +   if (*buf != '\0')
> +   return -EINVAL;
> +
> +   *tmp = *tmp >> 10;  /* convert to kilobytes */
> +   return 0;
> +}

This seems a bit inconsistent - if you write a value to a limit file,
then the value that you read back is reduced by a factor of 1024?
Having the "(kB)" suffix isn't really a big help to automated
middleware.

I'd still be in favour of just reading/writing 64-bit values
representing bytes - simple, and unambiguous for programmatic use, and
not really any less user-friendly than kilobytes  for manual use
(since the numbers involved are going to be unwieldly for manual use
whether they're in bytes or kB).

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/