[Devel] Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

2007-10-04 Thread Greg KH
On Thu, Sep 27, 2007 at 01:25:48PM -0600, Eric W. Biederman wrote: > > I still need to look at the code in detail but I have some concerns > I want to inject into this conversation of future sysfs architecture. > > - If we want to carefully limit sysfs from going to wild code review > is clearl

[Devel] Re: [PATCH 0/3] Make tasks always have non-zero pids

2007-10-04 Thread sukadev
Pavel Emelianov [EMAIL PROTECTED] wrote: | Some time ago Sukadev noticed that the vmlinux size has Cedric pointed it out to me first :-) | grown 5Kb due to merged pid namespaces. One of the big | problems with it was fat inline functions. The other thing | was noticed by Matt - the checks for tas

[Devel] [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Paul Menage
Simplify the memory controller and resource counter I/O routines This patch strips out some I/O boilerplate from resource counters and the memory controller. It also adds locking to the resource counter reads and writes, and forbids writes to the root memory cgroup's limit file. cgroup_write_uin

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Paul Menage
On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: > > Yes, either that way or add a strategy function, that would take > the string input from the user and convert it to unsigned long long > value. I am ok with either approach. > OK, new version of the patch sent in a separate mail. Paul __

[Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-10-04 Thread sukadev
| > | > One solution I was thinking of was to possibly queue pending blocked | > | > signals to a container init seperately and then requeue them on the | > | > normal queue when signals are unblocked. Its definitely not an easier | > | > solution, but might be less intrusive than the "signal from

[Devel] [RFC] [-mm PATCH] Memory controller fix swap charging context in unuse_pte()

2007-10-04 Thread Balbir Singh
Found-by: Hugh Dickins <[EMAIL PROTECTED]> mem_cgroup_charge() in unuse_pte() is called under a lock, the pte_lock. That's clearly incorrect, since we pass GFP_KERNEL to mem_cgroup_charge() for allocation of page_cgroup. This patch release the lock and reacquires the lock after the call to mem_

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Balbir Singh
Paul Menage wrote: > On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: >> Paul Menage wrote: >>> On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: Forbidding writing to the root resource counter is a policy decision I am unable to make up my mind about. It sounds right, but unless >>>

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Paul Menage
On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: > Paul Menage wrote: > > On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: > >> Forbidding writing to the root resource counter is a policy decision > >> I am unable to make up my mind about. It sounds right, but unless > >> we have a notion of

[Devel] Re: [RFC] [PATCH] memory controller statistics

2007-10-04 Thread Balbir Singh
YAMAMOTO Takashi wrote: >> hi, >> >> i implemented some statistics for your memory controller. >> >> it's tested with 2.6.23-rc2-mm2 + memory controller v7. >> i think it can be applied to 2.6.23-rc4-mm1 as well. >> >> YAMOMOTO Takshi >> >> todo: something like nr_active/inactive in /proc/vmstat. >

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Balbir Singh
Paul Menage wrote: > On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: >> Forbidding writing to the root resource counter is a policy decision >> I am unable to make up my mind about. It sounds right, but unless >> we have a notion of unlimited resources, I am a bit concerned about >> taking away

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Paul Menage
On 10/4/07, Balbir Singh <[EMAIL PROTECTED]> wrote: > > Forbidding writing to the root resource counter is a policy decision > I am unable to make up my mind about. It sounds right, but unless > we have a notion of unlimited resources, I am a bit concerned about > taking away this flexibility. One

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Balbir Singh
Paul Menage wrote: > Hi Balbir, > > Any thoughts on this patch? > Hi, Paul, I remember seeing this patch, sorry for not responding earlier > Cheers, > > Paul > > On 9/25/07, Paul Menage <[EMAIL PROTECTED]> wrote: >> Simplify the memory controller and resource counter I/O routines >> >> This

[Devel] Re: [PATCH] Simplify memory controller and resource counter I/O

2007-10-04 Thread Paul Menage
Hi Balbir, Any thoughts on this patch? Cheers, Paul On 9/25/07, Paul Menage <[EMAIL PROTECTED]> wrote: > Simplify the memory controller and resource counter I/O routines > > This patch strips out some I/O boilerplate from resource counters and > the memory controller. It also adds locking to th

[Devel] Re: [PATCH 2/3] Prepare pid_nr() etc functions to work with not-NULL pids

2007-10-04 Thread Matt Mackall
On Thu, Oct 04, 2007 at 12:54:17PM +0400, Pavel Emelyanov wrote: > Matt Mackall wrote: > > On Wed, Oct 03, 2007 at 06:20:43PM +0400, Pavel Emelyanov wrote: > >> Just make the __pid_nr() etc functions that expect the argument > >> to always be not NULL. > >> > >> Signed-off-by: Pavel Emelyanov <[EMA

[Devel] Re: [PATCH 11/33] task containersv11 make cpusets a client of containers

2007-10-04 Thread Paul Jackson
Paul M wrote: > I didn't notice any performance hit on a pure allocate/free memory > benchmark relative to non-cgroup cpusets. Good. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.6

[Devel] Re: [PATCH 11/33] task containersv11 make cpusets a client of containers

2007-10-04 Thread Paul Jackson
Paul M wrote: > It's two constant-indexed dereferences *in total*, compared to a > single constant-indexed dereference in the pre-cgroup case. Ok - the C expression is longer and I didn't realize how little difference it made in the end (the executing code.) Good - thanks. --

[Devel] [PATCH] Rename is_cgroup_init()

2007-10-04 Thread sukadev
From: Sukadev Bhattiprolu <[EMAIL PROTECTED]> Subject: [PATCH] Rename is_cgroup_init() is_container_init() was accidentally renamed to is_cgroup_init() when renaming "container" to "control group". This patch restores the original name. Signed-off-by: Sukadev Bhattiprolu <[EMAIL PROTECTED]> Acked

[Devel] Re: netns : close all sockets at unshare ?

2007-10-04 Thread Cedric Le Goater
Eric W. Biederman wrote: > Daniel Lezcano <[EMAIL PROTECTED]> writes: >> Yes, it will work. >> >> Do we want to be inside a network namespace and to use a socket belonging to >> another network namespace ? If yes, then my remark is irrelevant. > > Yes we do. > Shall we close all fd sockets w

[Devel] Re: [PATCH 2/3] Introduce the res_counter_populate() function

2007-10-04 Thread Paul Menage
On 10/4/07, Pavel Emelyanov <[EMAIL PROTECTED]> wrote: > This one is responsible for initializing the RES_CFT_MAX files > properly and register them inside the container. > > The caller must provide the cgroup, the cgroup_subsys, the > RES_CFT_MAX * sizeof(cftype) chunk of zeroed memory, the > unit

[Devel] Re: [PATCH 1/3] Typedefs the read and write functions in cftype

2007-10-04 Thread Paul Menage
On 10/4/07, Pavel Emelyanov <[EMAIL PROTECTED]> wrote: > This is just to reduce the code amount in the future. > > Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> Acked-by: Paul Menage <[EMAIL PROTECTED]> > > --- > > diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h > index 8747932.

[Devel] Re: [PATCH 11/33] task containersv11 make cpusets a client of containers

2007-10-04 Thread Paul Menage
On 10/4/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > Paul M, > > This snippet from the memory allocation hot path worries me a bit. > > Once per memory page allocation, we go through here, needing to peak inside > the current tasks cpuset to see if it has changed (it's 'mems_generation' > value do

Re: [Devel] [PATCH 2/3] Introduce the res_counter_populate() function

2007-10-04 Thread Paul Menage
On 10/4/07, Kir Kolyshkin <[EMAIL PROTECTED]> wrote: > > 3. Put units into a separate new files named "units" (or, well, > "measurement_units" (or even "measured_in") if you are fan of long > descriptive names). So, "cat units" will show us "bytes" or "items" or > "pages"... I'd vote for this opti

Re: [Devel] [PATCH][NETNS] Move some code into __init section when CONFIG_NET_NS=n

2007-10-04 Thread Alexey Dobriyan
On Thu, Oct 04, 2007 at 05:54:11PM +0400, Pavel Emelyanov wrote: > With the net namespaces many code leaved the __init section, > thus making the kernel occupy more memory than it did before. > Since we have a config option that prohibits the namespace > creation, the functions that initialize/fina

[Devel] [PATCH][NETNS] Move some code into __init section when CONFIG_NET_NS=n

2007-10-04 Thread Pavel Emelyanov
With the net namespaces many code leaved the __init section, thus making the kernel occupy more memory than it did before. Since we have a config option that prohibits the namespace creation, the functions that initialize/finalize some netns stuff are simply not needed and can be freed after the bo

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Serge E. Hallyn
Quoting Cedric Le Goater ([EMAIL PROTECTED]): > [EMAIL PROTECTED] wrote: > > Eric W. Biederman [EMAIL PROTECTED] wrote: > > | [EMAIL PROTECTED] writes: > > | > > | > Cedric Le Goater [EMAIL PROTECTED] wrote: > > | > | > I think you and Eric (and I) are disagreeing about those > > limitations. > >

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Cedric Le Goater
[EMAIL PROTECTED] wrote: > Eric W. Biederman [EMAIL PROTECTED] wrote: > | [EMAIL PROTECTED] writes: > | > | > Cedric Le Goater [EMAIL PROTECTED] wrote: > | > | > I think you and Eric (and I) are disagreeing about those limitations. > | > | > You take it for granted that a sibling pidns is off limi

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread sukadev
Eric W. Biederman [EMAIL PROTECTED] wrote: | [EMAIL PROTECTED] writes: | | > Cedric Le Goater [EMAIL PROTECTED] wrote: | > | > I think you and Eric (and I) are disagreeing about those limitations. | > | > You take it for granted that a sibling pidns is off limits for signals. | > | > But the signa

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Eric W. Biederman
[EMAIL PROTECTED] writes: > Cedric Le Goater [EMAIL PROTECTED] wrote: > | > I think you and Eric (and I) are disagreeing about those limitations. > | > You take it for granted that a sibling pidns is off limits for signals. > | > But the signal wasn't sent using a pid, but using a file (in SIGIO >

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Eric W. Biederman
Cedric Le Goater <[EMAIL PROTECTED]> writes: > [ > I have big fingers this morning and I managed to send this email > while typing it ... see below for the end. I should be awake now :) > ] > >> The really challenging case to handle here is what happens if we are >> signaling to someone in

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Cedric Le Goater
> I think you and Eric (and I) are disagreeing about those limitations. > You take it for granted that a sibling pidns is off limits for signals. > But the signal wasn't sent using a pid, but using a file (in SIGIO > case). So since the fs was shared, the signal should be sent. An > event happene

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread sukadev
Cedric Le Goater [EMAIL PROTECTED] wrote: | > I think you and Eric (and I) are disagreeing about those limitations. | > You take it for granted that a sibling pidns is off limits for signals. | > But the signal wasn't sent using a pid, but using a file (in SIGIO | > case). So since the fs was shar

[Devel] Re: [patch -mm 1/5] mqueue namespace : add struct mq_namespace

2007-10-04 Thread Serge E. Hallyn
Quoting Cedric Le Goater ([EMAIL PROTECTED]): > [EMAIL PROTECTED] wrote: > > Cedric Le Goater [EMAIL PROTECTED] wrote: > > | > > | >> however, we have an issue with the signal notification in __do_notify() > > | >> we could kill a process in a different pid namespace. > > | > > > | > So I took a

Re: [Devel] [PATCH 2/3] Introduce the res_counter_populate() function

2007-10-04 Thread Kir Kolyshkin
Pavel Emelyanov wrote: <...skipped...> +static char * units_names[RES_UNITS_MAX][RES_CFT_MAX] = { + [RES_UNITS_BYTES] = { + "usage_in_bytes", + "limit_in_bytes", + "failcnt", + }, + [RES_UNITS_ITEMS] = { + "usage", +

[Devel] Re: [PATCH 11/33] task containersv11 make cpusets a client of containers

2007-10-04 Thread Paul Jackson
Paul M, This snippet from the memory allocation hot path worries me a bit. Once per memory page allocation, we go through here, needing to peak inside the current tasks cpuset to see if it has changed (it's 'mems_generation' value doesn't match the last seen value we have stashed in the task stru

[Devel] [PATCH 3/3] Use the res_counter_populate in memory controller

2007-10-04 Thread Pavel Emelyanov
Note, that the controller code dealing with the cftype files for resource counters becomes much shorter and won't have to be changed in the future. Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1b8bf24..2e62d24 100644 --- a/mm/memcon

[Devel] [PATCH 1/3] Typedefs the read and write functions in cftype

2007-10-04 Thread Pavel Emelyanov
This is just to reduce the code amount in the future. Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 8747932..0635004 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -178,6 +178,15 @@ struct css_set {

[Devel] [PATCH 2/3] Introduce the res_counter_populate() function

2007-10-04 Thread Pavel Emelyanov
This one is responsible for initializing the RES_CFT_MAX files properly and register them inside the container. The caller must provide the cgroup, the cgroup_subsys, the RES_CFT_MAX * sizeof(cftype) chunk of zeroed memory, the units of measure and the read and write callbacks. Right now I made n

[Devel] [PATCH 0/3] Consolidate cgroup files creation for resource counters (v2)

2007-10-04 Thread Pavel Emelyanov
Changes from previous version * made the names configurable * fixed race between res_counter_populate and reading any of these files (memset to zero could spoof the pointers) Right now we have only one controller in -mm tree - the memory one - and it initializes all its files manually. After I

[Devel] Re: [PATCH 1/3] Introduce the dummy_pid

2007-10-04 Thread Pavel Emelyanov
Randy Dunlap wrote: > On Wed, 03 Oct 2007 18:19:01 +0400 Pavel Emelyanov wrote: > >> This is a pid which is attached to tasks when they detach >> their pids. This is done in detach_pid() and transfer_pid(). >> The pid_alive() check is changed to reflect this fact. >> >> Signed-off-by: Pavel Emelya

[Devel] Re: [PATCH 2/3] Prepare pid_nr() etc functions to work with not-NULL pids

2007-10-04 Thread Pavel Emelyanov
Matt Mackall wrote: > On Wed, Oct 03, 2007 at 06:20:43PM +0400, Pavel Emelyanov wrote: >> Just make the __pid_nr() etc functions that expect the argument >> to always be not NULL. >> >> Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> > >> static inline pid_t pid_nr(struct pid *pid) >> { >>