[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread KAMEZAWA Hiroyuki
On Tue, 16 Oct 2007 22:38:18 -0700 (PDT) David Rientjes <[EMAIL PROTECTED]> wrote: > > Thanks, > > You mean make it write-only? Typically it would be as easy as only > specifying a mode of S_IWUSR so that it can only be written to, the > S_IFREG is already provided by cgroup_add_file(). > > Un

[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread David Rientjes
On Wed, 17 Oct 2007, KAMEZAWA Hiroyuki wrote: > > > +static ssize_t mem_force_empty_read(struct cgroup *cont, > > > + struct cftype *cft, > > > + struct file *file, char __user *userbuf, > > > + size_t nbytes, loff_t *ppos) >

[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread KAMEZAWA Hiroyuki
On Wed, 17 Oct 2007 10:35:58 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > > If the only use of this is for rmdir, why not just make it part of the > > rmdir operation on the memory cgroup if there are no tasks by default? > > > > That's a good idea, but sometimes an administrator might want t

[Devel] Re: [PATCH] memory cgroup enhancements [0/5] intro

2007-10-16 Thread KAMEZAWA Hiroyuki
On Tue, 16 Oct 2007 11:28:43 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > > I would prefer these patches to go in once the fixes that you've posted > > earlier have gone in (the migration fix series). I am yet to test the > > migration fix per-se, but the series seemed quite fine to me. Andrew

[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread KAMEZAWA Hiroyuki
On Tue, 16 Oct 2007 21:17:06 -0700 (PDT) David Rientjes <[EMAIL PROTECTED]> wrote: > > This is useful to invoke rmdir() against memory cgroup successfully. > > > > If there's no tasks in the cgroup, then how can there be any charges > against its memory controller? Is the memory not being uncha

[Devel] Re: [PATCH] memory cgroup enhancements [3/5] record pc is on active list

2007-10-16 Thread KAMEZAWA Hiroyuki
On Tue, 16 Oct 2007 21:17:24 -0700 (PDT) David Rientjes <[EMAIL PROTECTED]> wrote: > On Tue, 16 Oct 2007, KAMEZAWA Hiroyuki wrote: > > > Remember page_cgroup is on active_list or not in page_cgroup->flags. > > > > Against 2.6.23-mm1. > > > > Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>

[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread Balbir Singh
David Rientjes wrote: > On Tue, 16 Oct 2007, KAMEZAWA Hiroyuki wrote: > >> This patch adds an interface "memory.force_empty". >> Any write to this file will drop all charges in this cgroup if >> there is no task under. >> >> %echo 1 > /../memory.force_empty >> >> will drop all charges of memor

[Devel] Re: [PATCH] memory cgroup enhancements [3/5] record pc is on active list

2007-10-16 Thread David Rientjes
On Tue, 16 Oct 2007, KAMEZAWA Hiroyuki wrote: > Remember page_cgroup is on active_list or not in page_cgroup->flags. > > Against 2.6.23-mm1. > > Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > Signed-off-by: YAMAMOTO Takashi <[EMAIL PROTECTED]> > > mm/memcontrol.c | 12 >

[Devel] Re: [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread David Rientjes
On Tue, 16 Oct 2007, KAMEZAWA Hiroyuki wrote: > This patch adds an interface "memory.force_empty". > Any write to this file will drop all charges in this cgroup if > there is no task under. > > %echo 1 > /../memory.force_empty > > will drop all charges of memory cgroup if cgroup's tasks is e

[Devel] Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

2007-10-16 Thread Eric W. Biederman
[EMAIL PROTECTED] writes: > | No. The "other" device namespace I would construct on machine B to > | look just like the device namespace that existed on machine A. > | Making /sys/devices/block/sda would still be 8:0. > | > | So to be very clear on machine B when talking about disk-1 I would have

[Devel] Re: [RFC] cpuset update_cgroup_cpus_allowed

2007-10-16 Thread Paul Jackson
David wrote: > Not necessarily because migration only occurs to any online cpu in the > mask, it won't attempt to migrate it to some cpu that has been downed. > ... ... one of David or I is insane ... I can't tell which one yet, perhaps both of us ;). I'm going to reply to David without all the

[Devel] Re: [PATCHSET 3/4] sysfs: divorce sysfs from kobject and driver model

2007-10-16 Thread sukadev
Eric W. Biederman [EMAIL PROTECTED] wrote: | Greg KH <[EMAIL PROTECTED]> writes: | | > On Fri, Oct 05, 2007 at 06:12:41AM -0600, Eric W. Biederman wrote: | >> Greg KH <[EMAIL PROTECTED]> writes: | >> > | >> >> Also fun is that the dev file implementation needs to be able to | >> >> report diff

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Serge E. Hallyn
Quoting Cedric Le Goater ([EMAIL PROTECTED]): > > > asmlinkage long sys_hijack(unsigned long clone_flags, int which, > >unsigned long id); > > I expect to get more explanation of the arguments in the patch > you are going to send :) There'll have to be a man page at some

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Cedric Le Goater
> asmlinkage long sys_hijack(unsigned long clone_flags, int which, > unsigned long id); I expect to get more explanation of the arguments in the patch you are going to send :) 'which' is used as a switch for 'id' : pid or fd on a open cgroup directory. right ? Thanks

[Devel] Re: [BUGFIX][RFC][PATCH][only -mm] FIX memory leak in memory cgroup vs. page migration [0/1]

2007-10-16 Thread Balbir Singh
KAMEZAWA Hiroyuki wrote: [snip] > # migrate_test mmaps 512Mfile and call system call move_pages(). and sleep. > [EMAIL PROTECTED] kamezawa]# ./migrate_test 512Mfile 1 & > [1] 4108 This step fails for me. I get an -ENOENT error from the utility you sent me. As I look through the migration code more

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Paul Menage
On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > Oh good, so I can just pass in a single arg id, so > > asmlinkage long sys_hijack(unsigned long clone_flags, int which, >unsigned long id); > Sounds good. Paul ___ C

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Paul Menage
On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > Oh good, so I can just pass in a single arg id, so > > asmlinkage long sys_hijack(unsigned long clone_flags, int which, >unsigned long id); > Sounds good. Paul ___ C

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > > > Currently every pid namespace's pid==1 must stick around as long as the > > pid namespace does. If you kill the pid==1, all processes in the > > container are killed. > > What about people w

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Paul Menage
On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > Currently every pid namespace's pid==1 must stick around as long as the > pid namespace does. If you kill the pid==1, all processes in the > container are killed. What about people who aren't using pid namespaces? > > > > Anyway, I can

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > pid, but wasn't sure how best to identify the cgroup. Originally I was > > more worried about pid exiting/wraparound, but then decided that with a > > real container the container_init can't go a

[Devel] Re: [PATCH] memory cgroup enhancements [0/5] intro

2007-10-16 Thread Balbir Singh
Andrew Morton wrote: > On Tue, 16 Oct 2007 23:50:28 +0530 > Balbir Singh <[EMAIL PROTECTED]> wrote: > >>> [1/5] ... force_empty patch >>> [2/5] ... remember page is charged as page-cache patch >>> [3/5] ... remember page is on which list patch >>> [4/5] ... memory cgroup statistics patch >>> [5/5]

[Devel] Re: [RFC] cpuset update_cgroup_cpus_allowed

2007-10-16 Thread David Rientjes
On Tue, 16 Oct 2007, Paul Jackson wrote: > David wrote: > > Why can't you just add a helper function to sched.c: > > > > void set_hotcpus_allowed(struct task_struct *task, > > cpumask_t cpumask) > > { > > mutex_lock(&sched_hotcpu_mutex); > >

[Devel] Re: [PATCH] memory cgroup enhancements [0/5] intro

2007-10-16 Thread Andrew Morton
On Tue, 16 Oct 2007 23:50:28 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > > [1/5] ... force_empty patch > > [2/5] ... remember page is charged as page-cache patch > > [3/5] ... remember page is on which list patch > > [4/5] ... memory cgroup statistics patch > > [5/5] ... show statistics patch

[Devel] Re: [PATCH] memory cgroup enhancements [0/5] intro

2007-10-16 Thread Balbir Singh
KAMEZAWA Hiroyuki wrote: > This patch set adds > - force_empty interface, which drops all charges in memory cgroup. >This enables rmdir() against unused memory cgroup. > - the memory cgroup statistics accounting. > > Based on 2.6.23-mm1 + http://lkml.org/lkml/2007/10/12/53 > > Changes from

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Paul Menage
On 10/16/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > pid, but wasn't sure how best to identify the cgroup. Originally I was > more worried about pid exiting/wraparound, but then decided that with a > real container the container_init can't go away until the container goes > away anyway. For

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > One thought on this - could we make the API have a "which" parameter > that indicates the type of thing being acted upon? E.g., like > sys_setpriority(), which can specify the target as a process, a pgrp > or a user. > > Right now the target would just be

[Devel] Re: [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Serge E. Hallyn
Quoting Cedric Le Goater ([EMAIL PROTECTED]): > > >> hmm, I'm wondering how this is going to work for a process which > >> would have unshared its device (pts) namespace. How are we going > >> to link the pts living in different namespaces if the stdios of the > >> hijacked process is using them

[Devel] [PATCH 7/7] Consolidate frag queues freeing

2007-10-16 Thread Pavel Emelyanov
Since we now allocate the queues in inet_fragment.c, we can safely free it in the same place. The ->destructor callback thus becomes optional for inet_frags. Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c index 470b056..a75

[Devel] [PATCH 6/7] Remove no longer needed ->equal callback

2007-10-16 Thread Pavel Emelyanov
Since this callback is used to check for conflicts in hashtable when inserting a newly created frag queue, we can do the same by checking for matching the queue with the argument, used to create one. Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/include/net/inet_frag.h b/i

[Devel] [PATCH 5/7] Consolidate xxx_find() in fragment management

2007-10-16 Thread Pavel Emelyanov
Here we need another callback ->match to check whether the entry found in hash matches the key passed. The key used is the same as the creation argument for inet_frag_create. Yet again, this ->match is the same for netfilter and ipv6. Running a frew steps forward - this callback will later replac

[Devel] [PATCH 4/7] Consolidate xxx_frag_create()

2007-10-16 Thread Pavel Emelyanov
This one uses the xxx_frag_intern() and xxx_frag_alloc() routines, which are already consolidated, so remove them from protocol code (as promised). The ->constructor callback is used to init the rest of the frag queue and it is the same for netfilter and ipv6. Signed-off-by: Pavel Emelyanov <[EMA

[Devel] [PATCH 3/7] Consolidate xxx_frag_alloc()

2007-10-16 Thread Pavel Emelyanov
Just perform the kzalloc() allocation and setup common fields in the inet_frag_queue(). Then return the result to the caller to initialize the rest. The inet_frag_alloc() may return NULL, so check the return value before doing the container_of(). This looks ugly, but the xxx_frag_alloc() will be

[Devel] [PATCH 2/7] Consolidate xxx_frag_intern

2007-10-16 Thread Pavel Emelyanov
This routine checks for the existence of a given entry in the hash table and inserts the new one if needed. The ->equal callback is used to compare two frag_queue-s together, but this one is temporary and will be removed later. The netfilter code and the ipv6 one use the same routine to compare fr

[Devel] [PATCH 1/7] Omit double hash calculations in xxx_frag_intern

2007-10-16 Thread Pavel Emelyanov
Since the hash value is already calculated in xxx_find, we can simply use it later. This is already done in netfilter code, so make the same in ipv4 and ipv6. Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- >From 5bcb464538cf504a9f6cf3be33dbd1a5ff18b693 Mon Sep 17 00:00:00 2001 From: Pa

[Devel] [PATCH 0/7] Next step in consolidating the IP fragment management

2007-10-16 Thread Pavel Emelyanov
This set concerns the xxx_find() routine only. This one is used to obtain the fragment queue depending on some criteria, and this can be consolidated. The consolidation goes step-by-step, consolidating one sub-routine at a time, so some functions, exports and callbacks that are introduced in patc

[Devel] Re: [PATCH] Control groups: Replace "cont" with "cgrp" and other misc renaming

2007-10-16 Thread Paul Jackson
> Replace "cont" with "cgrp" and other misc renaming Acked-by: Paul Jackson <[EMAIL PROTECTED]> Builds, boots, and I approve the 'cgrp' renaming - thanks. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EM

[Devel] [PATCH] Control groups: Replace "cont" with "cgrp" and other misc renaming

2007-10-16 Thread Paul Menage
Replace "cont" with "cgrp" and other misc renaming This patch finishes some of the names that got missed in the great "task containers" -> "control groups" rename. Primarily it renames the local variable "cont" to "cgrp" in a number of places, and renames the CONT_* enum members to CGRP_*. This

[Devel] Re: CONFIG_NAMESPACE* patchset

2007-10-16 Thread Cedric Le Goater
>> do you have some time to refresh on -mm and resend this set of >> patches ? If you don't, I can give it a try. > > I can, but I think that Andrew is busy with merging the > patches into 2.6.24 window, so I'd like to wait for things > to settle down and re-send all the patches that were not >

[Devel] Re: CONFIG_NAMESPACE* patchset

2007-10-16 Thread Pavel Emelyanov
Cedric Le Goater wrote: > Pavel, > > do you have some time to refresh on -mm and resend this set of > patches ? If you don't, I can give it a try. I can, but I think that Andrew is busy with merging the patches into 2.6.24 window, so I'd like to wait for things to settle down and re-send all the

[Devel] CONFIG_NAMESPACE* patchset

2007-10-16 Thread Cedric Le Goater
Pavel, do you have some time to refresh on -mm and resend this set of patches ? If you don't, I can give it a try. I'd like to base some of my patches on top of them. Thanks, C. ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-found

[Devel] [PATCH] memory cgroup enhancements [5/5] show statistics by memory.stat file per cgroup

2007-10-16 Thread KAMEZAWA Hiroyuki
Show accounted information of memory cgroup by memory.stat file Changelog v1->v2 - dropped Charge/Uncharge entry. Signed-off-by: YAMAMOTO Takashi <[EMAIL PROTECTED]> Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> mm/memcontrol.c | 52

[Devel] [PATCH] memory cgroup enhancements [4/5] memory cgroup statistics

2007-10-16 Thread KAMEZAWA Hiroyuki
Add statistics account infrastructure for memory controller. Changelog v1 -> v2 - Removed Charge/Uncharge counter - reflected comments. - changes __move_lists() args. - changes __mem_cgroup_stat_add() name, comment and added VM_BUG_ON Changes from original: - divided into 2 patch (accoun

[Devel] [PATCH] memory cgroup enhancements [3/5] record pc is on active list

2007-10-16 Thread KAMEZAWA Hiroyuki
Remember page_cgroup is on active_list or not in page_cgroup->flags. Against 2.6.23-mm1. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Signed-off-by: YAMAMOTO Takashi <[EMAIL PROTECTED]> mm/memcontrol.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) Index: devel-2.

[Devel] [PATCH] memory cgroup enhancements [2/5] remember charge as cache

2007-10-16 Thread KAMEZAWA Hiroyuki
Add PCGF_PAGECACHE flag to page_cgroup to remember "this page is charged as page-cache." This is very useful for implementing precise accounting in memory cgroup. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Signed-off-by: YAMAMOTO Takashi <[EMAIL PROTECTED]> mm/memcontrol.c | 18

[Devel] [PATCH] memory cgroup enhancements [1/5] force_empty for memory cgroup

2007-10-16 Thread KAMEZAWA Hiroyuki
This patch adds an interface "memory.force_empty". Any write to this file will drop all charges in this cgroup if there is no task under. %echo 1 > /../memory.force_empty will drop all charges of memory cgroup if cgroup's tasks is empty. This is useful to invoke rmdir() against memory cgroup

[Devel] [PATCH] memory cgroup enhancements [0/5] intro

2007-10-16 Thread KAMEZAWA Hiroyuki
This patch set adds - force_empty interface, which drops all charges in memory cgroup. This enables rmdir() against unused memory cgroup. - the memory cgroup statistics accounting. Based on 2.6.23-mm1 + http://lkml.org/lkml/2007/10/12/53 Changes from previous version is - merged comments.

[Devel] Re: [RFC] cpuset update_cgroup_cpus_allowed

2007-10-16 Thread Paul Menage
Paul Jackson wrote: Any chance you could provide a patch that works against cgroups? Fix cpusets update_cpumask Cause writes to cpuset "cpus" file to update cpus_allowed for member tasks: - collect batches of tasks under tasklist_lock and then call set_cpus_allowed() on them outside the lo

[Devel] Re: [RFC] cpuset update_cgroup_cpus_allowed

2007-10-16 Thread Paul Jackson
David wrote: > Why can't you just add a helper function to sched.c: > > void set_hotcpus_allowed(struct task_struct *task, >cpumask_t cpumask) > { > mutex_lock(&sched_hotcpu_mutex); > set_cpus_allowed(task, cpumask); >

Re: [Devel] [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Paul Menage
One thought on this - could we make the API have a "which" parameter that indicates the type of thing being acted upon? E.g., like sys_setpriority(), which can specify the target as a process, a pgrp or a user. Right now the target would just be a process, but I'd really like the ability to be abl

[Devel] Re: [PATCH] namespaces: introduce sys_hijack (v4)

2007-10-16 Thread Cedric Le Goater
>> hmm, I'm wondering how this is going to work for a process which >> would have unshared its device (pts) namespace. How are we going >> to link the pts living in different namespaces if the stdios of the >> hijacked process is using them ? like in the case of a shell, which >> is certainly so