[Devel] Re: [Fwd: [PATCH -RSS 2/2] Fix limit check after reclaim]

2007-06-04 Thread Pavel Emelianov
Balbir Singh wrote: > > Original Message > Subject: [PATCH -RSS 2/2] Fix limit check after reclaim > Date: Mon, 04 Jun 2007 21:03:04 +0530 > From: Balbir Singh <[EMAIL PROTECTED]> > To: Andrew Morton <[EMAIL PROTECTED]> > CC: Linux Containers <[EMAIL PROTECTED]>,Balbir Si

[Devel] Re: [RFC PATCH ext3/ext4] orphan list corruption due bad inode

2007-06-04 Thread Christoph Hellwig
On Tue, Jun 05, 2007 at 10:11:12AM +0400, Vasily Averin wrote: > >>return d_splice_alias(inode, dentry); > >> } > > Seems reasonable. So this prevents the bad inodes from getting onto the > > orphan list in the first place? > > make_bad_inode() is called from ext3_read_inode() that is called

[Devel] Re: [RFC PATCH ext3/ext4] orphan list corruption due bad inode

2007-06-04 Thread Vasily Averin
Eric Sandeen wrote: > Vasily Averin wrote: >> Bad inode can live some time, ext3_unlink can add it to orphan list, but >> ext3_delete_inode() do not deleted this inode from orphan list. As result >> we can have orphan list corruption detected in ext3_destroy_inode(). > > Ah, I see - so you have c

[Devel] Re: [PATCH ext3/ext4] orphan list check on destroy_inode

2007-06-04 Thread Vasily Averin
Eric Sandeen wrote: > Vasily Averin wrote: >> Customers claims to ext3-related errors, investigation showed that ext3 >> orphan list has been corrupted and have the reference to non-ext3 inode. >> The following debug helps to understand the reasons of this issue. > > Vasily, does your customer hav

[Devel] Re: [RFC PATCH ext3/ext4] orphan list corruption due bad inode

2007-06-04 Thread Vasily Averin
Andrew Morton wrote: > On Mon, 04 Jun 2007 09:19:10 +0400 Vasily Averin <[EMAIL PROTECTED]> wrote: >> diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c >> index 9bb046d..e3ac8c3 100644 >> --- a/fs/ext3/namei.c >> +++ b/fs/ext3/namei.c >> @@ -1019,6 +1019,11 @@ static struct dentry *ext3_lookup(struct

[Devel] release_task(), procfs dependency

2007-06-04 Thread sukadev
Like we discussed earlier and Pavel/others had pointed out, proc_flush_task() in its current place in release_task() is useless with the new pid namespace code, because task_pid() for the task is already NULL before the call to proc_flush_task(). So as a simple change I tried to move proc_flush_t

[Devel] Re: [RFC PATCH ext3/ext4] orphan list corruption due bad inode

2007-06-04 Thread Andreas Dilger
On Jun 04, 2007 19:03 -0700, Andrew Morton wrote: > What caused those inodes to be bad, anyway? Memory allocation failures? This can happen if e.g. NFS has a stale file handle - it will look up the inode by inum, but ext3_read_inode() will create a bad inode due to i_nlink = 0. Cheers, Andreas

[Devel] Re: [RFC PATCH ext3/ext4] orphan list corruption due bad inode

2007-06-04 Thread Andrew Morton
On Mon, 04 Jun 2007 09:19:10 +0400 Vasily Averin <[EMAIL PROTECTED]> wrote: > After ext3 orphan list check has been added into ext3_destroy_inode() (please > see my previous patch) the following situation has been detected: > EXT3-fs warning (device sda6): ext3_unlink: Deleting nonexistent file

[Devel] Re: [PATCH ext3/ext4] orphan list check on destroy_inode

2007-06-04 Thread Andrew Morton
On Mon, 04 Jun 2007 09:18:55 +0400 Vasily Averin <[EMAIL PROTECTED]> wrote: > Customers claims to ext3-related errors, investigation showed that ext3 > orphan list has been corrupted and have the reference to non-ext3 inode. The > following debug helps to understand the reasons of this issue. >

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Jackson
Serge wrote: > Odd, I thought rm -rf used to work in the past, > but i'm likely wrong. I'm pretty sure it never worked. And I've probably tested it myself, every few months, since the birth of cpusets, when I forget and type it again, and then stare dumbly at the screen wondering what all the com

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > On 6/4/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > > > >2. I can't delete containers because of the files they contain, and > >am not allowed to delete those files by hand. > > > > You should be able to delete a container with rmdir as long as it's >

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > On 6/4/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: > >[EMAIL PROTECTED] root]# rm -rf /containers/1 > > Just use "rmdir /containers/1" here. Hmm. Ok, that works... Odd, I thought rm -rf used to work in the past, but i'm likely wrong. thanks, -serge

[Devel] Re: [ckrm-tech] [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Jackson
> [EMAIL PROTECTED] root]# rm -rf /containers/1 No - not 'rm -fr'. 'rmdir' Remove the cpuset directory, not start bottom up trying to remove the files first. The poor 'rm -fr' command doesn't understand the rather odd nature of cpuset file systems, which have all files coming and going automag

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Jackson
> Would it then make sense to just > default to (parent_set - sibling_exclusive_set) for a new sibling's > value? Which could well be empty, which in turn puts one back in the position of dealing with a newborn cpuset that is empty (of cpus or of memory), or else it introduces a new and odd constr

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Menage
On 6/4/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: [EMAIL PROTECTED] root]# rm -rf /containers/1 Just use "rmdir /containers/1" here. Ah, I see the second time I typed 'ls /containers/1/tasks' instead of cat. When I then used cat, the file was empty, and I got an oops just like Pavel rep

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Serge E. Hallyn
Quoting Paul Menage ([EMAIL PROTECTED]): > On 6/4/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > > >Yup - early in the life of cpusets, a created cpuset inherited the cpus > >and mems of its parent. But that broke the exclusive property big > >time. You will recall that a cpu_exclusive or mem_ex

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Jackson
Paul M wrote: > Maybe we could make it a per-cpuset option whether children should > inherit mems/cpus or not? I suppose, if those needing inherited mems/cpus need it bad enough. -- I won't rest till it's the best ... Programmer, Linux Scalability

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Menage
On 6/4/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote: 2. I can't delete containers because of the files they contain, and am not allowed to delete those files by hand. You should be able to delete a container with rmdir as long as it's not in use - its control files will get cleaned up automa

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Menage
On 6/4/07, Paul Jackson <[EMAIL PROTECTED]> wrote: Yup - early in the life of cpusets, a created cpuset inherited the cpus and mems of its parent. But that broke the exclusive property big time. You will recall that a cpu_exclusive or mem_exclusive cpuset cannot overlap the cpus or memory, res

[Devel] [PATCH 5/6] userns strict: hook ext2

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Serge Hallyn <[EMAIL PROTECTED]> Date: Wed, 28 Mar 2007 15:06:47 -0500 Subject: [PATCH 5/6] userns strict: hook ext2 Add a user namespace pointer to the ext2 superblock and inode. Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]> --- fs/ext2/acl.c

[Devel] [PATCH 1/6] user namespace : add the framework

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Cedric Le Goater <[EMAIL PROTECTED]> Date: Thu, 5 Apr 2007 12:51:51 -0400 Subject: [PATCH 1/6] user namespace : add the framework Add the user namespace struct and framework Basically, it will allow a process to unshare its user_struct table, resetting

[Devel] user namespace - introduction

2007-06-04 Thread Serge E. Hallyn
[ I've been sitting on this for some months, and am just dumping it so people can talk if they like, maybe even build on the patchset by adding support for more filesystems or implementing the keyring. Or tell me how much the approach sucks. ] First, I point out once more that the base user nam

[Devel] [PATCH 3/6] user ns: add an inode user_ns pointer

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <[EMAIL PROTECTED]> Date: Thu, 5 Apr 2007 14:02:09 -0400 Subject: [PATCH 3/6] user ns: add an inode user_ns pointer Add a user namespace pointer to each inode. One user namespace is said to own each inode. Each filesystem can fill these

[Devel] [PATCH 6/6] userns strict: hook ext3

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Serge Hallyn <[EMAIL PROTECTED]> Date: Wed, 28 Mar 2007 13:11:19 -0500 Subject: [PATCH 6/6] userns strict: hook ext3 Add a user namespace pointer to the ext3 superblock and inode. Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]> --- fs/ext3/acl.c

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Serge E. Hallyn
Hi Paul, I've got two problems working with this patchset: 1. A task can't join a cpuset unless 'cpus' and 'mems' are set. These don't seem to automatically inherit the parent's values. So when I do mount -t container -o ns,cpuset nsproxy /containers (unshare a namespace) the

[Devel] [PATCH 4/6] user ns: hook generic_permission()

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <[EMAIL PROTECTED]> Date: Thu, 5 Apr 2007 17:17:23 -0400 Subject: [PATCH 4/6] user ns: hook generic_permission() Hook generic_permission() to check for user namespaces. Also define task_ino_capable() which denies a capability if the subje

[Devel] [PATCH 2/6] user namespace : add unshare

2007-06-04 Thread Serge E. Hallyn
>From nobody Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <[EMAIL PROTECTED]> Date: Thu, 5 Apr 2007 13:00:47 -0400 Subject: [PATCH 2/6] user namespace : add unshare Changelog: Fix !CONFIG_USER_NS clone with CLONE_NEWUSER so it returns -EINVAL rather than 0, so that userspace knows they d

[Devel] Re: [PATCH 00/10] Containers(V10): Generic Process Containers

2007-06-04 Thread Paul Jackson
What you describe, Serge, sounds like semantics carried over from cpusets. Serge wrote: > A task can't join a cpuset unless 'cpus' and 'mems' are set. Yup - don't want to run a task in a cpuset that lacks cpu, or lacks memory. Hard to run without those. > These don't seem to automatically inher

[Devel] Re: [PATCH 1/1] containers: implement nsproxy containers subsystem

2007-06-04 Thread Serge E. Hallyn
Sorry, didn't paste in my comment at the top that this is again not at all for inclusion, and barely tested, but mainly to get comment i.e. on the way the naming is done. thanks, -serge Quoting Serge E. Hallyn ([EMAIL PROTECTED]): > >From 190ea72d213393dd1440643b2b87b5b2128dff87 Mon Sep 17 00:00:

[Devel] [PATCH 1/1] containers: implement nsproxy containers subsystem

2007-06-04 Thread Serge E. Hallyn
>From 190ea72d213393dd1440643b2b87b5b2128dff87 Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <[EMAIL PROTECTED]> Date: Mon, 4 Jun 2007 14:18:52 -0400 Subject: [PATCH 1/1] containers: implement nsproxy containers subsystem When a task enters a new namespace via a clone() or unshare(), a new contai

[Devel] [PATCH -RSS] Add documentation for the RSS controller

2007-06-04 Thread Balbir Singh
Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> --- Documentation/controller/rss.txt | 165 +++ 1 file changed, 165 insertions(+) diff -puN /dev/null Documentation/controller/rss.txt --- /dev/null 2007-06-01 20:42:04.0 +0530 +++ linux-2.6.22-rc2-m

[Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Cedric Le Goater
Kirill Korotaev wrote: >> the results were also very reproducible but the profiling was too noisy. >> we also changed the kernel. the previous pidns patchset was on a 2.6.21-mm2 >> and we ported it on a 2.6.22-rc1-mm1. > > If reproducible, then were they the same as Pavel posted? > >> but let me

[Devel] [PATCH -RSS 1/1] Fix reclaim failure

2007-06-04 Thread Balbir Singh
This patch fixes the problem seen when a container goes over its limit, the reclaim is unsuccessful and the application is terminated. The problem is that all pages are by default added to the active list of the RSS controller. When __isolate_lru_page() is called, it checks to see if the list that

[Devel] [PATCH -RSS 2/2] Fix limit check after reclaim

2007-06-04 Thread Balbir Singh
This patch modifies the reclaim behaviour such that before calling the container out of memory routine, it checks if as a result of the reclaim (even though pages might not be fully reclaimed), the resident set size of the container decreased before declaring the container as out of memory Signe

[Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Kirill Korotaev
> the results were also very reproducible but the profiling was too noisy. > we also changed the kernel. the previous pidns patchset was on a 2.6.21-mm2 > and we ported it on a 2.6.22-rc1-mm1. If reproducible, then were they the same as Pavel posted? > but let me remove some debugging options,

Re: [Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Pavel Emelianov
Cedric Le Goater wrote: > Pavel Emelianov wrote: >> Serge E. Hallyn wrote: >>> Quoting Kirill Korotaev ([EMAIL PROTECTED]): Cedric, just a small note. imho it is not correct to check performance with enabled debug in memory allocator since it can influence cache effic

Re: [Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Cedric Le Goater
Pavel Emelianov wrote: > Serge E. Hallyn wrote: >> Quoting Kirill Korotaev ([EMAIL PROTECTED]): >>> Cedric, >>> >>> just a small note. >>> imho it is not correct to check performance with enabled debug in memory >>> allocator >>> since it can influence cache efficiency much. >>> In you case looks

Re: [Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Pavel Emelianov
Serge E. Hallyn wrote: > Quoting Kirill Korotaev ([EMAIL PROTECTED]): >> Cedric, >> >> just a small note. >> imho it is not correct to check performance with enabled debug in memory >> allocator >> since it can influence cache efficiency much. >> In you case looks like you have DEBUG_SLAB enabled.

[Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Cedric Le Goater
Kirill Korotaev wrote: > Cedric, > > just a small note. > imho it is not correct to check performance with enabled debug in memory > allocator > since it can influence cache efficiency much. > In you case looks like you have DEBUG_SLAB enabled. you're right. i'll rerun and resend. > Pavel will

[Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Serge E. Hallyn
Quoting Kirill Korotaev ([EMAIL PROTECTED]): > Cedric, > > just a small note. > imho it is not correct to check performance with enabled debug in memory > allocator > since it can influence cache efficiency much. > In you case looks like you have DEBUG_SLAB enabled. Hm, good point. Cedric, did

[Devel] Re: nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Kirill Korotaev
Cedric, just a small note. imho it is not correct to check performance with enabled debug in memory allocator since it can influence cache efficiency much. In you case looks like you have DEBUG_SLAB enabled. Pavel will recheck as well what influences on this particular test. BTW, it is strange..

[Devel] [PATCH 8/8] RSS accounting hooks over the code

2007-06-04 Thread Pavel Emelianov
As described, pages are charged to their first touchers. The first toucher is determined using pages' _mapcount manipulations in rmap calls. A page is charged in two stages: 1. preparation, in which the resource availability is checked. This stage may lead to page reclamation, thus it is perfo

[Devel] [PATCH 7/8] Per-container pages reclamation

2007-06-04 Thread Pavel Emelianov
Implement try_to_free_pages_in_container() to free the pages in container that has run out of memory. The scan_control->isolate_pages() function is set to isolate_pages_in_container() that isolates the container pages only. The exported __isolate_lru_page() call makes things look simpler than in t

[Devel] [PATCH 6/8] Per container OOM killer

2007-06-04 Thread Pavel Emelianov
When container is completely out of memory some tasks should die. This is unfair to kill the current task, so a task with the largest RSS is chosen and killed. The code re-uses current OOM killer select_bad_process() for task selection. Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]> --- ---

[Devel] nptl perf bench and profiling with pidns patchsets

2007-06-04 Thread Cedric Le Goater
Pavel and all, I've been profiling the different pidns patchsets to chase the perf bottlenecks in the pidns patchset. As i was not getting accurate profiling results with unixbench, I changed the benchmark to use the nptl perf benchmark ingo used when he introduced the generic pidhash back in

[Devel] [PATCH 5/8] RSS container core

2007-06-04 Thread Pavel Emelianov
The core routines for tracking the page ownership, RSS subsystem registration in the containers and the definition of the rss_container struct as container subsystem combined with the resource counter structure. To make the whole set look more consistent the calls to the reclamation code and oo

[Devel] [PATCH 4/8] Scanner changes needed to implement per-container scanner

2007-06-04 Thread Pavel Emelianov
The core change is that the isolate_lru_pages() call is replaced with struct scan_controll->isolate_pages() call. Other changes include exporting __isolate_lru_page() for per-container isolator and handling variable-to-pointer changes in try_to_free_pages(). This makes possible to use different i

[Devel] [PATCH 3/8] Add container pointer on mm_struct

2007-06-04 Thread Pavel Emelianov
Naturally mm_struct determines the resource consumer in memory accounting. So each mm_struct should have a pointer on container it belongs to. When a new task is created its mm_struct is assigned to the container this task belongs to. include/linux/rss_container.h is added in this patch to make th

[Devel] [PATCH 2/8] Add container pointer on struct page

2007-06-04 Thread Pavel Emelianov
Each page is supposed to have an owner - the container that touched the page first. The owner stays alive during the page lifetime even if the task that touched the page dies or moves to another container. This ownership is the forerunner for the "fair" page sharing accounting, in which page has a

[Devel] [PATCH 1/8] Resource counters

2007-06-04 Thread Pavel Emelianov
Introduce generic structures and routines for resource accounting. Each resource accounting container is supposed to aggregate it, container_subsystem_state and its resource-specific members within. Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]> --- diff -upr linux-2.6.22-rc2-mm1.orig/inclu

[Devel] [PATCH 0/8] RSS controller based on process containers (v3.1)

2007-06-04 Thread Pavel Emelianov
Adds RSS accounting and control within a container. Changes from v3 - comments across the code - git-bisect safe split - lost places to move the page between active/inactive lists Ported above Paul's containers V10 with fixes from Balbir. RSS container includes the per-container RSS accountin

[Devel] Re: [PATCH -mm] Fix /proc/slab_allocators re seq_list_next() conversion

2007-06-04 Thread Pavel Emelianov
Alexey Dobriyan wrote: > Wrong pointer was used as kmem_cache pointer. > > [Here /proc/slab_allocators appears as empty file, but it's just me, probably] > > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Acked-by: Pavel Emelianov <[EMAIL PROTECTED]> > --- > > mm/slab.c |2 +- > 1 fil

[Devel] [PATCH -mm] Fix /proc/slab_allocators re seq_list_next() conversion

2007-06-04 Thread Alexey Dobriyan
Wrong pointer was used as kmem_cache pointer. [Here /proc/slab_allocators appears as empty file, but it's just me, probably] Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> --- mm/slab.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/slab.c +++ b/mm/slab.c @@ -4401,7 +440

Re: [Devel] PPC64 Kernel 2.6.18 + RPM packages

2007-06-04 Thread Christian Kaiser2
Hi Kirill, imho there is no reason for not adding the patch to the git repository. I've tested it for one week now and I'm getting no serious errors. Mit freundlichen Grüßen / Best Regards Christian Kaiser -- IBM Deutschland Entwicklung GmbH Open Systems Firmware Development mail: [EMAIL PROTECTE

Re: [Devel] PPC64 Kernel 2.6.18 + RPM packages

2007-06-04 Thread Kirill Korotaev
http://git.openvz.org/?p=linux-2.6.18-openvz;a=commit;h=cb649b7cede6764c00e256578dc3c7ad73c1b24c Thanks, Kirill Christian Kaiser2 wrote: > Hi Kirill, > > imho there is no reason for not adding the patch to the git repository. > I've tested it for one week now and I'm getting no serious errors. >