Paul Menage wrote:
> On 10/3/07, Pavel Emelyanov <[EMAIL PROTECTED]> wrote:
>> +
>> + strcpy(files[RES_USAGE].name, "usage_in_bytes");
>
> This prevents the resource counters from being used for anything that
> doesn't measure its limit/usage in bytes. E.g. number of tasks.
>
> I like the i
> hi,
>
> i implemented some statistics for your memory controller.
>
> it's tested with 2.6.23-rc2-mm2 + memory controller v7.
> i think it can be applied to 2.6.23-rc4-mm1 as well.
>
> YAMOMOTO Takshi
>
> todo: something like nr_active/inactive in /proc/vmstat.
here's the version i'm working
One more cgroup code review detail ...
The following is evidence of some more stale comments in
kernel/cpuset.c. Some routines which used to be in that file, but
which are now reimplemented in cgroups, are still named in cpuset.c
comments:
$ grep -E 'cpuset_rmdir|cpuset_exit|cpuset_fork' kernel/
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> I can't say for sure, but I suspect that if cgroups had always
> been cgroups (short for control groups), then these local 'cont'
> variables would have a different name.
Oh, absolutely. I just refrained from changing them in the rename
since
> > - There are many instances of the local variable 'cont', referring
> >to a struct cgroup pointer. I presume the spelling 'cont' is a
> >holdover from the time when we called these containers.
>
> Yes, and since cgroup is short for "control group", "cont" still
> seemed like a reasona
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> But now (correct me if I'm wrong here) cgroups has a per-cgroup task
> list, and the above loop has cost linear in the number of tasks
> actually in the cgroup, plus (unfortunate but necessary and tolerable)
> the cost of taking a global css_s
Andrew - please kill this patch.
Looks like Paul Menage has a better solution
that I will be trying out.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
> What was wrong with my suggestion from a couple of emails back? Adding
> the following in cpuset_attach():
>
> struct cgroup_iter it;
> struct task_struct *p;
> while ((p = cgroup_iter_next(cs->css.cgroup, &it))) {
>set_cpus_allowed(p, cs->cpus_allowed);
> }
> cgroup_iter_end(cs->css.cgr
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> So far as I can tell, this patch just removes a minor optimization,
It's not minor for any subsystem that has a non-trivial attach() operation.
>
> Any further suggestions, or embarrassing (;) questions?
>
What was wrong with my suggestion
Paul M wrote:
> Given that later in cpusets.txt you say:
>
> >If hotplug functionality is used
> >to remove all the CPUs that are currently assigned to a cpuset,
> >then the kernel will automatically update the cpus_allowed of all
> >tasks attached to CPUs in that cpuset to allow all CPUs
>
> why
Eric W. Biederman wrote:
Daniel Lezcano <[EMAIL PROTECTED]> writes:
Yes, it will work.
Do we want to be inside a network namespace and to use a socket belonging to
another network namespace ? If yes, then my remark is irrelevant.
Yes we do.
Shall we close all fd sockets when doing an unshar
On 10/3/07, Paul Menage <[EMAIL PROTECTED]> wrote:
>
> What's wrong with, in update_cpumask(), doing a loop across all
> members of the cgroup and updating their cpus_allowed fields?
i.e. something like:
struct cgroup_iter it;
struct task_struct *p;
while ((p = cgroup_iter_next(cs->css.cgroup, &i
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> So after changing the 'cpus' of a cpuset, you (something in
> user space) then has to rewrite every pid in that cpusets
> tasks file back to that same tasks file, in order to get the
> change to be applied to each of those tasks, and get them
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> Yes, something in user space has to do it. It's part of the
> kernel-user cpuset API. If you change a cpuset's 'cpus', then
> you have to rewrite each pid in its 'tasks' file back to that
> 'tasks' file in order to get that 'cpus' change to
Pavel Emelyanov wrote:
>
> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> index 61363ce..8ee8316 100644
> --- a/include/linux/res_counter.h
> +++ b/include/linux/res_counter.h
> @@ -58,6 +58,9 @@ ssize_t res_counter_write(struct res_cou
> const char __user *
On Wed, 03 Oct 2007 18:19:01 +0400 Pavel Emelyanov wrote:
> This is a pid which is attached to tasks when they detach
> their pids. This is done in detach_pid() and transfer_pid().
> The pid_alive() check is changed to reflect this fact.
>
> Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
>
>
Pavel Emelyanov wrote:
> Right now we have only one controller in -mm tree - the memory
> one - and it initializes all its files manually. After I developed
> the kernel memory one it turned out, that the names of resource
> counter specific files has changed. In the future, if anything in
> the re
Paul M wrote:
> > The code in kernel/cgroup.c attach_task() which skips the
> > attachment of a task to the group it is already in has to be
> > removed. Cpusets depends on reattaching a task to its current
> > cpuset, in order to trigger updating the cpus_allowed mask in the
> > task struct.
>
>
Paul M wrote:
> Are there cases when userspace is
> required to try to reattach a task to its current cpuset in order to
> get a cpu mask change to stick?
Yes, there are such cases.
If tasks are running in a cpuset, and someone changes the
'cpus' of that cpuset, the tasks already in that cpuset d
Daniel Lezcano <[EMAIL PROTECTED]> writes:
>
> Yes, it will work.
>
> Do we want to be inside a network namespace and to use a socket belonging to
> another network namespace ? If yes, then my remark is irrelevant.
Yes we do.
>>> Shall we close all fd sockets when doing an unshare ? like a close-
On Wed, Oct 03, 2007 at 06:20:43PM +0400, Pavel Emelyanov wrote:
> Just make the __pid_nr() etc functions that expect the argument
> to always be not NULL.
>
> Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
> static inline pid_t pid_nr(struct pid *pid)
> {
> pid_t nr = 0;
> if (
On 10/3/07, Pavel Emelyanov <[EMAIL PROTECTED]> wrote:
> +
> + strcpy(files[RES_USAGE].name, "usage_in_bytes");
This prevents the resource counters from being used for anything that
doesn't measure its limit/usage in bytes. E.g. number of tasks.
I like the idea of having a unified interface
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
> From: Paul Jackson <[EMAIL PROTECTED]>
>
> The code in kernel/cgroup.c attach_task() which skips the
> attachment of a task to the group it is already in has to be
> removed. Cpusets depends on reattaching a task to its current
> cpuset, in ord
Eric W. Biederman wrote:
> Cedric Le Goater <[EMAIL PROTECTED]> writes:
>> my 2 cnts,
>>
>> when 'restarting' a socket bound to an abstract name, we will have
>> a EADDRINUSE if we try to rebind it to an abtract name which is
>> already in use by a socket in a another namespace ?
>
> No.
ok. i
On 10/3/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
>
> - What are these apparent 'exec notifications' that are provided to
>user space that the following mentions - I cannot find any other
>mention of them:
>
> With the ability to classify tasks differently for different
>
Cedric Le Goater <[EMAIL PROTECTED]> writes:
> my 2 cnts,
>
> when 'restarting' a socket bound to an abstract name, we will have
> a EADDRINUSE if we try to rebind it to an abtract name which is
> already in use by a socket in a another namespace ?
No.
> it seems to me that this is an identifi
There are two places that do so - the cgroups subsystem
and the autofs code.
Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
---
diff --git a/fs/autofs/root.c b/fs/autofs/root.c
index 592f640..5efff3c 100644
--- a/fs/autofs/root.c
+++ b/fs/autofs/root.c
@@ -214,7 +214,7 @@ static struct dent
Since we now have all the tasks have non-zero pids we may
call the __pid_nr() instead of pid_nr() when the pid is
get from task with task_pid() or similar call.
Besides, there are some other places where the check for
pid to be not NULL is already done, so change them too.
Signed-off-by: Pavel
Just make the __pid_nr() etc functions that expect the argument
to always be not NULL.
Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
---
diff --git a/include/linux/pid.h b/include/linux/pid.h
index e29a900..50b6899 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -135,21 +135,3
This is a pid which is attached to tasks when they detach
their pids. This is done in detach_pid() and transfer_pid().
The pid_alive() check is changed to reflect this fact.
Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
---
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4f
Some time ago Sukadev noticed that the vmlinux size has
grown 5Kb due to merged pid namespaces. One of the big
problems with it was fat inline functions. The other thing
was noticed by Matt - the checks for task's pid to be not
NULL take place and make the kernel grow due to inlining,
but these che
Denis V. Lunev wrote:
> Daniel Lezcano wrote:
>> Eric W. Biederman wrote:
>>> Daniel Lezcano <[EMAIL PROTECTED]> writes:
>>>
The following patch change autobind fonction to use the ordernum
from the network namespace instead of using the local static variable.
>>> Why do we care?
>>> Info
Note, that the controller code dealing with the cftype files
for resource counters becomes much shorter and won't have to
be changed in the future.
Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
---
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1b8bf24..d1b43bc 100644
--- a/mm/memcon
This one is responsible for initializing the RES_CFT_MAX files
properly and register them inside the container.
The caller must provide the cgroup, the cgroup_subsys, the
RES_CFT_MAX * sizeof(cftype) chunk of uninitialized memory
and the read and write callbacks.
In the future, if we add more res
This is just to reduce the code amount in the future.
Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>
---
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8747932..0635004 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -178,6 +178,15 @@ struct css_set {
Right now we have only one controller in -mm tree - the memory
one - and it initializes all its files manually. After I developed
the kernel memory one it turned out, that the names of resource
counter specific files has changed. In the future, if anything in
the resource counters change, we'll hav
On Wednesday 03 October 2007 16:18, H. Peter Anvin wrote:
> Nick Piggin wrote:
> >> This should work because the result gets used before reading again:
> >>
> >> read_cr3(a);
> >> write_cr3(a | 1);
> >> read_cr3(a);
> >>
> >> But this might be reordered so that b gets read before the write:
> >>
>
Eric W. Biederman wrote:
Daniel Lezcano <[EMAIL PROTECTED]> writes:
Hi,
I was looking at some cornercases and trying to figure out what happens if
someone does:
1 - fd = socket(...)
2 - unshare(CLONE_NEWNET)
3 - bind(fd, ...) / listen(fd, ...)
There is here an interaction between two namespa
>
> How does the compiler know it doesn't depend on memory?
When it has no m (or equivalent like g) constrained argument
and no memory clobber.
> How do you say it depends on memory?
You add any of the above.
> You really need something as heavy as volatile?
You could do a memory clobber,
From: Paul Jackson <[EMAIL PROTECTED]>
The code in kernel/cgroup.c attach_task() which skips the
attachment of a task to the group it is already in has to be
removed. Cpusets depends on reattaching a task to its current
cpuset, in order to trigger updating the cpus_allowed mask in the
task struct
Eric W. Biederman wrote:
Daniel Lezcano <[EMAIL PROTECTED]> writes:
Eric W. Biederman wrote:
Daniel Lezcano <[EMAIL PROTECTED]> writes:
The following patch change autobind fonction to use the ordernum
from the network namespace instead of using the local static variable.
Why do we care?
Inf
Arjan,
I can experiment with any constraints if you suggest which one.
>From our experiments with gcc, it compares asm strings (sic!!!) to find matches
to be merged! Sigh...
Below are 2 programs which differ in one space in read_cr3_b() asm statement.
The first one compiles incorrectly, while 2nd
Daniel Lezcano wrote:
> Eric W. Biederman wrote:
>> Daniel Lezcano <[EMAIL PROTECTED]> writes:
>>
>>> The following patch change autobind fonction to use the ordernum
>>> from the network namespace instead of using the local static variable.
>>
>> Why do we care?
>> Information leak?
>> Some applic
Eric W. Biederman wrote:
> Daniel Lezcano <[EMAIL PROTECTED]> writes:
>
>> The following patch change autobind fonction to use the ordernum
>> from the network namespace instead of using the local static variable.
>
> Why do we care?
> Information leak?
> Some application is expecting a predictab
Cgroup (aka container) code review:
Except for the very last item below, my other comments are minor.
And the last item is pretty easy too - just more important.
Overall - nice stuff. I like this generalization of the cpuset
hierarchy. Thanks.
===
Review comments on Documentation/cgro
Eric W. Biederman wrote:
> [EMAIL PROTECTED] writes:
>
>> Cedric Le Goater [EMAIL PROTECTED] wrote:
>> |
>> | >> however, we have an issue with the signal notification in __do_notify()
>> | >> we could kill a process in a different pid namespace.
>> | >
>> | > So I took a quick look at the code
[
I have big fingers this morning and I managed to send this email
while typing it ... see below for the end. I should be awake now :)
]
> The really challenging case to handle here is what happens if we are
> signaling to someone in a sibling pid namespace. What do we set the
> parent pi
[EMAIL PROTECTED] wrote:
> Cedric Le Goater [EMAIL PROTECTED] wrote:
> |
> | >> however, we have an issue with the signal notification in __do_notify()
> | >> we could kill a process in a different pid namespace.
> | >
> | > So I took a quick look at the code as it is (before this patchset)
> | >
Christoph Lameter wrote:
> On Tue, 2 Oct 2007, Pavel Emelyanov wrote:
>
>> Christoph Lameter wrote:
>>> On Mon, 1 Oct 2007, Pavel Emelyanov wrote:
>>>
>> +
> Quick check, slub_free_notify() and slab_alloc_notify() are called
> from serialized contexts, right?
Yup.
>>> How is it se
49 matches
Mail list logo