Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-12 Thread Daniel Wagner

Hi Tejun,

On 09.11.2012 18:22, Tejun Heo wrote:

Hey, Daniel.

On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:

On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
cgroup_subsys->post_create()


Currently, there's no way for a controller to find out whether a new
cgroup finished all ->create() allocatinos successfully and is
considered "live" by cgroup.


I'd like add hierarchy support to net_prio and the first thing to
do is to get rid of get_prioidx(). It looks like it would be nice to


Ooh, I'm already working on it.  I *think* I should be able to post
the patches later today or early next week.


No problem. I didn't spend too much time in writing the patch. Getting
rid of get_prioidx() was simple. I just wanted to help out, but then
I might to slow to keep up with you :)

Though I have a question on the patch you are writing. When you disable
broken_hierarchy for the networking controllers, are you also 
considering to disable them on the root cgroup? Currently both

net_prio and net_cls will do work on the root cgroup. I think for
harmonizing the behavior or all controllers this should also be
disabled.


be able to use use_id and post_create() for this but as I read the
code this might not work because the netdev might access resources
allocated between create() and post_create(). So my question is
would it make sense to move

cgroup_create():

if (ss->use_id) {
err = alloc_css_id(ss, parent, cgrp);
if (err)
goto err_destroy;
}

part before create() or add some protection between create() and
post_create() callback in net_prio. I have a patch but I see
I could drop it completely if post_create() is there.


Glauber had about similar question about css_id and I need to think
more about it but currently I think I want to phase out css IDs.  It's
an id of the wrong thing (CSSes don't need IDs, cgroups do) and
unnecessarily duplicates its own hierarchy when the hierarchy of
cgroups already exists.  Once memcontrol moves away from walking using
css_ids, I *think* I'll try to kill it.

I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
number) so that it can be used for cgroup indexing.  Glauber, that
should solve your problem too, right?


net_prio doesn't have any particular requirements on the indexing, 
unless there is one. So a global ida would work for net_prio.


cheers,
daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-12 Thread Daniel Wagner

Hi Tejun,

On 09.11.2012 18:22, Tejun Heo wrote:

Hey, Daniel.

On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:

On 08.11.2012 20:07, Tejun Heo wrote: Subject: cgroup: add
cgroup_subsys-post_create()


Currently, there's no way for a controller to find out whether a new
cgroup finished all -create() allocatinos successfully and is
considered live by cgroup.


I'd like add hierarchy support to net_prio and the first thing to
do is to get rid of get_prioidx(). It looks like it would be nice to


Ooh, I'm already working on it.  I *think* I should be able to post
the patches later today or early next week.


No problem. I didn't spend too much time in writing the patch. Getting
rid of get_prioidx() was simple. I just wanted to help out, but then
I might to slow to keep up with you :)

Though I have a question on the patch you are writing. When you disable
broken_hierarchy for the networking controllers, are you also 
considering to disable them on the root cgroup? Currently both

net_prio and net_cls will do work on the root cgroup. I think for
harmonizing the behavior or all controllers this should also be
disabled.


be able to use use_id and post_create() for this but as I read the
code this might not work because the netdev might access resources
allocated between create() and post_create(). So my question is
would it make sense to move

cgroup_create():

if (ss-use_id) {
err = alloc_css_id(ss, parent, cgrp);
if (err)
goto err_destroy;
}

part before create() or add some protection between create() and
post_create() callback in net_prio. I have a patch but I see
I could drop it completely if post_create() is there.


Glauber had about similar question about css_id and I need to think
more about it but currently I think I want to phase out css IDs.  It's
an id of the wrong thing (CSSes don't need IDs, cgroups do) and
unnecessarily duplicates its own hierarchy when the hierarchy of
cgroups already exists.  Once memcontrol moves away from walking using
css_ids, I *think* I'll try to kill it.

I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
number) so that it can be used for cgroup indexing.  Glauber, that
should solve your problem too, right?


net_prio doesn't have any particular requirements on the indexing, 
unless there is one. So a global ida would work for net_prio.


cheers,
daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-09 Thread Glauber Costa
On 11/09/2012 06:22 PM, Tejun Heo wrote:
> Hey, Daniel.
> 
> On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
>> On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
>> cgroup_subsys->post_create()
>>>
>>> Currently, there's no way for a controller to find out whether a new
>>> cgroup finished all ->create() allocatinos successfully and is
>>> considered "live" by cgroup.
>>
>> I'd like add hierarchy support to net_prio and the first thing to
>> do is to get rid of get_prioidx(). It looks like it would be nice to
> 
> Ooh, I'm already working on it.  I *think* I should be able to post
> the patches later today or early next week.
> 
>> be able to use use_id and post_create() for this but as I read the
>> code this might not work because the netdev might access resources
>> allocated between create() and post_create(). So my question is
>> would it make sense to move
>>
>> cgroup_create():
>>
>>  if (ss->use_id) {
>>  err = alloc_css_id(ss, parent, cgrp);
>>  if (err)
>>  goto err_destroy;
>>  }
>>
>> part before create() or add some protection between create() and
>> post_create() callback in net_prio. I have a patch but I see
>> I could drop it completely if post_create() is there.
> 
> Glauber had about similar question about css_id and I need to think
> more about it but currently I think I want to phase out css IDs.  It's
> an id of the wrong thing (CSSes don't need IDs, cgroups do) and
> unnecessarily duplicates its own hierarchy when the hierarchy of
> cgroups already exists.  Once memcontrol moves away from walking using
> css_ids, I *think* I'll try to kill it.

May I suggest doing something similar with what the scheduler does? I
had some code in the past that reused that code - but basically
duplicated it. If you want, I can try getting a version of that in
kernel/cgroup.c  that would serve as a general walker.

I like that walker a lot, because it happens in a sane order. memcg
basically walks in a random weird order, that makes hierarchical
computation of anything quite hard.

> 
> I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
> number) so that it can be used for cgroup indexing.  Glauber, that
> should solve your problem too, right?
> 

Actually I went with a totally orthogonal solution. I am now using per
kmem-limited ids. Because they are not tied to the cgroup creation
workflow, I can allocate whenever it is more convenient.

I ended up liking this solution because it will do better in scenarios
where most of the memcgs are not kmem limited. So it had an edge here,
and also got rid of the create/post_create problem by breaking the
dependency.

But of course, if cgroups would gain some kind of sane indexing, it
could shift the balance towards reusing it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-09 Thread Tejun Heo
Hey, Daniel.

On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
> On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
> cgroup_subsys->post_create()
> >
> > Currently, there's no way for a controller to find out whether a new
> > cgroup finished all ->create() allocatinos successfully and is
> > considered "live" by cgroup.
> 
> I'd like add hierarchy support to net_prio and the first thing to
> do is to get rid of get_prioidx(). It looks like it would be nice to

Ooh, I'm already working on it.  I *think* I should be able to post
the patches later today or early next week.

> be able to use use_id and post_create() for this but as I read the
> code this might not work because the netdev might access resources
> allocated between create() and post_create(). So my question is
> would it make sense to move
> 
> cgroup_create():
> 
>   if (ss->use_id) {
>   err = alloc_css_id(ss, parent, cgrp);
>   if (err)
>   goto err_destroy;
>   }
> 
> part before create() or add some protection between create() and
> post_create() callback in net_prio. I have a patch but I see
> I could drop it completely if post_create() is there.

Glauber had about similar question about css_id and I need to think
more about it but currently I think I want to phase out css IDs.  It's
an id of the wrong thing (CSSes don't need IDs, cgroups do) and
unnecessarily duplicates its own hierarchy when the hierarchy of
cgroups already exists.  Once memcontrol moves away from walking using
css_ids, I *think* I'll try to kill it.

I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
number) so that it can be used for cgroup indexing.  Glauber, that
should solve your problem too, right?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-09 Thread Daniel Wagner

Hi Tejun,

On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add 
cgroup_subsys->post_create()

>
> Currently, there's no way for a controller to find out whether a new
> cgroup finished all ->create() allocatinos successfully and is
> considered "live" by cgroup.

I'd like add hierarchy support to net_prio and the first thing to
do is to get rid of get_prioidx(). It looks like it would be nice to be 
able to use use_id and post_create() for this but as I read the code 
this might not work because the netdev might access resources allocated 
between create() and post_create(). So my question is would it make 
sense to move


cgroup_create():

if (ss->use_id) {
err = alloc_css_id(ss, parent, cgrp);
if (err)
goto err_destroy;
}

part before create() or add some protection between create() and 
post_create() callback in net_prio. I have a patch but I see

I could drop it completely if post_create() is there.

cheers,
daniel


From 84fbbdf0dc5d3460389e39a00a3ee553ee55b563 Mon Sep 17 00:00:00 2001
From: Daniel Wagner 
Date: Thu, 8 Nov 2012 17:17:21 +0100
Subject: [PATCH] cgroups: net_prio: Use IDR library to assign netprio index

get_prioidx() allocated a new id whenever it was called. put_prioidx()
gave an id back. get_pioidx() could can reallocate the id later on.
So that is exactly what IDR offers and so let's use it.

Signed-off-by: Daniel Wagner 
---
 net/core/netprio_cgroup.c | 51 
+--

 1 file changed, 9 insertions(+), 42 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 847c02b..3c1b612 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -27,10 +27,7 @@

 #include 

-#define PRIOIDX_SZ 128
-
-static unsigned long prioidx_map[PRIOIDX_SZ];
-static DEFINE_SPINLOCK(prioidx_map_lock);
+static DEFINE_IDA(netprio_ida);
 static atomic_t max_prioidx = ATOMIC_INIT(0);

 static inline struct cgroup_netprio_state *cgrp_netprio_state(struct 
cgroup *cgrp)
@@ -39,34 +36,6 @@ static inline struct cgroup_netprio_state 
*cgrp_netprio_state(struct cgroup *cgr

struct cgroup_netprio_state, css);
 }

-static int get_prioidx(u32 *prio)
-{
-   unsigned long flags;
-   u32 prioidx;
-
-   spin_lock_irqsave(_map_lock, flags);
-	prioidx = find_first_zero_bit(prioidx_map, sizeof(unsigned long) * 
PRIOIDX_SZ);

-   if (prioidx == sizeof(unsigned long) * PRIOIDX_SZ) {
-   spin_unlock_irqrestore(_map_lock, flags);
-   return -ENOSPC;
-   }
-   set_bit(prioidx, prioidx_map);
-   if (atomic_read(_prioidx) < prioidx)
-   atomic_set(_prioidx, prioidx);
-   spin_unlock_irqrestore(_map_lock, flags);
-   *prio = prioidx;
-   return 0;
-}
-
-static void put_prioidx(u32 idx)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(_map_lock, flags);
-   clear_bit(idx, prioidx_map);
-   spin_unlock_irqrestore(_map_lock, flags);
-}
-
 static int extend_netdev_table(struct net_device *dev, u32 new_len)
 {
size_t new_size = sizeof(struct netprio_map) +
@@ -120,9 +89,9 @@ static struct cgroup_subsys_state *cgrp_create(struct 
cgroup *cgrp)

if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx)
goto out;

-   ret = get_prioidx(>prioidx);
-   if (ret < 0) {
-   pr_warn("No space in priority index array\n");
+   cs->prioidx = ida_simple_get(_ida, 0, 0, GFP_KERNEL);
+   if (cs->prioidx < 0) {
+   ret = cs->prioidx;
goto out;
}

@@ -146,7 +115,7 @@ static void cgrp_destroy(struct cgroup *cgrp)
map->priomap[cs->prioidx] = 0;
}
rtnl_unlock();
-   put_prioidx(cs->prioidx);
+   ida_simple_remove(_ida, cs->prioidx);
kfree(cs);
 }

@@ -284,12 +253,10 @@ struct cgroup_subsys net_prio_subsys = {
.module = THIS_MODULE,

/*
-* net_prio has artificial limit on the number of cgroups and
-* disallows nesting making it impossible to co-mount it with other
-* hierarchical subsystems.  Remove the artificially low PRIOIDX_SZ
-* limit and properly nest configuration such that children follow
-* their parents' configurations by default and are allowed to
-* override and remove the following.
+* net_prio has artificial limit on properly nest
+* configuration such that children follow their parents'
+* configurations by default and are allowed to override and
+* remove the following.
 */
.broken_hierarchy = true,
 };
--
1.7.11.7


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-09 Thread Li Zefan
On 2012/11/9 3:07, Tejun Heo wrote:
> Subject: cgroup: add cgroup_subsys->post_create()
> 
> Currently, there's no way for a controller to find out whether a new
> cgroup finished all ->create() allocatinos successfully and is
> considered "live" by cgroup.
> 
> This becomes a problem later when we add generic descendants walking
> to cgroup which can be used by controllers as controllers don't have a
> synchronization point where it can synchronize against new cgroups
> appearing in such walks.
> 
> This patch adds ->post_create().  It's called after all ->create()
> succeeded and the cgroup is linked into the generic cgroup hierarchy.
> This plays the counterpart of ->pre_destroy().
> 
> When used in combination with the to-be-added generic descendant
> iterators, ->post_create() can be used to implement reliable state
> inheritance.  It will be explained with the descendant iterators.
> 
> v2: Added a paragraph about its future use w/ descendant iterators per
> Michal.
> 
> v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
> Fixed.
> 
> Signed-off-by: Tejun Heo 
> Acked-by: Michal Hocko 
> Reviewed-by: KAMEZAWA Hiroyuki 
> Cc: Glauber Costa 

Acked-by: Li Zefan 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-09 Thread Li Zefan
On 2012/11/9 3:07, Tejun Heo wrote:
> Subject: cgroup: add cgroup_subsys->post_create()
> 
> Currently, there's no way for a controller to find out whether a new
> cgroup finished all ->create() allocatinos successfully and is
> considered "live" by cgroup.
> 
> This becomes a problem later when we add generic descendants walking
> to cgroup which can be used by controllers as controllers don't have a
> synchronization point where it can synchronize against new cgroups
> appearing in such walks.
> 
> This patch adds ->post_create().  It's called after all ->create()
> succeeded and the cgroup is linked into the generic cgroup hierarchy.
> This plays the counterpart of ->pre_destroy().
> 
> When used in combination with the to-be-added generic descendant
> iterators, ->post_create() can be used to implement reliable state
> inheritance.  It will be explained with the descendant iterators.
> 
> v2: Added a paragraph about its future use w/ descendant iterators per
> Michal.
> 
> v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
> Fixed.
> 
> Signed-off-by: Tejun Heo 
> Acked-by: Michal Hocko 
> Reviewed-by: KAMEZAWA Hiroyuki 
> Cc: Glauber Costa 

Li Zefan http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-09 Thread Li Zefan
On 2012/11/9 3:07, Tejun Heo wrote:
 Subject: cgroup: add cgroup_subsys-post_create()
 
 Currently, there's no way for a controller to find out whether a new
 cgroup finished all -create() allocatinos successfully and is
 considered live by cgroup.
 
 This becomes a problem later when we add generic descendants walking
 to cgroup which can be used by controllers as controllers don't have a
 synchronization point where it can synchronize against new cgroups
 appearing in such walks.
 
 This patch adds -post_create().  It's called after all -create()
 succeeded and the cgroup is linked into the generic cgroup hierarchy.
 This plays the counterpart of -pre_destroy().
 
 When used in combination with the to-be-added generic descendant
 iterators, -post_create() can be used to implement reliable state
 inheritance.  It will be explained with the descendant iterators.
 
 v2: Added a paragraph about its future use w/ descendant iterators per
 Michal.
 
 v3: Forgot to add -post_create() invocation to cgroup_load_subsys().
 Fixed.
 
 Signed-off-by: Tejun Heo t...@kernel.org
 Acked-by: Michal Hocko mho...@suse.cz
 Reviewed-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Cc: Glauber Costa glom...@parallels.com

Li Zefan lize...@huawei.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-09 Thread Li Zefan
On 2012/11/9 3:07, Tejun Heo wrote:
 Subject: cgroup: add cgroup_subsys-post_create()
 
 Currently, there's no way for a controller to find out whether a new
 cgroup finished all -create() allocatinos successfully and is
 considered live by cgroup.
 
 This becomes a problem later when we add generic descendants walking
 to cgroup which can be used by controllers as controllers don't have a
 synchronization point where it can synchronize against new cgroups
 appearing in such walks.
 
 This patch adds -post_create().  It's called after all -create()
 succeeded and the cgroup is linked into the generic cgroup hierarchy.
 This plays the counterpart of -pre_destroy().
 
 When used in combination with the to-be-added generic descendant
 iterators, -post_create() can be used to implement reliable state
 inheritance.  It will be explained with the descendant iterators.
 
 v2: Added a paragraph about its future use w/ descendant iterators per
 Michal.
 
 v3: Forgot to add -post_create() invocation to cgroup_load_subsys().
 Fixed.
 
 Signed-off-by: Tejun Heo t...@kernel.org
 Acked-by: Michal Hocko mho...@suse.cz
 Reviewed-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Cc: Glauber Costa glom...@parallels.com

Acked-by: Li Zefan lize...@huawei.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-09 Thread Daniel Wagner

Hi Tejun,

On 08.11.2012 20:07, Tejun Heo wrote: Subject: cgroup: add 
cgroup_subsys-post_create()


 Currently, there's no way for a controller to find out whether a new
 cgroup finished all -create() allocatinos successfully and is
 considered live by cgroup.

I'd like add hierarchy support to net_prio and the first thing to
do is to get rid of get_prioidx(). It looks like it would be nice to be 
able to use use_id and post_create() for this but as I read the code 
this might not work because the netdev might access resources allocated 
between create() and post_create(). So my question is would it make 
sense to move


cgroup_create():

if (ss-use_id) {
err = alloc_css_id(ss, parent, cgrp);
if (err)
goto err_destroy;
}

part before create() or add some protection between create() and 
post_create() callback in net_prio. I have a patch but I see

I could drop it completely if post_create() is there.

cheers,
daniel


From 84fbbdf0dc5d3460389e39a00a3ee553ee55b563 Mon Sep 17 00:00:00 2001
From: Daniel Wagner daniel.wag...@bmw-carit.de
Date: Thu, 8 Nov 2012 17:17:21 +0100
Subject: [PATCH] cgroups: net_prio: Use IDR library to assign netprio index

get_prioidx() allocated a new id whenever it was called. put_prioidx()
gave an id back. get_pioidx() could can reallocate the id later on.
So that is exactly what IDR offers and so let's use it.

Signed-off-by: Daniel Wagner daniel.wag...@bmw-carit.de
---
 net/core/netprio_cgroup.c | 51 
+--

 1 file changed, 9 insertions(+), 42 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 847c02b..3c1b612 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -27,10 +27,7 @@

 #include linux/fdtable.h

-#define PRIOIDX_SZ 128
-
-static unsigned long prioidx_map[PRIOIDX_SZ];
-static DEFINE_SPINLOCK(prioidx_map_lock);
+static DEFINE_IDA(netprio_ida);
 static atomic_t max_prioidx = ATOMIC_INIT(0);

 static inline struct cgroup_netprio_state *cgrp_netprio_state(struct 
cgroup *cgrp)
@@ -39,34 +36,6 @@ static inline struct cgroup_netprio_state 
*cgrp_netprio_state(struct cgroup *cgr

struct cgroup_netprio_state, css);
 }

-static int get_prioidx(u32 *prio)
-{
-   unsigned long flags;
-   u32 prioidx;
-
-   spin_lock_irqsave(prioidx_map_lock, flags);
-	prioidx = find_first_zero_bit(prioidx_map, sizeof(unsigned long) * 
PRIOIDX_SZ);

-   if (prioidx == sizeof(unsigned long) * PRIOIDX_SZ) {
-   spin_unlock_irqrestore(prioidx_map_lock, flags);
-   return -ENOSPC;
-   }
-   set_bit(prioidx, prioidx_map);
-   if (atomic_read(max_prioidx)  prioidx)
-   atomic_set(max_prioidx, prioidx);
-   spin_unlock_irqrestore(prioidx_map_lock, flags);
-   *prio = prioidx;
-   return 0;
-}
-
-static void put_prioidx(u32 idx)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(prioidx_map_lock, flags);
-   clear_bit(idx, prioidx_map);
-   spin_unlock_irqrestore(prioidx_map_lock, flags);
-}
-
 static int extend_netdev_table(struct net_device *dev, u32 new_len)
 {
size_t new_size = sizeof(struct netprio_map) +
@@ -120,9 +89,9 @@ static struct cgroup_subsys_state *cgrp_create(struct 
cgroup *cgrp)

if (cgrp-parent  cgrp_netprio_state(cgrp-parent)-prioidx)
goto out;

-   ret = get_prioidx(cs-prioidx);
-   if (ret  0) {
-   pr_warn(No space in priority index array\n);
+   cs-prioidx = ida_simple_get(netprio_ida, 0, 0, GFP_KERNEL);
+   if (cs-prioidx  0) {
+   ret = cs-prioidx;
goto out;
}

@@ -146,7 +115,7 @@ static void cgrp_destroy(struct cgroup *cgrp)
map-priomap[cs-prioidx] = 0;
}
rtnl_unlock();
-   put_prioidx(cs-prioidx);
+   ida_simple_remove(netprio_ida, cs-prioidx);
kfree(cs);
 }

@@ -284,12 +253,10 @@ struct cgroup_subsys net_prio_subsys = {
.module = THIS_MODULE,

/*
-* net_prio has artificial limit on the number of cgroups and
-* disallows nesting making it impossible to co-mount it with other
-* hierarchical subsystems.  Remove the artificially low PRIOIDX_SZ
-* limit and properly nest configuration such that children follow
-* their parents' configurations by default and are allowed to
-* override and remove the following.
+* net_prio has artificial limit on properly nest
+* configuration such that children follow their parents'
+* configurations by default and are allowed to override and
+* remove the following.
 */
.broken_hierarchy = true,
 };
--
1.7.11.7


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo 

Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-09 Thread Tejun Heo
Hey, Daniel.

On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
 On 08.11.2012 20:07, Tejun Heo wrote: Subject: cgroup: add
 cgroup_subsys-post_create()
 
  Currently, there's no way for a controller to find out whether a new
  cgroup finished all -create() allocatinos successfully and is
  considered live by cgroup.
 
 I'd like add hierarchy support to net_prio and the first thing to
 do is to get rid of get_prioidx(). It looks like it would be nice to

Ooh, I'm already working on it.  I *think* I should be able to post
the patches later today or early next week.

 be able to use use_id and post_create() for this but as I read the
 code this might not work because the netdev might access resources
 allocated between create() and post_create(). So my question is
 would it make sense to move
 
 cgroup_create():
 
   if (ss-use_id) {
   err = alloc_css_id(ss, parent, cgrp);
   if (err)
   goto err_destroy;
   }
 
 part before create() or add some protection between create() and
 post_create() callback in net_prio. I have a patch but I see
 I could drop it completely if post_create() is there.

Glauber had about similar question about css_id and I need to think
more about it but currently I think I want to phase out css IDs.  It's
an id of the wrong thing (CSSes don't need IDs, cgroups do) and
unnecessarily duplicates its own hierarchy when the hierarchy of
cgroups already exists.  Once memcontrol moves away from walking using
css_ids, I *think* I'll try to kill it.

I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
number) so that it can be used for cgroup indexing.  Glauber, that
should solve your problem too, right?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-09 Thread Glauber Costa
On 11/09/2012 06:22 PM, Tejun Heo wrote:
 Hey, Daniel.
 
 On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
 On 08.11.2012 20:07, Tejun Heo wrote: Subject: cgroup: add
 cgroup_subsys-post_create()

 Currently, there's no way for a controller to find out whether a new
 cgroup finished all -create() allocatinos successfully and is
 considered live by cgroup.

 I'd like add hierarchy support to net_prio and the first thing to
 do is to get rid of get_prioidx(). It looks like it would be nice to
 
 Ooh, I'm already working on it.  I *think* I should be able to post
 the patches later today or early next week.
 
 be able to use use_id and post_create() for this but as I read the
 code this might not work because the netdev might access resources
 allocated between create() and post_create(). So my question is
 would it make sense to move

 cgroup_create():

  if (ss-use_id) {
  err = alloc_css_id(ss, parent, cgrp);
  if (err)
  goto err_destroy;
  }

 part before create() or add some protection between create() and
 post_create() callback in net_prio. I have a patch but I see
 I could drop it completely if post_create() is there.
 
 Glauber had about similar question about css_id and I need to think
 more about it but currently I think I want to phase out css IDs.  It's
 an id of the wrong thing (CSSes don't need IDs, cgroups do) and
 unnecessarily duplicates its own hierarchy when the hierarchy of
 cgroups already exists.  Once memcontrol moves away from walking using
 css_ids, I *think* I'll try to kill it.

May I suggest doing something similar with what the scheduler does? I
had some code in the past that reused that code - but basically
duplicated it. If you want, I can try getting a version of that in
kernel/cgroup.c  that would serve as a general walker.

I like that walker a lot, because it happens in a sane order. memcg
basically walks in a random weird order, that makes hierarchical
computation of anything quite hard.

 
 I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
 number) so that it can be used for cgroup indexing.  Glauber, that
 should solve your problem too, right?
 

Actually I went with a totally orthogonal solution. I am now using per
kmem-limited ids. Because they are not tied to the cgroup creation
workflow, I can allocate whenever it is more convenient.

I ended up liking this solution because it will do better in scenarios
where most of the memcgs are not kmem limited. So it had an edge here,
and also got rid of the create/post_create problem by breaking the
dependency.

But of course, if cgroups would gain some kind of sane indexing, it
could shift the balance towards reusing it.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/9 v3] cgroup: add cgroup_subsys->post_create()

2012-11-08 Thread Tejun Heo
Subject: cgroup: add cgroup_subsys->post_create()

Currently, there's no way for a controller to find out whether a new
cgroup finished all ->create() allocatinos successfully and is
considered "live" by cgroup.

This becomes a problem later when we add generic descendants walking
to cgroup which can be used by controllers as controllers don't have a
synchronization point where it can synchronize against new cgroups
appearing in such walks.

This patch adds ->post_create().  It's called after all ->create()
succeeded and the cgroup is linked into the generic cgroup hierarchy.
This plays the counterpart of ->pre_destroy().

When used in combination with the to-be-added generic descendant
iterators, ->post_create() can be used to implement reliable state
inheritance.  It will be explained with the descendant iterators.

v2: Added a paragraph about its future use w/ descendant iterators per
Michal.

v3: Forgot to add ->post_create() invocation to cgroup_load_subsys().
Fixed.

Signed-off-by: Tejun Heo 
Acked-by: Michal Hocko 
Reviewed-by: KAMEZAWA Hiroyuki 
Cc: Glauber Costa 
---
Oops, forgot updating cgroup_load_subsys().  Hate that it's a
different path from cgroup_init_subsys(). :(

Thanks.

 include/linux/cgroup.h |1 +
 kernel/cgroup.c|   15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -438,6 +438,7 @@ int cgroup_taskset_size(struct cgroup_ta
 
 struct cgroup_subsys {
struct cgroup_subsys_state *(*create)(struct cgroup *cgrp);
+   void (*post_create)(struct cgroup *cgrp);
void (*pre_destroy)(struct cgroup *cgrp);
void (*destroy)(struct cgroup *cgrp);
int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4059,10 +4059,15 @@ static long cgroup_create(struct cgroup
if (err < 0)
goto err_remove;
 
-   /* each css holds a ref to the cgroup's dentry */
-   for_each_subsys(root, ss)
+   for_each_subsys(root, ss) {
+   /* each css holds a ref to the cgroup's dentry */
dget(dentry);
 
+   /* creation succeeded, notify subsystems */
+   if (ss->post_create)
+   ss->post_create(cgrp);
+   }
+
/* The cgroup directory was pre-locked for us */
BUG_ON(!mutex_is_locked(>dentry->d_inode->i_mutex));
 
@@ -4280,6 +4285,9 @@ static void __init cgroup_init_subsys(st
 
ss->active = 1;
 
+   if (ss->post_create)
+   ss->post_create(>root->top_cgroup);
+
/* this function shouldn't be used with modular subsystems, since they
 * need to register a subsys_id, among other things */
BUG_ON(ss->module);
@@ -4389,6 +4397,9 @@ int __init_or_module cgroup_load_subsys(
 
ss->active = 1;
 
+   if (ss->post_create)
+   ss->post_create(>root->top_cgroup);
+
/* success! */
mutex_unlock(_mutex);
return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/9 v3] cgroup: add cgroup_subsys-post_create()

2012-11-08 Thread Tejun Heo
Subject: cgroup: add cgroup_subsys-post_create()

Currently, there's no way for a controller to find out whether a new
cgroup finished all -create() allocatinos successfully and is
considered live by cgroup.

This becomes a problem later when we add generic descendants walking
to cgroup which can be used by controllers as controllers don't have a
synchronization point where it can synchronize against new cgroups
appearing in such walks.

This patch adds -post_create().  It's called after all -create()
succeeded and the cgroup is linked into the generic cgroup hierarchy.
This plays the counterpart of -pre_destroy().

When used in combination with the to-be-added generic descendant
iterators, -post_create() can be used to implement reliable state
inheritance.  It will be explained with the descendant iterators.

v2: Added a paragraph about its future use w/ descendant iterators per
Michal.

v3: Forgot to add -post_create() invocation to cgroup_load_subsys().
Fixed.

Signed-off-by: Tejun Heo t...@kernel.org
Acked-by: Michal Hocko mho...@suse.cz
Reviewed-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
Cc: Glauber Costa glom...@parallels.com
---
Oops, forgot updating cgroup_load_subsys().  Hate that it's a
different path from cgroup_init_subsys(). :(

Thanks.

 include/linux/cgroup.h |1 +
 kernel/cgroup.c|   15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -438,6 +438,7 @@ int cgroup_taskset_size(struct cgroup_ta
 
 struct cgroup_subsys {
struct cgroup_subsys_state *(*create)(struct cgroup *cgrp);
+   void (*post_create)(struct cgroup *cgrp);
void (*pre_destroy)(struct cgroup *cgrp);
void (*destroy)(struct cgroup *cgrp);
int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4059,10 +4059,15 @@ static long cgroup_create(struct cgroup
if (err  0)
goto err_remove;
 
-   /* each css holds a ref to the cgroup's dentry */
-   for_each_subsys(root, ss)
+   for_each_subsys(root, ss) {
+   /* each css holds a ref to the cgroup's dentry */
dget(dentry);
 
+   /* creation succeeded, notify subsystems */
+   if (ss-post_create)
+   ss-post_create(cgrp);
+   }
+
/* The cgroup directory was pre-locked for us */
BUG_ON(!mutex_is_locked(cgrp-dentry-d_inode-i_mutex));
 
@@ -4280,6 +4285,9 @@ static void __init cgroup_init_subsys(st
 
ss-active = 1;
 
+   if (ss-post_create)
+   ss-post_create(ss-root-top_cgroup);
+
/* this function shouldn't be used with modular subsystems, since they
 * need to register a subsys_id, among other things */
BUG_ON(ss-module);
@@ -4389,6 +4397,9 @@ int __init_or_module cgroup_load_subsys(
 
ss-active = 1;
 
+   if (ss-post_create)
+   ss-post_create(ss-root-top_cgroup);
+
/* success! */
mutex_unlock(cgroup_mutex);
return 0;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/