Hello, Li. On Mon, Nov 19, 2012 at 05:02:42PM +0800, Li Zefan wrote: > On 2012/11/13 11:01, Tejun Heo wrote: > > struct cgroup is made RCU-safe by synchronize_rcu() in cgroup_diput(). > > but synchronize_rcu() is called before ss->destroy(). > > rcu_read_lock(); > for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq) > print_cfs_rq(m, cpu, cfs_rq); > -> call cgroup_path(task_group->css.cgroup); > rcu_read_unlock(); > > With this patch, if the above code race with cgroup_diput(), we might > end up accessing a cgroup which has been freed.
Ah, okay. So, the problem here is that sched is using ->css_free() as a de-registration point rather than freeing and may end up walking it after ->css_free() is complete inside RCU period. I think the correct solution is using ->css_offline() for that. It's ugly to require double RCU grace periods. > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > > index 278752e..a91e7ad 100644 > > --- a/kernel/cgroup.c > > +++ b/kernel/cgroup.c > > @@ -893,7 +893,7 @@ static void cgroup_diput(struct dentry *dentry, struct > > inode *inode) > > > > simple_xattrs_free(&cgrp->xattrs); > > > > - kfree_rcu(cgrp, rcu_head); > > + kfree(cgrp); > > This was also added to prevent a race in group scheduling code, and I think > the race still > exists. Care to point out which one? I don't think the double-RCU workaround is a good idea. We really should sort it out by following object lifecycle rules consistently. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/