On Mon, Sep 14, 2015 at 10:32:42PM -0700, Shayan Pooya wrote: > Fixes commit fed14d45f945 ("sched/fair: Track cgroup depth") > Hit this kernel panic mentioned in https://lkml.org/lkml/2014/2/15/217 > when running docker with kernel 3.16.
v3.16 includes the fix from that thread (and I had to look in my own archives, because lkml.org fancies showing blank pages today :/). > The issue has been reported other places including: > > https://github.com/docker/docker/issues/13940 > https://gist.github.com/burke/c60dc5b8f0ba9bfd9275 > > The latter also has an analysis and a similar patch (which was never > submitted to lkml). Pretty good write up that, sad you did not Cc the guy. I got defeated by the github web shite (again!) and could not locate an email address for him :( Ah.. Google to the rescue! > Which suggests the inlined function find_matching_se and the while loop > in it. Looking into the task that was about to get scheduled in the > check_preempt_wakeup function: > > crash> p ((struct task_struct *) 0xffff8808506fd180)->se.depth > $2 = 1 > crash> p ((struct task_struct *) 0xffff8808506fd180)->se.parent->depth > $4 = 1 Yep, buggered. > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 6e2e348..ced5534 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8035,7 +8035,6 @@ static void task_move_group_fair(struct > task_struct *p, int queued) > if (!queued) > se->vruntime -= cfs_rq_of(se)->min_vruntime; > set_task_rq(p, task_cpu(p)); > - se->depth = se->parent ? se->parent->depth + 1 : 0; > if (!queued) { > cfs_rq = cfs_rq_of(se); > se->vruntime += cfs_rq->min_vruntime; So at this point I'm left wondering about that depth update we have in switched_to_fair(). Which leads me to suggest the following (note that some of this code has _just_ changed a lot). Does that work for you? (not been near a compiler). --- kernel/sched/fair.c | 10 +--------- kernel/sched/sched.h | 1 + 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9176f7c588a8..fc3ef8fb6891 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8000,13 +8000,7 @@ static void attach_task_cfs_rq(struct task_struct *p) struct sched_entity *se = &p->se; struct cfs_rq *cfs_rq = cfs_rq_of(se); -#ifdef CONFIG_FAIR_GROUP_SCHED - /* - * Since the real-depth could have been changed (only FAIR - * class maintain depth value), reset depth properly. - */ - se->depth = se->parent ? se->parent->depth + 1 : 0; -#endif + set_task_rq(p, task_cpu(p)); /* Synchronize task with its cfs_rq */ attach_entity_load_avg(cfs_rq, se); @@ -8072,8 +8066,6 @@ void init_cfs_rq(struct cfs_rq *cfs_rq) static void task_move_group_fair(struct task_struct *p) { detach_task_cfs_rq(p); - set_task_rq(p, task_cpu(p)); - #ifdef CONFIG_SMP /* Tell se's cfs_rq has been changed -- migrated */ p->se.avg.last_update_time = 0; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 167ab4844ee6..dde8881f16bc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -931,6 +931,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu) #ifdef CONFIG_FAIR_GROUP_SCHED p->se.cfs_rq = tg->cfs_rq[cpu]; p->se.parent = tg->se[cpu]; + p->se.depth = p->se.parent ? p->se.parent->depth + 1 : 0; #endif #ifdef CONFIG_RT_GROUP_SCHED -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/