> -----Original Message-----
> From: Paul Turner [mailto:[email protected]]
> Sent: Tuesday, November 12, 2013 11:10 AM
> To: Wang, Xiaoming
> Cc: Ingo Molnar; Peter Zijlstra; LKML; Liu, Chuansheng; Zhang, Dongxing
> Subject: Re: [PATCH] [sched]: pick the NULL entity caused the panic.
> 
> On Tue, Nov 12, 2013 at 8:29 AM, Wang, Xiaoming
> <[email protected]> wrote:
> > cfs_rq get its group run queue but the value of
> > cfs_rq->nr_running maybe zero, which will cause
> > the panic in pick_next_task_fair.
> > So the evaluated of cfs_rq->nr_running is needed.
> >
> > Signed-off-by: xiaoming wang <[email protected]>
> > Signed-off-by: Zhang Dongxing <[email protected]>
> > ---
> >  kernel/sched/fair.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 7c70201..2d440f9 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3708,7 +3708,7 @@ static struct task_struct
> *pick_next_task_fair(struct rq *rq)
> >                 se = pick_next_entity(cfs_rq);
> >                 set_next_entity(cfs_rq, se);
> >                 cfs_rq = group_cfs_rq(se);
> > -       } while (cfs_rq);
> > +       } while (cfs_rq && cfs_rq->nr_running);
> >
> >         p = task_of(se);
> >         if (hrtick_enabled(rq))
> 
> This can only happen when something else has already corrupted the
> rb-tree.  Breaking out here is going to cause you to instead try
> treating a group entity as a task, which will crash just as badly.
> 
> Could you describe what was being run when this crash occurred?
> 
> > --
> > 1.7.1
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
Dear Paul
        How about moving cfs_rq->nr_running into loop. What I worried is that 
cfs_rq->nr_running
may zero because cfs_rq is coming from cfs_rq = group_cfs_rq(se) again. We 
haven't known the 
reproduction exactly, panic happened only on random test and unstable. 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2d440f9..7f2f8b6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3701,14 +3701,13 @@ static struct task_struct *pick_next_task_fair(struct 
rq *rq)
        struct cfs_rq *cfs_rq = &rq->cfs;
        struct sched_entity *se;

-       if (!cfs_rq->nr_running)
-               return NULL;
-
        do {
+               if (!cfs_rq->nr_running)
+                       return NULL;
                se = pick_next_entity(cfs_rq);
                set_next_entity(cfs_rq, se);
                cfs_rq = group_cfs_rq(se);
-       } while (cfs_rq && cfs_rq->nr_running);
+       } while (cfs_rq);

        p = task_of(se);
        if (hrtick_enabled(rq))

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to