On Tue, Mar 12, 2019 at 7:36 AM Subhra Mazumdar <subhra.mazum...@oracle.com> wrote: > > > On 3/11/19 11:34 AM, Subhra Mazumdar wrote: > > > > On 3/10/19 9:23 PM, Aubrey Li wrote: > >> On Sat, Mar 9, 2019 at 3:50 AM Subhra Mazumdar > >> <subhra.mazum...@oracle.com> wrote: > >>> expected. Most of the performance recovery happens in patch 15 which, > >>> unfortunately, is also the one that introduces the hard lockup. > >>> > >> After applied Subhra's patch, the following is triggered by enabling > >> core sched when a cgroup is > >> under heavy load. > >> > > It seems you are facing some other deadlock where printk is involved. > > Can you > > drop the last patch (patch 16 sched: Debug bits...) and try? > > > > Thanks, > > Subhra > > > Never Mind, I am seeing the same lockdep deadlock output even w/o patch > 16. Btw > the NULL fix had something missing, following works. >
okay, here is another one, on my system, the boot up CPUs don't match the possible cpu map, so the not onlined CPU rq->core are not initialized, which causes NULL pointer dereference panic in online_fair_sched_group(): And here is a quick fix. ----------------------------------------------------------------------------------------------------- @@ -10488,7 +10493,8 @@ void online_fair_sched_group(struct task_group *tg) for_each_possible_cpu(i) { rq = cpu_rq(i); se = tg->se[i]; - + if (!rq->core) + continue; raw_spin_lock_irq(rq_lockp(rq)); update_rq_clock(rq); attach_entity_cfs_rq(se); Thanks, -Aubrey