On Thu, Dec 21, 2017 at 1:03 AM, Jiri Pirko <j...@resnulli.us> wrote: > Thu, Dec 21, 2017 at 08:26:24AM CET, xiyou.wangc...@gmail.com wrote: >>The rcu_barrier_bh() in mini_qdisc_pair_swap() is to wait for >>flying RCU callback installed by a previous mini_qdisc_pair_swap(), >>however we miss it on the tp_head==NULL path, which leads to that >>the RCU callback still uses miniq_old->rcu after it is freed together >>with qdisc in qdisc_graft(). So just add it on that path too. >> >>Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of >>tp->q for clsact fastpath ") > > This fixes: > 752fbcc33405 ("net_sched: no need to free qdisc in RCU callback") > > Before that, the issue was not there as the qdisc struct got removed > after a grace period.
This is non-sense. You have to read the stack trace from Jakub again and tell me why you keep believing any RCU reader involved. I am pretty sure no one reported any crash between commit 752fbcc33405 and 46209401f8f6. > > >>Reported-by: Jakub Kicinski <jakub.kicin...@netronome.com> >>Tested-by: Jakub Kicinski <jakub.kicin...@netronome.com> >>Cc: Jiri Pirko <j...@mellanox.com> >>Cc: John Fastabend <john.fastab...@gmail.com> >>Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> >>--- >> net/sched/sch_generic.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >>diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c >>index cd1b200acae7..661c7144b53a 100644 >>--- a/net/sched/sch_generic.c >>+++ b/net/sched/sch_generic.c >>@@ -1040,6 +1040,8 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair >>*miniqp, >> >> if (!tp_head) { >> RCU_INIT_POINTER(*miniqp->p_miniq, NULL); >>+ /* Wait for flying RCU callback before it is freed. */ >>+ rcu_barrier_bh(); > > >> return; >> } >> >>@@ -1055,7 +1057,7 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair >>*miniqp, >> rcu_assign_pointer(*miniqp->p_miniq, miniq); >> >> if (miniq_old) >>- /* This is counterpart of the rcu barrier above. We need to >>+ /* This is counterpart of the rcu barriers above. We need to > > This is incorrect. Here we block in order to not use the same miniq > again in scenario > > miniq1 (X) > miniq2 > miniq1 (yet there are reader using X) > > This call_rcu has 0 relation to the barrier you are adding. Seriously? It is this call_rcu still flying after we free the qdisc. Did you seriously look into the stack trace from Jakub? > > > But again, we don't we just free qdisc in call_rcu and avoid the > barrier? Non-sense again. Why qdisc code should be adjusted for your miniq code? It is your own responsibility to take care of this shit. Don't spread it out of minq.