Re: RCU callback crashes

2017-12-21 Thread Jakub Kicinski
On Thu, 21 Dec 2017 13:31:01 -0800, Cong Wang wrote: > >629 if (likely(skb)) { > >630 qdisc_qstats_cpu_backlog_dec(qdisc, skb); > >631 qdisc_bstats_cpu_update(qdisc, skb); > >632 qdisc_qstats_cpu_qlen_dec(qdisc); > >633

Re: RCU callback crashes

2017-12-21 Thread Cong Wang
On Wed, Dec 20, 2017 at 4:50 PM, Jakub Kicinski wrote: > On Wed, 20 Dec 2017 16:41:14 -0800, Jakub Kicinski wrote: >> Just as I hit send... :) but this looks unrelated, "Comm: sshd" - >> so probably from the management interface. >> >> [ 154.604041] >> ==

Re: RCU callback crashes

2017-12-21 Thread Cong Wang
On Thu, Dec 21, 2017 at 8:26 AM, John Fastabend wrote: > On 12/20/2017 11:27 PM, Cong Wang wrote: >> On Wed, Dec 20, 2017 at 4:50 PM, Jakub Kicinski wrote: >>> On Wed, 20 Dec 2017 16:41:14 -0800, Jakub Kicinski wrote: Just as I hit send... :) but this looks unrelated, "Comm: sshd" - so

Re: RCU callback crashes

2017-12-21 Thread Jakub Kicinski
On Thu, 21 Dec 2017 08:26:56 -0800, John Fastabend wrote: > @Jakub, does your test have traffic generator running or just control > path? My theory would be a bit odd if you didn't have traffic, but > something is kicking the dequeue so must be some traffic. It was just control traffic, but it's t

Re: RCU callback crashes

2017-12-21 Thread John Fastabend
On 12/20/2017 11:27 PM, Cong Wang wrote: > On Wed, Dec 20, 2017 at 4:50 PM, Jakub Kicinski wrote: >> On Wed, 20 Dec 2017 16:41:14 -0800, Jakub Kicinski wrote: >>> Just as I hit send... :) but this looks unrelated, "Comm: sshd" - >>> so probably from the management interface. >>> >>> [ 154.604041

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 4:50 PM, Jakub Kicinski wrote: > On Wed, 20 Dec 2017 16:41:14 -0800, Jakub Kicinski wrote: >> Just as I hit send... :) but this looks unrelated, "Comm: sshd" - >> so probably from the management interface. >> >> [ 154.604041] >> ==

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 4:37 PM, Jakub Kicinski wrote: > On Wed, 20 Dec 2017 16:03:49 -0800, Cong Wang wrote: >> On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang wrote: >> > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang >> > wrote: >> >> >> >> I guess it is q->miniqp which is freed in qdisc_graft() wi

Re: RCU callback crashes

2017-12-20 Thread Jakub Kicinski
On Wed, 20 Dec 2017 16:41:14 -0800, Jakub Kicinski wrote: > Just as I hit send... :) but this looks unrelated, "Comm: sshd" - > so probably from the management interface. > > [ 154.604041] > == > [ 154.612245] BUG: KASAN: slab-out

Re: RCU callback crashes

2017-12-20 Thread Jakub Kicinski
On Wed, 20 Dec 2017 16:37:10 -0800, Jakub Kicinski wrote: > On Wed, 20 Dec 2017 16:03:49 -0800, Cong Wang wrote: > > On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang > > wrote: > > > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang > > > wrote: > > >> > > >> I guess it is q->miniqp which is freed i

Re: RCU callback crashes

2017-12-20 Thread Jakub Kicinski
On Wed, 20 Dec 2017 16:03:49 -0800, Cong Wang wrote: > On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang wrote: > > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang > > wrote: > >> > >> I guess it is q->miniqp which is freed in qdisc_graft() without properly > >> waiting for rcu readers? > > > > It is

Re: RCU callback crashes

2017-12-20 Thread Jakub Kicinski
On Wed, 20 Dec 2017 16:03:49 -0800, Cong Wang wrote: > On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang wrote: > > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang > > wrote: > >> > >> I guess it is q->miniqp which is freed in qdisc_graft() without properly > >> waiting for rcu readers? > > > > It is

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang wrote: > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang wrote: >> >> I guess it is q->miniqp which is freed in qdisc_graft() without properly >> waiting for rcu readers? > > It is probably so, the call_rcu_bh(&miniq_old->rcu, mini_qdisc_rcu_func) > in the

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 12:23 PM, John Fastabend wrote: > I'm trying to see how removing that rcu grace period was safe in the > first place. The datapath is using rcu_read critical section to protect > the qdisc but the control path (a) doesn't use rcu grace period and (b) > doesn't use the qidis

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 12:14 PM, John Fastabend wrote: > > Hi, > > Just sent a patch to complete qdisc_destroy from rcu callback. This > is needed to resolve a race with the lockless qdisc patches. > > But I guess it should fix the miniq issue as well? If you ever look into tools/testing/selfte

Re: RCU callback crashes

2017-12-20 Thread John Fastabend
On 12/20/2017 12:17 PM, Jakub Kicinski wrote: > On Wed, 20 Dec 2017 10:04:17 -0800, John Fastabend wrote: >> On 12/19/2017 10:34 PM, Jakub Kicinski wrote: >>> On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote: >> I get this: > > Could you try to run it with kasan on?

Re: RCU callback crashes

2017-12-20 Thread John Fastabend
On 12/20/2017 12:15 PM, Jiri Pirko wrote: > Wed, Dec 20, 2017 at 08:59:22PM CET, j...@resnulli.us wrote: >> Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangc...@gmail.com wrote: >>> On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: Ah, no object debug but KASAN on produces this: >>>

Re: RCU callback crashes

2017-12-20 Thread Jiri Pirko
Wed, Dec 20, 2017 at 09:14:49PM CET, john.fastab...@gmail.com wrote: >On 12/20/2017 11:59 AM, Jiri Pirko wrote: >> Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangc...@gmail.com wrote: >>> On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: Ah, no object debug but KASAN on produces this: >

Re: RCU callback crashes

2017-12-20 Thread Jakub Kicinski
On Wed, 20 Dec 2017 10:04:17 -0800, John Fastabend wrote: > On 12/19/2017 10:34 PM, Jakub Kicinski wrote: > > On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote: > I get this: > >>> > >>> Could you try to run it with kasan on? > >> > >> I didn't manage to reproduce it with KA

Re: RCU callback crashes

2017-12-20 Thread Jiri Pirko
Wed, Dec 20, 2017 at 08:59:22PM CET, j...@resnulli.us wrote: >Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangc...@gmail.com wrote: >>On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: >>> Ah, no object debug but KASAN on produces this: >>> >> >> >>I bet it is an ingress qdisc which is being f

Re: RCU callback crashes

2017-12-20 Thread John Fastabend
On 12/20/2017 11:59 AM, Jiri Pirko wrote: > Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangc...@gmail.com wrote: >> On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: >>> Ah, no object debug but KASAN on produces this: >>> >> >> >> I bet it is an ingress qdisc which is being freed? >> >> >> >

Re: RCU callback crashes

2017-12-20 Thread Jiri Pirko
Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangc...@gmail.com wrote: >On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: >> Ah, no object debug but KASAN on produces this: >> > > >I bet it is an ingress qdisc which is being freed? > > > >> [ 39.268209] BUG: KASAN: use-after-free in cpu_need

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang wrote: > > I guess it is q->miniqp which is freed in qdisc_graft() without properly > waiting for rcu readers? It is probably so, the call_rcu_bh(&miniq_old->rcu, mini_qdisc_rcu_func) in the end of mini_qdisc_pair_swap() is invoked on miniq_old->rcu, bu

Re: RCU callback crashes

2017-12-20 Thread Cong Wang
On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski wrote: > Ah, no object debug but KASAN on produces this: > I bet it is an ingress qdisc which is being freed? > [ 39.268209] BUG: KASAN: use-after-free in cpu_needs_another_gp+0x246/0x2b0 > [ 39.275965] Read of size 8 at addr 8803aa64f1

Re: RCU callback crashes

2017-12-20 Thread John Fastabend
On 12/19/2017 10:34 PM, Jakub Kicinski wrote: > On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote: I get this: >>> >>> Could you try to run it with kasan on? >> >> I didn't manage to reproduce it with KASAN on so far :( Even enabling >> object debugging to get the second splat in

Re: RCU callback crashes

2017-12-19 Thread Jakub Kicinski
On Tue, 19 Dec 2017 22:22:27 -0800, Jakub Kicinski wrote: > > >I get this: > > > > Could you try to run it with kasan on? > > I didn't manage to reproduce it with KASAN on so far :( Even enabling > object debugging to get the second splat in my email (which is more > useful) actually makes

Re: RCU callback crashes

2017-12-19 Thread Jakub Kicinski
On Wed, 20 Dec 2017 07:11:18 +0100, Jiri Pirko wrote: > Wed, Dec 20, 2017 at 02:59:21AM CET, kubak...@wp.pl wrote: > >Hi! > > > >If I run the netdevsim test long enough on a kernel with no debugging > > Just running tools/testing/selftests/bpf/test_offload.py? Yes, like this: while ./linux/to

Re: RCU callback crashes

2017-12-19 Thread Jiri Pirko
Wed, Dec 20, 2017 at 02:59:21AM CET, kubak...@wp.pl wrote: >Hi! > >If I run the netdevsim test long enough on a kernel with no debugging Just running tools/testing/selftests/bpf/test_offload.py? >I get this: Could you try to run it with kasan on? > >[ 1400.450124] BUG: unable to handle kernel

RCU callback crashes

2017-12-19 Thread Jakub Kicinski
Hi! If I run the netdevsim test long enough on a kernel with no debugging I get this: [ 1400.450124] BUG: unable to handle kernel paging request at 00046474e552 [ 1400.458005] IP: 0x46474e552 [ 1400.461231] PGD 0 P4D 0 [ 1400.464150] Oops: 0010 [#1] PREEMPT SMP [ 1400.468525] Modules linked