Re: [Patch net 00/13] net_sched: close the race between call_rcu() and cleanup_net()

2017-11-08 Thread Cong Wang
On Wed, Nov 8, 2017 at 6:26 PM, David Miller  wrote:
> From: Cong Wang 
> Date: Mon,  6 Nov 2017 13:47:17 -0800
>
>> This patchset tries to fix the race between call_rcu() and
>> cleanup_net() again. Without holding the netns refcnt the
>> tc_action_net_exit() in netns workqueue could be called before
>> filter destroy works in tc filter workqueue. This patchset
>> moves the netns refcnt from tc actions to tcf_exts, without
>> breaking per-netns tc actions.
>>
>> Patch 1 reverts the previous fix, patch 2 introduces two new
>> API's to help to address the bug and the rest patches switch
>> to the new API's. Please see each patch for details.
>>
>> I was not able to reproduce this bug, but now after adding
>> some delay in filter destroy work I manage to trigger the
>> crash. After this patchset, the crash is not reproducible
>> any more and the debugging printk's show the order is expected
>> too.
>>
>> Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc 
>> actions")
>> Reported-by: Lucas Bates 
>> Cc: Lucas Bates 
>> Cc: Jamal Hadi Salim 
>> Cc: Jiri Pirko 
>> Signed-off-by: Cong Wang 
>
> I have to say this was a lot of churn so late in the release
> cycle but I ended up pulling anyways.
>
> I cannot guarantee that I will be able to push this to Linus
> in time for 4.14-final.
>

Yeah, it is kinda too late. I'd suggest you to just push the revert
to Linus for 4.14 and defer the rest to 4.15, if it's doable...

Thanks!


Re: [Patch net 00/13] net_sched: close the race between call_rcu() and cleanup_net()

2017-11-08 Thread David Miller
From: Cong Wang 
Date: Mon,  6 Nov 2017 13:47:17 -0800

> This patchset tries to fix the race between call_rcu() and
> cleanup_net() again. Without holding the netns refcnt the
> tc_action_net_exit() in netns workqueue could be called before 
> filter destroy works in tc filter workqueue. This patchset
> moves the netns refcnt from tc actions to tcf_exts, without
> breaking per-netns tc actions.
> 
> Patch 1 reverts the previous fix, patch 2 introduces two new
> API's to help to address the bug and the rest patches switch
> to the new API's. Please see each patch for details.
> 
> I was not able to reproduce this bug, but now after adding
> some delay in filter destroy work I manage to trigger the
> crash. After this patchset, the crash is not reproducible
> any more and the debugging printk's show the order is expected
> too.
> 
> Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc 
> actions")
> Reported-by: Lucas Bates 
> Cc: Lucas Bates 
> Cc: Jamal Hadi Salim 
> Cc: Jiri Pirko 
> Signed-off-by: Cong Wang 

I have to say this was a lot of churn so late in the release
cycle but I ended up pulling anyways.

I cannot guarantee that I will be able to push this to Linus
in time for 4.14-final.

Thanks.


[Patch net 00/13] net_sched: close the race between call_rcu() and cleanup_net()

2017-11-06 Thread Cong Wang
This patchset tries to fix the race between call_rcu() and
cleanup_net() again. Without holding the netns refcnt the
tc_action_net_exit() in netns workqueue could be called before 
filter destroy works in tc filter workqueue. This patchset
moves the netns refcnt from tc actions to tcf_exts, without
breaking per-netns tc actions.

Patch 1 reverts the previous fix, patch 2 introduces two new
API's to help to address the bug and the rest patches switch
to the new API's. Please see each patch for details.

I was not able to reproduce this bug, but now after adding
some delay in filter destroy work I manage to trigger the
crash. After this patchset, the crash is not reproducible
any more and the debugging printk's show the order is expected
too.

Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions")
Reported-by: Lucas Bates 
Cc: Lucas Bates 
Cc: Jamal Hadi Salim 
Cc: Jiri Pirko 
Signed-off-by: Cong Wang 

Cong Wang (13):
  Revert "net_sched: hold netns refcnt for each action"
  net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
  cls_basic: use tcf_exts_get_net() before call_rcu()
  cls_bpf: use tcf_exts_get_net() before call_rcu()
  cls_cgroup: use tcf_exts_get_net() before call_rcu()
  cls_flow: use tcf_exts_get_net() before call_rcu()
  cls_flower: use tcf_exts_get_net() before call_rcu()
  cls_fw: use tcf_exts_get_net() before call_rcu()
  cls_matchall: use tcf_exts_get_net() before call_rcu()
  cls_route: use tcf_exts_get_net() before call_rcu()
  cls_rsvp: use tcf_exts_get_net() before call_rcu()
  cls_tcindex: use tcf_exts_get_net() before call_rcu()
  cls_u32: use tcf_exts_get_net() before call_rcu()

 include/net/act_api.h  |  4 +---
 include/net/pkt_cls.h  | 24 
 net/sched/act_api.c|  2 --
 net/sched/act_bpf.c|  2 +-
 net/sched/act_connmark.c   |  2 +-
 net/sched/act_csum.c   |  2 +-
 net/sched/act_gact.c   |  2 +-
 net/sched/act_ife.c|  2 +-
 net/sched/act_ipt.c|  4 ++--
 net/sched/act_mirred.c |  2 +-
 net/sched/act_nat.c|  2 +-
 net/sched/act_pedit.c  |  2 +-
 net/sched/act_police.c |  2 +-
 net/sched/act_sample.c |  2 +-
 net/sched/act_simple.c |  2 +-
 net/sched/act_skbedit.c|  2 +-
 net/sched/act_skbmod.c |  2 +-
 net/sched/act_tunnel_key.c |  2 +-
 net/sched/act_vlan.c   |  2 +-
 net/sched/cls_api.c|  1 +
 net/sched/cls_basic.c  | 20 +++-
 net/sched/cls_bpf.c|  7 ++-
 net/sched/cls_cgroup.c | 24 ++--
 net/sched/cls_flow.c   | 24 ++--
 net/sched/cls_flower.c | 16 +---
 net/sched/cls_fw.c | 17 ++---
 net/sched/cls_matchall.c   | 15 ---
 net/sched/cls_route.c  | 17 ++---
 net/sched/cls_rsvp.h   | 15 ---
 net/sched/cls_tcindex.c| 33 ++---
 net/sched/cls_u32.c|  8 +++-
 31 files changed, 198 insertions(+), 63 deletions(-)

-- 
2.13.0