On Mon, Sep 3, 2018 at 1:33 PM Vlad Buslov <vla...@mellanox.com> wrote: > > > On Mon 03 Sep 2018 at 18:50, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > On Mon, Sep 3, 2018 at 12:06 AM Vlad Buslov <vla...@mellanox.com> wrote: > >> > >> Action API was changed to work with actions and action_idr in concurrency > >> safe manner, however tcf_del_walker() still uses actions without taking > >> reference to them first and deletes them directly, disregarding possible > >> concurrent delete. > >> > >> Change tcf_del_walker() to use tcf_idr_delete_index() that doesn't require > >> caller to hold reference to action and accepts action id as argument, > >> instead of direct action pointer. > > > > Hmm, why doesn't tcf_del_walker() just take idrinfo->lock? At least > > tcf_dump_walker() already does. > > Because tcf_del_walker() calls __tcf_idr_release(), which take > idrinfo->lock itself (deadlock). It also calls sleeping functions like
Deadlock can be easily resolved by moving the lock out. > tcf_action_goto_chain_fini(), so just implementing function that > releases action without taking idrinfo->lock is not enough. Sleeping can be resolved either by making it atomic or deferring it to a work queue. None of your arguments here is a blocker to locking idrinfo->lock. You really should focus on if it is really necessary to lock idrinfo->lock in tcf_del_walker(), rather than these details. For me, if you need idrinfo->lock for dump walker, you must need it for delete walker too, because deletion is a writer which should require stronger protection than the dumper, which merely a reader.