On 7/17/20 5:37 PM, Konstantin Khorenko wrote:
> There could be a deadlock:
> 
> 1. "nft" process calls request_module() with
> nfnl_lock(NFNL_SUBSYS_NFTABLES) taken and waits for its completion:
> 
> [UN]  PID: 7391   TASK: ffff939e154bc000  COMMAND: "nft"
> wait_for_completion_killable
>  __call_usermodehelper_exec
>   __request_module
>    nf_logger_find_get
>     nft_log_init              // ops->init()
>      nf_tables_newexpr
>       nf_tables_newrule               // called with nfnl_lock taken
>        nfnetlink_rcv_batch    // takes nfnl_lock
>         nfnetlink_rcv
>          netlink_unicast
>           netlink_sendmsg
> 
> 2. request_module() tries to load the module, but waits on "net_mutex":
> 
> [UN]  PID: 8913   TASK: ffff939e140ec000  COMMAND: "modprobe"
> __mutex_lock_slowpath
>  mutex_lock                   // mutex_lock(&net_mutex);
>  (owner = 0xffff939e315f2000 == PID: 158  COMMAND: "kworker/u64:1")
>    register_pernet_subsys
>     init_module
>      do_one_initcall
>       load_module
>        sys_init_module
> 
> 3. "kworker" holds net_mutex, but itself waits for nfnl_lock:
> 
> [UN]  PID: 158    TASK: ffff939e315f2000  COMMAND: "kworker/u64:1"
> __mutex_lock_slowpath
>  mutex_lock                   // nfnl_lock(NFNL_SUBSYS_NFTABLES)
>  (owner = 0xffff939e154bc000  PID: 7391  COMMAND: "nft")
>   nfnl_lock
>    nft_unregister_afinfo
>     nf_tables_inet_exit_net
>      ops_exit_list
>       cleanup_net
>        process_one_work
> 
> Reproducer:
> ===========
> [console 1]: export i=0; while true; do
>       nft add table filter; \
>       nft add chain filter input; \
>       nft add rule filter input tcp dport 23 ct state new \
>               log prefix \"Connection to port 23:\" accept; \
>       nft flush ruleset; \
>       rmmod nft_log; rmmod nf_log_ipv4; rmmod nf_log_common; \
>       i=$(($i+1)); echo $i; \
> done
> 
> [console 2]: modprobe nf_tables; \
>       export j=0; while true; do \
>               ip net add n1; \
>               ip net del n1; \
>               j=$(($j + 1)); echo $j; \
>       done
> 
> In mainstream this sutiation is impossible, it's fixed as a side effect
> by: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard
> transactions"), but backporting it is quite heavy,
> 
> so let's just drop/reacquire nfnl lock around request_module() like it's
> done in similar places.
> 
> https://jira.sw.ru/browse/PSBM-105534
> 
> Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com>
> 

Reviewed-by: Andrey Ryabinin <aryabi...@virtuozzo.com>
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to