Re: WARNING: proc registration bug in clusterip_tg_check
On Wed, 2018-02-07 at 09:43 +0100, Paolo Abeni wrote: > On Tue, 2018-02-06 at 22:42 -0800, Cong Wang wrote: > > On Tue, Feb 6, 2018 at 6:27 AM, syzbot > > wrote: > > > Hello, > > > > > > syzbot hit the following crash on net-next commit > > > 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +) > > > Merge tag 'usercopy-v4.16-rc1' of > > > git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux > > > > > > So far this crash happened 5 times on net-next, upstream. > > > C reproducer is attached. > > > syzkaller reproducer is attached. > > > Raw console output is attached. > > > compiler: gcc (GCC) 7.1.1 20170620 > > > .config is attached. > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > Reported-by: syzbot+03218bcdba6aa7644...@syzkaller.appspotmail.com > > > It will help syzbot understand when the bug is fixed. See footer for > > > details. > > > If you forward the report, please keep this part and the footer. > > > > > > x_tables: ip_tables: osf match: only valid for protocol 6 > > > x_tables: ip_tables: osf match: only valid for protocol 6 > > > x_tables: ip_tables: osf match: only valid for protocol 6 > > > [ cut here ] > > > proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered > > > WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 > > > proc_register+0x2a4/0x370 > > > fs/proc/generic.c:329 > > > Kernel panic - not syncing: panic_on_warn set ... > > > > > > CPU: 1 PID: 4152 Comm: syzkaller851476 Not tainted 4.15.0+ #221 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > > Google 01/01/2011 > > > Call Trace: > > > __dump_stack lib/dump_stack.c:17 [inline] > > > dump_stack+0x194/0x257 lib/dump_stack.c:53 > > > panic+0x1e4/0x41c kernel/panic.c:183 > > > __warn+0x1dc/0x200 kernel/panic.c:547 > > > report_bug+0x211/0x2d0 lib/bug.c:184 > > > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 > > > fixup_bug arch/x86/kernel/traps.c:247 [inline] > > > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 > > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 > > > invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1097 > > > RIP: 0010:proc_register+0x2a4/0x370 fs/proc/generic.c:329 > > > RSP: 0018:8801cbd6ee20 EFLAGS: 00010286 > > > RAX: dc08 RBX: 8801d2181038 RCX: 815a57ae > > > RDX: RSI: 1100397add74 RDI: 1100397add49 > > > RBP: 8801cbd6ee70 R08: 1100397add0b R09: > > > R10: 8801cbd6ecd8 R11: R12: 8801b2bb1cc0 > > > R13: dc00 R14: 8801b0d8dbc8 R15: 8801b2bb1d81 > > > proc_create_data+0xf8/0x180 fs/proc/generic.c:494 > > > clusterip_config_init net/ipv4/netfilter/ipt_CLUSTERIP.c:250 [inline] > > > > I think there is probably a race condition between > > clusterip_config_entry_put() > > and clusterip_config_init(), after we release the spinlock, a new proc > > with the same IP could be created therefore triggers this warning > > > > I am not sure if it is enough to just move the proc_remove() under > > spinlock... > > I *think* we should change the order on proc fs entry creation, > because clusterip_config_init() can race with itself, > clusterip_config_init() returns NULL if the clusterip_config_init has > no pte, and currently such entry is inserted into the list with NULL > pte and the list lock itself is released before creating the PTE. I was wrong. My suggested fix does not work at all. I tried your code and it fixes the issue here. Feel free to submit with: Tested-by: Paolo Abeni Thank you, Paolo
Re: WARNING: proc registration bug in clusterip_tg_check
Paolo Abeni wrote: [ pruning CC list ] > #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master > > I can't reproduce the issue locally, so asking the syzbot to test the > tentive fix for me (and hoping I did not mess with the tag/format) I can reproduce it. CLUSTERIP has multiple other bugs that need to be fixed, I'll look into this asap.
Re: WARNING: proc registration bug in clusterip_tg_check
You dropped syzbot from CC ;) Add syzbot+03218bcdba6aa7644...@syzkaller.appspotmail.com to To or CC. On Wed, Feb 7, 2018 at 11:42 AM, syzbot wrote: >> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git >> master > > > Can't find the corresponding bug. > > > >> I can't reproduce the issue locally, so asking the syzbot to test the >> tentive fix for me (and hoping I did not mess with the tag/format) > > >> --- >> net/ipv4/netfilter/ipt_CLUSTERIP.c | 30 +++--- >> 1 file changed, 15 insertions(+), 15 deletions(-) > > >> diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c >> b/net/ipv4/netfilter/ipt_CLUSTERIP.c >> index 3a84a60f6b39..db103cd971a9 100644 >> --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c >> +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c >> @@ -230,17 +230,6 @@ clusterip_config_init(struct net *net, const struct >> ipt_clusterip_tgt_info *i, >> refcount_set(&c->refcount, 1); >> refcount_set(&c->entries, 1); > > >> - spin_lock_bh(&cn->lock); >> - if (__clusterip_config_find(net, ip)) { >> - spin_unlock_bh(&cn->lock); >> - kfree(c); >> - >> - return ERR_PTR(-EBUSY); >> - } >> - >> - list_add_rcu(&c->list, &cn->configs); >> - spin_unlock_bh(&cn->lock); >> - >> #ifdef CONFIG_PROC_FS >> { >> char buffer[16]; >> @@ -257,20 +246,31 @@ clusterip_config_init(struct net *net, const struct >> ipt_clusterip_tgt_info *i, >> } >> #endif > > >> + spin_lock_bh(&cn->lock); >> + if (__clusterip_config_find(net, ip)) { >> + spin_unlock_bh(&cn->lock); >> + err = -EBUSY; >> + goto err_remove_pte: >> + } >> + >> + list_add_rcu(&c->list, &cn->configs); >> + spin_unlock_bh(&cn->lock); >> + >> c->notifier.notifier_call = clusterip_netdev_event; >> err = register_netdevice_notifier(&c->notifier); >> if (!err) >> return c; > > >> + spin_lock_bh(&cn->lock); >> + list_del_rcu(&c->list); >> + spin_unlock_bh(&cn->lock); >> + >> +err_remove_pte: >> #ifdef CONFIG_PROC_FS >> proc_remove(c->pde); >> err: >> #endif >> - spin_lock_bh(&cn->lock); >> - list_del_rcu(&c->list); >> - spin_unlock_bh(&cn->lock); >> kfree(c); >> - >> return ERR_PTR(err); >> } > > >> -- >> 2.14.3 > > >> -- >> You received this message because you are subscribed to the Google Groups >> "syzkaller-bugs" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to syzkaller-bugs+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/syzkaller-bugs/945c8517a87c671825b61223088064ea2ad0a8cb.1517999262.git.pabeni%40redhat.com. >> For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to syzkaller-bugs+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/syzkaller-bugs/001a114372a68e749405649cf352%40google.com. > > For more options, visit https://groups.google.com/d/optout.
Re: WARNING: proc registration bug in clusterip_tg_check
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master Can't find the corresponding bug. I can't reproduce the issue locally, so asking the syzbot to test the tentive fix for me (and hoping I did not mess with the tag/format) --- net/ipv4/netfilter/ipt_CLUSTERIP.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index 3a84a60f6b39..db103cd971a9 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -230,17 +230,6 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, refcount_set(&c->refcount, 1); refcount_set(&c->entries, 1); - spin_lock_bh(&cn->lock); - if (__clusterip_config_find(net, ip)) { - spin_unlock_bh(&cn->lock); - kfree(c); - - return ERR_PTR(-EBUSY); - } - - list_add_rcu(&c->list, &cn->configs); - spin_unlock_bh(&cn->lock); - #ifdef CONFIG_PROC_FS { char buffer[16]; @@ -257,20 +246,31 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, } #endif + spin_lock_bh(&cn->lock); + if (__clusterip_config_find(net, ip)) { + spin_unlock_bh(&cn->lock); + err = -EBUSY; + goto err_remove_pte: + } + + list_add_rcu(&c->list, &cn->configs); + spin_unlock_bh(&cn->lock); + c->notifier.notifier_call = clusterip_netdev_event; err = register_netdevice_notifier(&c->notifier); if (!err) return c; + spin_lock_bh(&cn->lock); + list_del_rcu(&c->list); + spin_unlock_bh(&cn->lock); + +err_remove_pte: #ifdef CONFIG_PROC_FS proc_remove(c->pde); err: #endif - spin_lock_bh(&cn->lock); - list_del_rcu(&c->list); - spin_unlock_bh(&cn->lock); kfree(c); - return ERR_PTR(err); } -- 2.14.3 -- You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/945c8517a87c671825b61223088064ea2ad0a8cb.1517999262.git.pabeni%40redhat.com. For more options, visit https://groups.google.com/d/optout.
Re: WARNING: proc registration bug in clusterip_tg_check
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master I can't reproduce the issue locally, so asking the syzbot to test the tentive fix for me (and hoping I did not mess with the tag/format) --- net/ipv4/netfilter/ipt_CLUSTERIP.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index 3a84a60f6b39..db103cd971a9 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -230,17 +230,6 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, refcount_set(&c->refcount, 1); refcount_set(&c->entries, 1); - spin_lock_bh(&cn->lock); - if (__clusterip_config_find(net, ip)) { - spin_unlock_bh(&cn->lock); - kfree(c); - - return ERR_PTR(-EBUSY); - } - - list_add_rcu(&c->list, &cn->configs); - spin_unlock_bh(&cn->lock); - #ifdef CONFIG_PROC_FS { char buffer[16]; @@ -257,20 +246,31 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, } #endif + spin_lock_bh(&cn->lock); + if (__clusterip_config_find(net, ip)) { + spin_unlock_bh(&cn->lock); + err = -EBUSY; + goto err_remove_pte: + } + + list_add_rcu(&c->list, &cn->configs); + spin_unlock_bh(&cn->lock); + c->notifier.notifier_call = clusterip_netdev_event; err = register_netdevice_notifier(&c->notifier); if (!err) return c; + spin_lock_bh(&cn->lock); + list_del_rcu(&c->list); + spin_unlock_bh(&cn->lock); + +err_remove_pte: #ifdef CONFIG_PROC_FS proc_remove(c->pde); err: #endif - spin_lock_bh(&cn->lock); - list_del_rcu(&c->list); - spin_unlock_bh(&cn->lock); kfree(c); - return ERR_PTR(err); } -- 2.14.3
Re: WARNING: proc registration bug in clusterip_tg_check
On Tue, 2018-02-06 at 22:42 -0800, Cong Wang wrote: > On Tue, Feb 6, 2018 at 6:27 AM, syzbot > wrote: > > Hello, > > > > syzbot hit the following crash on net-next commit > > 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +) > > Merge tag 'usercopy-v4.16-rc1' of > > git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux > > > > So far this crash happened 5 times on net-next, upstream. > > C reproducer is attached. > > syzkaller reproducer is attached. > > Raw console output is attached. > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached. > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+03218bcdba6aa7644...@syzkaller.appspotmail.com > > It will help syzbot understand when the bug is fixed. See footer for > > details. > > If you forward the report, please keep this part and the footer. > > > > x_tables: ip_tables: osf match: only valid for protocol 6 > > x_tables: ip_tables: osf match: only valid for protocol 6 > > x_tables: ip_tables: osf match: only valid for protocol 6 > > [ cut here ] > > proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered > > WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 > > fs/proc/generic.c:329 > > Kernel panic - not syncing: panic_on_warn set ... > > > > CPU: 1 PID: 4152 Comm: syzkaller851476 Not tainted 4.15.0+ #221 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:17 [inline] > > dump_stack+0x194/0x257 lib/dump_stack.c:53 > > panic+0x1e4/0x41c kernel/panic.c:183 > > __warn+0x1dc/0x200 kernel/panic.c:547 > > report_bug+0x211/0x2d0 lib/bug.c:184 > > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 > > fixup_bug arch/x86/kernel/traps.c:247 [inline] > > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 > > invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1097 > > RIP: 0010:proc_register+0x2a4/0x370 fs/proc/generic.c:329 > > RSP: 0018:8801cbd6ee20 EFLAGS: 00010286 > > RAX: dc08 RBX: 8801d2181038 RCX: 815a57ae > > RDX: RSI: 1100397add74 RDI: 1100397add49 > > RBP: 8801cbd6ee70 R08: 1100397add0b R09: > > R10: 8801cbd6ecd8 R11: R12: 8801b2bb1cc0 > > R13: dc00 R14: 8801b0d8dbc8 R15: 8801b2bb1d81 > > proc_create_data+0xf8/0x180 fs/proc/generic.c:494 > > clusterip_config_init net/ipv4/netfilter/ipt_CLUSTERIP.c:250 [inline] > > I think there is probably a race condition between > clusterip_config_entry_put() > and clusterip_config_init(), after we release the spinlock, a new proc > with the same IP could be created therefore triggers this warning > > I am not sure if it is enough to just move the proc_remove() under > spinlock... I *think* we should change the order on proc fs entry creation, because clusterip_config_init() can race with itself, clusterip_config_init() returns NULL if the clusterip_config_init has no pte, and currently such entry is inserted into the list with NULL pte and the list lock itself is released before creating the PTE. I'll try to test something the following: --- diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index 3a84a60f6b39..d8807c44cc61 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -230,17 +230,6 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, refcount_set(&c->refcount, 1); refcount_set(&c->entries, 1); - spin_lock_bh(&cn->lock); - if (__clusterip_config_find(net, ip)) { - spin_unlock_bh(&cn->lock); - kfree(c); - - return ERR_PTR(-EBUSY); - } - - list_add_rcu(&c->list, &cn->configs); - spin_unlock_bh(&cn->lock); - #ifdef CONFIG_PROC_FS { char buffer[16]; @@ -257,6 +246,18 @@ clusterip_config_init(struct net *net, const struct ipt_clusterip_tgt_info *i, } #endif + spin_lock_bh(&cn->lock); + if (__clusterip_config_find(net, ip)) { + spin_unlock_bh(&cn->lock); + kfree(c); + + proc_remove(c->pde); + return ERR_PTR(-EBUSY); + } + + list_add_rcu(&c->list, &cn->configs); + spin_unlock_bh(&cn->lock); + c->notifier.notifier_call = clusterip_netdev_event; err = register_netdevice_notifier(&c->notifier); if (!err) --- Cheers, Paolo
Re: WARNING: proc registration bug in clusterip_tg_check
On Tue, Feb 6, 2018 at 6:27 AM, syzbot wrote: > Hello, > > syzbot hit the following crash on net-next commit > 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +) > Merge tag 'usercopy-v4.16-rc1' of > git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux > > So far this crash happened 5 times on net-next, upstream. > C reproducer is attached. > syzkaller reproducer is attached. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+03218bcdba6aa7644...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > x_tables: ip_tables: osf match: only valid for protocol 6 > x_tables: ip_tables: osf match: only valid for protocol 6 > x_tables: ip_tables: osf match: only valid for protocol 6 > [ cut here ] > proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered > WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 > fs/proc/generic.c:329 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 1 PID: 4152 Comm: syzkaller851476 Not tainted 4.15.0+ #221 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 > fixup_bug arch/x86/kernel/traps.c:247 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 > invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1097 > RIP: 0010:proc_register+0x2a4/0x370 fs/proc/generic.c:329 > RSP: 0018:8801cbd6ee20 EFLAGS: 00010286 > RAX: dc08 RBX: 8801d2181038 RCX: 815a57ae > RDX: RSI: 1100397add74 RDI: 1100397add49 > RBP: 8801cbd6ee70 R08: 1100397add0b R09: > R10: 8801cbd6ecd8 R11: R12: 8801b2bb1cc0 > R13: dc00 R14: 8801b0d8dbc8 R15: 8801b2bb1d81 > proc_create_data+0xf8/0x180 fs/proc/generic.c:494 > clusterip_config_init net/ipv4/netfilter/ipt_CLUSTERIP.c:250 [inline] I think there is probably a race condition between clusterip_config_entry_put() and clusterip_config_init(), after we release the spinlock, a new proc with the same IP could be created therefore triggers this warning I am not sure if it is enough to just move the proc_remove() under spinlock... diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index 3a84a60f6b39..1ff72b87a066 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -107,12 +107,6 @@ clusterip_config_entry_put(struct net *net, struct clusterip_config *c) local_bh_disable(); if (refcount_dec_and_lock(&c->entries, &cn->lock)) { - list_del_rcu(&c->list); - spin_unlock(&cn->lock); - local_bh_enable(); - - unregister_netdevice_notifier(&c->notifier); - /* In case anyone still accesses the file, the open/close * functions are also incrementing the refcount on their own, * so it's safe to remove the entry even if it's in use. */ @@ -120,6 +114,12 @@ clusterip_config_entry_put(struct net *net, struct clusterip_config *c) if (cn->procdir) proc_remove(c->pde); #endif + list_del_rcu(&c->list); + spin_unlock(&cn->lock); + local_bh_enable(); + + unregister_netdevice_notifier(&c->notifier); + return; } local_bh_enable(); > clusterip_tg_check+0xf9c/0x16d0 net/ipv4/netfilter/ipt_CLUSTERIP.c:488 > xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:850 > check_target net/ipv4/netfilter/ip_tables.c:513 [inline] > find_check_entry.isra.8+0x8c8/0xcb0 net/ipv4/netfilter/ip_tables.c:554 > translate_table+0xed1/0x1610 net/ipv4/netfilter/ip_tables.c:725 > do_replace net/ipv4/netfilter/ip_tables.c:1141 [inline] > do_ipt_set_ctl+0x370/0x5f0 net/ipv4/netfilter/ip_tables.c:1675 > nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] > nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 > ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1259 > sctp_setsockopt+0x2b6/0x61d0 net/sctp/socket.c:4104 > sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2975 > SYSC_setsockopt net/socket.c:1849 [inline] > SyS_setsockopt+0x189/0x360 net/socket.c:1828 > entry_SYSCALL_64_fastpath+0x29/0xa0 > RIP: 0033:0x446839 > RSP: 002b:7f0309d0fdb8 EFLAGS: 0246 ORIG_RAX: 0036 > RAX: ffda RBX: 00