Re: lost connection on dynamic IP
On Thu, 20 May 2021 00:28:08 +0200 Vicente Bergas wrote: > There is a public IP assigned to the router. The IP is dynamic, so, it > can change from time to time, but, once assigned, it is exclusive to > the router. > There is no carrier-grade NAT. > I've configured the router to forward the wireguard port to the > server, so, it is like the server is directly connected to the > Internet. > I think the PersistentKeepalive on the server side is not required. Is it? I believe it is. Consider the server public IP has changed. The server sends no Keepalives. The client sends them to the server's old IP. The whole thing broke. > So, what do you mean is that wireguard does a single DNS resolution at > the beginning and further DNS resolutions need to be done elsewere. Is > that correct? Yes. -- With respect, Roman
Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)
On 5/19/21 12:48 PM, Dmitry Vyukov wrote: > On Wed, May 19, 2021 at 7:35 PM syzbot > wrote: >> >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec >> git tree: net >> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0 >> kernel config: https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5 >> dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f >> >> Unfortunately, I don't have any reproducer for this issue yet. >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com >> >> BUG: MAX_LOCKDEP_KEYS too low! > include/linux/lockdep.h #define MAX_LOCKDEP_KEYS_BITS 13 #define MAX_LOCKDEP_KEYS(1UL << MAX_LOCKDEP_KEYS_BITS) Documentation/locking/lockdep-design.rst: Troubleshooting: The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes. Exceeding this number will trigger the following lockdep warning:: (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)) By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical desktop systems have less than 1,000 lock classes, so this warning normally results from lock-class leakage or failure to properly initialize locks. These two problems are illustrated below: > > What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in > any of the config descriptions... > Here is what syzbot used: > > CONFIG_LOCKDEP=y > CONFIG_LOCKDEP_BITS=16 > CONFIG_LOCKDEP_CHAINS_BITS=17 > CONFIG_LOCKDEP_STACK_TRACE_BITS=20 > CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14 > CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > > We already bumped most of these. > The log contains dump of the lockdep debug files, is there any offender? > > Also looking at the log I noticed a memory safety bug in lockdep > implementation: ... -- ~Randy
Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)
On Wed, May 19, 2021 at 9:58 PM Randy Dunlap wrote: > > On 5/19/21 12:48 PM, Dmitry Vyukov wrote: > > On Wed, May 19, 2021 at 7:35 PM syzbot > > wrote: > >> > >> Hello, > >> > >> syzbot found the following issue on: > >> > >> HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec > >> git tree: net > >> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0 > >> kernel config: https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5 > >> dashboard link: > >> https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f > >> > >> Unfortunately, I don't have any reproducer for this issue yet. > >> > >> IMPORTANT: if you fix the issue, please add the following tag to the > >> commit: > >> Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com > >> > >> BUG: MAX_LOCKDEP_KEYS too low! > > > > include/linux/lockdep.h > > #define MAX_LOCKDEP_KEYS_BITS 13 > #define MAX_LOCKDEP_KEYS(1UL << MAX_LOCKDEP_KEYS_BITS) Ouch, so it's not configurable yet :( Unless, of course, we identify the offender that produced thousands of lock classes in the log and fix it. > Documentation/locking/lockdep-design.rst: > > Troubleshooting: > > > The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes. > Exceeding this number will trigger the following lockdep warning:: > > (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)) > > By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical > desktop systems have less than 1,000 lock classes, so this warning > normally results from lock-class leakage or failure to properly > initialize locks. These two problems are illustrated below: > > > > > What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in > > any of the config descriptions... > > Here is what syzbot used: > > > > CONFIG_LOCKDEP=y > > CONFIG_LOCKDEP_BITS=16 > > CONFIG_LOCKDEP_CHAINS_BITS=17 > > CONFIG_LOCKDEP_STACK_TRACE_BITS=20 > > CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14 > > CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 > > > > We already bumped most of these. > > The log contains dump of the lockdep debug files, is there any offender? > > > > Also looking at the log I noticed a memory safety bug in lockdep > > implementation: > > ... > > -- > ~Randy > > -- > You received this message because you are subscribed to the Google Groups > "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to syzkaller-bugs+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/syzkaller-bugs/c545268c-fe62-883c-4c46-974b3bb3cea1%40infradead.org.
Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)
On Wed, May 19, 2021 at 7:35 PM syzbot wrote: > > Hello, > > syzbot found the following issue on: > > HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec > git tree: net > console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0 > kernel config: https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5 > dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f > > Unfortunately, I don't have any reproducer for this issue yet. > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com > > BUG: MAX_LOCKDEP_KEYS too low! What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in any of the config descriptions... Here is what syzbot used: CONFIG_LOCKDEP=y CONFIG_LOCKDEP_BITS=16 CONFIG_LOCKDEP_CHAINS_BITS=17 CONFIG_LOCKDEP_STACK_TRACE_BITS=20 CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14 CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12 We already bumped most of these. The log contains dump of the lockdep debug files, is there any offender? Also looking at the log I noticed a memory safety bug in lockdep implementation: [ 2023.605505][ T6807] == [ 2023.613589][ T6807] BUG: KASAN: global-out-of-bounds in print_name+0x1b0/0x1d0 [ 2023.624553][ T6807] Read of size 8 at addr 90225cb0 by task cat/6807 [ 2023.631765][ T6807] [ 2023.634096][ T6807] CPU: 1 PID: 6807 Comm: cat Not tainted 5.12.0-syzkaller #0 [ 2023.641488][ T6807] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 2023.651745][ T6807] Call Trace: [ 2023.655031][ T6807] dump_stack+0x141/0x1d7 [ 2023.659375][ T6807] ? print_name+0x1b0/0x1d0 [ 2023.663890][ T6807] print_address_description.constprop.0.cold+0x5/0x2f8 [ 2023.670895][ T6807] ? print_name+0x1b0/0x1d0 [ 2023.675413][ T6807] ? print_name+0x1b0/0x1d0 [ 2023.679948][ T6807] kasan_report.cold+0x7c/0xd8 [ 2023.684725][ T6807] ? print_name+0x1b0/0x1d0 [ 2023.689248][ T6807] print_name+0x1b0/0x1d0 [ 2023.694196][ T6807] ? lockdep_stats_show+0xa20/0xa20 [ 2023.699940][ T6807] ? seq_file_path+0x30/0x30 [ 2023.704721][ T6807] ? mutex_lock_io_nested+0xf70/0xf70 [ 2023.710118][ T6807] ? lock_acquire+0x58a/0x740 [ 2023.715156][ T6807] ? kasan_unpoison+0x3c/0x60 [ 2023.719843][ T6807] lc_show+0x10a/0x210 [ 2023.723924][ T6807] seq_read_iter+0xb66/0x1220 [ 2023.728617][ T6807] proc_reg_read_iter+0x1fb/0x2d0 [ 2023.733651][ T6807] new_sync_read+0x41e/0x6e0 [ 2023.738272][ T6807] ? ksys_lseek+0x1b0/0x1b0 [ 2023.742784][ T6807] ? lock_acquire+0x58a/0x740 [ 2023.747563][ T6807] vfs_read+0x35c/0x570 [ 2023.751737][ T6807] ksys_read+0x12d/0x250 [ 2023.756003][ T6807] ? vfs_write+0xa30/0xa30 [ 2023.760429][ T6807] ? syscall_enter_from_user_mode+0x27/0x70 [ 2023.766335][ T6807] do_syscall_64+0x3a/0xb0 [ 2023.770764][ T6807] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 2023.776685][ T6807] RIP: 0033:0x7f99856e2910 [ 2023.781104][ T6807] Code: b6 fe ff ff 48 8d 3d 0f be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d f9 2d 2c 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 9b 01 00 48 89 04 24 [ 2023.800719][ T6807] RSP: 002b:7ffee7328628 EFLAGS: 0246 ORIG_RAX: [ 2023.809169][ T6807] RAX: ffda RBX: 0002 RCX: 7f99856e2910 [ 2023.817150][ T6807] RDX: 0002 RSI: 564290b2a000 RDI: 0003 [ 2023.825123][ T6807] RBP: 564290b2a000 R08: 0003 R09: 00021010 [ 2023.833107][ T6807] R10: 0002 R11: 0246 R12: 564290b2a000 [ 2023.841091][ T6807] R13: 0003 R14: 0002 R15: 1000 [ 2023.849074][ T6807] [ 2023.851408][ T6807] The buggy address belongs to the variable: [ 2023.857388][ T6807] lock_classes_in_use+0x410/0x420 [ 2023.862510][ T6807] [ 2023.864826][ T6807] Memory state around the buggy address: [ 2023.870450][ T6807] 90225b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2023.878511][ T6807] 90225c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2023.886572][ T6807] >90225c80: 00 00 00 00 f9 f9 f9 f9 00 00 00 00 00 00 00 00 [ 2023.894628][ T6807] ^ [ 2023.900256][ T6807] 90225d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2023.908317][ T6807] 90225d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2023.916377][ T6807] == > turning off the locking correctness validator. > CPU: 0 PID: 5917 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:79 [inline] > dump_stack+0x141/0x1d7 lib/dump_stack.c:120 > register_lock_class.cold+0x14/0x19 ker
Re: lost connection on dynamic IP
On Tue, 18 May 2021 13:22:31 +0200 Vicente Bergas wrote: > A server connected to the Internet through an ISP that provides a > dynamic IP with NAT. If it's NAT, then your server has no dedicated public IP? What do you update to DNS, IP of the ISP's NAT pool (shared IP with many other customers)? > I think the issue happens when the ISP on the server side shuts down > the Internet connection for more than 1 hour! Then, it is restored > with a new IP. > inadyn detects the new IP and updates the DNS. > At this point the Internet connection is operational again, but the > client remains disconnected until rebooted. > > Is this scenario expected to work due to the "Built-in Roaming" ? It might work, helped by PersistentKeepalive, and as long as the server and the client don't change their IPs/ports *at the same time*. To protect against that, or to improve resiliency in general (and assuming there's actually no NAT at the server side after all), your client should resolve the DNS record for the server periodically, and in case the IP changed, call "wg set [interface] peer [key] endpoint [IP:port]". -- With respect, Roman
lost connection on dynamic IP
Hi, i've got the following setup: A server connected to the Internet through an ISP that provides a dynamic IP with NAT. The server keeps the DNS updated with https://github.com/troglobit/inadyn A client on a tiny embedded board connects to the server by means of its domain name. Wireguard configurations are: # server/etc/wireguard/wg0.conf [Interface] ListenPort = port_number PrivateKey = ... [Peer] PublicKey = ... PresharedKey = ... PersistentKeepalive = 25 AllowedIPs = 10.0.0.2 # client/etc/wireguard/wg0.conf [Interface] PrivateKey = ... [Peer] PublicKey = ... PresharedKey = ... Endpoint = domain.name.that.resolves.to.a.dynamic.ip:port_number PersistentKeepalive = 25 AllowedIPs = 10.0.0.1 The server almost never initiates comunications towards the client. The client sends one packet every minute towards the server. I think the issue happens when the ISP on the server side shuts down the Internet connection for more than 1 hour! Then, it is restored with a new IP. inadyn detects the new IP and updates the DNS. At this point the Internet connection is operational again, but the client remains disconnected until rebooted. Is this scenario expected to work due to the "Built-in Roaming" ? Regards, Vicenç.
[syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)
Hello, syzbot found the following issue on: HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0 kernel config: https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5 dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com BUG: MAX_LOCKDEP_KEYS too low! turning off the locking correctness validator. CPU: 0 PID: 5917 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 register_lock_class.cold+0x14/0x19 kernel/locking/lockdep.c:1281 __lock_acquire+0x102/0x5230 kernel/locking/lockdep.c:4781 lock_acquire kernel/locking/lockdep.c:5512 [inline] lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5477 flush_workqueue+0x110/0x13e0 kernel/workqueue.c:2786 drain_workqueue+0x1a5/0x3c0 kernel/workqueue.c:2951 destroy_workqueue+0x71/0x800 kernel/workqueue.c:4382 alloc_workqueue+0xc40/0xef0 kernel/workqueue.c:4343 wg_newlink+0x43d/0x9e0 drivers/net/wireguard/device.c:335 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3452 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3500 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5562 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4665d9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:7fb25febe188 EFLAGS: 0246 ORIG_RAX: 002e RAX: ffda RBX: 0056c0b0 RCX: 004665d9 RDX: RSI: 2080 RDI: 0005 RBP: 004bfcb9 R08: R09: R10: R11: 0246 R12: 0056c0b0 R13: 7fff30a5021f R14: 7fb25febe300 R15: 00022000 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.