Re: lost connection on dynamic IP

2021-05-19 Thread Roman Mamedov
On Thu, 20 May 2021 00:28:08 +0200
Vicente Bergas  wrote:

> There is a public IP assigned to the router. The IP is dynamic, so, it
> can change from time to time, but, once assigned, it is exclusive to
> the router.
> There is no carrier-grade NAT.
> I've configured the router to forward the wireguard port to the
> server, so, it is like the server is directly connected to the
> Internet.
> I think the PersistentKeepalive on the server side is not required. Is it?

I believe it is. Consider the server public IP has changed. The server sends
no Keepalives. The client sends them to the server's old IP. The whole thing
broke.

> So, what do you mean is that wireguard does a single DNS resolution at
> the beginning and further DNS resolutions need to be done elsewere. Is
> that correct?

Yes.

-- 
With respect,
Roman


Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)

2021-05-19 Thread Randy Dunlap
On 5/19/21 12:48 PM, Dmitry Vyukov wrote:
> On Wed, May 19, 2021 at 7:35 PM syzbot
>  wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec
>> git tree:   net
>> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com
>>
>> BUG: MAX_LOCKDEP_KEYS too low!
> 

include/linux/lockdep.h

#define MAX_LOCKDEP_KEYS_BITS   13
#define MAX_LOCKDEP_KEYS(1UL << MAX_LOCKDEP_KEYS_BITS)

Documentation/locking/lockdep-design.rst:

Troubleshooting:


The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
Exceeding this number will trigger the following lockdep warning::

(DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))

By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
desktop systems have less than 1,000 lock classes, so this warning
normally results from lock-class leakage or failure to properly
initialize locks.  These two problems are illustrated below:

> 
> What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in
> any of the config descriptions...
> Here is what syzbot used:
> 
> CONFIG_LOCKDEP=y
> CONFIG_LOCKDEP_BITS=16
> CONFIG_LOCKDEP_CHAINS_BITS=17
> CONFIG_LOCKDEP_STACK_TRACE_BITS=20
> CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
> CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
> 
> We already bumped most of these.
> The log contains dump of the lockdep debug files, is there any offender?
> 
> Also looking at the log I noticed a memory safety bug in lockdep 
> implementation:

...

-- 
~Randy



Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)

2021-05-19 Thread Dmitry Vyukov
On Wed, May 19, 2021 at 9:58 PM Randy Dunlap  wrote:
>
> On 5/19/21 12:48 PM, Dmitry Vyukov wrote:
> > On Wed, May 19, 2021 at 7:35 PM syzbot
> >  wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec
> >> git tree:   net
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0
> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5
> >> dashboard link: 
> >> https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f
> >>
> >> Unfortunately, I don't have any reproducer for this issue yet.
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the 
> >> commit:
> >> Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com
> >>
> >> BUG: MAX_LOCKDEP_KEYS too low!
> >
>
> include/linux/lockdep.h
>
> #define MAX_LOCKDEP_KEYS_BITS   13
> #define MAX_LOCKDEP_KEYS(1UL << MAX_LOCKDEP_KEYS_BITS)

Ouch, so it's not configurable yet :(
Unless, of course, we identify the offender that produced thousands of
lock classes in the log and fix it.


> Documentation/locking/lockdep-design.rst:
>
> Troubleshooting:
> 
>
> The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
> Exceeding this number will trigger the following lockdep warning::
>
> (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
>
> By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
> desktop systems have less than 1,000 lock classes, so this warning
> normally results from lock-class leakage or failure to properly
> initialize locks.  These two problems are illustrated below:
>
> >
> > What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in
> > any of the config descriptions...
> > Here is what syzbot used:
> >
> > CONFIG_LOCKDEP=y
> > CONFIG_LOCKDEP_BITS=16
> > CONFIG_LOCKDEP_CHAINS_BITS=17
> > CONFIG_LOCKDEP_STACK_TRACE_BITS=20
> > CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
> > CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
> >
> > We already bumped most of these.
> > The log contains dump of the lockdep debug files, is there any offender?
> >
> > Also looking at the log I noticed a memory safety bug in lockdep 
> > implementation:
>
> ...
>
> --
> ~Randy
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/c545268c-fe62-883c-4c46-974b3bb3cea1%40infradead.org.


Re: [syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)

2021-05-19 Thread Dmitry Vyukov
On Wed, May 19, 2021 at 7:35 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec
> git tree:   net
> console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5
> dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com
>
> BUG: MAX_LOCKDEP_KEYS too low!


What config controls this? I don't see "MAX_LOCKDEP_KEYS too low" in
any of the config descriptions...
Here is what syzbot used:

CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=16
CONFIG_LOCKDEP_CHAINS_BITS=17
CONFIG_LOCKDEP_STACK_TRACE_BITS=20
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12

We already bumped most of these.
The log contains dump of the lockdep debug files, is there any offender?

Also looking at the log I noticed a memory safety bug in lockdep implementation:

[ 2023.605505][ T6807]
==
[ 2023.613589][ T6807] BUG: KASAN: global-out-of-bounds in
print_name+0x1b0/0x1d0
[ 2023.624553][ T6807] Read of size 8 at addr 90225cb0 by task cat/6807
[ 2023.631765][ T6807]
[ 2023.634096][ T6807] CPU: 1 PID: 6807 Comm: cat Not tainted
5.12.0-syzkaller #0
[ 2023.641488][ T6807] Hardware name: Google Google Compute
Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 2023.651745][ T6807] Call Trace:
[ 2023.655031][ T6807]  dump_stack+0x141/0x1d7
[ 2023.659375][ T6807]  ? print_name+0x1b0/0x1d0
[ 2023.663890][ T6807]  print_address_description.constprop.0.cold+0x5/0x2f8
[ 2023.670895][ T6807]  ? print_name+0x1b0/0x1d0
[ 2023.675413][ T6807]  ? print_name+0x1b0/0x1d0
[ 2023.679948][ T6807]  kasan_report.cold+0x7c/0xd8
[ 2023.684725][ T6807]  ? print_name+0x1b0/0x1d0
[ 2023.689248][ T6807]  print_name+0x1b0/0x1d0
[ 2023.694196][ T6807]  ? lockdep_stats_show+0xa20/0xa20
[ 2023.699940][ T6807]  ? seq_file_path+0x30/0x30
[ 2023.704721][ T6807]  ? mutex_lock_io_nested+0xf70/0xf70
[ 2023.710118][ T6807]  ? lock_acquire+0x58a/0x740
[ 2023.715156][ T6807]  ? kasan_unpoison+0x3c/0x60
[ 2023.719843][ T6807]  lc_show+0x10a/0x210
[ 2023.723924][ T6807]  seq_read_iter+0xb66/0x1220
[ 2023.728617][ T6807]  proc_reg_read_iter+0x1fb/0x2d0
[ 2023.733651][ T6807]  new_sync_read+0x41e/0x6e0
[ 2023.738272][ T6807]  ? ksys_lseek+0x1b0/0x1b0
[ 2023.742784][ T6807]  ? lock_acquire+0x58a/0x740
[ 2023.747563][ T6807]  vfs_read+0x35c/0x570
[ 2023.751737][ T6807]  ksys_read+0x12d/0x250
[ 2023.756003][ T6807]  ? vfs_write+0xa30/0xa30
[ 2023.760429][ T6807]  ? syscall_enter_from_user_mode+0x27/0x70
[ 2023.766335][ T6807]  do_syscall_64+0x3a/0xb0
[ 2023.770764][ T6807]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2023.776685][ T6807] RIP: 0033:0x7f99856e2910
[ 2023.781104][ T6807] Code: b6 fe ff ff 48 8d 3d 0f be 08 00 48 83 ec
08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d f9 2d 2c 00 00 75 10 b8 00
00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 9b 01 00
48 89 04 24
[ 2023.800719][ T6807] RSP: 002b:7ffee7328628 EFLAGS: 0246
ORIG_RAX: 
[ 2023.809169][ T6807] RAX: ffda RBX: 0002
RCX: 7f99856e2910
[ 2023.817150][ T6807] RDX: 0002 RSI: 564290b2a000
RDI: 0003
[ 2023.825123][ T6807] RBP: 564290b2a000 R08: 0003
R09: 00021010
[ 2023.833107][ T6807] R10: 0002 R11: 0246
R12: 564290b2a000
[ 2023.841091][ T6807] R13: 0003 R14: 0002
R15: 1000
[ 2023.849074][ T6807]
[ 2023.851408][ T6807] The buggy address belongs to the variable:
[ 2023.857388][ T6807]  lock_classes_in_use+0x410/0x420
[ 2023.862510][ T6807]
[ 2023.864826][ T6807] Memory state around the buggy address:
[ 2023.870450][ T6807]  90225b80: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 2023.878511][ T6807]  90225c00: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 2023.886572][ T6807] >90225c80: 00 00 00 00 f9 f9 f9 f9 00
00 00 00 00 00 00 00
[ 2023.894628][ T6807]  ^
[ 2023.900256][ T6807]  90225d00: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 2023.908317][ T6807]  90225d80: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 2023.916377][ T6807]
==





> turning off the locking correctness validator.
> CPU: 0 PID: 5917 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>  register_lock_class.cold+0x14/0x19 ker

Re: lost connection on dynamic IP

2021-05-19 Thread Roman Mamedov
On Tue, 18 May 2021 13:22:31 +0200
Vicente Bergas  wrote:

> A server connected to the Internet through an ISP that provides a
> dynamic IP with NAT.

If it's NAT, then your server has no dedicated public IP? What do you update
to DNS, IP of the ISP's NAT pool (shared IP with many other customers)?

> I think the issue happens when the ISP on the server side shuts down
> the Internet connection for more than 1 hour! Then, it is restored
> with a new IP.
> inadyn detects the new IP and updates the DNS.
> At this point the Internet connection is operational again, but the
> client remains disconnected until rebooted.
>
> Is this scenario expected to work due to the "Built-in Roaming" ?

It might work, helped by PersistentKeepalive, and as long as the server and the
client don't change their IPs/ports *at the same time*. To protect against
that, or to improve resiliency in general (and assuming there's actually no NAT
at the server side after all), your client should resolve the DNS record for
the server periodically, and in case the IP changed, call "wg set [interface]
peer [key] endpoint [IP:port]".

-- 
With respect,
Roman


lost connection on dynamic IP

2021-05-19 Thread Vicente Bergas
Hi, i've got the following setup:
A server connected to the Internet through an ISP that provides a
dynamic IP with NAT.
The server keeps the DNS updated with https://github.com/troglobit/inadyn
A client on a tiny embedded board connects to the server by means of
its domain name.
Wireguard configurations are:
# server/etc/wireguard/wg0.conf
[Interface]
ListenPort = port_number
PrivateKey = ...
[Peer]
PublicKey = ...
PresharedKey = ...
PersistentKeepalive = 25
AllowedIPs = 10.0.0.2

# client/etc/wireguard/wg0.conf
[Interface]
PrivateKey = ...
[Peer]
PublicKey = ...
PresharedKey = ...
Endpoint = domain.name.that.resolves.to.a.dynamic.ip:port_number
PersistentKeepalive = 25
AllowedIPs = 10.0.0.1

The server almost never initiates comunications towards the client.
The client sends one packet every minute towards the server.

I think the issue happens when the ISP on the server side shuts down
the Internet connection for more than 1 hour! Then, it is restored
with a new IP.
inadyn detects the new IP and updates the DNS.
At this point the Internet connection is operational again, but the
client remains disconnected until rebooted.

Is this scenario expected to work due to the "Built-in Roaming" ?

Regards,
  Vicenç.


[syzbot] BUG: MAX_LOCKDEP_KEYS too low! (2)

2021-05-19 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:b81ac784 net: cdc_eem: fix URL to CDC EEM 1.0 spec
git tree:   net
console output: https://syzkaller.appspot.com/x/log.txt?x=15a257c3d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=5b86a12e0d1933b5
dashboard link: https://syzkaller.appspot.com/bug?extid=a70a6358abd2c3f9550f

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a70a6358abd2c3f95...@syzkaller.appspotmail.com

BUG: MAX_LOCKDEP_KEYS too low!
turning off the locking correctness validator.
CPU: 0 PID: 5917 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 register_lock_class.cold+0x14/0x19 kernel/locking/lockdep.c:1281
 __lock_acquire+0x102/0x5230 kernel/locking/lockdep.c:4781
 lock_acquire kernel/locking/lockdep.c:5512 [inline]
 lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5477
 flush_workqueue+0x110/0x13e0 kernel/workqueue.c:2786
 drain_workqueue+0x1a5/0x3c0 kernel/workqueue.c:2951
 destroy_workqueue+0x71/0x800 kernel/workqueue.c:4382
 alloc_workqueue+0xc40/0xef0 kernel/workqueue.c:4343
 wg_newlink+0x43d/0x9e0 drivers/net/wireguard/device.c:335
 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3452
 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3500
 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5562
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:674
 sys_sendmsg+0x6e8/0x810 net/socket.c:2350
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665d9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:7fb25febe188 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0056c0b0 RCX: 004665d9
RDX:  RSI: 2080 RDI: 0005
RBP: 004bfcb9 R08:  R09: 
R10:  R11: 0246 R12: 0056c0b0
R13: 7fff30a5021f R14: 7fb25febe300 R15: 00022000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.