Re: general protection fault in j1939_netdev_notify (2)
On Mon, Dec 21, 2020 at 01:25:47PM +0100, Oliver Hartkopp wrote: > > > On 20.12.20 15:37, Oleksij Rempel wrote: > > Hello Oliver, > > > > On Sun, Dec 20, 2020 at 02:18:27PM +0100, Oliver Hartkopp wrote: > > > Hello Oleksij, > > > > > > I assume there is some ndev->ml_priv value set - but not from a CAN > > > netdevice. > > > > it is kind of CAN device :) > > No, it is not. > > Team and bonding devices copy elements like dev->type but do not take care > about the CAN specific ml_priv. > > I don't know if this is the case here. I can take a look later. ok > > > What was the reason to fiddle with the 'priv' stuff in > > > j1939_netdev_notify() > > > before checking if it was a CAN device? > > > > > > Would this patch fix the issue then? > > > > No, j1939_priv_get_by_ndev() already has an internal test for > > ARPHRD_CAN. One of this tests can be removed, to make the code clear. > > So, we get netdev with ARPHRD_CAN and ml_priv == something. > > > > Right now I do not know how to fix it. > > > > Ideas? > > IMO the patch is still an improvement as it swaps the testing and reduces > complexity. ack. I'm ok with this patch. > Regards, > Oliver > > > > > > diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c > > > index bb914d8b4216..6940f98b81fb 100644 > > > --- a/net/can/j1939/main.c > > > +++ b/net/can/j1939/main.c > > > @@ -348,26 +348,25 @@ static int j1939_netdev_notify(struct notifier_block > > > *nb, > > > unsigned long msg, void *data) > > > { > > > struct net_device *ndev = netdev_notifier_info_to_dev(data); > > > struct j1939_priv *priv; > > > > > > + if (ndev->type != ARPHRD_CAN) > > > + goto notify_done; > > > + > > > priv = j1939_priv_get_by_ndev(ndev); > > > if (!priv) > > > goto notify_done; > > > > > > - if (ndev->type != ARPHRD_CAN) > > > - goto notify_put; > > > - > > > switch (msg) { > > > case NETDEV_DOWN: > > > j1939_cancel_active_session(priv, NULL); > > > j1939_sk_netdev_event_netdown(priv); > > > j1939_ecu_unmap_all(priv); > > > break; > > > } > > > > > > -notify_put: > > > j1939_priv_put(priv); > > > > > > notify_done: > > > return NOTIFY_DONE; > > > } > > > > > > If so, I can send a proper patch if you like. > > > > > > Best regards, > > > Oliver > > > > > > > > > On 20.12.20 06:34, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit:d635a69d Merge tag 'net-next-5.11' of > > > > git://git.kernel.org.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1315f12350 > > > > kernel config: > > > > https://syzkaller.appspot.com/x/.config?x=c3556e4856b17a95 > > > > dashboard link: > > > > https://syzkaller.appspot.com/bug?extid=5138c4dd15a0401bec7b > > > > compiler: clang version 11.0.0 > > > > (https://github.com/llvm/llvm-project.git > > > > ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) > > > > syz repro: > > > > https://syzkaller.appspot.com/x/repro.syz?x=1295512350 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10f2f30f50 > > > > > > > > The issue was bisected to: > > > > > > > > commit 497a5757ce4e8f37219a3989ac6a561eb9a8e6c7 > > > > Author: Heiner Kallweit > > > > Date: Sat Nov 7 20:50:56 2020 + > > > > > > > > tun: switch to net core provided statistics counters > > > > > > > > bisection log: > > > > https://syzkaller.appspot.com/x/bisect.txt?x=143b845b50 > > > > final oops: > > > > https://syzkaller.appspot.com/x/report.txt?x=163b845b50 > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=123b845b50 > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the > > > > commit: > > > > Reported-by: syzbot+5138c4dd15a0401be...@syzkaller.appspotmail.com > > > > Fixes: 497a5757ce4e ("tun: switch to net core provided statistics > > > > counters") > > > > > > > > general protection fault, probably for non-canonical address > > > > 0xe80fe8c072f1: [#1] PREEMPT SMP KASAN > > > > KASAN: probably user-memory-access in range > > > > [0x607f46039788-0x607f4603978f] > > > > CPU: 1 PID: 8472 Comm: syz-executor635 Not tainted 5.10.0-syzkaller #0 > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > > > Google 01/01/2011 > > > > RIP: 0010:j1939_ndev_to_priv net/can/j1939/main.c:219 [inline] > > > > RIP: 0010:j1939_priv_get_by_ndev_locked net/can/j1939/main.c:231 > > > > [inline] > > > > RIP: 0010:j1939_priv_get_by_ndev net/can/j1939/main.c:243 [inline] > > > > RIP: 0010:j1939_netdev_notify+0x115/0x320 net/can/j1939/main.c:353 > > > > Code: 00 74 08 48 89 df e8 ba 1e 48 f9 48 8b 1b 48 85 db 0f 84 f0 00 00 > > > > 00 4c 89 64 24 08 48 81 c3 28 60 00 00 48 89 d8 48
Re: general protection fault in j1939_netdev_notify (2)
On 20.12.20 15:37, Oleksij Rempel wrote: Hello Oliver, On Sun, Dec 20, 2020 at 02:18:27PM +0100, Oliver Hartkopp wrote: Hello Oleksij, I assume there is some ndev->ml_priv value set - but not from a CAN netdevice. it is kind of CAN device :) No, it is not. Team and bonding devices copy elements like dev->type but do not take care about the CAN specific ml_priv. I don't know if this is the case here. I can take a look later. What was the reason to fiddle with the 'priv' stuff in j1939_netdev_notify() before checking if it was a CAN device? Would this patch fix the issue then? No, j1939_priv_get_by_ndev() already has an internal test for ARPHRD_CAN. One of this tests can be removed, to make the code clear. So, we get netdev with ARPHRD_CAN and ml_priv == something. Right now I do not know how to fix it. Ideas? IMO the patch is still an improvement as it swaps the testing and reduces complexity. Regards, Oliver diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c index bb914d8b4216..6940f98b81fb 100644 --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -348,26 +348,25 @@ static int j1939_netdev_notify(struct notifier_block *nb, unsigned long msg, void *data) { struct net_device *ndev = netdev_notifier_info_to_dev(data); struct j1939_priv *priv; + if (ndev->type != ARPHRD_CAN) + goto notify_done; + priv = j1939_priv_get_by_ndev(ndev); if (!priv) goto notify_done; - if (ndev->type != ARPHRD_CAN) - goto notify_put; - switch (msg) { case NETDEV_DOWN: j1939_cancel_active_session(priv, NULL); j1939_sk_netdev_event_netdown(priv); j1939_ecu_unmap_all(priv); break; } -notify_put: j1939_priv_put(priv); notify_done: return NOTIFY_DONE; } If so, I can send a proper patch if you like. Best regards, Oliver On 20.12.20 06:34, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit:d635a69d Merge tag 'net-next-5.11' of git://git.kernel.org.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1315f12350 kernel config: https://syzkaller.appspot.com/x/.config?x=c3556e4856b17a95 dashboard link: https://syzkaller.appspot.com/bug?extid=5138c4dd15a0401bec7b compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project.git ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1295512350 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10f2f30f50 The issue was bisected to: commit 497a5757ce4e8f37219a3989ac6a561eb9a8e6c7 Author: Heiner Kallweit Date: Sat Nov 7 20:50:56 2020 + tun: switch to net core provided statistics counters bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=143b845b50 final oops: https://syzkaller.appspot.com/x/report.txt?x=163b845b50 console output: https://syzkaller.appspot.com/x/log.txt?x=123b845b50 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+5138c4dd15a0401be...@syzkaller.appspotmail.com Fixes: 497a5757ce4e ("tun: switch to net core provided statistics counters") general protection fault, probably for non-canonical address 0xe80fe8c072f1: [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x607f46039788-0x607f4603978f] CPU: 1 PID: 8472 Comm: syz-executor635 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:j1939_ndev_to_priv net/can/j1939/main.c:219 [inline] RIP: 0010:j1939_priv_get_by_ndev_locked net/can/j1939/main.c:231 [inline] RIP: 0010:j1939_priv_get_by_ndev net/can/j1939/main.c:243 [inline] RIP: 0010:j1939_netdev_notify+0x115/0x320 net/can/j1939/main.c:353 Code: 00 74 08 48 89 df e8 ba 1e 48 f9 48 8b 1b 48 85 db 0f 84 f0 00 00 00 4c 89 64 24 08 48 81 c3 28 60 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 df e8 8c 1e 48 f9 4c 8b 23 4d 85 e4 0f RSP: 0018:c9e9fd68 EFLAGS: 00010202 RAX: 0c0fe8c072f1 RBX: 607f46039788 RCX: 88801456d040 RDX: 88801456d040 RSI: 0118 RDI: 0118 RBP: 0118 R08: 8870585d R09: f520001d3fa5 R10: f520001d3fa5 R11: R12: 0010 R13: 11100293e848 R14: dc00 R15: 8880149f4244 FS: 01d13880() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2080 CR3: 1402f000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: notifier_call_chain kernel/notifier.c:83 [inline] raw_notifier_call_chain+0xe7/0x170 kernel/notifier.c:410 call_netdevice_not
Re: general protection fault in j1939_netdev_notify (2)
Hello Oliver, On Sun, Dec 20, 2020 at 02:18:27PM +0100, Oliver Hartkopp wrote: > Hello Oleksij, > > I assume there is some ndev->ml_priv value set - but not from a CAN > netdevice. it is kind of CAN device :) > What was the reason to fiddle with the 'priv' stuff in j1939_netdev_notify() > before checking if it was a CAN device? > > Would this patch fix the issue then? No, j1939_priv_get_by_ndev() already has an internal test for ARPHRD_CAN. One of this tests can be removed, to make the code clear. So, we get netdev with ARPHRD_CAN and ml_priv == something. Right now I do not know how to fix it. Ideas? > diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c > index bb914d8b4216..6940f98b81fb 100644 > --- a/net/can/j1939/main.c > +++ b/net/can/j1939/main.c > @@ -348,26 +348,25 @@ static int j1939_netdev_notify(struct notifier_block > *nb, > unsigned long msg, void *data) > { > struct net_device *ndev = netdev_notifier_info_to_dev(data); > struct j1939_priv *priv; > > + if (ndev->type != ARPHRD_CAN) > + goto notify_done; > + > priv = j1939_priv_get_by_ndev(ndev); > if (!priv) > goto notify_done; > > - if (ndev->type != ARPHRD_CAN) > - goto notify_put; > - > switch (msg) { > case NETDEV_DOWN: > j1939_cancel_active_session(priv, NULL); > j1939_sk_netdev_event_netdown(priv); > j1939_ecu_unmap_all(priv); > break; > } > > -notify_put: > j1939_priv_put(priv); > > notify_done: > return NOTIFY_DONE; > } > > If so, I can send a proper patch if you like. > > Best regards, > Oliver > > > On 20.12.20 06:34, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:d635a69d Merge tag 'net-next-5.11' of git://git.kernel.org.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1315f12350 > > kernel config: https://syzkaller.appspot.com/x/.config?x=c3556e4856b17a95 > > dashboard link: https://syzkaller.appspot.com/bug?extid=5138c4dd15a0401bec7b > > compiler: clang version 11.0.0 > > (https://github.com/llvm/llvm-project.git > > ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1295512350 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10f2f30f50 > > > > The issue was bisected to: > > > > commit 497a5757ce4e8f37219a3989ac6a561eb9a8e6c7 > > Author: Heiner Kallweit > > Date: Sat Nov 7 20:50:56 2020 + > > > > tun: switch to net core provided statistics counters > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=143b845b50 > > final oops: https://syzkaller.appspot.com/x/report.txt?x=163b845b50 > > console output: https://syzkaller.appspot.com/x/log.txt?x=123b845b50 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+5138c4dd15a0401be...@syzkaller.appspotmail.com > > Fixes: 497a5757ce4e ("tun: switch to net core provided statistics counters") > > > > general protection fault, probably for non-canonical address > > 0xe80fe8c072f1: [#1] PREEMPT SMP KASAN > > KASAN: probably user-memory-access in range > > [0x607f46039788-0x607f4603978f] > > CPU: 1 PID: 8472 Comm: syz-executor635 Not tainted 5.10.0-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > RIP: 0010:j1939_ndev_to_priv net/can/j1939/main.c:219 [inline] > > RIP: 0010:j1939_priv_get_by_ndev_locked net/can/j1939/main.c:231 [inline] > > RIP: 0010:j1939_priv_get_by_ndev net/can/j1939/main.c:243 [inline] > > RIP: 0010:j1939_netdev_notify+0x115/0x320 net/can/j1939/main.c:353 > > Code: 00 74 08 48 89 df e8 ba 1e 48 f9 48 8b 1b 48 85 db 0f 84 f0 00 00 00 > > 4c 89 64 24 08 48 81 c3 28 60 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 30 00 > > 74 08 48 89 df e8 8c 1e 48 f9 4c 8b 23 4d 85 e4 0f > > RSP: 0018:c9e9fd68 EFLAGS: 00010202 > > RAX: 0c0fe8c072f1 RBX: 607f46039788 RCX: 88801456d040 > > RDX: 88801456d040 RSI: 0118 RDI: 0118 > > RBP: 0118 R08: 8870585d R09: f520001d3fa5 > > R10: f520001d3fa5 R11: R12: 0010 > > R13: 11100293e848 R14: dc00 R15: 8880149f4244 > > FS: 01d13880() GS:8880b9d0() knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 2080 CR3: 1402f000 CR4: 001506e0 > > DR0: DR1: DR2: > > DR3: DR6: fffe0ff0 DR7: 0400 > > Call Trace: > > notifier_call_chain kernel/notifier.c:83 [inline] > > raw_notifier_call_chain+0xe7/0x170 kernel/notifier.c:410 > > call_netdevice_notifiers_info net/core/dev.c:2022 [inline] > > call_net
Re: general protection fault in j1939_netdev_notify (2)
Hello Oleksij, I assume there is some ndev->ml_priv value set - but not from a CAN netdevice. What was the reason to fiddle with the 'priv' stuff in j1939_netdev_notify() before checking if it was a CAN device? Would this patch fix the issue then? diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c index bb914d8b4216..6940f98b81fb 100644 --- a/net/can/j1939/main.c +++ b/net/can/j1939/main.c @@ -348,26 +348,25 @@ static int j1939_netdev_notify(struct notifier_block *nb, unsigned long msg, void *data) { struct net_device *ndev = netdev_notifier_info_to_dev(data); struct j1939_priv *priv; + if (ndev->type != ARPHRD_CAN) + goto notify_done; + priv = j1939_priv_get_by_ndev(ndev); if (!priv) goto notify_done; - if (ndev->type != ARPHRD_CAN) - goto notify_put; - switch (msg) { case NETDEV_DOWN: j1939_cancel_active_session(priv, NULL); j1939_sk_netdev_event_netdown(priv); j1939_ecu_unmap_all(priv); break; } -notify_put: j1939_priv_put(priv); notify_done: return NOTIFY_DONE; } If so, I can send a proper patch if you like. Best regards, Oliver On 20.12.20 06:34, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit:d635a69d Merge tag 'net-next-5.11' of git://git.kernel.org.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1315f12350 kernel config: https://syzkaller.appspot.com/x/.config?x=c3556e4856b17a95 dashboard link: https://syzkaller.appspot.com/bug?extid=5138c4dd15a0401bec7b compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project.git ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1295512350 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10f2f30f50 The issue was bisected to: commit 497a5757ce4e8f37219a3989ac6a561eb9a8e6c7 Author: Heiner Kallweit Date: Sat Nov 7 20:50:56 2020 + tun: switch to net core provided statistics counters bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=143b845b50 final oops: https://syzkaller.appspot.com/x/report.txt?x=163b845b50 console output: https://syzkaller.appspot.com/x/log.txt?x=123b845b50 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+5138c4dd15a0401be...@syzkaller.appspotmail.com Fixes: 497a5757ce4e ("tun: switch to net core provided statistics counters") general protection fault, probably for non-canonical address 0xe80fe8c072f1: [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x607f46039788-0x607f4603978f] CPU: 1 PID: 8472 Comm: syz-executor635 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:j1939_ndev_to_priv net/can/j1939/main.c:219 [inline] RIP: 0010:j1939_priv_get_by_ndev_locked net/can/j1939/main.c:231 [inline] RIP: 0010:j1939_priv_get_by_ndev net/can/j1939/main.c:243 [inline] RIP: 0010:j1939_netdev_notify+0x115/0x320 net/can/j1939/main.c:353 Code: 00 74 08 48 89 df e8 ba 1e 48 f9 48 8b 1b 48 85 db 0f 84 f0 00 00 00 4c 89 64 24 08 48 81 c3 28 60 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 30 00 74 08 48 89 df e8 8c 1e 48 f9 4c 8b 23 4d 85 e4 0f RSP: 0018:c9e9fd68 EFLAGS: 00010202 RAX: 0c0fe8c072f1 RBX: 607f46039788 RCX: 88801456d040 RDX: 88801456d040 RSI: 0118 RDI: 0118 RBP: 0118 R08: 8870585d R09: f520001d3fa5 R10: f520001d3fa5 R11: R12: 0010 R13: 11100293e848 R14: dc00 R15: 8880149f4244 FS: 01d13880() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2080 CR3: 1402f000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: notifier_call_chain kernel/notifier.c:83 [inline] raw_notifier_call_chain+0xe7/0x170 kernel/notifier.c:410 call_netdevice_notifiers_info net/core/dev.c:2022 [inline] call_netdevice_notifiers_extack net/core/dev.c:2034 [inline] call_netdevice_notifiers+0xeb/0x150 net/core/dev.c:2048 __tun_chr_ioctl+0x2337/0x4860 drivers/net/tun.c:3093 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x440359 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7fffd37b9c98 EFLAGS: 0246 ORIG_