Re: WARNING in refcount_dec
On Thu, Apr 19, 2018 at 2:55 PM, Willem de Bruijn wrote: > On Thu, Apr 19, 2018 at 2:32 AM, DaeRyong Jeong wrote: >> Hello. >> We have analyzed the cause of the crash in v4.16-rc3, WARNING in >> refcount_dec, >> which is found by RaceFuzzer (a modified version of Syzkaller). >> >> Since struct packet_sock's member variables, running, has_vnet_hdr, origdev >> and auxdata are declared as bitfields, accessing these variables can race if >> there is no synchronization mechanism. > > Great catch. > > These fields po->{running, auxdata, origdev, has_vnet_hdr} are > accessed without a uniform locking strategy. > > po->running is always accessed with po->bind_lock held (with the > exception of reading in packet_seq_show, but that is best effort). > > That is the only field written to outside setsockopt. If it is moved to > a separate word, it will no longer interfere with the others. > > The other fields are read lockless in the various recv and send > functions, but only set in setsockopt. We've had enough > locking bugs around setsockopt that I suggest we wrap all of > those in lock_sock, like the example I gave before for > has_vnet_hdr. Sent http://patchwork.ozlabs.org/patch/903190/
Re: WARNING in refcount_dec
On Thu, Apr 19, 2018 at 2:32 AM, DaeRyong Jeong wrote: > Hello. > We have analyzed the cause of the crash in v4.16-rc3, WARNING in refcount_dec, > which is found by RaceFuzzer (a modified version of Syzkaller). > > Since struct packet_sock's member variables, running, has_vnet_hdr, origdev > and auxdata are declared as bitfields, accessing these variables can race if > there is no synchronization mechanism. Great catch. These fields po->{running, auxdata, origdev, has_vnet_hdr} are accessed without a uniform locking strategy. po->running is always accessed with po->bind_lock held (with the exception of reading in packet_seq_show, but that is best effort). That is the only field written to outside setsockopt. If it is moved to a separate word, it will no longer interfere with the others. The other fields are read lockless in the various recv and send functions, but only set in setsockopt. We've had enough locking bugs around setsockopt that I suggest we wrap all of those in lock_sock, like the example I gave before for has_vnet_hdr.
Re: WARNING in refcount_dec
Hello. We have analyzed the cause of the crash in v4.16-rc3, WARNING in refcount_dec, which is found by RaceFuzzer (a modified version of Syzkaller). Since struct packet_sock's member variables, running, has_vnet_hdr, origdev and auxdata are declared as bitfields, accessing these variables can race if there is no synchronization mechanism. We think racing between following lines in af_packet.c causes the crash. In function __unregister_prot_hook, po->running = 0; In function packet_setsockopt, po->has_vnet_hdr = !!val; Analysis: CPU0 pakcet_setsockopt po->has_vnet_hdr = !!val; CPU1 packet_setsockop packet_set_ring __unregister_prot_hook po->running = 0; In the CPU1, the value of po->running should become 0, but because of racing, it is possible that po->running can keep the value 1. Consequently, the followings can happen. - When packet_set_ring calls register_prot_hook, register_prot_hook return immediately without calling sock_hold(sk). - When packet_release is called, __unregister_prot_hook will be called because po->running == 1 and sk->sk_refcnt hits zero. Possible interleaving between racy C source lines is as follows (built with gcc-7.1.0). CPU0 (po->has_vnet_hdr = !!val) CPU1 (po->running = 0) movzbl 0x6e0(%r15),%eax andb $0xfe,0x6e0(%r13) shl$0x3,%r12d and$0xfff7,%eax or %r12d,%eax mov%al,0x6e0(%r15) Please, check out the following reproducer. C repro code : https://kiwi.cs.purdue.edu/static/race-fuzzer/repro-refcount_dec.c kernel config: https://kiwi.cs.purdue.edu/static/race-fuzzer/kernel-config-v4.16-rc3 Since there is a small room to race, it may take a long time to reproduce the crash. = About RaceFuzzer RaceFuzzer is a customized version of Syzkaller, specifically tailored to find race condition bugs in the Linux kernel. While we leverage many different technique, the notable feature of RaceFuzzer is in leveraging a custom hypervisor (QEMU/KVM) to interleave the scheduling. In particular, we modified the hypervisor to intentionally stall a per-core execution, which is similar to supporting per-core breakpoint functionality. This allows RaceFuzzer to force the kernel to deterministically trigger racy condition (which may rarely happen in practice due to randomness in scheduling). RaceFuzzer's C repro always pinpoints two racy syscalls. Since C repro's scheduling synchronization should be performed at the user space, its reproducibility is limited (reproduction may take from 1 second to 10 minutes (or even more), depending on a bug). This is because, while RaceFuzzer precisely interleaves the scheduling at the kernel's instruction level when finding this bug, C repro cannot fully utilize such a feature. Please disregard all code related to "should_hypercall" in the C repro, as this is only for our debugging purposes using our own hypervisor. On Tue, Apr 3, 2018 at 1:12 PM, DaeRyong Jeong wrote: > No. Only the first crash (WARNING in refcount_dec) is reproduced by > the attached reproducer. > > The second crash (kernel bug at af_packet.c:3107) is reproduced by > another reproducer. > We reported it here. > http://lkml.iu.edu/hypermail/linux/kernel/1803.3/05324.html > > On Sun, Apr 1, 2018 at 4:38 PM, Willem de Bruijn > wrote: >> On Thu, Mar 29, 2018 at 1:16 AM, Cong Wang wrote: >>> (Cc'ing netdev and Willem) >>> >>> On Wed, Mar 28, 2018 at 12:03 PM, Byoungyoung Lee >>> wrote: >>>> Another crash patterns observed: race between (setsockopt$packet_int) >>>> and (bind$packet). >>>> >>>> -- >>>> [ 357.731597] kernel BUG at >>>> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/net/packet/af_packet.c:3107! >>>> [ 357.733382] invalid opcode: [#1] SMP KASAN >>>> [ 357.734017] Modules linked in: >>>> [ 357.734662] CPU: 1 PID: 3871 Comm: repro.exe Not tainted 4.16.0-rc3 #1 >>>> [ 357.735791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >>>> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 >>>> [ 357.737434] RIP: 0010:packet_do_bind+0x88d/0x950 >>>> [ 357.738121] RSP: 0018:8800b2787b08 EFLAGS: 00010293 >>>> [ 357.738906] RAX: 8800b2fdc780 RBX: 880234358cc0 RCX: >>>> 838b244c >>>> [ 357.739905] RDX: RSI: 838b257d RDI: >>>> 0001 >>>> [ 357.741315] RBP: 8800b2787c10 R08: 8800b2fdc780 R09: >>>> >>>> [ 357.743055] R10: 0001 R11: R12: >>>> 88023352ecc0 >>>>
Re: WARNING in refcount_dec
No. Only the first crash (WARNING in refcount_dec) is reproduced by the attached reproducer. The second crash (kernel bug at af_packet.c:3107) is reproduced by another reproducer. We reported it here. http://lkml.iu.edu/hypermail/linux/kernel/1803.3/05324.html On Sun, Apr 1, 2018 at 4:38 PM, Willem de Bruijn wrote: > On Thu, Mar 29, 2018 at 1:16 AM, Cong Wang wrote: >> (Cc'ing netdev and Willem) >> >> On Wed, Mar 28, 2018 at 12:03 PM, Byoungyoung Lee >> wrote: >>> Another crash patterns observed: race between (setsockopt$packet_int) >>> and (bind$packet). >>> >>> -- >>> [ 357.731597] kernel BUG at >>> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/net/packet/af_packet.c:3107! >>> [ 357.733382] invalid opcode: [#1] SMP KASAN >>> [ 357.734017] Modules linked in: >>> [ 357.734662] CPU: 1 PID: 3871 Comm: repro.exe Not tainted 4.16.0-rc3 #1 >>> [ 357.735791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >>> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 >>> [ 357.737434] RIP: 0010:packet_do_bind+0x88d/0x950 >>> [ 357.738121] RSP: 0018:8800b2787b08 EFLAGS: 00010293 >>> [ 357.738906] RAX: 8800b2fdc780 RBX: 880234358cc0 RCX: >>> 838b244c >>> [ 357.739905] RDX: RSI: 838b257d RDI: >>> 0001 >>> [ 357.741315] RBP: 8800b2787c10 R08: 8800b2fdc780 R09: >>> >>> [ 357.743055] R10: 0001 R11: R12: >>> 88023352ecc0 >>> [ 357.744744] R13: R14: 0001 R15: >>> 1d00 >>> [ 357.746377] FS: 7f4b43733700() GS:8800b8b0() >>> knlGS: >>> [ 357.749599] CS: 0010 DS: ES: CR0: 80050033 >>> [ 357.752096] CR2: 20058000 CR3: 0002334b8000 CR4: >>> 06e0 >>> [ 357.755045] Call Trace: >>> [ 357.755822] ? compat_packet_setsockopt+0x100/0x100 >>> [ 357.757324] ? __sanitizer_cov_trace_const_cmp8+0x18/0x20 >>> [ 357.758810] packet_bind+0xa2/0xe0 >>> [ 357.759640] SYSC_bind+0x279/0x2f0 >>> [ 357.760364] ? move_addr_to_kernel.part.19+0xc0/0xc0 >>> [ 357.761491] ? __handle_mm_fault+0x25d0/0x25d0 >>> [ 357.762449] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >>> [ 357.763663] ? __do_page_fault+0x417/0xba0 >>> [ 357.764569] ? vmalloc_fault+0x910/0x910 >>> [ 357.765405] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >>> [ 357.766525] ? mark_held_locks+0x25/0xb0 >>> [ 357.767336] ? SyS_socketpair+0x4a0/0x4a0 >>> [ 357.768182] SyS_bind+0x24/0x30 >>> [ 357.768851] do_syscall_64+0x209/0x5d0 >>> [ 357.769650] ? syscall_return_slowpath+0x3e0/0x3e0 >>> [ 357.770665] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >>> [ 357.771779] ? syscall_return_slowpath+0x260/0x3e0 >>> [ 357.772748] ? mark_held_locks+0x25/0xb0 >>> [ 357.773581] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >>> [ 357.774720] ? retint_user+0x18/0x18 >>> [ 357.775493] ? trace_hardirqs_off_caller+0xb5/0x120 >>> [ 357.776567] ? trace_hardirqs_off_thunk+0x1a/0x1c >>> [ 357.777512] entry_SYSCALL_64_after_hwframe+0x42/0xb7 >>> [ 357.778508] RIP: 0033:0x4503a9 >>> [ 357.779156] RSP: 002b:7f4b43732ce8 EFLAGS: 0246 ORIG_RAX: >>> 0031 >>> [ 357.780737] RAX: ffda RBX: RCX: >>> 004503a9 >>> [ 357.782169] RDX: 0014 RSI: 20058000 RDI: >>> 0003 >>> [ 357.783710] RBP: 7f4b43732d10 R08: R09: >>> >>> [ 357.785202] R10: R11: 0246 R12: >>> >>> [ 357.786664] R13: R14: 7f4b437339c0 R15: >>> 7f4b43733700 >>> [ 357.788210] Code: c0 fd 48 c7 c2 00 c8 d9 84 be ab 02 00 00 48 c7 >>> c7 60 c8 d9 84 c6 05 e7 a2 48 02 01 e8 3f 17 af fd e9 60 fb ff ff e8 >>> 43 b3 c0 fd <0f> 0b e8 3c b3 c0 fd 48 8b bd 20 ff ff ff e8 60 1e e7 fd >>> 4c 89 >>> [ 357.792260] RIP: packet_do_bind+0x88d/0x950 RSP: 8800b2787b08 >>> [ 357.793698] ---[ end trace 0c5a2539f0247369 ]--- >>> [ 357.794696] Kernel panic - not syncing: Fatal exception >>> [ 357.795918] Kernel Offset: disabled >>> [ 357.796614] Rebooting in 86400 seconds.. >>> >>> On Wed, Mar 28, 2018 at 1:19
Re: WARNING in refcount_dec
On Thu, Mar 29, 2018 at 1:16 AM, Cong Wang wrote: > (Cc'ing netdev and Willem) > > On Wed, Mar 28, 2018 at 12:03 PM, Byoungyoung Lee > wrote: >> Another crash patterns observed: race between (setsockopt$packet_int) >> and (bind$packet). >> >> -- >> [ 357.731597] kernel BUG at >> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/net/packet/af_packet.c:3107! >> [ 357.733382] invalid opcode: [#1] SMP KASAN >> [ 357.734017] Modules linked in: >> [ 357.734662] CPU: 1 PID: 3871 Comm: repro.exe Not tainted 4.16.0-rc3 #1 >> [ 357.735791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 >> [ 357.737434] RIP: 0010:packet_do_bind+0x88d/0x950 >> [ 357.738121] RSP: 0018:8800b2787b08 EFLAGS: 00010293 >> [ 357.738906] RAX: 8800b2fdc780 RBX: 880234358cc0 RCX: >> 838b244c >> [ 357.739905] RDX: RSI: 838b257d RDI: >> 0001 >> [ 357.741315] RBP: 8800b2787c10 R08: 8800b2fdc780 R09: >> >> [ 357.743055] R10: 0001 R11: R12: >> 88023352ecc0 >> [ 357.744744] R13: R14: 0001 R15: >> 1d00 >> [ 357.746377] FS: 7f4b43733700() GS:8800b8b0() >> knlGS: >> [ 357.749599] CS: 0010 DS: ES: CR0: 80050033 >> [ 357.752096] CR2: 20058000 CR3: 0002334b8000 CR4: >> 06e0 >> [ 357.755045] Call Trace: >> [ 357.755822] ? compat_packet_setsockopt+0x100/0x100 >> [ 357.757324] ? __sanitizer_cov_trace_const_cmp8+0x18/0x20 >> [ 357.758810] packet_bind+0xa2/0xe0 >> [ 357.759640] SYSC_bind+0x279/0x2f0 >> [ 357.760364] ? move_addr_to_kernel.part.19+0xc0/0xc0 >> [ 357.761491] ? __handle_mm_fault+0x25d0/0x25d0 >> [ 357.762449] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >> [ 357.763663] ? __do_page_fault+0x417/0xba0 >> [ 357.764569] ? vmalloc_fault+0x910/0x910 >> [ 357.765405] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >> [ 357.766525] ? mark_held_locks+0x25/0xb0 >> [ 357.767336] ? SyS_socketpair+0x4a0/0x4a0 >> [ 357.768182] SyS_bind+0x24/0x30 >> [ 357.768851] do_syscall_64+0x209/0x5d0 >> [ 357.769650] ? syscall_return_slowpath+0x3e0/0x3e0 >> [ 357.770665] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >> [ 357.771779] ? syscall_return_slowpath+0x260/0x3e0 >> [ 357.772748] ? mark_held_locks+0x25/0xb0 >> [ 357.773581] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 >> [ 357.774720] ? retint_user+0x18/0x18 >> [ 357.775493] ? trace_hardirqs_off_caller+0xb5/0x120 >> [ 357.776567] ? trace_hardirqs_off_thunk+0x1a/0x1c >> [ 357.777512] entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> [ 357.778508] RIP: 0033:0x4503a9 >> [ 357.779156] RSP: 002b:7f4b43732ce8 EFLAGS: 0246 ORIG_RAX: >> 0031 >> [ 357.780737] RAX: ffda RBX: RCX: >> 004503a9 >> [ 357.782169] RDX: 0014 RSI: 20058000 RDI: >> 0003 >> [ 357.783710] RBP: 7f4b43732d10 R08: R09: >> >> [ 357.785202] R10: R11: 0246 R12: >> >> [ 357.786664] R13: R14: 7f4b437339c0 R15: >> 7f4b43733700 >> [ 357.788210] Code: c0 fd 48 c7 c2 00 c8 d9 84 be ab 02 00 00 48 c7 >> c7 60 c8 d9 84 c6 05 e7 a2 48 02 01 e8 3f 17 af fd e9 60 fb ff ff e8 >> 43 b3 c0 fd <0f> 0b e8 3c b3 c0 fd 48 8b bd 20 ff ff ff e8 60 1e e7 fd >> 4c 89 >> [ 357.792260] RIP: packet_do_bind+0x88d/0x950 RSP: 8800b2787b08 >> [ 357.793698] ---[ end trace 0c5a2539f0247369 ]--- >> [ 357.794696] Kernel panic - not syncing: Fatal exception >> [ 357.795918] Kernel Offset: disabled >> [ 357.796614] Rebooting in 86400 seconds.. >> >> On Wed, Mar 28, 2018 at 1:19 AM, Byoungyoung Lee >> wrote: >>> We report the crash: WARNING in refcount_dec >>> >>> This crash has been found in v4.16-rc3 using RaceFuzzer (a modified >>> version of Syzkaller), which we describe more at the end of this >>> report. Our analysis shows that the race occurs when invoking two >>> syscalls concurrently, (setsockopt$packet_int) and >>> (setsockopt$packet_rx_ring). >>> >>> C repro code : >>> https://kiwi.cs.purdue.edu/static/race-fuzzer/repro-refcount_dec.c >>> kernel config: >>> https://kiwi.cs.purdu
Re: WARNING in refcount_dec
(Cc'ing netdev and Willem) On Wed, Mar 28, 2018 at 12:03 PM, Byoungyoung Lee wrote: > Another crash patterns observed: race between (setsockopt$packet_int) > and (bind$packet). > > -- > [ 357.731597] kernel BUG at > /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/net/packet/af_packet.c:3107! > [ 357.733382] invalid opcode: [#1] SMP KASAN > [ 357.734017] Modules linked in: > [ 357.734662] CPU: 1 PID: 3871 Comm: repro.exe Not tainted 4.16.0-rc3 #1 > [ 357.735791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 > [ 357.737434] RIP: 0010:packet_do_bind+0x88d/0x950 > [ 357.738121] RSP: 0018:8800b2787b08 EFLAGS: 00010293 > [ 357.738906] RAX: 8800b2fdc780 RBX: 880234358cc0 RCX: > 838b244c > [ 357.739905] RDX: RSI: 838b257d RDI: > 0001 > [ 357.741315] RBP: 8800b2787c10 R08: 8800b2fdc780 R09: > > [ 357.743055] R10: 0001 R11: R12: > 88023352ecc0 > [ 357.744744] R13: R14: 0001 R15: > 1d00 > [ 357.746377] FS: 7f4b43733700() GS:8800b8b0() > knlGS: > [ 357.749599] CS: 0010 DS: ES: CR0: 80050033 > [ 357.752096] CR2: 20058000 CR3: 0002334b8000 CR4: > 06e0 > [ 357.755045] Call Trace: > [ 357.755822] ? compat_packet_setsockopt+0x100/0x100 > [ 357.757324] ? __sanitizer_cov_trace_const_cmp8+0x18/0x20 > [ 357.758810] packet_bind+0xa2/0xe0 > [ 357.759640] SYSC_bind+0x279/0x2f0 > [ 357.760364] ? move_addr_to_kernel.part.19+0xc0/0xc0 > [ 357.761491] ? __handle_mm_fault+0x25d0/0x25d0 > [ 357.762449] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 > [ 357.763663] ? __do_page_fault+0x417/0xba0 > [ 357.764569] ? vmalloc_fault+0x910/0x910 > [ 357.765405] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 > [ 357.766525] ? mark_held_locks+0x25/0xb0 > [ 357.767336] ? SyS_socketpair+0x4a0/0x4a0 > [ 357.768182] SyS_bind+0x24/0x30 > [ 357.768851] do_syscall_64+0x209/0x5d0 > [ 357.769650] ? syscall_return_slowpath+0x3e0/0x3e0 > [ 357.770665] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 > [ 357.771779] ? syscall_return_slowpath+0x260/0x3e0 > [ 357.772748] ? mark_held_locks+0x25/0xb0 > [ 357.773581] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20 > [ 357.774720] ? retint_user+0x18/0x18 > [ 357.775493] ? trace_hardirqs_off_caller+0xb5/0x120 > [ 357.776567] ? trace_hardirqs_off_thunk+0x1a/0x1c > [ 357.777512] entry_SYSCALL_64_after_hwframe+0x42/0xb7 > [ 357.778508] RIP: 0033:0x4503a9 > [ 357.779156] RSP: 002b:7f4b43732ce8 EFLAGS: 0246 ORIG_RAX: > 0031 > [ 357.780737] RAX: ffda RBX: RCX: > 004503a9 > [ 357.782169] RDX: 0014 RSI: 20058000 RDI: > 0003 > [ 357.783710] RBP: 7f4b43732d10 R08: R09: > > [ 357.785202] R10: R11: 0246 R12: > > [ 357.786664] R13: R14: 7f4b437339c0 R15: > 7f4b43733700 > [ 357.788210] Code: c0 fd 48 c7 c2 00 c8 d9 84 be ab 02 00 00 48 c7 > c7 60 c8 d9 84 c6 05 e7 a2 48 02 01 e8 3f 17 af fd e9 60 fb ff ff e8 > 43 b3 c0 fd <0f> 0b e8 3c b3 c0 fd 48 8b bd 20 ff ff ff e8 60 1e e7 fd > 4c 89 > [ 357.792260] RIP: packet_do_bind+0x88d/0x950 RSP: 8800b2787b08 > [ 357.793698] ---[ end trace 0c5a2539f0247369 ]--- > [ 357.794696] Kernel panic - not syncing: Fatal exception > [ 357.795918] Kernel Offset: disabled > [ 357.796614] Rebooting in 86400 seconds.. > > On Wed, Mar 28, 2018 at 1:19 AM, Byoungyoung Lee > wrote: >> We report the crash: WARNING in refcount_dec >> >> This crash has been found in v4.16-rc3 using RaceFuzzer (a modified >> version of Syzkaller), which we describe more at the end of this >> report. Our analysis shows that the race occurs when invoking two >> syscalls concurrently, (setsockopt$packet_int) and >> (setsockopt$packet_rx_ring). >> >> C repro code : >> https://kiwi.cs.purdue.edu/static/race-fuzzer/repro-refcount_dec.c >> kernel config: >> https://kiwi.cs.purdue.edu/static/race-fuzzer/kernel-config-v4.16-rc3 I tried your reproducer, no luck here. >> >> --- >> [ 305.838560] refcount_t: decrement hit 0; leaking memory. >> [ 305.839669] WARNING: CPU: 0 PID: 3867 at >> /home/blee/project/race-fuzzer/kernels/kernel_v4.16-rc3/lib/refcount.c:228 >> refcount_dec+0x62/0x70 >> [ 305.841441] Modules