Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hi Pedro, On 6/25/24 12:55 오전, Pedro Tammela wrote: > On 24/06/2024 12:43, Yunseong Kim wrote: >> Hi Pedro, >> >> On 6/25/24 12:12 오전, Pedro Tammela wrote: >>> On 22/06/2024 01:57, ysk...@gmail.com wrote: From: Yunseong Kim In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from qdisc->dev_queue->dev ->name [ 5301.595872] KASAN: null-ptr-deref in range [0x0130-0x0137] [ 5301.595877] Mem abort info: [ 5301.595881] ESR = 0x9606 [ 5301.595885] EC = 0x25: DABT (current EL), IL = 32 bits [ 5301.595889] SET = 0, FnV = 0 [ 5301.595893] EA = 0, S1PTW = 0 [ 5301.595896] FSC = 0x06: level 2 translation fault [ 5301.595900] Data abort info: [ 5301.595903] ISV = 0, ISS = 0x0006, ISS2 = 0x [ 5301.595907] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 5301.595911] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 5301.595915] [dfff8026] address between user and kernel address ranges [ 5301.595971] Internal error: Oops: 9606 [#1] SMP Link: https://lore.kernel.org/lkml/20240229143432.273b4...@gandalf.local.home/t/ Fixes: 51270d573a8d ("tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string") Cc: net...@vger.kernel.org Cc: sta...@vger.kernel.org # +v6.7.10, +v6.8 Signed-off-by: Yunseong Kim Signed-off-by: Yeoreum Yun --- include/trace/events/qdisc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h index f1b5e816e7e5..170b51fbe47a 100644 --- a/include/trace/events/qdisc.h +++ b/include/trace/events/qdisc.h @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset, TP_ARGS(q), TP_STRUCT__entry( - __string( dev, qdisc_dev(q)->name ) + __string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue") __string( kind, q->ops->id ) __field( u32, parent ) __field( u32, handle ) >>> >>> You missed the __assign_str portion (see below). Also let's just say >>> "(null)" as it's the correct device name. "noop_queue" could be >>> misleading. >> >> Thanks for the code review Pedro, I agree your advice. >> >>> diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h >>> index 1f4258308b96..f54e0b4dbcf4 100644 >>> --- a/include/trace/events/qdisc.h >>> +++ b/include/trace/events/qdisc.h >>> @@ -81,14 +81,14 @@ TRACE_EVENT(qdisc_reset, >>> TP_ARGS(q), >>> >>> TP_STRUCT__entry( >>> - __string( dev, >>> qdisc_dev(q)->name ) >>> + __string( dev, qdisc_dev(q) ? >>> qdisc_dev(q)->name : "(null)" ) >>> __string( kind, >>> q->ops->id ) >>> __field( u32, >>> parent ) >>> __field( u32, >>> handle ) >>> ), >> >> It looks better to align the name with the current convention. >> >> Link: >> https://lore.kernel.org/linux-trace-kernel/2024011442.634192...@goodmis.org/ >> >>> TP_fast_assign( >>> - __assign_str(dev, qdisc_dev(q)->name); >>> + __assign_str(dev, qdisc_dev(q) ? qdisc_dev(q)->name : >>> "(null)"); >>> __assign_str(kind, q->ops->id); >>> __entry->parent = q->parent; >>> __entry->handle = q->handle; >>> >>> >> >> The second part you mentioned, Steve recently worked on it and changed >> it. >> >> Link: >> https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba...@rorschach.local.home/ > > Oh! Thanks for the double check, Pedro. >> If it hadn't, I don't think I would have been able to prevent the panic >> by just applying my patch. > > But you must be careful with the backports. > > In any case, perhaps send another patch to net-next updating the new > conventions there and use the 'old convention' for the bug fix? Right, I agree, I'll send a patch for the next version. Warm regards, Yunseong Kim
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
On 24/06/2024 12:43, Yunseong Kim wrote: Hi Pedro, On 6/25/24 12:12 오전, Pedro Tammela wrote: On 22/06/2024 01:57, ysk...@gmail.com wrote: From: Yunseong Kim In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from qdisc->dev_queue->dev ->name [ 5301.595872] KASAN: null-ptr-deref in range [0x0130-0x0137] [ 5301.595877] Mem abort info: [ 5301.595881] ESR = 0x9606 [ 5301.595885] EC = 0x25: DABT (current EL), IL = 32 bits [ 5301.595889] SET = 0, FnV = 0 [ 5301.595893] EA = 0, S1PTW = 0 [ 5301.595896] FSC = 0x06: level 2 translation fault [ 5301.595900] Data abort info: [ 5301.595903] ISV = 0, ISS = 0x0006, ISS2 = 0x [ 5301.595907] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 5301.595911] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 5301.595915] [dfff8026] address between user and kernel address ranges [ 5301.595971] Internal error: Oops: 9606 [#1] SMP Link: https://lore.kernel.org/lkml/20240229143432.273b4...@gandalf.local.home/t/ Fixes: 51270d573a8d ("tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string") Cc: net...@vger.kernel.org Cc: sta...@vger.kernel.org # +v6.7.10, +v6.8 Signed-off-by: Yunseong Kim Signed-off-by: Yeoreum Yun --- include/trace/events/qdisc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h index f1b5e816e7e5..170b51fbe47a 100644 --- a/include/trace/events/qdisc.h +++ b/include/trace/events/qdisc.h @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset, TP_ARGS(q), TP_STRUCT__entry( - __string( dev, qdisc_dev(q)->name ) + __string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue") __string( kind, q->ops->id ) __field( u32, parent ) __field( u32, handle ) You missed the __assign_str portion (see below). Also let's just say "(null)" as it's the correct device name. "noop_queue" could be misleading. Thanks for the code review Pedro, I agree your advice. diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h index 1f4258308b96..f54e0b4dbcf4 100644 --- a/include/trace/events/qdisc.h +++ b/include/trace/events/qdisc.h @@ -81,14 +81,14 @@ TRACE_EVENT(qdisc_reset, TP_ARGS(q), TP_STRUCT__entry( - __string( dev, qdisc_dev(q)->name ) + __string( dev, qdisc_dev(q) ? qdisc_dev(q)->name : "(null)" ) __string( kind, q->ops->id ) __field( u32, parent ) __field( u32, handle ) ), It looks better to align the name with the current convention. Link: https://lore.kernel.org/linux-trace-kernel/2024011442.634192...@goodmis.org/ TP_fast_assign( - __assign_str(dev, qdisc_dev(q)->name); + __assign_str(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "(null)"); __assign_str(kind, q->ops->id); __entry->parent = q->parent; __entry->handle = q->handle; The second part you mentioned, Steve recently worked on it and changed it. Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba...@rorschach.local.home/ Oh! If it hadn't, I don't think I would have been able to prevent the panic by just applying my patch. But you must be careful with the backports. In any case, perhaps send another patch to net-next updating the new conventions there and use the 'old convention' for the bug fix?
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hi Pedro, On 6/25/24 12:12 오전, Pedro Tammela wrote: > On 22/06/2024 01:57, ysk...@gmail.com wrote: >> From: Yunseong Kim >> >> In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from >> >> qdisc->dev_queue->dev ->name >> >> [ 5301.595872] KASAN: null-ptr-deref in range >> [0x0130-0x0137] >> [ 5301.595877] Mem abort info: >> [ 5301.595881] ESR = 0x9606 >> [ 5301.595885] EC = 0x25: DABT (current EL), IL = 32 bits >> [ 5301.595889] SET = 0, FnV = 0 >> [ 5301.595893] EA = 0, S1PTW = 0 >> [ 5301.595896] FSC = 0x06: level 2 translation fault >> [ 5301.595900] Data abort info: >> [ 5301.595903] ISV = 0, ISS = 0x0006, ISS2 = 0x >> [ 5301.595907] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >> [ 5301.595911] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >> [ 5301.595915] [dfff8026] address between user and kernel >> address ranges >> [ 5301.595971] Internal error: Oops: 9606 [#1] SMP >> Link: >> https://lore.kernel.org/lkml/20240229143432.273b4...@gandalf.local.home/t/ >> Fixes: 51270d573a8d ("tracing/net_sched: Fix tracepoints that save >> qdisc_dev() as a string") >> Cc: net...@vger.kernel.org >> Cc: sta...@vger.kernel.org # +v6.7.10, +v6.8 >> Signed-off-by: Yunseong Kim >> Signed-off-by: Yeoreum Yun >> --- >> include/trace/events/qdisc.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h >> index f1b5e816e7e5..170b51fbe47a 100644 >> --- a/include/trace/events/qdisc.h >> +++ b/include/trace/events/qdisc.h >> @@ -81,7 +81,7 @@ TRACE_EVENT(qdisc_reset, >> TP_ARGS(q), >> TP_STRUCT__entry( >> - __string( dev, qdisc_dev(q)->name ) >> + __string(dev, qdisc_dev(q) ? qdisc_dev(q)->name : "noop_queue") >> __string( kind, q->ops->id ) >> __field( u32, parent ) >> __field( u32, handle ) > > You missed the __assign_str portion (see below). Also let's just say > "(null)" as it's the correct device name. "noop_queue" could be misleading. Thanks for the code review Pedro, I agree your advice. > diff --git a/include/trace/events/qdisc.h b/include/trace/events/qdisc.h > index 1f4258308b96..f54e0b4dbcf4 100644 > --- a/include/trace/events/qdisc.h > +++ b/include/trace/events/qdisc.h > @@ -81,14 +81,14 @@ TRACE_EVENT(qdisc_reset, > TP_ARGS(q), > > TP_STRUCT__entry( > - __string( dev, qdisc_dev(q)->name ) > + __string( dev, qdisc_dev(q) ? > qdisc_dev(q)->name : "(null)" ) > __string( kind, q->ops->id ) > __field( u32, parent ) > __field( u32, handle ) > ), It looks better to align the name with the current convention. Link: https://lore.kernel.org/linux-trace-kernel/2024011442.634192...@goodmis.org/ > TP_fast_assign( > - __assign_str(dev, qdisc_dev(q)->name); > + __assign_str(dev, qdisc_dev(q) ? qdisc_dev(q)->name : > "(null)"); > __assign_str(kind, q->ops->id); > __entry->parent = q->parent; > __entry->handle = q->handle; > > The second part you mentioned, Steve recently worked on it and changed it. Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba...@rorschach.local.home/ If it hadn't, I don't think I would have been able to prevent the panic by just applying my patch. Link: https://lore.kernel.org/all/e2f9da4e-919d-4a98-919d-167726ef6...@gmail.com/ Warm Regards, Yunseong Kim
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
On 22/06/2024 01:57, ysk...@gmail.com wrote: From: Yunseong Kim In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from qdisc->dev_queue->dev ->name This situation simulated from bunch of veths and Bluetooth dis/reconnection. During qdisc initialization, qdisc was being set to noop_queue. In veth_init_queue, the initial tx_num was reduced back to one, causing the qdisc reset to be called with noop, which led to the kernel panic. I've attached the GitHub gist link that C converted syz-execprogram source code and 3 log of reproduced vmcore-dmesg. https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 Yeoreum and I use two fuzzing tool simultaneously. One process with syz-executor : https://github.com/google/syzkaller $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ -enable=none -collide=false log1 The other process with perf fuzzer: https://github.com/deater/perf_event_tests/tree/master/fuzzer $ perf_event_tests/fuzzer/perf_fuzzer I think this will happen on the kernel version. Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. This occurred from 51270d573a8d. I think this patch is absolutely necessary. Previously, It was showing not intended string value of name. I've reproduced 3 time from my fedora 40 Debug Kernel with any other module or patched. version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug [ 5287.164555] veth0_vlan: left promiscuous mode [ 5287.164929] veth1_macvtap: left promiscuous mode [ 5287.164950] veth0_macvtap: left promiscuous mode [ 5287.164983] veth1_vlan: left promiscuous mode [ 5287.165008] veth0_vlan: left promiscuous mode [ 5287.165450] veth1_macvtap: left promiscuous mode [ 5287.165472] veth0_macvtap: left promiscuous mode [ 5287.165502] veth1_vlan: left promiscuous mode … [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0 [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0 [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state … [ 5298.002798] bridge_slave_0: left promiscuous mode [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface [ 5298.320207] bond0 (unregistering): Released all slaves [ 5298.354296] hsr_slave_0: left promiscuous mode [ 5298.360750] hsr_slave_1: left promiscuous mode [ 5298.374889] veth1_macvtap: left promiscuous mode [ 5298.374931] veth0_macvtap: left promiscuous mode [ 5298.374988] veth1_vlan: left promiscuous mode [ 5298.375024] veth0_vlan: left promiscuous mode [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed … [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1 [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9 [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9 …. [ 5301.075531] team0: Port device team_slave_1 added [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state [ 5301.085588] bridge_slave_0: entered allmulticast mode [ 5301.085800] bridge_slave_0: entered promiscuous mode [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state … [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.193481] hsr_slave_0: entered promiscuous mode [ 5301.204425] hsr_slave_1: entered promiscuous mode [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.210185] Cannot create hsr debugfs directory [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.255934] team0: Port device team_slave_0 added [ 5301.256480] team0: Port device team_slave_1 added [ 5301.256948] team0: Port device team_slave_0 added … [ 5301.435928] hsr_slave_0: entered promiscuous mode [ 5301.446029] hsr_slave_1: entered promiscuous mode [ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.455884] Cannot
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hi, On 6/22/24 3:12 오후, Yunseong Kim wrote: > Hi Taehee, > > On 6/22/24 2:50 오후, Taehee Yoo wrote: >> On Sat, Jun 22, 2024 at 1:58 PM wrote: >>> >>> From: Yunseong Kim >>> >> >> Hi Yunseong, >> Thanks a lot for this work! > > Thank you Taehee for reviewing our patch. It's greatly appreciated. > >>> During qdisc initialization, qdisc was being set to noop_queue. >>> In veth_init_queue, the initial tx_num was reduced back to one, >>> causing the qdisc reset to be called with noop, which led to the kernel >>> panic. >>> >>> I've attached the GitHub gist link that C converted syz-execprogram >>> source code and 3 log of reproduced vmcore-dmesg. >>> >>> https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 >>> >>> Yeoreum and I use two fuzzing tool simultaneously. >>> >>> One process with syz-executor : https://github.com/google/syzkaller >>> >>> $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ >>> -enable=none -collide=false log1 >>> >>> The other process with perf fuzzer: >>> https://github.com/deater/perf_event_tests/tree/master/fuzzer >>> >>> $ perf_event_tests/fuzzer/perf_fuzzer >>> >>> I think this will happen on the kernel version. >>> >>> Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. >>> >>> This occurred from 51270d573a8d. I think this patch is absolutely >>> necessary. Previously, It was showing not intended string value of name. > > >> I found a simple reproducer, please use the below command to test this patch. >> >> echo 1 > /sys/kernel/debug/tracing/events/enable >> ip link add veth0 type veth peer name veth1 After applying our patch, I didn't find any message or kernel panic errors. # echo 1 > /sys/kernel/debug/tracing/events/qdisc/qdisc_reset/enable # ip link add veth0 type veth peer name veth1 Error: Unknown device type. However, without our patch applied, I tested Tahee's command line on the upstream 6.10.0-rc3 kernel using the qdisc_reset event and the ip command. $ echo 1 > /sys/kernel/debug/tracing/events/qdisc/qdisc_reset/enable $ ip link add veth0 type veth peer name veth1 This make always kernel panic. Linux version: 6.10.0-rc3 [0.00] Linux version 6.10.0-rc3-00164-g44ef20baed8e-dirty (paran@fedora) (gcc (GCC) 14.1.1 20240522 (Red Hat 14.1.1-4), GNU ld version 2.41-34.fc40) #20 SMP PREEMPT Sat Jun 15 16:51:25 KST 2024 Kernel panic message: [ 615.236484] Internal error: Oops: 9605 [#1] PREEMPT SMP [ 615.237250] Dumping ftrace buffer: [ 615.237679](ftrace buffer empty) [ 615.238097] Modules linked in: veth crct10dif_ce virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper zynqmp_fpga xilinx_can xilinx_spi xilinx_selectmap xilinx_core xilinx_pr_decoupler versal_fpga uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc usbnet deflate zstd ubifs ubi rcar_canfd rcar_can omap_mailbox ntb_msi_test ntb_hw_epf lattice_sysconfig_spi lattice_sysconfig ice40_spi gpio_xilinx dwmac_altr_socfpga mdio_regmap stmmac_platform stmmac pcs_xpcs dfl_fme_region dfl_fme_mgr dfl_fme_br dfl_afu dfl fpga_region fpga_bridge can can_dev br_netfilter bridge stp llc atl1c ath11k_pci mhi ath11k_ahb ath11k qmi_helpers ath10k_sdio ath10k_pci ath10k_core ath mac80211 libarc4 cfg80211 drm fuse backlight ipv6 Jun 22 02:36:5[3 6k152.62-4sm98k4-0k]v kCePUr:n e1l :P IUDn:a b4le6 8t oC ohmma: nidpl eN oketr nteali nptaedg i6n.g1 0re.0q-urecs3t- 0at0 1v6i4r-tgu4a4le fa2d0dbraeeds0se-dir tyd f#f2f08 615.252376] Hardware name: linux,dummy-virt (DT) [ 615.253220] pstate: 8045 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 615.254433] pc : strnlen+0x6c/0xe0 [ 615.255096] lr : trace_event_get_offsets_qdisc_reset+0x94/0x3d0 [ 615.256088] sp : 800080b269a0 [ 615.256615] x29: 800080b269a0 x28: c070f3f98500 x27: 0001 [ 615.257831] x26: 0010 x25: c070f3f98540 x24: c070f619cf60 [ 615.259020] x23: 0128 x22: 0138 x21: dfff8000 [ 615.260241] x20: c070f631ad00 x19: 0128 x18: c070f448b800 [ 615.261454] x17: x16: 0001 x15: c070f4ba2a90 [ 615.262635] x14: 700010164d73 x13: 180e1e8d5eb3 x12: 100010164d72 [ 615.263877] x11: 700010164d72 x10: dfff8000 x9 : c070e85d6184 [ 615.265047] x8 : c070e4402070 x7 : f1f1 x6 : 1504a6d3 [ 615.266336] x5 : 28ca21122140 x4 : c070f5043ea8 x3 : [ 615.267528] x2 : 0025 x1 : x0 : [ 615.268747] Call trace: [ 615.269180] strnlen+0x6c/0xe0 [ 615.269767] trace_event_get_offsets_qdisc_reset+0x94/0x3d0 [ 615.270716] trace_event_raw_event_qdisc_reset+0xe8/0x4e8 [ 615.271667] __traceiter_qdisc_reset+0xa0/0x140 [ 615.272499] qdisc_reset+0x554/0x848 [ 615.273134] netif_set_real_num_tx_queues+0x360/0x9a8 [ 615.274050] veth_init_queues+0x110/0x220 [veth] [ 615.275110]
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
Hi Taehee, On 6/22/24 2:50 오후, Taehee Yoo wrote: > On Sat, Jun 22, 2024 at 1:58 PM wrote: >> >> From: Yunseong Kim >> > > Hi Yunseong, > Thanks a lot for this work! Thank you Taehee for reviewing our patch. It's greatly appreciated. >> During qdisc initialization, qdisc was being set to noop_queue. >> In veth_init_queue, the initial tx_num was reduced back to one, >> causing the qdisc reset to be called with noop, which led to the kernel >> panic. >> >> I've attached the GitHub gist link that C converted syz-execprogram >> source code and 3 log of reproduced vmcore-dmesg. >> >> https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 >> >> Yeoreum and I use two fuzzing tool simultaneously. >> >> One process with syz-executor : https://github.com/google/syzkaller >> >> $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ >> -enable=none -collide=false log1 >> >> The other process with perf fuzzer: >> https://github.com/deater/perf_event_tests/tree/master/fuzzer >> >> $ perf_event_tests/fuzzer/perf_fuzzer >> >> I think this will happen on the kernel version. >> >> Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. >> >> This occurred from 51270d573a8d. I think this patch is absolutely >> necessary. Previously, It was showing not intended string value of name. > I found a simple reproducer, please use the below command to test this patch. > > echo 1 > /sys/kernel/debug/tracing/events/enable > ip link add veth0 type veth peer name veth1 The perf event is activated by perf_fuzzer, and it's indeed a similar environment with veth. > In my machine, the splat looks like: > > BUG: kernel NULL pointer dereference, address: 0130 > #PF: supervisor read access in kernel mode > #PF: error_code(0x) - not-present page > PGD 0 P4D 0 > Oops: Oops: [#1] PREEMPT SMP NOPTI > CPU: 1 PID: 1207 Comm: ip Not tainted 6.10.0-rc4+ #25 > 362ec22a686962a9936425abea9a73f03b445c0c > Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 > RIP: 0010:strlen+0x0/0x20 > Code: f7 75 ec 31 c0 c3 cc cc cc cc 48 89 f8 c3 cc cc cc cc 0f 1f 84 > 00 00 00 00 00 90 90 90 90 9c > RSP: 0018:bed8435c7630 EFLAGS: 00010206 > RAX: 92d629c0 RBX: a14100185c60 RCX: > RDX: 0001 RSI: 92d62840 RDI: 0130 > RBP: 92dc4600 R08: 0fd0 R09: 0010 > R10: 92c66c98 R11: 0001 R12: 0001 > R13: R14: 0130 R15: 92d62840 > FS: 7f6a94e50b80() GS:a1485f68() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 0130 CR3: 000103414000 CR4: 007506f0 > PKRU: 5554 > Call Trace: > > ? __die+0x20/0x70 > ? page_fault_oops+0x15a/0x460 > ? trace_event_raw_event_x86_exceptions+0x5f/0xa0 > ? exc_page_fault+0x6e/0x180 > ? asm_exc_page_fault+0x22/0x30 > ? __pfx_strlen+0x10/0x10 > trace_event_raw_event_qdisc_reset+0x4d/0x180 > ? synchronize_rcu_expedited+0x215/0x240 > ? __pfx_autoremove_wake_function+0x10/0x10 > qdisc_reset+0x130/0x150 > netif_set_real_num_tx_queues+0xe3/0x1e0 > veth_init_queues+0x44/0x70 [veth 24a9dd1cd1b1b279e1b467ad46d47a753799b428] > veth_newlink+0x22b/0x440 [veth 24a9dd1cd1b1b279e1b467ad46d47a753799b428] > __rtnl_newlink+0x718/0x990 > rtnl_newlink+0x44/0x70 > rtnetlink_rcv_msg+0x159/0x410 > ? kmalloc_reserve+0x90/0xf0 > ? trace_event_raw_event_kmem_cache_alloc+0x87/0xe0 > ? __pfx_rtnetlink_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x54/0x100 > netlink_unicast+0x243/0x370 > netlink_sendmsg+0x1bb/0x3e0 > sys_sendmsg+0x2bb/0x320 > ? copy_msghdr_from_user+0x6d/0xa0 > ___sys_sendmsg+0x88/0xd0 > > Thanks a lot! > Taehee Yoo I think this bug might cause inconvenience for developers working on net devices driver in a virtual machine when they use tracing. I'm appreciate your effort in reproducing it. Warm Regards, Yunseong Kim
Re: [PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
On Sat, Jun 22, 2024 at 1:58 PM wrote: > > From: Yunseong Kim > Hi Yunseong, Thanks a lot for this work! > In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from > > qdisc->dev_queue->dev ->name > > This situation simulated from bunch of veths and Bluetooth dis/reconnection. > > During qdisc initialization, qdisc was being set to noop_queue. > In veth_init_queue, the initial tx_num was reduced back to one, > causing the qdisc reset to be called with noop, which led to the kernel panic. > > I've attached the GitHub gist link that C converted syz-execprogram > source code and 3 log of reproduced vmcore-dmesg. > > https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 > > Yeoreum and I use two fuzzing tool simultaneously. > > One process with syz-executor : https://github.com/google/syzkaller > > $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ > -enable=none -collide=false log1 > > The other process with perf fuzzer: > https://github.com/deater/perf_event_tests/tree/master/fuzzer > > $ perf_event_tests/fuzzer/perf_fuzzer > > I think this will happen on the kernel version. > > Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. > > This occurred from 51270d573a8d. I think this patch is absolutely > necessary. Previously, It was showing not intended string value of name. > > I've reproduced 3 time from my fedora 40 Debug Kernel with any other module > or patched. > > version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug > > [ 5287.164555] veth0_vlan: left promiscuous mode > [ 5287.164929] veth1_macvtap: left promiscuous mode > [ 5287.164950] veth0_macvtap: left promiscuous mode > [ 5287.164983] veth1_vlan: left promiscuous mode > [ 5287.165008] veth0_vlan: left promiscuous mode > [ 5287.165450] veth1_macvtap: left promiscuous mode > [ 5287.165472] veth0_macvtap: left promiscuous mode > [ 5287.165502] veth1_vlan: left promiscuous mode > … > [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state > [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state > [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state > [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0 > [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state > [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0 > [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state > … > [ 5298.002798] bridge_slave_0: left promiscuous mode > [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup > interface > [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup > interface > [ 5298.320207] bond0 (unregistering): Released all slaves > [ 5298.354296] hsr_slave_0: left promiscuous mode > [ 5298.360750] hsr_slave_1: left promiscuous mode > [ 5298.374889] veth1_macvtap: left promiscuous mode > [ 5298.374931] veth0_macvtap: left promiscuous mode > [ 5298.374988] veth1_vlan: left promiscuous mode > [ 5298.375024] veth0_vlan: left promiscuous mode > [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed > [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed > … > [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1 > [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9 > [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9 > …. > [ 5301.075531] team0: Port device team_slave_1 added > [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state > [ 5301.085588] bridge_slave_0: entered allmulticast mode > [ 5301.085800] bridge_slave_0: entered promiscuous mode > [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state > [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state > … > [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface > with an up link > [ 5301.193481] hsr_slave_0: entered promiscuous mode > [ 5301.204425] hsr_slave_1: entered promiscuous mode > [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present! > [ 5301.210185] Cannot create hsr debugfs directory > [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface > with an up link > [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface > with an up link > [ 5301.255934] team0: Port device team_slave_0 added > [ 5301.256480] team0: Port device team_slave_1 added > [ 5301.256948] team0: Port device
[PATCH v3] tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
From: Yunseong Kim In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from qdisc->dev_queue->dev ->name This situation simulated from bunch of veths and Bluetooth dis/reconnection. During qdisc initialization, qdisc was being set to noop_queue. In veth_init_queue, the initial tx_num was reduced back to one, causing the qdisc reset to be called with noop, which led to the kernel panic. I've attached the GitHub gist link that C converted syz-execprogram source code and 3 log of reproduced vmcore-dmesg. https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 Yeoreum and I use two fuzzing tool simultaneously. One process with syz-executor : https://github.com/google/syzkaller $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ -enable=none -collide=false log1 The other process with perf fuzzer: https://github.com/deater/perf_event_tests/tree/master/fuzzer $ perf_event_tests/fuzzer/perf_fuzzer I think this will happen on the kernel version. Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. This occurred from 51270d573a8d. I think this patch is absolutely necessary. Previously, It was showing not intended string value of name. I've reproduced 3 time from my fedora 40 Debug Kernel with any other module or patched. version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug [ 5287.164555] veth0_vlan: left promiscuous mode [ 5287.164929] veth1_macvtap: left promiscuous mode [ 5287.164950] veth0_macvtap: left promiscuous mode [ 5287.164983] veth1_vlan: left promiscuous mode [ 5287.165008] veth0_vlan: left promiscuous mode [ 5287.165450] veth1_macvtap: left promiscuous mode [ 5287.165472] veth0_macvtap: left promiscuous mode [ 5287.165502] veth1_vlan: left promiscuous mode … [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0 [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0 [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state … [ 5298.002798] bridge_slave_0: left promiscuous mode [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface [ 5298.320207] bond0 (unregistering): Released all slaves [ 5298.354296] hsr_slave_0: left promiscuous mode [ 5298.360750] hsr_slave_1: left promiscuous mode [ 5298.374889] veth1_macvtap: left promiscuous mode [ 5298.374931] veth0_macvtap: left promiscuous mode [ 5298.374988] veth1_vlan: left promiscuous mode [ 5298.375024] veth0_vlan: left promiscuous mode [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed … [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1 [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9 [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9 …. [ 5301.075531] team0: Port device team_slave_1 added [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state [ 5301.085588] bridge_slave_0: entered allmulticast mode [ 5301.085800] bridge_slave_0: entered promiscuous mode [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state … [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.193481] hsr_slave_0: entered promiscuous mode [ 5301.204425] hsr_slave_1: entered promiscuous mode [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.210185] Cannot create hsr debugfs directory [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.255934] team0: Port device team_slave_0 added [ 5301.256480] team0: Port device team_slave_1 added [ 5301.256948] team0: Port device team_slave_0 added … [ 5301.435928] hsr_slave_0: entered promiscuous mode [ 5301.446029] hsr_slave_1: entered promiscuous mode [ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.455884] Cannot create hsr debugfs directory [ 5301.502664]