** Description changed: BugLink: https://bugs.launchpad.net/bugs/2073092 [Impact] Hit conntrack refcount use-after-free issue: refcount_t: addition on 0; use-after-free. Call Trace: <IRQ> ? show_regs+0x6d/0x80 ? __warn+0x89/0x160 ? refcount_warn_saturate+0x12e/0x150 ? report_bug+0x17e/0x1b0 ? handle_bug+0x46/0x90 ? exc_invalid_op+0x18/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? refcount_warn_saturate+0x12e/0x150 flow_offload_alloc+0xe5/0xf0 [nf_flow_table] tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct] tcf_ct_act+0x6c8/0xaa0 [act_ct] tcf_action_exec+0xbc/0x1a0 fl_classify+0x1f8/0x200 [cls_flower] __tcf_classify+0x169/0x200 tcf_classify+0xff/0x250 sch_handle_ingress.constprop.0+0x11f/0x290 ? srso_alias_return_thunk+0x5/0x7f __netif_receive_skb_core.constprop.0+0x60b/0xd70 ? __udp4_lib_lookup+0x25f/0x2a0 __netif_receive_skb_list_core+0xfd/0x250 netif_receive_skb_list_internal+0x1a3/0x2d0 ? srso_alias_return_thunk+0x5/0x7f ? dev_gro_receive+0x196/0x350 napi_complete_done+0x74/0x1c0 gro_cell_poll+0x7c/0xb0 __napi_poll+0x33/0x1f0 net_rx_action+0x181/0x2e0 __do_softirq+0xdc/0x349 ? srso_alias_return_thunk+0x5/0x7f ? handle_irq_event+0x52/0x80 ? handle_edge_irq+0xda/0x250 __irq_exit_rcu+0x75/0xa0 irq_exit_rcu+0xe/0x20 common_interrupt+0xa4/0xb0 </IRQ> <TASK> [Fix] I enabled kasan and get: BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] Read of size 1 at addr ffff888c07603600 by task handler130/6469 Call Trace: <IRQ> dump_stack_lvl+0x48/0x70 print_address_description.constprop.0+0x33/0x3d0 print_report+0xc0/0x2b0 kasan_report+0xd0/0x120 __asan_load1+0x6c/0x80 tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] tcf_ct_act+0x886/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 __irq_exit_rcu+0x82/0xc0 irq_exit_rcu+0xe/0x20 common_interrupt+0xa1/0xb0 </IRQ> Allocated by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_alloc_info+0x1e/0x40 __kasan_krealloc+0x133/0x190 krealloc+0xaa/0x130 nf_ct_ext_add+0xed/0x230 [nf_conntrack] tcf_ct_act+0x1095/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 Freed by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_free_info+0x2b/0x60 ____kasan_slab_free+0x180/0x1f0 __kasan_slab_free+0x12/0x30 slab_free_freelist_hook+0xd2/0x1a0 __kmem_cache_free+0x1a2/0x2f0 kfree+0x78/0x120 nf_conntrack_free+0x74/0x130 [nf_conntrack] nf_ct_destroy+0xb2/0x140 [nf_conntrack] __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack] nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack] __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack] tcf_ct_act+0x12ad/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 When resolving a clash, a duplicate conntrack will be freed, but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack. - We sent a patch to netdev to fix it and got merged: - https://patchwork.kernel.org/project/netdevbpf/patch/20240710053747.13223-1-chengen...@canonical.com/ + We sent a patch to upstream and got merged: + commit 26488172b0292bed837b95a006a3f3431d1898c3 + Author: Chengen Du <chengen...@canonical.com> + Date: Wed Jul 10 13:37:47 2024 +0800 + + net/sched: Fix UAF when resolving a clash Cherry-pick this comment to fix the conntrack slab use-after-free issue. [Testcase] - Built a test kernel and verified on our environment. + Built a test kernel and verified on our environment which is constantly hitting this issue. [Where problems could occur] - This patch ensure when a clash happens and the duplicated conntrack is freed, + This patch ensure when a clash happens and the duplicated conntrack is freed, call nf_ct_get to get the correct conntrack, the freed conntrack won't be used and the rest of code path will follow the original path. This won't cause other issues.
** Description changed: BugLink: https://bugs.launchpad.net/bugs/2073092 [Impact] Hit conntrack refcount use-after-free issue: refcount_t: addition on 0; use-after-free. Call Trace: <IRQ> ? show_regs+0x6d/0x80 ? __warn+0x89/0x160 ? refcount_warn_saturate+0x12e/0x150 ? report_bug+0x17e/0x1b0 ? handle_bug+0x46/0x90 ? exc_invalid_op+0x18/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? refcount_warn_saturate+0x12e/0x150 flow_offload_alloc+0xe5/0xf0 [nf_flow_table] tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct] tcf_ct_act+0x6c8/0xaa0 [act_ct] tcf_action_exec+0xbc/0x1a0 fl_classify+0x1f8/0x200 [cls_flower] __tcf_classify+0x169/0x200 tcf_classify+0xff/0x250 sch_handle_ingress.constprop.0+0x11f/0x290 ? srso_alias_return_thunk+0x5/0x7f __netif_receive_skb_core.constprop.0+0x60b/0xd70 ? __udp4_lib_lookup+0x25f/0x2a0 __netif_receive_skb_list_core+0xfd/0x250 netif_receive_skb_list_internal+0x1a3/0x2d0 ? srso_alias_return_thunk+0x5/0x7f ? dev_gro_receive+0x196/0x350 napi_complete_done+0x74/0x1c0 gro_cell_poll+0x7c/0xb0 __napi_poll+0x33/0x1f0 net_rx_action+0x181/0x2e0 __do_softirq+0xdc/0x349 ? srso_alias_return_thunk+0x5/0x7f ? handle_irq_event+0x52/0x80 ? handle_edge_irq+0xda/0x250 __irq_exit_rcu+0x75/0xa0 irq_exit_rcu+0xe/0x20 common_interrupt+0xa4/0xb0 </IRQ> <TASK> [Fix] I enabled kasan and get: BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] Read of size 1 at addr ffff888c07603600 by task handler130/6469 Call Trace: <IRQ> dump_stack_lvl+0x48/0x70 print_address_description.constprop.0+0x33/0x3d0 print_report+0xc0/0x2b0 kasan_report+0xd0/0x120 __asan_load1+0x6c/0x80 tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] tcf_ct_act+0x886/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 __irq_exit_rcu+0x82/0xc0 irq_exit_rcu+0xe/0x20 common_interrupt+0xa1/0xb0 </IRQ> Allocated by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_alloc_info+0x1e/0x40 __kasan_krealloc+0x133/0x190 krealloc+0xaa/0x130 nf_ct_ext_add+0xed/0x230 [nf_conntrack] tcf_ct_act+0x1095/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 Freed by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_free_info+0x2b/0x60 ____kasan_slab_free+0x180/0x1f0 __kasan_slab_free+0x12/0x30 slab_free_freelist_hook+0xd2/0x1a0 __kmem_cache_free+0x1a2/0x2f0 kfree+0x78/0x120 nf_conntrack_free+0x74/0x130 [nf_conntrack] nf_ct_destroy+0xb2/0x140 [nf_conntrack] __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack] nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack] __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack] tcf_ct_act+0x12ad/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 When resolving a clash, a duplicate conntrack will be freed, but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack. - We sent a patch to upstream and got merged: + We sent a patch to upstream to fix it and got merged: commit 26488172b0292bed837b95a006a3f3431d1898c3 Author: Chengen Du <chengen...@canonical.com> Date: Wed Jul 10 13:37:47 2024 +0800 - net/sched: Fix UAF when resolving a clash + net/sched: Fix UAF when resolving a clash Cherry-pick this comment to fix the conntrack slab use-after-free issue. [Testcase] Built a test kernel and verified on our environment which is constantly hitting this issue. [Where problems could occur] This patch ensure when a clash happens and the duplicated conntrack is freed, call nf_ct_get to get the correct conntrack, the freed conntrack won't be used and the rest of code path will follow the original path. This won't cause other issues. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2073092 Title: net/sched: Fix conntrack use-after-free To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2073092/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs