RE: [PATCH 00/17] v3 net generic subsystem refcount conversions
> On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote: > >On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote: > >> Changes in v3: > >> Rebased on top of the net-next tree. > >> > >> Changes in v2: > >> No changes in patches apart from rebases, but now by > >> default refcount_t = atomic_t (*) and uses all atomic standard operations > >> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the > >> systems that are critical on performance (such as net) and cannot accept > >> even > >> slight delay on the refcounter operations. > >> > >> This series, for core network subsystem components, replaces atomic_t > reference > >> counters with the new refcount_t type and API (see > include/linux/refcount.h). > >> By doing this we prevent intentional or accidental > >> underflows or overflows that can led to use-after-free vulnerabilities. > >> These patches contain only generic net pieces. Other changes will be sent > separately. > >> > >> The patches are fully independent and can be cherry-picked separately. > >> The big patches, such as conversions for sock structure, need a very > >> detailed > >> look from maintainers: refcount managing is quite complex in them and while > >> it seems that they would benefit from the change, extra checking is needed. > >> The biggest corner issue is the fact that refcount_inc() does not increment > >> from zero. > >> > >> If there are no objections to the patches, please merge them via respective > trees. > >> > >> * The respective change is currently merged into -next as > >> "locking/refcount: Create unchecked atomic_t implementation". > >> > >> Elena Reshetova (17): > >> net: convert inet_peer.refcnt from atomic_t to refcount_t > >> net: convert neighbour.refcnt from atomic_t to refcount_t > >> net: convert neigh_params.refcnt from atomic_t to refcount_t > >> net: convert nf_bridge_info.use from atomic_t to refcount_t > >> net: convert sk_buff.users from atomic_t to refcount_t > >> net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t > >> net: convert sock.sk_wmem_alloc from atomic_t to refcount_t > >> net: convert sock.sk_refcnt from atomic_t to refcount_t > >> net: convert ip_mc_list.refcnt from atomic_t to refcount_t > >> net: convert in_device.refcnt from atomic_t to refcount_t > >> net: convert netpoll_info.refcnt from atomic_t to refcount_t > >> net: convert unix_address.refcnt from atomic_t to refcount_t > >> net: convert fib_rule.refcnt from atomic_t to refcount_t > >> net: convert inet_frag_queue.refcnt from atomic_t to refcount_t > >> net: convert net.passive from atomic_t to refcount_t > >> net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t > >> net: convert packet_fanout.sk_ref from atomic_t to refcount_t > > > > > >Can you take a look at this please ? > > > >[ 64.601749] [ cut here ] > >[ 64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 > refcount_sub_and_test+0x75/0xa0 > >[ 64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd > mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core > >[ 64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW > >4.12.0-smp-DEV > #274 > >[ 64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 > 06/22/2016 > >[ 64.601771] task: 8837bf482040 task.stack: 8837bdc08000 > >[ 64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0 > >[ 64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286 > >[ 64.601776] RAX: 0026 RBX: 0001 RCX: > > >[ 64.601777] RDX: 0026 RSI: 0096 RDI: > ed06f7b81eae > >[ 64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: > fbfff4a54c25 > >[ 64.601779] R10: cbc500e5 R11: a52a6128 R12: > >881febcf6f24 > >[ 64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: > >8837d7a4ed00 > >[ 64.601781] FS: 7ff5a2f6b700() GS:881fff80() > knlGS: > >[ 64.601782] CS: 0010 DS: ES: CR0: 80050033 > >[ 64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: > 001406f0 > >[ 64.601783] Call Trace: > >[ 64.601786] refcount_dec_and_test+0x11/0x20 > >[ 64.601790] fib_nl_delrule+0xc39/0x1630 > [snip] > > I'm seeing a similar one coming from sctp: > > refcount_t: underflow; use-after-free. > [ cut here ] > WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186 > refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 > 04/01/2014 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x11d/0x1ef lib/dump_stack.c:52 > panic+0x1bc/0x3ad kernel/panic.c:180 > __warn.cold.6+0x2f/0x2f kernel/panic.c:541 > report_bug+0x20d/0x2
Re: [PATCH 00/17] v3 net generic subsystem refcount conversions
On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote: >On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote: >> Changes in v3: >> Rebased on top of the net-next tree. >> >> Changes in v2: >> No changes in patches apart from rebases, but now by >> default refcount_t = atomic_t (*) and uses all atomic standard operations >> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the >> systems that are critical on performance (such as net) and cannot accept even >> slight delay on the refcounter operations. >> >> This series, for core network subsystem components, replaces atomic_t >> reference >> counters with the new refcount_t type and API (see include/linux/refcount.h). >> By doing this we prevent intentional or accidental >> underflows or overflows that can led to use-after-free vulnerabilities. >> These patches contain only generic net pieces. Other changes will be sent >> separately. >> >> The patches are fully independent and can be cherry-picked separately. >> The big patches, such as conversions for sock structure, need a very detailed >> look from maintainers: refcount managing is quite complex in them and while >> it seems that they would benefit from the change, extra checking is needed. >> The biggest corner issue is the fact that refcount_inc() does not increment >> from zero. >> >> If there are no objections to the patches, please merge them via respective >> trees. >> >> * The respective change is currently merged into -next as >> "locking/refcount: Create unchecked atomic_t implementation". >> >> Elena Reshetova (17): >> net: convert inet_peer.refcnt from atomic_t to refcount_t >> net: convert neighbour.refcnt from atomic_t to refcount_t >> net: convert neigh_params.refcnt from atomic_t to refcount_t >> net: convert nf_bridge_info.use from atomic_t to refcount_t >> net: convert sk_buff.users from atomic_t to refcount_t >> net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t >> net: convert sock.sk_wmem_alloc from atomic_t to refcount_t >> net: convert sock.sk_refcnt from atomic_t to refcount_t >> net: convert ip_mc_list.refcnt from atomic_t to refcount_t >> net: convert in_device.refcnt from atomic_t to refcount_t >> net: convert netpoll_info.refcnt from atomic_t to refcount_t >> net: convert unix_address.refcnt from atomic_t to refcount_t >> net: convert fib_rule.refcnt from atomic_t to refcount_t >> net: convert inet_frag_queue.refcnt from atomic_t to refcount_t >> net: convert net.passive from atomic_t to refcount_t >> net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t >> net: convert packet_fanout.sk_ref from atomic_t to refcount_t > > >Can you take a look at this please ? > >[ 64.601749] [ cut here ] >[ 64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 >refcount_sub_and_test+0x75/0xa0 >[ 64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd >mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core >[ 64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW >4.12.0-smp-DEV #274 >[ 64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 >[ 64.601771] task: 8837bf482040 task.stack: 8837bdc08000 >[ 64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0 >[ 64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286 >[ 64.601776] RAX: 0026 RBX: 0001 RCX: > >[ 64.601777] RDX: 0026 RSI: 0096 RDI: >ed06f7b81eae >[ 64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: >fbfff4a54c25 >[ 64.601779] R10: cbc500e5 R11: a52a6128 R12: >881febcf6f24 >[ 64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: >8837d7a4ed00 >[ 64.601781] FS: 7ff5a2f6b700() GS:881fff80() >knlGS: >[ 64.601782] CS: 0010 DS: ES: CR0: 80050033 >[ 64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: >001406f0 >[ 64.601783] Call Trace: >[ 64.601786] refcount_dec_and_test+0x11/0x20 >[ 64.601790] fib_nl_delrule+0xc39/0x1630 [snip] I'm seeing a similar one coming from sctp: refcount_t: underflow; use-after-free. [ cut here ] WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186 refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x11d/0x1ef lib/dump_stack.c:52 panic+0x1bc/0x3ad kernel/panic.c:180 __warn.cold.6+0x2f/0x2f kernel/panic.c:541 report_bug+0x20d/0x2d0 lib/bug.c:183 fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x132/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x133/0x38
Re: [PATCH 00/17] v3 net generic subsystem refcount conversions
On Mon, 2017-07-03 at 09:57 +, Reshetova, Elena wrote: > Thank you very much for the report! This is an underflow (dec/sub from > zero) that is reported by WARNING. > I guess it is unlikely that actual code underflows, so the most > probable cause is that it attempted to do refcount_inc/add() from > zero, but then failed. > However in that case you should have seen another warning on > refcount_inc() somewhere earlier. That one is actually the one I need > to see to track the root cause. > Could you tell me how do you arrive to the below output? Boot in what > config/etc. > I can try to reproduce to debug further. I sent this fix : https://patchwork.ozlabs.org/patch/783389/ Thanks.
RE: [PATCH 00/17] v3 net generic subsystem refcount conversions
> On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote: > > Changes in v3: > > Rebased on top of the net-next tree. > > > > Changes in v2: > > No changes in patches apart from rebases, but now by > > default refcount_t = atomic_t (*) and uses all atomic standard operations > > unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the > > systems that are critical on performance (such as net) and cannot accept > > even > > slight delay on the refcounter operations. > > > > This series, for core network subsystem components, replaces atomic_t > > reference > > counters with the new refcount_t type and API (see > > include/linux/refcount.h). > > By doing this we prevent intentional or accidental > > underflows or overflows that can led to use-after-free vulnerabilities. > > These patches contain only generic net pieces. Other changes will be sent > separately. > > > > The patches are fully independent and can be cherry-picked separately. > > The big patches, such as conversions for sock structure, need a very > > detailed > > look from maintainers: refcount managing is quite complex in them and while > > it seems that they would benefit from the change, extra checking is needed. > > The biggest corner issue is the fact that refcount_inc() does not increment > > from zero. > > > > If there are no objections to the patches, please merge them via respective > > trees. > > > > * The respective change is currently merged into -next as > > "locking/refcount: Create unchecked atomic_t implementation". > > > > Elena Reshetova (17): > > net: convert inet_peer.refcnt from atomic_t to refcount_t > > net: convert neighbour.refcnt from atomic_t to refcount_t > > net: convert neigh_params.refcnt from atomic_t to refcount_t > > net: convert nf_bridge_info.use from atomic_t to refcount_t > > net: convert sk_buff.users from atomic_t to refcount_t > > net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t > > net: convert sock.sk_wmem_alloc from atomic_t to refcount_t > > net: convert sock.sk_refcnt from atomic_t to refcount_t > > net: convert ip_mc_list.refcnt from atomic_t to refcount_t > > net: convert in_device.refcnt from atomic_t to refcount_t > > net: convert netpoll_info.refcnt from atomic_t to refcount_t > > net: convert unix_address.refcnt from atomic_t to refcount_t > > net: convert fib_rule.refcnt from atomic_t to refcount_t > > net: convert inet_frag_queue.refcnt from atomic_t to refcount_t > > net: convert net.passive from atomic_t to refcount_t > > net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t > > net: convert packet_fanout.sk_ref from atomic_t to refcount_t > > > Can you take a look at this please ? > > Thanks. Thank you very much for the report! This is an underflow (dec/sub from zero) that is reported by WARNING. I guess it is unlikely that actual code underflows, so the most probable cause is that it attempted to do refcount_inc/add() from zero, but then failed. However in that case you should have seen another warning on refcount_inc() somewhere earlier. That one is actually the one I need to see to track the root cause. Could you tell me how do you arrive to the below output? Boot in what config/etc. I can try to reproduce to debug further. Best Regards, Elena > > [ 64.601749] [ cut here ] > [ 64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 > refcount_sub_and_test+0x75/0xa0 > [ 64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd > mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core > [ 64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW > 4.12.0-smp-DEV #274 > [ 64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 > [ 64.601771] task: 8837bf482040 task.stack: 8837bdc08000 > [ 64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0 > [ 64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286 > [ 64.601776] RAX: 0026 RBX: 0001 RCX: > > [ 64.601777] RDX: 0026 RSI: 0096 RDI: > ed06f7b81eae > [ 64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: > fbfff4a54c25 > [ 64.601779] R10: cbc500e5 R11: a52a6128 R12: > 881febcf6f24 > [ 64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: > 8837d7a4ed00 > [ 64.601781] FS: 7ff5a2f6b700() GS:881fff80() > knlGS: > [ 64.601782] CS: 0010 DS: ES: CR0: 80050033 > [ 64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: > 001406f0 > [ 64.601783] Call Trace: > [ 64.601786] refcount_dec_and_test+0x11/0x20 > [ 64.601790] fib_nl_delrule+0xc39/0x1630 > [ 64.601793] ? is_bpf_text_address+0xe/0x20 > [ 64.601795] ? fib_nl_newrule+0x25e0/0x25e0 > [ 64.601798] ? depot_save_stack+0x133/0x470 > [ 64.601801] ? ns_capable+0x13/0x20 > [ 64.601803] ? __netlink_ns_capable+
Re: [PATCH 00/17] v3 net generic subsystem refcount conversions
On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote: > Changes in v3: > Rebased on top of the net-next tree. > > Changes in v2: > No changes in patches apart from rebases, but now by > default refcount_t = atomic_t (*) and uses all atomic standard operations > unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the > systems that are critical on performance (such as net) and cannot accept even > slight delay on the refcounter operations. > > This series, for core network subsystem components, replaces atomic_t > reference > counters with the new refcount_t type and API (see include/linux/refcount.h). > By doing this we prevent intentional or accidental > underflows or overflows that can led to use-after-free vulnerabilities. > These patches contain only generic net pieces. Other changes will be sent > separately. > > The patches are fully independent and can be cherry-picked separately. > The big patches, such as conversions for sock structure, need a very detailed > look from maintainers: refcount managing is quite complex in them and while > it seems that they would benefit from the change, extra checking is needed. > The biggest corner issue is the fact that refcount_inc() does not increment > from zero. > > If there are no objections to the patches, please merge them via respective > trees. > > * The respective change is currently merged into -next as > "locking/refcount: Create unchecked atomic_t implementation". > > Elena Reshetova (17): > net: convert inet_peer.refcnt from atomic_t to refcount_t > net: convert neighbour.refcnt from atomic_t to refcount_t > net: convert neigh_params.refcnt from atomic_t to refcount_t > net: convert nf_bridge_info.use from atomic_t to refcount_t > net: convert sk_buff.users from atomic_t to refcount_t > net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t > net: convert sock.sk_wmem_alloc from atomic_t to refcount_t > net: convert sock.sk_refcnt from atomic_t to refcount_t > net: convert ip_mc_list.refcnt from atomic_t to refcount_t > net: convert in_device.refcnt from atomic_t to refcount_t > net: convert netpoll_info.refcnt from atomic_t to refcount_t > net: convert unix_address.refcnt from atomic_t to refcount_t > net: convert fib_rule.refcnt from atomic_t to refcount_t > net: convert inet_frag_queue.refcnt from atomic_t to refcount_t > net: convert net.passive from atomic_t to refcount_t > net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t > net: convert packet_fanout.sk_ref from atomic_t to refcount_t Can you take a look at this please ? Thanks. [ 64.601749] [ cut here ] [ 64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 refcount_sub_and_test+0x75/0xa0 [ 64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core [ 64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW 4.12.0-smp-DEV #274 [ 64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 [ 64.601771] task: 8837bf482040 task.stack: 8837bdc08000 [ 64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0 [ 64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286 [ 64.601776] RAX: 0026 RBX: 0001 RCX: [ 64.601777] RDX: 0026 RSI: 0096 RDI: ed06f7b81eae [ 64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: fbfff4a54c25 [ 64.601779] R10: cbc500e5 R11: a52a6128 R12: 881febcf6f24 [ 64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 8837d7a4ed00 [ 64.601781] FS: 7ff5a2f6b700() GS:881fff80() knlGS: [ 64.601782] CS: 0010 DS: ES: CR0: 80050033 [ 64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: 001406f0 [ 64.601783] Call Trace: [ 64.601786] refcount_dec_and_test+0x11/0x20 [ 64.601790] fib_nl_delrule+0xc39/0x1630 [ 64.601793] ? is_bpf_text_address+0xe/0x20 [ 64.601795] ? fib_nl_newrule+0x25e0/0x25e0 [ 64.601798] ? depot_save_stack+0x133/0x470 [ 64.601801] ? ns_capable+0x13/0x20 [ 64.601803] ? __netlink_ns_capable+0xcc/0x100 [ 64.601806] rtnetlink_rcv_msg+0x23a/0x6a0 [ 64.601808] ? rtnl_newlink+0x1630/0x1630 [ 64.601811] ? memset+0x31/0x40 [ 64.601813] netlink_rcv_skb+0x2d7/0x440 [ 64.601815] ? rtnl_newlink+0x1630/0x1630 [ 64.601816] ? netlink_ack+0xaf0/0xaf0 [ 64.601818] ? kasan_unpoison_shadow+0x35/0x50 [ 64.601820] ? __kmalloc_node_track_caller+0x4c/0x70 [ 64.601821] rtnetlink_rcv+0x28/0x30 [ 64.601823] netlink_unicast+0x422/0x610 [ 64.601824] ? netlink_attachskb+0x650/0x650 [ 64.601826] netlink_sendmsg+0x7b7/0xb60 [ 64.601828] ? netlink_unicast+0x610/0x610 [ 64.601830] ? netlink_unicast+0x610/0x610 [ 64.601832] sock_sendmsg+0xba/0xf0 [ 64.601834] ___sys_sendmsg+0x6a9/0x8c0 [ 64.601835] ? copy_msghdr_from_user+0x52