RE: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-10 Thread Reshetova, Elena
> On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote:
> >On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote:
> >> Changes in v3:
> >> Rebased on top of the net-next tree.
> >>
> >> Changes in v2:
> >> No changes in patches apart from rebases, but now by
> >> default refcount_t = atomic_t (*) and uses all atomic standard operations
> >> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the
> >> systems that are critical on performance (such as net) and cannot accept 
> >> even
> >> slight delay on the refcounter operations.
> >>
> >> This series, for core network subsystem components, replaces atomic_t
> reference
> >> counters with the new refcount_t type and API (see
> include/linux/refcount.h).
> >> By doing this we prevent intentional or accidental
> >> underflows or overflows that can led to use-after-free vulnerabilities.
> >> These patches contain only generic net pieces. Other changes will be sent
> separately.
> >>
> >> The patches are fully independent and can be cherry-picked separately.
> >> The big patches, such as conversions for sock structure, need a very 
> >> detailed
> >> look from maintainers: refcount managing is quite complex in them and while
> >> it seems that they would benefit from the change, extra checking is needed.
> >> The biggest corner issue is the fact that refcount_inc() does not increment
> >> from zero.
> >>
> >> If there are no objections to the patches, please merge them via respective
> trees.
> >>
> >> * The respective change is currently merged into -next as
> >>   "locking/refcount: Create unchecked atomic_t implementation".
> >>
> >> Elena Reshetova (17):
> >>   net: convert inet_peer.refcnt from atomic_t to refcount_t
> >>   net: convert neighbour.refcnt from atomic_t to refcount_t
> >>   net: convert neigh_params.refcnt from atomic_t to refcount_t
> >>   net: convert nf_bridge_info.use from atomic_t to refcount_t
> >>   net: convert sk_buff.users from atomic_t to refcount_t
> >>   net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
> >>   net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
> >>   net: convert sock.sk_refcnt from atomic_t to refcount_t
> >>   net: convert ip_mc_list.refcnt from atomic_t to refcount_t
> >>   net: convert in_device.refcnt from atomic_t to refcount_t
> >>   net: convert netpoll_info.refcnt from atomic_t to refcount_t
> >>   net: convert unix_address.refcnt from atomic_t to refcount_t
> >>   net: convert fib_rule.refcnt from atomic_t to refcount_t
> >>   net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
> >>   net: convert net.passive from atomic_t to refcount_t
> >>   net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
> >>   net: convert packet_fanout.sk_ref from atomic_t to refcount_t
> >
> >
> >Can you take a look at this please ?
> >
> >[   64.601749] [ cut here ]
> >[   64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184
> refcount_sub_and_test+0x75/0xa0
> >[   64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd
> mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
> >[   64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW   
> >4.12.0-smp-DEV
> #274
> >[   64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0
> 06/22/2016
> >[   64.601771] task: 8837bf482040 task.stack: 8837bdc08000
> >[   64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0
> >[   64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286
> >[   64.601776] RAX: 0026 RBX: 0001 RCX:
> 
> >[   64.601777] RDX: 0026 RSI: 0096 RDI:
> ed06f7b81eae
> >[   64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09:
> fbfff4a54c25
> >[   64.601779] R10: cbc500e5 R11: a52a6128 R12: 
> >881febcf6f24
> >[   64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 
> >8837d7a4ed00
> >[   64.601781] FS:  7ff5a2f6b700() GS:881fff80()
> knlGS:
> >[   64.601782] CS:  0010 DS:  ES:  CR0: 80050033
> >[   64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4:
> 001406f0
> >[   64.601783] Call Trace:
> >[   64.601786]  refcount_dec_and_test+0x11/0x20
> >[   64.601790]  fib_nl_delrule+0xc39/0x1630
> [snip]
> 
> I'm seeing a similar one coming from sctp:
> 
> refcount_t: underflow; use-after-free.
> [ cut here ]
> WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186
> refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1
> 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x11d/0x1ef lib/dump_stack.c:52
>  panic+0x1bc/0x3ad kernel/panic.c:180
>  __warn.cold.6+0x2f/0x2f kernel/panic.c:541
>  report_bug+0x20d/0x2

Re: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-08 Thread Levin, Alexander (Sasha Levin)
On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote:
>On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote:
>> Changes in v3:
>> Rebased on top of the net-next tree.
>>
>> Changes in v2:
>> No changes in patches apart from rebases, but now by
>> default refcount_t = atomic_t (*) and uses all atomic standard operations
>> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the
>> systems that are critical on performance (such as net) and cannot accept even
>> slight delay on the refcounter operations.
>>
>> This series, for core network subsystem components, replaces atomic_t 
>> reference
>> counters with the new refcount_t type and API (see include/linux/refcount.h).
>> By doing this we prevent intentional or accidental
>> underflows or overflows that can led to use-after-free vulnerabilities.
>> These patches contain only generic net pieces. Other changes will be sent 
>> separately.
>>
>> The patches are fully independent and can be cherry-picked separately.
>> The big patches, such as conversions for sock structure, need a very detailed
>> look from maintainers: refcount managing is quite complex in them and while
>> it seems that they would benefit from the change, extra checking is needed.
>> The biggest corner issue is the fact that refcount_inc() does not increment
>> from zero.
>>
>> If there are no objections to the patches, please merge them via respective 
>> trees.
>>
>> * The respective change is currently merged into -next as
>>   "locking/refcount: Create unchecked atomic_t implementation".
>>
>> Elena Reshetova (17):
>>   net: convert inet_peer.refcnt from atomic_t to refcount_t
>>   net: convert neighbour.refcnt from atomic_t to refcount_t
>>   net: convert neigh_params.refcnt from atomic_t to refcount_t
>>   net: convert nf_bridge_info.use from atomic_t to refcount_t
>>   net: convert sk_buff.users from atomic_t to refcount_t
>>   net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
>>   net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
>>   net: convert sock.sk_refcnt from atomic_t to refcount_t
>>   net: convert ip_mc_list.refcnt from atomic_t to refcount_t
>>   net: convert in_device.refcnt from atomic_t to refcount_t
>>   net: convert netpoll_info.refcnt from atomic_t to refcount_t
>>   net: convert unix_address.refcnt from atomic_t to refcount_t
>>   net: convert fib_rule.refcnt from atomic_t to refcount_t
>>   net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
>>   net: convert net.passive from atomic_t to refcount_t
>>   net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
>>   net: convert packet_fanout.sk_ref from atomic_t to refcount_t
>
>
>Can you take a look at this please ?
>
>[   64.601749] [ cut here ]
>[   64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 
>refcount_sub_and_test+0x75/0xa0
>[   64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd 
>mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
>[   64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW   
>4.12.0-smp-DEV #274
>[   64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
>[   64.601771] task: 8837bf482040 task.stack: 8837bdc08000
>[   64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0
>[   64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286
>[   64.601776] RAX: 0026 RBX: 0001 RCX: 
>
>[   64.601777] RDX: 0026 RSI: 0096 RDI: 
>ed06f7b81eae
>[   64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: 
>fbfff4a54c25
>[   64.601779] R10: cbc500e5 R11: a52a6128 R12: 
>881febcf6f24
>[   64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 
>8837d7a4ed00
>[   64.601781] FS:  7ff5a2f6b700() GS:881fff80() 
>knlGS:
>[   64.601782] CS:  0010 DS:  ES:  CR0: 80050033
>[   64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: 
>001406f0
>[   64.601783] Call Trace:
>[   64.601786]  refcount_dec_and_test+0x11/0x20
>[   64.601790]  fib_nl_delrule+0xc39/0x1630
[snip]

I'm seeing a similar one coming from sctp:

refcount_t: underflow; use-after-free.
[ cut here ]
WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186 
refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186
Kernel panic - not syncing: panic_on_warn set ...

CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 
04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x11d/0x1ef lib/dump_stack.c:52
 panic+0x1bc/0x3ad kernel/panic.c:180
 __warn.cold.6+0x2f/0x2f kernel/panic.c:541
 report_bug+0x20d/0x2d0 lib/bug.c:183
 fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:190
 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
 do_trap+0x132/0x390 arch/x86/kernel/traps.c:273
 do_error_trap+0x133/0x38

Re: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-03 Thread Eric Dumazet
On Mon, 2017-07-03 at 09:57 +, Reshetova, Elena wrote:

> Thank you very much for the report! This is an underflow (dec/sub from
> zero) that is reported by WARNING. 
> I guess it is unlikely that actual code underflows, so the most
> probable cause is that it attempted to do refcount_inc/add() from
> zero, but then failed. 
> However  in that case you should have seen another warning on
> refcount_inc() somewhere earlier. That one is actually the one I need
> to see to track the root cause. 
> Could you tell me how do you arrive to the below output? Boot in what
> config/etc. 
> I can try to reproduce to debug further. 

I sent this fix : 

https://patchwork.ozlabs.org/patch/783389/

Thanks.





RE: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-03 Thread Reshetova, Elena



> On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote:
> > Changes in v3:
> > Rebased on top of the net-next tree.
> >
> > Changes in v2:
> > No changes in patches apart from rebases, but now by
> > default refcount_t = atomic_t (*) and uses all atomic standard operations
> > unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the
> > systems that are critical on performance (such as net) and cannot accept 
> > even
> > slight delay on the refcounter operations.
> >
> > This series, for core network subsystem components, replaces atomic_t 
> > reference
> > counters with the new refcount_t type and API (see 
> > include/linux/refcount.h).
> > By doing this we prevent intentional or accidental
> > underflows or overflows that can led to use-after-free vulnerabilities.
> > These patches contain only generic net pieces. Other changes will be sent
> separately.
> >
> > The patches are fully independent and can be cherry-picked separately.
> > The big patches, such as conversions for sock structure, need a very 
> > detailed
> > look from maintainers: refcount managing is quite complex in them and while
> > it seems that they would benefit from the change, extra checking is needed.
> > The biggest corner issue is the fact that refcount_inc() does not increment
> > from zero.
> >
> > If there are no objections to the patches, please merge them via respective 
> > trees.
> >
> > * The respective change is currently merged into -next as
> >   "locking/refcount: Create unchecked atomic_t implementation".
> >
> > Elena Reshetova (17):
> >   net: convert inet_peer.refcnt from atomic_t to refcount_t
> >   net: convert neighbour.refcnt from atomic_t to refcount_t
> >   net: convert neigh_params.refcnt from atomic_t to refcount_t
> >   net: convert nf_bridge_info.use from atomic_t to refcount_t
> >   net: convert sk_buff.users from atomic_t to refcount_t
> >   net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
> >   net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
> >   net: convert sock.sk_refcnt from atomic_t to refcount_t
> >   net: convert ip_mc_list.refcnt from atomic_t to refcount_t
> >   net: convert in_device.refcnt from atomic_t to refcount_t
> >   net: convert netpoll_info.refcnt from atomic_t to refcount_t
> >   net: convert unix_address.refcnt from atomic_t to refcount_t
> >   net: convert fib_rule.refcnt from atomic_t to refcount_t
> >   net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
> >   net: convert net.passive from atomic_t to refcount_t
> >   net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
> >   net: convert packet_fanout.sk_ref from atomic_t to refcount_t
> 
> 
> Can you take a look at this please ?
> 
> Thanks.

Thank you very much for the report! This is an underflow (dec/sub from zero) 
that is reported by WARNING. 
I guess it is unlikely that actual code underflows, so the most probable cause 
is that it attempted to do refcount_inc/add() from zero, but then failed. 
However  in that case you should have seen another warning on refcount_inc() 
somewhere earlier. That one is actually the one I need to see to track the root 
cause. 
Could you tell me how do you arrive to the below output? Boot in what 
config/etc. 
I can try to reproduce to debug further. 

Best Regards,
Elena

> 
> [   64.601749] [ cut here ]
> [   64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184
> refcount_sub_and_test+0x75/0xa0
> [   64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd
> mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
> [   64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW   
> 4.12.0-smp-DEV #274
> [   64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
> [   64.601771] task: 8837bf482040 task.stack: 8837bdc08000
> [   64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0
> [   64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286
> [   64.601776] RAX: 0026 RBX: 0001 RCX:
> 
> [   64.601777] RDX: 0026 RSI: 0096 RDI:
> ed06f7b81eae
> [   64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: 
> fbfff4a54c25
> [   64.601779] R10: cbc500e5 R11: a52a6128 R12: 
> 881febcf6f24
> [   64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 
> 8837d7a4ed00
> [   64.601781] FS:  7ff5a2f6b700() GS:881fff80()
> knlGS:
> [   64.601782] CS:  0010 DS:  ES:  CR0: 80050033
> [   64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4:
> 001406f0
> [   64.601783] Call Trace:
> [   64.601786]  refcount_dec_and_test+0x11/0x20
> [   64.601790]  fib_nl_delrule+0xc39/0x1630
> [   64.601793]  ? is_bpf_text_address+0xe/0x20
> [   64.601795]  ? fib_nl_newrule+0x25e0/0x25e0
> [   64.601798]  ? depot_save_stack+0x133/0x470
> [   64.601801]  ? ns_capable+0x13/0x20
> [   64.601803]  ? __netlink_ns_capable+

Re: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-03 Thread Eric Dumazet
On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote:
> Changes in v3:
> Rebased on top of the net-next tree.
> 
> Changes in v2:
> No changes in patches apart from rebases, but now by
> default refcount_t = atomic_t (*) and uses all atomic standard operations
> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the
> systems that are critical on performance (such as net) and cannot accept even
> slight delay on the refcounter operations.
> 
> This series, for core network subsystem components, replaces atomic_t 
> reference
> counters with the new refcount_t type and API (see include/linux/refcount.h).
> By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
> These patches contain only generic net pieces. Other changes will be sent 
> separately.
> 
> The patches are fully independent and can be cherry-picked separately.
> The big patches, such as conversions for sock structure, need a very detailed
> look from maintainers: refcount managing is quite complex in them and while
> it seems that they would benefit from the change, extra checking is needed.
> The biggest corner issue is the fact that refcount_inc() does not increment
> from zero.
> 
> If there are no objections to the patches, please merge them via respective 
> trees.
> 
> * The respective change is currently merged into -next as
>   "locking/refcount: Create unchecked atomic_t implementation".
> 
> Elena Reshetova (17):
>   net: convert inet_peer.refcnt from atomic_t to refcount_t
>   net: convert neighbour.refcnt from atomic_t to refcount_t
>   net: convert neigh_params.refcnt from atomic_t to refcount_t
>   net: convert nf_bridge_info.use from atomic_t to refcount_t
>   net: convert sk_buff.users from atomic_t to refcount_t
>   net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
>   net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
>   net: convert sock.sk_refcnt from atomic_t to refcount_t
>   net: convert ip_mc_list.refcnt from atomic_t to refcount_t
>   net: convert in_device.refcnt from atomic_t to refcount_t
>   net: convert netpoll_info.refcnt from atomic_t to refcount_t
>   net: convert unix_address.refcnt from atomic_t to refcount_t
>   net: convert fib_rule.refcnt from atomic_t to refcount_t
>   net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
>   net: convert net.passive from atomic_t to refcount_t
>   net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
>   net: convert packet_fanout.sk_ref from atomic_t to refcount_t


Can you take a look at this please ?

Thanks.

[   64.601749] [ cut here ]
[   64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 
refcount_sub_and_test+0x75/0xa0
[   64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd 
mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
[   64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW   
4.12.0-smp-DEV #274
[   64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
[   64.601771] task: 8837bf482040 task.stack: 8837bdc08000
[   64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0
[   64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286
[   64.601776] RAX: 0026 RBX: 0001 RCX: 
[   64.601777] RDX: 0026 RSI: 0096 RDI: ed06f7b81eae
[   64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: fbfff4a54c25
[   64.601779] R10: cbc500e5 R11: a52a6128 R12: 881febcf6f24
[   64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 8837d7a4ed00
[   64.601781] FS:  7ff5a2f6b700() GS:881fff80() 
knlGS:
[   64.601782] CS:  0010 DS:  ES:  CR0: 80050033
[   64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: 001406f0
[   64.601783] Call Trace:
[   64.601786]  refcount_dec_and_test+0x11/0x20
[   64.601790]  fib_nl_delrule+0xc39/0x1630
[   64.601793]  ? is_bpf_text_address+0xe/0x20
[   64.601795]  ? fib_nl_newrule+0x25e0/0x25e0
[   64.601798]  ? depot_save_stack+0x133/0x470
[   64.601801]  ? ns_capable+0x13/0x20
[   64.601803]  ? __netlink_ns_capable+0xcc/0x100
[   64.601806]  rtnetlink_rcv_msg+0x23a/0x6a0
[   64.601808]  ? rtnl_newlink+0x1630/0x1630
[   64.601811]  ? memset+0x31/0x40
[   64.601813]  netlink_rcv_skb+0x2d7/0x440
[   64.601815]  ? rtnl_newlink+0x1630/0x1630
[   64.601816]  ? netlink_ack+0xaf0/0xaf0
[   64.601818]  ? kasan_unpoison_shadow+0x35/0x50
[   64.601820]  ? __kmalloc_node_track_caller+0x4c/0x70
[   64.601821]  rtnetlink_rcv+0x28/0x30
[   64.601823]  netlink_unicast+0x422/0x610
[   64.601824]  ? netlink_attachskb+0x650/0x650
[   64.601826]  netlink_sendmsg+0x7b7/0xb60
[   64.601828]  ? netlink_unicast+0x610/0x610
[   64.601830]  ? netlink_unicast+0x610/0x610
[   64.601832]  sock_sendmsg+0xba/0xf0
[   64.601834]  ___sys_sendmsg+0x6a9/0x8c0
[   64.601835]  ? copy_msghdr_from_user+0x52