Re: [PATCH net v2 2/2] can: fix ref count warning if socket was closed before skb was cloned

2021-02-23 Thread Eric Dumazet
On Tue, Feb 23, 2021 at 6:53 AM Oleksij Rempel  wrote:
>
> There are two ref count variables controlling the free()ing of a socket:
> - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
> - struct sock::sk_wmem_alloc - which accounts the memory allocated by
>   the skbs in the send path.
>
> If the socket is closed the struct sock::sk_refcnt will finally reach 0
> and sk_free() is called. Which then calls
> refcount_dec_and_test(&sk->sk_wmem_alloc). If sk_wmem_alloc reaches 0
> the socket is actually free()ed.
>
> In case there are still TX skbs on the fly and the socket() is closed,
> the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
> clones an "echo" skb, calls sock_hold() on the original socket and
> references it. This produces the following back trace:

Why not simply fix can_skb_set_owner() instead of adding yet another helper ?

diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h
index 
685f34cfba20741d372d340fe7df1084767b2850..655f33aa99e330b8ffc804b0f3a1d61aa9b00b0b
100644
--- a/include/linux/can/skb.h
+++ b/include/linux/can/skb.h
@@ -65,8 +65,7 @@ static inline void can_skb_reserve(struct sk_buff *skb)

 static inline void can_skb_set_owner(struct sk_buff *skb, struct sock *sk)
 {
-   if (sk) {
-   sock_hold(sk);
+   if (sk && refcount_inc_not_zero(&sk->sk_refcnt)) {
skb->destructor = sock_efree;
skb->sk = sk;
}

IMO, CAN seems to use sock_hold() even for tx packets.

But tx packets usually have a reference on sockets based on sk->sk_wmem_alloc ,
look at skb_set_owner_w() for reference.

This might be the reason why you catch a zero sk_refcnt while packets
are still in flight ?



> | WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 
> refcount_warn_saturate+0x114/0x134
> | refcount_t: addition on 0; use-after-free.
> | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
> | CPU: 0 PID: 280 Comm: test_can.sh Tainted: GE 
> 5.11.0-04577-gf8ff6603c617 #203
> | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> | Backtrace:
> | [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) 
> r7: r6:600f0113 r5: r4:81441220
> | [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
> | [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:0019 
> r8:80f4a8c2 r7:83e4150c r6: r5:0009 r4:80528f90
> | [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) 
> r9:83f26400 r8:80f4a8d1 r7:0009 r6:80528f90 r5:0019 r4:80f4a8c2
> | [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] 
> (refcount_warn_saturate+0x114/0x134) r8: r7: r6:82b44000 
> r5:834e5600 r4:83f4d540
> | [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] 
> (__refcount_add.constprop.0+0x4c/0x50)
> | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] 
> (can_put_echo_skb+0xb0/0x13c)
> | [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] 
> (flexcan_start_xmit+0x1c4/0x230) r9:0010 r8:83f48610 r7:0fdc 
> r6:0c08 r5:82b44000 r4:834e5600
> | [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] 
> (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7: r6:834e5600 
> r5:82b44000 r4:82ab1f00
> | [<80969034>] (netdev_start_xmit) from [<809725a4>] 
> (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8: r7:82ab1f00 
> r6:82b44000 r5: r4:834e5600
> | [<80972408>] (dev_hard_start_xmit) from [<809c6584>] 
> (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9: r8: r7:82b44000 
> r6:82ab1f00 r5:834e5600 r4:83f27400
> | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)
>
> To fix this problem, we have to take into account, that the socket
> technically still there but should not used (by any new skbs) any more.
> The function skb_clone_sk_optional() (introduced in the previous patch)
> takes care of this. It will only clone the skb, if the sk is set and the
> refcount has not reached 0.
>
> Cc: Oliver Hartkopp 
> Cc: Andre Naujoks 
> Cc: Eric Dumazet 
> Fixes: 0ae89beb283a ("can: add destructor for self generated skbs")
> Signed-off-by: Oleksij Rempel 
> ---
>  include/linux/can/skb.h   | 3 +--
>  net/can/af_can.c  | 6 +++---
>  net/can/j1939/main.c  | 3 +--
>  net/can/j1939/socket.c| 3 +--
>  net/can/j1939/transport.c | 4 +---
>  5 files changed, 7 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h
> index 685f34cfba20..bc1af38697a2 100644
> --- a/include/linux/can/skb.h
> +++ b/include/linux/can/skb.h
> @@ -79,13 +79,12 @@ static inline struct sk_buff *can_create_echo_skb(struct 
> sk_buff *skb)
>  {
> struct sk_buff *nskb;
>
> -   nskb = skb_clone(skb, GFP_ATOMIC);
> +   nskb = skb_clone_sk_optional(skb);
> if (unlikely(!nskb)) {
> kfree_skb(skb);
> return NULL;
> }
>
> -   can_skb_set_owner(n

[PATCH net v2 2/2] can: fix ref count warning if socket was closed before skb was cloned

2021-02-22 Thread Oleksij Rempel
There are two ref count variables controlling the free()ing of a socket:
- struct sock::sk_refcnt - which is changed by sock_hold()/sock_put()
- struct sock::sk_wmem_alloc - which accounts the memory allocated by
  the skbs in the send path.

If the socket is closed the struct sock::sk_refcnt will finally reach 0
and sk_free() is called. Which then calls
refcount_dec_and_test(&sk->sk_wmem_alloc). If sk_wmem_alloc reaches 0
the socket is actually free()ed.

In case there are still TX skbs on the fly and the socket() is closed,
the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack
clones an "echo" skb, calls sock_hold() on the original socket and
references it. This produces the following back trace:

| WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 
refcount_warn_saturate+0x114/0x134
| refcount_t: addition on 0; use-after-free.
| Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E)
| CPU: 0 PID: 280 Comm: test_can.sh Tainted: GE 
5.11.0-04577-gf8ff6603c617 #203
| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
| Backtrace:
| [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) 
r7: r6:600f0113 r5: r4:81441220
| [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8)
| [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:0019 
r8:80f4a8c2 r7:83e4150c r6: r5:0009 r4:80528f90
| [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) 
r9:83f26400 r8:80f4a8d1 r7:0009 r6:80528f90 r5:0019 r4:80f4a8c2
| [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] 
(refcount_warn_saturate+0x114/0x134) r8: r7: r6:82b44000 
r5:834e5600 r4:83f4d540
| [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] 
(__refcount_add.constprop.0+0x4c/0x50)
| [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] 
(can_put_echo_skb+0xb0/0x13c)
| [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] 
(flexcan_start_xmit+0x1c4/0x230) r9:0010 r8:83f48610 r7:0fdc 
r6:0c08 r5:82b44000 r4:834e5600
| [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] 
(netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7: r6:834e5600 
r5:82b44000 r4:82ab1f00
| [<80969034>] (netdev_start_xmit) from [<809725a4>] 
(dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8: r7:82ab1f00 
r6:82b44000 r5: r4:834e5600
| [<80972408>] (dev_hard_start_xmit) from [<809c6584>] 
(sch_direct_xmit+0xcc/0x264) r10:834e5600 r9: r8: r7:82b44000 
r6:82ab1f00 r5:834e5600 r4:83f27400
| [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534)

To fix this problem, we have to take into account, that the socket
technically still there but should not used (by any new skbs) any more.
The function skb_clone_sk_optional() (introduced in the previous patch)
takes care of this. It will only clone the skb, if the sk is set and the
refcount has not reached 0.

Cc: Oliver Hartkopp 
Cc: Andre Naujoks 
Cc: Eric Dumazet 
Fixes: 0ae89beb283a ("can: add destructor for self generated skbs")
Signed-off-by: Oleksij Rempel 
---
 include/linux/can/skb.h   | 3 +--
 net/can/af_can.c  | 6 +++---
 net/can/j1939/main.c  | 3 +--
 net/can/j1939/socket.c| 3 +--
 net/can/j1939/transport.c | 4 +---
 5 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/include/linux/can/skb.h b/include/linux/can/skb.h
index 685f34cfba20..bc1af38697a2 100644
--- a/include/linux/can/skb.h
+++ b/include/linux/can/skb.h
@@ -79,13 +79,12 @@ static inline struct sk_buff *can_create_echo_skb(struct 
sk_buff *skb)
 {
struct sk_buff *nskb;
 
-   nskb = skb_clone(skb, GFP_ATOMIC);
+   nskb = skb_clone_sk_optional(skb);
if (unlikely(!nskb)) {
kfree_skb(skb);
return NULL;
}
 
-   can_skb_set_owner(nskb, skb->sk);
consume_skb(skb);
return nskb;
 }
diff --git a/net/can/af_can.c b/net/can/af_can.c
index cce2af10eb3e..9e1bd60e7e1b 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -251,20 +251,20 @@ int can_send(struct sk_buff *skb, int loop)
 * its own. Example: can_raw sockopt CAN_RAW_RECV_OWN_MSGS
 * Therefore we have to ensure that skb->sk remains the
 * reference to the originating sock by restoring skb->sk
-* after each skb_clone() or skb_orphan() usage.
+* after each skb_clone() or skb_orphan() usage -
+* skb_clone_sk_optional() takes care of that.
 */
 
if (!(skb->dev->flags & IFF_ECHO)) {
/* If the interface is not capable to do loopback
 * itself, we do it here.
 */
-   newskb = skb_clone(skb, GFP_ATOMIC);
+   newskb = skb_clone_sk_optional(skb);
if (!newskb) {
kfree_skb(skb);