Re: [tipc-discussion] [PATCH net v3 1/1] tipc: fix socket timer deadlock

Xue, Ying Wed, 15 Jun 2016 02:13:07 -0700

I concern whether it's really safe for us by testing socket owner instead of 
sk_lock.slock.
We have identified the deadlock reason is because tipc_sk_respond() is called 
under socket lock.


Just for the case of responding  to CONN_MANAGER messages, testing socket owner 
is safe. But the solution seems unsafe when TIPC_ERR_OVERLOAD message is sent.

While TIPC_ERR_OVERLOAD is sent out under BH context, socket's owner is set. So 
the message will be sent out under BH through tipc_node_xmit_skb() instead of 
insert it into sk->sk_write_queue.

Regards,
Ying

-----Original Message-----
From: Jon Maloy [mailto:[email protected]] 
Sent: Tuesday, June 14, 2016 1:51 AM
To: [email protected]; 
[email protected]; Xue, Ying; [email protected]; 
[email protected]
Cc: [email protected]; [email protected]
Subject: [PATCH net v3 1/1] tipc: fix socket timer deadlock

We sometimes observe a 'deadly embrace' type deadlock occurring between 
mutually connected sockets on the same node. This happens when the one-hour 
peer supervision timers happen to expire simultaneously in both sockets.

The scenario is as follows:

CPU 1:                          CPU 2:
--------                        --------
tipc_sk_timeout(sk1)            tipc_sk_timeout(sk2)
  lock(sk1.slock)                 lock(sk2.slock)
  msg_create(probe)               msg_create(probe)
  unlock(sk1.slock)               unlock(sk2.slock)
  tipc_node_xmit_skb()            tipc_node_xmit_skb()
    tipc_node_xmit()                tipc_node_xmit()
      tipc_sk_rcv(sk2)                tipc_sk_rcv(sk1)
        lock(sk2.slock)                 lock((sk1.slock)
        filter_rcv()                    filter_rcv()
          tipc_sk_proto_rcv()             tipc_sk_proto_rcv()
            msg_create(probe_rsp)           msg_create(probe_rsp)
            tipc_sk_respond()               tipc_sk_respond()
              tipc_node_xmit_skb()            tipc_node_xmit_skb()
                tipc_node_xmit()                tipc_node_xmit()
                  tipc_sk_rcv(sk1)                tipc_sk_rcv(sk2)
                    lock((sk1.slock)                lock((sk2.slock)
                    ===> DEADLOCK                   ===> DEADLOCK

Further analysis reveals that there are at least three different locations in 
the socket code where tipc_sk_respond() is called within the context of the 
socket lock, with ensuing risk of similar deadlocks.

We solve this by ensuring that messages created by tipc_sk_respond() only are 
sent directly if sk_lock.owned mutex is held. Otherwise they are queued up in 
the socket write queue and sent after the socket lock has been released.

v2: - Testing on mutex sk_lock.owned instead of sk_lock.slock in
      tipc_sk_respond(). This is safer, since sk_lock.slock may
      occasionally and briefly be held (by concurrent user contexts)
      even if we are in user context.
v3: - By lowering the socket timeout to 36 ms instead of 3,600,000 and
      setting up 1000 connections I could easily reproduce the deadlock
      and verify that my solution works.
    - When killing one of the processes I sometimes got a kernel crash
      in the loop emptying the socket write queue. Realizing that there
      may be concurrent processes emptying the write queue, I had to add
      a test that the dequeuing actually returned a buffer. This solved
      the problem.
    - I tried Ying's suggestion with unconditionally adding all
      CONN_MANAGER messages to the backlog queue, and it didn't work.
      This is because we will often add the message to the backlog when
      the socket is *not* owned, so there will be nothing triggering
      execution of backlog_rcv() within acceptable time. Apart from
      that, my solution solves the problem at all three locations where
      this deadlock may happen, as already stated above.

Reported-by: GUNA <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
---
 net/tipc/socket.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 88bfcd7..e8ed3a8 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -278,7 +278,11 @@ static void tipc_sk_respond(struct sock *sk, struct 
sk_buff *skb, int err)
 
        dnode = msg_destnode(buf_msg(skb));
        selector = msg_origport(buf_msg(skb));
-       tipc_node_xmit_skb(sock_net(sk), skb, dnode, selector);
+
+       if (sock_owned_by_user(sk))
+               tipc_node_xmit_skb(sock_net(sk), skb, dnode, selector);
+       else
+               skb_queue_tail(&sk->sk_write_queue, skb);
 }
 
 /**
@@ -1830,6 +1834,14 @@ void tipc_sk_rcv(struct net *net, struct sk_buff_head 
*inputq)
                                tipc_sk_enqueue(inputq, sk, dport);
                                spin_unlock_bh(&sk->sk_lock.slock);
                        }
+                       /* Send pending response/rejected messages, if any */
+                       while (!skb_queue_empty(&sk->sk_write_queue)) {
+                               skb = skb_dequeue(&sk->sk_write_queue);
+                               if (!skb)
+                                       break;
+                               dnode = msg_destnode(buf_msg(skb));
+                               tipc_node_xmit_skb(net, skb, dnode, dport);
+                       }
                        sock_put(sk);
                        continue;
                }
--
1.9.1


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports. http://pubads.g.doubleclick.net/gampad/clk?id=1444514421&iu=/41014381
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Re: [tipc-discussion] [PATCH net v3 1/1] tipc: fix socket timer deadlock

Reply via email to