Eric Dumazet <eduma...@google.com> writes: > On Tue, Nov 24, 2015 at 5:10 PM, Rainer Weikusat > <rweiku...@mobileactivedefense.com> wrote:
[...] >> It's also easy to verify: Swap the unix_state_lock and >> other->sk_data_ready and see if the issue still occurs. Right now (this >> may change after I had some sleep as it's pretty late for me), I don't >> think there's another local fix: The ->sk_data_ready accesses a >> pointer after the lock taken by the code which will clear and >> then later free it was released. > > It seems that : > > int sock_wake_async(struct socket *sock, int how, int band) > > should really be changed to > > int sock_wake_async(struct socket_wq *wq, int how, int band) > > So that RCU rules (already present) apply safely. > > sk->sk_socket is inherently racy (that is : racy without using > sk_callback_lock rwlock ) The comment above sock_wait_async states that /* This function may be called only under socket lock or callback_lock or rcu_lock */ In this case, it's called via sk_wake_async (include/net/sock.h) which is - in turn - called via sock_def_readable (the 'default' data ready routine/ net/core/sock.c) which looks like this: static void sock_def_readable(struct sock *sk) { struct socket_wq *wq; rcu_read_lock(); wq = rcu_dereference(sk->sk_wq); if (wq_has_sleeper(wq)) wake_up_interruptible_sync_poll(&wq->wait, POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND); sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN); rcu_read_unlock(); } and should thus satisfy the constraint documented by the comment (I didn't verify if the comment is actually correct, though). Further - sorry about that - I think changing code in "half of the network stack" in order to avoid calling a certain routine which will only ever do something in case someone's using signal-driven I/O with an already acquired lock held is a terrifying idea. Because of this, I propose the following alternate patch which should also solve the problem by ensuring that the ->sk_data_ready activity happens before unix_release_sock/ sock_release get a chance to clear or free anything which will be needed. In case this demonstrably causes other issues, a more complicated alternate idea (still restricting itself to changes to the af_unix code) would be to move the socket_wq structure to a dummy struct socket allocated by unix_release_sock and freed by the destructor. --- diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4e95bdf..5c87ea6 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1754,8 +1754,8 @@ restart_locked: skb_queue_tail(&other->sk_receive_queue, skb); if (max_level > unix_sk(other)->recursion_level) unix_sk(other)->recursion_level = max_level; - unix_state_unlock(other); other->sk_data_ready(other); + unix_state_unlock(other); sock_put(other); scm_destroy(&scm); return len; @@ -1860,8 +1860,8 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, skb_queue_tail(&other->sk_receive_queue, skb); if (max_level > unix_sk(other)->recursion_level) unix_sk(other)->recursion_level = max_level; - unix_state_unlock(other); other->sk_data_ready(other); + unix_state_unlock(other); sent += size; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/