Hannes Frederic Sowa <han...@stressinduktion.org> writes: > Jason Baron <jba...@akamai.com> writes: > >> The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait >> queue associated with the socket s that we are poll'ing against, but also >> calls >> sock_poll_wait() for a remote peer socket p, if it is connected. Thus, >> if we call poll()/select()/epoll() for the socket s, there are then >> a couple of code paths in which the remote peer socket p and its associated >> peer_wait queue can be freed before poll()/select()/epoll() have a chance >> to remove themselves from the remote peer socket. >> >> The way that remote peer socket can be freed are: >> >> 1. If s calls connect() to a connect to a new socket other than p, it will >> drop its reference on p, and thus a close() on p will free it. >> >> 2. If we call close on p(), then a subsequent sendmsg() from s, will drop >> the final reference to p, allowing it to be freed. >> >> Address this issue, by reverting unix_dgram_poll() to only register with >> the wait queue associated with s and register a callback with the remote peer >> socket on connect() that will wake up the wait queue associated with s. If >> scenarios 1 or 2 occur above we then simply remove the callback from the >> remote peer. This then presents the expected semantics to poll()/select()/ >> epoll(). >> >> I've implemented this for sock-type, SOCK_RAW, SOCK_DGRAM, and SOCK_SEQPACKET >> but not for SOCK_STREAM, since SOCK_STREAM does not use unix_dgram_poll(). >> >> Introduced in commit ec0d215f9420 ("af_unix: fix 'poll for write'/connected >> DGRAM sockets"). >> >> Tested-by: Mathias Krause <mini...@googlemail.com> >> Signed-off-by: Jason Baron <jba...@akamai.com> > > While I think this approach works, I haven't seen where the current code > leaks a reference.
It doesn't "leak a reference" (strictly). It possibly registers a wait queue with whatever invoked the poll-routine which belongs to the peer socket of the socket poll was called on. And the inherent problem with that is that the lifetime of the peer socket is not necessarily the same as the lifetime of the polled socket. If the polled socket is disconnected from its peer while still being polled (or registered for being polled), the former peer may be freed despite the polling code (of whatever provenience) still references the peer_wait member of the unix socket structure for this socket. As pointed out in the original mail, two ways for this to happen is to call connect on the polled socket or cause a unix_dgram_sendmsg call on that after the peer socket was closed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/