On 14.05.2015 07:21, Herbert Xu wrote:
On Thu, May 14, 2015 at 12:16:28PM +0800, Herbert Xu wrote:
On Wed, May 13, 2015 at 09:13:38PM -0700, Eric Dumazet wrote:

So it looks like we lost an skb or something....

OK that sounds reasonable.  So my plan is to disable dynamic
rehashing and then hunt down this lookup bug.

Oh wait this isn't even a lookup failure since that should return
ECONNREFUSED.  Could it be that this hang is a separate bug that's
not related to rhashtable?

Hang in getaddrinfo is a bug in libc: function make_request in
sysdeps/unix/sysv/linux/check_pf.c ignores NLMSG_ERROR
(as well as messsages with nlmh->nlmsg_pid != pid)

It hangs forever in case of any error or netlink pid collision.
And I've seen ECONNREFUSED in message buffer when connected to hang
process with gdb.


I've found race in v3.18 in __netlink_lookup: rhashtable_hashfn
computes hash using one table and following rhashtable_lookup_compare
dereferences ht->tbl once again and could see different table.

patch follows...


If that was the case then we simply need to get rid of dynamic
rehashing.

Cheers,


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to