On Mon, 2017-11-06 at 14:48 +0800, Liu Yu wrote: > On Mon, Nov 6, 2017 at 1:27 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Mon, 2017-11-06 at 10:28 +0800, Liu Yu wrote: > >> From: Liu Yu <allanyu...@tencent.com> > >> > >> When a mount of processes connect to the same port at the same address > >> simultaneously, they are likely getting the same bhash and therefore > >> conflict with each other. > >> > >> The more the cpu number, the worse in this case. > >> > >> Use spin_trylock instead for this scene, which seems doesn't matter > >> for common case. > >> > >> Signed-off-by: Liu Yu <allanyu...@tencent.com> > >> --- > >> net/ipv4/inet_hashtables.c | 6 +++++- > >> 1 files changed, 5 insertions(+), 1 deletions(-) > >> > >> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c > >> index e7d15fb..cc11ec7 100644 > >> --- a/net/ipv4/inet_hashtables.c > >> +++ b/net/ipv4/inet_hashtables.c > >> @@ -581,13 +581,17 @@ int __inet_hash_connect(struct > >> inet_timewait_death_row *death_row, > >> other_parity_scan: > >> port = low + offset; > >> for (i = 0; i < remaining; i += 2, port += 2) { > >> + int ret; > >> + > >> if (unlikely(port >= high)) > >> port -= remaining; > >> if (inet_is_local_reserved_port(net, port)) > >> continue; > >> head = &hinfo->bhash[inet_bhashfn(net, port, > >> hinfo->bhash_size)]; > >> - spin_lock_bh(&head->lock); > >> + ret = spin_trylock(&head->lock); > >> + if (unlikely(!ret)) > >> + continue; > >> > >> /* Does not bother with rcv_saddr checks, because > >> * the established check is already unique enough. > > > > This is broken. > > > > I am pretty sure you have not really tested this patch properly. > > > > Chances are very high that a connect() will miss slots and wont succeed, > > when table is almost full. > > Thanks for your comments! > > Can you explain how connect() miss slots when table is almost full?
Every time your spin_trylock() is failing, you will not examin one port. Now go back to the loop : for (i = 0; i < remaining; i += 2, port += 2) { Since we no longer look all the ports, we are going to fail to find a 4-tuple. Really a spin_trylock() is a not a good idea. Find something else if you really have contention at connect() > > > > > Performance is nice, but we actually need to allocate a 4-tuple in a > > more deterministic fashion. > > > > So, what's the 4th element would you suggest? What 4th element I am suggesting ?