Benjamin LaHaise a écrit :
Hello again,

This patch introduces the use of rcu for the ipv4 established connections hashtable, as well as the timewait table since they are closely intertwined. This removes 4 atomic operations per packet from the tcp_v4_rcv codepath, which helps quite a bit when the other performance barriers in the system are removed. Eliminating the rwlock cache bouncing should also help on SMP systems.

By itself, this improves local netperf performance on a P4/HT by ~260Mbit/s on average. With smaller packets (say, ethernet size) the difference should be larger.


On a second thought, do you think we still need one rwlock per hash chain ?

TCP established hash table entries: 1048576 (order: 12, 16777216 bytes)

On this x86_64 machine, we 'waste' 8 MB of ram for those rwlocks.

With RCU, we touch these rwlocks only on TCP connection creation/deletion, maybe we could reduce to one rwlock or a hashed array of 2^N rwlocks (2^N depending on NR_CPUS), like in net/ipv4/route.c ?

Eric
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to