This patch series enhances the IPv4 multipath code, adding support for hash-based multipath.
The multipath algorithm is a per-route attribute (RTA_MP_ALGO) with some degree of binary compatibility with the old implementation (2.6.12 - 2.6.22), but without source level compatibility since attributes have different names: RT_MP_ALG_L3_HASH: L3 hash-based distribution. This was IP_MP_ALG_NONE, which with the route cache behaved somewhat like L3-based distribution. This is now the default. RT_MP_ALG_PER_PACKET: Per-packet distribution. Was IP_MP_ALG_RR. Uses round-robin. RT_MP_ALG_DRR, RT_MP_ALG_RANDOM, RT_MP_ALG_WRANDOM: Unsupported values, but reserved because they existed in 2.6.12 - 2.6.22. RT_MP_ALG_L4_HASH: L4 hash-based distribution. This is new. The traditional modulo approach was replaced by a threshold-based approach, described in RFC 2992. This reduces disruption in case of link failures or route changes. To better support anycast environments where PMTU usually breaks with multipath, certain ICMP packets are hashed using the header within the payload, ensuring that ICMP packets are routed over the same path as the flow they belong to. As a side effect, the multipath spinlock was removed and the code got faster. I measured ip_mkroute_input (excl. __mkroute_input) on a Xeon X3350 (2.66GHz) with two paths and L3 hashing: 1 thread: Before: ~199.8 cycles(tsc) After: ~75.2 cycles(tsc) 4 threads: Before: ~393.9 cycles(tsc) After: ~77.8 cycles(tsc) If this patch is accepted, a follow-up patch to iproute2 will also be submitted. Best regards, Peter Nørlund Peter Nørlund (3): ipv4: Lock-less per-packet multipath ipv4: L3 and L4 hash-based multipath routing ipv4: ICMP packet inspection for multipath include/net/ip_fib.h | 10 ++- include/net/route.h | 5 + include/uapi/linux/rtnetlink.h | 14 ++++ net/ipv4/Kconfig | 1 net/ipv4/fib_frontend.c | 4 + net/ipv4/fib_semantics.c | 146 +++++++++++++++++++++++++--------------- net/ipv4/icmp.c | 29 +++++++- net/ipv4/route.c | 108 +++++++++++++++++++++++++++--- net/ipv4/xfrm4_policy.c | 2 - 9 files changed, 246 insertions(+), 73 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html