This patch series enhances the IPv4 multipath code, adding support for
hash-based multipath.

The multipath algorithm is a per-route attribute (RTA_MP_ALGO) with some
degree of binary compatibility with the old implementation (2.6.12 - 2.6.22),
but without source level compatibility since attributes have different names:

RT_MP_ALG_L3_HASH:
L3 hash-based distribution. This was IP_MP_ALG_NONE, which with the route
cache behaved somewhat like L3-based distribution. This is now the default.

RT_MP_ALG_PER_PACKET:
Per-packet distribution. Was IP_MP_ALG_RR. Uses round-robin.

RT_MP_ALG_DRR, RT_MP_ALG_RANDOM, RT_MP_ALG_WRANDOM:
Unsupported values, but reserved because they existed in 2.6.12 - 2.6.22.

RT_MP_ALG_L4_HASH:
L4 hash-based distribution. This is new.

The traditional modulo approach was replaced by a threshold-based approach,
described in RFC 2992. This reduces disruption in case of link failures or
route changes.

To better support anycast environments where PMTU usually breaks with
multipath, certain ICMP packets are hashed using the header within the
payload, ensuring that ICMP packets are routed over the same path as the
flow they belong to.

As a side effect, the multipath spinlock was removed and the code got faster.
I measured ip_mkroute_input (excl. __mkroute_input) on a Xeon X3350 (2.66GHz)
with two paths and L3 hashing:

1 thread:
Before: ~199.8 cycles(tsc)
After:   ~75.2 cycles(tsc)

4 threads:
Before: ~393.9 cycles(tsc)
After:   ~77.8 cycles(tsc)

If this patch is accepted, a follow-up patch to iproute2 will also be
submitted.

Best regards,
 Peter Nørlund

Peter Nørlund (3):
      ipv4: Lock-less per-packet multipath
      ipv4: L3 and L4 hash-based multipath routing
      ipv4: ICMP packet inspection for multipath

 include/net/ip_fib.h           |   10 ++-
 include/net/route.h            |    5 +
 include/uapi/linux/rtnetlink.h |   14 ++++
 net/ipv4/Kconfig               |    1 
 net/ipv4/fib_frontend.c        |    4 +
 net/ipv4/fib_semantics.c       |  146 +++++++++++++++++++++++++---------------
 net/ipv4/icmp.c                |   29 +++++++-
 net/ipv4/route.c               |  108 +++++++++++++++++++++++++++---
 net/ipv4/xfrm4_policy.c        |    2 -
 9 files changed, 246 insertions(+), 73 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to