Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
On 12/12/17 20:02, Stephen Hemminger wrote: > On Tue, 12 Dec 2017 16:02:50 +0200 > Nikolay Aleksandrovwrote: > >> +memcpy(__entry->addr, f->key.addr.addr, ETH_ALEN); > > Maybe use ether_addr_copy() here? > This is an unrelated cleanup, the code in question was already like that. I can post a separate patch to turn these into ether_addr_copy().
Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
On 12/12/17 20:07, Stephen Hemminger wrote: > On Tue, 12 Dec 2017 16:02:50 +0200 > Nikolay Aleksandrovwrote: > >> Before this patch the bridge used a fixed 256 element hash table which >> was fine for small use cases (in my tests it starts to degrade >> above 1000 entries), but it wasn't enough for medium or large >> scale deployments. Modern setups have thousands of participants in a >> single bridge, even only enabling vlans and adding a few thousand vlan >> entries will cause a few thousand fdbs to be automatically inserted per >> participating port. So we need to scale the fdb table considerably to >> cope with modern workloads, and this patch converts it to use a >> rhashtable for its operations thus improving the bridge scalability. >> Tests show the following results (10 runs each), at up to 1000 entries >> rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it >> is 2 times faster and at 3 it is 50 times faster. >> Obviously this happens because of the properties of the two constructs >> and is expected, rhashtable keeps pretty much a constant time even with >> 1000 entries (tested), while the fixed hash table struggles >> considerably even above 1. >> As a side effect this also reduces the net_bridge struct size from 3248 >> bytes to 1344 bytes. Also note that the key struct is 8 bytes. >> >> Signed-off-by: Nikolay Aleksandrov >> --- > > Thanks for doing this, it was on my list of things that never get done. > > Some downsides: > * size of the FDB entry gets larger. It does not, due to smp alignment of the write-heavy members we had a large hole between cache line 1 and 2, the new 8 bytes fit perfectly and there are still bytes left to use. > * you lost the ability to salt the hash (and rekey) which is important >for DDoS attacks The hash is always salted (property of rhashtable) and in fact is better because now the salt is generated for each rhashtable separately rather than having 1 global salt for all bridge devices. > * being slower for small (<10 entries) also matters and is is a common >use case for containers. I think they're pretty comparable in speed, the difference is negligible IMO.
Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
On Tue, 12 Dec 2017 16:02:50 +0200 Nikolay Aleksandrovwrote: > Before this patch the bridge used a fixed 256 element hash table which > was fine for small use cases (in my tests it starts to degrade > above 1000 entries), but it wasn't enough for medium or large > scale deployments. Modern setups have thousands of participants in a > single bridge, even only enabling vlans and adding a few thousand vlan > entries will cause a few thousand fdbs to be automatically inserted per > participating port. So we need to scale the fdb table considerably to > cope with modern workloads, and this patch converts it to use a > rhashtable for its operations thus improving the bridge scalability. > Tests show the following results (10 runs each), at up to 1000 entries > rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it > is 2 times faster and at 3 it is 50 times faster. > Obviously this happens because of the properties of the two constructs > and is expected, rhashtable keeps pretty much a constant time even with > 1000 entries (tested), while the fixed hash table struggles > considerably even above 1. > As a side effect this also reduces the net_bridge struct size from 3248 > bytes to 1344 bytes. Also note that the key struct is 8 bytes. > > Signed-off-by: Nikolay Aleksandrov > --- Thanks for doing this, it was on my list of things that never get done. Some downsides: * size of the FDB entry gets larger. * you lost the ability to salt the hash (and rekey) which is important for DDoS attacks * being slower for small (<10 entries) also matters and is is a common use case for containers.
Re: [Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
On Tue, 12 Dec 2017 16:02:50 +0200 Nikolay Aleksandrovwrote: > + memcpy(__entry->addr, f->key.addr.addr, ETH_ALEN); Maybe use ether_addr_copy() here?
[Bridge] [PATCH net-next] net: bridge: use rhashtable for fdbs
Before this patch the bridge used a fixed 256 element hash table which was fine for small use cases (in my tests it starts to degrade above 1000 entries), but it wasn't enough for medium or large scale deployments. Modern setups have thousands of participants in a single bridge, even only enabling vlans and adding a few thousand vlan entries will cause a few thousand fdbs to be automatically inserted per participating port. So we need to scale the fdb table considerably to cope with modern workloads, and this patch converts it to use a rhashtable for its operations thus improving the bridge scalability. Tests show the following results (10 runs each), at up to 1000 entries rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it is 2 times faster and at 3 it is 50 times faster. Obviously this happens because of the properties of the two constructs and is expected, rhashtable keeps pretty much a constant time even with 1000 entries (tested), while the fixed hash table struggles considerably even above 1. As a side effect this also reduces the net_bridge struct size from 3248 bytes to 1344 bytes. Also note that the key struct is 8 bytes. Signed-off-by: Nikolay Aleksandrov--- After this I'll post patches for the per-port fdb limit option. Later we can get rid of hash_lock altogether though that requires much more careful changes. include/trace/events/bridge.h | 4 +- net/bridge/br_device.c| 10 ++ net/bridge/br_fdb.c | 392 -- net/bridge/br_private.h | 16 +- net/bridge/br_switchdev.c | 8 +- 5 files changed, 211 insertions(+), 219 deletions(-) diff --git a/include/trace/events/bridge.h b/include/trace/events/bridge.h index 1bee3e7fdf32..8ea966448b58 100644 --- a/include/trace/events/bridge.h +++ b/include/trace/events/bridge.h @@ -82,8 +82,8 @@ TRACE_EVENT(fdb_delete, TP_fast_assign( __assign_str(br_dev, br->dev->name); __assign_str(dev, f->dst ? f->dst->dev->name : "null"); - memcpy(__entry->addr, f->addr.addr, ETH_ALEN); - __entry->vid = f->vlan_id; + memcpy(__entry->addr, f->key.addr.addr, ETH_ALEN); + __entry->vid = f->key.vlan_id; ), TP_printk("br_dev %s dev %s addr %02x:%02x:%02x:%02x:%02x:%02x vid %u", diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index af5b8c87f590..1285ca30ab0a 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -125,9 +125,16 @@ static int br_dev_init(struct net_device *dev) if (!br->stats) return -ENOMEM; + err = br_fdb_hash_init(br); + if (err) { + free_percpu(br->stats); + return err; + } + err = br_vlan_init(br); if (err) { free_percpu(br->stats); + br_fdb_hash_fini(br); return err; } @@ -135,6 +142,7 @@ static int br_dev_init(struct net_device *dev) if (err) { free_percpu(br->stats); br_vlan_flush(br); + br_fdb_hash_fini(br); } br_set_lockdep_class(dev); @@ -148,6 +156,7 @@ static void br_dev_uninit(struct net_device *dev) br_multicast_dev_del(br); br_multicast_uninit_stats(br); br_vlan_flush(br); + br_fdb_hash_fini(br); free_percpu(br->stats); } @@ -416,6 +425,7 @@ void br_dev_setup(struct net_device *dev) br->dev = dev; spin_lock_init(>lock); INIT_LIST_HEAD(>port_list); + INIT_HLIST_HEAD(>fdb_list); spin_lock_init(>hash_lock); br->bridge_id.prio[0] = 0x80; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 4ea5c8bbe286..dc87fbc9a23b 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -28,14 +28,20 @@ #include #include "br_private.h" +static const struct rhashtable_params br_fdb_rht_params = { + .head_offset = offsetof(struct net_bridge_fdb_entry, rhnode), + .key_offset = offsetof(struct net_bridge_fdb_entry, key), + .key_len = sizeof(struct net_bridge_fdb_key), + .automatic_shrinking = true, + .locks_mul = 1, +}; + static struct kmem_cache *br_fdb_cache __read_mostly; static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source, const unsigned char *addr, u16 vid); static void fdb_notify(struct net_bridge *br, const struct net_bridge_fdb_entry *, int); -static u32 fdb_salt __read_mostly; - int __init br_fdb_init(void) { br_fdb_cache = kmem_cache_create("bridge_fdb_cache", @@ -45,7 +51,6 @@ int __init br_fdb_init(void) if (!br_fdb_cache) return -ENOMEM; - get_random_bytes(_salt, sizeof(fdb_salt)); return 0; } @@ -54,6 +59,15 @@ void br_fdb_fini(void) kmem_cache_destroy(br_fdb_cache); } +int br_fdb_hash_init(struct