Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
Thanks Ilya for comments. I will address them in the next patch-set. Warm Regards, Vishal Ajmera ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
> > Hi Vishal, > > I quickly tested your patch on two servers with 2 ixgbe cards each linked via > a > juniper switch with LACP. > With testpmd using a single core the switching rate raised from ~2.0 Mpps to > ~2.4+ Mpps, so I read at least a +20% gain. > > Please add an example command on how to enable it in the commit message, > e.g. > > ovs-vsctl set port bond0 other_config:opt-bond-tcp=true Thanks Matteo for testing the patch and sharing results. I will add example in the commit message for next patch-set. Warm Regards, Vishal Ajmera ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
On Thu, Aug 8, 2019 at 10:57 AM Vishal Deep Ajmera wrote: > > Problem: > > In OVS-DPDK, flows with output over a bond interface of type “balance-tcp” > (using a hash on TCP/UDP 5-tuple) get translated by the ofproto layer into > "HASH" and "RECIRC" datapath actions. After recirculation, the packet is > forwarded to the bond member port based on 8-bits of the datapath hash > value computed through dp_hash. This causes performance degradation in the > following ways: > > 1. The recirculation of the packet implies another lookup of the packet’s > flow key in the exact match cache (EMC) and potentially Megaflow classifier > (DPCLS). This is the biggest cost factor. > > 2. The recirculated packets have a new “RSS” hash and compete with the > original packets for the scarce number of EMC slots. This implies more > EMC misses and potentially EMC thrashing causing costly DPCLS lookups. > > 3. The 256 extra megaflow entries per bond for dp_hash bond selection put > additional load on the revalidation threads. > > Owing to this performance degradation, deployments stick to “balance-slb” > bond mode even though it does not do active-active load balancing for > VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same > source MAC address. > > Proposed optimization: > -- > This proposal introduces a new load-balancing output action instead of > recirculation. > > Maintain one table per-bond (could just be an array of uint16's) and > program it the same way internal flows are created today for each possible > hash value(256 entries) from ofproto layer. Use this table to load-balance > flows as part of output action processing. > > Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules() > -> bond_may_recirc() and compose_output_action__() generate > “dp_hash(hash_l4(0))” and “recirc()” actions. In this case the > RecircID identifies the bond. For the recirculated packets the ofproto layer > installs megaflow entries that match on RecircID and masked dp_hash and send > them to the corresponding output port. > > Instead, we will now generate actions as > "hash(l4(0)),lb_output(bond,)" > > This combines hash computation (only if needed, else re-use RSS hash) and > inline load-balancing over the bond. This action is used *only* for > balance-tcp > bonds in OVS-DPDK datapath (the OVS kernel datapath remains unchanged). > > Example: > > Current scheme: > --- > With 1 IP-UDP flow: > > flow-dump from pmd on cpu core: 2 > recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2828969, bytes:181054016, used:0.000s, > actions:hash(hash_l4(0)),recirc(0x1) > > recirc_id(0x1),dp_hash(0x113683bd/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:2828937, bytes:181051968, used:0.000s, actions:2 > > With 8 IP-UDP flows (with random UDP src port): (all hitting same DPCL): > > flow-dump from pmd on cpu core: 2 > recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2674009, bytes:171136576, used:0.000s, > actions:hash(hash_l4(0)),recirc(0x1) > > recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:377395, bytes:24153280, used:0.000s, actions:2 > recirc_id(0x1),dp_hash(0xb236c260/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:333486, bytes:21343104, used:0.000s, actions:1 > recirc_id(0x1),dp_hash(0x7d89eb18/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:348461, bytes:22301504, used:0.000s, actions:1 > recirc_id(0x1),dp_hash(0xa78d75df/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:633353, bytes:40534592, used:0.000s, actions:2 > recirc_id(0x1),dp_hash(0xb58d846f/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:319901, bytes:20473664, used:0.001s, actions:2 > recirc_id(0x1),dp_hash(0x24534406/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:334985, bytes:21439040, used:0.001s, actions:1 > recirc_id(0x1),dp_hash(0x3cf32550/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:326404, bytes:20889856, used:0.001s, actions:1 > > New scheme: > --- > We can do with a single flow entry (for any number of new flows): > > in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2674009, bytes:171136576, used:0.000s, > actions:hash(l4(0)),lb_output(bond,1) > > A new CLI has been added to dump the per-PMD bond cache as given below. > > “sudo ovs-appctl dpif-netdev/pmd-bond-show” > > root@ubuntu-190:performance_scripts # sudo ovs-appctl > dpif-netdev/pmd-bond-show > pmd thread numa_id 0 core_id 4: > Bond cache: >
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
Hi. Not a full review. Few comments inline. Best regards, Ilya Maximets. On 08.08.2019 11:56, Vishal Deep Ajmera wrote: > Problem: > > In OVS-DPDK, flows with output over a bond interface of type “balance-tcp” > (using a hash on TCP/UDP 5-tuple) get translated by the ofproto layer into > "HASH" and "RECIRC" datapath actions. After recirculation, the packet is > forwarded to the bond member port based on 8-bits of the datapath hash > value computed through dp_hash. This causes performance degradation in the > following ways: > > 1. The recirculation of the packet implies another lookup of the packet’s > flow key in the exact match cache (EMC) and potentially Megaflow classifier > (DPCLS). This is the biggest cost factor. > > 2. The recirculated packets have a new “RSS” hash and compete with the > original packets for the scarce number of EMC slots. This implies more > EMC misses and potentially EMC thrashing causing costly DPCLS lookups. > > 3. The 256 extra megaflow entries per bond for dp_hash bond selection put > additional load on the revalidation threads. > > Owing to this performance degradation, deployments stick to “balance-slb” > bond mode even though it does not do active-active load balancing for > VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same > source MAC address. > > Proposed optimization: > -- > This proposal introduces a new load-balancing output action instead of > recirculation. > > Maintain one table per-bond (could just be an array of uint16's) and > program it the same way internal flows are created today for each possible > hash value(256 entries) from ofproto layer. Use this table to load-balance > flows as part of output action processing. > > Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules() > -> bond_may_recirc() and compose_output_action__() generate > “dp_hash(hash_l4(0))” and “recirc()” actions. In this case the > RecircID identifies the bond. For the recirculated packets the ofproto layer > installs megaflow entries that match on RecircID and masked dp_hash and send > them to the corresponding output port. > > Instead, we will now generate actions as > "hash(l4(0)),lb_output(bond,)" > > This combines hash computation (only if needed, else re-use RSS hash) and > inline load-balancing over the bond. This action is used *only* for > balance-tcp > bonds in OVS-DPDK datapath (the OVS kernel datapath remains unchanged). > > Example: > > Current scheme: > --- > With 1 IP-UDP flow: > > flow-dump from pmd on cpu core: 2 > recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2828969, bytes:181054016, used:0.000s, > actions:hash(hash_l4(0)),recirc(0x1) > > recirc_id(0x1),dp_hash(0x113683bd/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:2828937, bytes:181051968, used:0.000s, actions:2 > > With 8 IP-UDP flows (with random UDP src port): (all hitting same DPCL): > > flow-dump from pmd on cpu core: 2 > recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2674009, bytes:171136576, used:0.000s, > actions:hash(hash_l4(0)),recirc(0x1) > > recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:377395, bytes:24153280, used:0.000s, actions:2 > recirc_id(0x1),dp_hash(0xb236c260/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:333486, bytes:21343104, used:0.000s, actions:1 > recirc_id(0x1),dp_hash(0x7d89eb18/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:348461, bytes:22301504, used:0.000s, actions:1 > recirc_id(0x1),dp_hash(0xa78d75df/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:633353, bytes:40534592, used:0.000s, actions:2 > recirc_id(0x1),dp_hash(0xb58d846f/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:319901, bytes:20473664, used:0.001s, actions:2 > recirc_id(0x1),dp_hash(0x24534406/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:334985, bytes:21439040, used:0.001s, actions:1 > recirc_id(0x1),dp_hash(0x3cf32550/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), > packets:326404, bytes:20889856, used:0.001s, actions:1 > > New scheme: > --- > We can do with a single flow entry (for any number of new flows): > > in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no), > packets:2674009, bytes:171136576, used:0.000s, > actions:hash(l4(0)),lb_output(bond,1) > > A new CLI has been added to dump the per-PMD bond cache as given below. > > “sudo ovs-appctl dpif-netdev/pmd-bond-show” > > root@ubuntu-190:performance_scripts # sudo ovs-appctl > d
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
> > Hi, > > why not a static function in the header file? So it gets inlined. > > Regards, > -- > Matteo Croce > per aspera ad upstream Thanks Matteo for looking into this patch-set. Yes I agree. I will address your suggestion in the next revision. Warm Regards, Vishal ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
On Thu, Aug 8, 2019 at 10:57 AM Vishal Deep Ajmera wrote: > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -1939,3 +1959,9 @@ bond_get_changed_active_slave(const char *name, struct > eth_addr *mac, > > return false; > } > + > +bool > +bond_get_cache_mode(const struct bond *bond) > +{ > +return bond->use_bond_cache; > +} Hi, why not a static function in the header file? So it gets inlined. Regards, -- Matteo Croce per aspera ad upstream ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] Avoid dp_hash recirculation for balance-tcp bond selection mode
Bleep bloop. Greetings Vishal Deep Ajmera, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 81 characters long (recommended limit is 79) #219 FILE: lib/dpif-netdev.c:732: * Note: This flag is decoupled from 'reload' WARNING: Line is 81 characters long (recommended limit is 79) #220 FILE: lib/dpif-netdev.c:733: * flag otherwise full pmd reload will become Lines checked: 1668, Warnings: 2, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev