Re: [ovs-dev] group dp_hash method works incorrectly when using snat
CACHE_LINE_SIZE, cacheline0, uint32_t recirc_id; /* Recirculation id carried with the recirculating packets. 0 for packets received from the wire. */ -uint32_t dp_hash; /* hash value computed by the recirculation - action. */ +uint32_t dp_hash; /* hash value computed by dp_hash action. */ uint32_t skb_priority; /* Packet priority for QoS. */ uint32_t pkt_mark; /* Packet mark. */ uint8_t ct_state; /* Connection state. */ The list of places where we need to invalidate the dp_hash is probably not exhaustive. BR, Jan > -Original Message- > From: ovs-dev-boun...@openvswitch.org boun...@openvswitch.org> On Behalf Of ychen > Sent: Monday, 30 September, 2019 06:09 > To: d...@openvswitch.org > Subject: [ovs-dev] group dp_hash method works incorrectly when using snat > > Hi, >We found that when the same TCP session using snat with dp_hash group > as output actionj, >SYN packet and the other packets behaves different, SYN packet outputs > to one group bucket, and the other packets outputs to another group bucket. > > >Here is the ovs flows: > > table=0,in_port=DOWN_PORT,tun_id=vni,ip,actions=ct(nat,zone=ZID,table=1 > ) > > table=1,ip,ct_state=+new,ct(commit,nat,src=SNAT_PUB_IP,zone=ZID,table=2 > ) >table=1,ip,ct_state=-new,actions=goto_table(table=2) >table=2,ip,actions=group:1 > > group=1,type=select,selection_method=dp_hash,bucket=actions=output:UP_ > PORT1,bucket=actions=output:UP_PORT2 > > > Here is the datapath flow: > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x080 > 0),ipv4(src=192.168.100.16/255.255.255.240,frag=no), packets:5, bytes:455, > used:2.978s, flags:FP., > actions:meter(248),meter(249),ct(zone=1298,nat),recirc(0x176) > flow-dump from pmd on cpu core: 6 > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),ct_state(+new- > inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth_t > ype(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, > actions:meter(250),ct(commit,zone=1298,nat(src=172.16.1.152:1024- > 65535)),recirc(0x177) > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),ct_state(-new- > inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth(sr > c=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,fra > g=no), packets:4, bytes:389, used:3.002s, flags:FP., > actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)), > hash(hash_l4(0)),recirc(0x178) > flow-dump from pmd on cpu core: 6 > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),recirc_id(0x178),dp_hash(0x8a6c9809/0xf),in_port(7),packet_t > ype(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:389, > used:3.025s, flags:FP., actions:2 > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),recirc_id(0x178),dp_hash(0xbab97b2e/0xf),in_port(7),packet_ > type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, > used:never, actions:3 flow-dump from pmd on cpu core: 6 > tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(- > df+csum+key)),recirc_id(0x177),in_port(7),packet_type(ns=0,id=0),eth(src=02: > 00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no) > , packets:0, bytes:0, used:never, > actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)), > hash(hash_l4(0)),recirc(0x178) > > > from the above datapath flow, we can get the conclusion: > 1. the first SYN packet match ct_state=+new, and recirculates 3 times 2. > other packets match ct_state=-new, and recirculates only 2 times 3. packet's > match +new and packets match -new have different dp_hash value, hence > may output to different port >(same session TCP packets output to different port may increase the > disorder risk) > > > we researched ovs code, and found the following: > dpif_netdev_packet_get_rss_hash(struct dp_packet *packet, > const struct miniflow *mf) { > uint32_t hash, recirc_depth; > > > if (OVS_LIKELY(dp_packet_rss_valid(packet))) { > hash = dp_packet_get_rss_hash(packet); > } else { > hash = miniflow_hash_5tuple(mf, 0); > dp_packet_set_rss_hash(packet, hash); > } > > > /* The RSS hash must account for the recirculation depth to avoid > * collisions in the exact match cache */ > recirc_depth = *recirc_depth_get_unsafe(); > if (OVS_UNLIKELY(recir
[ovs-dev] group dp_hash method works incorrectly when using snat
Hi, We found that when the same TCP session using snat with dp_hash group as output actionj, SYN packet and the other packets behaves different, SYN packet outputs to one group bucket, and the other packets outputs to another group bucket. Here is the ovs flows: table=0,in_port=DOWN_PORT,tun_id=vni,ip,actions=ct(nat,zone=ZID,table=1) table=1,ip,ct_state=+new,ct(commit,nat,src=SNAT_PUB_IP,zone=ZID,table=2) table=1,ip,ct_state=-new,actions=goto_table(table=2) table=2,ip,actions=group:1 group=1,type=select,selection_method=dp_hash,bucket=actions=output:UP_PORT1,bucket=actions=output:UP_PORT2 Here is the datapath flow: tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.100.16/255.255.255.240,frag=no), packets:5, bytes:455, used:2.978s, flags:FP., actions:meter(248),meter(249),ct(zone=1298,nat),recirc(0x176) flow-dump from pmd on cpu core: 6 tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(+new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:meter(250),ct(commit,zone=1298,nat(src=172.16.1.152:1024-65535)),recirc(0x177) tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(-new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:4, bytes:389, used:3.002s, flags:FP., actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178) flow-dump from pmd on cpu core: 6 tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0x8a6c9809/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:389, used:3.025s, flags:FP., actions:2 tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0xbab97b2e/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:3 flow-dump from pmd on cpu core: 6 tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x177),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:0, bytes:0, used:never, actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178) from the above datapath flow, we can get the conclusion: 1. the first SYN packet match ct_state=+new, and recirculates 3 times 2. other packets match ct_state=-new, and recirculates only 2 times 3. packet's match +new and packets match -new have different dp_hash value, hence may output to different port (same session TCP packets output to different port may increase the disorder risk) we researched ovs code, and found the following: dpif_netdev_packet_get_rss_hash(struct dp_packet *packet, const struct miniflow *mf) { uint32_t hash, recirc_depth; if (OVS_LIKELY(dp_packet_rss_valid(packet))) { hash = dp_packet_get_rss_hash(packet); } else { hash = miniflow_hash_5tuple(mf, 0); dp_packet_set_rss_hash(packet, hash); } /* The RSS hash must account for the recirculation depth to avoid * collisions in the exact match cache */ recirc_depth = *recirc_depth_get_unsafe(); if (OVS_UNLIKELY(recirc_depth)) { hash = hash_finish(hash, recirc_depth);=> this code changes the RSS hash, and this function is called before EMC lookup dp_packet_set_rss_hash(packet, hash); } return hash; } so is there any method to fix this problem? we tried change the ovs flow with : table=1,ip,ct_state=-new,actions=ct(commit, table=2) and problem dispeer, but in this time ,packets match ct_state=-new also need recirc 3 times which may decrease performance. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev