Re: [ovs-dev] group dp_hash method works incorrectly when using snat

2019-09-30 Thread Jan Scheurich via dev
CACHE_LINE_SIZE, cacheline0,
 uint32_t recirc_id; /* Recirculation id carried with the
recirculating packets. 0 for packets
received from the wire. */
-uint32_t dp_hash;   /* hash value computed by the recirculation
-   action. */
+uint32_t dp_hash;   /* hash value computed by dp_hash action. */
 uint32_t skb_priority;  /* Packet priority for QoS. */
 uint32_t pkt_mark;  /* Packet mark. */
 uint8_t  ct_state;  /* Connection state. */

The list of places where we need to invalidate the dp_hash is probably not 
exhaustive.

BR, Jan



> -Original Message-
> From: ovs-dev-boun...@openvswitch.org  boun...@openvswitch.org> On Behalf Of ychen
> Sent: Monday, 30 September, 2019 06:09
> To: d...@openvswitch.org
> Subject: [ovs-dev] group dp_hash method works incorrectly when using snat
> 
> Hi,
>We found that when the same TCP session  using snat with dp_hash group
> as output actionj,
>SYN packet and the other packets behaves different, SYN packet outputs
> to one group bucket, and the other packets outputs to another group bucket.
> 
> 
>Here is the ovs flows:
> 
> table=0,in_port=DOWN_PORT,tun_id=vni,ip,actions=ct(nat,zone=ZID,table=1
> )
> 
> table=1,ip,ct_state=+new,ct(commit,nat,src=SNAT_PUB_IP,zone=ZID,table=2
> )
>table=1,ip,ct_state=-new,actions=goto_table(table=2)
>table=2,ip,actions=group:1
> 
> group=1,type=select,selection_method=dp_hash,bucket=actions=output:UP_
> PORT1,bucket=actions=output:UP_PORT2
> 
> 
>   Here is the datapath flow:
>   tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x080
> 0),ipv4(src=192.168.100.16/255.255.255.240,frag=no), packets:5, bytes:455,
> used:2.978s, flags:FP.,
> actions:meter(248),meter(249),ct(zone=1298,nat),recirc(0x176)
> flow-dump from pmd on cpu core: 6
> tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),ct_state(+new-
> inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth_t
> ype(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never,
> actions:meter(250),ct(commit,zone=1298,nat(src=172.16.1.152:1024-
> 65535)),recirc(0x177)
> tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),ct_state(-new-
> inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth(sr
> c=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,fra
> g=no), packets:4, bytes:389, used:3.002s, flags:FP.,
> actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),
> hash(hash_l4(0)),recirc(0x178)
> flow-dump from pmd on cpu core: 6
> tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),recirc_id(0x178),dp_hash(0x8a6c9809/0xf),in_port(7),packet_t
> ype(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:4, bytes:389,
> used:3.025s, flags:FP., actions:2
> tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),recirc_id(0x178),dp_hash(0xbab97b2e/0xf),in_port(7),packet_
> type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0,
> used:never, actions:3 flow-dump from pmd on cpu core: 6
> tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-
> df+csum+key)),recirc_id(0x177),in_port(7),packet_type(ns=0,id=0),eth(src=02:
> 00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no)
> , packets:0, bytes:0, used:never,
> actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),
> hash(hash_l4(0)),recirc(0x178)
> 
> 
> from the above datapath flow, we can get the conclusion:
>  1. the first SYN packet match ct_state=+new, and recirculates 3 times  2.
> other packets match ct_state=-new, and recirculates only 2 times  3. packet's
> match +new and packets match -new have different dp_hash value, hence
> may output to different port
>(same session TCP packets output to different port may increase the
> disorder risk)
> 
> 
> we researched ovs code, and found the following:
>  dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
> const struct miniflow *mf) {
> uint32_t hash, recirc_depth;
> 
> 
> if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> hash = dp_packet_get_rss_hash(packet);
> } else {
> hash = miniflow_hash_5tuple(mf, 0);
> dp_packet_set_rss_hash(packet, hash);
> }
> 
> 
> /* The RSS hash must account for the recirculation depth to avoid
>  * collisions in the exact match cache */
> recirc_depth = *recirc_depth_get_unsafe();
> if (OVS_UNLIKELY(recir

[ovs-dev] group dp_hash method works incorrectly when using snat

2019-09-29 Thread ychen
Hi,
   We found that when the same TCP session  using snat with dp_hash group as 
output actionj, 
   SYN packet and the other packets behaves different, SYN packet outputs to 
one group bucket, and the other packets outputs to another group bucket.


   Here is the ovs flows:
   table=0,in_port=DOWN_PORT,tun_id=vni,ip,actions=ct(nat,zone=ZID,table=1)
   table=1,ip,ct_state=+new,ct(commit,nat,src=SNAT_PUB_IP,zone=ZID,table=2)
   table=1,ip,ct_state=-new,actions=goto_table(table=2)
   table=2,ip,actions=group:1
   
group=1,type=select,selection_method=dp_hash,bucket=actions=output:UP_PORT1,bucket=actions=output:UP_PORT2


  Here is the datapath flow:
  
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(src=192.168.100.16/255.255.255.240,frag=no),
 packets:5, bytes:455, used:2.978s, flags:FP., 
actions:meter(248),meter(249),ct(zone=1298,nat),recirc(0x176)
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(+new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:0, bytes:0, used:never, 
actions:meter(250),ct(commit,zone=1298,nat(src=172.16.1.152:1024-65535)),recirc(0x177)
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),ct_state(-new-inv),ct_zone(0x512),recirc_id(0x176),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no),
 packets:4, bytes:389, used:3.002s, flags:FP., 
actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178)
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0x8a6c9809/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:4, bytes:389, used:3.025s, flags:FP., actions:2
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x178),dp_hash(0xbab97b2e/0xf),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:0, bytes:0, used:never, actions:3
flow-dump from pmd on cpu core: 6
tunnel(tun_id=0x1435,src=10.185.2.87,dst=10.185.2.93,flags(-df+csum+key)),recirc_id(0x177),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:00:00:00,dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no),
 packets:0, bytes:0, used:never, 
actions:set(eth(src=fa:25:fa:c2:52:71,dst=xx:xx:xx:xx:xx:xx)),set(ipv4(ttl=63)),hash(hash_l4(0)),recirc(0x178)


from the above datapath flow, we can get the conclusion:
 1. the first SYN packet match ct_state=+new, and recirculates 3 times
 2. other packets match ct_state=-new, and recirculates only 2 times
 3. packet's match +new and packets match -new have different dp_hash value, 
hence may output to different port
   (same session TCP packets output to different port may increase the disorder 
risk) 


we researched ovs code, and found the following:
 dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
const struct miniflow *mf)
{
uint32_t hash, recirc_depth;


if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
hash = dp_packet_get_rss_hash(packet);
} else {
hash = miniflow_hash_5tuple(mf, 0);
dp_packet_set_rss_hash(packet, hash);
}


/* The RSS hash must account for the recirculation depth to avoid
 * collisions in the exact match cache */
recirc_depth = *recirc_depth_get_unsafe();
if (OVS_UNLIKELY(recirc_depth)) {
hash = hash_finish(hash, recirc_depth);=> this code changes the RSS 
hash, and this function is called before EMC lookup
dp_packet_set_rss_hash(packet, hash);
}
return hash;
}


so is there any method to fix this problem? 
we tried change the ovs flow with :
 table=1,ip,ct_state=-new,actions=ct(commit, table=2)
and problem dispeer, but in this time ,packets match ct_state=-new also need 
recirc 3 times which may decrease performance.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev