[ovs-discuss] Questions about flow put failed(File exist)

2020-07-20 Thread Yutao (Simon, Cloud Infrastructure Service Product Dept.)
Hi all,



The ovs ran into a problem in our  environment: Flow continuous to upcall , 
because of flow_put_on_pmd failed.



Steps to trigger this problem:

Step1

Add two openflow rules:

#1. ovs-ofctl add-flow br-forward 
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_dst=123.1.1.1/32,priority=10 
actions=output:2 "

#2. ovs-ofctl add-flow br-forward 
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_src=200.175.100.200,priority=20
 actions=output:2"



Step2

Two traffic flow from #1 port:

#1. SIP:8.93.223.217   DIP:114.166.27.239

#2. SIP:128.93.223.217 DIP:114.166.27.239



It will generate two megaflow in datapath:

#1. 
recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=8.93.223.217/128.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))

#2. 
recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=128.93.223.217/192.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))



Step3

Stop #2 traffic flow in Step 2(SIP:128.93.223.217), wait the #2 megaflow aged;



Step4

Delete the #2 openflow rule in Step1, then start the #2 
traffic(SIP:128.93.223.217) again。

Problem occurs---the #2 traffic flow continuous to upcall , because of 
flow_put_on_pmd failed, error is “File exist”.



Root cause analysis:

1.   In Step3, we delete the #2 openflow rule then wait #2 megaflow aged. 
Now, there is only one megaflow in datapath:

recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=8.93.223.217/128.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))



2.   In Step 4 , start the #2 traffic flow again. The flow lookup cls in 
pmd thread, but it will not match, then packet will upcall to handler.

3.   In handler thread, the flow will match  
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_dst=123.1.1.1/32,priority=10 
actions=output:2 ". So the src_ip in mask will be 0.0.0.0(No openflow rule to 
match src_ip).

4.   Before put megaflow to pmd, handler will check if the megaflow is 
exist in cls. The key to lookup in cls will be masked in dpif_netdev_flow_put 
(now,src_ip in key is 0.0.0.0). Unfortunately, It will match the #2 megaflow, 
File exist, Put faild. Flow continus to upcall



The key to lookup cls in fast_path_processing(-->dp_netdev_pmd_lookup_flow) is 
different from the one in flow_put_on_pmd(-->dp_netdev_pmd_lookup_flow).



Is there any useful patch to fix this bug? How about the following two 
modifications:



1.   Use the same key to lookup cls in handler thread

---

lib/dpif-netdev.c | 10 ++

1 file changed, 6 insertions(+), 4 deletions(-)



diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

index e456cc9..facb9f6 100644

--- a/lib/dpif-netdev.c

+++ b/lib/dpif-netdev.c

@@ -3524,10 +3524,12 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct 
dpif_flow_put *put)

 }



 /* Must produce a netdev_flow_key for lookup.

- * Use the same method as employed to create the key when adding

- * the flow to the dplcs to make sure they match. */

-netdev_flow_mask_init(, );

-netdev_flow_key_init_masked(, , );

+ * Must generate key by flow to guarantee that key is the same as the 
key in

+ * fast_path_processing().

+ * ps: key.hash will not be used in lookup. */

+miniflow_map_init(, >flow);

+miniflow_init(, >flow);

+key.len = netdev_flow_key_size(miniflow_n_values());



 if (put->pmd_id == PMD_ID_NULL) {

 if (cmap_count(>poll_threads) == 0) {

--



2.   Modify the revalidate_ukey__ to delete the megaflow when mask changes



From c7bda1680832a18946b8459f6b937d62b02d0280 Mon Sep 17 00:00:00 2001

From: f00448292 mailto:fuzhan...@huawei.com>>

Date: Mon, 20 Jul 2020 14:44:34 +0800

Subject: [PATCH] NEW



Signed-off-by: f00448292 mailto:fuzhan...@huawei.com>>

---

ofproto/ofproto-dpif-upcall.c | 2 +-

1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c

index 8dfa05b..8a8c233 100644

--- a/ofproto/ofproto-dpif-upcall.c

+++ b/ofproto/ofproto-dpif-upcall.c

@@ -2209,7 +2209,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,

  * tells that the datapath flow is now too generic and must be narrowed

  * down.  Note that we do not know if the datapath has ignored any of the

  * wildcarded bits, so we may be overly conservative here. */

-if (flow_wildcards_has_extra(_mask, ctx.wc)) {

+if (flow_wildcards_equal(_mask, ctx.wc)) {

 goto exit;

 }



--

Looking forward to your reply.



[ovs-discuss] Questions about flow put failed(File exist)

2020-07-14 Thread Yutao (Simon, Cloud Infrastructure Service Product Dept.)
Hi all,



The ovs ran into a problem in our  environment: Flow continuous to upcall , 
because of flow_put_on_pmd failed.



Steps to trigger this problem:

Step1

Add two openflow rules:

#1. ovs-ofctl add-flow br-forward 
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_dst=123.1.1.1/32,priority=10 
actions=output:2 "

#2. ovs-ofctl add-flow br-forward 
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_src=200.175.100.200,priority=20
 actions=output:2"



Step2

Two traffic flow from #1 port:

#1. SIP:8.93.223.217   DIP:114.166.27.239

#2. SIP:128.93.223.217 DIP:114.166.27.239



It will generate two megaflow in datapath:

#1. 
recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=8.93.223.217/128.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))

#2. 
recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=128.93.223.217/192.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))



Step3

Stop #2 traffic flow in Step 2(SIP:128.93.223.217), wait the #2 megaflow aged;



Step4

Delete the #2 openflow rule in Step1, then start the #2 
traffic(SIP:128.93.223.217) again。

Problem occurs---the #2 traffic flow continuous to upcall , because of 
flow_put_on_pmd failed, error is “File exist”.



Root cause analysis:

1.   In Step3, we delete the #2 openflow rule then wait #2 megaflow aged. 
Now, there is only one megaflow in datapath:

recirc_id(0),in_port(2),eth(dst=3e:5b:51:81:5a:2f),eth_type(0x8100),encap(eth_type(0x0800),ipv4(src=8.93.223.217/128.0.0.0,dst=114.166.27.239/255.0.0.0,frag=no)),

packets:12825613, bytes:3232054476, used:0.000s, 
actions:ext_action(action=route,args(table_index=0;policy=ip_src))



2.   In Step 4 , start the #2 traffic flow again. The flow lookup cls in 
pmd thread, but it will not match, then packet will upcall to handler.

3.   In handler thread, the flow will match  
"table=0,in_port=1,dl_dst:3E:5B:51:81:5A:2F,ip,nw_dst=123.1.1.1/32,priority=10 
actions=output:2 ". So the src_ip in mask will be 0.0.0.0(No openflow rule to 
match src_ip).

4.   Before put megaflow to pmd, handler will check if the megaflow is 
exist in cls. The key to lookup in cls will be masked in dpif_netdev_flow_put 
(now,src_ip in key is 0.0.0.0). Unfortunately, It will match the #2 megaflow, 
File exist, Put faild. Flow continus to upcall



The key to lookup cls in fast_path_processing(-->dp_netdev_pmd_lookup_flow) is 
different from the one in flow_put_on_pmd(-->dp_netdev_pmd_lookup_flow).



Is there any useful patch to fix this bug?  How about this:

---

lib/dpif-netdev.c | 10 ++

1 file changed, 6 insertions(+), 4 deletions(-)



diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

index e456cc9..facb9f6 100644

--- a/lib/dpif-netdev.c

+++ b/lib/dpif-netdev.c

@@ -3524,10 +3524,12 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct 
dpif_flow_put *put)

 }



 /* Must produce a netdev_flow_key for lookup.

- * Use the same method as employed to create the key when adding

- * the flow to the dplcs to make sure they match. */

-netdev_flow_mask_init(, );

-netdev_flow_key_init_masked(, , );

+ * Must generate key by flow to guarantee that key is the same as the 
key in

+ * fast_path_processing().

+ * ps: key.hash will not be used in lookup. */

+miniflow_map_init(, >flow);

+miniflow_init(, >flow);

+key.len = netdev_flow_key_size(miniflow_n_values());



 if (put->pmd_id == PMD_ID_NULL) {

 if (cmap_count(>poll_threads) == 0) {

--



Looking forward to your reply.



Thanks.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Questions about qinq packet get a wrong hash from 'skb_get_hash'

2020-04-03 Thread Yutao (Simon, Cloud Infrastructure Service Product Dept.)
Hi all,

We find a problem in our our environment with ovs+linux-kernel-- The same 
flow's packets will get different hash from 'skb_get_hash', if we push a vxlan 
header to these packets, the outter udp's sprot will be different.

The call stack and corresponding skb info:
ovs_dp_process_packet --skb->data:a0c25fb502f2(mac header), 
skb->network_header:a0c25fb50304(ip header), skb->protocol:0x8100, 
skb->protocol:0x8100,, is_vlan_tag_present:true
ovs_vport_receive
netdev_port_receive
netdev_frame_hook skb->data:a0c25fb50300(inner vlan header), 
skb->network_header:a0c25fb50300, skb->protocol:0x8100, 
skb->protocol:0x8100, is_vlan_tag_present:true
__netif_receive_skb_core
__netif_receive_skb
netif_receive_skb_internal
napi_gro_receive
recv_one_pkt
hinic_rx_poll
hinic_poll
net_rx_action
__do_softirq
call_softirq

Problem Description:
When ovs received a qinq packet, the kernel has untagged the outer vlan and 
save vlan_id in skb-> vlan_tci, vlan_protocol in skb->vlan_proto. Now 
skb->protocol=0x8100,skb->vlan_protocol=0x8100 skb->.

In 'netdev_frame_hook': skb->data point to packet's inner vlan, 
skb->network_header point to inner vlan too.

In 'ovs_vport_receive': After 'ovs_flow_key_extract', the skb->data will point 
to mac header, skb->network_header will point to ip header. Then 
'ovs_dp_process_packet' will call 'skb_get_hash'. 'skb_get_hash' will call 
'__skb_flow_dissect' finally.
[cid:image001.jpg@01D60995.BC93D3B0]

In '__skb_flow_dissect'( "flow_dissector.c" line 1161 of newest kernel code) . 
Because the network header is pointing to ip, when skb enter the vlan case in 
the second time, it will extract vlan info in ip header, then 
'__skb_flow_dissect' will return a error hash:

case htons(ETH_P_8021Q): {
   const struct vlan_hdr *vlan = NULL;
   struct vlan_hdr _vlan;
   __be16 saved_vlan_tpid = proto;

   if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX &&
   skb && skb_vlan_tag_present(skb)) {
proto = skb->protocol;
   } else {
vlan = __skb_header_pointer(skb, nhoff, 
sizeof(_vlan),
data, hlen, &_vlan);
if (!vlan) {
 fdret = FLOW_DISSECT_RET_OUT_BAD;
 break;
}

proto = vlan->h_vlan_encapsulated_proto;
nhoff += sizeof(*vlan);
   }



Is there any useful patch to fix this bug? Or some suggestion?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss