On 10/1/2021 5:43 PM, Eelco Chaudron wrote:

On 1 Oct 2021, at 11:35, Eelco Chaudron wrote:

Hi Chris,

I just started to review this patchset, but as some of the v14 discussions have 
not finished, I’ll copy them over to v15. This way, all the open items are 
contained in a single thread/revision. I would suggest clear up the open items 
before you send a new revision.

This is only a visual review, as I did not test the code, assuming we need a 
new revision anyway with the test cases integrated.

Cheers,

Eelco

Forgot about this issue (see below), was this fixed in v15? I did not see any 
reply to the email chain?
I still can't reproduce the issue locally, so I didn't reply the email explicitly. But I revised the code in v15. I think the change should fix it. In function dpif_sflow_attr_equal() of v14, only ufid is compared. In v15, all the fields are compared. That will prevent the gid mapping to be reused wrongly when the flow updates the action for the same ufid. Since this issue is a corner case, in dpif_sflow_attr_hash(), only ufid is hashed to save CPU cycles.
Hopefully that will fix the issue. 😁


Here is some debug info, it seems related to an update to an existing handle, 
which does not update the tc part:

  Here the flow gets added:

  2021-09-09T14:01:12.278Z|00001|netdev_offload_tc(handler1)|ERR|EC_DBG: Enter 
b13915aa-c7a8-4574-8dcf-439f03929b8e.

  2021-09-09T14:01:12.278Z|00002|netdev_offload_tc(handler1)|ERR|EC_DEBUG: port 
-2147483646

  2021-09-09T14:01:12.285Z|00003|netdev_offload_tc(handler1)|ERR|EC_DBG: EXIT 
ok b13915aa-c7a8-4574-8dcf-439f03929b8e.

  Here the revalidator updates it with new port information:

  2021-09-09T14:01:12.537Z|00001|netdev_offload_tc(revalidator13)|ERR|EC_DBG: 
Enter b13915aa-c7a8-4574-8dcf-439f03929b8e.

  2021-09-09T14:01:12.537Z|00002|netdev_offload_tc(revalidator13)|ERR|EC_DEBUG: 
port 3

  2021-09-09T14:01:12.537Z|00004|netdev_offload_tc(revalidator13)|DBG|updating 
old handle: 1 prio: 2

  2021-09-09T14:01:12.553Z|00005|netdev_offload_tc(revalidator13)|ERR|EC_DBG: 
EXIT ok b13915aa-c7a8-4574-8dcf-439f03929b8e.

  Here is a dp dump showing the wrong/old port:

  $ ovs-appctl dpctl/dump-flows -m system@ovs-system type=tc

  ufid:b13915aa-c7a8-4574-8dcf-439f03929b8e, 
skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(vnet4),packet_type(ns=0/0,id=0/0),eth(src=52:54:00:88:51:38,dst=04:f4:bc:28:57:01),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no),
 packets:180, bytes:15120, used:0.230s, dp:tc, 
actions:sample(sample=10.0%,actions(userspace(pid=2771969963,sFlow(vid=0,pcp=0,output=2147483650),actions))),enp5s0f0

  Guess this is the area you need to look at del_filter_and_ufid_mapping():

  2389     if (get_ufid_tc_mapping(ufid, &id) == 0) {

  2390         VLOG_DBG_RL(&rl, "updating old handle: %d prio: %d",

  2391                     id.handle, id.prio);

  2392         info->tc_modify_flow_deleted = !del_filter_and_ufid_mapping(&id, 
ufid);

  2393     }

  There might also be a problem in the sgid mapping as it’s also using the uuid 
as the hash, and there could be a collision during the update phase.

  I’m working on some escalations so, please take a look and see if you can fix 
this.

  I need to restart OVS, and then send a ping, and I see the problem once, I 
need to restart OVS each time to see it.

  I’ll wait for v15 to see how you fixed it ;)

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to