On 8/8/23 09:20, Peng He wrote: > > > Ilya Maximets <i.maxim...@ovn.org <mailto:i.maxim...@ovn.org>> 于2023年8月8日周二 > 07:30写道: > > On 7/31/23 06:55, Peng He wrote: > > OVS allows overlapping megaflows, as long as the actions of these > > megaflows are equal. However, the current implementation of action > > modification relies on flow_lookup instead of ufid, this could result > > in looking up a wrong megaflow and make the ukeys and megaflows > inconsistent > > > > Just like the test case in the patch, at first we have a rule with the > > prefix: > > > > 10.1.2.0/24 <http://10.1.2.0/24> > > > > and we will get a megaflow with prefixes 10.1.2.2/24 > <http://10.1.2.2/24> when a packet with IP > > 10.1.2.2 is received. > > > > Then suppose we change the rule into 10.1.0.0/16 <http://10.1.0.0/16>. > OVS prefers to keep the > > 10.1.2.2/24 <http://10.1.2.2/24> megaflow and just changes its action > instead of extending > > the prefix into 10.1.2.2/16 <http://10.1.2.2/16>. > > > > then suppose we have a 10.1.0.2 packet, since it misses the megaflow, > > this time, we will have an overlapping megaflow with the right prefix: > > 10.1.0.2/16 <http://10.1.0.2/16> > > > > now we have two megaflows: > > 10.1.2.2/24 <http://10.1.2.2/24> > > 10.1.0.2/16 <http://10.1.0.2/16> > > > > last, suppose we have changed the ruleset again. The revalidator this > > time still decides to change the actions of both megaflows instead of > > deleting them. > > > > The dpif_netdev_flow_put will search the megaflow to modify with > unmasked > > keys, however it might lookup the wrong megaflow as the key 10.1.2.2 > matches > > both 10.1.2.2/24 <http://10.1.2.2/24> and 10.1.0.2/16 > <http://10.1.0.2/16>! > > > > This patch changes the megaflow lookup code in modification path into > > relying the ufid to find the correct megaflow instead of key lookup. > > > > Signed-off-by: Peng He <hepeng.0...@bytedance.com > <mailto:hepeng.0...@bytedance.com>> > > --- > > Thanks for the update! This version looks logically correct to me. > See some small comments below. > > Best regards, Ilya Maximets. > > > lib/dpif-netdev.c | 49 ++++++++++++++++++++++++++++++++--------------- > > tests/pmd.at <http://pmd.at> | 48 > ++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 82 insertions(+), 15 deletions(-) > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > index 70b953ae6..8c88a5dc0 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -4191,7 +4191,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, > > const struct dpif_flow_put *put, > > struct dpif_flow_stats *stats) > > { > > - struct dp_netdev_flow *netdev_flow; > > + struct dp_netdev_flow *netdev_flow = NULL; > > int error = 0; > > > > if (stats) { > > @@ -4199,16 +4199,39 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread > *pmd, > > } > > > > ovs_mutex_lock(&pmd->flow_mutex); > > - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > - if (!netdev_flow) { > > - if (put->flags & DPIF_FP_CREATE) { > > - dp_netdev_flow_add(pmd, match, ufid, put->actions, > > - put->actions_len, ODPP_NONE); > > + if (put->ufid) { > > + netdev_flow = dp_netdev_pmd_find_flow(pmd, put->ufid, > > + put->key, put->key_len); > > + } else { > > + /* Use key instead of the locally generated ufid > > + * to search netdev_flow. > > + */ > > Nit: Usually, /* and */ in the comment are on their own lines or > both on the lines with the text, i.e. > > /* > * Line 1 > * Line 2 > */ > > Or > > /* Line 1 > * Line 2 */ > > Rarely the style is mixed. In the case above, I'd suggest to move > the closing '*/' to the previous line. > > > + netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > + } > > + > > + if (put->flags & DPIF_FP_CREATE) { > > + if (!netdev_flow) { > > + /* If ufid is provided, use provided ufid, otherwise > > + * use locally generated ufid. > > + */ > > + dp_netdev_flow_add(pmd, match, put->ufid ? put->ufid : > ufid, > > This looks abit strange, because ufid equals put->ufid, if provided. > I'd remove the comment and just use 'ufid' here without a ternary > operator. > > Or maybe a cleaner solution is to remove 'ufid = *put->ufid;' assignment > from dpif_netdev_flow_put() and re-name the argument of flow_put_on_pmd() > function from 'ufid' to 'local_ufid' or something like that. In this > case we should still remove the comment, but keep the ternary operator. > Might be easier to read this way. What do you think? > > > I miss ufid = put->ufid here. If so, I guess just use ufid here is fine. > > > > > > + put->actions, put->actions_len, > ODPP_NONE); > > } else { > > - error = ENOENT; > > + error = EEXIST; > > } > > - } else { > > - if (put->flags & DPIF_FP_MODIFY) { > > + goto exit; > > + } > > + > > + if (put->flags & DPIF_FP_MODIFY) { > > + if (!netdev_flow) { > > + error = ENOENT; > > + } else { > > + if (!flow_equal(&match->flow, &netdev_flow->flow)) { > > Should this be (!put->ufid && !flow_equal(&match->flow, > &netdev_flow->flow)) ? > i.e. is it necessary to compare flows if found by external uuid? > > Yes, no need to check against the flow if ufid is provided. > will fix it in the next version. > > Will fix all the issues you mentioned below. > > > > > + /* Overlapping flow. */ > > + error = EINVAL; > > + goto exit; > > + } > > + > > struct dp_netdev_actions *new_actions; > > struct dp_netdev_actions *old_actions; > > > > @@ -4239,15 +4262,11 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread > *pmd, > > * counter, and subtracting it before outputting the > stats */ > > error = EOPNOTSUPP; > > } > > - > > ovsrcu_postpone(dp_netdev_actions_free, old_actions); > > - } else if (put->flags & DPIF_FP_CREATE) { > > - error = EEXIST; > > - } else { > > - /* Overlapping flow. */ > > - error = EINVAL; > > } > > } > > + > > +exit: > > ovs_mutex_unlock(&pmd->flow_mutex); > > return error; > > } > > diff --git a/tests/pmd.at <http://pmd.at> b/tests/pmd.at <http://pmd.at> > > index 48f3d432d..f399294d2 100644 > > --- a/tests/pmd.at <http://pmd.at> > > +++ b/tests/pmd.at <http://pmd.at> > > @@ -1300,3 +1300,51 @@ OVS_WAIT_UNTIL([tail -n +$LINENUM > ovs-vswitchd.log | grep "PMD load based sleeps > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([PMD - revalidator wrongly modify userspace megaflows]) > > + > > +OVS_VSWITCHD_START( > > +[add-port br0 p1 \ > > + -- set interface p1 type=dummy-pmd \ > > + -- set bridge br0 datapath-type=dummy \ > > It's probably better to change order of above two lines. > > > + -- add-port br0 p2 \ > > + -- set interface p2 type=dummy-pmd -- > > '--' at the end is not needed. > > > +]) > > + > > +dnl Add one openflow rule and generate a megaflow. > > +AT_CHECK([ovs-ofctl add-flow br0 > 'table=0,in_port=p1,ip,nw_dst=10.1.2.0/24,actions=p2' > <http://10.1.2.0/24,actions=p2'>]) > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//'], > [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no > <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:never, > actions:2 > > +]) > > For some reason, this check always fails on FreeBSD: > > ./pmd.at:1349 <http://pmd.at:1349>: ovs-appctl dpctl/dump-flows | sed > 's/.*core: [0-9]*//' > --- - 2023-08-07 22:43:09.208141000 +0000 > +++ /tmp/cirrus-ci-build/tests/testsuite.dir/at-groups/1088/stdout > 2023-08-07 22:43:09.207372000 +0000 > @@ -1,3 +1 @@ > > > -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no > <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:never, > actions:2 > - > --- > > The flow dump is empty, presumably the PMD thread didn't fully > process the packet yet at the time of a dump. > > > So, what's your suggestion? set a large max_idle and sleep like a 1s?
Sleeps are not great. We should use OVS_WAIT_UNTIL_EQUAL, I guess. > > > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Replace openflow rules, trigger the revalidation. > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 > <http://10.1.0.0/16> actions=ct(commit)' | dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indent this line to the level of 'echo' above. > > > +AT_CHECK([ovs-appctl revalidator/wait]) > > + > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no > <http://10.1.0.2/255.255.0.0,frag=no>), packets:0, bytes:0, used:never, > actions:ct(commit) > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no > <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:0.0s, > actions:ct(commit) > > +]) > > + > > +dnl Hold the prefix 10.1.2.2/24 <http://10.1.2.2/24> by another 10s. > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)']) > > +dnl Send more 10.1.0.2 to make 10.1.0.0/16 <http://10.1.0.0/16> tuple > preprend 10.1.2.0/24 <http://10.1.2.0/24> tuple in the pvector of subtables. > > +for i in `seq 0 256`;do > > A space after ';'. Also, it's better to use $(...) instead of `...`. > > > +AT_CHECK([ovs-appctl netdev-dummy/receive p1 > 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)']) > > Indent this 2 spaces to the right. > > > +done > > + > > +AT_CHECK([echo 'table=0,in_port=p1,ip,nw_dst=10.1.0.0/16 > <http://10.1.0.0/16> actions=p2' | dnl > > +ovs-ofctl --bundle replace-flows br0 -]) > > Indentation here again. > > Should, probably, wait here for revalidation as well? > > > + > > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*core: [[0-9]]*//' | > strip_xout_keep_actions], [0], [ > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.0.2/255.255.0.0,frag=no > <http://10.1.0.2/255.255.0.0,frag=no>), packets:0, bytes:0, used:0.0s, > actions:2 > > > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(dst=10.1.2.2/255.255.255.0,frag=no > <http://10.1.2.2/255.255.255.0,frag=no>), packets:0, bytes:0, used:0.0s, > actions:2 > > +]) > > + > > +OVS_VSWITCHD_STOP > > +AT_CLEANUP > > > > -- > hepeng _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev