Ilya Maximets <i.maxim...@ovn.org> 于2023年7月28日周五 00:59写道:
> On 7/27/23 04:31, Peng He wrote: > > > > > > Eelco Chaudron <echau...@redhat.com <mailto:echau...@redhat.com>> > 于2023年7月6日周四 15:52写道: > > > > > > > > On 6 Jul 2023, at 4:32, Peng He wrote: > > > > > Eelco Chaudron <echau...@redhat.com <mailto:echau...@redhat.com>> > 于2023年7月5日周三 22:16写道: > > > > > >> > > >> > > >> On 1 Jul 2023, at 4:43, Peng He wrote: > > >> > > >>> OVS allows overlapping megaflows, as long as the actions of these > > >>> megaflows are equal. However, the current implementation of > action > > >>> modification relies on flow_lookup instead of ufid, this could > result > > >>> in looking up a wrong megaflow and make the ukeys and megaflows > > >> inconsistent > > >>> > > >>> Just like the test case in the patch, at first we have a rule > with the > > >>> prefix: > > >>> > > >>> 10.1.2.0/24 <http://10.1.2.0/24> > > >>> > > >>> and we will get a megaflow with prefixes 10.1.2.2/24 < > http://10.1.2.2/24> when a packet with > > >> IP > > >>> 10.1.2.2 is received. > > >>> > > >>> Then suppose we change the rule into 10.1.0.0/16 < > http://10.1.0.0/16>. OVS prefers to keep > > >> the > > >>> 10.1.2.2/24 <http://10.1.2.2/24> megaflow and just changes its > action instead of extending > > >>> the prefix into 10.1.2.2/16 <http://10.1.2.2/16>. > > >>> > > >>> then suppose we have a 10.1.0.2 packet, since it misses the > megaflow, > > >>> this time, we will have an overlapping megaflow with the right > prefix: > > >>> 10.1.0.2/16 <http://10.1.0.2/16> > > >>> > > >>> now we have two megaflows: > > >>> 10.1.2.2/24 <http://10.1.2.2/24> > > >>> 10.1.0.2/16 <http://10.1.0.2/16> > > >>> > > >>> last, suppose we have changed the ruleset again. The revalidator > this > > >>> time still decides to change the actions of both megaflows > instead of > > >>> deleting them. > > >>> > > >>> The dpif_netdev_flow_put will search the megaflow to modify with > unmasked > > >>> keys, however it might lookup the wrong megaflow as the key > 10.1.2.2 > > >> matches > > >>> both 10.1.2.2/24 <http://10.1.2.2/24> and 10.1.0.2/16 < > http://10.1.0.2/16>! > > >>> > > >>> This patch changes the megaflow lookup code in modification path > into > > >>> relying the ufid to find the correct megaflow instead of key > lookup. > > >> > > >> Thanks for fixing Ilya’s comments! I’ve also copied in some of > the v3 > > >> discussion, so we can wrap it up here. > > >> > > >> Acked-by: Eelco Chaudron <echau...@redhat.com <mailto: > echau...@redhat.com>> > > >> > > >> //Eelco > > >> > > >>> Signed-off-by: Peng He <hepeng.0...@bytedance.com <mailto: > hepeng.0...@bytedance.com>> > > >>> --- > > >>> lib/dpif-netdev.c | 62 > ++++++++++++++++++++++++++--------------------- > > >>> tests/pmd.at <http://pmd.at> | 48 > ++++++++++++++++++++++++++++++++++++ > > >>> 2 files changed, 82 insertions(+), 28 deletions(-) > > >>> > > >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > >>> index 70b953ae6..b90ed1542 100644 > > >>> --- a/lib/dpif-netdev.c > > >>> +++ b/lib/dpif-netdev.c > > >>> @@ -4198,36 +4198,43 @@ flow_put_on_pmd(struct > dp_netdev_pmd_thread *pmd, > > >>> memset(stats, 0, sizeof *stats); > > >>> } > > >>> > > >>> - ovs_mutex_lock(&pmd->flow_mutex); > > >>> - netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL); > > >>> - if (!netdev_flow) { > > >>> - if (put->flags & DPIF_FP_CREATE) { > > >>> + if (put->flags & DPIF_FP_CREATE) { > > >> > > >> > > >> EC> Should this not be: > > >> EC> > > >> EC> if (put->flags & DPIF_FP_CREATE && !(put->flags & > DPIF_FP_MODIFY)) > > >> EC> > > >> EC> I know this will break the fix, but I’m wondering what the > possible > > >> EC> combinations are. > > >> EC> Quickly looking at the code the following ones seem to exist: > > >> EC> > > >> EC> DPIF_FP_CREATE -> Create a flow, if one exists return EEXIST > > >> EC> DPIF_FP_MODIFY -> Modify a flow, if it does not exist return > ENONT > > >> EC> DPIF_FP_CREATE | DPIF_FP_MODIFY -> If it exists modify it, > if it does > > >> EC> not exists create it. > > >> EC> > > >> EC> Could the last combination not potentially fail the lookup? > > >> EC> > > >> EC> Or are we assuming only standalone DPIF_FP_MODIFY’s are the > problem? In > > >> EC> theory, it could also be the combination. > > >> EC> > > >> > > >> PH> sorry, I was wrong. Such a combination does exist in > > >> PH>the dpif_probe_feature, and it probes the datapath by trying > to put > > >> flows: > > >> PH> > > >> PH> error = dpif_flow_put(dpif, DPIF_FP_CREATE | > DPIF_FP_MODIFY | > > >> PH>DPIF_FP_PROBE, > > >> PH> key->data, key->size, NULL, 0, > > >> PH> nl_actions, nl_actions_size, > > >> PH> ufid, NON_PMD_CORE_ID, NULL); > > >> > > >> PH> so we CANNOT change the code into: > > >> > > >> PH> if (put->flags & DPIF_FP_CREATE && !(put->flags & > DPIF_FP_MODIFY)) > > >> > > >> PH> as the issue the patch tries to fix only exist in MODIFY > alone path. > > >> PH> If CREATE bit is set, we need to go through the CREATE path > no matter > > >> if > > >> PH> MODIFY exist or not. > > >> > > >> Ok, if this is only used by the probe function we should be fine. > I did > > >> quickly search the code and it seems to be the way. However, if > it’s ever > > >> used by any part of ovs with both flags set, it might fail the > lookup and > > >> we run into the same problem. > > >> > > > > > > By "the same problem", you mean the problem listed in this patch > or this > > > fix might lead to a potential failure in other part of OVS? > > > > If you have a request with both DPIF_FP_MODIFY and DPIF_FP_CREATE > set, the lookup could still return the wrong entry. But if this is not used > it should be fine for now. > > > > > > So do we need to check such combination here, and just return -ENOSUPP? > > I'm not sure that's a good solution for a problem in the current version > of the fix. In general, this patch seems to not comply with the documented > behavior of the flow_put callback. The intended behavior is described in > the comment for a struct dpif_flow_put in lib/dpif.h. Both flags being > present is a valid case that should be handled. Empty set of flags can > also be a valid case of some sort. It will always fail, but it should > provide different error codes depending on the flow existence in the > datapath. > > Is there anything major that prevents us from using UFID always? > Userspace datapath seems to always have it and it will generate one if not > provided. The internal generation might be a problem, I guess, due to a > risk of it being non-unique. So, we probably should not use a locally > generated ufid for a lookup, only for dp_netdev_flow_add. > > So, the logic might be: > > if put->ufid: /* Actually provided UFID from upper layers. */ > Lookup by UFID > else: > Lookup by KEY > In general, I think this solution is better. One thing I am not clear is below: if use KEY, how can we guarantee that we have found the right megaflow? I think the only way to do this is to generate again the ufid using KEY, and then lookup the flow_table using this ufid. The upper layer can decide to use ufid or not, t hrough upcall/enable/disable-ufid, but the userspace datapath should always support ufid. This results in a logic that no matter what the upper layer provide, the datapath always use ufid to locate the megaflow. Why not use ufid at the first place? I do not know. > Not Found: > DPIF_FP_CREATE is SET: Add the flow > DPIF_FP_CREATE is not SET: ENOENT > Found: > DPIF_FP_MODIFY is SET: Modify the flow > DPIF_FP_MODIFY is not SET: EEXIST > This way, if flows are added manually it will be user's responsibility > to specify or not specify UFIDs, the dpctl/add-flow is a debug interface > after all, so should not be a problem. And the ovs-vswitchd itself will > always provide UFID and it checks that ukeys have different UFIDs, unless > manually disabled, again, for testing or debugging purposes. > > Such logic is not that different from what we have today and should cover > the issue and all the combinations of flags. > > Or am I missing something? > > > A few other comments regarding the implementation: > > - The change reduces the critical section of the flow mutex down to the > DPIF_FP_CREATE case. We can't actually do that. The main reason will > be a possible race condition on the ovsrcu_set() during the actions update > and a subsequent potential actions leak. RCU set operations require > external synchronization. There could be other race conditions as well, > because multiple threads can work on the same flow at the same time. > Yes, this is indeed a problem. Although I am not sure that is it possible that two revalidator threads operate a same ukey at the same time? But better control this in the lower layer to make sure. > > - The '/* Overlapping flow. */' comment is actually not correct. It's > not really the problem of this particular patch as it wasn't correct > before either, but we should probably just remove it as you're touching > this code anyways. > Ok, will delete it in the next version. > > Best regards, Ilya Maximets. > -- hepeng _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev