Hi guys,

I am afraid that commit is too long ago that would remember any details that 
caused us to change the code in beb75a40fdc2 ("userspace: Switching of L3 
packets in L2 pipeline"). What I vaguely remember was that I couldn’t 
comprehend the original code and it was not working correctly in some of the 
cases we needed/tested. But perhaps the changes we introduced also had corner 
cases we didn't consider. 

Question though: 

> > The datapath supports installing wider flows, and OVS relies on this
> > behavior. For example if ipv4(src=1.1.1.1/192.0.0.0,
> > dst=1.1.1.2/192.0.0.0) exists, a wider flow (smaller mask) of
> > ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/128.0.0.0) is allowed to be
> > added.

That sounds strange to me. I always believed the datapath only supports 
non-overlapping flows, i.e. a packet can match at most one datapath flow. Only 
with that pre-requisite the dpcls classifier can work without priorities. Have 
I been wrong in this? What would be the semantics of adding a wider flow to the 
datapath? To my knowledge there is no guarantee that the dpcls subtables are 
visited in any specific order that would honor the mask width. And the first 
match will win.

Please clarify this. And in which sense OVS relies on this behavior?

BR, Jan

> -----Original Message-----
> From: Ilya Maximets <i.maxim...@ovn.org>
> Sent: Tuesday, 18 October 2022 21:40
> To: Eelco Chaudron <echau...@redhat.com>; d...@openvswitch.org
> Cc: i.maxim...@ovn.org; Jan Scheurich <jan.scheur...@ericsson.com>
> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Use unmasked key when adding
> datapath flows.
> 
> On 10/18/22 18:42, Eelco Chaudron wrote:
> > The datapath supports installing wider flows, and OVS relies on this
> > behavior. For example if ipv4(src=1.1.1.1/192.0.0.0,
> > dst=1.1.1.2/192.0.0.0) exists, a wider flow (smaller mask) of
> > ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/128.0.0.0) is allowed to be
> > added.
> >
> > However, if we try to add a wildcard rule, the installation fails:
> >
> > # ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
> >   ipv4(src=1.1.1.1/192.0.0.0,dst=1.1.1.2/192.0.0.0,frag=no)" 2 #
> > ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
> >   ipv4(src=192.1.1.1/0.0.0.0,dst=49.1.1.2/0.0.0.0,frag=no)" 2
> > ovs-vswitchd: updating flow table (File exists)
> >
> > The reason is that the key used to determine if the flow is already
> > present in the system uses the original key ANDed with the mask.
> > This results in the IP address not being part of the (miniflow) key,
> > i.e., being substituted with an all-zero value. When doing the actual
> > lookup, this results in the key wrongfully matching the first flow,
> > and therefore the flow does not get installed. The solution is to use
> > the unmasked key for the existence check, the same way this is handled
> > in the userspace datapath.
> >
> > Signed-off-by: Eelco Chaudron <echau...@redhat.com>
> > ---
> >  lib/dpif-netdev.c    |   33 +++++++++++++++++++++++++++++----
> >  tests/dpif-netdev.at |   14 ++++++++++++++
> >  2 files changed, 43 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > a45b46014..daa00aa2f 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -3321,6 +3321,28 @@ netdev_flow_key_init_masked(struct
> netdev_flow_key *dst,
> >                              (dst_u64 - miniflow_get_values(&dst->mf))
> > * 8);  }
> >
> > +/* Initializes 'dst' as a copy of 'flow'. */ static inline void
> > +netdev_flow_key_init(struct netdev_flow_key *key,
> > +                     const struct flow *flow) {
> > +    uint64_t *dst = miniflow_values(&key->mf);
> > +    uint32_t hash = 0;
> > +    uint64_t value;
> > +
> > +    miniflow_map_init(&key->mf, flow);
> > +    miniflow_init(&key->mf, flow);
> > +
> > +    size_t n = dst - miniflow_get_values(&key->mf);
> > +
> > +    FLOW_FOR_EACH_IN_MAPS (value, flow, key->mf.map) {
> > +        hash = hash_add64(hash, value);
> > +    }
> > +
> > +    key->hash = hash_finish(hash, n * 8);
> > +    key->len = netdev_flow_key_size(n); }
> > +
> >  static inline void
> >  emc_change_entry(struct emc_entry *ce, struct dp_netdev_flow *flow,
> >                   const struct netdev_flow_key *key) @@ -4195,7
> > +4217,7 @@ static int  dpif_netdev_flow_put(struct dpif *dpif, const
> > struct dpif_flow_put *put)  {
> >      struct dp_netdev *dp = get_dp_netdev(dpif);
> > -    struct netdev_flow_key key, mask;
> > +    struct netdev_flow_key key;
> >      struct dp_netdev_pmd_thread *pmd;
> >      struct match match;
> >      ovs_u128 ufid;
> > @@ -4244,9 +4266,12 @@ dpif_netdev_flow_put(struct dpif *dpif, const
> > struct dpif_flow_put *put)
> >
> >      /* Must produce a netdev_flow_key for lookup.
> >       * Use the same method as employed to create the key when adding
> > -     * the flow to the dplcs to make sure they match. */
> > -    netdev_flow_mask_init(&mask, &match);
> > -    netdev_flow_key_init_masked(&key, &match.flow, &mask);
> > +     * the flow to the dplcs to make sure they match.
> > +     * We need to put in the unmasked key as flow_put_on_pmd() will first
> try
> > +     * to see if an entry exists doing a packet type lookup. As masked-out
> > +     * fields are interpreted as zeros, they could falsely match a wider IP
> > +     * address mask. Installation of the flow will use the match variable. 
> > */
> > +    netdev_flow_key_init(&key, &match.flow);
> 
> 
> Hi, Eelco.  Thanks for the patch.
> It is indeed an issue that we can lookup the incorrect flow due to new mask
> being wider, but I'm wondering if that will cause any other issues for L3
> traffic as you seem to mainly revert part of this previous commit:
>   beb75a40fdc2 ("userspace: Switching of L3 packets in L2 pipeline") The
> commit message was tlking about something not being properly masked...
> 
> CC: Jan, maybe you remember some details about that?
> 
> Also, the "Use the same method" doesn't make a lot of sense with the
> change.
> 
> Best regards, Ilya Maximets.
> 
> >
> >      if (put->pmd_id == PMD_ID_NULL) {
> >          if (cmap_count(&dp->poll_threads) == 0) { diff --git
> > a/tests/dpif-netdev.at b/tests/dpif-netdev.at index
> > 3179e1645..32054c52e 100644
> > --- a/tests/dpif-netdev.at
> > +++ b/tests/dpif-netdev.at
> > @@ -636,6 +636,20 @@ OVS_VSWITCHD_STOP(["/flow: in_port is not an
> > exact match/d  /failed to put/d"])  AT_CLEANUP
> >
> > +AT_SETUP([dpif-netdev - check dpctl/add-flow wider ip match])
> > +OVS_VSWITCHD_START(
> > +  [add-port br0 p1 \
> > +   -- set interface p1 type=dummy
> options:pstream=punix:$OVS_RUNDIR/p0.sock \
> > +   -- set bridge br0 datapath-type=dummy])
> > +
> > +AT_CHECK([ovs-appctl revalidator/pause]) AT_CHECK([ovs-appctl
> > +dpctl/add-flow
> > +"in_port(1),eth_type(0x0800),ipv4(src=0.0.0.0/192.0.0.0,dst=0.0.0.0/1
> > +92.0.0.0,frag=no)" "3"]) AT_CHECK([ovs-appctl dpctl/add-flow
> > +"in_port(1),eth_type(0x0800),ipv4(src=192.1.1.1/0.0.0.0,dst=49.1.1.1/
> > +0.0.0.0,frag=no)" "3"]) AT_CHECK([ovs-appctl revalidator/resume])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > +
> >  # SEND_UDP_PKTS([p_name], [p_ofport])  #  # Sends 128 packets to port
> > 'p_name' with different UDP destination ports.
> >
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-
> 45444
> > 5555731-d4c0eb6f205e38a3&q=1&e=8e5a091e-ba71-48db-af86-
> f51651461be7&u=
> > https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev
> >
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to