Re: [ovs-dev] [ovs-dev v4] dpif-netdev: fix dpif_netdev_flow_put

2023-07-26 Thread Peng He
Eelco Chaudron  于2023年7月6日周四 15:52写道:

>
>
> On 6 Jul 2023, at 4:32, Peng He wrote:
>
> > Eelco Chaudron  于2023年7月5日周三 22:16写道:
> >
> >>
> >>
> >> On 1 Jul 2023, at 4:43, Peng He wrote:
> >>
> >>> OVS allows overlapping megaflows, as long as the actions of these
> >>> megaflows are equal. However, the current implementation of action
> >>> modification relies on flow_lookup instead of ufid, this could result
> >>> in looking up a wrong megaflow and make the ukeys and megaflows
> >> inconsistent
> >>>
> >>> Just like the test case in the patch, at first we have a rule with the
> >>> prefix:
> >>>
> >>> 10.1.2.0/24
> >>>
> >>> and we will get a megaflow with prefixes 10.1.2.2/24 when a packet
> with
> >> IP
> >>> 10.1.2.2 is received.
> >>>
> >>> Then suppose we change the rule into 10.1.0.0/16. OVS prefers to keep
> >> the
> >>> 10.1.2.2/24 megaflow and just changes its action instead of extending
> >>> the prefix into 10.1.2.2/16.
> >>>
> >>> then suppose we have a 10.1.0.2 packet, since it misses the megaflow,
> >>> this time, we will have an overlapping megaflow with the right prefix:
> >>> 10.1.0.2/16
> >>>
> >>> now we have two megaflows:
> >>> 10.1.2.2/24
> >>> 10.1.0.2/16
> >>>
> >>> last, suppose we have changed the ruleset again. The revalidator this
> >>> time still decides to change the actions of both megaflows instead of
> >>> deleting them.
> >>>
> >>> The dpif_netdev_flow_put will search the megaflow to modify with
> unmasked
> >>> keys, however it might lookup the wrong megaflow as the key 10.1.2.2
> >> matches
> >>> both 10.1.2.2/24 and 10.1.0.2/16!
> >>>
> >>> This patch changes the megaflow lookup code in modification path into
> >>> relying the ufid to find the correct megaflow instead of key lookup.
> >>
> >> Thanks for fixing Ilya’s comments! I’ve also copied in some of the v3
> >> discussion, so we can wrap it up here.
> >>
> >> Acked-by: Eelco Chaudron 
> >>
> >> //Eelco
> >>
> >>> Signed-off-by: Peng He 
> >>> ---
> >>>  lib/dpif-netdev.c | 62 ++-
> >>>  tests/pmd.at  | 48 
> >>>  2 files changed, 82 insertions(+), 28 deletions(-)
> >>>
> >>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >>> index 70b953ae6..b90ed1542 100644
> >>> --- a/lib/dpif-netdev.c
> >>> +++ b/lib/dpif-netdev.c
> >>> @@ -4198,36 +4198,43 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread
> *pmd,
> >>>  memset(stats, 0, sizeof *stats);
> >>>  }
> >>>
> >>> -ovs_mutex_lock(>flow_mutex);
> >>> -netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, NULL);
> >>> -if (!netdev_flow) {
> >>> -if (put->flags & DPIF_FP_CREATE) {
> >>> +if (put->flags & DPIF_FP_CREATE) {
> >>
> >>
> >> EC> Should this not be:
> >> EC>
> >> EC> if (put->flags & DPIF_FP_CREATE && !(put->flags & DPIF_FP_MODIFY))
> >> EC>
> >> EC> I know this will break the fix, but I’m wondering what the possible
> >> EC> combinations are.
> >> EC> Quickly looking at the code the following ones seem to exist:
> >> EC>
> >> EC> DPIF_FP_CREATE  -> Create a flow, if one exists return EEXIST
> >> EC> DPIF_FP_MODIFY  -> Modify a flow, if it does not exist return ENONT
> >> EC> DPIF_FP_CREATE | DPIF_FP_MODIFY   -> If it exists modify it, if it
> does
> >> EC> not exists create it.
> >> EC>
> >> EC> Could the last combination not potentially fail the lookup?
> >> EC>
> >> EC> Or are we assuming only standalone DPIF_FP_MODIFY’s are the
> problem? In
> >> EC> theory, it could also be the combination.
> >> EC>
> >>
> >> PH> sorry, I was wrong. Such a combination does exist in
> >> PH>the dpif_probe_feature, and it probes the datapath by trying to put
> >> flows:
> >> PH>
> >> PH> error = dpif_flow_put(dpif, DPIF_FP_CREATE | DPIF_FP_MODIFY |
> >> PH>DPIF_FP_PROBE,
> >> PH>   key->data, key->size, NULL, 0,
> >> PH>   nl_actions, nl_actions_size,
> >> PH>   ufid, NON_PMD_CORE_ID, NULL);
> >>
> >> PH> so we CANNOT change the code into:
> >>
> >> PH> if (put->flags & DPIF_FP_CREATE && !(put->flags & DPIF_FP_MODIFY))
> >>
> >> PH> as the issue the patch tries to fix only exist in MODIFY alone path.
> >> PH> If CREATE bit is set, we need to go through the CREATE path no
> matter
> >> if
> >> PH> MODIFY exist or not.
> >>
> >> Ok, if this is only used by the probe function we should be fine. I did
> >> quickly search the code and it seems to be the way. However, if it’s
> ever
> >> used by any part of ovs with both flags set, it might fail the lookup
> and
> >> we run into the same problem.
> >>
> >
> > By "the same problem", you mean the problem listed in this patch or this
> > fix might lead to a potential failure in other part of OVS?
>
> If you have a request with both DPIF_FP_MODIFY and DPIF_FP_CREATE set, the
> lookup could still return the wrong entry. But if this is not used it
> should be fine for now.
>

So do we need to check such combination here, and just return 

Re: [ovs-dev] [ovs-dev v12] ofproto-dpif-upcall: fix push_dp_ops

2023-07-26 Thread Peng He
Simon Horman  于2023年7月27日周四 01:04写道:

> On Thu, Jul 06, 2023 at 04:59:36PM +0800, Peng He wrote:
>
> ...
>
> >  +dnl Replace OpenFlow rules, trigger revalidation and wait for it to
> > >>> complete.
> >  +AT_CHECK([echo 'table=0,in_port=p1,ip actions=ct(commit)' |
> ovs-ofctl
> > >>> --bundle replace-flows br0 -])
> >  +AT_CHECK([ovs-appctl revalidator/wait])
> >  +
> >  +dnl Inconsistent ukey should be deleted.
> >  +AT_CHECK([ovs-appctl upcall/show | grep keys | grep -q -v 0], [1])
> >  +
> >  +dnl Check the log for the flow modification error.
> >  +AT_CHECK([grep -q -E ".*failed to put.*$" ovs-vswitchd.log])
> >  +
> >  +dnl Remove warning logs to let test suite pass.
> >  +OVS_VSWITCHD_STOP(["dnl
> >  +/.*failed to put.*$/d
> >  +/.*failed to flow_del.*$/d"])
> > >>>
> > >>> You missed the indentation suggested by Ilya:
> > >>>
> > >>>
> > >> Which kind of the email client are you using...
> > >> The indentation here is really easy to miss
> > >
> > > From the User-Agent data, he seems to be using Thunderbird/102.10.0
> but the emails are plain text, so the problem is probably with your email
> client converting them to “html” like format. If you can configure a fixed
> width font for text emails you might spot these thinks easier.
> >
> >
> > I change to use another email client now, thanks !
>
> Hi Peng He, all,
>
> Am I correct in assuming that there will be a v13?
>

Hi, if need, I can submit a v13, the current version is just missing some
indentation.

"The changes look good, with one little nit (see below), but those can be
applied during commit."
I thought it would be changed by the maintainer during merging...




-- 
hepeng
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovs-tcpdump: Bugfix-of-ovs-tcpdump

2023-07-26 Thread Simon Jones
Hi Aaron,

Thanks for your suggestion. This is my opinion as below.


Simon Jones


Aaron Conole  于2023年7月20日周四 23:41写道:

> Hi Simon,
>
> Thanks for the contribution!
>
> Simon Jones  writes:
>
> > From: simon 
> >
> > Fix bug of ovs-tcpdump, which will cause megaflow action wrong.
> >
> > As use ovs-tcpdump will add mipxxx NIC, and this NIC has IPv6 address by
> > default.
>
> This is true only on systems where the ipv6 autoconf sysctl is set to
> true, I think.  I think it is probably worth including that detail.
>
> > For vxlan topology, mipxxx will be treated as tunnel port, and will got
> > error actions.
> >
> > For detail discuss, refer:
> >
> https://github.com/batmancn/github-work/blob/main/ovs-bugfix-ovs-tcpdump.md
>
> Why not include some of these details in the commit message?  At least,
> we can say that ipv6 packets going down the mirror port will cause
> erroneous behavior in most scenarios, and we should just stop ipv6
> address there.
>

Ok, how about change like this:

Fix bug of ovs-tcpdump, which will cause megaflow action wrong.

Details:
As use ovs-tcpdump will add mipxxx NIC, and this NIC has IPv6 address by
default.
For vxlan topology, mipxxx, which has IPv6 address, will be treated as
tunnel port, and will got error actions.
So we should just stop ipv6 address there.
For more detail discuss, refer:
https://github.com/batmancn/github-work/blob/main/ovs-bugfix-ovs-tcpdump.md

What patch do:
So we need to clear all IP address on mipxxx NIC.
As there is no IPv4 address, but has IPv6 by default, which is tested on
ubuntu and centos.
So my patch is to clear IPv6 address on mipxxx NIC.


>
> Actually, I wonder if there should be any other steps we might need to
> take incase zeroconf is setup for ipv4 on the system, but hopefully that
> is managed separately.
>

I have tested on ubuntu20.04 and centos stream 8, there is no IPv4 address
by default.
I think for now this patch works.


>
> > Signed-off-by: simon 
> > ---
>
> Acked-by: Aaron Conole 
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovs-tcpdump: Bugfix-of-ovs-tcpdump

2023-07-26 Thread Simon Jones
ok ~
Many thanks for suggestion, I will fix these ~


Simon Jones


Simon Horman  于2023年7月27日周四 01:09写道:

> On Thu, Jul 20, 2023 at 11:38:41AM -0400, Aaron Conole wrote:
> > Hi Simon,
> >
> > Thanks for the contribution!
> >
> > Simon Jones  writes:
> >
> > > From: simon 
> > >
> > > Fix bug of ovs-tcpdump, which will cause megaflow action wrong.
> > >
> > > As use ovs-tcpdump will add mipxxx NIC, and this NIC has IPv6 address
> by
> > > default.
> >
> > This is true only on systems where the ipv6 autoconf sysctl is set to
> > true, I think.  I think it is probably worth including that detail.
> >
> > > For vxlan topology, mipxxx will be treated as tunnel port, and will got
> > > error actions.
> > >
> > > For detail discuss, refer:
> > >
> https://github.com/batmancn/github-work/blob/main/ovs-bugfix-ovs-tcpdump.md
> >
> > Why not include some of these details in the commit message?  At least,
> > we can say that ipv6 packets going down the mirror port will cause
> > erroneous behavior in most scenarios, and we should just stop ipv6
> > address there.
> >
> > Actually, I wonder if there should be any other steps we might need to
> > take incase zeroconf is setup for ipv4 on the system, but hopefully that
> > is managed separately.
>
> Hi Simon,
>
> could I ask you to prepare a v3 taking into account Arron's suggestion.
>
> > > Signed-off-by: simon 
>
> And updating the above to include your first and last name,
> as appears in the from address of your emails.
>
> > > ---
> >
> > Acked-by: Aaron Conole 
>
> You can include this in v3, immediately after your Signed-off-by line.
>
> Thanks!
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn-heater] OVN scale testing with OpenStack workloads - questionnaire

2023-07-26 Thread Robin Jarry
Hi all,

as discussed yesterday during the community meeting and today on IRC,
I have created a public form that we can use to aggregate the results.

https://forms.gle/bmZARvrxgfrJvzkn6

No account is required to fill in the form and the results are
anonymous.

Please hold off before providing any responses. It is planned to do
a final review of all questions during the OVN IRC meeting tomorrow
(Thursday, July 26 2023, 17:15 UTC) before making it official.

https://www.ovn.org/en/#irc-meetings

All results are publicly available in various formats:

Web page:
https://docs.google.com/spreadsheets/d/e/2PACX-1vSR9TrcGLg1WO327Uy1ihAeJaACfnJSgBl_JcTez0mjYdFZpgF9NAUPe6fod0nyI5_fatJ7_GSre_I3/pubhtml?gid=1526075983=true
CSV:
https://docs.google.com/spreadsheets/d/e/2PACX-1vSR9TrcGLg1WO327Uy1ihAeJaACfnJSgBl_JcTez0mjYdFZpgF9NAUPe6fod0nyI5_fatJ7_GSre_I3/pub?gid=1526075983=true=csv
TSV:
https://docs.google.com/spreadsheets/d/e/2PACX-1vSR9TrcGLg1WO327Uy1ihAeJaACfnJSgBl_JcTez0mjYdFZpgF9NAUPe6fod0nyI5_fatJ7_GSre_I3/pub?gid=1526075983=true=tsv

Let me know if you have any remarks, corrections or questions regarding
this form. It was based on the document created by Dumitru:

https://docs.google.com/document/d/151saO5a5PmCt7cIZQ7DkgvU755od76BlN2bKjXc1n08/edit?usp=sharing

Cheers,

-- 
Robin

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Numan Siddique
On Wed, Jul 26, 2023 at 6:21 PM Dumitru Ceara  wrote:
>
> On 7/26/23 14:42, Ales Musil wrote:
> > There are two scenarios that could cause unwanted
> > CT zone flush:
> >
> > 1) The SB DB is destroyed and recreated. The new
> > database will end up with different UUIDs for
> > various components.
> >
> > 2) Upgrade of existing SB DB to ovn-ic.
> > The components are the same as before, but scattered
> > between multiple SB DBs. This again leads to different
> > UUIDs in SB DB.
> >
> > The CT zone name was based on datapath UUID which
> > causes flush when the UUID changes. Even if
> > the datapath is the same.
> >
> > To prevent the unwanted flush migrate from UUID
> > to component name (LR/LS name). This allows
> > the CT zones to be stable across the before mentioned
> > scenarios.
> >
> > For the migration to be "flush less" itself we need
> > to make sure to start the restoration process only
> > after controller is connected to the SB DB e.g. the
> > restoration can happen only during engine run and not
> > init as it was done previously.
> >
> > Reported-at: https://bugzilla.redhat.com/2224199
> > Tested-by: Surya Seetharaman 
> > Signed-off-by: Ales Musil 
> > ---
> > v2: Address comments from Dumitru.
> > Add TODO for the external_ids tracking.
> > Simplify the id retrieval in physical.c
> > ---
> >  controller/ovn-controller.c | 106 
> >  controller/physical.c   |  17 ++--
> >  lib/ovn-util.c  |   4 +-
> >  lib/ovn-util.h  |   2 +-
> >  tests/ovn-controller.at |  16 ++--
> >  tests/ovn.at| 156 
> >  6 files changed, 267 insertions(+), 34 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 236974f4f..d34dba75c 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
> >  HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
> >  /* XXX Add method to limit zone assignment to logical router
> >   * datapaths with NAT */
> > -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, 
> > "dnat");
> > -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, 
> > "snat");
> > +const char *name = smap_get(>datapath->external_ids, "name");
> > +if (!name) {
> > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
> > +"skipping zone assignment.",
> > +UUID_ARGS(>datapath->header_.uuid));
> > +continue;
> > +}
> > +
> > +char *dnat = alloc_nat_zone_key(name, "dnat");
> > +char *snat = alloc_nat_zone_key(name, "snat");
> >  sset_add(_users, dnat);
> >  sset_add(_users, snat);
> >
> > @@ -882,9 +891,66 @@ struct ed_type_ct_zones {
> >  bool recomputed;
> >  };
> >
> > +/* Replaces a UUID prefix from 'uuid_zone' (if any) with the
> > + * corresponding Datapath_Binding.external_ids.name.  Returns it
> > + * as a new string that that will be owned by the caller. */
>
> Typo: "that that".  This is my fault.. I had suggested it, sorry.  But
> maybe it can be fixed up at apply time.
>
> > +static char *
> > +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table,
> > +   const char *uuid_zone)
> > +{
> > +struct uuid uuid;
> > +if (!uuid_from_string_prefix(, uuid_zone)) {
> > +return NULL;
> > +}
> > +
> > +const struct sbrec_datapath_binding *dp =
> > +sbrec_datapath_binding_table_get_for_uuid(dp_table, );
> > +if (!dp) {
> > +return NULL;
> > +}
> > +
> > +const char *entity_name = smap_get(>external_ids, "name");
> > +if (!entity_name) {
> > +return NULL;
> > +}
> > +
> > +return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN);
> > +}
> > +
> > +static void
> > +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table,
> > +struct ed_type_ct_zones *ct_zones_data, const char *name,
> > +int zone)
> > +{
> > +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name);
> > +
> > +char *new_name = ct_zone_name_from_uuid(dp_table, name);
> > +const char *current_name = name;
> > +if (new_name) {
> > +VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'",
> > +  zone, name, new_name);
> > +
> > +/* Make sure we remove the uuid one in the next OvS DB commit 
> > without
> > + * flush. */
> > +add_pending_ct_zone_entry(_zones_data->pending, 
> > CT_ZONE_DB_QUEUED,
> > +  zone, false, name);
> > +/* Store the zone in OvS DB with name instead of uuid without 
> > flush.
> > + * */
> > +

Re: [ovs-dev] [PATCH ovn 4/5] controller: Add MAC cache I-P node

2023-07-26 Thread Mark Michelson

On 7/25/23 03:59, Ales Musil wrote:



On Thu, Jul 20, 2023 at 10:58 PM Mark Michelson > wrote:


Hi Ales,

I have a couple of comments in-line.


Hi Mark,

thank you for the review.


On 7/10/23 07:05, Ales Musil wrote:
 > The node currently keeps track of MAC bindings
 > that have aging enabled. The purpose of this node
 > is to have structures that can be used by the
 > thread to update the timestamp when the MAC
 > binding is used. The I-P is generic enough
 > to be used by FDB timestamp refresh in very similar
 > way to the MAC binding.
 >
 > This is a preparation for the MAC binding refresh mechanism.
 >
 > Signed-off-by: Ales Musil mailto:amu...@redhat.com>>
 > ---
 >   controller/automake.mk       |   4 +-
 >   controller/mac_cache.c      | 254

 >   controller/mac_cache.h      |  74 +++
 >   controller/ovn-controller.c | 220 +++
 >   4 files changed, 551 insertions(+), 1 deletion(-)
 >   create mode 100644 controller/mac_cache.c
 >   create mode 100644 controller/mac_cache.h
 >
 > diff --git a/controller/automake.mk 
b/controller/automake.mk 
 > index 334672b4d..562290359 100644
 > --- a/controller/automake.mk 
 > +++ b/controller/automake.mk 
 > @@ -43,7 +43,9 @@ controller_ovn_controller_SOURCES = \
 >       controller/vif-plug.h \
 >       controller/vif-plug.c \
 >       controller/mirror.h \
 > -     controller/mirror.c
 > +     controller/mirror.c \
 > +     controller/mac_cache.h \
 > +     controller/mac_cache.c
 >
 >   controller_ovn_controller_LDADD = lib/libovn.la
 $(OVS_LIBDIR)/libopenvswitch.la

 >   man_MANS += controller/ovn-controller.8
 > diff --git a/controller/mac_cache.c b/controller/mac_cache.c
 > new file mode 100644
 > index 0..4663499a1
 > --- /dev/null
 > +++ b/controller/mac_cache.c
 > @@ -0,0 +1,254 @@
 > +/* Copyright (c) 2023, Red Hat, Inc.
 > + *
 > + * Licensed under the Apache License, Version 2.0 (the "License");
 > + * you may not use this file except in compliance with the License.
 > + * You may obtain a copy of the License at:
 > + *
 > + * http://www.apache.org/licenses/LICENSE-2.0

 > + *
 > + * Unless required by applicable law or agreed to in writing,
software
 > + * distributed under the License is distributed on an "AS IS" BASIS,
 > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied.
 > + * See the License for the specific language governing
permissions and
 > + * limitations under the License.
 > + */
 > +
 > +#include 
 > +#include 
 > +
 > +#include "lport.h"
 > +#include "mac_cache.h"
 > +#include "openvswitch/hmap.h"
 > +#include "openvswitch/vlog.h"
 > +#include "ovn/logical-fields.h"
 > +#include "ovn-sb-idl.h"
 > +
 > +VLOG_DEFINE_THIS_MODULE(mac_cache);
 > +
 > +static uint32_t
 > +mac_cache_mb_data_hash(const struct mac_cache_mb_data *mb_data);
 > +static inline bool
 > +mac_cache_mb_data_equals(const struct mac_cache_mb_data *a,
 > +                          const struct mac_cache_mb_data *b);
 > +static struct mac_cache_mac_binding *
 > +mac_cache_mac_binding_find_by_mb_data(struct mac_cache_data *data,
 > +                                      const struct
mac_cache_mb_data *mb_data);
 > +static bool
 > +mac_cache_mb_data_from_sbrec(struct mac_cache_mb_data *data,
 > +                              const struct sbrec_mac_binding *mb,
 > +                              struct ovsdb_idl_index
*sbrec_pb_by_name);
 > +static struct mac_cache_threshold *
 > +mac_cache_threshold_find(struct hmap *thresholds, const struct
uuid *uuid);
 > +static void
 > +mac_cache_threshold_remove(struct hmap *thresholds,
 > +                           struct mac_cache_threshold *threshold);
 > +
 > +bool
 > +mac_cache_threshold_add(struct mac_cache_data *data,
 > +                        const struct sbrec_datapath_binding *dp)
 > +{
 > +    struct mac_cache_threshold *threshold =
 > +            mac_cache_threshold_find(>mb_thresholds,
>header_.uuid);
 > +    if (threshold) {
 > +        return true;
 > +    }
 > +
 > +    uint64_t mb_threshold = smap_get_uint(>external_ids,
 > + 
"mac_binding_age_threshold", 0);

 > +    if (!mb_threshold) {
 > +        return false;
 > +    }
 > +
 > +    threshold = xmalloc(sizeof *threshold);
 > +    

Re: [ovs-dev] [PATCH v2] ovs-tcpdump: Bugfix-of-ovs-tcpdump

2023-07-26 Thread Simon Horman
On Thu, Jul 20, 2023 at 11:38:41AM -0400, Aaron Conole wrote:
> Hi Simon,
> 
> Thanks for the contribution!
> 
> Simon Jones  writes:
> 
> > From: simon 
> >
> > Fix bug of ovs-tcpdump, which will cause megaflow action wrong.
> >
> > As use ovs-tcpdump will add mipxxx NIC, and this NIC has IPv6 address by
> > default.
> 
> This is true only on systems where the ipv6 autoconf sysctl is set to
> true, I think.  I think it is probably worth including that detail.
> 
> > For vxlan topology, mipxxx will be treated as tunnel port, and will got
> > error actions.
> >
> > For detail discuss, refer:
> > https://github.com/batmancn/github-work/blob/main/ovs-bugfix-ovs-tcpdump.md
> 
> Why not include some of these details in the commit message?  At least,
> we can say that ipv6 packets going down the mirror port will cause
> erroneous behavior in most scenarios, and we should just stop ipv6
> address there.
> 
> Actually, I wonder if there should be any other steps we might need to
> take incase zeroconf is setup for ipv4 on the system, but hopefully that
> is managed separately.

Hi Simon,

could I ask you to prepare a v3 taking into account Arron's suggestion.

> > Signed-off-by: simon 

And updating the above to include your first and last name,
as appears in the from address of your emails.

> > ---
> 
> Acked-by: Aaron Conole 

You can include this in v3, immediately after your Signed-off-by line.

Thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-dev v12] ofproto-dpif-upcall: fix push_dp_ops

2023-07-26 Thread Simon Horman
On Thu, Jul 06, 2023 at 04:59:36PM +0800, Peng He wrote:

...

>  +dnl Replace OpenFlow rules, trigger revalidation and wait for it to
> >>> complete.
>  +AT_CHECK([echo 'table=0,in_port=p1,ip actions=ct(commit)' | ovs-ofctl
> >>> --bundle replace-flows br0 -])
>  +AT_CHECK([ovs-appctl revalidator/wait])
>  +
>  +dnl Inconsistent ukey should be deleted.
>  +AT_CHECK([ovs-appctl upcall/show | grep keys | grep -q -v 0], [1])
>  +
>  +dnl Check the log for the flow modification error.
>  +AT_CHECK([grep -q -E ".*failed to put.*$" ovs-vswitchd.log])
>  +
>  +dnl Remove warning logs to let test suite pass.
>  +OVS_VSWITCHD_STOP(["dnl
>  +/.*failed to put.*$/d
>  +/.*failed to flow_del.*$/d"])
> >>> 
> >>> You missed the indentation suggested by Ilya:
> >>> 
> >>> 
> >> Which kind of the email client are you using...
> >> The indentation here is really easy to miss
> > 
> > From the User-Agent data, he seems to be using Thunderbird/102.10.0 but the 
> > emails are plain text, so the problem is probably with your email client 
> > converting them to “html” like format. If you can configure a fixed width 
> > font for text emails you might spot these thinks easier.
> 
> 
> I change to use another email client now, thanks !

Hi Peng He, all,

Am I correct in assuming that there will be a v13?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] binding: handle ovs ofport update

2023-07-26 Thread Mohammad Heib
Currently when ovs interface ofport is updated
after setting external_ids:iface_id, the ovn-controller
will see this change but will not do much if it handles
this change incrementally.

This behavior leads to a mismatch between the ovs openflow
flows in table=0 (inaccurate in_port) and the ofport number
that the packet was received at which will lead to packets
drop in table=0.

This patch will resolve the above issue by triggering
flows recompute during the I-P processing only if the
affected port are associated with lport and has flows
that need to be updated.

Reported-at: https://issues.redhat.com/browse/FD-3063
Signed-off-by: Mohammad Heib 
---
 controller/binding.c| 35 +++
 tests/ovn-controller.at | 39 +++
 2 files changed, 74 insertions(+)

diff --git a/controller/binding.c b/controller/binding.c
index 9aa3fc6c4..d28548d93 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -1514,6 +1514,35 @@ release_lport(const struct sbrec_port_binding *pb,
 return true;
 }
 
+/*
+ * This function will update the tracked_dp_bindings
+ * whenever an ofport on a specific ovs port.
+ * This update will trigger flow recomputation during
+ * the incremental processing run which updates the local
+ * flows in_port filed.
+ *
+ * This function will trigger flow recomputation only for
+ * affected local_bindings that have port_binding associated
+ * with it, otherwise, no flows are installed and we don't
+ * have to update any in_port field.
+ */
+static void
+handle_ovs_ofport_update(const char *iface_id,
+ struct binding_ctx_out *b_ctx_out)
+{
+struct shash *local_bindings = _ctx_out->lbinding_data->bindings;
+struct local_binding *lbinding = local_binding_find(local_bindings,
+iface_id);
+struct binding_lport *b_lport =
+local_binding_get_primary_or_localport_lport(lbinding);
+
+if (b_lport) {
+tracked_datapath_lport_add(b_lport->pb, TRACKED_RESOURCE_UPDATED,
+   b_ctx_out->tracked_dp_bindings);
+b_ctx_out->local_lports_changed = true;
+}
+}
+
 static bool
 is_lbinding_set(struct local_binding *lbinding)
 {
@@ -2681,6 +2710,12 @@ binding_handle_ovs_interface_changes(struct 
binding_ctx_in *b_ctx_in,
 if (!handled) {
 break;
 }
+
+if (!b_ctx_out->local_lports_changed
+ && ovsrec_interface_is_updated(iface_rec,
+OVSREC_INTERFACE_COL_OFPORT)) {
+handle_ovs_ofport_update(iface_id, b_ctx_out);
+}
 }
 }
 
diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 28c13234c..1461cade6 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -2608,3 +2608,42 @@ AT_CHECK([ovn-sbctl get chassis $chassis_id 
other_config:unsupported], [1], [ign
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-controller - ovs iface change ofport])
+AT_KEYWORDS([ovn])
+ovn_start
+
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
+options:tx_pcap=hv1/vif1-tx.pcap \
+options:rxq_pcap=hv1/vif1-rx.pcap \
+ofport-request=1
+
+
+ovn-nbctl ls-add sw0
+
+check ovn-nbctl lsp-add sw0 sw0-p1
+check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 1000::3"
+
+wait_for_ports_up
+ovn-nbctl --wait=hv sync
+
+OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-ofctl dump-flows br-int table=0 | grep 
priority=100 | grep in_port=1 | grep "resubmit(,8)" | grep -c n_packets=0`])
+
+# update the ovs interface ofport from 1 to 24
+check as hv1 ovs-vsctl set Interface hv1-vif1 ofport-request=24
+OVS_WAIT_UNTIL([test x`as hv1 ovs-vsctl get Interface hv1-vif1 ofport` = x24])
+
+OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-ofctl dump-flows br-int table=0 | grep 
priority=100 | grep in_port=24 | grep "resubmit(,8)" | grep -c n_packets=0`])
+OVS_WAIT_UNTIL([test 0 = `as hv1 ovs-ofctl dump-flows br-int table=0 | grep 
priority=100 | grep in_port=1 | grep "resubmit(,8)" | grep -c n_packets=0`])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])
-- 
2.34.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] QoS: Properly set qos when ovs db is read only

2023-07-26 Thread Xavier Simonart
QoS was not configured in OVS db when db was read only: the configuration
was just ignored and not done later when OVS db became writable.
It was sometimes set later, if/when a recompute happened.
This is now fixed: when OVS db is read only, the ports on which qos
must be applied and stored and qos will be applied when OVS db becomes writable.
To avoid race conditions between delayed qos and new qos changes (e.g. a qos
configuration delayed in one loop as ovs is ro, followed in next loop, when ovs
becomes rw, by another qos on the same port), all qos changes are done at the
same time.

This issue was identified by some random failures in system test
"egress qos".

Signed-off-by: Xavier Simonart 
---
 controller/binding.c| 131 
 controller/binding.h|   8 +++
 controller/ovn-controller.c |   7 ++
 tests/ovn.at|   1 -
 tests/system-ovn.at |  22 ++
 5 files changed, 124 insertions(+), 45 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 9aa3fc6c4..0e13624c1 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -55,8 +55,13 @@ struct claimed_port {
 long long int last_claimed;
 };
 
+struct qos_port {
+bool added;
+};
+
 static struct shash _claimed_ports = SHASH_INITIALIZER(&_claimed_ports);
 static struct sset _postponed_ports = SSET_INITIALIZER(&_postponed_ports);
+static struct shash _qos_ports = SHASH_INITIALIZER(&_qos_ports);
 
 static void
 remove_additional_chassis(const struct sbrec_port_binding *pb,
@@ -218,6 +223,17 @@ get_qos_egress_port_interface(struct shash 
*bridge_mappings,
 return NULL;
 }
 
+static void
+add_or_del_qos_port(const char *ovn_port, bool add)
+{
+struct qos_port *qos_port = shash_find_data(&_qos_ports, ovn_port);
+if (!qos_port) {
+qos_port = xzalloc(sizeof *qos_port);
+shash_add(&_qos_ports, ovn_port, qos_port);
+}
+qos_port->added = add;
+}
+
 /* 34359738360 == (2^32 - 1) * 8.  netdev_set_qos() doesn't support
  * 64-bit rate netlink attributes, so the maximum value is 2^32 - 1
  * bytes. The 'max-rate' config option is in bits, so multiplying by 8.
@@ -225,7 +241,7 @@ get_qos_egress_port_interface(struct shash *bridge_mappings,
  * can be unrecognized for certain NICs or reported too low for virtual
  * interfaces. */
 #define OVN_QOS_MAX_RATE34359738360ULL
-static void
+static bool
 add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 const struct ovsrec_port *port,
 unsigned long long min_rate,
@@ -239,7 +255,7 @@ add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 const struct ovsrec_qos *qos = port->qos;
 if (qos && !smap_get_bool(>external_ids, "ovn_qos", false)) {
 /* External configured QoS, do not overwrite it. */
-return;
+return false;
 }
 
 if (!qos) {
@@ -282,22 +298,18 @@ add_ovs_qos_table_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 ovsrec_queue_verify_external_ids(queue);
 ovsrec_queue_set_external_ids(queue, _ids);
 smap_destroy(_ids);
+return true;
 }
 
 static void
-remove_stale_qos_entry(struct ovsdb_idl_txn *ovs_idl_txn,
-   const struct sbrec_port_binding *pb,
+remove_stale_qos_entry( const char *logical_port,
struct ovsdb_idl_index *ovsrec_port_by_qos,
const struct ovsrec_qos_table *qos_table,
struct hmap *queue_map)
 {
-if (!ovs_idl_txn) {
-return;
-}
-
 struct qos_queue *q = find_qos_queue(
-queue_map, hash_string(pb->logical_port, 0),
-pb->logical_port);
+queue_map, hash_string(logical_port, 0),
+logical_port);
 if (!q) {
 return;
 }
@@ -338,8 +350,12 @@ remove_stale_qos_entry(struct ovsdb_idl_txn *ovs_idl_txn,
 
 static void
 configure_qos(const struct sbrec_port_binding *pb,
-  struct binding_ctx_in *b_ctx_in,
-  struct binding_ctx_out *b_ctx_out)
+  struct ovsdb_idl_txn *ovs_idl_txn,
+  struct ovsdb_idl_index *ovsrec_port_by_qos,
+  const struct ovsrec_qos_table *qos_table,
+  struct hmap *qos_map,
+  const struct ovsrec_open_vswitch_table *ovs_table,
+  const struct ovsrec_bridge_table *bridge_table)
 {
 unsigned long long min_rate = smap_get_ullong(
 >options, "qos_min_rate", 0);
@@ -351,20 +367,20 @@ configure_qos(const struct sbrec_port_binding *pb,
 
 if ((!min_rate && !max_rate && !burst) || !queue_id) {
 /* Qos is not configured for this port. */
-remove_stale_qos_entry(b_ctx_in->ovs_idl_txn, pb,
-   b_ctx_in->ovsrec_port_by_qos,
-   b_ctx_in->qos_table, b_ctx_out->qos_map);
+remove_stale_qos_entry(pb->logical_port,
+   ovsrec_port_by_qos,
+   

Re: [ovs-dev] [PATCH ovn v2] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#165 FILE: controller/ovn-controller.c:1154:
/* XXX: There is a potential bug in CT zone I-P node,

Lines checked: 522, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Dumitru Ceara
On 7/26/23 14:42, Ales Musil wrote:
> There are two scenarios that could cause unwanted
> CT zone flush:
> 
> 1) The SB DB is destroyed and recreated. The new
> database will end up with different UUIDs for
> various components.
> 
> 2) Upgrade of existing SB DB to ovn-ic.
> The components are the same as before, but scattered
> between multiple SB DBs. This again leads to different
> UUIDs in SB DB.
> 
> The CT zone name was based on datapath UUID which
> causes flush when the UUID changes. Even if
> the datapath is the same.
> 
> To prevent the unwanted flush migrate from UUID
> to component name (LR/LS name). This allows
> the CT zones to be stable across the before mentioned
> scenarios.
> 
> For the migration to be "flush less" itself we need
> to make sure to start the restoration process only
> after controller is connected to the SB DB e.g. the
> restoration can happen only during engine run and not
> init as it was done previously.
> 
> Reported-at: https://bugzilla.redhat.com/2224199
> Tested-by: Surya Seetharaman 
> Signed-off-by: Ales Musil 
> ---
> v2: Address comments from Dumitru.
> Add TODO for the external_ids tracking.
> Simplify the id retrieval in physical.c
> ---
>  controller/ovn-controller.c | 106 
>  controller/physical.c   |  17 ++--
>  lib/ovn-util.c  |   4 +-
>  lib/ovn-util.h  |   2 +-
>  tests/ovn-controller.at |  16 ++--
>  tests/ovn.at| 156 
>  6 files changed, 267 insertions(+), 34 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 236974f4f..d34dba75c 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
>  HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>  /* XXX Add method to limit zone assignment to logical router
>   * datapaths with NAT */
> -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
> -char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
> +const char *name = smap_get(>datapath->external_ids, "name");
> +if (!name) {
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
> +"skipping zone assignment.",
> +UUID_ARGS(>datapath->header_.uuid));
> +continue;
> +}
> +
> +char *dnat = alloc_nat_zone_key(name, "dnat");
> +char *snat = alloc_nat_zone_key(name, "snat");
>  sset_add(_users, dnat);
>  sset_add(_users, snat);
>  
> @@ -882,9 +891,66 @@ struct ed_type_ct_zones {
>  bool recomputed;
>  };
>  
> +/* Replaces a UUID prefix from 'uuid_zone' (if any) with the
> + * corresponding Datapath_Binding.external_ids.name.  Returns it
> + * as a new string that that will be owned by the caller. */

Typo: "that that".  This is my fault.. I had suggested it, sorry.  But
maybe it can be fixed up at apply time.

> +static char *
> +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table,
> +   const char *uuid_zone)
> +{
> +struct uuid uuid;
> +if (!uuid_from_string_prefix(, uuid_zone)) {
> +return NULL;
> +}
> +
> +const struct sbrec_datapath_binding *dp =
> +sbrec_datapath_binding_table_get_for_uuid(dp_table, );
> +if (!dp) {
> +return NULL;
> +}
> +
> +const char *entity_name = smap_get(>external_ids, "name");
> +if (!entity_name) {
> +return NULL;
> +}
> +
> +return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN);
> +}
> +
> +static void
> +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table,
> +struct ed_type_ct_zones *ct_zones_data, const char *name,
> +int zone)
> +{
> +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name);
> +
> +char *new_name = ct_zone_name_from_uuid(dp_table, name);
> +const char *current_name = name;
> +if (new_name) {
> +VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'",
> +  zone, name, new_name);
> +
> +/* Make sure we remove the uuid one in the next OvS DB commit without
> + * flush. */
> +add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
> +  zone, false, name);
> +/* Store the zone in OvS DB with name instead of uuid without flush.
> + * */
> +add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
> +  zone, true, new_name);
> +current_name = new_name;
> +}
> +
> +simap_put(_zones_data->current, current_name, zone);
> +bitmap_set1(ct_zones_data->bitmap, zone);
> +
> +free(new_name);
> +}
> +
>  static void
>  

Re: [ovs-dev] [PATCH ovn] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Ales Musil
On Wed, Jul 26, 2023 at 1:39 PM Dumitru Ceara  wrote:

> On 7/26/23 09:44, Surya Seetharaman wrote:
> > Thanks Ales for doing this critical fix!
> >
>
> +1 Thanks, Ales and Surya (ant others), for catching, debugging and
> fixing this!
>

Hi Dumitru,

thank you for the review.


>
> > During upgrades when moving clusters from non-interconnect model to
> > the interconnect model we need to rebuild the NB/SB DB's from scratch
> > and that was leading the change in UUIDs of the logical elements which
> > was causing a ct zone flush (OVNK uses zone0 which is a well-known zone
> > on the host that made matters worse since OVN started flushing zone 0 as
> > well). This led to traffic disruption in the cluster in order of
> > ~30-40seconds
> > during the upgrade.
> >
> > This fix was tested on an OpenShift cluster and verified that it does not
> > flush the zones that have not changed based on names (instead of UUIDs).
> >
> > Requesting a backport of the same to 23.06 OVN.
> >
>
> I think that's fine, in my opinion this can be easily considered a bug
> fix.  I have a few comments below.
>
> > Best Regards,
> > -SS
> >
> >
> > On Wed, Jul 26, 2023 at 8:45 AM Ales Musil  > > wrote:
> >
> > There are two scenarios that could cause unwanted
> > CT zone flush:
> >
> > 1) The SB DB is destroyed and recreated. The new
> > database will end up with different UUIDs for
> > various components.
> >
> > 2) Upgrade of existing SB DB to ovn-ic.
> > The components are the same as before, but scattered
> > between multiple SB DBs. This again leads to different
> > UUIDs in SB DB.
> >
> > The CT zone name was based on datapath UUID which
> > causes flush when the UUID changes. Even if
> > the datapath is the same.
> >
> > To prevent the unwanted flush migrate from UUID
> > to component name (LR/LS name). This allows
> > the CT zones to be stable across the before mentioned
> > scenarios.
> >
> > For the migration to be "flush less" itself we need
> > to make sure to start the restoration process only
> > after controller is connected to the SB DB e.g. the
> > restoration can happen only during engine run and not
> > init as it was done previously.
> >
> > Reported-at: https://bugzilla.redhat.com/2224199
> > 
> > Tested-by: Surya Seetharaman  > >
> > Signed-off-by: Ales Musil  amu...@redhat.com>>
> > ---
> >  controller/ovn-controller.c |  99 +++
> >  controller/physical.c   |  11 ++-
> >  lib/ovn-util.c  |   4 +-
> >  lib/ovn-util.h  |   2 +-
> >  tests/ovn-controller.at  |  16 ++--
> >  tests/ovn.at | 153
> > 
> >  6 files changed, 253 insertions(+), 32 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c
> b/controller/ovn-controller.c
> > index 236974f4f..0453949bc 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
> >  HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
> >  /* XXX Add method to limit zone assignment to logical router
> >   * datapaths with NAT */
> > -char *dnat =
> > alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
> > -char *snat =
> > alloc_nat_zone_key(>datapath->header_.uuid, "snat");
> > +const char *name = smap_get(>datapath->external_ids,
> > "name");
> > +if (!name) {
> > +static struct vlog_rate_limit rl =
> > VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_ERR_RL(, "Missing name for datapath
> '"UUID_FMT"' "
> > +"skipping zone assignment.",
> > +UUID_ARGS(>datapath->header_.uuid));
> > +continue;
> > +}
> > +
> > +char *dnat = alloc_nat_zone_key(name, "dnat");
> > +char *snat = alloc_nat_zone_key(name, "snat");
> >  sset_add(_users, dnat);
> >  sset_add(_users, snat);
> >
> > @@ -882,9 +891,63 @@ struct ed_type_ct_zones {
> >  bool recomputed;
> >  };
> >
>
> To make it more explicit I'd add a comment here.  Would something like
> this make sense?
>
> /* Replaces a UUID prefix from 'uuid_zone' (if any) with the
>  * corresponding Datapath_Binding.external_ids.name.  Returns it
>  * as a new string that that will be owned by the caller. */
>

Added in v2.


>
> > +static char *
> > +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table
> > *dp_table,
> > +   const char *uuid_zone)
> > +{
> > +struct uuid uuid;
> > +if (!uuid_from_string_prefix(, uuid_zone)) {
> > 

[ovs-dev] [PATCH ovn v2] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Ales Musil
There are two scenarios that could cause unwanted
CT zone flush:

1) The SB DB is destroyed and recreated. The new
database will end up with different UUIDs for
various components.

2) Upgrade of existing SB DB to ovn-ic.
The components are the same as before, but scattered
between multiple SB DBs. This again leads to different
UUIDs in SB DB.

The CT zone name was based on datapath UUID which
causes flush when the UUID changes. Even if
the datapath is the same.

To prevent the unwanted flush migrate from UUID
to component name (LR/LS name). This allows
the CT zones to be stable across the before mentioned
scenarios.

For the migration to be "flush less" itself we need
to make sure to start the restoration process only
after controller is connected to the SB DB e.g. the
restoration can happen only during engine run and not
init as it was done previously.

Reported-at: https://bugzilla.redhat.com/2224199
Tested-by: Surya Seetharaman 
Signed-off-by: Ales Musil 
---
v2: Address comments from Dumitru.
Add TODO for the external_ids tracking.
Simplify the id retrieval in physical.c
---
 controller/ovn-controller.c | 106 
 controller/physical.c   |  17 ++--
 lib/ovn-util.c  |   4 +-
 lib/ovn-util.h  |   2 +-
 tests/ovn-controller.at |  16 ++--
 tests/ovn.at| 156 
 6 files changed, 267 insertions(+), 34 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 236974f4f..d34dba75c 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
 /* XXX Add method to limit zone assignment to logical router
  * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+const char *name = smap_get(>datapath->external_ids, "name");
+if (!name) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
+"skipping zone assignment.",
+UUID_ARGS(>datapath->header_.uuid));
+continue;
+}
+
+char *dnat = alloc_nat_zone_key(name, "dnat");
+char *snat = alloc_nat_zone_key(name, "snat");
 sset_add(_users, dnat);
 sset_add(_users, snat);
 
@@ -882,9 +891,66 @@ struct ed_type_ct_zones {
 bool recomputed;
 };
 
+/* Replaces a UUID prefix from 'uuid_zone' (if any) with the
+ * corresponding Datapath_Binding.external_ids.name.  Returns it
+ * as a new string that that will be owned by the caller. */
+static char *
+ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table,
+   const char *uuid_zone)
+{
+struct uuid uuid;
+if (!uuid_from_string_prefix(, uuid_zone)) {
+return NULL;
+}
+
+const struct sbrec_datapath_binding *dp =
+sbrec_datapath_binding_table_get_for_uuid(dp_table, );
+if (!dp) {
+return NULL;
+}
+
+const char *entity_name = smap_get(>external_ids, "name");
+if (!entity_name) {
+return NULL;
+}
+
+return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN);
+}
+
+static void
+ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table,
+struct ed_type_ct_zones *ct_zones_data, const char *name,
+int zone)
+{
+VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name);
+
+char *new_name = ct_zone_name_from_uuid(dp_table, name);
+const char *current_name = name;
+if (new_name) {
+VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'",
+  zone, name, new_name);
+
+/* Make sure we remove the uuid one in the next OvS DB commit without
+ * flush. */
+add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
+  zone, false, name);
+/* Store the zone in OvS DB with name instead of uuid without flush.
+ * */
+add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
+  zone, true, new_name);
+current_name = new_name;
+}
+
+simap_put(_zones_data->current, current_name, zone);
+bitmap_set1(ct_zones_data->bitmap, zone);
+
+free(new_name);
+}
+
 static void
 restore_ct_zones(const struct ovsrec_bridge_table *bridge_table,
  const struct ovsrec_open_vswitch_table *ovs_table,
+ const struct sbrec_datapath_binding_table *dp_table,
  struct ed_type_ct_zones *ct_zones_data)
 {
 memset(ct_zones_data->bitmap, 0, sizeof ct_zones_data->bitmap);
@@ -895,10 +961,8 @@ restore_ct_zones(const struct ovsrec_bridge_table 
*bridge_table,
  

Re: [ovs-dev] [PATCH ovn] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Dumitru Ceara
On 7/26/23 13:38, Dumitru Ceara wrote:
> I'd just set zone_ids.dnat and zone_ids.snat to 0 and 'return zone_ids;'
> here.  Then we'd also avoid the ternary operator below when calling
> alloc_nat_zone_key().

Or use "modern C" and initialize zone_ids:

struct zone_ids zone_ids = (struct zone_ids) { 0 };

:)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Dumitru Ceara
On 7/26/23 09:44, Surya Seetharaman wrote:
> Thanks Ales for doing this critical fix!
> 

+1 Thanks, Ales and Surya (ant others), for catching, debugging and
fixing this!

> During upgrades when moving clusters from non-interconnect model to
> the interconnect model we need to rebuild the NB/SB DB's from scratch
> and that was leading the change in UUIDs of the logical elements which
> was causing a ct zone flush (OVNK uses zone0 which is a well-known zone
> on the host that made matters worse since OVN started flushing zone 0 as
> well). This led to traffic disruption in the cluster in order of
> ~30-40seconds
> during the upgrade.
> 
> This fix was tested on an OpenShift cluster and verified that it does not
> flush the zones that have not changed based on names (instead of UUIDs).
> 
> Requesting a backport of the same to 23.06 OVN.
> 

I think that's fine, in my opinion this can be easily considered a bug
fix.  I have a few comments below.

> Best Regards,
> -SS
> 
> 
> On Wed, Jul 26, 2023 at 8:45 AM Ales Musil  > wrote:
> 
> There are two scenarios that could cause unwanted
> CT zone flush:
> 
> 1) The SB DB is destroyed and recreated. The new
> database will end up with different UUIDs for
> various components.
> 
> 2) Upgrade of existing SB DB to ovn-ic.
> The components are the same as before, but scattered
> between multiple SB DBs. This again leads to different
> UUIDs in SB DB.
> 
> The CT zone name was based on datapath UUID which
> causes flush when the UUID changes. Even if
> the datapath is the same.
> 
> To prevent the unwanted flush migrate from UUID
> to component name (LR/LS name). This allows
> the CT zones to be stable across the before mentioned
> scenarios.
> 
> For the migration to be "flush less" itself we need
> to make sure to start the restoration process only
> after controller is connected to the SB DB e.g. the
> restoration can happen only during engine run and not
> init as it was done previously.
> 
> Reported-at: https://bugzilla.redhat.com/2224199
> 
> Tested-by: Surya Seetharaman  >
> Signed-off-by: Ales Musil mailto:amu...@redhat.com>>
> ---
>  controller/ovn-controller.c |  99 +++
>  controller/physical.c       |  11 ++-
>  lib/ovn-util.c              |   4 +-
>  lib/ovn-util.h              |   2 +-
>  tests/ovn-controller.at      |  16 ++--
>  tests/ovn.at                 | 153
> 
>  6 files changed, 253 insertions(+), 32 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 236974f4f..0453949bc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
>      HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>          /* XXX Add method to limit zone assignment to logical router
>           * datapaths with NAT */
> -        char *dnat =
> alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
> -        char *snat =
> alloc_nat_zone_key(>datapath->header_.uuid, "snat");
> +        const char *name = smap_get(>datapath->external_ids,
> "name");
> +        if (!name) {
> +            static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> +            VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
> +                        "skipping zone assignment.",
> +                        UUID_ARGS(>datapath->header_.uuid));
> +            continue;
> +        }
> +
> +        char *dnat = alloc_nat_zone_key(name, "dnat");
> +        char *snat = alloc_nat_zone_key(name, "snat");
>          sset_add(_users, dnat);
>          sset_add(_users, snat);
> 
> @@ -882,9 +891,63 @@ struct ed_type_ct_zones {
>      bool recomputed;
>  };
> 

To make it more explicit I'd add a comment here.  Would something like
this make sense?

/* Replaces a UUID prefix from 'uuid_zone' (if any) with the
 * corresponding Datapath_Binding.external_ids.name.  Returns it
 * as a new string that that will be owned by the caller. */

> +static char *
> +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table
> *dp_table,
> +                       const char *uuid_zone)
> +{
> +    struct uuid uuid;
> +    if (!uuid_from_string_prefix(, uuid_zone)) {
> +        return NULL;
> +    }
> +
> +    const struct sbrec_datapath_binding *dp =
> +            sbrec_datapath_binding_table_get_for_uuid(dp_table, );
> +    if (!dp) {
> +        return NULL;
> +    }
> +
> +    const char *entity_name = smap_get(>external_ids, "name");
> +    if 

[ovs-dev] [PATCH v2 ovn] northd: support binding remote ports in ovn-northd

2023-07-26 Thread Lorenzo Bianconi
ovn supports creating remote logical ports. An user
can set requested-chassis option for a logical switch port
to the remote chassis and ovn-northd can bind it to that chassis.
This is required for OVN IC in ovn-k8s. Right now ovn-k8s
ovnkube-controller after creating a remote logical port, sets the
chassis column of the corresponding port binding in SB DB to the
remote chassis. This process can be implemented in ovn-northd.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2217930
Signed-off-by: Lorenzo Bianconi 
---
Changes since v1:
- add NEWS entry
- do not remove ovn-ic code to bind sb to gw chassis
- simplify codebase
---
 NEWS|  2 ++
 ic/ovn-ic.c |  5 +
 northd/northd.c |  8 
 tests/ovn-ic.at |  2 ++
 tests/ovn-northd.at | 20 
 tests/ovn.at| 27 +++
 6 files changed, 64 insertions(+)

diff --git a/NEWS b/NEWS
index 8275877f9..be900c95b 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post v23.06.0
   - To allow optimizing ovn-controller's monitor conditions for the regular
 VIF case, ovn-controller now unconditionally monitors all sub-ports
 (ports with parent_port set).
+  - Introduce support for binding remote ports in ovn-northd if the CMS sets
+requested-chassis option for a remote logical switch port.
 
 OVN v23.06.0 - 01 Jun 2023
 --
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 6f31037ec..72709ce78 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -646,6 +646,11 @@ sync_remote_port(struct ic_context *ctx,
 /* Sync tunnel key from ISB to NB */
 sync_lsp_tnl_key(lsp, isb_pb->tunnel_key);
 
+/* Skip port binding if it is already requested by the CMS. */
+if (smap_get(>options, "requested-chassis")) {
+return;
+}
+
 /* Sync gateway from ISB to SB */
 if (isb_pb->gateway[0]) {
 if (!sb_pb->chassis || strcmp(sb_pb->chassis->name, isb_pb->gateway)) {
diff --git a/northd/northd.c b/northd/northd.c
index b9605862e..e5cd6b6ab 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3601,6 +3601,14 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn *ovnsb_txn,
  * ha_chassis_group cleared in the same transaction. */
 sbrec_port_binding_set_ha_chassis_group(op->sb, NULL);
 }
+
+/* ovn-northd is supposed to set port_binding for remote ports
+ * if requested chassis is marked as remote
+ */
+if (lsp_is_remote(op->nbsp)) {
+sbrec_port_binding_set_chassis(op->sb,
+   op->sb->requested_chassis);
+}
 } else {
 const char *chassis = NULL;
 if (op->peer && op->peer->od && op->peer->od->nbr) {
diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index ceee45092..05c9b2825 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -337,6 +337,7 @@ ovn-nbctl lsp-set-addresses lsp-ts1-lr1 router
 ovn-nbctl lsp-set-type lsp-ts1-lr1 router
 ovn-nbctl --wait=hv lsp-set-options lsp-ts1-lr1 router-port=lrp-lr1-ts1
 
+ovn_as az2 ovn-nbctl lsp-set-options lsp-ts1-lr1 requested-chassis=gw1
 AT_CHECK([ovn_as az2 ovn-nbctl show | uuidfilt], [0], [dnl
 switch <0> (ts1)
 port lsp-ts1-lr1
@@ -351,6 +352,7 @@ lsp-ts1-lr1,remote
 ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1 gw1
 OVS_WAIT_UNTIL([ovn_as az2 ovn-sbctl show | grep lsp-ts1-lr1])
 
+ovn_as az2 ovn-nbctl lsp-set-options lsp-ts1-lr1 requested-chassis=""
 ovn-nbctl lrp-del-gateway-chassis lrp-lr1-ts1 gw1
 OVS_WAIT_WHILE([ovn_as az2 ovn-sbctl show | grep lsp-ts1-lr1])
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 3e06f14c9..69569f3a7 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9688,3 +9688,23 @@ AT_CHECK([grep "lr_in_gw_redirect" R1flows |sed 
s'/table=../table=??/' |sort], [
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([Remote port binding])
+AT_KEYWORDS([remote-port-binding])
+ovn_start
+
+check ovn-sbctl chassis-add remote-ch0 geneve 127.0.0.1
+check ovn-sbctl set chassis remote-ch0 other_config:is-remote=true
+wait_row_count Chassis 1
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl lsp-add sw0 sw0-r1 -- lsp-set-type sw0-r1 remote
+check ovn-nbctl lsp-set-options sw0-r1 requested-chassis=remote-ch0
+wait_for_ports_up sw0-r1
+
+check ovn-nbctl remove logical_switch_port sw0-r1 options requested-chassis
+wait_row_count nb:Logical_Switch_Port 1 up=false name=sw0-r1
+
+AT_CLEANUP
+])
diff --git a/tests/ovn.at b/tests/ovn.at
index 24da9174e..a27e3eec2 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -26129,6 +26129,19 @@ done
 # XXX This should be more systematic.
 sleep 2
 
+# Populate requested-chassis options for remote lsps
+for az in $(seq 1 $n_az); do
+ovn_as az${az}
+for ts in $(seq 1 $n_ts); do
+for i in $(seq 1 $n_ts); do
+if [[ $i -eq ${az} ]]; then
+continue
+fi
+check ovn-nbctl lsp-set-options 

Re: [ovs-dev] [PATCH ovn] northd: support binding remote ports in ovn-northd

2023-07-26 Thread Lorenzo Bianconi
> Hi Lorenzo,

Hi Mark,

thx for the fast review.

> 
> Please add a NEWS item about this new feature.

ack, I will add it in v2.

> 
> I have more down below.
> 
> On 7/24/23 11:52, Lorenzo Bianconi wrote:
> > ovn supports creating remote logical ports. An user
> > can set requested-chassis option for a logical switch port
> > to the remote chassis and ovn-northd can bind it to that chassis.
> > This is required for OVN IC in ovn-k8s. Right now ovn-k8s
> > ovnkube-controller after creating a remote logical port, sets the
> > chassis column of the corresponding port binding in SB DB to the
> > remote chassis. This process can be authomized in ovn-northd.
> 
> I've never heard the word "authomized" before. I don't know if this is
> jargon that I have never heard before or if it is a typo :)
> 
> > 
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2217930
> > Signed-off-by: Lorenzo Bianconi 
> > ---
> >   ic/ovn-ic.c | 42 +++---
> >   northd/northd.c | 21 -
> >   tests/ovn-ic.at |  2 ++
> >   tests/ovn-northd.at | 20 
> >   tests/ovn.at| 27 +++
> >   5 files changed, 72 insertions(+), 40 deletions(-)
> > 
> > diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
> > index 6f31037ec..c12e345ed 100644
> > --- a/ic/ovn-ic.c
> > +++ b/ic/ovn-ic.c
> > @@ -506,20 +506,6 @@ get_lrp_address_for_sb_pb(struct ic_context *ctx,
> >   return peer->n_mac ? *peer->mac : NULL;
> >   }
> > -static const struct sbrec_chassis *
> > -find_sb_chassis(struct ic_context *ctx, const char *name)
> > -{
> > -const struct sbrec_chassis *key =
> > -sbrec_chassis_index_init_row(ctx->sbrec_chassis_by_name);
> > -sbrec_chassis_index_set_name(key, name);
> > -
> > -const struct sbrec_chassis *chassis =
> > -sbrec_chassis_index_find(ctx->sbrec_chassis_by_name, key);
> > -sbrec_chassis_index_destroy_row(key);
> > -
> > -return chassis;
> > -}
> > -
> >   static void
> >   sync_lsp_tnl_key(const struct nbrec_logical_switch_port *lsp,
> >int64_t isb_tnl_key)
> > @@ -622,13 +608,10 @@ sync_local_port(struct ic_context *ctx,
> >   /* For each remote port:
> >*   - Sync from ISB to NB
> > - *   - Sync gateway from ISB to SB
> >*/
> >   static void
> > -sync_remote_port(struct ic_context *ctx,
> > - const struct icsbrec_port_binding *isb_pb,
> > - const struct nbrec_logical_switch_port *lsp,
> > - const struct sbrec_port_binding *sb_pb)
> > +sync_remote_port(const struct icsbrec_port_binding *isb_pb,
> > + const struct nbrec_logical_switch_port *lsp)
> >   {
> >   /* Sync address from ISB to NB */
> >   if (isb_pb->address[0]) {
> > @@ -645,25 +628,6 @@ sync_remote_port(struct ic_context *ctx,
> >   /* Sync tunnel key from ISB to NB */
> >   sync_lsp_tnl_key(lsp, isb_pb->tunnel_key);
> > -
> > -/* Sync gateway from ISB to SB */
> > -if (isb_pb->gateway[0]) {
> > -if (!sb_pb->chassis || strcmp(sb_pb->chassis->name, 
> > isb_pb->gateway)) {
> > -const struct sbrec_chassis *chassis =
> > -find_sb_chassis(ctx, isb_pb->gateway);
> > -if (!chassis) {
> > -VLOG_DBG("Chassis %s is not found in SB, syncing from ISB "
> > - "to SB skipped for logical port %s.",
> > - isb_pb->gateway, lsp->name);
> > -return;
> > -}
> > -sbrec_port_binding_set_chassis(sb_pb, chassis);
> > -}
> > -} else {
> > -if (sb_pb->chassis) {
> > -sbrec_port_binding_set_chassis(sb_pb, NULL);
> > -}
> > -}
> 
> I don't feel great about removing this. This will break existing
> configurations that expect the port to automatically be bound to the gateway
> chassis.
> 
> I also don't understand why this needs to be removed. The removed behavior
> only works on logical switch ports of type "router". Typically
> "requested-chassis" is set on VIF ports, correct?

iiuc this code is executed for 'remote' ports (not router ones) but I would say
we can keep the code and avoid to overwrite the configuration if the CMS sets
requested-chassis options in lsp table.

> 
> >   }
> >   static void
> > @@ -813,7 +777,7 @@ port_binding_run(struct ic_context *ctx,
> >   if (!sb_pb) {
> >   continue;
> >   }
> > -sync_remote_port(ctx, isb_pb, lsp, sb_pb);
> > +sync_remote_port(isb_pb, lsp);
> >   }
> >   } else {
> >   VLOG_DBG("Ignore lsp %s on ts %s with type %s.",
> > diff --git a/northd/northd.c b/northd/northd.c
> > index b9605862e..618c79c61 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -3773,11 +3773,30 @@ ovn_port_update_sbrec(struct ovsdb_idl_txn 
> > *ovnsb_txn,
> >   

Re: [ovs-dev] [PATCH net-next 0/7] openvswitch: add drop reasons

2023-07-26 Thread Adrian Moreno



On 7/24/23 16:34, Aaron Conole wrote:

Adrian Moreno  writes:


There is currently a gap in drop visibility in the openvswitch module.
This series tries to improve this by adding a new drop reason subsystem
for OVS.

Apart from adding a new drop reasson subsystem and some common drop
reasons, this series takes Eric's preliminary work [1] on adding an
explicit drop action and integrates it into the same subsystem.

This series also adds some selftests and so it requires [2] to be
applied first.

A limitation of this series is that it does not report upcall errors.
The reason is that there could be many sources of upcall drops and the
most common one, which is the netlink buffer overflow, cannot be
reported via kfree_skb() because the skb is freed in the netlink layer
(see [3]). Therefore, using a reason for the rare events and not the
common one would be even more misleading. I'd propose we add (in a
follow up patch) a tracepoint to better report upcall errors.


I guess you meant to add RFC tag to this, since it depends on other
series that aren't accepted yet.



Yep, sorry I should have added RFC tag.


If it's okay, I will pull in your patch 5/7 when I re-post my flow
additions series, since it will need to be added there at some point
anyway.



Sure, please go ahead.


[1] https://lore.kernel.org/netdev/202306300609.tdrdzscy-...@intel.com/T/
[2] 
https://lore.kernel.org/all/9375ccbc-dd40-9998-dde5-c94e4e28f...@redhat.com/T/
[3] commit 1100248a5c5c ("openvswitch: Fix double reporting of drops in 
dropwatch")

Adrian Moreno (6):
   net: openvswitch: add datapath flow drop reason
   net: openvswitch: add meter drop reason
   net: openvswitch: add misc error drop reasons
   selftests: openvswitch: support key masks
   selftests: openvswitch: add drop reason testcase
   selftests: openvswitch: add explicit drop testcase

Eric Garver (1):
   net: openvswitch: add explicit drop action

  include/net/dropreason.h  |   6 +
  include/uapi/linux/openvswitch.h  |   2 +
  net/openvswitch/actions.c |  40 --
  net/openvswitch/conntrack.c   |   3 +-
  net/openvswitch/datapath.c|  16 +++
  net/openvswitch/drop.h|  33 +
  net/openvswitch/flow_netlink.c|   8 +-
  .../selftests/net/openvswitch/openvswitch.sh  |  92 +-
  .../selftests/net/openvswitch/ovs-dpctl.py| 115 --
  9 files changed, 267 insertions(+), 48 deletions(-)
  create mode 100644 net/openvswitch/drop.h




--
Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 4/7] net: openvswitch: add misc error drop reasons

2023-07-26 Thread Adrian Moreno




On 7/24/23 17:23, Aaron Conole wrote:

Adrian Moreno  writes:


Use drop reasons from include/net/dropreason-core.h when a reasonable
candidate exists.

Signed-off-by: Adrian Moreno 
---
  net/openvswitch/actions.c   | 17 ++---
  net/openvswitch/conntrack.c |  3 ++-
  net/openvswitch/drop.h  |  6 ++
  3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 9279bb186e9f..42fa1e7eb912 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c


Did you consider putting in a drop reason when one of the actions fails
setting err?  For example, if dec_ttl fails with some error other than
EHOSTUNREACH, it will drop into the kfree_skb() case... maybe we should
have an action_failed drop reason that can be passed there.



Another good idea! Thanks Aaron.



@@ -782,7 +782,7 @@ static int ovs_vport_output(struct net *net, struct sock 
*sk,
struct vport *vport = data->vport;
  
  	if (skb_cow_head(skb, data->l2_len) < 0) {

-   kfree_skb(skb);
+   kfree_skb_reason(skb, SKB_DROP_REASON_NOMEM);
return -ENOMEM;
}
  
@@ -853,6 +853,7 @@ static void ovs_fragment(struct net *net, struct vport *vport,

 struct sk_buff *skb, u16 mru,
 struct sw_flow_key *key)
  {
+   enum ovs_drop_reason reason;
u16 orig_network_offset = 0;
  
  	if (eth_p_mpls(skb->protocol)) {

@@ -862,6 +863,7 @@ static void ovs_fragment(struct net *net, struct vport 
*vport,
  
  	if (skb_network_offset(skb) > MAX_L2_LEN) {

OVS_NLERR(1, "L2 header too long to fragment");
+   reason = OVS_DROP_FRAG_L2_TOO_LONG;
goto err;
}
  
@@ -902,12 +904,13 @@ static void ovs_fragment(struct net *net, struct vport *vport,

WARN_ONCE(1, "Failed fragment ->%s: eth=%04x, MRU=%d, MTU=%d.",
  ovs_vport_name(vport), ntohs(key->eth.type), mru,
  vport->dev->mtu);
+   reason = OVS_DROP_FRAG_INVALID_PROTO;
goto err;
}
  
  	return;

  err:
-   kfree_skb(skb);
+   kfree_skb_reason(skb, reason);
  }
  
  static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,

@@ -934,10 +937,10 @@ static void do_output(struct datapath *dp, struct sk_buff 
*skb, int out_port,
  
  			ovs_fragment(net, vport, skb, mru, key);

} else {
-   kfree_skb(skb);
+   kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
}
} else {
-   kfree_skb(skb);
+   kfree_skb_reason(skb, SKB_DROP_REASON_DEV_READY);
}
  }
  
@@ -1011,7 +1014,7 @@ static int dec_ttl_exception_handler(struct datapath *dp, struct sk_buff *skb,

return clone_execute(dp, skb, key, 0, nla_data(actions),
 nla_len(actions), true, false);
  
-	consume_skb(skb);

+   kfree_skb_reason(skb, OVS_DROP_IP_TTL);
return 0;
  }
  
@@ -1564,7 +1567,7 @@ static int clone_execute(struct datapath *dp, struct sk_buff *skb,

/* Out of per CPU action FIFO space. Drop the 'skb' and
 * log an error.
 */
-   kfree_skb(skb);
+   kfree_skb_reason(skb, OVS_DROP_DEFERRED_LIMIT);
  
  		if (net_ratelimit()) {

if (actions) { /* Sample action */
@@ -1616,7 +1619,7 @@ int ovs_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
if (unlikely(level > OVS_RECURSION_LIMIT)) {
net_crit_ratelimited("ovs: recursion limit reached on datapath %s, 
probable configuration error\n",
 ovs_dp_name(dp));
-   kfree_skb(skb);
+   kfree_skb_reason(skb, OVS_DROP_RECURSION_LIMIT);
err = -ENETDOWN;
goto out;
}
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index fa955e892210..b03ebd4a8fae 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -29,6 +29,7 @@
  #include 
  
  #include "datapath.h"

+#include "drop.h"
  #include "conntrack.h"
  #include "flow.h"
  #include "flow_netlink.h"
@@ -1035,7 +1036,7 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
  
  	skb_push_rcsum(skb, nh_ofs);

if (err)
-   kfree_skb(skb);
+   kfree_skb_reason(skb, OVS_DROP_CONNTRACK);
return err;
  }
  
diff --git a/net/openvswitch/drop.h b/net/openvswitch/drop.h

index 2440c836727f..744b8d1b93a3 100644
--- a/net/openvswitch/drop.h
+++ b/net/openvswitch/drop.h
@@ -12,6 +12,12 @@
R(OVS_DROP_EXPLICIT_ACTION) \
R(OVS_DROP_EXPLICIT_ACTION_ERROR)   \
R(OVS_DROP_METER)   \
+   R(OVS_DROP_RECURSION_LIMIT) \
+   R(OVS_DROP_DEFERRED_LIMIT)  \

Re: [ovs-dev] [PATCH net-next 2/7] net: openvswitch: add explicit drop action

2023-07-26 Thread Adrian Moreno



On 7/24/23 16:47, Aaron Conole wrote:

Adrian Moreno  writes:


From: Eric Garver 

This adds an explicit drop action. This is used by OVS to drop packets
for which it cannot determine what to do. An explicit action in the
kernel allows passing the reason _why_ the packet is being dropped or
zero to indicate no particular error happened (i.e: OVS intentionally
dropped the packet).

Since the error codes coming from userspace mean nothing for the kernel,
we squash all of them into only two drop reasons:
- OVS_DROP_EXPLICIT_ACTION_ERROR to indicate a non-zero value was passed
- OVS_DROP_EXPLICIT_ACTION to indicate a zero value was passed (no
   error)

e.g. trace all OVS dropped skbs

  # perf trace -e skb:kfree_skb --filter="reason >= 0x3"
  [..]
  106.023 ping/2465 skb:kfree_skb(skbaddr: 0xa0e8765f2000, \
   location:0xc0d9b462, protocol: 2048, reason: 196610)

reason: 196610 --> 0x30002 (OVS_DROP_EXPLICIT_ACTION)

Signed-off-by: Eric Garver 
Signed-off-by: Adrian Moreno 
---
  include/uapi/linux/openvswitch.h | 2 ++
  net/openvswitch/actions.c| 9 +
  net/openvswitch/drop.h   | 2 ++
  net/openvswitch/flow_netlink.c   | 8 +++-
  tools/testing/selftests/net/openvswitch/ovs-dpctl.py | 3 +++
  5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index e94870e77ee9..efc82c318fa2 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -965,6 +965,7 @@ struct check_pkt_len_arg {
   * start of the packet or at the start of the l3 header depending on the value
   * of l3 tunnel flag in the tun_flags field of OVS_ACTION_ATTR_ADD_MPLS
   * argument.
+ * @OVS_ACTION_ATTR_DROP: Explicit drop action.
   *
   * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not 
all
   * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -1002,6 +1003,7 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
+   OVS_ACTION_ATTR_DROP, /* u32 error code. */
  
  	__OVS_ACTION_ATTR_MAX,	  /* Nothing past this will be accepted

   * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index af676dcac2b4..194379d58b62 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1485,6 +1485,15 @@ static int do_execute_actions(struct datapath *dp, 
struct sk_buff *skb,
return dec_ttl_exception_handler(dp, skb,
 key, a);
break;
+
+   case OVS_ACTION_ATTR_DROP: {
+   enum ovs_drop_reason reason = nla_get_u32(a)
+   ? OVS_DROP_EXPLICIT_ACTION_ERROR
+   : OVS_DROP_EXPLICIT_ACTION;
+
+   kfree_skb_reason(skb, reason);
+   return 0;
+   }
}
  
  		if (unlikely(err)) {

diff --git a/net/openvswitch/drop.h b/net/openvswitch/drop.h
index cdd10629c6be..f9e9c1610f6b 100644
--- a/net/openvswitch/drop.h
+++ b/net/openvswitch/drop.h
@@ -9,6 +9,8 @@
  
  #define OVS_DROP_REASONS(R)			\

R(OVS_DROP_FLOW)\
+   R(OVS_DROP_EXPLICIT_ACTION) \
+   R(OVS_DROP_EXPLICIT_ACTION_ERROR)   \
/* deliberate comment for trailing \ */
  
  enum ovs_drop_reason {

diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 41116361433d..244280922a14 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -38,6 +38,7 @@
  #include 
  #include 
  
+#include "drop.h"

  #include "flow_netlink.h"
  
  struct ovs_len_tbl {

@@ -61,6 +62,7 @@ static bool actions_may_change_flow(const struct nlattr 
*actions)
case OVS_ACTION_ATTR_RECIRC:
case OVS_ACTION_ATTR_TRUNC:
case OVS_ACTION_ATTR_USERSPACE:
+   case OVS_ACTION_ATTR_DROP:
break;
  
  		case OVS_ACTION_ATTR_CT:

@@ -2394,7 +2396,7 @@ static void ovs_nla_free_nested_actions(const struct 
nlattr *actions, int len)
/* Whenever new actions are added, the need to update this
 * function should be considered.
 */
-   BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 23);
+   BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
  
  	if (!actions)

return;
@@ -3182,6 +3184,7 @@ static int __ovs_nla_copy_actions(struct net *net, const 
struct nlattr *attr,
[OVS_ACTION_ATTR_CHECK_PKT_LEN] = (u32)-1,
[OVS_ACTION_ATTR_ADD_MPLS] = 

Re: [ovs-dev] [PATCH ovn v2] Expose distributed gateway port information in NB DB

2023-07-26 Thread Lucas Martins
On Tue, Jul 25, 2023 at 2:28 PM Dumitru Ceara  wrote:
>
> On 7/5/23 10:36, lmart...@redhat.com wrote:
> > From: Lucas Alvares Gomes 
> >
> > In order for the CMS to know which Chassis a distributed gateway port
> > is bond to, this patch updates the ovn-northd daemon to populate the
> > Logical_Router_Port table with that information.
> >
> > To avoid changing the database schema, ovn-northd is setting a new key
> > called "hosting-chassis" in the options column from the LRP table. This
> > key value points to the name of the Chassis that is currently hosting
> > the distributed port.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2107515
> > Signed-off-by: Lucas Alvares Gomes 
> > ---
> >
>
> Hi Lucas!
>
> Thanks for the patch!  And sorry for taking long to review.
>

Thank you for the review! No problem at all, appreciate it.

> > v1 -> v2
> > * Rebased on the main branch
> > * Updated the ovnsb_db_run() and handle_port_binding_changes() functions
> >   to include the LR ports as a parameter
> >
> >  northd/en-sync-from-sb.c |  2 +-
> >  northd/northd.c  | 34 --
> >  northd/northd.h  |  3 ++-
> >  ovn-nb.xml   | 15 +++
> >  tests/ovn-northd.at  | 34 ++
> >  5 files changed, 84 insertions(+), 4 deletions(-)
> >
> > diff --git a/northd/en-sync-from-sb.c b/northd/en-sync-from-sb.c
> > index 55ece2d16..4109aebe4 100644
> > --- a/northd/en-sync-from-sb.c
> > +++ b/northd/en-sync-from-sb.c
> > @@ -60,7 +60,7 @@ en_sync_from_sb_run(struct engine_node *node, void *data 
> > OVS_UNUSED)
> >  stopwatch_start(OVNSB_DB_RUN_STOPWATCH_NAME, time_msec());
> >  ovnsb_db_run(eng_ctx->ovnnb_idl_txn, eng_ctx->ovnsb_idl_txn,
> >   sb_pb_table, sb_ha_ch_grp_table, sb_ha_ch_grp_by_name,
> > - >ls_ports);
> > + >ls_ports, >lr_ports);
> >  stopwatch_stop(OVNSB_DB_RUN_STOPWATCH_NAME, time_msec());
> >  }
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 309e0abd0..58da9c086 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -17631,6 +17631,7 @@ handle_port_binding_changes(struct ovsdb_idl_txn 
> > *ovnsb_txn,
> >  const struct sbrec_ha_chassis_group_table 
> > *sb_ha_ch_grp_table,
> >  struct ovsdb_idl_index *sb_ha_ch_grp_by_name,
> >  struct hmap *ls_ports,
> > +struct hmap *lr_ports,
> >  struct shash *ha_ref_chassis_map)
> >  {
> >  const struct sbrec_port_binding *sb;
> > @@ -17649,6 +17650,34 @@ handle_port_binding_changes(struct ovsdb_idl_txn 
> > *ovnsb_txn,
> >  }
> >
> >  SBREC_PORT_BINDING_TABLE_FOR_EACH (sb, sb_pb_table) {
> > +
> > +/* Look for a chassisredirect binding and set the "active-chassis"
> > + * option in the NBDB logical_router_port table indicating on which
> > + * chassis the distributed port is bond to. */
> > +if (!strcmp(sb->type, "chassisredirect")) {
> > + const char *dist_port =
> > + smap_get(>options, "distributed-port");
>
> Isn't it easier to just use the op->l3dgw_port instead of looking it up
> again here?  We lookup 'op' just below in the port binding loop.
>

Ah great! I didn't know about that l3dgw_port, I will change the patch
and test it out.


> > + if (dist_port) {
> > + struct ovn_port *router_port =
> > + ovn_port_find(lr_ports, dist_port);
> > + if (router_port) {
> > + struct smap options;
> > + smap_clone(, _port->nbrp->options);
> > + if (sb->chassis) {
> > + smap_replace(, "hosting-chassis",
> > +  sb->chassis->name);
> > + } else {
> > + smap_remove(, "hosting-chassis");
> > + }
> > + 
> > nbrec_logical_router_port_set_options(router_port->nbrp,
> > +   );
>
> We leak 'options' here.  Also reported in CI:
>
> https://github.com/ovsrobot/ovn/actions/runs/5462364730
>
> You might want to add a 'smap_destroy()' here.
>

Oops, thanks for pointing it out! Will fix.

> Can we move this "update the options" logic into a separate
> chassis-redirect specific function?
>

Certainly, will make the code more organized. I will do it.

> > +  }
> > + }
> > + /* Continue since there are no matching logical port for
> > +  * chassisredirect bindings */
> > + continue;
> > +}
> > +
> >  struct ovn_port *op = ovn_port_find(ls_ports, sb->logical_port);
> >
> >  if (!op || !op->nbsp) {
> > @@ -17697,7 +17726,8 @@ ovnsb_db_run(struct ovsdb_idl_txn *ovnnb_txn,
> >   const struct sbrec_port_binding_table *sb_pb_table,
> > 

Re: [ovs-dev] [PATCH ovn] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Surya Seetharaman
Thanks Ales for doing this critical fix!

During upgrades when moving clusters from non-interconnect model to
the interconnect model we need to rebuild the NB/SB DB's from scratch
and that was leading the change in UUIDs of the logical elements which
was causing a ct zone flush (OVNK uses zone0 which is a well-known zone
on the host that made matters worse since OVN started flushing zone 0 as
well). This led to traffic disruption in the cluster in order of
~30-40seconds
during the upgrade.

This fix was tested on an OpenShift cluster and verified that it does not
flush the zones that have not changed based on names (instead of UUIDs).

Requesting a backport of the same to 23.06 OVN.

Best Regards,
-SS


On Wed, Jul 26, 2023 at 8:45 AM Ales Musil  wrote:

> There are two scenarios that could cause unwanted
> CT zone flush:
>
> 1) The SB DB is destroyed and recreated. The new
> database will end up with different UUIDs for
> various components.
>
> 2) Upgrade of existing SB DB to ovn-ic.
> The components are the same as before, but scattered
> between multiple SB DBs. This again leads to different
> UUIDs in SB DB.
>
> The CT zone name was based on datapath UUID which
> causes flush when the UUID changes. Even if
> the datapath is the same.
>
> To prevent the unwanted flush migrate from UUID
> to component name (LR/LS name). This allows
> the CT zones to be stable across the before mentioned
> scenarios.
>
> For the migration to be "flush less" itself we need
> to make sure to start the restoration process only
> after controller is connected to the SB DB e.g. the
> restoration can happen only during engine run and not
> init as it was done previously.
>
> Reported-at: https://bugzilla.redhat.com/2224199
> Tested-by: Surya Seetharaman 
> Signed-off-by: Ales Musil 
> ---
>  controller/ovn-controller.c |  99 +++
>  controller/physical.c   |  11 ++-
>  lib/ovn-util.c  |   4 +-
>  lib/ovn-util.h  |   2 +-
>  tests/ovn-controller.at |  16 ++--
>  tests/ovn.at| 153 
>  6 files changed, 253 insertions(+), 32 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 236974f4f..0453949bc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
>  HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
>  /* XXX Add method to limit zone assignment to logical router
>   * datapaths with NAT */
> -char *dnat = alloc_nat_zone_key(>datapath->header_.uuid,
> "dnat");
> -char *snat = alloc_nat_zone_key(>datapath->header_.uuid,
> "snat");
> +const char *name = smap_get(>datapath->external_ids, "name");
> +if (!name) {
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
> +"skipping zone assignment.",
> +UUID_ARGS(>datapath->header_.uuid));
> +continue;
> +}
> +
> +char *dnat = alloc_nat_zone_key(name, "dnat");
> +char *snat = alloc_nat_zone_key(name, "snat");
>  sset_add(_users, dnat);
>  sset_add(_users, snat);
>
> @@ -882,9 +891,63 @@ struct ed_type_ct_zones {
>  bool recomputed;
>  };
>
> +static char *
> +ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table
> *dp_table,
> +   const char *uuid_zone)
> +{
> +struct uuid uuid;
> +if (!uuid_from_string_prefix(, uuid_zone)) {
> +return NULL;
> +}
> +
> +const struct sbrec_datapath_binding *dp =
> +sbrec_datapath_binding_table_get_for_uuid(dp_table, );
> +if (!dp) {
> +return NULL;
> +}
> +
> +const char *entity_name = smap_get(>external_ids, "name");
> +if (!entity_name) {
> +return NULL;
> +}
> +
> +return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN);
> +}
> +
> +static void
> +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table,
> +struct ed_type_ct_zones *ct_zones_data, const char *name,
> +int zone)
> +{
> +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name);
> +
> +char *new_name = ct_zone_name_from_uuid(dp_table, name);
> +const char *current_name = name;
> +if (new_name) {
> +VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'",
> +  zone, name, new_name);
> +
> +/* Make sure we remove the uuid one in the next OvS DB commit
> without
> + * flush. */
> +add_pending_ct_zone_entry(_zones_data->pending,
> CT_ZONE_DB_QUEUED,
> +  zone, false, name);
> +/* Store the zone in OvS DB with name instead of uuid without
> flush.
> + * */
> +add_pending_ct_zone_entry(_zones_data->pending,
> CT_ZONE_DB_QUEUED,
> 

Re: [ovs-dev] [RFC ovn-heater] OVN scale testing with OpenStack workloads - questionnaire

2023-07-26 Thread Dumitru Ceara
On 7/12/23 10:51, Dumitru Ceara wrote:
> Hi all,
> 
> During the ovn-heater community meeting organized by Frode yesterday
> (thanks again for that!) we agreed to follow up on some points.  Two of
> these are related to gathering more information that allow us to build
> realistic test scenarios (1) and define targets for them (2).
> 
> On that note we have agreed to prepare a _short_ questionnaire to be
> shared with OpenStack + OVN users (companies and other operators).
> 
> I started a draft document here:
> 
> https://docs.google.com/document/d/151saO5a5PmCt7cIZQ7DkgvU755od76BlN2bKjXc1n08
> 

Based on feedback received both through comments/suggestions in the
google doc and here on-list we seem to have reached a stable version of
the questionnaire.  I tagged the version as V1.

https://docs.google.com/document/d/151saO5a5PmCt7cIZQ7DkgvU755od76BlN2bKjXc1n08/

During yesterday's ovn-heater community meeting we agreed to take a few
minutes from the weekly OVN community meeting (Thursday, 1715 UTC [0])
and iron out any potential last issues before sharing this questionnaire
with users.

Best regards,
Dumitru

[0] https://www.ovn.org/en/#irc-meetings

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] controller: Migrate from ct zone UUID name to component name

2023-07-26 Thread Ales Musil
There are two scenarios that could cause unwanted
CT zone flush:

1) The SB DB is destroyed and recreated. The new
database will end up with different UUIDs for
various components.

2) Upgrade of existing SB DB to ovn-ic.
The components are the same as before, but scattered
between multiple SB DBs. This again leads to different
UUIDs in SB DB.

The CT zone name was based on datapath UUID which
causes flush when the UUID changes. Even if
the datapath is the same.

To prevent the unwanted flush migrate from UUID
to component name (LR/LS name). This allows
the CT zones to be stable across the before mentioned
scenarios.

For the migration to be "flush less" itself we need
to make sure to start the restoration process only
after controller is connected to the SB DB e.g. the
restoration can happen only during engine run and not
init as it was done previously.

Reported-at: https://bugzilla.redhat.com/2224199
Tested-by: Surya Seetharaman 
Signed-off-by: Ales Musil 
---
 controller/ovn-controller.c |  99 +++
 controller/physical.c   |  11 ++-
 lib/ovn-util.c  |   4 +-
 lib/ovn-util.h  |   2 +-
 tests/ovn-controller.at |  16 ++--
 tests/ovn.at| 153 
 6 files changed, 253 insertions(+), 32 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 236974f4f..0453949bc 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -744,8 +744,17 @@ update_ct_zones(const struct sset *local_lports,
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
 /* XXX Add method to limit zone assignment to logical router
  * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+const char *name = smap_get(>datapath->external_ids, "name");
+if (!name) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_ERR_RL(, "Missing name for datapath '"UUID_FMT"' "
+"skipping zone assignment.",
+UUID_ARGS(>datapath->header_.uuid));
+continue;
+}
+
+char *dnat = alloc_nat_zone_key(name, "dnat");
+char *snat = alloc_nat_zone_key(name, "snat");
 sset_add(_users, dnat);
 sset_add(_users, snat);
 
@@ -882,9 +891,63 @@ struct ed_type_ct_zones {
 bool recomputed;
 };
 
+static char *
+ct_zone_name_from_uuid(const struct sbrec_datapath_binding_table *dp_table,
+   const char *uuid_zone)
+{
+struct uuid uuid;
+if (!uuid_from_string_prefix(, uuid_zone)) {
+return NULL;
+}
+
+const struct sbrec_datapath_binding *dp =
+sbrec_datapath_binding_table_get_for_uuid(dp_table, );
+if (!dp) {
+return NULL;
+}
+
+const char *entity_name = smap_get(>external_ids, "name");
+if (!entity_name) {
+return NULL;
+}
+
+return xasprintf("%s%s", entity_name, uuid_zone + UUID_LEN);
+}
+
+static void
+ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table,
+struct ed_type_ct_zones *ct_zones_data, const char *name,
+int zone)
+{
+VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, name);
+
+char *new_name = ct_zone_name_from_uuid(dp_table, name);
+const char *current_name = name;
+if (new_name) {
+VLOG_DBG("ct zone %"PRId32" replace uuid name '%s' with '%s'",
+  zone, name, new_name);
+
+/* Make sure we remove the uuid one in the next OvS DB commit without
+ * flush. */
+add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
+  zone, false, name);
+/* Store the zone in OvS DB with name instead of uuid without flush.
+ * */
+add_pending_ct_zone_entry(_zones_data->pending, CT_ZONE_DB_QUEUED,
+  zone, true, new_name);
+current_name = new_name;
+}
+
+simap_put(_zones_data->current, current_name, zone);
+bitmap_set1(ct_zones_data->bitmap, zone);
+
+free(new_name);
+}
+
 static void
 restore_ct_zones(const struct ovsrec_bridge_table *bridge_table,
  const struct ovsrec_open_vswitch_table *ovs_table,
+ const struct sbrec_datapath_binding_table *dp_table,
  struct ed_type_ct_zones *ct_zones_data)
 {
 memset(ct_zones_data->bitmap, 0, sizeof ct_zones_data->bitmap);
@@ -895,10 +958,8 @@ restore_ct_zones(const struct ovsrec_bridge_table 
*bridge_table,
 struct ct_zone_pending_entry *ctpe = pending_node->data;
 
 if (ctpe->add) {
-VLOG_DBG("restoring ct zone %"PRId32" for '%s'", ctpe->zone,
- pending_node->name);
-bitmap_set1(ct_zones_data->bitmap, ctpe->zone);
-simap_put(_zones_data->current,