Re: [ovs-dev] [PATCH ovn] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread William Tu
On Tue, Oct 5, 2021 at 8:05 PM William Tu  wrote:
>
> On Tue, Oct 5, 2021 at 11:33 AM Numan Siddique  wrote:
> >
> > On Tue, Oct 5, 2021 at 1:46 PM Han Zhou  wrote:
> > >
> > > On Tue, Oct 5, 2021 at 9:39 AM Ben Pfaff  wrote:
> > > >
> > > > I have not been actively maintaining OVN for some time now.  I don't
> > > expect
> > > > that to change.  I think that the responsible thing to do is to
> > > > acknowledge this properly by transitioning to emeritus status.
> > > >
> > > > Signed-off-by: Ben Pfaff 
> > > > ---
> > > >  MAINTAINERS.rst | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
> > > > index 0d19bd622..fff2cd977 100644
> > > > --- a/MAINTAINERS.rst
> > > > +++ b/MAINTAINERS.rst
> > > > @@ -41,8 +41,6 @@ This is the current list of active OVN committers:
> > > >
> > > > * - Name
> > > >   - Email
> > > > -   * - Ben Pfaff
> > > > - - b...@ovn.org
> > > > * - Gurucharan Shetty
> > > >   - g...@ovn.org
> > > > * - Han Zhou
> > > > @@ -67,3 +65,5 @@ More information about Emeritus Committers can be 
> > > > found
> > > >
> > > > * - Name
> > > >   - Email
> > > > +   * - Ben Pfaff
> > > > + - b...@ovn.org
> > > > --
> > > > 2.31.1
> > > >
> > > > ___
> > > > dev mailing list
> > > > d...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> > > Hi Ben,
> > >
> > > The patch LGTM, although I really don't like the solution :(
> > > Thank you so much for building the amazing stuff from scratch!
> > >
> >
> > Agree, Thank you for all your amazing work!
> >
> > Thanks
> > Numan
> >
> >
> > > Acked-by: Han Zhou 
>
> Thanks Ben!
> I applied to master.
> William

and feel free to revert the commit if you change your mind :)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread William Tu
On Tue, Oct 5, 2021 at 11:33 AM Numan Siddique  wrote:
>
> On Tue, Oct 5, 2021 at 1:46 PM Han Zhou  wrote:
> >
> > On Tue, Oct 5, 2021 at 9:39 AM Ben Pfaff  wrote:
> > >
> > > I have not been actively maintaining OVN for some time now.  I don't
> > expect
> > > that to change.  I think that the responsible thing to do is to
> > > acknowledge this properly by transitioning to emeritus status.
> > >
> > > Signed-off-by: Ben Pfaff 
> > > ---
> > >  MAINTAINERS.rst | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
> > > index 0d19bd622..fff2cd977 100644
> > > --- a/MAINTAINERS.rst
> > > +++ b/MAINTAINERS.rst
> > > @@ -41,8 +41,6 @@ This is the current list of active OVN committers:
> > >
> > > * - Name
> > >   - Email
> > > -   * - Ben Pfaff
> > > - - b...@ovn.org
> > > * - Gurucharan Shetty
> > >   - g...@ovn.org
> > > * - Han Zhou
> > > @@ -67,3 +65,5 @@ More information about Emeritus Committers can be found
> > >
> > > * - Name
> > >   - Email
> > > +   * - Ben Pfaff
> > > + - b...@ovn.org
> > > --
> > > 2.31.1
> > >
> > > ___
> > > dev mailing list
> > > d...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> > Hi Ben,
> >
> > The patch LGTM, although I really don't like the solution :(
> > Thank you so much for building the amazing stuff from scratch!
> >
>
> Agree, Thank you for all your amazing work!
>
> Thanks
> Numan
>
>
> > Acked-by: Han Zhou 

Thanks Ben!
I applied to master.
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 1/2] northd: Use address sets for ARP responder flows for VIPs.

2021-10-05 Thread Numan Siddique
On Fri, Oct 1, 2021 at 9:47 AM Dumitru Ceara  wrote:
>
> Otherwise the S_ROUTER_IN_IP_INPUT aggregated flows that reply to ARP
> requests targeting load balancer VIPs get completely regenerated every
> time a new VIP/LB is added.  This affects SB memory usage as RAFT log
> entries are huge.  Use an address set instead.  Updating an address set
> is incremental, because it's performed with a "mutate" operation.
>
> On a large scale ovn-kubernetes deployment with a high number of
> load balancers (services) this change reduces memory usage of
> ovsdb-servers running the OVN_Southbound cluster by 50%, from ~2GB
> RSS to ~1GB RSS.
>
> Signed-off-by: Dumitru Ceara 

Thanks Dumitru for the patches.

Both the patches LGTM.   I applied both to the main branch.

Numan

> ---
>  northd/northd.c |   61 
> +--
>  tests/ovn-northd.at |   44 +
>  2 files changed, 59 insertions(+), 46 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index cf2467fe1..5699745d6 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -11937,39 +11937,28 @@ build_lrouter_ipv4_ip_input(struct ovn_port *op,
>op->cr_port->json_key);
>  }
>
> -struct ds load_balancer_ips_v4 = DS_EMPTY_INITIALIZER;
> -
> -/* For IPv4 we can just create one rule with all required IPs. */
> -ds_put_cstr(_balancer_ips_v4, "{ ");
> -ds_put_and_free_cstr(_balancer_ips_v4,
> - sset_join(>od->lb_ips_v4, ", ", " }"));
> -
> -build_lrouter_arp_flow(op->od, op, 
> ds_cstr(_balancer_ips_v4),
> +/* Create a single ARP rule for all IPs that are used as VIPs. */
> +char *lb_ips_v4_as = xasprintf("$%s_lb_ip4", op->od->nbr->name);
> +build_lrouter_arp_flow(op->od, op, lb_ips_v4_as,
> REG_INPORT_ETH_ADDR,
> match, false, 90, NULL, lflows);
> -ds_destroy(_balancer_ips_v4);
> +free(lb_ips_v4_as);
>  }
>
>  if (sset_count(>od->lb_ips_v6)) {
>  ds_clear(match);
> -ds_clear(actions);
> -
> -struct ds load_balancer_ips_v6 = DS_EMPTY_INITIALIZER;
> -
> -ds_put_cstr(_balancer_ips_v6, "{ ");
> -ds_put_and_free_cstr(_balancer_ips_v6,
> - sset_join(>od->lb_ips_v6, ", ", " }"));
>
>  if (is_l3dgw_port(op)) {
>  ds_put_format(match, "is_chassis_resident(%s)",
>op->cr_port->json_key);
>  }
> -build_lrouter_nd_flow(op->od, op, "nd_na",
> -  ds_cstr(_balancer_ips_v6), NULL,
> +
> +/* Create a single ND rule for all IPs that are used as VIPs. */
> +char *lb_ips_v6_as = xasprintf("$%s_lb_ip6", op->od->nbr->name);
> +build_lrouter_nd_flow(op->od, op, "nd_na", lb_ips_v6_as, NULL,
>REG_INPORT_ETH_ADDR, match, false, 90,
>NULL, lflows, meter_groups);
> -
> -ds_destroy(_balancer_ips_v6);
> +free(lb_ips_v6_as);
>  }
>
>  if (!op->od->is_gw_router && !op->od->n_l3dgw_ports) {
> @@ -13541,7 +13530,7 @@ sync_address_set(struct northd_context *ctx, const 
> char *name,
>   * in OVN_Northbound, so that the address sets used in Logical_Flows in
>   * OVN_Southbound is checked against the proper set.*/
>  static void
> -sync_address_sets(struct northd_context *ctx)
> +sync_address_sets(struct northd_context *ctx, struct hmap *datapaths)
>  {
>  struct shash sb_address_sets = SHASH_INITIALIZER(_address_sets);
>
> @@ -13588,6 +13577,34 @@ sync_address_sets(struct northd_context *ctx)
>  svec_destroy(_addrs);
>  }
>
> +/* Sync router load balancer VIP generated address sets. */
> +struct ovn_datapath *od;
> +HMAP_FOR_EACH (od, key_node, datapaths) {
> +if (!od->nbr) {
> +continue;
> +}
> +
> +if (sset_count(>lb_ips_v4)) {
> +char *ipv4_addrs_name = xasprintf("%s_lb_ip4", od->nbr->name);
> +const char **ipv4_addrs = sset_array(>lb_ips_v4);
> +
> +sync_address_set(ctx, ipv4_addrs_name, ipv4_addrs,
> + sset_count(>lb_ips_v4), _address_sets);
> +free(ipv4_addrs_name);
> +free(ipv4_addrs);
> +}
> +
> +if (sset_count(>lb_ips_v6)) {
> +char *ipv6_addrs_name = xasprintf("%s_lb_ip6", od->nbr->name);
> +const char **ipv6_addrs = sset_array(>lb_ips_v6);
> +
> +sync_address_set(ctx, ipv6_addrs_name, ipv6_addrs,
> + sset_count(>lb_ips_v6), _address_sets);
> +free(ipv6_addrs_name);
> +free(ipv6_addrs);
> +}
> +}
> +
>  /* 

[ovs-dev] [PATCH ovn v6 4/4] ic: don't learn routes which have local GW

2021-10-05 Thread Vladislav Odintsov
In case we have ovn-ic-interconnected Logical_Routers
and install same ip_prefix route with GW in local AZ
in each LR in each AZ, this route would be learned in
other AZs and L3 loop is possible.
There could be next routes output:

[az1 ~]$ ovn-nbctl lr-route-list lr0
IPv4 Routes
Route Table global:
  128.0.0.0/1   169.254.1.1 dst-ip ecmp
  128.0.0.0/1 169.254.100.2 dst-ip (learned) ecmp

[az2 ~]$ ovn-nbctl lr-route-list lr0
IPv4 Routes
Route Table global:
  128.0.0.0/1   169.254.2.1 dst-ip ecmp
  128.0.0.0/1 169.254.100.1 dst-ip (learned) ecmp

So, there is a possible routing loop. Packets going
to 128.0.0.0/1 could go from AZ1 to AZ2 and on AZ2
they can be routed back.

This commit adds check for installed local (non-learned)
routes. If OVN IC route's ip_prefix, route_table are
the same with already installed non-learned NB route, such
route wouldn't be learned.

Signed-off-by: Vladislav Odintsov 
---
 ic/ovn-ic.c   | 30 --
 tests/ovn-ic.at   | 49 +++
 utilities/ovn-nbctl.c |  4 +++-
 3 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 02bd28b2b..5b37bb718 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -1197,7 +1197,25 @@ add_network_to_routes_ad(struct hmap *routes_ad, const 
char *network,
 }
 
 static bool
-route_need_learn(struct in6_addr *prefix, unsigned int plen,
+route_has_local_gw(const struct nbrec_logical_router *lr,
+   const char *route_table, const char *ip_prefix) {
+
+const struct nbrec_logical_router_static_route *route;
+for (int i = 0; i < lr->n_static_routes; i++) {
+route = lr->static_routes[i];
+if (!smap_get(>external_ids, "ic-learned-route") &&
+!strcmp(route->route_table, route_table) &&
+!strcmp(route->ip_prefix, ip_prefix)) {
+return true;
+}
+}
+return false;
+}
+
+static bool
+route_need_learn(const struct nbrec_logical_router *lr,
+ const struct icsbrec_route *isb_route,
+ struct in6_addr *prefix, unsigned int plen,
  const struct smap *nb_options)
 {
 if (!smap_get_bool(nb_options, "ic-route-learn", false)) {
@@ -1217,6 +1235,12 @@ route_need_learn(struct in6_addr *prefix, unsigned int 
plen,
 return false;
 }
 
+if (route_has_local_gw(lr, isb_route->route_table, isb_route->ip_prefix)) {
+VLOG_DBG("Skip learning %s (rtb:%s) route, as we've got one with "
+ "local GW", isb_route->ip_prefix, isb_route->route_table);
+return false;
+}
+
 return true;
 }
 
@@ -1321,9 +1345,11 @@ sync_learned_routes(struct ic_context *ctx,
  isb_route->nexthop);
 continue;
 }
-if (!route_need_learn(, plen, _global->options)) {
+if (!route_need_learn(ic_lr->lr, isb_route, , plen,
+  _global->options)) {
 continue;
 }
+
 struct ic_route_info *route_learned
 = ic_route_find(_lr->routes_learned, , plen,
 , isb_route->route_table);
diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index 5803f76e9..15560334d 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -870,3 +870,52 @@ OVN_CLEANUP_IC([az1], [az2])
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-ic -- same routes destination])
+
+ovn_init_ic_db
+ovn-ic-nbctl ts-add ts1
+
+for i in 1 2; do
+ovn_start az$i
+ovn_as az$i
+
+# Enable route learning at AZ level
+ovn-nbctl set nb_global . options:ic-route-learn=true
+ovn-nbctl set nb_global . options:ic-route-learn-default=true
+# Enable route advertising at AZ level
+ovn-nbctl set nb_global . options:ic-route-adv=true
+ovn-nbctl set nb_global . options:ic-route-adv-default=true
+
+lr=lr1$i
+ovn-nbctl lr-add $lr
+
+lrp=lrp-$lr-ts1
+lsp=lsp-ts1-$lr
+# Create LRP and connect to TS
+ovn-nbctl lrp-add $lr $lrp aa:aa:aa:aa:aa:0$i 169.254.100.$i/24
+ovn-nbctl lsp-add ts1 $lsp \
+-- lsp-set-addresses $lsp router \
+-- lsp-set-type $lsp router \
+-- lsp-set-options $lsp router-port=$lrp
+ovn-nbctl lrp-add $lr lrp-local-subnet 00:00:00:00:00:0$i 192.168.$i.1/24
+ovn-nbctl list logical-router-static-route
+check ovn-nbctl lr-route-add $lr 10.0.0.0/24 192.168.$i.10
+check ovn-nbctl lr-route-add $lr 0.0.0.0/0 192.168.$i.11
+done
+
+AT_CHECK([ovn_as az1 ovn-nbctl lr-route-list lr11 | grep dst-ip | sort], [0], 
[dnl
+0.0.0.0/0  192.168.1.11 dst-ip
+  10.0.0.0/24  192.168.1.10 dst-ip
+   192.168.2.0/24 169.254.100.2 dst-ip (learned)
+])
+
+AT_CHECK([ovn_as az2 ovn-nbctl lr-route-list lr12 | grep dst-ip | sort], [0], 
[dnl
+

[ovs-dev] [PATCH ovn v6 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Vladislav Odintsov
Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
   he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
   (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
   routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
   with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov 
Reviewed-by: Numan Siddique 
---
 NEWS|   4 +
 ic/ovn-ic.c | 540 
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 tests/ovn-ic.at | 440 
 5 files changed, 811 insertions(+), 196 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..02bd28b2b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
 struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
 struct ovsdb_idl_index *nbrec_port_by_name;
 struct ovsdb_idl_index *sbrec_chassis_by_name;
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
 
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
 if (isb_pb->availability_zone == az) {
 shash_add(_pbs, isb_pb->logical_port, isb_pb);
 shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
 struct ic_router_info {
 struct hmap_node node;
 const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
 struct hmap routes_learned;
 };
 
@@ -854,6 +858,7 @@ struct ic_route_info {
 struct in6_addr prefix;
 unsigned int plen;
 struct in6_addr nexthop;
+const char *route_table;
 
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,
 
 static struct ic_route_info *
 ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
 {
 struct ic_route_info *r;
 uint32_t hash = ic_route_hash(prefix, plen, nexthop);
 HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
 if (ipv6_addr_equals(>prefix, prefix) &&
 r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
 ipv6_addr_equals(>nexthop, nexthop)) {
 return r;
 }
@@ -926,11 +933,19 @@ add_to_routes_learned(struct hmap *routes_learned,
  , , )) {
 return false;
 }
+
+if (ic_route_find(routes_learned, , plen, ,
+  nb_route->route_table)) {
+/* Route is already added to learned in 

[ovs-dev] [PATCH ovn v6 2/4] northd, utils: support for RouteTables in LRs

2021-10-05 Thread Vladislav Odintsov
This patch extends Logical Router's routing functionality.
Now user may create multiple routing tables within a Logical Router
and assign them to Logical Router Ports.

Traffic coming from Logical Router Port with assigned route_table
is checked against global routes if any (Logical_Router_Static_Routes
whith empty route_table field), next against directly connected routes
and then Logical_Router_Static_Routes with same route_table value as
in Logical_Router_Port options:route_table field.

A new Logical Router ingress table #10 is added - IN_IP_ROUTING_PRE.
In this table packets which come from LRPs with configured
options:route_table field are checked against inport and in OVS
register 7 unique non-zero value identifying route table is written.

Then in 11th table IN_IP_ROUTING routes which have non-empty
`route_table` field are added with additional match on reg7 value
associated with appropriate route_table.

Signed-off-by: Vladislav Odintsov 
Acked-by: Numan Siddique 
---
 northd/northd.c | 159 ---
 northd/ovn-northd.8.xml |  63 --
 ovn-nb.ovsschema|   5 +-
 ovn-nb.xml  |  30 +++
 tests/ovn-ic.at |   4 +
 tests/ovn-nbctl.at  | 196 +-
 tests/ovn-northd.at |  76 ++-
 tests/ovn.at| 441 +++-
 utilities/ovn-nbctl.c   | 134 +++-
 9 files changed, 1041 insertions(+), 67 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 092eca829..6a020cb2e 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -148,15 +148,16 @@ enum ovn_stage {
 PIPELINE_STAGE(ROUTER, IN,  ECMP_STATEFUL,   7, "lr_in_ecmp_stateful") \
 PIPELINE_STAGE(ROUTER, IN,  ND_RA_OPTIONS,   8, "lr_in_nd_ra_options") \
 PIPELINE_STAGE(ROUTER, IN,  ND_RA_RESPONSE,  9, "lr_in_nd_ra_response") \
-PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  10, "lr_in_ip_routing")   \
-PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 11, "lr_in_ip_routing_ecmp") \
-PIPELINE_STAGE(ROUTER, IN,  POLICY,  12, "lr_in_policy")   \
-PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 13, "lr_in_policy_ecmp")  \
-PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 14, "lr_in_arp_resolve")  \
-PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN   ,  15, "lr_in_chk_pkt_len")  \
-PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 16, "lr_in_larger_pkts")  \
-PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 17, "lr_in_gw_redirect")  \
-PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 18, "lr_in_arp_request")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_PRE,  10, "lr_in_ip_routing_pre")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  11, "lr_in_ip_routing")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 12, "lr_in_ip_routing_ecmp") \
+PIPELINE_STAGE(ROUTER, IN,  POLICY,  13, "lr_in_policy")  \
+PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 14, "lr_in_policy_ecmp") \
+PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 15, "lr_in_arp_resolve") \
+PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN, 16, "lr_in_chk_pkt_len") \
+PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 17, "lr_in_larger_pkts") \
+PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 18, "lr_in_gw_redirect") \
+PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 19, "lr_in_arp_request") \
   \
 /* Logical router egress stages. */   \
 PIPELINE_STAGE(ROUTER, OUT, UNDNAT,  0, "lr_out_undnat")\
@@ -225,6 +226,7 @@ enum ovn_stage {
 #define REG_NEXT_HOP_IPV6 "xxreg0"
 #define REG_SRC_IPV4 "reg1"
 #define REG_SRC_IPV6 "xxreg1"
+#define REG_ROUTE_TABLE_ID "reg7"
 
 #define REG_ORIG_TP_DPORT_ROUTER   "reg9[16..31]"
 
@@ -287,8 +289,9 @@ enum ovn_stage {
  * | R6  |UNUSED| X | | G | IN_IP_ROUTING)|
  * | |  | R | | 1 |   |
  * +-+--+ E | UNUSED  |   |   |
- * | R7  |UNUSED| G | |   |   |
- * | |  | 3 | |   |   |
+ * | R7  |  ROUTE_TABLE_ID  | G | |   |   |
+ * | | (>= IN_IP_ROUTING_PRE && | 3 | |   |   |
+ * | |  <= IN_IP_ROUTING)   |   | |   |   |
  * +-+--+---+-+---+---+
  * | R8  | ECMP_GROUP_ID|   | |
  * | | ECMP_MEMBER_ID   | X | |
@@ -8511,11 +8514,72 @@ cleanup:
 ds_destroy();
 }
 
+static uint32_t
+route_table_add(struct simap *route_tables, const char *route_table_name)
+{
+/* route table ids start from 1 */
+uint32_t rtb_id = simap_count(route_tables) + 1;
+
+if (rtb_id == UINT16_MAX) {
+static struct 

[ovs-dev] [PATCH ovn v6 1/4] ic: process only local port_bindings

2021-10-05 Thread Vladislav Odintsov
This commit adds a small optimization by utilizing ovsdb_index
to iterate over port_bindings.
Prior to this change each iteration checked availability_zone
and continued processing only if port_binding belons to local AZ.

Now we run against port_bindings from local AZ only and don't check
availability_zone.

Signed-off-by: Vladislav Odintsov 
Acked-by: Numan Siddique 
---
 ic/ovn-ic.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 99356253d..303e93a4f 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -68,6 +68,7 @@ struct ic_context {
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
+struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
 struct ovsdb_idl_index *icsbrec_route_by_ts;
 struct ovsdb_idl_index *icsbrec_route_by_ts_az;
 };
@@ -1386,17 +1387,14 @@ route_run(struct ic_context *ctx,
 const struct icsbrec_port_binding *isb_pb;
 const struct icsbrec_port_binding *isb_pb_key =
 icsbrec_port_binding_index_init_row(
-ctx->icsbrec_port_binding_by_ts);
+ctx->icsbrec_port_binding_by_ts_az);
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
+icsbrec_port_binding_index_set_availability_zone(isb_pb_key, az);
 
 /* Each port on TS maps to a logical router, which is stored in the
  * external_ids:router-id of the IC SB port_binding record. */
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
- ctx->icsbrec_port_binding_by_ts) {
-if (isb_pb->availability_zone != az) {
-continue;
-}
-
+ctx->icsbrec_port_binding_by_ts_az) {
 const char *ts_lrp_name =
 get_lrp_name_by_ts_port_name(ctx, isb_pb->logical_port);
 if (!ts_lrp_name) {
@@ -1713,6 +1711,11 @@ main(int argc, char *argv[])
 = ovsdb_idl_index_create1(ovnisb_idl_loop.idl,
   _port_binding_col_transit_switch);
 
+struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az
+= ovsdb_idl_index_create2(ovnisb_idl_loop.idl,
+  _port_binding_col_transit_switch,
+  _port_binding_col_availability_zone);
+
 struct ovsdb_idl_index *icsbrec_route_by_ts
 = ovsdb_idl_index_create1(ovnisb_idl_loop.idl,
   _route_col_transit_switch);
@@ -1763,6 +1766,7 @@ main(int argc, char *argv[])
 .sbrec_chassis_by_name = sbrec_chassis_by_name,
 .icsbrec_port_binding_by_az = icsbrec_port_binding_by_az,
 .icsbrec_port_binding_by_ts = icsbrec_port_binding_by_ts,
+.icsbrec_port_binding_by_ts_az = icsbrec_port_binding_by_ts_az,
 .icsbrec_route_by_ts = icsbrec_route_by_ts,
 .icsbrec_route_by_ts_az = icsbrec_route_by_ts_az,
 };
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v6 0/4] Add multiple routing tables support to Logical Routers

2021-10-05 Thread Vladislav Odintsov
v5 -> v6:
  - 2 memory leak bugs were fixed in ovn-ic code, which were introduced in
previous versions of #3 patch.

v4 -> v5:
  - Addressed Numan's review comments.

v3 -> v4:
  - Minor logging typo fixes.
  - Added patch with ovn-ic routes learning bugfix.

v2 -> v3:
  - Rebased on split northd changes.
  - Replaced route_tables HMAP with SIMAP as Numan suggested.
  - This series stil doesn't have ddlog support yet.
It will take too much time for me to deal to ddlog language and specifics.
Help with ddlog implementation wanted.

v1 -> v2:
  - First patch of v1 patch series was applied, but new tests for new feature
were added with strict table number check. Update this tests to be table
number-independent.
  - Squash pathes for northd and utilities as tests don't pass without latter.
  - Add support for OVN IC routing table in routes advertisement/learning.
  - Patches `ic: remove port_binding on ts deletion`

https://patchwork.ozlabs.org/project/ovn/patch/20210824184442.35063-1-odiv...@gmail.com/
and `ic: process only local port_bindings`

https://patchwork.ozlabs.org/project/ovn/patch/20210830195707.98529-1-odiv...@gmail.com/
were already sent to list separately, but other changes are based on them
so they're included.
Once those patches are accepts, I can drop them from this series.
  - Added NEWS item.
  - Added myself to authors list.

Vladislav Odintsov (4):
  ic: process only local port_bindings
  northd,utils: support for RouteTables in LRs
  ic: add support for routing tables in adv/learn routes
  ic: don't learn routes which have local GW

 NEWS|   4 +
 ic/ovn-ic.c | 576 ++--
 northd/northd.c | 159 ---
 northd/ovn-northd.8.xml |  63 -
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 ovn-nb.ovsschema|   5 +-
 ovn-nb.xml  |  30 +++
 tests/ovn-ic.at | 493 ++
 tests/ovn-nbctl.at  | 196 +-
 tests/ovn-northd.at |  76 +-
 tests/ovn.at| 441 +-
 utilities/ovn-nbctl.c   | 138 +-
 13 files changed, 1937 insertions(+), 267 deletions(-)

-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Odintsov Vladislav
I’m sorry for this spam — I was trying to replace one patch in existing patch 
series, because found and fixed a bug.
I’ll resend v6.

Regards,
Vladislav Odintsov

On 5 Oct 2021, at 21:43, Vladislav Odintsov 
mailto:odiv...@gmail.com>> wrote:

Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
  he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
  (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
  routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
  with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov mailto:odiv...@gmail.com>>
Reviewed-by: Numan Siddique mailto:num...@ovn.org>>
---
NEWS|   4 +
ic/ovn-ic.c | 540 
ovn-ic-sb.ovsschema |   5 +-
ovn-ic-sb.xml   |  18 ++
tests/ovn-ic.at | 440 
5 files changed, 811 insertions(+), 196 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
  - Allow static routes without nexthops.
  - Enabled logical dp groups as a default.  CMS should disable it if not
desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.

OVN v21.06.0 - 18 Jun 2021
-
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..02bd28b2b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
struct ovsdb_idl_txn *ovninb_txn;
struct ovsdb_idl_txn *ovnisb_txn;
struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
struct ovsdb_idl_index *nbrec_port_by_name;
struct ovsdb_idl_index *sbrec_chassis_by_name;
struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
struct ovsdb_idl_index *icsbrec_port_binding_by_az;
struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);

ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
if (isb_pb->availability_zone == az) {
shash_add(_pbs, isb_pb->logical_port, isb_pb);
shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
struct ic_router_info {
struct hmap_node node;
const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
struct hmap routes_learned;
};

@@ -854,6 +858,7 @@ struct ic_route_info {
struct in6_addr prefix;
unsigned int plen;
struct in6_addr nexthop;
+const char *route_table;

/* Either nb_route or nb_lrp is set and the other one must be NULL.
 * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,

static struct ic_route_info *
ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
{
struct ic_route_info *r;
uint32_t hash = ic_route_hash(prefix, plen, nexthop);
HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
if (ipv6_addr_equals(>prefix, prefix) &&
r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
ipv6_addr_equals(>nexthop, nexthop)) {
return r;
}
@@ -926,11 +933,19 @@ 

[ovs-dev] [PATCH ovn v5 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Vladislav Odintsov
Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
   he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
   (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
   routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
   with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov 
Reviewed-by: Numan Siddique 
---
 NEWS|   4 +
 ic/ovn-ic.c | 540 
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 tests/ovn-ic.at | 440 
 5 files changed, 811 insertions(+), 196 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..02bd28b2b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
 struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
 struct ovsdb_idl_index *nbrec_port_by_name;
 struct ovsdb_idl_index *sbrec_chassis_by_name;
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
 
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
 if (isb_pb->availability_zone == az) {
 shash_add(_pbs, isb_pb->logical_port, isb_pb);
 shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
 struct ic_router_info {
 struct hmap_node node;
 const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
 struct hmap routes_learned;
 };
 
@@ -854,6 +858,7 @@ struct ic_route_info {
 struct in6_addr prefix;
 unsigned int plen;
 struct in6_addr nexthop;
+const char *route_table;
 
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,
 
 static struct ic_route_info *
 ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
 {
 struct ic_route_info *r;
 uint32_t hash = ic_route_hash(prefix, plen, nexthop);
 HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
 if (ipv6_addr_equals(>prefix, prefix) &&
 r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
 ipv6_addr_equals(>nexthop, nexthop)) {
 return r;
 }
@@ -926,11 +933,19 @@ add_to_routes_learned(struct hmap *routes_learned,
  , , )) {
 return false;
 }
+
+if (ic_route_find(routes_learned, , plen, ,
+  nb_route->route_table)) {
+/* Route is already added to learned in 

[ovs-dev] [PATCH ovn v5 5/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Vladislav Odintsov
Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
   he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
   (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
   routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
   with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov 
Reviewed-by: Numan Siddique 
---
 NEWS|   4 +
 ic/ovn-ic.c | 540 
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 tests/ovn-ic.at | 440 
 5 files changed, 811 insertions(+), 196 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..02bd28b2b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
 struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
 struct ovsdb_idl_index *nbrec_port_by_name;
 struct ovsdb_idl_index *sbrec_chassis_by_name;
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
 
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
 if (isb_pb->availability_zone == az) {
 shash_add(_pbs, isb_pb->logical_port, isb_pb);
 shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
 struct ic_router_info {
 struct hmap_node node;
 const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
 struct hmap routes_learned;
 };
 
@@ -854,6 +858,7 @@ struct ic_route_info {
 struct in6_addr prefix;
 unsigned int plen;
 struct in6_addr nexthop;
+const char *route_table;
 
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,
 
 static struct ic_route_info *
 ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
 {
 struct ic_route_info *r;
 uint32_t hash = ic_route_hash(prefix, plen, nexthop);
 HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
 if (ipv6_addr_equals(>prefix, prefix) &&
 r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
 ipv6_addr_equals(>nexthop, nexthop)) {
 return r;
 }
@@ -926,11 +933,19 @@ add_to_routes_learned(struct hmap *routes_learned,
  , , )) {
 return false;
 }
+
+if (ic_route_find(routes_learned, , plen, ,
+  nb_route->route_table)) {
+/* Route is already added to learned in 

Re: [ovs-dev] [PATCH ovn] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread Numan Siddique
On Tue, Oct 5, 2021 at 1:46 PM Han Zhou  wrote:
>
> On Tue, Oct 5, 2021 at 9:39 AM Ben Pfaff  wrote:
> >
> > I have not been actively maintaining OVN for some time now.  I don't
> expect
> > that to change.  I think that the responsible thing to do is to
> > acknowledge this properly by transitioning to emeritus status.
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> >  MAINTAINERS.rst | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
> > index 0d19bd622..fff2cd977 100644
> > --- a/MAINTAINERS.rst
> > +++ b/MAINTAINERS.rst
> > @@ -41,8 +41,6 @@ This is the current list of active OVN committers:
> >
> > * - Name
> >   - Email
> > -   * - Ben Pfaff
> > - - b...@ovn.org
> > * - Gurucharan Shetty
> >   - g...@ovn.org
> > * - Han Zhou
> > @@ -67,3 +65,5 @@ More information about Emeritus Committers can be found
> >
> > * - Name
> >   - Email
> > +   * - Ben Pfaff
> > + - b...@ovn.org
> > --
> > 2.31.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> Hi Ben,
>
> The patch LGTM, although I really don't like the solution :(
> Thank you so much for building the amazing stuff from scratch!
>

Agree, Thank you for all your amazing work!

Thanks
Numan


> Acked-by: Han Zhou 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ipf: release unhandled packets from the batch

2021-10-05 Thread Aaron Conole
Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf
framework unconditionally allocates a new dp_packet to track
individual fragments.  This prevents a use-after-free.  However, an
additional issue was present - even when the packet buffer is cloned,
if the ip fragment handling code keeps it, the original buffer is
leaked during the refill loop.  Even in the original processing code,
the hardcoded dnsteal branches would always leak a packet buffer from
the refill loop.

This can be confirmed with valgrind:

==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are 
definitely lost in loss record 390 of 390
==717566==at 0x484086F: malloc (vg_replace_malloc.c:380)
==717566==by 0x537BFD: xmalloc__ (util.c:137)
==717566==by 0x537BFD: xmalloc (util.c:172)
==717566==by 0x46DDD4: dp_packet_new (dp-packet.c:153)
==717566==by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163)
==717566==by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 
(netdev-linux.c:1262)
==717566==by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511)
==717566==by 0x4AB7E0: netdev_rxq_recv (netdev.c:727)
==717566==by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699)
==717566==by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957)
==717566==by 0x4331D2: type_run (ofproto-dpif.c:370)
==717566==by 0x41DFD8: ofproto_type_run (ofproto.c:1768)
==717566==by 0x40A7FB: bridge_run__ (bridge.c:3245)
==717566==by 0x411269: bridge_run (bridge.c:3310)
==717566==by 0x406E6C: main (ovs-vswitchd.c:127)

The fix is to delete the original packet when it isn't able to be
reinserted into the packet batch.  Subsequent valgrind runs show that
the packets are not leaked from the batch any longer.

Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 
'do_not_steal' flag.")
Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
Reported-by: Wan Junjie 
Reported-at: https://github.com/openvswitch/ovs-issues/issues/226
Signed-off-by: Aaron Conole 
---
 lib/ipf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/ipf.c b/lib/ipf.c
index 665f40fefe..507db2aea2 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -943,6 +943,8 @@ ipf_extract_frags_from_batch(struct ipf *ipf, struct 
dp_packet_batch *pb,
 ovs_mutex_lock(>ipf_lock);
 if (!ipf_handle_frag(ipf, pkt, dl_type, zone, now, hash_basis)) {
 dp_packet_batch_refill(pb, pkt, pb_idx);
+} else {
+dp_packet_delete(pkt);
 }
 ovs_mutex_unlock(>ipf_lock);
 } else {
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Vladislav Odintsov
Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
   he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
   (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
   routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
   with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov 
Reviewed-by: Numan Siddique 
---
 NEWS|   4 +
 ic/ovn-ic.c | 540 
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 tests/ovn-ic.at | 440 
 5 files changed, 811 insertions(+), 196 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..02bd28b2b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
 struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
 struct ovsdb_idl_index *nbrec_port_by_name;
 struct ovsdb_idl_index *sbrec_chassis_by_name;
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
 
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
 if (isb_pb->availability_zone == az) {
 shash_add(_pbs, isb_pb->logical_port, isb_pb);
 shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
 struct ic_router_info {
 struct hmap_node node;
 const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
 struct hmap routes_learned;
 };
 
@@ -854,6 +858,7 @@ struct ic_route_info {
 struct in6_addr prefix;
 unsigned int plen;
 struct in6_addr nexthop;
+const char *route_table;
 
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,
 
 static struct ic_route_info *
 ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
 {
 struct ic_route_info *r;
 uint32_t hash = ic_route_hash(prefix, plen, nexthop);
 HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
 if (ipv6_addr_equals(>prefix, prefix) &&
 r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
 ipv6_addr_equals(>nexthop, nexthop)) {
 return r;
 }
@@ -926,11 +933,19 @@ add_to_routes_learned(struct hmap *routes_learned,
  , , )) {
 return false;
 }
+
+if (ic_route_find(routes_learned, , plen, ,
+  nb_route->route_table)) {
+/* Route is already added to learned in 

Re: [ovs-dev] [PATCH ovn] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread Han Zhou
On Tue, Oct 5, 2021 at 9:39 AM Ben Pfaff  wrote:
>
> I have not been actively maintaining OVN for some time now.  I don't
expect
> that to change.  I think that the responsible thing to do is to
> acknowledge this properly by transitioning to emeritus status.
>
> Signed-off-by: Ben Pfaff 
> ---
>  MAINTAINERS.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
> index 0d19bd622..fff2cd977 100644
> --- a/MAINTAINERS.rst
> +++ b/MAINTAINERS.rst
> @@ -41,8 +41,6 @@ This is the current list of active OVN committers:
>
> * - Name
>   - Email
> -   * - Ben Pfaff
> - - b...@ovn.org
> * - Gurucharan Shetty
>   - g...@ovn.org
> * - Han Zhou
> @@ -67,3 +65,5 @@ More information about Emeritus Committers can be found
>
> * - Name
>   - Email
> +   * - Ben Pfaff
> + - b...@ovn.org
> --
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Hi Ben,

The patch LGTM, although I really don't like the solution :(
Thank you so much for building the amazing stuff from scratch!

Acked-by: Han Zhou 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [OVN] branch name renamed from 'master' to 'main'

2021-10-05 Thread Numan Siddique
Hello everyone,

The default branch of OVN has been renamed from 'master' to 'main'.  I
had brought this up
 for discussion in our weekly upstream OVN meeting a couple of weeks
ago and the attendees were supportive of it.

I followed the instructions from here -
https://github.com/github/renaming  for the renaming.

Request all the developers and users of OVN code base to update the
default branch locally.

Please let me know if this caused any issues so that this can be resolved soon.

Thanks
Numan
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 1/2] ovn-parallel-hmap.h: Minor fixes for hashrow_lock.

2021-10-05 Thread Han Zhou
On Tue, Oct 5, 2021 at 9:39 AM Numan Siddique  wrote:
>
> On Mon, Oct 4, 2021 at 5:34 AM Anton Ivanov
>  wrote:
> >
> >
> > On 04/10/2021 02:37, Han Zhou wrote:
> > > Although not used currently, it is better to fix:
> > > 1. The type of the mask field should be the same as hmap->mask:
size_t.
> > > 2. Calculating the index is better to use & instead of %.
> > >
> > > Signed-off-by: Han Zhou 
> > > ---
> > >   lib/ovn-parallel-hmap.h | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/ovn-parallel-hmap.h b/lib/ovn-parallel-hmap.h
> > > index 2df132ea8..897208ef8 100644
> > > --- a/lib/ovn-parallel-hmap.h
> > > +++ b/lib/ovn-parallel-hmap.h
> > > @@ -224,7 +224,7 @@ static inline void
wait_for_work_completion(struct worker_pool *pool)
> > >*/
> > >
> > >   struct hashrow_locks {
> > > -ssize_t mask;
> > > +size_t mask;
> > >   struct ovs_mutex *row_locks;
> > >   };
> > >
> > > @@ -235,13 +235,13 @@ void ovn_update_hashrow_locks(struct hmap
*lflows, struct hashrow_locks *hrl);
> > >   /* Lock a hash row */
> > >   static inline void lock_hash_row(struct hashrow_locks *hrl,
uint32_t hash)
> > >   {
> > > -ovs_mutex_lock(>row_locks[hash % hrl->mask]);
> > > +ovs_mutex_lock(>row_locks[hash & hrl->mask]);
> > >   }
> > >
> > >   /* Unlock a hash row */
> > >   static inline void unlock_hash_row(struct hashrow_locks *hrl,
uint32_t hash)
> > >   {
> > > -ovs_mutex_unlock(>row_locks[hash % hrl->mask]);
> > > +ovs_mutex_unlock(>row_locks[hash & hrl->mask]);
> > >   }
> > >
> > >   /* Init the row locks structure */
> >
> > +1.
>
> Acked-by: Numan Siddique 
>
Thanks Anton and Numan. I applied it to the main branch.

> Numan
>
> >
> > --
> > Anton R. Ivanov
> > Cambridgegreys Limited. Registered in England. Company Number 10273661
> > https://www.cambridgegreys.com/
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 3/3] northd.c: Optimize parallel build performance with hash based locks.

2021-10-05 Thread Han Zhou
On Tue, Oct 5, 2021 at 9:37 AM Numan Siddique  wrote:
>
> On Mon, Oct 4, 2021 at 1:28 PM Han Zhou  wrote:
> >
> > On Mon, Oct 4, 2021 at 9:57 AM Anton Ivanov <
anton.iva...@cambridgegreys.com>
> > wrote:
> > >
> > >
> > > On 04/10/2021 16:58, Han Zhou wrote:
> > >
> > >
> > >
> > > On Mon, Oct 4, 2021 at 2:31 AM Anton Ivanov <
> > anton.iva...@cambridgegreys.com> wrote:
> > > >
> > > >
> > > > On 03/10/2021 23:45, Han Zhou wrote:
> > > > > The current implementation of parallel build in northd with
dp-groups
> > > > > enabled results in bad performance when the below assumption is
not
> > > > > true:
> > > > >
> > > > >   * 3. Most RW ops are not flow adds, but changes to the
> > > > >   * od groups.
> > > > >
> > > > > In fact most (if not all) of the ovn port based flows don't share
> > > > > dp-groups, so the assumption doesn't hold in the reality, and in a
> > scale
> > > > > test environment with ovn-k8s topology of 800 nodes, the parallel
> > build
> > > > > shows 5 times more time spent for one northd iteration than
without
> > > > > parallel build on a test machine with 24 cores (24 northd worker
> > > > > threads). This is because of lock contension on the global rw lock
> > > > > protecting the lflow hmap.
> > > > >
> > > > > This patch optimizes it by using an array of bash based locks
instead
> > of
> > > > > a single global lock. It is similar to the approach prior to the
> > commit
> > > > > 8c980ce6, but with two major differences:
> > > > >
> > > > > 1) It uses a fixed length mutex array instead of the dynamic
array of
> > > > > the struct hashrow_locks. It is equally efficient considering
the
> > low
> > > > > chance of contention in a large array of locks, but without
the
> > burden
> > > > > of resizing every time the hmap size changes. The uniqueness
of
> > the
> > > > > lock is guaranteed by combining the masks of both the hmap
and the
> > > > > mutex array.
> > > > >
> > > > > 2) It fixes the corrupted hmap size after each round of parallel
flow
> > > > > build. The hash based lock protects the list in each bucket,
but
> > > > > doesn't protect the hmap size. The patch uses thread-local
> > counters
> > > > > and aggregate them at the end of each iteration, which is lock
> > free.
> > > > > This approach has lower cost than alternatively using atomic
> > > > > incrementing a global counter.
> > > > >
> > > > > This patch ends up with 8 times speedup than the current parallel
> > build
> > > > > with dp-group enabled for the same scale test (which is 30% faster
> > than
> > > > > without parallel).
> > > > >
> > > > > Test command: ovn-nbctl --print-wait-time --wait=sb sync
> > > > >
> > > > > Before:
> > > > >
> > > > > no parallel:
> > > > > ovn-northd completion: 7807ms
> > > > >
> > > > > parallel:
> > > > > ovn-northd completion: 41267ms
> > > > >
> > > > > After:
> > > > >
> > > > > no parallel: (no change)
> > > > >
> > > > > parallel:
> > > > > ovn-northd completion: 5081ms
> > > > > (8x faster than before, 30% faster than no parallel)
> > > > >
> > > > > Note: all the above tests are with dp-groups enabled)
> > > > >
> > > > > Signed-off-by: Han Zhou 
> > > > > ---
> > > > > v1 -> v2: Addressed comments from Anton
> > > > >  - Fixes the hmap size afte each round of prallel flow
building.
> > > > >  - Refactored lflow_hash_lock and removed unnecessary
functions
> > to avoid
> > > > >clang warnings.
> > > > >
> > > > >   northd/northd.c | 140
> > 
> > > > >   1 file changed, 70 insertions(+), 70 deletions(-)
> > > > >
> > > > > diff --git a/northd/northd.c b/northd/northd.c
> > > > > index afd812700..88ab0c929 100644
> > > > > --- a/northd/northd.c
> > > > > +++ b/northd/northd.c
> > > > > @@ -4319,41 +4319,44 @@ ovn_dp_group_add_with_reference(struct
> > ovn_lflow *lflow_ref,
> > > > >   return true;
> > > > >   }
> > > > >
> > > > > -/* Adds a row with the specified contents to the Logical_Flow
table.
> > > > > - * Version to use with dp_groups + parallel - when locking is
> > required.
> > > > > - *
> > > > > - * Assumptions:
> > > > > - *
> > > > > - * 1. A large proportion of the operations are lookups (reads).
> > > > > - * 2. RW operations are a small proportion of overall adds.
> > > > > - * 3. Most RW ops are not flow adds, but changes to the
> > > > > - * od groups.
> > > > > +/* The lflow_hash_lock is a mutex array that protects updates to
the
> > shared
> > > > > + * lflow table across threads when parallel lflow build and
dp-group
> > are both
> > > > > + * enabled. To avoid high contention between threads, a big
array of
> > mutexes
> > > > > + * are used instead of just one. This is possible because when
> > parallel build
> > > > > + * is used we only use hmap_insert_fast() to update the hmap,
which
> > would not
> > > > > + * touch the bucket array but only the list in a single bucket.
We
> > only need to
> > > > > + * make sure that when adding lflows to 

Re: [ovs-dev] [PATCH ovn 1/2] ovn-parallel-hmap.h: Minor fixes for hashrow_lock.

2021-10-05 Thread Numan Siddique
On Mon, Oct 4, 2021 at 5:34 AM Anton Ivanov
 wrote:
>
>
> On 04/10/2021 02:37, Han Zhou wrote:
> > Although not used currently, it is better to fix:
> > 1. The type of the mask field should be the same as hmap->mask: size_t.
> > 2. Calculating the index is better to use & instead of %.
> >
> > Signed-off-by: Han Zhou 
> > ---
> >   lib/ovn-parallel-hmap.h | 6 +++---
> >   1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/ovn-parallel-hmap.h b/lib/ovn-parallel-hmap.h
> > index 2df132ea8..897208ef8 100644
> > --- a/lib/ovn-parallel-hmap.h
> > +++ b/lib/ovn-parallel-hmap.h
> > @@ -224,7 +224,7 @@ static inline void wait_for_work_completion(struct 
> > worker_pool *pool)
> >*/
> >
> >   struct hashrow_locks {
> > -ssize_t mask;
> > +size_t mask;
> >   struct ovs_mutex *row_locks;
> >   };
> >
> > @@ -235,13 +235,13 @@ void ovn_update_hashrow_locks(struct hmap *lflows, 
> > struct hashrow_locks *hrl);
> >   /* Lock a hash row */
> >   static inline void lock_hash_row(struct hashrow_locks *hrl, uint32_t hash)
> >   {
> > -ovs_mutex_lock(>row_locks[hash % hrl->mask]);
> > +ovs_mutex_lock(>row_locks[hash & hrl->mask]);
> >   }
> >
> >   /* Unlock a hash row */
> >   static inline void unlock_hash_row(struct hashrow_locks *hrl, uint32_t 
> > hash)
> >   {
> > -ovs_mutex_unlock(>row_locks[hash % hrl->mask]);
> > +ovs_mutex_unlock(>row_locks[hash & hrl->mask]);
> >   }
> >
> >   /* Init the row locks structure */
>
> +1.

Acked-by: Numan Siddique 

Numan

>
> --
> Anton R. Ivanov
> Cambridgegreys Limited. Registered in England. Company Number 10273661
> https://www.cambridgegreys.com/
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread Ben Pfaff
I have not been actively maintaining OVN for some time now.  I don't expect
that to change.  I think that the responsible thing to do is to
acknowledge this properly by transitioning to emeritus status.

Signed-off-by: Ben Pfaff 
---
 MAINTAINERS.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
index 0d19bd622..fff2cd977 100644
--- a/MAINTAINERS.rst
+++ b/MAINTAINERS.rst
@@ -41,8 +41,6 @@ This is the current list of active OVN committers:
 
* - Name
  - Email
-   * - Ben Pfaff
- - b...@ovn.org
* - Gurucharan Shetty
  - g...@ovn.org
* - Han Zhou
@@ -67,3 +65,5 @@ More information about Emeritus Committers can be found
 
* - Name
  - Email
+   * - Ben Pfaff
+ - b...@ovn.org
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 3/3] northd.c: Optimize parallel build performance with hash based locks.

2021-10-05 Thread Numan Siddique
On Mon, Oct 4, 2021 at 1:28 PM Han Zhou  wrote:
>
> On Mon, Oct 4, 2021 at 9:57 AM Anton Ivanov 
> wrote:
> >
> >
> > On 04/10/2021 16:58, Han Zhou wrote:
> >
> >
> >
> > On Mon, Oct 4, 2021 at 2:31 AM Anton Ivanov <
> anton.iva...@cambridgegreys.com> wrote:
> > >
> > >
> > > On 03/10/2021 23:45, Han Zhou wrote:
> > > > The current implementation of parallel build in northd with dp-groups
> > > > enabled results in bad performance when the below assumption is not
> > > > true:
> > > >
> > > >   * 3. Most RW ops are not flow adds, but changes to the
> > > >   * od groups.
> > > >
> > > > In fact most (if not all) of the ovn port based flows don't share
> > > > dp-groups, so the assumption doesn't hold in the reality, and in a
> scale
> > > > test environment with ovn-k8s topology of 800 nodes, the parallel
> build
> > > > shows 5 times more time spent for one northd iteration than without
> > > > parallel build on a test machine with 24 cores (24 northd worker
> > > > threads). This is because of lock contension on the global rw lock
> > > > protecting the lflow hmap.
> > > >
> > > > This patch optimizes it by using an array of bash based locks instead
> of
> > > > a single global lock. It is similar to the approach prior to the
> commit
> > > > 8c980ce6, but with two major differences:
> > > >
> > > > 1) It uses a fixed length mutex array instead of the dynamic array of
> > > > the struct hashrow_locks. It is equally efficient considering the
> low
> > > > chance of contention in a large array of locks, but without the
> burden
> > > > of resizing every time the hmap size changes. The uniqueness of
> the
> > > > lock is guaranteed by combining the masks of both the hmap and the
> > > > mutex array.
> > > >
> > > > 2) It fixes the corrupted hmap size after each round of parallel flow
> > > > build. The hash based lock protects the list in each bucket, but
> > > > doesn't protect the hmap size. The patch uses thread-local
> counters
> > > > and aggregate them at the end of each iteration, which is lock
> free.
> > > > This approach has lower cost than alternatively using atomic
> > > > incrementing a global counter.
> > > >
> > > > This patch ends up with 8 times speedup than the current parallel
> build
> > > > with dp-group enabled for the same scale test (which is 30% faster
> than
> > > > without parallel).
> > > >
> > > > Test command: ovn-nbctl --print-wait-time --wait=sb sync
> > > >
> > > > Before:
> > > >
> > > > no parallel:
> > > > ovn-northd completion: 7807ms
> > > >
> > > > parallel:
> > > > ovn-northd completion: 41267ms
> > > >
> > > > After:
> > > >
> > > > no parallel: (no change)
> > > >
> > > > parallel:
> > > > ovn-northd completion: 5081ms
> > > > (8x faster than before, 30% faster than no parallel)
> > > >
> > > > Note: all the above tests are with dp-groups enabled)
> > > >
> > > > Signed-off-by: Han Zhou 
> > > > ---
> > > > v1 -> v2: Addressed comments from Anton
> > > >  - Fixes the hmap size afte each round of prallel flow building.
> > > >  - Refactored lflow_hash_lock and removed unnecessary functions
> to avoid
> > > >clang warnings.
> > > >
> > > >   northd/northd.c | 140
> 
> > > >   1 file changed, 70 insertions(+), 70 deletions(-)
> > > >
> > > > diff --git a/northd/northd.c b/northd/northd.c
> > > > index afd812700..88ab0c929 100644
> > > > --- a/northd/northd.c
> > > > +++ b/northd/northd.c
> > > > @@ -4319,41 +4319,44 @@ ovn_dp_group_add_with_reference(struct
> ovn_lflow *lflow_ref,
> > > >   return true;
> > > >   }
> > > >
> > > > -/* Adds a row with the specified contents to the Logical_Flow table.
> > > > - * Version to use with dp_groups + parallel - when locking is
> required.
> > > > - *
> > > > - * Assumptions:
> > > > - *
> > > > - * 1. A large proportion of the operations are lookups (reads).
> > > > - * 2. RW operations are a small proportion of overall adds.
> > > > - * 3. Most RW ops are not flow adds, but changes to the
> > > > - * od groups.
> > > > +/* The lflow_hash_lock is a mutex array that protects updates to the
> shared
> > > > + * lflow table across threads when parallel lflow build and dp-group
> are both
> > > > + * enabled. To avoid high contention between threads, a big array of
> mutexes
> > > > + * are used instead of just one. This is possible because when
> parallel build
> > > > + * is used we only use hmap_insert_fast() to update the hmap, which
> would not
> > > > + * touch the bucket array but only the list in a single bucket. We
> only need to
> > > > + * make sure that when adding lflows to the same hash bucket, the
> same lock is
> > > > + * used, so that no two threads can add to the bucket at the same
> time.  It is
> > > > + * ok that the same lock is used to protect multiple buckets, so a
> fixed sized
> > > > + * mutex array is used instead of 1-1 mapping to the hash buckets.
> This
> > > > + * simplies the implementation 

[ovs-dev] [PATCH] MAINTAINERS: Transition myself to emeritus status.

2021-10-05 Thread Ben Pfaff
It has been a long time since I've been able to devote significant time
to OVS work.  I don't expect that to change.  I think that the responsible
thing to do is to transition myself to emeritus status, with this patch.

Signed-off-by: Ben Pfaff 
---
 MAINTAINERS.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS.rst b/MAINTAINERS.rst
index 0c035d940..f4f6188c8 100644
--- a/MAINTAINERS.rst
+++ b/MAINTAINERS.rst
@@ -49,8 +49,6 @@ This is the current list of active Open vSwitch committers:
  - az...@ovn.org
* - Ansis Atteka
  - aatt...@nicira.com
-   * - Ben Pfaff
- - b...@ovn.org
* - Daniele Di Proietto
  - daniele.di.proie...@gmail.com
* - Gurucharan Shetty
@@ -89,5 +87,7 @@ More information about Emeritus Committers can be found
 
* - Name
  - Email
+   * - Ben Pfaff
+ - b...@ovn.org
* - Ethan J. Jackson
  - e...@eecs.berkeley.edu
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/6] MFEX Optimizations IPv6 + Hashing

2021-10-05 Thread Stokes, Ian
9/21/21 12:23, Kumar Amber wrote:
> > ---
> > v3:
> > - rebase to master.
> > v2:
> > - fix the CI build.
> > - fix check-patch for co-author.
> > ---
> >
> > The patch-set introduces AVX512 optimizations of IPv6
> > traffic profiles and hashing improvements for all AVX512
> > supported traffic profiles for IPv4 and IPv6.
> >
> > Kumar Amber (6):
> >   dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles
> >   dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles
> >   dpif-netdev/mfex: Add packet hash check to autovalidator
> >   dpif-netdev/mfex: Add ipv4 profile based hashing
> >   dpif-netdev/mfex: Add ipv6 profile based hashing
> >   dpif-netdev/mfex: Avoid hashing when opt mfex called
> >
> >  NEWS  |   7 +
> >  lib/automake.mk   |   1 +
> >  lib/dpif-netdev-avx512.c  |   6 +-
> >  lib/dpif-netdev-extract-avx512.c  | 348 +-
> >  lib/dpif-netdev-private-extract.c |  63 +-
> >  lib/dpif-netdev-private-extract.h |  12 ++
> >  tests/pcap/mfex_test.pcap | Bin 416 -> 632 bytes
> >  7 files changed, 432 insertions(+), 5 deletions(-)
> >
> 
> Hi.  A few months ago I was told that it's easy for Intel to set up CI
> to test upstream patches with AVX512 features enabled.  Is there any
> progress on that front?
> 
> My point is that we should refrain from adding new features in this
> area until we have a proper CI.  Especially considering the unit test
> failure you reported yesterday, which is supposedly related to AVX512
> optimizations.
> 

Hi Ilya,

Apologies for the delay in response, I've been on PTO the past 2 weeks so only 
catching up now.

There is an ongoing effort to deploy an AVX512 CI. Aaron, Michael and myself 
have started to look at this. I think the initial target was to have something 
up and running by EOY.

> // Marking this patch-set as deferred for now.

Agree with marking this deferred for now.

However I don’t think this should be dependent on delivery of the CI as that 
could be some time away yet. I think if we are to run the tests manually on an 
AVX512 system (We could even run it on the system that has been setup and 
reserved for the CI in the Intel Lab) and post results in response/part of the 
review of the series that should help progress the series in the meantime?

Thanks
Ian

> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: do not run find_lrp_member_ip with empty nexthop

2021-10-05 Thread Numan Siddique
On Wed, Sep 22, 2021 at 6:49 AM Lorenzo Bianconi
 wrote:
>
> Do not run find_lrp_member_ip in find_static_route_outport if the route
> has been configured without a valid nexthop. This patch fixes the
> following northd warning:
>
> 2021-09-18T06:01:37.909Z|8|ovn_northd|WARN|bad ipv6 address
>
> Fixes: c00852288 ("northd: allow to configure routes with no nexthop")
> Reported-by: Jianlin Shi 
> Signed-off-by: Lorenzo Bianconi 

Thanks for the fix.

I applied to the main branch and branch-21.09.

Numan

> ---
>  northd/northd.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index cf2467fe1..884a3a6c1 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -8833,7 +8833,9 @@ find_static_route_outport(struct ovn_datapath *od, 
> struct hmap *ports,
>   route->output_port, route->ip_prefix);
>  return false;
>  }
> -lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
> +if (strlen(route->nexthop)) {
> +lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
> +}
>  if (!lrp_addr_s) {
>  /* There are no IP networks configured on the router's port via
>   * which 'route->nexthop' is theoretically reachable.  But since
> @@ -8863,7 +8865,9 @@ find_static_route_outport(struct ovn_datapath *od, 
> struct hmap *ports,
>  continue;
>  }
>
> -lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
> +if (strlen(route->nexthop)) {
> +lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
> +}
>  if (lrp_addr_s) {
>  break;
>  }
> --
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread 0-day Robot
Bleep bloop.  Greetings Vladislav Odintsov, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 81 characters long (recommended limit is 79)
#377 FILE: ic/ovn-ic.c:1340:

nbrec_logical_router_static_route_update_external_ids_setkey(

Lines checked: 1287, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5 2/4] northd, utils: support for RouteTables in LRs

2021-10-05 Thread 0-day Robot
Bleep bloop.  Greetings Vladislav Odintsov, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1553 FILE: utilities/ovn-nbctl.c:332:
  lrp-set-options PORT KEY=VALUE [KEY=VALUE]...\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1564 FILE: utilities/ovn-nbctl.c:356:
  [--policy=POLICY]\n\

WARNING: Line lacks whitespace around operator
#1566 FILE: utilities/ovn-nbctl.c:358:
  [--ecmp-symmetric-reply]\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1567 FILE: utilities/ovn-nbctl.c:359:
  [--route-table=ROUTE_TABLE]\n\

WARNING: Line lacks whitespace around operator
#1568 FILE: utilities/ovn-nbctl.c:360:
  lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1571 FILE: utilities/ovn-nbctl.c:362:
  [--policy=POLICY]\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1572 FILE: utilities/ovn-nbctl.c:363:
  [--route-table=ROUTE_TABLE]\n\

WARNING: Line lacks whitespace around operator
#1573 FILE: utilities/ovn-nbctl.c:364:
  lr-route-del ROUTER [PREFIX [NEXTHOP [PORT]]]\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1575 FILE: utilities/ovn-nbctl.c:366:
  [--route-table=ROUTE_TABLE]\n\

WARNING: Line is 82 characters long (recommended limit is 79)
#1854 FILE: utilities/ovn-nbctl.c:7067:
  
"--may-exist,--ecmp,--ecmp-symmetric-reply,--policy=,--route-table=,--bfd?",

Lines checked: 1868, Warnings: 20, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 2/4] northd, utils: support for RouteTables in LRs

2021-10-05 Thread Vladislav Odintsov
Hi Numan,

I’ve submitted v5 here [1]. I decided to leave NEWS section unchanged since it 
mentions OVN IC changes, which is done in patch #3.
Also, I see that 21.09.0 tag was created 17 hours ago… so, now this patch 
series is postponed to 21.12 or there are options?

1: https://patchwork.ozlabs.org/project/ovn/list/?series=265501

Regards,
Vladislav Odintsov

> On 4 Oct 2021, at 19:20, Numan Siddique  wrote:
> 
> On Sun, Sep 19, 2021 at 5:24 PM Vladislav Odintsov  > wrote:
>> 
>> Signed-off-by: Vladislav Odintsov 
>> ---
>> northd/northd.c | 159 ---
>> northd/ovn-northd.8.xml |  63 --
>> ovn-nb.ovsschema|   5 +-
>> ovn-nb.xml  |  30 +++
>> tests/ovn-ic.at |   4 +
>> tests/ovn-nbctl.at  | 196 +-
>> tests/ovn-northd.at |  76 ++-
>> tests/ovn.at| 441 +++-
>> utilities/ovn-nbctl.c   | 134 +++-
>> 9 files changed, 1041 insertions(+), 67 deletions(-)
> 
> 
> The patch LGTM.
> 
> I see that patch 3 has the NEWS item about this feature.  But I think
> that can be moved to this patch.
> 
> Acked-by: Numan Siddique mailto:num...@ovn.org>>
> 
> Numan
> 
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index d1b87891c..c238c6241 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -152,15 +152,16 @@ enum ovn_stage {
>> PIPELINE_STAGE(ROUTER, IN,  ECMP_STATEFUL,   7, "lr_in_ecmp_stateful") \
>> PIPELINE_STAGE(ROUTER, IN,  ND_RA_OPTIONS,   8, "lr_in_nd_ra_options") \
>> PIPELINE_STAGE(ROUTER, IN,  ND_RA_RESPONSE,  9, "lr_in_nd_ra_response") \
>> -PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  10, "lr_in_ip_routing")   \
>> -PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 11, 
>> "lr_in_ip_routing_ecmp") \
>> -PIPELINE_STAGE(ROUTER, IN,  POLICY,  12, "lr_in_policy")   \
>> -PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 13, "lr_in_policy_ecmp")  \
>> -PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 14, "lr_in_arp_resolve")  \
>> -PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN   ,  15, "lr_in_chk_pkt_len")  \
>> -PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 16, "lr_in_larger_pkts")  \
>> -PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 17, "lr_in_gw_redirect")  \
>> -PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 18, "lr_in_arp_request")  \
>> +PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_PRE,  10, 
>> "lr_in_ip_routing_pre")  \
>> +PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  11, "lr_in_ip_routing")
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 12, 
>> "lr_in_ip_routing_ecmp") \
>> +PIPELINE_STAGE(ROUTER, IN,  POLICY,  13, "lr_in_policy")
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 14, "lr_in_policy_ecmp")   
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 15, "lr_in_arp_resolve")   
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN, 16, "lr_in_chk_pkt_len")   
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 17, "lr_in_larger_pkts")   
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 18, "lr_in_gw_redirect")   
>>   \
>> +PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 19, "lr_in_arp_request")   
>>   \
>>   \
>> /* Logical router egress stages. */   \
>> PIPELINE_STAGE(ROUTER, OUT, UNDNAT,  0, "lr_out_undnat")\
>> @@ -229,6 +230,7 @@ enum ovn_stage {
>> #define REG_NEXT_HOP_IPV6 "xxreg0"
>> #define REG_SRC_IPV4 "reg1"
>> #define REG_SRC_IPV6 "xxreg1"
>> +#define REG_ROUTE_TABLE_ID "reg7"
>> 
>> #define REG_ORIG_TP_DPORT_ROUTER   "reg9[16..31]"
>> 
>> @@ -291,8 +293,9 @@ enum ovn_stage {
>>  * | R6  |UNUSED| X | | G | 
>> IN_IP_ROUTING)|
>>  * | |  | R | | 1 |  
>>  |
>>  * +-+--+ E | UNUSED  |   |  
>>  |
>> - * | R7  |UNUSED| G | |   | 
>>   |
>> - * | |  | 3 | |   | 
>>   |
>> + * | R7  |  ROUTE_TABLE_ID  | G | |   | 
>>   |
>> + * | | (>= IN_IP_ROUTING_PRE && | 3 | |   | 
>>   |
>> + * | |  <= IN_IP_ROUTING)   |   | |   | 
>>   |
>>  * 
>> +-+--+---+-+---+---+
>>  * | R8  | ECMP_GROUP_ID|   | |
>>  * | | ECMP_MEMBER_ID   | X | |
>> @@ -8520,11 +8523,72 @@ cleanup:
>> ds_destroy();
>> }
>> 
>> +static uint32_t
>> +route_table_add(struct simap *route_tables, const char *route_table_name)
>> +{
>> +/* route table ids start from 1 */
>> +uint32_t rtb_id = simap_count(route_tables) + 1;
>> +
>> +if (rtb_id == UINT16_MAX) {
>> 

[ovs-dev] [PATCH ovn v5 3/4] ic: add support for routing tables in adv/learn routes

2021-10-05 Thread Vladislav Odintsov
Previously support for multiple routing tables was added
to northd code.
This commit expands support for multiple routing tables
by adding support of advertising and learning routes with
their routing table information.

To utilize such feature, user must:
1. create Logical Router in each AZ;
2. create IC transit switch for each routing table, that
   he/she needs;
3. connect each TS with this LR;
4. assign routing table for TS's LRP
   (ovn-nbctl lrp-set-options  route_table=<>);
5. enable routes sync (turn on learning and advertising
   routes in NB_Global table);
6. create LRPs for subnets in LR, create static routes
   with supplying route_table parameter.

Note 1: routes for directly-connected networks will be
learned to global routing table and if Logical Routers
have more than one Transit Switch, which interconnects
them, directly-connected routes will be added via each
transit switch port and configured as ECMP routes.

Note 2: static routes within route tables will be advertised
and learned only if interconnecting transit switch's LRPs
will have options:route_table same value as route's route_table
value.

Signed-off-by: Vladislav Odintsov 
Reviewed-by: Numan Siddique 
---
 NEWS|   4 +
 ic/ovn-ic.c | 533 
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 tests/ovn-ic.at | 440 
 5 files changed, 808 insertions(+), 192 deletions(-)

diff --git a/NEWS b/NEWS
index 5e93f813a..dcb70f944 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ OVN v21.09.0 - xx xxx 
   - Allow static routes without nexthops.
   - Enabled logical dp groups as a default.  CMS should disable it if not
 desired.
+  - Added support for multiple routing tables in Logical Router Static Routes
+and LRPs. OVN Interconnection supports routes' route tables as well.
+This requires to update schemas for OVN_Northdbound and OVN_IC_Southbound
+DBs.
 
 OVN v21.06.0 - 18 Jun 2021
 -
diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 303e93a4f..94f123f70 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -63,9 +63,11 @@ struct ic_context {
 struct ovsdb_idl_txn *ovninb_txn;
 struct ovsdb_idl_txn *ovnisb_txn;
 struct ovsdb_idl_index *nbrec_ls_by_name;
+struct ovsdb_idl_index *nbrec_lrp_by_name;
 struct ovsdb_idl_index *nbrec_port_by_name;
 struct ovsdb_idl_index *sbrec_chassis_by_name;
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
+struct ovsdb_idl_index *icnbrec_transit_switch_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
@@ -773,7 +775,7 @@ port_binding_run(struct ic_context *ctx,
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
 
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
-ctx->icsbrec_port_binding_by_ts) {
+ ctx->icsbrec_port_binding_by_ts) {
 if (isb_pb->availability_zone == az) {
 shash_add(_pbs, isb_pb->logical_port, isb_pb);
 shash_find_and_delete(_all_local_pbs,
@@ -844,7 +846,9 @@ port_binding_run(struct ic_context *ctx,
 struct ic_router_info {
 struct hmap_node node;
 const struct nbrec_logical_router *lr; /* key of hmap */
-const struct icsbrec_port_binding *isb_pb;
+const struct icsbrec_port_binding **isb_pbs;
+size_t n_isb_pbs;
+size_t n_allocated_isb_pbs;
 struct hmap routes_learned;
 };
 
@@ -854,6 +858,7 @@ struct ic_route_info {
 struct in6_addr prefix;
 unsigned int plen;
 struct in6_addr nexthop;
+const char *route_table;
 
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
@@ -875,13 +880,15 @@ ic_route_hash(const struct in6_addr *prefix, unsigned int 
plen,
 
 static struct ic_route_info *
 ic_route_find(struct hmap *routes, const struct in6_addr *prefix,
-  unsigned int plen, const struct in6_addr *nexthop)
+  unsigned int plen, const struct in6_addr *nexthop,
+  char *route_table)
 {
 struct ic_route_info *r;
 uint32_t hash = ic_route_hash(prefix, plen, nexthop);
 HMAP_FOR_EACH_WITH_HASH (r, node, hash, routes) {
 if (ipv6_addr_equals(>prefix, prefix) &&
 r->plen == plen &&
+!strcmp(r->route_table ? r->route_table : "", route_table) &&
 ipv6_addr_equals(>nexthop, nexthop)) {
 return r;
 }
@@ -926,11 +933,19 @@ add_to_routes_learned(struct hmap *routes_learned,
  , , )) {
 return false;
 }
+
+if (ic_route_find(routes_learned, , plen, ,
+  nb_route->route_table)) {
+/* Route is already added to learned in 

[ovs-dev] [PATCH ovn v5 4/4] ic: don't learn routes which have local GW

2021-10-05 Thread Vladislav Odintsov
In case we have ovn-ic-interconnected Logical_Routers
and install same ip_prefix route with GW in local AZ
in each LR in each AZ, this route would be learned in
other AZs and L3 loop is possible.
There could be next routes output:

[az1 ~]$ ovn-nbctl lr-route-list lr0
IPv4 Routes
Route Table global:
  128.0.0.0/1   169.254.1.1 dst-ip ecmp
  128.0.0.0/1 169.254.100.2 dst-ip (learned) ecmp

[az2 ~]$ ovn-nbctl lr-route-list lr0
IPv4 Routes
Route Table global:
  128.0.0.0/1   169.254.2.1 dst-ip ecmp
  128.0.0.0/1 169.254.100.1 dst-ip (learned) ecmp

So, there is a possible routing loop. Packets going
to 128.0.0.0/1 could go from AZ1 to AZ2 and on AZ2
they can be routed back.

This commit adds check for installed local (non-learned)
routes. If OVN IC route's ip_prefix, route_table are
the same with already installed non-learned NB route, such
route wouldn't be learned.

Signed-off-by: Vladislav Odintsov 
---
 ic/ovn-ic.c   | 30 --
 tests/ovn-ic.at   | 49 +++
 utilities/ovn-nbctl.c |  4 +++-
 3 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 94f123f70..4c904400c 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -1197,7 +1197,25 @@ add_network_to_routes_ad(struct hmap *routes_ad, const 
char *network,
 }
 
 static bool
-route_need_learn(struct in6_addr *prefix, unsigned int plen,
+route_has_local_gw(const struct nbrec_logical_router *lr,
+   const char *route_table, const char *ip_prefix) {
+
+const struct nbrec_logical_router_static_route *route;
+for (int i = 0; i < lr->n_static_routes; i++) {
+route = lr->static_routes[i];
+if (!smap_get(>external_ids, "ic-learned-route") &&
+!strcmp(route->route_table, route_table) &&
+!strcmp(route->ip_prefix, ip_prefix)) {
+return true;
+}
+}
+return false;
+}
+
+static bool
+route_need_learn(const struct nbrec_logical_router *lr,
+ const struct icsbrec_route *isb_route,
+ struct in6_addr *prefix, unsigned int plen,
  const struct smap *nb_options)
 {
 if (!smap_get_bool(nb_options, "ic-route-learn", false)) {
@@ -1217,6 +1235,12 @@ route_need_learn(struct in6_addr *prefix, unsigned int 
plen,
 return false;
 }
 
+if (route_has_local_gw(lr, isb_route->route_table, isb_route->ip_prefix)) {
+VLOG_DBG("Skip learning %s (rtb:%s) route, as we've got one with "
+ "local GW", isb_route->ip_prefix, isb_route->route_table);
+return false;
+}
+
 return true;
 }
 
@@ -1322,9 +1346,11 @@ sync_learned_routes(struct ic_context *ctx,
  isb_route->nexthop);
 continue;
 }
-if (!route_need_learn(, plen, _global->options)) {
+if (!route_need_learn(ic_lr->lr, isb_route, , plen,
+  _global->options)) {
 continue;
 }
+
 struct ic_route_info *route_learned
 = ic_route_find(_lr->routes_learned, , plen,
 , isb_route->route_table);
diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index 5803f76e9..15560334d 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -870,3 +870,52 @@ OVN_CLEANUP_IC([az1], [az2])
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-ic -- same routes destination])
+
+ovn_init_ic_db
+ovn-ic-nbctl ts-add ts1
+
+for i in 1 2; do
+ovn_start az$i
+ovn_as az$i
+
+# Enable route learning at AZ level
+ovn-nbctl set nb_global . options:ic-route-learn=true
+ovn-nbctl set nb_global . options:ic-route-learn-default=true
+# Enable route advertising at AZ level
+ovn-nbctl set nb_global . options:ic-route-adv=true
+ovn-nbctl set nb_global . options:ic-route-adv-default=true
+
+lr=lr1$i
+ovn-nbctl lr-add $lr
+
+lrp=lrp-$lr-ts1
+lsp=lsp-ts1-$lr
+# Create LRP and connect to TS
+ovn-nbctl lrp-add $lr $lrp aa:aa:aa:aa:aa:0$i 169.254.100.$i/24
+ovn-nbctl lsp-add ts1 $lsp \
+-- lsp-set-addresses $lsp router \
+-- lsp-set-type $lsp router \
+-- lsp-set-options $lsp router-port=$lrp
+ovn-nbctl lrp-add $lr lrp-local-subnet 00:00:00:00:00:0$i 192.168.$i.1/24
+ovn-nbctl list logical-router-static-route
+check ovn-nbctl lr-route-add $lr 10.0.0.0/24 192.168.$i.10
+check ovn-nbctl lr-route-add $lr 0.0.0.0/0 192.168.$i.11
+done
+
+AT_CHECK([ovn_as az1 ovn-nbctl lr-route-list lr11 | grep dst-ip | sort], [0], 
[dnl
+0.0.0.0/0  192.168.1.11 dst-ip
+  10.0.0.0/24  192.168.1.10 dst-ip
+   192.168.2.0/24 169.254.100.2 dst-ip (learned)
+])
+
+AT_CHECK([ovn_as az2 ovn-nbctl lr-route-list lr12 | grep dst-ip | sort], [0], 
[dnl
+

[ovs-dev] [PATCH ovn v5 2/4] northd, utils: support for RouteTables in LRs

2021-10-05 Thread Vladislav Odintsov
This patch extends Logical Router's routing functionality.
Now user may create multiple routing tables within a Logical Router
and assign them to Logical Router Ports.

Traffic coming from Logical Router Port with assigned route_table
is checked against global routes if any (Logical_Router_Static_Routes
whith empty route_table field), next against directly connected routes
and then Logical_Router_Static_Routes with same route_table value as
in Logical_Router_Port options:route_table field.

A new Logical Router ingress table #10 is added - IN_IP_ROUTING_PRE.
In this table packets which come from LRPs with configured
options:route_table field are checked against inport and in OVS
register 7 unique non-zero value identifying route table is written.

Then in 11th table IN_IP_ROUTING routes which have non-empty
`route_table` field are added with additional match on reg7 value
associated with appropriate route_table.

Signed-off-by: Vladislav Odintsov 
Acked-by: Numan Siddique 
---
 northd/northd.c | 159 ---
 northd/ovn-northd.8.xml |  63 --
 ovn-nb.ovsschema|   5 +-
 ovn-nb.xml  |  30 +++
 tests/ovn-ic.at |   4 +
 tests/ovn-nbctl.at  | 196 +-
 tests/ovn-northd.at |  76 ++-
 tests/ovn.at| 441 +++-
 utilities/ovn-nbctl.c   | 134 +++-
 9 files changed, 1041 insertions(+), 67 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index bf32a077c..b77befa3b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -148,15 +148,16 @@ enum ovn_stage {
 PIPELINE_STAGE(ROUTER, IN,  ECMP_STATEFUL,   7, "lr_in_ecmp_stateful") \
 PIPELINE_STAGE(ROUTER, IN,  ND_RA_OPTIONS,   8, "lr_in_nd_ra_options") \
 PIPELINE_STAGE(ROUTER, IN,  ND_RA_RESPONSE,  9, "lr_in_nd_ra_response") \
-PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  10, "lr_in_ip_routing")   \
-PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 11, "lr_in_ip_routing_ecmp") \
-PIPELINE_STAGE(ROUTER, IN,  POLICY,  12, "lr_in_policy")   \
-PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 13, "lr_in_policy_ecmp")  \
-PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 14, "lr_in_arp_resolve")  \
-PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN   ,  15, "lr_in_chk_pkt_len")  \
-PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 16, "lr_in_larger_pkts")  \
-PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 17, "lr_in_gw_redirect")  \
-PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 18, "lr_in_arp_request")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_PRE,  10, "lr_in_ip_routing_pre")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  11, "lr_in_ip_routing")  \
+PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 12, "lr_in_ip_routing_ecmp") \
+PIPELINE_STAGE(ROUTER, IN,  POLICY,  13, "lr_in_policy")  \
+PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP, 14, "lr_in_policy_ecmp") \
+PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 15, "lr_in_arp_resolve") \
+PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN, 16, "lr_in_chk_pkt_len") \
+PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS, 17, "lr_in_larger_pkts") \
+PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 18, "lr_in_gw_redirect") \
+PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 19, "lr_in_arp_request") \
   \
 /* Logical router egress stages. */   \
 PIPELINE_STAGE(ROUTER, OUT, UNDNAT,  0, "lr_out_undnat")\
@@ -225,6 +226,7 @@ enum ovn_stage {
 #define REG_NEXT_HOP_IPV6 "xxreg0"
 #define REG_SRC_IPV4 "reg1"
 #define REG_SRC_IPV6 "xxreg1"
+#define REG_ROUTE_TABLE_ID "reg7"
 
 #define REG_ORIG_TP_DPORT_ROUTER   "reg9[16..31]"
 
@@ -287,8 +289,9 @@ enum ovn_stage {
  * | R6  |UNUSED| X | | G | IN_IP_ROUTING)|
  * | |  | R | | 1 |   |
  * +-+--+ E | UNUSED  |   |   |
- * | R7  |UNUSED| G | |   |   |
- * | |  | 3 | |   |   |
+ * | R7  |  ROUTE_TABLE_ID  | G | |   |   |
+ * | | (>= IN_IP_ROUTING_PRE && | 3 | |   |   |
+ * | |  <= IN_IP_ROUTING)   |   | |   |   |
  * +-+--+---+-+---+---+
  * | R8  | ECMP_GROUP_ID|   | |
  * | | ECMP_MEMBER_ID   | X | |
@@ -8537,11 +8540,72 @@ cleanup:
 ds_destroy();
 }
 
+static uint32_t
+route_table_add(struct simap *route_tables, const char *route_table_name)
+{
+/* route table ids start from 1 */
+uint32_t rtb_id = simap_count(route_tables) + 1;
+
+if (rtb_id == UINT16_MAX) {
+static struct 

[ovs-dev] [PATCH ovn v5 1/4] ic: process only local port_bindings

2021-10-05 Thread Vladislav Odintsov
This commit adds a small optimization by utilizing ovsdb_index
to iterate over port_bindings.
Prior to this change each iteration checked availability_zone
and continued processing only if port_binding belons to local AZ.

Now we run against port_bindings from local AZ only and don't check
availability_zone.

Signed-off-by: Vladislav Odintsov 
Acked-by: Numan Siddique 
---
 ic/ovn-ic.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 99356253d..303e93a4f 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -68,6 +68,7 @@ struct ic_context {
 struct ovsdb_idl_index *sbrec_port_binding_by_name;
 struct ovsdb_idl_index *icsbrec_port_binding_by_az;
 struct ovsdb_idl_index *icsbrec_port_binding_by_ts;
+struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az;
 struct ovsdb_idl_index *icsbrec_route_by_ts;
 struct ovsdb_idl_index *icsbrec_route_by_ts_az;
 };
@@ -1386,17 +1387,14 @@ route_run(struct ic_context *ctx,
 const struct icsbrec_port_binding *isb_pb;
 const struct icsbrec_port_binding *isb_pb_key =
 icsbrec_port_binding_index_init_row(
-ctx->icsbrec_port_binding_by_ts);
+ctx->icsbrec_port_binding_by_ts_az);
 icsbrec_port_binding_index_set_transit_switch(isb_pb_key, ts->name);
+icsbrec_port_binding_index_set_availability_zone(isb_pb_key, az);
 
 /* Each port on TS maps to a logical router, which is stored in the
  * external_ids:router-id of the IC SB port_binding record. */
 ICSBREC_PORT_BINDING_FOR_EACH_EQUAL (isb_pb, isb_pb_key,
- ctx->icsbrec_port_binding_by_ts) {
-if (isb_pb->availability_zone != az) {
-continue;
-}
-
+ctx->icsbrec_port_binding_by_ts_az) {
 const char *ts_lrp_name =
 get_lrp_name_by_ts_port_name(ctx, isb_pb->logical_port);
 if (!ts_lrp_name) {
@@ -1713,6 +1711,11 @@ main(int argc, char *argv[])
 = ovsdb_idl_index_create1(ovnisb_idl_loop.idl,
   _port_binding_col_transit_switch);
 
+struct ovsdb_idl_index *icsbrec_port_binding_by_ts_az
+= ovsdb_idl_index_create2(ovnisb_idl_loop.idl,
+  _port_binding_col_transit_switch,
+  _port_binding_col_availability_zone);
+
 struct ovsdb_idl_index *icsbrec_route_by_ts
 = ovsdb_idl_index_create1(ovnisb_idl_loop.idl,
   _route_col_transit_switch);
@@ -1763,6 +1766,7 @@ main(int argc, char *argv[])
 .sbrec_chassis_by_name = sbrec_chassis_by_name,
 .icsbrec_port_binding_by_az = icsbrec_port_binding_by_az,
 .icsbrec_port_binding_by_ts = icsbrec_port_binding_by_ts,
+.icsbrec_port_binding_by_ts_az = icsbrec_port_binding_by_ts_az,
 .icsbrec_route_by_ts = icsbrec_route_by_ts,
 .icsbrec_route_by_ts_az = icsbrec_route_by_ts_az,
 };
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v5 0/4] Add multiple routing tables support to Logical Routers

2021-10-05 Thread Vladislav Odintsov
This patch series extends Logical Router's routing functionality.
Now user may create multiple routing tables within a Logical Router
and assign them to Logical Router Ports.

Traffic coming from Logical Router Port with assigned route_table
is checked against global routes if any (Logical_Router_Static_Routes
whith empty route_table field), next against directly connected routes
and then Logical_Router_Static_Routes with same route_table value as
in Logical_Router_Port options:route_table field.

---
v4 -> v5:
  - Addressed Numan's review comments.

v3 -> v4:
  - Minor logging typo fixes.
  - Added patch with ovn-ic routes learning bugfix.

v2 -> v3:
  - Rebased on split northd changes.
  - Replaced route_tables HMAP with SIMAP as Numan suggested.
  - This series stil doesn't have ddlog support yet.
It will take too much time for me to deal to ddlog language and specifics.
Help with ddlog implementation wanted.

v1 -> v2:
  - First patch of v1 patch series was applied, but new tests for new feature
were added with strict table number check. Update this tests to be table
number-independent.
  - Squash pathes for northd and utilities as tests don't pass without latter.
  - Add support for OVN IC routing table in routes advertisement/learning.
  - Patches `ic: remove port_binding on ts deletion`

https://patchwork.ozlabs.org/project/ovn/patch/20210824184442.35063-1-odiv...@gmail.com/
and `ic: process only local port_bindings`

https://patchwork.ozlabs.org/project/ovn/patch/20210830195707.98529-1-odiv...@gmail.com/
were already sent to list separately, but other changes are based on them
so they're included.
Once those patches are accepts, I can drop them from this series.
  - Added NEWS item.
  - Added myself to authors list.

Vladislav Odintsov (4):
  ic: process only local port_bindings
  northd,utils: support for RouteTables in LRs
  ic: add support for routing tables in adv/learn routes
  ic: don't learn routes which have local GW

 NEWS|   4 +
 ic/ovn-ic.c | 569 ++--
 northd/northd.c | 159 ---
 northd/ovn-northd.8.xml |  63 -
 ovn-ic-sb.ovsschema |   5 +-
 ovn-ic-sb.xml   |  18 ++
 ovn-nb.ovsschema|   5 +-
 ovn-nb.xml  |  30 +++
 tests/ovn-ic.at | 493 ++
 tests/ovn-nbctl.at  | 196 +-
 tests/ovn-northd.at |  76 +-
 tests/ovn.at| 441 ++-
 utilities/ovn-nbctl.c   | 138 +-
 13 files changed, 1934 insertions(+), 263 deletions(-)

-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ovs-ctl: add missing description for --ovs-vswitchd-options and --ovsdb-server-options to usage()

2021-10-05 Thread remijouannet
From: Remi Jouannet 

the options commit by fce20b8b73b1 were missing a description in ovs-ctl help 
message

Fixes: fce20b8b73b1 ("ovs-ctl: Permit to specify additional options")
Signed-off-by: Remi Jouannet 
---
 utilities/ovs-ctl.in | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
index 7180079..e6e07f4 100644
--- a/utilities/ovs-ctl.in
+++ b/utilities/ovs-ctl.in
@@ -421,7 +421,9 @@ Less important options for "start", "restart" and 
"force-reload-kmod":
   --no-force-corefiles   do not force on core dumps for OVS daemons
   --no-mlockall  do not lock all of ovs-vswitchd into memory
   --ovsdb-server-priority=NICE   set ovsdb-server's niceness (default: 
$OVSDB_SERVER_PRIORITY)
+  --ovsdb-server-options=OPTIONS additional options for ovsdb-server (example: 
'-vconsole:dbg -vfile:dbg')
   --ovs-vswitchd-priority=NICE   set ovs-vswitchd's niceness (default: 
$OVS_VSWITCHD_PRIORITY)
+  --ovs-vswitchd-options=OPTIONS additional options for ovs-vswitchd (example: 
'-vconsole:dbg -vfile:dbg')
   --no-full-hostname set short hostname instead of full hostname
   --no-record-hostname   do not attempt to determine/record system
  hostname as part of start command
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] dpdk: Support running PMD threads on any core.

2021-10-05 Thread Kevin Traynor

On 27/09/2021 15:56, David Marchand wrote:

Previously in OVS, a PMD thread running on cpu X used lcore X.
This assumption limited OVS to run PMD threads on physical cpu <
RTE_MAX_LCORE.

DPDK 20.08 introduced a new API that associates a non-EAL thread to a free
lcore. This new API does not change the thread characteristics (like CPU
affinity) and let OVS run its PMD threads on any cpu regardless of
RTE_MAX_LCORE.

The DPDK multiprocess feature is not compatible with this new API and is
disabled.

DPDK still limits the number of lcores to RTE_MAX_LCORE (128 on x86_64)
which should be enough for OVS pmd threads (hopefully).

DPDK lcore/OVS pmd threads mapping are logged at threads creation and
destruction.
A new command is added to help get DPDK point of view of the DPDK lcores:

$ ovs-appctl dpdk/lcores-list
lcore 0, socket 0, role RTE, cpuset 0
lcore 1, socket 0, role NON_EAL, cpuset 1
lcore 2, socket 0, role NON_EAL, cpuset 15



I did some testing with this patch. A few summary comments below,


Signed-off-by: David Marchand 
---
Changes since v5:
- rebased,
- commitlog tweaks,
- dropped use of global ALLOW_EXPERIMENTAL flag and pinpointed
   experimental API,

Changes since v4:
- rebased on the master branch,
- disabled DPDK mp feature,
- updated DPDK documentation and manual with the new command,
- added notes in NEWS,

Changes since v3:
- rebased on current HEAD,
- switched back to simple warning rather than abort when registering a
   thread fails,

Changes since v2:
- introduced a new api in DPDK 20.08 (still being discussed), inbox thread at
   
http://inbox.dpdk.org/dev/20200610144506.30505-1-david.march...@redhat.com/T/#t
- this current patch depends on a patch on master I sent:
   
https://patchwork.ozlabs.org/project/openvswitch/patch/20200626122738.28163-1-david.march...@redhat.com/
- dropped 'dpdk-lcore-mask' compat handling,

Changes since v1:
- rewired existing configuration 'dpdk-lcore-mask' to use --lcores,
- switched to a bitmap to track lcores,
- added a command to dump current mapping (Flavio): used an experimental
   API to get DPDK lcores cpuset since it is the most reliable/portable
   information,
- used the same code for the logs when starting DPDK/PMD threads,
- addressed Ilya comments,

---
  Documentation/howto/dpdk.rst |  5 
  NEWS |  2 ++
  lib/dpdk-stub.c  |  8 +-
  lib/dpdk-unixctl.man |  2 ++
  lib/dpdk.c   | 50 ++--
  lib/dpdk.h   |  3 ++-
  lib/dpif-netdev.c|  3 ++-
  7 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 70b64881ab..81f236d3bd 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -402,6 +402,11 @@ Supported actions for hardware offload are:
  - Clone/output (tnl_push and output) for encapsulating over a tunnel.
  - Tunnel pop, for packets received on physical ports.
  
+Multiprocess

+
+
+This DPDK feature is not supported and disabled during OVS initialization.
+
  Further Reading
  ---
  
diff --git a/NEWS b/NEWS

index 90f4b15902..dc9709641c 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post-v2.16.0
 limiting behavior.
   * Add hardware offload support for matching IPv4/IPv6 frag types
 (experimental).
+ * Forbid use of DPDK multiprocess feature.
+ * Add support for running threads on cores >= RTE_MAX_LCORE.
  
  
  v2.16.0 - 16 Aug 2021

diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
index b7d577870d..5bc996b665 100644
--- a/lib/dpdk-stub.c
+++ b/lib/dpdk-stub.c
@@ -39,7 +39,13 @@ dpdk_init(const struct smap *ovs_other_config)
  }
  
  void

-dpdk_set_lcore_id(unsigned cpu OVS_UNUSED)
+dpdk_init_thread_context(unsigned cpu OVS_UNUSED)
+{
+/* Nothing */
+}
+
+void
+dpdk_uninit_thread_context(void)
  {
  /* Nothing */
  }
diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
index a0d1fa2ea3..cd8a178515 100644
--- a/lib/dpdk-unixctl.man
+++ b/lib/dpdk-unixctl.man
@@ -1,5 +1,7 @@
  .SS "DPDK COMMANDS"
  These commands manage DPDK components.
+.IP "\fBdpdk/lcore-list\fR"
+Lists the DPDK lcores and their cpu affinity.
  .IP "\fBdpdk/log-list\fR"
  Lists all DPDK components that emit logs and their logging levels.
  .IP "\fBdpdk/log-set\fR [\fIspec\fR]"
diff --git a/lib/dpdk.c b/lib/dpdk.c
index b2ef31cd20..686d080a3a 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -470,6 +470,15 @@ dpdk_init__(const struct smap *ovs_other_config)
  return false;
  }
  
+#pragma GCC diagnostic push

+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
+if (!rte_mp_disable()) {
+#pragma GCC diagnostic pop
+VLOG_EMER("Could not disable multiprocess, DPDK won't be available.");
+rte_eal_cleanup();
+return false;
+}
+
  if (VLOG_IS_DBG_ENABLED()) {
  size_t size;
  char *response = NULL;
@@ -489,6 +498,11 @@ 

Re: [ovs-dev] [PATCH] ovs-ctl: add --ovs-vswitchd-options and --ovsdb-server-options to usage()

2021-10-05 Thread Rémi Jouannet

Hello David,

sorry for the basic mistakes it's my first time sending a patch by 
email, thank you for your explanation and your patience


i'm gonna send a new patch soon

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 2/3] northd.c: Lock to protect against possible od->group corruption.

2021-10-05 Thread Han Zhou
On Mon, Oct 4, 2021 at 8:08 PM Numan Siddique  wrote:
>
> On Mon, Oct 4, 2021 at 5:29 AM Anton Ivanov
>  wrote:
> >
> >
> > On 03/10/2021 23:45, Han Zhou wrote:
> > > When parallel build is used, od->group can be updated by threads
outside
> > > of the function do_ovn_lflow_add_pd (for lb related flow building). So
> > > use the function ovn_dp_group_add_with_reference() to update it in
> > > function do_ovn_lflow_add() when it is not a newly created flow.
> > >
> > > Signed-off-by: Han Zhou 
> > > ---
> > >   northd/northd.c | 42 +-
> > >   1 file changed, 21 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/northd/northd.c b/northd/northd.c
> > > index 027c5b170..afd812700 100644
> > > --- a/northd/northd.c
> > > +++ b/northd/northd.c
> > > @@ -4299,6 +4299,26 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct
ovn_datapath *od,
> > >   }
> > >   }
> > >
> > > +static bool
> > > +ovn_dp_group_add_with_reference(struct ovn_lflow *lflow_ref,
> > > +struct ovn_datapath *od)
> > > +OVS_NO_THREAD_SAFETY_ANALYSIS
> > > +{
> > > +if (!use_logical_dp_groups || !lflow_ref) {
> > > +return false;
> > > +}
> > > +
> > > +if (use_parallel_build) {
> > > +ovs_mutex_lock(_ref->odg_lock);
> > > +hmapx_add(_ref->od_group, od);
> > > +ovs_mutex_unlock(_ref->odg_lock);
> > > +} else {
> > > +hmapx_add(_ref->od_group, od);
> > > +}
> > > +
> > > +return true;
> > > +}
> > > +
> > >   /* Adds a row with the specified contents to the Logical_Flow table.
> > >* Version to use with dp_groups + parallel - when locking is
required.
> > >*
> > > @@ -4351,7 +4371,7 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct
ovn_datapath *od,
> > >   old_lflow = ovn_lflow_find(lflow_map, NULL, stage,
priority, match,
> > >  actions, ctrl_meter, hash);
> > >   if (old_lflow) {
> > > -hmapx_add(_lflow->od_group, od);
> > > +ovn_dp_group_add_with_reference(old_lflow, od);
> > >   return old_lflow;
> > >   }
> > >   }
> > > @@ -4466,26 +4486,6 @@ ovn_lflow_add_at(struct hmap *lflow_map,
struct ovn_datapath *od,
> > >  io_port, ctrl_meter, stage_hint,
where, hash);
> > >   }
> > >
> > > -static bool
> > > -ovn_dp_group_add_with_reference(struct ovn_lflow *lflow_ref,
> > > -struct ovn_datapath *od)
> > > -OVS_NO_THREAD_SAFETY_ANALYSIS
> > > -{
> > > -if (!use_logical_dp_groups || !lflow_ref) {
> > > -return false;
> > > -}
> > > -
> > > -if (use_parallel_build) {
> > > -ovs_mutex_lock(_ref->odg_lock);
> > > -hmapx_add(_ref->od_group, od);
> > > -ovs_mutex_unlock(_ref->odg_lock);
> > > -} else {
> > > -hmapx_add(_ref->od_group, od);
> > > -}
> > > -
> > > -return true;
> > > -}
> > > -
> > >   /* Adds a row with the specified contents to the Logical_Flow
table. */
> > >   #define ovn_lflow_add_with_hint__(LFLOW_MAP, OD, STAGE, PRIORITY,
MATCH, \
> > > ACTIONS, IN_OUT_PORT, CTRL_METER,
\
> >
> >
> > +1
> >
>
> Acked-by: Numan Siddique 

Thanks Anton and Numan for the review. I applied the patch 1 and 2 in this
series to main branch and branch-21.09.

>
> Numan
>
> > --
> > Anton R. Ivanov
> > Cambridgegreys Limited. Registered in England. Company Number 10273661
> > https://www.cambridgegreys.com/
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Add support for NAT with multiple DGP.

2021-10-05 Thread Han Zhou
On Mon, Sep 20, 2021 at 7:30 AM Abhiram Sangana 
wrote:
>
> Currently, if multiple distributed gateway ports (DGP) are configured
> on a logical router, NAT is disabled as part of commit 15348b7b
> (northd: Multiple distributed gateway port support.)
>
> This patch updates the behavior by selectively applying NAT rules at DGPs.
> A NAT rule is applied on matching packets entering or leaving a specific
> DGP only if the external_ip of the rule belongs to the same subnet as the
> DGP.
>
> This patch also updates ovn-nbctl to accept multiple NAT rules of type
> `snat` with the same logical_ip but different external_ip for a logical
> router.
>
> Signed-off-by: Abhiram Sangana 

Thanks Abhiram for adding the NAT support. The approach looks good overall.
Please see my comments below.

> ---
>  NEWS|   1 +
>  northd/northd.c | 155 +++
>  northd/ovn-northd.8.xml |  29 ---
>  ovn-architecture.7.xml  |   6 +-
>  ovn-nb.xml  |   4 +-
>  tests/ovn-nbctl.at  |  40 -
>  tests/ovn-northd.at | 175 +---
>  utilities/ovn-nbctl.c   | 156 ---
>  8 files changed, 473 insertions(+), 93 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 8a21c029e..1fdc65bc0 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -12,6 +12,7 @@ OVN v21.09.0 - xx xxx 
>- Allow static routes without nexthops.
>- Enabled logical dp groups as a default.  CMS should disable it if not
>  desired.
> +  - Support NAT with multiple distributed gateway ports on a logical
router.

Just a reminder that now it should be moved to the "post 21.09" section.

>
>  OVN v21.06.0 - 18 Jun 2021
>  -
> diff --git a/northd/northd.c b/northd/northd.c
> index d1b87891c..21a72d238 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -593,12 +593,11 @@ struct ovn_datapath {
>
>  /* Applies to only logical router datapath.
>   * True if logical router is a gateway router. i.e options:chassis
is set.
> - * If this is true, then 'l3dgw_port' and 'l3redirect_port' will be
> - * ignored. */
> + * If this is true, then 'l3dgw_ports' will be ignored. */
>  bool is_gw_router;
>
> -/* OVN northd only needs to know about the logical router gateway
port for
> - * NAT on a distributed router.  The "distributed gateway ports" are
> +/* OVN northd only needs to know about logical router gateway ports
for
> + * NAT/LB on a distributed router.  The "distributed gateway ports"
are
>   * populated only when there is a gateway chassis or ha chassis group
>   * specified for some of the ports on the logical router. Otherwise
this
>   * will be NULL. */
> @@ -747,16 +746,6 @@ init_nat_entries(struct ovn_datapath *od)
>  return;
>  }
>
> -if (od->n_l3dgw_ports > 1) {
> -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> -VLOG_WARN_RL(, "NAT is configured on logical router %s, which
has %"
> - PRIuSIZE" distributed gateway ports. NAT is not
supported"
> - " yet when there is more than one distributed
gateway "
> - "port on the router.",
> - od->nbr->name, od->n_l3dgw_ports);
> -return;
> -}
> -
>  od->nat_entries = xmalloc(od->nbr->n_nat * sizeof *od->nat_entries);
>
>  for (size_t i = 0; i < od->nbr->n_nat; i++) {
> @@ -2681,8 +2670,11 @@ get_nat_addresses(const struct ovn_port *op,
size_t *n, bool routable_only)
>  /* Gratuitous ARP for centralized NAT rules on distributed
gateway
>   * ports should be restricted to the gateway chassis. */
>  if (op->od->n_l3dgw_ports) {
> +const struct ovn_port *l3dgw_port = (is_l3dgw_port(op)
> + ? op
> + :
op->od->l3dgw_ports[0]);

I think we shouldn't hardcode to index 0 any more when NAT for multiple
DGPs is supported. In fact, if the code reaches here, is_l3dgw_port(op)
must be true. We should instead use ovs_assert(is_l3dgw_port(op)).

In addition, the addresses collected in this function will be used to send
GARP on the external networks. It would end up sending GARPs for all NAT
external IPs on all the external networks, which is wrong. We should check
here if the NAT external IP matches the l3dgw port's subnet, similar to the
check you added in extract_lrp_networks(). (Same checks needed when adding
support for LB IPs).

>  ds_put_format(_addresses, " is_chassis_resident(%s)",
> -  op->od->l3dgw_ports[0]->cr_port->json_key);
> +  l3dgw_port->cr_port->json_key);
>  }
>
>  addresses[n_nats++] = ds_steal_cstr(_addresses);
> @@ -3274,9 +3266,11 @@ ovn_port_update_sbrec(struct northd_context *ctx,
>  }
>
>  if (op->peer->od->n_l3dgw_ports) 

Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-10-05 Thread Nicolas Dichtel
Le 01/10/2021 à 22:42, Cpp Code a écrit :
> On Fri, Oct 1, 2021 at 12:21 AM Nicolas Dichtel
>  wrote:
>>
>> Le 30/09/2021 à 18:11, Cpp Code a écrit :
>>> On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski  wrote:

 On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote:
>> /* Insert a kernel only KEY_ATTR */
>> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX
>> #undef OVS_KEY_ATTR_MAX
>> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX
> Following the other thread [1], this will break if a new app runs over an 
> old
> kernel.

 Good point.

> Why not simply expose this attribute to userspace and throw an error if a
> userspace app uses it?

 Does it matter if it's exposed or not? Either way the parsing policy
 for attrs coming from user space should have a reject for the value.
 (I say that not having looked at the code, so maybe I shouldn't...)
>>>
>>> To remove some confusion, there are some architectural nuances if we
>>> want to extend code without large refactor.
>>> The ovs_key_attr is defined only in kernel side. Userspace side is
>>> generated from this file. As well the code can be built without kernel
>>> modules.
>>> The code inside OVS repository and net-next is not identical, but I
>>> try to keep some consistency.
>> I didn't get why OVS_KEY_ATTR_TUNNEL_INFO cannot be exposed to userspace.
> 
> OVS_KEY_ATTR_TUNNEL_INFO is compressed version of OVS_KEY_ATTR_TUNNEL
> and for clarity purposes its not exposed to userspace as it will never
> use it.
> I would say it's a coding style as it would not brake anything if exposed.
In fact, it's the best way to keep the compatibility in the long term.
You can define it like this:
OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info, reserved for kernel use */

> 
>>
>>>
>>> JFYI This is the file responsible for generating userspace part:
>>> https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h
>>> This is the how corresponding file for ovs_key_attr looks inside OVS:
>>> https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h
>>> one can see there are more values than in net-next version.
>> There are still some '#ifdef __KERNEL__'. The standard 'make headers_install'
>> filters them. Why not using this standard mechanism?
> 
> Could you elaborate on this, I don't quite understand the idea!? Which
> ifdef you are referring, the one along OVS_KEY_ATTR_TUNNEL_INFO or
> some other?
My understanding is that this file is used for the userland third party, thus,
theoretically, there should be no '#ifdef __KERNEL__'. uapi headers generated
with 'make headers_install' are filtered to remove them.

> 
>>
>> In this file, there are two attributes (OVS_KEY_ATTR_PACKET_TYPE and
>> OVS_KEY_ATTR_ND_EXTENSIONS) that doesn't exist in the kernel.
>> This will also breaks if an old app runs over a new kernel. I don't see how 
>> it
>> is possible to keep the compat between {old|new} {kernel|app}.
> 
> Looks like this most likely is a bug while working on multiple
> versions of code.  Need to do add more padding.
As said above, just define the same uapi for everybody and the problem is gone
forever.


Regards,
Nicolas
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev