Re: [ovs-dev] [PATCH v15 0/7] Add offload support for sFlow

2021-09-21 Thread Chris Mi via dev

Hi Eelco,

That's ok. Please review it according to your plan. 

Thanks,
Chris

On 9/21/2021 4:12 PM, Eelco Chaudron wrote:

Hi Chris,

Just a quick update, I did see your responses to v14 and I also noticed you 
send out a v15. I planned to review it this week, but due to some other 
unforeseen stuff, I have to move it to next week (if nothing more is going to 
mess up my plan ;)

Cheers,

Eelco


On 15 Sep 2021, at 14:43, Chris Mi wrote:


This patch set adds offload support for sFlow.

Psample is a genetlink channel for packet sampling. TC action act_sample
uses psample to send sampled packets to userspace.

When offloading sample action to TC, userspace creates a unique ID to
map sFlow action and tunnel info and passes this ID to kernel instead
of the sFlow info. psample will send this ID and sampled packet to
userspace. Using the ID, userspace can recover the sFlow info and send
sampled packet to the right sFlow monitoring host.

v2-v1:
- Fix robot errors.
v3-v2:
- Remove Gerrit Change-Id.
- Add patch #9 to fix older kernels build issue.
- Add travis test result.
v4-v3:
- Fix offload issue when sampling rate is 1.
v5-v4:
- Move polling thread from ofproto to netdev-offload-tc.
v6-v5:
- Rebase.
- Add GitHub Actions test result.
v7-v6:
- Remove Gerrit Change-Id.
- Fix "ERROR: Inappropriate spacing around cast"
v8-v7
- Address Eelco Chaudron's comment for patch #11.
v9-v8
- Remove sflow_len from struct dpif_sflow_attr.
- Log a debug message for other userspace actions.
v10-v9
- Address Eelco Chaudron's comments on v9.
v11-v10
- Fix a bracing error.
v12-v11
- Add duplicate sample group id check.
v13-v12
- Remove the psample poll thread from netdev-offload-tc and reuse
   ofproto handler thread according to Ilya's new desgin.
- Add dpif-offload-provider layer according to Eli's suggestion.
v14-v13
- Fix a robot error.
v15-v14
- Address Eelco Chaudron's comments on v14.

Chris Mi (7):
   compat: Add psample and tc sample action defines for older kernels
   ovs-kmod-ctl: Load kernel module psample
   dpif-offload-provider: Introduce dpif-offload-provider layer
   netdev-offload-tc: Introduce group ID management API
   dpif-offload-netlink: Implement dpif-offload-provider API
   ofproto: Introduce API to process sFlow offload packet
   netdev-offload-tc: Add offload support for sFlow

  NEWS |   1 +
  include/linux/automake.mk|   4 +-
  include/linux/psample.h  |  62 +
  include/linux/tc_act/tc_sample.h |  25 ++
  lib/automake.mk  |   3 +
  lib/dpif-netdev.c|   1 +
  lib/dpif-netlink.c   |   2 +
  lib/dpif-offload-netlink.c   | 208 ++
  lib/dpif-offload-provider.h  |  75 +
  lib/dpif-offload.c   |  43 +++
  lib/dpif-provider.h  |   8 +-
  lib/dpif.c   |  10 +
  lib/netdev-offload-tc.c  | 459 +--
  lib/netdev-offload.h |   1 +
  lib/tc.c |  61 +++-
  lib/tc.h |  16 +-
  ofproto/ofproto-dpif-upcall.c|  63 +
  utilities/ovs-kmod-ctl.in|  14 +
  18 files changed, 1030 insertions(+), 26 deletions(-)
  create mode 100644 include/linux/psample.h
  create mode 100644 include/linux/tc_act/tc_sample.h
  create mode 100644 lib/dpif-offload-netlink.c
  create mode 100644 lib/dpif-offload-provider.h
  create mode 100644 lib/dpif-offload.c

--
2.27.0


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Fix multicast relay when DGP are configured.

2021-09-21 Thread Numan Siddique
On Tue, Sep 21, 2021 at 1:50 PM Han Zhou  wrote:
>
> On Tue, Sep 21, 2021 at 6:50 AM Dumitru Ceara  wrote:
> >
> > IP multicast relay didn't work properly if traffic had to be forwarded
> > across a distributed gateway port in the router pipeline.  That is
> > because the multicast_group used as output logical port is expanded in
> > the egress pipeline, way after 'lr_in_gw_redirect' where unicast traffic
> > would normally be forwarded to the chassis-redirect port.
> >
> > In order to achieve the same behavior for IP multicast routed traffic we
> > now store the chassis-redirect port binding in the multicast_group on
> > which IP multicast is routed.  On the remote hypervisor, to make sure
> > traffic is delivered to the correct destination switch pipeline, we make
> > sure that ovn-controller translates chassis-redirect ports from
> > multicast groups to the logical patch ports they were created from.
> >
> > This patch also adds a test to simulate the ovn-kubernetes IP multicast
> > use case (where this issue was first observed).
> >
> > Fixes: 5d1527b11e94 ("ovn-northd: Add IGMP Relay support")
> > Reported-by: Alexander Constantinescu 
> > Reported-at: https://bugzilla.redhat.com/2006306
> > Signed-off-by: Dumitru Ceara 
> > ---
> >  controller/physical.c |  28 -
> >  northd/lrouter.dl |  14 ++-
> >  northd/multicast.dl   |  23 +++-
> >  northd/northd.c   |   7 +-
> >  northd/ovn_northd.dl  |  12 +-
> >  tests/ovn.at  | 248 ++
> >  6 files changed, 314 insertions(+), 18 deletions(-)
> >
> > diff --git a/controller/physical.c b/controller/physical.c
> > index ffb9f9952..0cfb158c8 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -1373,7 +1373,8 @@ out:
> >  }
> >
> >  static void
> > -consider_mc_group(enum mf_field_id mff_ovn_geneve,
> > +consider_mc_group(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> > +  enum mf_field_id mff_ovn_geneve,
> >const struct simap *ct_zones,
> >const struct hmap *local_datapaths,
> >struct shash *local_bindings,
> > @@ -1406,6 +1407,10 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
> >   *  instead.  (If we put them in 'ofpacts', then the output
> >   *  would happen on every hypervisor in the multicast group,
> >   *  effectively duplicating the packet.)
> > + *
> > + *- For chassisredirect ports, add actions to 'ofpacts' to
> > + *  set the output port to be the router patch port for which
> > + *  the redirect port was added.
> >   */
> >  struct ofpbuf ofpacts;
> >  ofpbuf_init(, 0);
> > @@ -1440,6 +1445,21 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
> > && port->chassis == chassis)) {
> >  put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, );
> >  put_resubmit(OFTABLE_CHECK_LOOPBACK, );
> > +} else if (!strcmp(port->type, "chassisredirect")
> > +   && port->chassis == chassis) {
> > +const char *distributed_port = smap_get(>options,
> > +"distributed-port");
> > +if (distributed_port) {
> > +const struct sbrec_port_binding *distributed_binding
> > += lport_lookup_by_name(sbrec_port_binding_by_name,
> > +   distributed_port);
> > +if (distributed_binding
> > +&& port->datapath == distributed_binding->datapath) {
> > +put_load(distributed_binding->tunnel_key,
> MFF_LOG_OUTPORT,
> > + 0, 32, );
> > +put_resubmit(OFTABLE_CHECK_LOOPBACK, );
> > +}
> > +}
> >  } else if (port->chassis && !get_localnet_port(
> >  local_datapaths, mc->datapath->tunnel_key)) {
> >  /* Add remote chassis only when localnet port not exist,
> > @@ -1574,7 +1594,8 @@ physical_handle_mc_group_changes(struct
> physical_ctx *p_ctx,
> >  if (!sbrec_multicast_group_is_new(mc)) {
> >  ofctrl_remove_flows(flow_table, >header_.uuid);
> >  }
> > -consider_mc_group(p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
> > +consider_mc_group(p_ctx->sbrec_port_binding_by_name,
> > +  p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
> >p_ctx->local_datapaths,
> p_ctx->local_bindings,
> >p_ctx->patch_ofports,
> >p_ctx->chassis, mc,
> > @@ -1617,7 +1638,8 @@ physical_run(struct physical_ctx *p_ctx,
> >  /* Handle output to multicast groups, in tables 32 and 33. */
> >  const struct sbrec_multicast_group *mc;
> >  SBREC_MULTICAST_GROUP_TABLE_FOR_EACH (mc, p_ctx->mc_group_table) {
> > -

Re: [ovs-dev] [PATCH ovn v2] northd: Update the probe interval in main loop.

2021-09-21 Thread Han Zhou
On Tue, Sep 21, 2021 at 11:08 AM Zhen Wang  wrote:
>
> From: zhen wang 
>
> When ovn-northd work in HA mode, ovn-northd will not update the
> probe interval in standby mode. If SB/NB raft leader and active
> ovn-northd instance got killed by system power outage, standby
> ovn-northd instance would never detect the failure.
> This patch address the problem by updating the probe value in main loop.
>
> Signed-off-by: zhen wang 

Thanks Zhen. I applied this fix to master, branch-21.09, 21.06 and 21.03.

For master and branch-21.09, I made a minor adjustment to the commit
message:

When ovn-northd work in HA mode, ovn-northd will not update the
probe interval in standby mode. This patch address the problem by
updating the probe value in main loop.

I removed the sentence that describes the HA impact because on branch-21.09
and master after the commit 520d5ceda3 that split northd.c, the behavior
changed. Although standby still won't get the probe interval updated, the
impact is different because the probe interval for standby won't be 0 (but
instead it will be the default 5s). The HA impact is valid for older
branches so I kept the message as is for them. Regardless of the commit
message, the actual fixes are essentially the same for all branches.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Enforce datapath and port key constraints in vxlan mode

2021-09-21 Thread Ihar Hrachyshka
With vxlan enabled for in-cluster communication, the ranges available
for port and datapath keys are limited to 12 bits (including multigroup
keys). (The default range is 16 bit long.)

This means that OVN should avoid allocating, or allowing to request,
tunnel keys for datapaths and ports that are equal or higher than
2 << 11. This was not enforced before, and this patch adds the missing
enforcement rules.

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka 
---
 northd/ovn-northd.c  | 53 +++-
 northd/ovn_northd.dl | 32 +-
 tests/ovn.at | 49 
 3 files changed, 108 insertions(+), 26 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index baaddb73e..644f04aa5 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -1370,7 +1370,8 @@ ovn_datapath_allocate_key(struct northd_context *ctx,
 }
 
 static void
-ovn_datapath_assign_requested_tnl_id(struct hmap *dp_tnlids,
+ovn_datapath_assign_requested_tnl_id(struct northd_context *ctx,
+ struct hmap *dp_tnlids,
  struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -1378,6 +1379,13 @@ ovn_datapath_assign_requested_tnl_id(struct hmap 
*dp_tnlids,
: >nbr->options);
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
+if (is_vxlan_mode(ctx->ovnsb_idl) && tunnel_key >= 1 << 12) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "Tunnel key %"PRIu32" for datapath %s is "
+ "incompatible with VXLAN", tunnel_key,
+ od->nbs ? od->nbs->name : od->nbr->name);
+return;
+}
 if (ovn_add_tnlid(dp_tnlids, tunnel_key)) {
 od->tunnel_key = tunnel_key;
 } else {
@@ -1407,10 +1415,10 @@ build_datapaths(struct northd_context *ctx, struct hmap 
*datapaths,
 struct hmap dp_tnlids = HMAP_INITIALIZER(_tnlids);
 struct ovn_datapath *od, *next;
 LIST_FOR_EACH (od, list, ) {
-ovn_datapath_assign_requested_tnl_id(_tnlids, od);
+ovn_datapath_assign_requested_tnl_id(ctx, _tnlids, od);
 }
 LIST_FOR_EACH (od, list, _only) {
-ovn_datapath_assign_requested_tnl_id(_tnlids, od);
+ovn_datapath_assign_requested_tnl_id(ctx, _tnlids, od);
 }
 
 /* Keep nonconflicting tunnel IDs that are already assigned. */
@@ -3815,27 +3823,40 @@ ovn_port_add_tnlid(struct ovn_port *op, uint32_t 
tunnel_key)
 }
 
 static void
-ovn_port_assign_requested_tnl_id(struct ovn_port *op)
+ovn_port_assign_requested_tnl_id(struct northd_context *ctx,
+ struct ovn_port *op)
 {
 const struct smap *options = (op->nbsp
   ? >nbsp->options
   : >nbrp->options);
 uint32_t tunnel_key = smap_get_int(options, "requested-tnl-key", 0);
-if (tunnel_key && !ovn_port_add_tnlid(op, tunnel_key)) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(, "Logical %s port %s requests same tunnel key "
- "%"PRIu32" as another LSP or LRP",
- op->nbsp ? "switch" : "router",
- op_get_name(op), tunnel_key);
+if (tunnel_key) {
+if (is_vxlan_mode(ctx->ovnsb_idl) &&
+tunnel_key >= OVN_VXLAN_MIN_MULTICAST) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "Tunnel key %"PRIu32" for port %s "
+ "is incompatible with VXLAN",
+ tunnel_key, op_get_name(op));
+return;
+}
+if (!ovn_port_add_tnlid(op, tunnel_key)) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "Logical %s port %s requests same tunnel key "
+ "%"PRIu32" as another LSP or LRP",
+ op->nbsp ? "switch" : "router",
+ op_get_name(op), tunnel_key);
+}
 }
 }
 
 static void
-ovn_port_allocate_key(struct hmap *ports, struct ovn_port *op)
+ovn_port_allocate_key(struct northd_context *ctx, struct hmap *ports,
+  struct ovn_port *op)
 {
 if (!op->tunnel_key) {
+uint8_t key_bits = is_vxlan_mode(ctx->ovnsb_idl)? 12 : 16;
 op->tunnel_key = ovn_allocate_tnlid(>od->port_tnlids, "port",
-1, (1u << 15) - 1,
+1, (1u << (key_bits - 1)) - 1,
 >od->port_key_hint);
 if (!op->tunnel_key) {
 if (op->sb) {
@@ -3875,10 +3896,10 @@ build_ports(struct 

[ovs-dev] [PATCH ovn] Fix basic multicast flows for vxlan (non-vtep) tunnels

2021-09-21 Thread Ihar Hrachyshka
The 15-bit port key range used for multicast groups can't be covered
by 12-bit key space available for port keys in VXLAN. To make
multicast keys work, we have to transform 16-bit multicast port keys
to 12-bit keys before fanning out packets through VXLAN tunnels.
Otherwise significant bits are not retained, and multicast / broadcast
traffic does not reach ports located on other chassis.

This patch introduces a mapping scheme between core 16-bit multicast
port keys and 12-bit key range available in VXLAN. The scheme is as
follows:

1) Before sending a packet through VXLAN tunnel, the most significant
   bit of a 16-bit port key is copied into the most significant bit of
   12-bit VXLAN key. The 11 least significant bits of a 16-bit port
   key are copied to the least significant bits of 12-bit VXLAN key.

2) When receiving a packet through VXLAN tunnel, the most significant
   bit of a VXLAN 12-bit port key is copied into the most significant
   bit of 16-bit port key used in core. The 11 least significant bits
   of a VXLAN 12-bit port key are copied into the least significant
   bits of a 16-bit port key used in core.

This change also implies that the range available for multicast port
keys is more limited and fits into 11-bit space. The same rule should
be enforced for unicast port keys, like we already do for tunnel keys
when a VXLAN encap is present in a cluster. This enforcement is
implied here but missing in master and will be implemented in a
separate patch. (The missing enforcement is an oversight of the
original patch that added support for VXLAN tunnels.)

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka 
---
 controller-vtep/gateway.c |   2 +
 controller/physical.c | 101 ++
 lib/mcast-group-index.h   |  15 ++
 tests/ovn.at  |  23 +++--
 4 files changed, 116 insertions(+), 25 deletions(-)

diff --git a/controller-vtep/gateway.c b/controller-vtep/gateway.c
index e9419138b..288772dc4 100644
--- a/controller-vtep/gateway.c
+++ b/controller-vtep/gateway.c
@@ -61,6 +61,8 @@ create_chassis_rec(struct ovsdb_idl_txn *txn, const char 
*name,
 sbrec_encap_set_options(encap_rec, );
 sbrec_encap_set_chassis_name(encap_rec, name);
 sbrec_chassis_set_encaps(chassis_rec, _rec, 1);
+const struct smap oc = SMAP_CONST1(, "is-vtep", "true");
+sbrec_chassis_set_other_config(chassis_rec, );
 
 return chassis_rec;
 }
diff --git a/controller/physical.c b/controller/physical.c
index 6f2c1cea0..aa2942dd4 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -37,6 +37,7 @@
 #include "openvswitch/ofp-parse.h"
 #include "ovn-controller.h"
 #include "lib/chassis-index.h"
+#include "lib/mcast-group-index.h"
 #include "lib/ovn-sb-idl.h"
 #include "lib/ovn-util.h"
 #include "physical.h"
@@ -63,6 +64,7 @@ static void
 load_logical_ingress_metadata(const struct sbrec_port_binding *binding,
   const struct zone_ids *zone_ids,
   struct ofpbuf *ofpacts_p);
+static int64_t get_vxlan_port_key(int64_t port_key);
 
 /* UUID to identify OF flows not associated with ovsdb rows. */
 static struct uuid *hc_uuid = NULL;
@@ -160,8 +162,9 @@ put_encapsulation(enum mf_field_id mff_ovn_geneve,
 } else if (tun->type == VXLAN) {
 uint64_t vni = datapath->tunnel_key;
 if (!is_ramp_switch) {
-/* Only some bits are used for regular tunnels. */
-vni |= (uint64_t) outport << 12;
+/* Map southbound 16-bit port key to limited 12-bit range
+ * available for VXLAN, which differs for multicast groups. */
+vni |= get_vxlan_port_key(outport) << 12;
 }
 put_load(vni, MFF_TUN_ID, 0, 24, ofpacts);
 } else {
@@ -1372,6 +1375,43 @@ out:
 }
 }
 
+static int64_t
+get_vxlan_port_key(int64_t port_key)
+{
+if (port_key >= OVN_MIN_MULTICAST) {
+/* 0b1<11 least significant bits> */
+return OVN_VXLAN_MIN_MULTICAST |
+(port_key & (OVN_VXLAN_MIN_MULTICAST - 1));
+}
+return port_key;
+}
+
+static void
+fanout_to_chassis(enum mf_field_id mff_ovn_geneve,
+  struct sset *remote_chassis,
+  const struct hmap *chassis_tunnels,
+  const struct sbrec_datapath_binding *datapath,
+  uint16_t outport, bool is_ramp_switch,
+  struct ofpbuf *remote_ofpacts)
+{
+const char *chassis_name;
+const struct chassis_tunnel *prev = NULL;
+SSET_FOR_EACH (chassis_name, remote_chassis) {
+const struct chassis_tunnel *tun
+= chassis_tunnel_find(chassis_tunnels, chassis_name, NULL);
+if (!tun) {
+continue;
+}
+
+if (!prev || tun->type != prev->type) {
+put_encapsulation(mff_ovn_geneve, tun, datapath,
+  outport, is_ramp_switch, remote_ofpacts);
+prev = 

Re: [ovs-dev] [PATCH 2/2] openvswitch: Fix condition check in output_userspace() by using nla_ok()

2021-09-21 Thread Pravin Shelar
On Fri, Sep 17, 2021 at 1:25 AM Jiasheng Jiang  wrote:
>
> Just using 'rem > 0' might be unsafe, so it's better
> to use the nla_ok() instead.
> Because we can see from the nla_next() that
> '*remaining' might be smaller than 'totlen'. And nla_ok()
> will avoid it happening.
> For example, ovs_dp_process_packet() -> ovs_execute_actions()
> -> do_execute_actions() -> output_userspace(), and attr comes
> from OVS_CB(skb)->input_vport,which restores the received packet
> from the user space.
>
> Fixes: ccb1352e76cff0524e7ccb2074826a092dd13016
> ('net: Add Open vSwitch kernel components.')
> Signed-off-by: Jiasheng Jiang 

> ---
>  net/openvswitch/actions.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index c23537f..e8236dd 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -915,8 +915,7 @@ static int output_userspace(struct datapath *dp, struct 
> sk_buff *skb,
> upcall.cmd = OVS_PACKET_CMD_ACTION;
> upcall.mru = OVS_CB(skb)->mru;
>
> -   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> -a = nla_next(a, )) {
> +   nla_for_each_nested(a, attr, rem) {
> switch (nla_type(a)) {
> case OVS_USERSPACE_ATTR_USERDATA:
> upcall.userdata = a;

These nl-attributes are built and verified at time of OVS flow
install, so the rest of checks in nla_ok, is not required.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-controller: Allow specifying tos option for tunnel interface

2021-09-21 Thread Han Zhou
On Tue, Sep 21, 2021 at 9:24 AM  wrote:
>
> From: Venugopal Iyer 
>
> Currently, OVN tunnel interface supports the csum option along
> with remote_ip and key. There are applications (e.g. RoCE) that rely
> on setting the DSCP bits and expect it to be moved to the outer/
> tunnel header as well.
>
> This commit adds an "ovn-encap-tos" external-id that can be used to
> set the tos option  on the OVS tunnel interface, using:
>
> ovs-vsctl set Open_vSwitch . external_ids:ovn-encap-tos=inherit
>
> Tested by setting the external_id (as above) and checking the geneve
> interfaces created, e.g:
>
> options : {csum="true", key=flow, remote_ip="X.X.X.X",
tos=inherit}
>
> Also, added a simple test case to make sure the tos option is carried to
the
> tunnel interface when set.
>
> Signed-off-by: venu iyer (venugop...@nvidia.com)

Thanks Venu for the patch. There are some checkpatch failures due to the
email format and line length. The patch looks good overall. Please find
some minor comments below regarding documentation and tests.

> ---
>  controller/encaps.c | 25 ++
>  controller/encaps.h |  1 +
>  controller/ovn-controller.8.xml |  7 +
>  controller/ovn-controller.c |  1 +
>  tests/ovn-controller.at | 45 +
>  5 files changed, 74 insertions(+), 5 deletions(-)
>
> diff --git a/controller/encaps.c b/controller/encaps.c
> index fc93bf1ee..da24448f5 100644
> --- a/controller/encaps.c
> +++ b/controller/encaps.c
> @@ -152,7 +152,8 @@ encaps_tunnel_id_match(const char *tunnel_id, const
char *chassis_id,
>
>  static void
>  tunnel_add(struct tunnel_ctx *tc, const struct sbrec_sb_global *sbg,
> -   const char *new_chassis_id, const struct sbrec_encap *encap)
> +   const char *new_chassis_id, const struct sbrec_encap *encap,
> +   const struct ovsrec_open_vswitch_table *ovs_table)
>  {
>  struct smap options = SMAP_INITIALIZER();
>  smap_add(, "remote_ip", encap->ip);
> @@ -202,6 +203,18 @@ tunnel_add(struct tunnel_ctx *tc, const struct
sbrec_sb_global *sbg,
>  smap_add(, "remote_name", new_chassis_id);
>  }
>
> +const struct ovsrec_open_vswitch *cfg =
> +ovsrec_open_vswitch_table_first(ovs_table);
> +/* If the tos option is configured, get it */
> +if (cfg) {
> +const char *encap_tos = smap_get_def(>external_ids,
> +   "ovn-encap-tos", "none");
> +
> +if (encap_tos && strcmp(encap_tos, "none")) {
> +smap_add(, "tos", encap_tos);
> +}
> +}
> +
>  /* If there's an existing chassis record that does not need any
change,
>   * keep it.  Otherwise, create a new record (if there was an existing
>   * record, the new record will supplant it and encaps_run() will
delete
> @@ -270,7 +283,8 @@ preferred_encap(const struct sbrec_chassis
*chassis_rec)
>   * as there are VTEP of that type (differentiated by remote_ip) on that
chassis.
>   */
>  static int
> -chassis_tunnel_add(const struct sbrec_chassis *chassis_rec, const struct
sbrec_sb_global *sbg, struct tunnel_ctx *tc)
> +chassis_tunnel_add(const struct sbrec_chassis *chassis_rec, const struct
sbrec_sb_global *sbg,
> +   const struct ovsrec_open_vswitch_table *ovs_table,
struct tunnel_ctx *tc)
>  {
>  struct sbrec_encap *encap = preferred_encap(chassis_rec);
>  int tuncnt = 0;
> @@ -286,7 +300,7 @@ chassis_tunnel_add(const struct sbrec_chassis
*chassis_rec, const struct sbrec_s
>  if (tun_type != pref_type) {
>  continue;
>  }
> -tunnel_add(tc, sbg, chassis_rec->name, chassis_rec->encaps[i]);
> +tunnel_add(tc, sbg, chassis_rec->name, chassis_rec->encaps[i],
ovs_table);
>  tuncnt++;
>  }
>  return tuncnt;
> @@ -316,11 +330,12 @@ chassis_tzones_overlap(const struct sset
*transport_zones,
>
>  void
>  encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> -   const struct ovsrec_bridge_table *bridge_table,
> +  const struct ovsrec_bridge_table *bridge_table,
> const struct ovsrec_bridge *br_int,
> const struct sbrec_chassis_table *chassis_table,
> const struct sbrec_chassis *this_chassis,
> const struct sbrec_sb_global *sbg,
> +   const struct ovsrec_open_vswitch_table *ovs_table,
> const struct sset *transport_zones)
>  {
>  if (!ovs_idl_txn || !br_int) {
> @@ -390,7 +405,7 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
>  continue;
>  }
>
> -if (chassis_tunnel_add(chassis_rec, sbg, ) == 0) {
> +if (chassis_tunnel_add(chassis_rec, sbg, ovs_table, ) ==
0) {
>  VLOG_INFO("Creating encap for '%s' failed",
chassis_rec->name);
>  continue;
>  }
> diff --git a/controller/encaps.h b/controller/encaps.h
> index f488393c4..25d44b034 100644
> --- a/controller/encaps.h
> +++ b/controller/encaps.h
> @@ -35,6 

[ovs-dev] [PATCH ovn v2] northd: Update the probe interval in main loop.

2021-09-21 Thread Zhen Wang via dev
From: zhen wang 

When ovn-northd work in HA mode, ovn-northd will not update the
probe interval in standby mode. If SB/NB raft leader and active
ovn-northd instance got killed by system power outage, standby
ovn-northd instance would never detect the failure.
This patch address the problem by updating the probe value in main loop.

Signed-off-by: zhen wang 
---
 northd/northd.c | 25 -
 northd/ovn-northd.c | 30 ++
 2 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 621e83175..91635b93b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -73,10 +73,6 @@ static struct eth_addr svc_monitor_mac_ea;
  * Otherwise, it will avoid using it.  The default is true. */
 static bool use_ct_inv_match = true;
 
-/* Default probe interval for NB and SB DB connections. */
-#define DEFAULT_PROBE_INTERVAL_MSEC 5000
-static int northd_probe_interval_nb = 0;
-static int northd_probe_interval_sb = 0;
 #define MAX_OVN_TAGS 4096
 
 /* Pipeline stages. */
@@ -14190,20 +14186,6 @@ build_meter_groups(struct northd_context *ctx,
 }
 }
 
-static int
-get_probe_interval(const char *db, const struct nbrec_nb_global *nb)
-{
-int default_interval = (db && !stream_or_pstream_needs_probes(db)
-? 0 : DEFAULT_PROBE_INTERVAL_MSEC);
-int interval = smap_get_int(>options,
-"northd_probe_interval", default_interval);
-
-if (interval > 0 && interval < 1000) {
-interval = 1000;
-}
-return interval;
-}
-
 static void
 ovnnb_db_run(struct northd_context *ctx,
  struct ovsdb_idl_index *sbrec_chassis_by_name,
@@ -14290,13 +14272,6 @@ ovnnb_db_run(struct northd_context *ctx,
 
 smap_destroy();
 
-/* Update the probe interval. */
-northd_probe_interval_nb = get_probe_interval(ctx->ovnnb_db, nb);
-northd_probe_interval_sb = get_probe_interval(ctx->ovnsb_db, nb);
-
-ovsdb_idl_set_probe_interval(ctx->ovnnb_idl, northd_probe_interval_nb);
-ovsdb_idl_set_probe_interval(ctx->ovnsb_idl, northd_probe_interval_sb);
-
 use_parallel_build =
 (smap_get_bool(>options, "use_parallel_build", false) &&
  can_parallelize_hashes(false));
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 42c0ad644..39aa96055 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -65,6 +65,10 @@ static const char *ssl_private_key_file;
 static const char *ssl_certificate_file;
 static const char *ssl_ca_cert_file;
 
+/* Default probe interval for NB and SB DB connections. */
+#define DEFAULT_PROBE_INTERVAL_MSEC 5000
+static int northd_probe_interval_nb = 0;
+static int northd_probe_interval_sb = 0;
 static bool use_parallel_build = true;
 
 static const char *rbac_chassis_auth[] =
@@ -576,6 +580,20 @@ update_ssl_config(void)
 }
 }
 
+static int
+get_probe_interval(const char *db, const struct nbrec_nb_global *nb)
+{
+int default_interval = (db && !stream_or_pstream_needs_probes(db)
+? 0 : DEFAULT_PROBE_INTERVAL_MSEC);
+int interval = smap_get_int(>options,
+"northd_probe_interval", default_interval);
+
+if (interval > 0 && interval < 1000) {
+interval = 1000;
+}
+return interval;
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -997,6 +1015,18 @@ main(int argc, char *argv[])
 poll_immediate_wake();
 }
 
+const struct nbrec_nb_global *nb =
+nbrec_nb_global_first(ovnnb_idl_loop.idl);
+/* Update the probe interval. */
+if (nb) {
+northd_probe_interval_nb = get_probe_interval(ovnnb_db, nb);
+northd_probe_interval_sb = get_probe_interval(ovnsb_db, nb);
+}
+ovsdb_idl_set_probe_interval(ovnnb_idl_loop.idl,
+ northd_probe_interval_nb);
+ovsdb_idl_set_probe_interval(ovnsb_idl_loop.idl,
+ northd_probe_interval_sb);
+
 if (reset_ovnsb_idl_min_index) {
 VLOG_INFO("Resetting southbound database cluster state");
 ovsdb_idl_reset_min_index(ovnsb_idl_loop.idl);
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Fix multicast relay when DGP are configured.

2021-09-21 Thread Han Zhou
On Tue, Sep 21, 2021 at 6:50 AM Dumitru Ceara  wrote:
>
> IP multicast relay didn't work properly if traffic had to be forwarded
> across a distributed gateway port in the router pipeline.  That is
> because the multicast_group used as output logical port is expanded in
> the egress pipeline, way after 'lr_in_gw_redirect' where unicast traffic
> would normally be forwarded to the chassis-redirect port.
>
> In order to achieve the same behavior for IP multicast routed traffic we
> now store the chassis-redirect port binding in the multicast_group on
> which IP multicast is routed.  On the remote hypervisor, to make sure
> traffic is delivered to the correct destination switch pipeline, we make
> sure that ovn-controller translates chassis-redirect ports from
> multicast groups to the logical patch ports they were created from.
>
> This patch also adds a test to simulate the ovn-kubernetes IP multicast
> use case (where this issue was first observed).
>
> Fixes: 5d1527b11e94 ("ovn-northd: Add IGMP Relay support")
> Reported-by: Alexander Constantinescu 
> Reported-at: https://bugzilla.redhat.com/2006306
> Signed-off-by: Dumitru Ceara 
> ---
>  controller/physical.c |  28 -
>  northd/lrouter.dl |  14 ++-
>  northd/multicast.dl   |  23 +++-
>  northd/northd.c   |   7 +-
>  northd/ovn_northd.dl  |  12 +-
>  tests/ovn.at  | 248 ++
>  6 files changed, 314 insertions(+), 18 deletions(-)
>
> diff --git a/controller/physical.c b/controller/physical.c
> index ffb9f9952..0cfb158c8 100644
> --- a/controller/physical.c
> +++ b/controller/physical.c
> @@ -1373,7 +1373,8 @@ out:
>  }
>
>  static void
> -consider_mc_group(enum mf_field_id mff_ovn_geneve,
> +consider_mc_group(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> +  enum mf_field_id mff_ovn_geneve,
>const struct simap *ct_zones,
>const struct hmap *local_datapaths,
>struct shash *local_bindings,
> @@ -1406,6 +1407,10 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
>   *  instead.  (If we put them in 'ofpacts', then the output
>   *  would happen on every hypervisor in the multicast group,
>   *  effectively duplicating the packet.)
> + *
> + *- For chassisredirect ports, add actions to 'ofpacts' to
> + *  set the output port to be the router patch port for which
> + *  the redirect port was added.
>   */
>  struct ofpbuf ofpacts;
>  ofpbuf_init(, 0);
> @@ -1440,6 +1445,21 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
> && port->chassis == chassis)) {
>  put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, );
>  put_resubmit(OFTABLE_CHECK_LOOPBACK, );
> +} else if (!strcmp(port->type, "chassisredirect")
> +   && port->chassis == chassis) {
> +const char *distributed_port = smap_get(>options,
> +"distributed-port");
> +if (distributed_port) {
> +const struct sbrec_port_binding *distributed_binding
> += lport_lookup_by_name(sbrec_port_binding_by_name,
> +   distributed_port);
> +if (distributed_binding
> +&& port->datapath == distributed_binding->datapath) {
> +put_load(distributed_binding->tunnel_key,
MFF_LOG_OUTPORT,
> + 0, 32, );
> +put_resubmit(OFTABLE_CHECK_LOOPBACK, );
> +}
> +}
>  } else if (port->chassis && !get_localnet_port(
>  local_datapaths, mc->datapath->tunnel_key)) {
>  /* Add remote chassis only when localnet port not exist,
> @@ -1574,7 +1594,8 @@ physical_handle_mc_group_changes(struct
physical_ctx *p_ctx,
>  if (!sbrec_multicast_group_is_new(mc)) {
>  ofctrl_remove_flows(flow_table, >header_.uuid);
>  }
> -consider_mc_group(p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
> +consider_mc_group(p_ctx->sbrec_port_binding_by_name,
> +  p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
>p_ctx->local_datapaths,
p_ctx->local_bindings,
>p_ctx->patch_ofports,
>p_ctx->chassis, mc,
> @@ -1617,7 +1638,8 @@ physical_run(struct physical_ctx *p_ctx,
>  /* Handle output to multicast groups, in tables 32 and 33. */
>  const struct sbrec_multicast_group *mc;
>  SBREC_MULTICAST_GROUP_TABLE_FOR_EACH (mc, p_ctx->mc_group_table) {
> -consider_mc_group(p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
> +consider_mc_group(p_ctx->sbrec_port_binding_by_name,
> +  p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
>p_ctx->local_datapaths, 

Re: [ovs-dev] [PATCH net-next v5] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-21 Thread Jakub Kicinski
On Mon, 20 Sep 2021 11:20:38 -0700 Toms Atteka wrote:
> This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and
> packets can be filtered using ipv6_ext flag.
> 
> Signed-off-by: Toms Atteka 

Please make sure to check the files you touch with

./scripts/kernel-doc -none

You're adding kdoc warnings by using the /** comments
which are in fact not kdoc-formatted. Please fix and 
repost.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-21.06] northd: support HW VTEP with stateful datapath

2021-09-21 Thread Numan Siddique
On Sat, Sep 18, 2021 at 8:51 AM Vladislav Odintsov  wrote:
>
> A packet going from HW VTEP device to VIF port when arrives to
> hypervisor chassis should go through LS ingress pipeline to l2_lkp
> stage without any match. In l2_lkp stage an output port is
> determined and then packet passed to LS egress pipeline for futher
> processing and to VIF port delivery.
>
> Prior to this commit a packet, which was received from HW VTEP
> device was dropped in an LS ingress datapath, where stateful services
> were defined (ACLs, LBs).
>
> To fix this issue we add a special flag-bit which can be used in LS
> pipelines, to check whether the packet came from HW VTEP devices.
> In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110
> to skip such packets.
>
> Signed-off-by: Vladislav Odintsov 
> Signed-off-by: Numan Siddique 
> (cherry picked from commit 62ca8b9620cc1168ace6905575b7d36438363aed)

The below system test case fails with this patch.  Please check this
out - 
https://github.com/numansiddique/ovn/runs/3665362661?check_suite_focus=true

## --- ##
## ovn 21.06.1 test suite. ##
## --- ##
134: ovn -- ECMP IPv6 symmetric reply -- ovn-northd -- dp-groups=no
FAILED (system-ovn.at:5539)

## - ##
## Test results. ##
## - ##


The test fails locally too.   I didn't look into the details.

Thanks
Numan

> ---
>  northd/ovn-northd.8.xml | 28 
>  northd/ovn-northd.c | 14 ++
>  northd/ovn_northd.dl| 33 +++--
>  ovs |  2 +-
>  tests/ovn-northd.at |  2 ++
>  5 files changed, 76 insertions(+), 3 deletions(-)
>
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 890775797..29eaf1864 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -262,6 +262,16 @@
>  logical ports on which port security is not enabled, these advance 
> all
>  packets that match the inport.
>
> +  
> +For logical ports of type vtep, the above logical flow
> +will also apply the action REGBIT_FROM_RAMP = 1; to
> +indicate that the packet is coming from a RAMP (controller-vtep)
> +device.  Later pipelines will use this information to skip
> +sending the packet to the conntrack.  Packets from vtep
> +logical ports should go though ingress pipeline only to determine
> +the output port and they should not be subjected to any ACL checks.
> +Egress pipeline will do the ACL checks.
> +  
>  
>
>  
> @@ -453,6 +463,15 @@
>processing.
>  
>
> +
> +  This table has a priority-110 flow with the match
> +  REGBIT_FROM_RAMP == 1 for all logical switch datapaths to
> +  resubmit traffic to the next table. REGBIT_FROM_RAMP
> +  indicates that packet was received from vtep logical ports
> +  and it can be skipped from the stateful ACL processing in the ingress
> +  pipeline.
> +
> +
>  
>This table also has a priority-110 flow with the match
>eth.dst == E for all logical switch
> @@ -512,6 +531,15 @@
>configured. We can now add a lflow to drop ct.inv packets.
>  
>
> +
> +  This table has a priority-110 flow with the match
> +  REGBIT_FROM_RAMP == 1 for all logical switch datapaths to
> +  resubmit traffic to the next table. REGBIT_FROM_RAMP
> +  indicates that packet was received from vtep logical ports
> +  and it can be skipped from the load balancer processing in the ingress
> +  pipeline.
> +
> +
>  
>This table also has a priority-110 flow with the match
>eth.dst == E for all logical switch
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index a7f6fdf6b..c2cc9b930 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -236,6 +236,7 @@ enum ovn_stage {
>  #define REGBIT_ACL_HINT_BLOCK "reg0[10]"
>  #define REGBIT_LKUP_FDB   "reg0[11]"
>  #define REGBIT_HAIRPIN_REPLY  "reg0[12]"
> +#define REGBIT_FROM_RAMP  "reg0[14]"
>
>  #define REG_ORIG_DIP_IPV4 "reg1"
>  #define REG_ORIG_DIP_IPV6 "xxreg1"
> @@ -4823,10 +4824,15 @@ build_lswitch_input_port_sec_op(
>  build_port_security_l2("eth.src", op->ps_addrs, op->n_ps_addrs,
> match);
>
> +if (!strcmp(op->nbsp->type, "vtep")) {
> +ds_put_format(actions, REGBIT_FROM_RAMP" = 1; ");
> +}
> +
>  const char *queue_id = smap_get(>sb->options, "qdisc_queue_id");
>  if (queue_id) {
>  ds_put_format(actions, "set_queue(%s); ", queue_id);
>  }
> +
>  ds_put_cstr(actions, "next;");
>  ovn_lflow_add_with_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, 50,
>  ds_cstr(match), ds_cstr(actions),
> @@ -5070,6 +5076,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap 
> *port_groups,
>"nd || nd_rs || nd_ra 

[ovs-dev] [PATCH ovn] ovn-controller: Allow specifying tos option for tunnel interface

2021-09-21 Thread venugopali--- via dev
From: Venugopal Iyer 

Currently, OVN tunnel interface supports the csum option along
with remote_ip and key. There are applications (e.g. RoCE) that rely
on setting the DSCP bits and expect it to be moved to the outer/
tunnel header as well.

This commit adds an "ovn-encap-tos" external-id that can be used to
set the tos option  on the OVS tunnel interface, using:

ovs-vsctl set Open_vSwitch . external_ids:ovn-encap-tos=inherit

Tested by setting the external_id (as above) and checking the geneve
interfaces created, e.g:

options : {csum="true", key=flow, remote_ip="X.X.X.X", 
tos=inherit}

Also, added a simple test case to make sure the tos option is carried to the
tunnel interface when set.

Signed-off-by: venu iyer (venugop...@nvidia.com)
---
 controller/encaps.c | 25 ++
 controller/encaps.h |  1 +
 controller/ovn-controller.8.xml |  7 +
 controller/ovn-controller.c |  1 +
 tests/ovn-controller.at | 45 +
 5 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/controller/encaps.c b/controller/encaps.c
index fc93bf1ee..da24448f5 100644
--- a/controller/encaps.c
+++ b/controller/encaps.c
@@ -152,7 +152,8 @@ encaps_tunnel_id_match(const char *tunnel_id, const char 
*chassis_id,
 
 static void
 tunnel_add(struct tunnel_ctx *tc, const struct sbrec_sb_global *sbg,
-   const char *new_chassis_id, const struct sbrec_encap *encap)
+   const char *new_chassis_id, const struct sbrec_encap *encap,
+   const struct ovsrec_open_vswitch_table *ovs_table)
 {
 struct smap options = SMAP_INITIALIZER();
 smap_add(, "remote_ip", encap->ip);
@@ -202,6 +203,18 @@ tunnel_add(struct tunnel_ctx *tc, const struct 
sbrec_sb_global *sbg,
 smap_add(, "remote_name", new_chassis_id);
 }
 
+const struct ovsrec_open_vswitch *cfg =
+ovsrec_open_vswitch_table_first(ovs_table);
+/* If the tos option is configured, get it */
+if (cfg) {
+const char *encap_tos = smap_get_def(>external_ids,
+   "ovn-encap-tos", "none");
+
+if (encap_tos && strcmp(encap_tos, "none")) {
+smap_add(, "tos", encap_tos);
+}
+}
+
 /* If there's an existing chassis record that does not need any change,
  * keep it.  Otherwise, create a new record (if there was an existing
  * record, the new record will supplant it and encaps_run() will delete
@@ -270,7 +283,8 @@ preferred_encap(const struct sbrec_chassis *chassis_rec)
  * as there are VTEP of that type (differentiated by remote_ip) on that 
chassis.
  */
 static int
-chassis_tunnel_add(const struct sbrec_chassis *chassis_rec, const struct 
sbrec_sb_global *sbg, struct tunnel_ctx *tc)
+chassis_tunnel_add(const struct sbrec_chassis *chassis_rec, const struct 
sbrec_sb_global *sbg,
+   const struct ovsrec_open_vswitch_table *ovs_table, struct 
tunnel_ctx *tc)
 {
 struct sbrec_encap *encap = preferred_encap(chassis_rec);
 int tuncnt = 0;
@@ -286,7 +300,7 @@ chassis_tunnel_add(const struct sbrec_chassis *chassis_rec, 
const struct sbrec_s
 if (tun_type != pref_type) {
 continue;
 }
-tunnel_add(tc, sbg, chassis_rec->name, chassis_rec->encaps[i]);
+tunnel_add(tc, sbg, chassis_rec->name, chassis_rec->encaps[i], 
ovs_table);
 tuncnt++;
 }
 return tuncnt;
@@ -316,11 +330,12 @@ chassis_tzones_overlap(const struct sset *transport_zones,
 
 void
 encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
-   const struct ovsrec_bridge_table *bridge_table,
+  const struct ovsrec_bridge_table *bridge_table,
const struct ovsrec_bridge *br_int,
const struct sbrec_chassis_table *chassis_table,
const struct sbrec_chassis *this_chassis,
const struct sbrec_sb_global *sbg,
+   const struct ovsrec_open_vswitch_table *ovs_table,
const struct sset *transport_zones)
 {
 if (!ovs_idl_txn || !br_int) {
@@ -390,7 +405,7 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
 continue;
 }
 
-if (chassis_tunnel_add(chassis_rec, sbg, ) == 0) {
+if (chassis_tunnel_add(chassis_rec, sbg, ovs_table, ) == 0) {
 VLOG_INFO("Creating encap for '%s' failed", chassis_rec->name);
 continue;
 }
diff --git a/controller/encaps.h b/controller/encaps.h
index f488393c4..25d44b034 100644
--- a/controller/encaps.h
+++ b/controller/encaps.h
@@ -35,6 +35,7 @@ void encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
 const struct sbrec_chassis_table *,
 const struct sbrec_chassis *,
 const struct sbrec_sb_global *,
+const struct ovsrec_open_vswitch_table *,
 const struct sset *transport_zones);
 
 bool encaps_cleanup(struct ovsdb_idl_txn *ovs_idl_txn,
diff --git a/controller/ovn-controller.8.xml 

[ovs-dev] [PATCH v2 ovn 2/2] controller: add memory accounting for if_status_mgr module

2021-09-21 Thread Lorenzo Bianconi
Introduce memory accounting for data structures in ovn-controller
if_status_mgr module.

Signed-off-by: Lorenzo Bianconi 
---
 controller/if-status.c  | 33 +
 controller/if-status.h  |  3 +++
 controller/ovn-controller.c |  1 +
 3 files changed, 37 insertions(+)

diff --git a/controller/if-status.c b/controller/if-status.c
index 00f826c50..fa4c8bd94 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -18,6 +18,7 @@
 #include "binding.h"
 #include "if-status.h"
 #include "ofctrl-seqno.h"
+#include "simap.h"
 
 #include "lib/hmapx.h"
 #include "lib/util.h"
@@ -86,6 +87,8 @@ struct ovs_iface {
  */
 };
 
+static uint64_t ifaces_usage;
+
 /* State machine manager for all local OVS interfaces. */
 struct if_status_mgr {
 /* All local interfaces, mapping from 'iface-id' to 'struct ovs_iface'. */
@@ -336,6 +339,18 @@ if_status_mgr_run(struct if_status_mgr *mgr,
   ovs_readonly);
 }
 
+static void
+ovs_iface_account_mem(const char *iface_id, bool erase)
+{
+uint32_t size = (strlen(iface_id) + sizeof(struct ovs_iface) +
+ sizeof(struct shash_node));
+if (erase) {
+ifaces_usage -= size;
+} else {
+ifaces_usage += size;
+}
+}
+
 static struct ovs_iface *
 ovs_iface_create(struct if_status_mgr *mgr, const char *iface_id,
  enum if_state state)
@@ -346,6 +361,7 @@ ovs_iface_create(struct if_status_mgr *mgr, const char 
*iface_id,
 iface->id = xstrdup(iface_id);
 shash_add_nocopy(>ifaces, iface->id, iface);
 ovs_iface_set_state(mgr, iface, state);
+ovs_iface_account_mem(iface_id, false);
 return iface;
 }
 
@@ -359,6 +375,7 @@ ovs_iface_destroy(struct if_status_mgr *mgr, struct 
ovs_iface *iface)
 if (node) {
 shash_steal(>ifaces, node);
 }
+ovs_iface_account_mem(iface->id, true);
 free(iface->id);
 free(iface);
 }
@@ -420,3 +437,19 @@ if_status_mgr_update_bindings(struct if_status_mgr *mgr,
 local_binding_set_down(bindings, iface->id, sb_readonly, ovs_readonly);
 }
 }
+
+void
+if_status_mgr_get_memory_usage(struct if_status_mgr *mgr,
+   struct simap *usage)
+{
+uint64_t ifaces_state_usage = 0;
+for (size_t i = 0; i < ARRAY_SIZE(mgr->ifaces_per_state); i++) {
+ifaces_state_usage += sizeof(struct hmapx_node) *
+  hmapx_count(>ifaces_per_state[i]);
+}
+
+simap_increase(usage, "if_status_mgr_ifaces_usage-KB",
+   ROUND_UP(ifaces_usage, 1024) / 1024);
+simap_increase(usage, "if_status_mgr_ifaces_state_usage-KB",
+   ROUND_UP(ifaces_state_usage, 1024) / 1024);
+}
diff --git a/controller/if-status.h b/controller/if-status.h
index 51fe7c684..ff4aa760e 100644
--- a/controller/if-status.h
+++ b/controller/if-status.h
@@ -21,6 +21,7 @@
 #include "binding.h"
 
 struct if_status_mgr;
+struct simap;
 
 struct if_status_mgr *if_status_mgr_create(void);
 void if_status_mgr_clear(struct if_status_mgr *);
@@ -33,5 +34,7 @@ void if_status_mgr_delete_iface(struct if_status_mgr *, const 
char *iface_id);
 void if_status_mgr_update(struct if_status_mgr *, struct local_binding_data *);
 void if_status_mgr_run(struct if_status_mgr *mgr, struct local_binding_data *,
bool sb_readonly, bool ovs_readonly);
+void if_status_mgr_get_memory_usage(struct if_status_mgr *mgr,
+struct simap *usage);
 
 # endif /* controller/if-status.h */
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index aa7941eeb..c48ac5d40 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -3462,6 +3462,7 @@ main(int argc, char *argv[])
 
 lflow_cache_get_memory_usage(ctrl_engine_ctx.lflow_cache, );
 ofctrl_get_memory_usage();
+if_status_mgr_get_memory_usage(if_mgr, );
 memory_report();
 simap_destroy();
 }
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 ovn 1/2] controller: do not allocate iface name twice in if_status_mgr module

2021-09-21 Thread Lorenzo Bianconi
Rely on shash_add_nocopy instead of shash_add in ovs_iface_create in
order to avoid allocating iface_id twice.

Signed-off-by: Lorenzo Bianconi 
---
 controller/if-status.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/controller/if-status.c b/controller/if-status.c
index b5a4025fc..00f826c50 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -344,7 +344,7 @@ ovs_iface_create(struct if_status_mgr *mgr, const char 
*iface_id,
 
 VLOG_DBG("Interface %s create.", iface_id);
 iface->id = xstrdup(iface_id);
-shash_add(>ifaces, iface_id, iface);
+shash_add_nocopy(>ifaces, iface->id, iface);
 ovs_iface_set_state(mgr, iface, state);
 return iface;
 }
@@ -355,7 +355,10 @@ ovs_iface_destroy(struct if_status_mgr *mgr, struct 
ovs_iface *iface)
 VLOG_DBG("Interface %s destroy: state %s", iface->id,
  if_state_names[iface->state]);
 hmapx_find_and_delete(>ifaces_per_state[iface->state], iface);
-shash_find_and_delete(>ifaces, iface->id);
+struct shash_node *node = shash_find(>ifaces, iface->id);
+if (node) {
+shash_steal(>ifaces, node);
+}
 free(iface->id);
 free(iface);
 }
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 ovn 0/2] add memory accounting for if_status_mgr module

2021-09-21 Thread Lorenzo Bianconi
Changes since v1:
- add ovs_iface_account_mem utility routine
- rely on shash_add_nocopy in ovs_iface_create

Lorenzo Bianconi (2):
  controller: do not allocate iface name twice in if_status_mgr module
  controller: add memory accounting for if_status_mgr module

 controller/if-status.c  | 40 +++--
 controller/if-status.h  |  3 +++
 controller/ovn-controller.c |  1 +
 3 files changed, 42 insertions(+), 2 deletions(-)

-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovsdb-server: Log database transactions for user requested tables.

2021-09-21 Thread Michael Santana




On 9/16/21 11:37 AM, Dumitru Ceara wrote:

Add a new command, 'ovsdb-server/log-db-ops DB TABLE on|off', which
allows the user to enable/disable transaction logging for specific
databases and tables.

By default, logging is disabled.  Once enabled, logs are generated
with level INFO and are also rate limited.

If used with care, this command can be useful in analyzing production
deployment performance issues, allowing the user to pin point
bottlenecks without the need to enable wider debug logs, e.g., jsonrpc.

Signed-off-by: Dumitru Ceara 
---
A sample use case is an ovn-kubernetes scaled deployment in which
we're interesting in reducing time to bring up PODs (represented by
OVN logical switch ports).  In order to determine exactly where the
bottleneck is when provisioning PODs (CMS/ovn-nbctl/client
IDLs/ovsdb-server/ovn-controller/etc) we need timestamps of when
operations happen at various places in the stack.

Without this patch the only option for tracking when transactions
happen in the Northbound database is to enable jsonrpc debug logs in
ovsdb-server.  This generates a rather large amount of data.

Instead, now, users would be able to just enable logging for the
Logical_Switch_Port table getting more relevant and precise
information.

Very well written and explained

Everything looks good to me. I just have one small question down below


V2:
- rebased (fixed conflicts in NEWS).
---
  NEWS |  4 
  ovsdb/ovsdb-server.c | 38 +
  ovsdb/row.c  | 17 +++
  ovsdb/row.h  |  1 +
  ovsdb/table.c|  7 ++
  ovsdb/table.h|  3 +++
  ovsdb/transaction.c  | 51 
  7 files changed, 121 insertions(+)

diff --git a/NEWS b/NEWS
index 90f4b15902b8..d56329772276 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,10 @@ Post-v2.16.0
 limiting behavior.
   * Add hardware offload support for matching IPv4/IPv6 frag types
 (experimental).
+   - OVSDB:
+ * New unixctl command 'ovsdb-server/log-db-ops DB TABLE on|off".
+   If turned on, ovsdb-server will log (at level INFO and rate limited)
+   all operations that are committed to table TABLE in the DB database.
  
  
  v2.16.0 - 16 Aug 2021

diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 0b3d2bb71432..c48645f7e255 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -115,6 +115,7 @@ static unixctl_cb_func ovsdb_server_list_remotes;
  static unixctl_cb_func ovsdb_server_add_database;
  static unixctl_cb_func ovsdb_server_remove_database;
  static unixctl_cb_func ovsdb_server_list_databases;
+static unixctl_cb_func ovsdb_server_log_db_ops;
  
  static void read_db(struct server_config *, struct db *);

  static struct ovsdb_error *open_db(struct server_config *,
@@ -443,6 +444,8 @@ main(int argc, char *argv[])
   ovsdb_server_remove_database, _config);
  unixctl_command_register("ovsdb-server/list-dbs", "", 0, 0,
   ovsdb_server_list_databases, _dbs);
+unixctl_command_register("ovsdb-server/log-db-ops", "DB TABLE on|off",
+ 3, 3, ovsdb_server_log_db_ops, _dbs);
  unixctl_command_register("ovsdb-server/perf-counters-show", "", 0, 0,
   ovsdb_server_perf_counters_show, NULL);
  unixctl_command_register("ovsdb-server/perf-counters-clear", "", 0, 0,
@@ -1769,6 +1772,41 @@ ovsdb_server_list_databases(struct unixctl_conn *conn, 
int argc OVS_UNUSED,
  ds_destroy();
  }
  
+static void

+ovsdb_server_log_db_ops(struct unixctl_conn *conn, int argc OVS_UNUSED,
+const char *argv[], void *all_dbs_)
+{
+struct shash *all_dbs = all_dbs_;
+const char *db_name = argv[1];
+const char *tbl_name = argv[2];
+const char *command = argv[3];
+bool log;
+
+if (!strcmp(command, "on")) {
+log = true;
+} else if (!strcmp(command, "off")) {
+log = false;
+} else {
+unixctl_command_reply_error(conn, "invalid argument");
+return;
+}
+
+struct db *db = shash_find_data(all_dbs, db_name);
+if (!db) {
+unixctl_command_reply_error(conn, "no such database");
+return;
+}
+
+struct ovsdb_table *table = ovsdb_get_table(db->db, tbl_name);
+if (!table) {
+unixctl_command_reply_error(conn, "no such table");
+return;
+}
+
+ovsdb_table_log_ops(table, log);
+unixctl_command_reply(conn, NULL);
+}
+
  static void
  ovsdb_server_get_sync_status(struct unixctl_conn *conn, int argc OVS_UNUSED,
   const char *argv[] OVS_UNUSED, void *config_)
diff --git a/ovsdb/row.c b/ovsdb/row.c
index 65a0546211c8..5e31716506bc 100644
--- a/ovsdb/row.c
+++ b/ovsdb/row.c
@@ -278,6 +278,23 @@ ovsdb_row_to_json(const struct ovsdb_row *row,
  }
  return json;
  }
+
+void
+ovsdb_row_to_string(const struct ovsdb_row *row, struct ds 

[ovs-dev] [OVN Patch v4 1/2] Make changes to the parallel processing API to allow pool sizing

2021-09-21 Thread anton . ivanov
From: Anton Ivanov 

1. Make pool size user defineable.
2. Expose pool destruction.
3. Make pools resizeable at runtime.
4. Split pool start and completion to allow background execution.
5. Add a simplified API for SAFE walking single hash.

Signed-off-by: Anton Ivanov 
---
 lib/ovn-parallel-hmap.c | 290 +++-
 lib/ovn-parallel-hmap.h |  77 ++-
 northd/northd.c |  72 +++---
 3 files changed, 321 insertions(+), 118 deletions(-)

diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
index b8c7ac786..1b3883441 100644
--- a/lib/ovn-parallel-hmap.c
+++ b/lib/ovn-parallel-hmap.c
@@ -51,7 +51,6 @@ static bool can_parallelize = false;
  * accompanied by a fence. It does not need to be atomic or be
  * accessed under a lock.
  */
-static bool workers_must_exit = false;
 
 static struct ovs_list worker_pools = OVS_LIST_INITIALIZER(_pools);
 
@@ -70,10 +69,27 @@ static void merge_hash_results(struct worker_pool *pool 
OVS_UNUSED,
void *fin_result, void *result_frags,
int index);
 
+
+static bool init_control(struct worker_control *control, int id,
+ struct worker_pool *pool);
+
+static void cleanup_control(struct worker_pool *pool, int id);
+
+static void free_controls(struct worker_pool *pool);
+
+static struct worker_control *alloc_controls(int size);
+
+static void *standard_helper_thread(void *arg);
+
+struct worker_pool *ovn_add_standard_pool(int size)
+{
+return add_worker_pool(standard_helper_thread, size);
+}
+
 bool
-ovn_stop_parallel_processing(void)
+ovn_stop_parallel_processing(struct worker_pool *pool)
 {
-return workers_must_exit;
+return pool->workers_must_exit;
 }
 
 bool
@@ -92,11 +108,67 @@ ovn_can_parallelize_hashes(bool force_parallel)
 return can_parallelize;
 }
 
+
+void
+destroy_pool(struct worker_pool *pool) {
+char sem_name[256];
+
+free_controls(pool);
+sem_close(pool->done);
+sprintf(sem_name, MAIN_SEM_NAME, sembase, pool);
+sem_unlink(sem_name);
+free(pool);
+}
+
+bool
+ovn_resize_pool(struct worker_pool *pool, int size)
+{
+int i;
+
+ovs_assert(pool != NULL);
+
+if (!size) {
+size = pool_size;
+}
+
+ovs_mutex_lock(_mutex);
+
+if (can_parallelize) {
+free_controls(pool);
+pool->size = size;
+
+/* Allocate new control structures. */
+
+pool->controls = alloc_controls(size);
+pool->workers_must_exit = false;
+
+for (i = 0; i < pool->size; i++) {
+if (! init_control(>controls[i], i, pool)) {
+goto cleanup;
+}
+}
+}
+ovs_mutex_unlock(_mutex);
+return true;
+cleanup:
+
+/* Something went wrong when opening semaphores. In this case
+ * it is better to shut off parallel procesing altogether
+ */
+
+VLOG_INFO("Failed to initialize parallel processing, error %d", errno);
+can_parallelize = false;
+free_controls(pool);
+
+ovs_mutex_unlock(_mutex);
+return false;
+}
+
+
 struct worker_pool *
-ovn_add_worker_pool(void *(*start)(void *))
+ovn_add_worker_pool(void *(*start)(void *), int size)
 {
 struct worker_pool *new_pool = NULL;
-struct worker_control *new_control;
 bool test = false;
 int i;
 char sem_name[256];
@@ -113,38 +185,29 @@ ovn_add_worker_pool(void *(*start)(void *))
 ovs_mutex_unlock(_mutex);
 }
 
+if (!size) {
+size = pool_size;
+}
+
 ovs_mutex_lock(_mutex);
 if (can_parallelize) {
 new_pool = xmalloc(sizeof(struct worker_pool));
-new_pool->size = pool_size;
-new_pool->controls = NULL;
+new_pool->size = size;
+new_pool->start = start;
 sprintf(sem_name, MAIN_SEM_NAME, sembase, new_pool);
 new_pool->done = sem_open(sem_name, O_CREAT, S_IRWXU, 0);
 if (new_pool->done == SEM_FAILED) {
 goto cleanup;
 }
 
-new_pool->controls =
-xmalloc(sizeof(struct worker_control) * new_pool->size);
+new_pool->controls = alloc_controls(size);
+new_pool->workers_must_exit = false;
 
 for (i = 0; i < new_pool->size; i++) {
-new_control = _pool->controls[i];
-new_control->id = i;
-new_control->done = new_pool->done;
-new_control->data = NULL;
-ovs_mutex_init(_control->mutex);
-new_control->finished = ATOMIC_VAR_INIT(false);
-sprintf(sem_name, WORKER_SEM_NAME, sembase, new_pool, i);
-new_control->fire = sem_open(sem_name, O_CREAT, S_IRWXU, 0);
-if (new_control->fire == SEM_FAILED) {
+if (!init_control(_pool->controls[i], i, new_pool)) {
 goto cleanup;
 }
 }
-
-for (i = 0; i < pool_size; i++) {
-new_pool->controls[i].worker =
-ovs_thread_create("worker pool helper", start, 

[ovs-dev] [OVN Patch v4 2/2] Add support for configuring parallelization via unixctl

2021-09-21 Thread anton . ivanov
From: Anton Ivanov 

libs: add configuration support to parallel-hmap.[c,h]
northd: add support for configuring parallelization to northd

Signed-off-by: Anton Ivanov 
---
 lib/ovn-parallel-hmap.c | 185 ++--
 lib/ovn-parallel-hmap.h |  63 +-
 northd/northd.c |  30 +++
 northd/northd.h |   2 -
 northd/ovn-northd.c |   5 +-
 tests/ovn-macros.at |  16 +++-
 6 files changed, 263 insertions(+), 38 deletions(-)

diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
index 1b3883441..6a6488a17 100644
--- a/lib/ovn-parallel-hmap.c
+++ b/lib/ovn-parallel-hmap.c
@@ -33,6 +33,7 @@
 #include "ovs-thread.h"
 #include "ovs-numa.h"
 #include "random.h"
+#include "unixctl.h"
 
 VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
 
@@ -46,6 +47,7 @@ VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
  */
 static atomic_bool initial_pool_setup = ATOMIC_VAR_INIT(false);
 static bool can_parallelize = false;
+static bool should_parallelize = false;
 
 /* This is set only in the process of exit and the set is
  * accompanied by a fence. It does not need to be atomic or be
@@ -83,7 +85,7 @@ static void *standard_helper_thread(void *arg);
 
 struct worker_pool *ovn_add_standard_pool(int size)
 {
-return add_worker_pool(standard_helper_thread, size);
+return add_worker_pool(standard_helper_thread, size, "default", true);
 }
 
 bool
@@ -92,6 +94,19 @@ ovn_stop_parallel_processing(struct worker_pool *pool)
 return pool->workers_must_exit;
 }
 
+bool
+ovn_set_parallel_processing(bool enable)
+{
+should_parallelize = enable;
+return can_parallelize;
+}
+
+bool
+ovn_get_parallel_processing(void)
+{
+return can_parallelize && should_parallelize;
+}
+
 bool
 ovn_can_parallelize_hashes(bool force_parallel)
 {
@@ -117,6 +132,7 @@ destroy_pool(struct worker_pool *pool) {
 sem_close(pool->done);
 sprintf(sem_name, MAIN_SEM_NAME, sembase, pool);
 sem_unlink(sem_name);
+free(pool->name);
 free(pool);
 }
 
@@ -127,6 +143,10 @@ ovn_resize_pool(struct worker_pool *pool, int size)
 
 ovs_assert(pool != NULL);
 
+if (!pool->is_mutable) {
+return false;
+}
+
 if (!size) {
 size = pool_size;
 }
@@ -166,7 +186,8 @@ cleanup:
 
 
 struct worker_pool *
-ovn_add_worker_pool(void *(*start)(void *), int size)
+ovn_add_worker_pool(void *(*start)(void *), int size, char *name,
+bool is_mutable)
 {
 struct worker_pool *new_pool = NULL;
 bool test = false;
@@ -194,6 +215,8 @@ ovn_add_worker_pool(void *(*start)(void *), int size)
 new_pool = xmalloc(sizeof(struct worker_pool));
 new_pool->size = size;
 new_pool->start = start;
+new_pool->is_mutable = is_mutable;
+new_pool->name = xstrdup(name);
 sprintf(sem_name, MAIN_SEM_NAME, sembase, new_pool);
 new_pool->done = sem_open(sem_name, O_CREAT, S_IRWXU, 0);
 if (new_pool->done == SEM_FAILED) {
@@ -226,6 +249,7 @@ cleanup:
 sprintf(sem_name, MAIN_SEM_NAME, sembase, new_pool);
 sem_unlink(sem_name);
 }
+free(new_pool->name);
 ovs_mutex_unlock(_mutex);
 return NULL;
 }
@@ -342,8 +366,7 @@ ovn_complete_pool_callback(struct worker_pool *pool,
 }
 } while (completed < pool->size);
 }
-
-/* Complete a thread pool which uses a callback function to process results
+/* Run a thread pool which uses a callback function to process results
  */
 void
 ovn_run_pool_callback(struct worker_pool *pool,
@@ -352,8 +375,8 @@ ovn_run_pool_callback(struct worker_pool *pool,
   void *fin_result,
   void *result_frags, int index))
 {
-ovn_start_pool(pool);
-ovn_complete_pool_callback(pool, fin_result, result_frags, helper_func);
+start_pool(pool);
+complete_pool_callback(pool, fin_result, result_frags, helper_func);
 }
 
 /* Run a thread pool - basic, does not do results processing.
@@ -401,6 +424,28 @@ ovn_fast_hmap_merge(struct hmap *dest, struct hmap *inc)
 inc->n = 0;
 }
 
+/* Run a thread pool which gathers results in an array
+ * of hashes. Merge results.
+ */
+void
+ovn_complete_pool_hash(struct worker_pool *pool,
+  struct hmap *result,
+  struct hmap *result_frags)
+{
+complete_pool_callback(pool, result, result_frags, merge_hash_results);
+}
+
+/* Run a thread pool which gathers results in an array of lists.
+ * Merge results.
+ */
+void
+ovn_complete_pool_list(struct worker_pool *pool,
+  struct ovs_list *result,
+  struct ovs_list *result_frags)
+{
+complete_pool_callback(pool, result, result_frags, merge_list_results);
+}
+
 /* Run a thread pool which gathers results in an array
  * of hashes. Merge results.
  */
@@ -514,7 +559,7 @@ static struct worker_control *alloc_controls(int size)
 
 static void
 worker_pool_hook(void *aux OVS_UNUSED) {
-static struct worker_pool 

Re: [ovs-dev] [PATCH ovn] ovn-northd: Virtual port add ND/ARP responder flows for IPv6 VIPs.

2021-09-21 Thread Dumitru Ceara
On 9/20/21 3:20 PM, mh...@redhat.com wrote:
> From: Mohammad Heib 
> 
> currently ovn-northd only handle virtual ports with VIP IPv4 and ignores
> virtual ports with VIP IPv6.
> 
> This patch adds support for virtual ports with VIP IPv6 by adding
> lflows to the lsp_in_arp_rsp logical switch pipeline.
> Those lflows handle Neighbor Solicitations and Neighbor Advertisement requests
> that target the virtual port VIPs and bind the virtual port to the desired 
> VIF.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2003091
> Fixes: 054f4c85c413 ("Add a new logical switch port type - 'virtual'")
> Signed-off-by: Mohammad Heib 
> ---

Hi Mohammad,

Thanks for the patch!

This is not a full review yet, just some stuff I noticed while glancing
over the patch.

It would be great if you could add a test case in ovn-northd.at to make
sure the logical flows are generated as expected.  This would also
ensure that we don't miss adding the northd-ddlog part (which is not
present in this patch).

>  northd/northd.c | 52 +++--
>  1 file changed, 42 insertions(+), 10 deletions(-)
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index d1b87891c..d85036dcc 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -7386,16 +7386,30 @@ build_lswitch_arp_nd_responder_known_ips(struct 
> ovn_port *op,
>   *  - ARP reply from the virtual ip which belongs to a logical
>   *port of type 'virtual' and bind that port.
>   * */
> -ovs_be32 ip;
> +
> +union ip {
> +ovs_be32 ip;
> +ovs_u128 ipv6;
> +}ip;
> +

I think we should probably use in6_add and in6_addr_mapped_ipv4 instead
of coming up with a new type.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Kalahasthi, Suneetha
Hi Kevin,

Thanks for the details.
I will make the setup, test and update the results.

Regards,
Suneetha

-Original Message-
From: Kevin Traynor  
Sent: 21 September 2021 17:52
To: Kalahasthi, Suneetha ; d...@openvswitch.org
Cc: David Marchand 
Subject: Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

On 21/09/2021 13:16, Kalahasthi, Suneetha wrote:
> HI Kevin,
> 
> The setup is:
> 1. Add one virtio_user port to OVS with 3 queues ovs-vsctl add-port 
> br0 virtio_user0 -- set Interface virtio_user0 type=dpdk
> options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,q
> ueues=3
> 

Just need to add the port in step 1.

Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault.
0x00ddf230 in virtio_rx_mem_pool_buf_size ()
(gdb) bt
#0  0x00ddf230 in virtio_rx_mem_pool_buf_size ()
#1  0x00ddf301 in virtio_mtu_set ()
#2  0x0107e451 in rte_eth_dev_set_mtu ()
#3  0x012bc7bf in dpdk_eth_dev_port_config (dev=0x1503c8b00, n_rxq=1, 
n_txq=3) at lib/netdev-dpdk.c:1018
#4  0x012bce2e in dpdk_eth_dev_init (dev=0x1503c8b00) at
lib/netdev-dpdk.c:1146
#5  0x012c6729 in netdev_dpdk_reconfigure (netdev=0x1503c8b80) at 
lib/netdev-dpdk.c:5007
#6  0x011ac55b in netdev_reconfigure (netdev=0x1503c8b80) at
lib/netdev.c:2288
#7  0x0115d315 in port_reconfigure (port=0x3f41b50) at
lib/dpif-netdev.c:4789
#8  0x0115f8de in reconfigure_datapath (dp=0x3f07ac0) at
lib/dpif-netdev.c:5761
#9  0x01156b92 in do_add_port (dp=0x3f07ac0, devname=0x3f40a00 
"virtio_user0", type=0x155b2e6 "dpdk", port_no=4) at lib/dpif-netdev.c:2057
#10 0x01156d22 in dpif_netdev_port_add (dpif=0x3c53430, 
netdev=0x1503c8b80, port_nop=0x7fffd77158b0) at lib/dpif-netdev.c:2101
#11 0x0116de4a in dpif_port_add (dpif=0x3c53430, 
netdev=0x1503c8b80, port_nop=0x7fffd771590c) at lib/dpif.c:595
#12 0x010f58c5 in port_add (ofproto_=0x3f068b0,
netdev=0x1503c8b80) at ofproto/ofproto-dpif.c:3920
#13 0x010d9e9c in ofproto_port_add (ofproto=0x3f068b0, 
netdev=0x1503c8b80, ofp_portp=0x7fffd7715a74) at ofproto/ofproto.c:2067
#14 0x010c5ada in iface_do_create (br=0x3f06250, iface_cfg=0x3f76b20, 
ofp_portp=0x7fffd7715a74, netdevp=0x7fffd7715a78,
errp=0x7fffd7715a68) at vswitchd/bridge.c:2063
#15 0x010c5c6e in iface_create (br=0x3f06250, iface_cfg=0x3f76b20, 
port_cfg=0x3f41850) at vswitchd/bridge.c:2106
#16 0x010c3346 in bridge_add_ports__ (br=0x3f06250, 
wanted_ports=0x3f06330, with_requested_port=false) at vswitchd/bridge.c:1170
#17 0x010c33cd in bridge_add_ports (br=0x3f06250,
wanted_ports=0x3f06330) at vswitchd/bridge.c:1186
#18 0x010c2908 in bridge_reconfigure (ovs_cfg=0x3c59000) at
vswitchd/bridge.c:898
#19 0x010c92fe in bridge_run () at vswitchd/bridge.c:3331
#20 0x010cea43 in main (argc=4, argv=0x7fffd7715d48) at
vswitchd/ovs-vswitchd.c:127


> 2. Inject traffic
> 3. traffic should eb received at virtio_user port ?
> 
> Regards,
> Suneetha
> 
> -Original Message-
> From: Kevin Traynor 
> Sent: 21 September 2021 17:43
> To: Kalahasthi, Suneetha ; 
> d...@openvswitch.org
> Cc: David Marchand 
> Subject: Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release
> 
> On 21/09/2021 11:42, Kevin Traynor wrote:
>> On 21/09/2021 08:08, Suneetha Kalahasthi wrote:
>>> Modify ci linux build script to use the latest DPDK stable release 20.11.3.
>>> Modify Documentation to use the latest DPDK stable release 20.11.3.
>>> Update NEWS file to reflect the latest DPDK stable release 20.11.3.
>>> FAQ is updated to reflect the latest DPDK for each OVS branch.
>>>
>>
>> David has reported a crash for virtio_user devices with 20.11.3 [1]. 
>> I ran a quick test of adding a virtio_user port to OVS and there was 
>> no crash, but maybe it was not a full test or I was lucky.
>>
> 
> After talking to David, I reproduced with:
> ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 
> type=dpdk
> options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,q
> ueues=3
> 
> You'd need to check if there are fixes impacting OVS in 20.11.3 that make it 
> better to take now and document this known issue. Otherwise, probably better 
> to wait until 20.11.4 with the fix for this.
> 
>> Can you check if it is ok to use 20.11.3 with this known issue?
>>
>> [1]
>> http://inbox.dpdk.org/dev/CAJFAV8yjvEvk-YQgwBb=ZAWCrn_P2NDzcugC2W-O+7
>> J
>> zoyd...@mail.gmail.com/
>>
>>> Signed-off-by: Suneetha Kalahasthi 
>>> ---
>>> .ci/linux-build.sh   | 2 +-
>>> Documentation/faq/releases.rst   | 8 
>>> Documentation/intro/install/dpdk.rst | 8 
>>> NEWS | 2 ++
>>> 4 files changed, 11 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index
>>> 863f02388..5323cb2f2 100755
>>> --- a/.ci/linux-build.sh
>>> +++ b/.ci/linux-build.sh
>>> @@ -216,7 +216,7 @@ fi
>>> 

[ovs-dev] [PATCH ovn] northd: Fix multicast relay when DGP are configured.

2021-09-21 Thread Dumitru Ceara
IP multicast relay didn't work properly if traffic had to be forwarded
across a distributed gateway port in the router pipeline.  That is
because the multicast_group used as output logical port is expanded in
the egress pipeline, way after 'lr_in_gw_redirect' where unicast traffic
would normally be forwarded to the chassis-redirect port.

In order to achieve the same behavior for IP multicast routed traffic we
now store the chassis-redirect port binding in the multicast_group on
which IP multicast is routed.  On the remote hypervisor, to make sure
traffic is delivered to the correct destination switch pipeline, we make
sure that ovn-controller translates chassis-redirect ports from
multicast groups to the logical patch ports they were created from.

This patch also adds a test to simulate the ovn-kubernetes IP multicast
use case (where this issue was first observed).

Fixes: 5d1527b11e94 ("ovn-northd: Add IGMP Relay support")
Reported-by: Alexander Constantinescu 
Reported-at: https://bugzilla.redhat.com/2006306
Signed-off-by: Dumitru Ceara 
---
 controller/physical.c |  28 -
 northd/lrouter.dl |  14 ++-
 northd/multicast.dl   |  23 +++-
 northd/northd.c   |   7 +-
 northd/ovn_northd.dl  |  12 +-
 tests/ovn.at  | 248 ++
 6 files changed, 314 insertions(+), 18 deletions(-)

diff --git a/controller/physical.c b/controller/physical.c
index ffb9f9952..0cfb158c8 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -1373,7 +1373,8 @@ out:
 }
 
 static void
-consider_mc_group(enum mf_field_id mff_ovn_geneve,
+consider_mc_group(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+  enum mf_field_id mff_ovn_geneve,
   const struct simap *ct_zones,
   const struct hmap *local_datapaths,
   struct shash *local_bindings,
@@ -1406,6 +1407,10 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
  *  instead.  (If we put them in 'ofpacts', then the output
  *  would happen on every hypervisor in the multicast group,
  *  effectively duplicating the packet.)
+ *
+ *- For chassisredirect ports, add actions to 'ofpacts' to
+ *  set the output port to be the router patch port for which
+ *  the redirect port was added.
  */
 struct ofpbuf ofpacts;
 ofpbuf_init(, 0);
@@ -1440,6 +1445,21 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
&& port->chassis == chassis)) {
 put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, );
 put_resubmit(OFTABLE_CHECK_LOOPBACK, );
+} else if (!strcmp(port->type, "chassisredirect")
+   && port->chassis == chassis) {
+const char *distributed_port = smap_get(>options,
+"distributed-port");
+if (distributed_port) {
+const struct sbrec_port_binding *distributed_binding
+= lport_lookup_by_name(sbrec_port_binding_by_name,
+   distributed_port);
+if (distributed_binding
+&& port->datapath == distributed_binding->datapath) {
+put_load(distributed_binding->tunnel_key, MFF_LOG_OUTPORT,
+ 0, 32, );
+put_resubmit(OFTABLE_CHECK_LOOPBACK, );
+}
+}
 } else if (port->chassis && !get_localnet_port(
 local_datapaths, mc->datapath->tunnel_key)) {
 /* Add remote chassis only when localnet port not exist,
@@ -1574,7 +1594,8 @@ physical_handle_mc_group_changes(struct physical_ctx 
*p_ctx,
 if (!sbrec_multicast_group_is_new(mc)) {
 ofctrl_remove_flows(flow_table, >header_.uuid);
 }
-consider_mc_group(p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
+consider_mc_group(p_ctx->sbrec_port_binding_by_name,
+  p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
   p_ctx->local_datapaths, p_ctx->local_bindings,
   p_ctx->patch_ofports,
   p_ctx->chassis, mc,
@@ -1617,7 +1638,8 @@ physical_run(struct physical_ctx *p_ctx,
 /* Handle output to multicast groups, in tables 32 and 33. */
 const struct sbrec_multicast_group *mc;
 SBREC_MULTICAST_GROUP_TABLE_FOR_EACH (mc, p_ctx->mc_group_table) {
-consider_mc_group(p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
+consider_mc_group(p_ctx->sbrec_port_binding_by_name,
+  p_ctx->mff_ovn_geneve, p_ctx->ct_zones,
   p_ctx->local_datapaths, p_ctx->local_bindings,
   p_ctx->patch_ofports, p_ctx->chassis,
   mc, p_ctx->chassis_tunnels,
diff --git a/northd/lrouter.dl b/northd/lrouter.dl
index 3029ba67d..0e4308eb5 100644
--- 

Re: [ovs-dev] Unit Test Failure Report to OVS ML

2021-09-21 Thread Ilya Maximets
On 9/21/21 14:51, Amber, Kumar wrote:
> Hi Ilya,
> 
> The Test-case failure is not related to AVX512 or any patches we are directly 
> failing on "master" latest of OVS with no patches on top of it.
> I am still trying to figure out or root cause the issue, we tested the master 
> on 4 different servers, and all fails on the same test-case.

This sounds very weird.  How do you build it?

> 
> Regards
> Amber
> 
>> -Original Message-
>> From: Ilya Maximets 
>> Sent: Monday, September 20, 2021 5:05 PM
>> To: Amber, Kumar ; ovs-dev@openvswitch.org;
>> i.maxim...@ovn.org; tony.vanderp...@alliedtelesis.co.nz
>> Cc: Stokes, Ian ; Van Haaren, Harry
>> 
>> Subject: Re: Unit Test Failure Report to OVS ML
>>
>> On 9/20/21 12:35, Amber, Kumar wrote:
>>> Hi all,
>>>
>>> The following commit ID with the following description added a test case for
>> "tunnel-push-pop" test-suit by the name: "tunnel_push_pop - packet_out
>> debug_slow" has been found to be failing on the latest master branch.
>>>
>>> ## --- ##
>>> ## openvswitch 2.16.90 test suite. ##
>>> ## --- ##
>>> 779: tunnel_push_pop - packet_out debug_slow FAILED
>>> (ovs-macros.at:242)
>>>
>>> ## - ##
>>> ## Test results. ##
>>> ## - ##
>>>
>>> ERROR: 1 test was run,
>>> 1 failed unexpectedly.
>>>
>>> We did some investigation, and the matching is the cause of the failure.
>>>
>>> ./ovs-macros.at:242: hard failure
>>>
>>> 779. tunnel-push-pop.at:598: 779. tunnel_push_pop - packet_out
>>> debug_slow (tunnel-push-pop.at:598): FAILED (ovs-macros.at:242)
>>>
>>> Commit patch: 7e6b41ac8d9d183655be96795b529adeb33aeb47
>>>
>>> dpif-netdev: Fix crash when PACKET_OUT is metered.
>>>
>>> When a PACKET_OUT has output port of OFPP_TABLE, and the rule table
>>> includes a meter and this causes the packet to be deleted, execute
>>> with a clone of the packet, restoring the original packet if it is
>>> changed by the execution.
>>>
>>> Add tests to verify the original issue is fixed, and that the fix
>>> doesn't break tunnel processing.
>>>
>>> Would the authors of the patch investigate why the test is failing?
>>>
>>> Regards
>>> Amber
>>
>> Hi.
>>
>> I can't reproduce the issue.  I re-run the test 10 times on 2 of my
>> systems and it works 10/10 without any issues.   And none of our CI
>> systems has issues with this test.
>>
>> The patch that added the test should not affect packet matching as it only
>> changes the execution of actions, just to avoid the crash under certain
>> conditions, and it tries to do that with least amount of side effects 
>> possible.  So,
>> this patch should not be a root cause.  Maybe the new test case just 
>> uncovered a
>> different issue in packet matching?
>>
>> The test itself was carefully crafted to catch a particular issue where 
>> packet is
>> not encapsulated, while it should be.  And the test itself seems solid.
>>
>> Does it still fail for you, if you revert code changes from the patch but 
>> keep the
>> aforementioned unit test (this test is not for the crash itself, so it 
>> should pass
>> without the change in the patch)?
>>
>> Anyway, what does "the matching is the cause of the failure" mean?
>> Are you testing with avx512 enabled?  If so, doesn't autovalidator tell you 
>> what
>> the issue is?
>>
>> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ovs-ctl: add log level option to utilities/ovs-ctl.in

2021-09-21 Thread remijouannet
From: Remi Jouannet 

Add three new options to configure log level at runtime with ovs-ctl
--vconsole, --vsyslog-level and --vfile-level

Signed-off-by: Remi Jouannet 
---
 utilities/ovs-ctl.in | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
index 7180079..bf9f466 100644
--- a/utilities/ovs-ctl.in
+++ b/utilities/ovs-ctl.in
@@ -143,7 +143,9 @@ do_start_ovsdb () {
 if test X"$SELF_CONFINEMENT" = Xno; then
 set "$@" --no-self-confinement
 fi
-set "$@" -vconsole:emer -vsyslog:err -vfile:info
+set "$@" -vconsole:"$VCONSOLE_LEVEL"
+set "$@" -vsyslog:"$VSYSLOG_LEVEL"
+set "$@" -vfile:"$VFILE_LEVEL"
 set "$@" --remote=punix:"$DB_SOCK"
 set "$@" --private-key=db:Open_vSwitch,SSL,private_key
 set "$@" --certificate=db:Open_vSwitch,SSL,certificate
@@ -211,7 +213,9 @@ do_start_forwarding () {
 
 # Start ovs-vswitchd.
 set ovs-vswitchd unix:"$DB_SOCK"
-set "$@" -vconsole:emer -vsyslog:err -vfile:info
+set "$@" -vconsole:"$VCONSOLE_LEVEL"
+set "$@" -vsyslog:"$VSYSLOG_LEVEL"
+set "$@" -vfile:"$VFILE_LEVEL"
 if test X"$MLOCKALL" != Xno; then
 set "$@" --mlockall
 fi
@@ -352,6 +356,10 @@ set_defaults () {
 DPORT=
 SPORT=
 
+VCONSOLE_LEVEL=emer
+VSYSLOG_LEVEL=err
+VFILE_LEVEL=info
+
 IKE_DAEMON=
 RESTART_IKE_DAEMON=yes
 
@@ -441,6 +449,11 @@ Options for "enable-protocol":
   --sport=PORT   source port to match (for tcp or udp protocol)
   --dport=PORT   ddestination port to match (for tcp or udp protocol)
 
+Log level options, documentation in ovs-appctl.8:
+  --vconsole-level=LEVEL console logging level (default: $VCONSOLE_LEVEL)
+  --vsyslog-level=LEVEL   syslog logging level (default: $VSYSLOG_LEVEL)
+  --vfile-level=LEVEL   file logging level (default: $VFILE_LEVEL)
+
 Option for "start-ovs-ipsec":
   --ike-daemon=IKE_DAEMON
   the IKE daemon for ipsec tunnels (either libreswan or strongswan)
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Unit Test Failure Report to OVS ML

2021-09-21 Thread Amber, Kumar
Hi Ilya,

The Test-case failure is not related to AVX512 or any patches we are directly 
failing on "master" latest of OVS with no patches on top of it.
I am still trying to figure out or root cause the issue, we tested the master 
on 4 different servers, and all fails on the same test-case.

Regards
Amber

> -Original Message-
> From: Ilya Maximets 
> Sent: Monday, September 20, 2021 5:05 PM
> To: Amber, Kumar ; ovs-dev@openvswitch.org;
> i.maxim...@ovn.org; tony.vanderp...@alliedtelesis.co.nz
> Cc: Stokes, Ian ; Van Haaren, Harry
> 
> Subject: Re: Unit Test Failure Report to OVS ML
> 
> On 9/20/21 12:35, Amber, Kumar wrote:
> > Hi all,
> >
> > The following commit ID with the following description added a test case for
> "tunnel-push-pop" test-suit by the name: "tunnel_push_pop - packet_out
> debug_slow" has been found to be failing on the latest master branch.
> >
> > ## --- ##
> > ## openvswitch 2.16.90 test suite. ##
> > ## --- ##
> > 779: tunnel_push_pop - packet_out debug_slow FAILED
> > (ovs-macros.at:242)
> >
> > ## - ##
> > ## Test results. ##
> > ## - ##
> >
> > ERROR: 1 test was run,
> > 1 failed unexpectedly.
> >
> > We did some investigation, and the matching is the cause of the failure.
> >
> > ./ovs-macros.at:242: hard failure
> >
> > 779. tunnel-push-pop.at:598: 779. tunnel_push_pop - packet_out
> > debug_slow (tunnel-push-pop.at:598): FAILED (ovs-macros.at:242)
> >
> > Commit patch: 7e6b41ac8d9d183655be96795b529adeb33aeb47
> >
> > dpif-netdev: Fix crash when PACKET_OUT is metered.
> >
> > When a PACKET_OUT has output port of OFPP_TABLE, and the rule table
> > includes a meter and this causes the packet to be deleted, execute
> > with a clone of the packet, restoring the original packet if it is
> > changed by the execution.
> >
> > Add tests to verify the original issue is fixed, and that the fix
> > doesn't break tunnel processing.
> >
> > Would the authors of the patch investigate why the test is failing?
> >
> > Regards
> > Amber
> 
> Hi.
> 
> I can't reproduce the issue.  I re-run the test 10 times on 2 of my
> systems and it works 10/10 without any issues.   And none of our CI
> systems has issues with this test.
> 
> The patch that added the test should not affect packet matching as it only
> changes the execution of actions, just to avoid the crash under certain
> conditions, and it tries to do that with least amount of side effects 
> possible.  So,
> this patch should not be a root cause.  Maybe the new test case just 
> uncovered a
> different issue in packet matching?
> 
> The test itself was carefully crafted to catch a particular issue where 
> packet is
> not encapsulated, while it should be.  And the test itself seems solid.
> 
> Does it still fail for you, if you revert code changes from the patch but 
> keep the
> aforementioned unit test (this test is not for the crash itself, so it should 
> pass
> without the change in the patch)?
> 
> Anyway, what does "the matching is the cause of the failure" mean?
> Are you testing with avx512 enabled?  If so, doesn't autovalidator tell you 
> what
> the issue is?
> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/6] MFEX Optimizations IPv6 + Hashing

2021-09-21 Thread Ilya Maximets
On 9/21/21 12:23, Kumar Amber wrote:
> ---
> v3:
> - rebase to master.
> v2:
> - fix the CI build.
> - fix check-patch for co-author.
> ---
> 
> The patch-set introduces AVX512 optimizations of IPv6
> traffic profiles and hashing improvements for all AVX512
> supported traffic profiles for IPv4 and IPv6.
> 
> Kumar Amber (6):
>   dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles
>   dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles
>   dpif-netdev/mfex: Add packet hash check to autovalidator
>   dpif-netdev/mfex: Add ipv4 profile based hashing
>   dpif-netdev/mfex: Add ipv6 profile based hashing
>   dpif-netdev/mfex: Avoid hashing when opt mfex called
> 
>  NEWS  |   7 +
>  lib/automake.mk   |   1 +
>  lib/dpif-netdev-avx512.c  |   6 +-
>  lib/dpif-netdev-extract-avx512.c  | 348 +-
>  lib/dpif-netdev-private-extract.c |  63 +-
>  lib/dpif-netdev-private-extract.h |  12 ++
>  tests/pcap/mfex_test.pcap | Bin 416 -> 632 bytes
>  7 files changed, 432 insertions(+), 5 deletions(-)
> 

Hi.  A few months ago I was told that it's easy for Intel to set up CI
to test upstream patches with AVX512 features enabled.  Is there any
progress on that front?

My point is that we should refrain from adding new features in this
area until we have a proper CI.  Especially considering the unit test
failure you reported yesterday, which is supposedly related to AVX512
optimizations.

// Marking this patch-set as deferred for now.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Kevin Traynor

On 21/09/2021 13:16, Kalahasthi, Suneetha wrote:

HI Kevin,

The setup is:
1. Add one virtio_user port to OVS with 3 queues
ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 type=dpdk
options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,queues=3



Just need to add the port in step 1.

Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault.
0x00ddf230 in virtio_rx_mem_pool_buf_size ()
(gdb) bt
#0  0x00ddf230 in virtio_rx_mem_pool_buf_size ()
#1  0x00ddf301 in virtio_mtu_set ()
#2  0x0107e451 in rte_eth_dev_set_mtu ()
#3  0x012bc7bf in dpdk_eth_dev_port_config (dev=0x1503c8b00, 
n_rxq=1, n_txq=3) at lib/netdev-dpdk.c:1018
#4  0x012bce2e in dpdk_eth_dev_init (dev=0x1503c8b00) at 
lib/netdev-dpdk.c:1146
#5  0x012c6729 in netdev_dpdk_reconfigure (netdev=0x1503c8b80) 
at lib/netdev-dpdk.c:5007
#6  0x011ac55b in netdev_reconfigure (netdev=0x1503c8b80) at 
lib/netdev.c:2288
#7  0x0115d315 in port_reconfigure (port=0x3f41b50) at 
lib/dpif-netdev.c:4789
#8  0x0115f8de in reconfigure_datapath (dp=0x3f07ac0) at 
lib/dpif-netdev.c:5761
#9  0x01156b92 in do_add_port (dp=0x3f07ac0, devname=0x3f40a00 
"virtio_user0", type=0x155b2e6 "dpdk", port_no=4) at lib/dpif-netdev.c:2057
#10 0x01156d22 in dpif_netdev_port_add (dpif=0x3c53430, 
netdev=0x1503c8b80, port_nop=0x7fffd77158b0) at lib/dpif-netdev.c:2101
#11 0x0116de4a in dpif_port_add	(dpif=0x3c53430, 
netdev=0x1503c8b80, port_nop=0x7fffd771590c) at lib/dpif.c:595
#12 0x010f58c5 in port_add (ofproto_=0x3f068b0, 
netdev=0x1503c8b80) at ofproto/ofproto-dpif.c:3920
#13 0x010d9e9c in ofproto_port_add (ofproto=0x3f068b0, 
netdev=0x1503c8b80, ofp_portp=0x7fffd7715a74) at ofproto/ofproto.c:2067
#14 0x010c5ada in iface_do_create (br=0x3f06250, 
iface_cfg=0x3f76b20, ofp_portp=0x7fffd7715a74, netdevp=0x7fffd7715a78, 
errp=0x7fffd7715a68) at vswitchd/bridge.c:2063
#15 0x010c5c6e in iface_create (br=0x3f06250, 
iface_cfg=0x3f76b20, port_cfg=0x3f41850) at vswitchd/bridge.c:2106
#16 0x010c3346 in bridge_add_ports__ (br=0x3f06250, 
wanted_ports=0x3f06330, with_requested_port=false) at vswitchd/bridge.c:1170
#17 0x010c33cd in bridge_add_ports (br=0x3f06250, 
wanted_ports=0x3f06330) at vswitchd/bridge.c:1186
#18 0x010c2908 in bridge_reconfigure (ovs_cfg=0x3c59000) at 
vswitchd/bridge.c:898

#19 0x010c92fe in bridge_run () at vswitchd/bridge.c:3331
#20 0x010cea43 in main (argc=4, argv=0x7fffd7715d48) at 
vswitchd/ovs-vswitchd.c:127




2. Inject traffic
3. traffic should eb received at virtio_user port ?

Regards,
Suneetha

-Original Message-
From: Kevin Traynor 
Sent: 21 September 2021 17:43
To: Kalahasthi, Suneetha ; d...@openvswitch.org
Cc: David Marchand 
Subject: Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

On 21/09/2021 11:42, Kevin Traynor wrote:

On 21/09/2021 08:08, Suneetha Kalahasthi wrote:

Modify ci linux build script to use the latest DPDK stable release 20.11.3.
Modify Documentation to use the latest DPDK stable release 20.11.3.
Update NEWS file to reflect the latest DPDK stable release 20.11.3.
FAQ is updated to reflect the latest DPDK for each OVS branch.



David has reported a crash for virtio_user devices with 20.11.3 [1]. I
ran a quick test of adding a virtio_user port to OVS and there was no
crash, but maybe it was not a full test or I was lucky.



After talking to David, I reproduced with:
ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 type=dpdk
options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,queues=3

You'd need to check if there are fixes impacting OVS in 20.11.3 that make it 
better to take now and document this known issue. Otherwise, probably better to 
wait until 20.11.4 with the fix for this.


Can you check if it is ok to use 20.11.3 with this known issue?

[1]
http://inbox.dpdk.org/dev/CAJFAV8yjvEvk-YQgwBb=ZAWCrn_P2NDzcugC2W-O+7J
zoyd...@mail.gmail.com/


Signed-off-by: Suneetha Kalahasthi 
---
.ci/linux-build.sh   | 2 +-
Documentation/faq/releases.rst   | 8 
Documentation/intro/install/dpdk.rst | 8 
NEWS | 2 ++
4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index
863f02388..5323cb2f2 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -216,7 +216,7 @@ fi

if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then

if [ -z "$DPDK_VER" ]; then
-DPDK_VER="20.11.1"
+DPDK_VER="20.11.3"
fi
install_dpdk $DPDK_VER
if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst
b/Documentation/faq/releases.rst index 68c9867b1..4f8d105e6 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release work 

Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Kalahasthi, Suneetha
HI Kevin,

The setup is:
1. Add one virtio_user port to OVS with 3 queues
ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 type=dpdk
options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,queues=3

2. Inject traffic
3. traffic should eb received at virtio_user port ?

Regards,
Suneetha

-Original Message-
From: Kevin Traynor  
Sent: 21 September 2021 17:43
To: Kalahasthi, Suneetha ; d...@openvswitch.org
Cc: David Marchand 
Subject: Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

On 21/09/2021 11:42, Kevin Traynor wrote:
> On 21/09/2021 08:08, Suneetha Kalahasthi wrote:
>> Modify ci linux build script to use the latest DPDK stable release 20.11.3.
>> Modify Documentation to use the latest DPDK stable release 20.11.3.
>> Update NEWS file to reflect the latest DPDK stable release 20.11.3.
>> FAQ is updated to reflect the latest DPDK for each OVS branch.
>>
> 
> David has reported a crash for virtio_user devices with 20.11.3 [1]. I 
> ran a quick test of adding a virtio_user port to OVS and there was no 
> crash, but maybe it was not a full test or I was lucky.
> 

After talking to David, I reproduced with:
ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 type=dpdk
options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,queues=3

You'd need to check if there are fixes impacting OVS in 20.11.3 that make it 
better to take now and document this known issue. Otherwise, probably better to 
wait until 20.11.4 with the fix for this.

> Can you check if it is ok to use 20.11.3 with this known issue?
> 
> [1]
> http://inbox.dpdk.org/dev/CAJFAV8yjvEvk-YQgwBb=ZAWCrn_P2NDzcugC2W-O+7J
> zoyd...@mail.gmail.com/
> 
>> Signed-off-by: Suneetha Kalahasthi 
>> ---
>>.ci/linux-build.sh   | 2 +-
>>Documentation/faq/releases.rst   | 8 
>>Documentation/intro/install/dpdk.rst | 8 
>>NEWS | 2 ++
>>4 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 
>> 863f02388..5323cb2f2 100755
>> --- a/.ci/linux-build.sh
>> +++ b/.ci/linux-build.sh
>> @@ -216,7 +216,7 @@ fi
>>
>>if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
>>if [ -z "$DPDK_VER" ]; then
>> -DPDK_VER="20.11.1"
>> +DPDK_VER="20.11.3"
>>fi
>>install_dpdk $DPDK_VER
>>if [ "$CC" = "clang" ]; then
>> diff --git a/Documentation/faq/releases.rst 
>> b/Documentation/faq/releases.rst index 68c9867b1..4f8d105e6 100644
>> --- a/Documentation/faq/releases.rst
>> +++ b/Documentation/faq/releases.rst
>> @@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release 
>> work with?
>>2.10.x   17.11.10
>>2.11.x   18.11.9
>>2.12.x   18.11.9
>> -2.13.x   19.11.8
>> -2.14.x   19.11.8
>> -2.15.x   20.11.1
>> -2.16.x   20.11.1
>> +2.13.x   19.11.10
>> +2.14.x   19.11.10
>> +2.15.x   20.11.3
>> +2.16.x   20.11.3
>> 
>>
>>Q: Are all the DPDK releases that OVS versions work with maintained?
>> diff --git a/Documentation/intro/install/dpdk.rst 
>> b/Documentation/intro/install/dpdk.rst
>> index 96843af73..83c758783 100644
>> --- a/Documentation/intro/install/dpdk.rst
>> +++ b/Documentation/intro/install/dpdk.rst
>> @@ -42,7 +42,7 @@ Build requirements
>>In addition to the requirements described in :doc:`general`, building Open
>>vSwitch with DPDK will require the following:
>>
>> -- DPDK 20.11.1
>> +- DPDK 20.11.3
>>
>>- A `DPDK supported NIC`_
>>
>> @@ -73,9 +73,9 @@ Install DPDK
>>#. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
>>
>>   $ cd /usr/src/
>> -   $ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz
>> -   $ tar xf dpdk-20.11.1.tar.xz
>> -   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.1
>> +   $ wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
>> +   $ tar xf dpdk-20.11.3.tar.xz
>> +   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.3
>>   $ cd $DPDK_DIR
>>
>>#. Configure and install DPDK using Meson diff --git a/NEWS b/NEWS 
>> index 90f4b1590..b92445a32 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -1,6 +1,8 @@
>>Post-v2.16.0
>>-
>>   - DPDK:
>> + * OVS validated with DPDK 20.11.3. It is recommended to use this 
>> version
>> +   until further releases.
>> * EAL argument --socket-mem is no longer configured by default upon
>>   start-up.  If dpdk-socket-mem and dpdk-alloc-mem are not specified,
>>   DPDK defaults will be used.
>>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Kevin Traynor

On 21/09/2021 11:42, Kevin Traynor wrote:

On 21/09/2021 08:08, Suneetha Kalahasthi wrote:

Modify ci linux build script to use the latest DPDK stable release 20.11.3.
Modify Documentation to use the latest DPDK stable release 20.11.3.
Update NEWS file to reflect the latest DPDK stable release 20.11.3.
FAQ is updated to reflect the latest DPDK for each OVS branch.



David has reported a crash for virtio_user devices with 20.11.3 [1]. I
ran a quick test of adding a virtio_user port to OVS and there was no
crash, but maybe it was not a full test or I was lucky.



After talking to David, I reproduced with:
ovs-vsctl add-port br0 virtio_user0 -- set Interface virtio_user0 
type=dpdk 
options:dpdk-devargs=net_virtio_user0,iface=tap0,path=/dev/vhost-net,queues=3


You'd need to check if there are fixes impacting OVS in 20.11.3 that 
make it better to take now and document this known issue. Otherwise, 
probably better to wait until 20.11.4 with the fix for this.



Can you check if it is ok to use 20.11.3 with this known issue?

[1]
http://inbox.dpdk.org/dev/CAJFAV8yjvEvk-YQgwBb=zawcrn_p2ndzcugc2w-o+7jzoyd...@mail.gmail.com/


Signed-off-by: Suneetha Kalahasthi 
---
   .ci/linux-build.sh   | 2 +-
   Documentation/faq/releases.rst   | 8 
   Documentation/intro/install/dpdk.rst | 8 
   NEWS | 2 ++
   4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 863f02388..5323cb2f2 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -216,7 +216,7 @@ fi
   
   if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then

   if [ -z "$DPDK_VER" ]; then
-DPDK_VER="20.11.1"
+DPDK_VER="20.11.3"
   fi
   install_dpdk $DPDK_VER
   if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 68c9867b1..4f8d105e6 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release work 
with?
   2.10.x   17.11.10
   2.11.x   18.11.9
   2.12.x   18.11.9
-2.13.x   19.11.8
-2.14.x   19.11.8
-2.15.x   20.11.1
-2.16.x   20.11.1
+2.13.x   19.11.10
+2.14.x   19.11.10
+2.15.x   20.11.3
+2.16.x   20.11.3
    
   
   Q: Are all the DPDK releases that OVS versions work with maintained?

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 96843af73..83c758783 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
   In addition to the requirements described in :doc:`general`, building Open
   vSwitch with DPDK will require the following:
   
-- DPDK 20.11.1

+- DPDK 20.11.3
   
   - A `DPDK supported NIC`_
   
@@ -73,9 +73,9 @@ Install DPDK

   #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
   
  $ cd /usr/src/

-   $ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz
-   $ tar xf dpdk-20.11.1.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.1
+   $ wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
+   $ tar xf dpdk-20.11.3.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.3
  $ cd $DPDK_DIR
   
   #. Configure and install DPDK using Meson

diff --git a/NEWS b/NEWS
index 90f4b1590..b92445a32 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,8 @@
   Post-v2.16.0
   -
  - DPDK:
+ * OVS validated with DPDK 20.11.3. It is recommended to use this version
+   until further releases.
* EAL argument --socket-mem is no longer configured by default upon
  start-up.  If dpdk-socket-mem and dpdk-alloc-mem are not specified,
  DPDK defaults will be used.





___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovn] problem: long tcp session instantiation with stateful ACLs

2021-09-21 Thread Dumitru Ceara
On 9/21/21 1:33 PM, Vladislav Odintsov wrote:
> Hi Dumitru,

Hi Vladislav,

> 
> are you talking about any specific _mising_ patch?

No, sorry for the confusion.  I just meant there's a bug in the OOT
module that was probably already fixed in the in-tree one so, likely,
one would have to figure out the patch that fixed it.

> 
> Regards,
> Vladislav Odintsov

Regards,
Dumitru

> 
>> On 16 Sep 2021, at 19:09, Dumitru Ceara  wrote:
>>
>> On 9/16/21 4:18 PM, Vladislav Odintsov wrote:
>>> Sorry, by OOT I meant non-inbox kmod.
>>> I’ve tried to use inbox kernel module (from kernel package) and problem 
>>> resolved.
>>>
>>> Regards,
>>> Vladislav Odintsov
>>>
 On 16 Sep 2021, at 17:17, Vladislav Odintsov  wrote:

 Hi Dumitru,

 I’ve tried to exclude OOT OVS kernel module.
 With OVN 20.06.3 + OVS 2.13.4 the problem solved.

 Could you please try with OOT kmod? For me it looks like a bug in OOT OVS 
 kernel module code.
>>
>> You're right, this seems to be a missing patch in the OOT openvswitch
>> module.  I could replicate the problem you reported with the OOT module.
>>
>> Regards,
>> Dumitru
>>

 Thanks.

 Regards,
 Vladislav Odintsov

> On 16 Sep 2021, at 11:02, Dumitru Ceara    >> wrote:
>
> On 9/16/21 2:50 AM, Vladislav Odintsov wrote:
>> Hi Dumitru,
>>
>> thanks for your reply.
>>
>> Regards,
>> Vladislav Odintsov
>>
>>> On 15 Sep 2021, at 11:24, Dumitru Ceara  wrote:
>>>
>>> Hi Vladislav,
>>>
>>> On 9/13/21 6:14 PM, Vladislav Odintsov wrote:
 Hi Numan,

 I’ve checked with OVS 2.16.0 and OVN master. The problem persists.
 Symptoms are the same.

 # grep ct_zero_snat /var/log/openvswitch/ovs-vswitchd.log
 2021-09-13T16:10:01.792Z|00019|ofproto_dpif|INFO|system@ovs-system: 
 Datapath supports ct_zero_snat
>>>
>>> This shouldn't be related to the problem we fixed with ct_zero_snat.
>>>

 Regards,
 Vladislav Odintsov

> On 13 Sep 2021, at 17:54, Numan Siddique  wrote:
>
> On Mon, Sep 13, 2021 at 8:10 AM Vladislav Odintsov  > wrote:
>>
>> Hi,
>>
>> we’ve encountered a next problem with stateful ACLs.
>>
>> Suppose, we have one logical switch (ls1) and attached to it a VIF 
>> type logical ports (lsp1, lsp2).
>> Each logical port has a linux VM besides it.
>>
>> Logical ports reside in port group (pg1) and two ACLs are created 
>> within this PG:
>> to-lport outport == @pg1 && ip4 && ip4.dst == 0.0.0.0/0 allow-related
>> from-lport outport == @pg1 && ip4 && ip4.src == 0.0.0.0/0 
>> allow-related
>>
>> When we have a high-connection rate service between VMs, the tcp 
>> source/dest ports may be reused before the connection is deleted 
>> from LSP’s-related conntrack zones on the host.
>> Let’s use curl with passing --local-port argument to have each time 
>> same source port.
>>
>> Run it from VM to another VM (172.31.0.18 -> 172.31.0.17):
>> curl --local-port 4 http://172.31.0.17/
>>
>> Check connections in client’s and server’s vif zones (client - 
>> zone=20, server - zone=1):
>> run while true script to check connections state per-second, while 
>> running new connection with same source/dest 5-tuple:
>>
>> while true; do date; grep -e 'zone=1 ' -e zone=20 
>> /proc/net/nf_conntrack; sleep 0.2; done
>>
>> Right after we’ve succesfully run curl, the connection is getting 
>> time-closed and next time-wait states:
>>
>> Mon Sep 13 14:34:39 MSK 2021
>> ipv4 2 tcp  6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
>> dport=4 [ASSURED] mark=0 zone=1 use=2
>> ipv4 2 tcp  6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
>> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
>> dport=4 [ASSURED] mark=0 zone=20 use=2
>> Mon Sep 13 14:34:39 MSK 2021
>> ipv4 2 tcp  6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
>> dport=4 [ASSURED] mark=0 zone=1 use=2
>> ipv4 2 tcp  6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
>> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
>> dport=4 [ASSURED] mark=0 zone=20 use=2
>>
>> And it remains in time-wait state for nf_conntrack_time_wait_timeout 
>> (120 seconds for centos 7).
>>
>> 

Re: [ovs-dev] [PATCH net-next] openvswitch: allow linking a VRF to an OVS bridge

2021-09-21 Thread Antoine Tenart
Hello,

Quoting Luis Tomas Bolivar (2021-09-21 13:20:08)
> 
> Follow up on this. I found the mistake I was making on the veth-pair
> addition configuration (ovs flow was setting the wrong mac address
> before sending the traffic through the veth device to the vrf). And it
> indeed works connecting the VRF to the OVS bridge by using a veth pair
> instead of directly plugin the VRF device as an OVS port.

Great! This means there is no need for this patch I believe, OVS bridge
is not different in the end.

Thanks,
Antoine
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ovs-ctl: add log level option to utilities/ovs-ctl.in

2021-09-21 Thread remijouannet
From: Remi Jouannet 

Add three new options to configure log level at runtime with ovs-ctl
--vconsole, --vsyslog-level and --vfile-level 

Signed-off-by: Remi Jouannet 
---
 utilities/ovs-ctl.in | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
index 7180079..bf9f466 100644
--- a/utilities/ovs-ctl.in
+++ b/utilities/ovs-ctl.in
@@ -143,7 +143,9 @@ do_start_ovsdb () {
 if test X"$SELF_CONFINEMENT" = Xno; then
 set "$@" --no-self-confinement
 fi
-set "$@" -vconsole:emer -vsyslog:err -vfile:info
+set "$@" -vconsole:"$VCONSOLE_LEVEL"
+set "$@" -vsyslog:"$VSYSLOG_LEVEL"
+set "$@" -vfile:"$VFILE_LEVEL"
 set "$@" --remote=punix:"$DB_SOCK"
 set "$@" --private-key=db:Open_vSwitch,SSL,private_key
 set "$@" --certificate=db:Open_vSwitch,SSL,certificate
@@ -211,7 +213,9 @@ do_start_forwarding () {
 
 # Start ovs-vswitchd.
 set ovs-vswitchd unix:"$DB_SOCK"
-set "$@" -vconsole:emer -vsyslog:err -vfile:info
+set "$@" -vconsole:"$VCONSOLE_LEVEL"
+set "$@" -vsyslog:"$VSYSLOG_LEVEL"
+set "$@" -vfile:"$VFILE_LEVEL"
 if test X"$MLOCKALL" != Xno; then
 set "$@" --mlockall
 fi
@@ -352,6 +356,10 @@ set_defaults () {
 DPORT=
 SPORT=
 
+VCONSOLE_LEVEL=emer
+VSYSLOG_LEVEL=err
+VFILE_LEVEL=info
+
 IKE_DAEMON=
 RESTART_IKE_DAEMON=yes
 
@@ -441,6 +449,11 @@ Options for "enable-protocol":
   --sport=PORT   source port to match (for tcp or udp protocol)
   --dport=PORT   ddestination port to match (for tcp or udp protocol)
 
+Log level options, documentation in ovs-appctl.8:
+  --vconsole-level=LEVEL console logging level (default: $VCONSOLE_LEVEL)
+  --vsyslog-level=LEVEL   syslog logging level (default: $VSYSLOG_LEVEL)
+  --vfile-level=LEVEL   file logging level (default: $VFILE_LEVEL)
+
 Option for "start-ovs-ipsec":
   --ike-daemon=IKE_DAEMON
   the IKE daemon for ipsec tunnels (either libreswan or strongswan)
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next] openvswitch: allow linking a VRF to an OVS bridge

2021-09-21 Thread Antoine Tenart
VRF devices are prevented from being added to upper devices since commit
1017e0987117 ("vrf: prevent adding upper devices") as they set the
IFF_NO_RX_HANDLER flag. However attaching a VRF to an OVS bridge is a
valid use case[1].

Allow a VRF device to be attached to an OVS bridge by having an OVS
specific tweak. This approach allows not to change a valid logic
elsewhere and the IFF_NO_RX_HANDLER limitation still applies for non-OVS
upper devices, even after a VRF was unlinked from an OVS bridge.

(Patch not sent as a fix as the commit introducing the limitation is not
recent).

[1] https://ltomasbo.wordpress.com/2021/06/25/openstack-networking-with-evpn/

Signed-off-by: Antoine Tenart 
---

Hi all,

I thought about other ways to fix this but did not want to add yet
another flag, nor to add specific logic outside of net/openvswitch/. A
custom netdev_rx_handler_register having priv_flags as a parameter could
also have been added, but again that seemed a bit invasive.

There might be questions about the setup in which a VRF is linked to an
OVS bridge; I cc'ed Luis Tomás who wrote the article.

Thanks,
Antoine

 net/openvswitch/vport-netdev.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 8e1a88f13622..e76b2477d384 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -75,6 +75,7 @@ static struct net_device *get_dpdev(const struct datapath *dp)
 
 struct vport *ovs_netdev_link(struct vport *vport, const char *name)
 {
+   unsigned int saved_flags;
int err;
 
vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), name);
@@ -98,8 +99,17 @@ struct vport *ovs_netdev_link(struct vport *vport, const 
char *name)
if (err)
goto error_unlock;
 
+   /* While IFF_NO_RX_HANDLER is rightly set for l3 masters (VRF) as they
+* don't work with upper devices, they can be attached to OVS bridges.
+*/
+   saved_flags = vport->dev->priv_flags;
+   if (netif_is_l3_master(vport->dev))
+   vport->dev->priv_flags &= ~IFF_NO_RX_HANDLER;
+
err = netdev_rx_handler_register(vport->dev, netdev_frame_hook,
 vport);
+   vport->dev->priv_flags = saved_flags;
+
if (err)
goto error_master_upper_dev_unlink;
 
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs: Only clear tstamp when changing namespaces

2021-09-21 Thread Tyler Stachecki
On Sun, Sep 19, 2021 at 7:33 PM Cong Wang  wrote:
>
> On Sun, Sep 19, 2021 at 10:59 AM Tyler J. Stachecki
>  wrote:
> >
> > As of "ovs: clear skb->tstamp in forwarding path", the
> > tstamp is now being cleared unconditionally to fix fq qdisc
> > operation with ovs vports.
> >
> > While this is mostly correct and fixes forwarding for that
> > use case, a slight adjustment is necessary to ensure that
> > the tstamp is cleared *only when the forwarding is across
> > namespaces*.
>
> Hmm? I am sure timestamp has already been cleared when
> crossing netns:
>
> void skb_scrub_packet(struct sk_buff *skb, bool xnet)
> {
> ...
> if (!xnet)
> return;
>
> ipvs_reset(skb);
> skb->mark = 0;
> skb->tstamp = 0;
> }
>
> So, what are you trying to fix?
>
> >
> > Signed-off-by: Tyler J. Stachecki 
> > ---
> >  net/openvswitch/vport.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> > index cf2ce5812489..c2d32a5c3697 100644
> > --- a/net/openvswitch/vport.c
> > +++ b/net/openvswitch/vport.c
> > @@ -507,7 +507,8 @@ void ovs_vport_send(struct vport *vport, struct sk_buff 
> > *skb, u8 mac_proto)
> > }
> >
> > skb->dev = vport->dev;
> > -   skb->tstamp = 0;
> > +   if (dev_net(skb->dev))
>
> Doesn't dev_net() always return a non-NULL pointer?
>
> If you really want to check whether it is cross-netns, you should
> use net_eq() to compare src netns with dst netns, something like:
> if (!net_eq(dev_net(vport->dev), dev_net(skb->dev))).
>
> Thanks.

Sorry if this is a no-op -- I'm admittedly not familiar with this part
of the tree.  I had added this check based on this discussion on the
OVS mailing list:
https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050966.html

The motivation to add it was based on the fact that skb_scrub_packet
is doing it conditionally as well, but you seem to indicate that
skb_scrub_packet itself is already being done somewhere?

Cheers,
Tyler
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs: Only clear tstamp when changing namespaces

2021-09-21 Thread Cong Wang
On Sun, Sep 19, 2021 at 10:44 PM Tyler Stachecki
 wrote:
> Sorry if this is a no-op -- I'm admittedly not familiar with this part
> of the tree.  I had added this check based on this discussion on the
> OVS mailing list:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050966.html
>
> The motivation to add it was based on the fact that skb_scrub_packet
> is doing it conditionally as well, but you seem to indicate that
> skb_scrub_packet itself is already being done somewhere?

I mean, skb->tstamp has been cleared when crossing netns,
so: 1) you don't need to clear it again for this case; 2) clearly we
fix other cases with commit 01634047bf0d, so your patch break
it again.

Thanks.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: allow linking a VRF to an OVS bridge

2021-09-21 Thread David Ahern
On 9/20/21 9:34 AM, Antoine Tenart wrote:
> There might be questions about the setup in which a VRF is linked to an
> OVS bridge; I cc'ed Luis Tomás who wrote the article.

My head just exploded. You want to make an L3 device a port of an L2
device.

Can someone explain how this is supposed to work and why it is even
needed? ie., given how an OVS bridge handles packets and the point of a
VRF device (to direct lookups to a table), the 2 are at odds in my head.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs: Only clear tstamp when changing namespaces

2021-09-21 Thread Cong Wang
On Sun, Sep 19, 2021 at 10:59 AM Tyler J. Stachecki
 wrote:
>
> As of "ovs: clear skb->tstamp in forwarding path", the
> tstamp is now being cleared unconditionally to fix fq qdisc
> operation with ovs vports.
>
> While this is mostly correct and fixes forwarding for that
> use case, a slight adjustment is necessary to ensure that
> the tstamp is cleared *only when the forwarding is across
> namespaces*.

Hmm? I am sure timestamp has already been cleared when
crossing netns:

void skb_scrub_packet(struct sk_buff *skb, bool xnet)
{
...
if (!xnet)
return;

ipvs_reset(skb);
skb->mark = 0;
skb->tstamp = 0;
}

So, what are you trying to fix?

>
> Signed-off-by: Tyler J. Stachecki 
> ---
>  net/openvswitch/vport.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> index cf2ce5812489..c2d32a5c3697 100644
> --- a/net/openvswitch/vport.c
> +++ b/net/openvswitch/vport.c
> @@ -507,7 +507,8 @@ void ovs_vport_send(struct vport *vport, struct sk_buff 
> *skb, u8 mac_proto)
> }
>
> skb->dev = vport->dev;
> -   skb->tstamp = 0;
> +   if (dev_net(skb->dev))

Doesn't dev_net() always return a non-NULL pointer?

If you really want to check whether it is cross-netns, you should
use net_eq() to compare src netns with dst netns, something like:
if (!net_eq(dev_net(vport->dev), dev_net(skb->dev))).

Thanks.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ovs: Only clear tstamp when changing namespaces

2021-09-21 Thread Tyler J. Stachecki
As of "ovs: clear skb->tstamp in forwarding path", the
tstamp is now being cleared unconditionally to fix fq qdisc
operation with ovs vports.

While this is mostly correct and fixes forwarding for that
use case, a slight adjustment is necessary to ensure that
the tstamp is cleared *only when the forwarding is across
namespaces*.

Signed-off-by: Tyler J. Stachecki 
---
 net/openvswitch/vport.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index cf2ce5812489..c2d32a5c3697 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -507,7 +507,8 @@ void ovs_vport_send(struct vport *vport, struct sk_buff 
*skb, u8 mac_proto)
}
 
skb->dev = vport->dev;
-   skb->tstamp = 0;
+   if (dev_net(skb->dev))
+   skb->tstamp = 0;
vport->ops->send(skb);
return;
 
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovn] problem: long tcp session instantiation with stateful ACLs

2021-09-21 Thread Vladislav Odintsov
Hi Dumitru,

are you talking about any specific _mising_ patch?

Regards,
Vladislav Odintsov

> On 16 Sep 2021, at 19:09, Dumitru Ceara  wrote:
> 
> On 9/16/21 4:18 PM, Vladislav Odintsov wrote:
>> Sorry, by OOT I meant non-inbox kmod.
>> I’ve tried to use inbox kernel module (from kernel package) and problem 
>> resolved.
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 16 Sep 2021, at 17:17, Vladislav Odintsov  wrote:
>>> 
>>> Hi Dumitru,
>>> 
>>> I’ve tried to exclude OOT OVS kernel module.
>>> With OVN 20.06.3 + OVS 2.13.4 the problem solved.
>>> 
>>> Could you please try with OOT kmod? For me it looks like a bug in OOT OVS 
>>> kernel module code.
> 
> You're right, this seems to be a missing patch in the OOT openvswitch
> module.  I could replicate the problem you reported with the OOT module.
> 
> Regards,
> Dumitru
> 
>>> 
>>> Thanks.
>>> 
>>> Regards,
>>> Vladislav Odintsov
>>> 
 On 16 Sep 2021, at 11:02, Dumitru Ceara >>>  >> wrote:
 
 On 9/16/21 2:50 AM, Vladislav Odintsov wrote:
> Hi Dumitru,
> 
> thanks for your reply.
> 
> Regards,
> Vladislav Odintsov
> 
>> On 15 Sep 2021, at 11:24, Dumitru Ceara  wrote:
>> 
>> Hi Vladislav,
>> 
>> On 9/13/21 6:14 PM, Vladislav Odintsov wrote:
>>> Hi Numan,
>>> 
>>> I’ve checked with OVS 2.16.0 and OVN master. The problem persists.
>>> Symptoms are the same.
>>> 
>>> # grep ct_zero_snat /var/log/openvswitch/ovs-vswitchd.log
>>> 2021-09-13T16:10:01.792Z|00019|ofproto_dpif|INFO|system@ovs-system: 
>>> Datapath supports ct_zero_snat
>> 
>> This shouldn't be related to the problem we fixed with ct_zero_snat.
>> 
>>> 
>>> Regards,
>>> Vladislav Odintsov
>>> 
 On 13 Sep 2021, at 17:54, Numan Siddique  wrote:
 
 On Mon, Sep 13, 2021 at 8:10 AM Vladislav Odintsov >>> > wrote:
> 
> Hi,
> 
> we’ve encountered a next problem with stateful ACLs.
> 
> Suppose, we have one logical switch (ls1) and attached to it a VIF 
> type logical ports (lsp1, lsp2).
> Each logical port has a linux VM besides it.
> 
> Logical ports reside in port group (pg1) and two ACLs are created 
> within this PG:
> to-lport outport == @pg1 && ip4 && ip4.dst == 0.0.0.0/0 allow-related
> from-lport outport == @pg1 && ip4 && ip4.src == 0.0.0.0/0 
> allow-related
> 
> When we have a high-connection rate service between VMs, the tcp 
> source/dest ports may be reused before the connection is deleted from 
> LSP’s-related conntrack zones on the host.
> Let’s use curl with passing --local-port argument to have each time 
> same source port.
> 
> Run it from VM to another VM (172.31.0.18 -> 172.31.0.17):
> curl --local-port 4 http://172.31.0.17/
> 
> Check connections in client’s and server’s vif zones (client - 
> zone=20, server - zone=1):
> run while true script to check connections state per-second, while 
> running new connection with same source/dest 5-tuple:
> 
> while true; do date; grep -e 'zone=1 ' -e zone=20 
> /proc/net/nf_conntrack; sleep 0.2; done
> 
> Right after we’ve succesfully run curl, the connection is getting 
> time-closed and next time-wait states:
> 
> Mon Sep 13 14:34:39 MSK 2021
> ipv4 2 tcp  6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
> dport=4 [ASSURED] mark=0 zone=1 use=2
> ipv4 2 tcp  6 59 CLOSE_WAIT src=172.31.0.18 dst=172.31.0.17 
> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
> dport=4 [ASSURED] mark=0 zone=20 use=2
> Mon Sep 13 14:34:39 MSK 2021
> ipv4 2 tcp  6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
> dport=4 [ASSURED] mark=0 zone=1 use=2
> ipv4 2 tcp  6 119 TIME_WAIT src=172.31.0.18 dst=172.31.0.17 
> sport=4 dport=80 src=172.31.0.17 dst=172.31.0.18 sport=80 
> dport=4 [ASSURED] mark=0 zone=20 use=2
> 
> And it remains in time-wait state for nf_conntrack_time_wait_timeout 
> (120 seconds for centos 7).
> 
> Everything is okay for now.
> While we have installed connections in TW state in zone 1 and 20, 
> lets run this curl (source port 4) again:
> 1st SYN packet is lost. It didn’t get to destination VM. In conntrack 
> we have:
> 
> Mon Sep 13 14:34:41 MSK 2021
> ipv4 2 tcp  6 118 TIME_WAIT src=172.31.0.18 

Re: [ovs-dev] [PATCH net-next] openvswitch: allow linking a VRF to an OVS bridge

2021-09-21 Thread Luis Tomas Bolivar
On Tue, Sep 21, 2021 at 10:12 AM Luis Tomas Bolivar  wrote:
>
> On Mon, Sep 20, 2021 at 5:45 PM David Ahern  wrote:
> >
> > On 9/20/21 9:34 AM, Antoine Tenart wrote:
> > > There might be questions about the setup in which a VRF is linked to an
> > > OVS bridge; I cc'ed Luis Tomás who wrote the article.
> >
> > My head just exploded. You want to make an L3 device a port of an L2
> > device.
> >
> > Can someone explain how this is supposed to work and why it is even
> > needed? ie., given how an OVS bridge handles packets and the point of a
> > VRF device (to direct lookups to a table), the 2 are at odds in my head.
> >
>
> Hi David,
>
> Thanks for your comment. And yes you are right, this probably is a bit
> of an odd setup. That said, OVS is not pure L2 as it knows about IPs
> and it is doing virtual routing too (we can say it is 2.5 xD)
>
> What we want to achieve is something similar to what is shown in slide 100
> here http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf, but 
> instead
> of connecting the VRF bridge directly to containers, we have a single ovs
> bridge (where the OpenStack VMs are connected to) where we connect the
> vrfs in different (ovs) ports (so that the traffic in the way out of OVS can 
> be
> redirected to the right VRF).
>
> The initial part is pretty much the same as in the slide 100:
> 1) creating the vrf
>- ip link add vrf-1001 type vrf table 1001
> 2) vxlan device, in our case associated to the loopback,
> for ECMP (instead of associate both nics/vlan devices to the VRF)
>- ip link add vxlan-1001 type vxlan id 1001 dstport 4789 local L_IP
> nolearning
> 3) create the linux bridge device
>- ip link add name br-1001 type bridge stp_state 0
> 4) link the 3 above
>- ip link set br-1001 master vrf-1001 (bridge to vrf)
>- ip link set vxlan-1001 master br-1001 (vxlan to bridge)
>
> Then, I'm attaching the vrf device also as an ovs bridge port, so that
> traffic (together with some ip routes in the vrf routing table) can be
> redirected
> to OVS, and the (OpenStack) virtual networking happens there
> (br-ex is the ovs bridge)
>- ovs-vsctl add-port br-ex vrf-1001
>- ip route show vrf vrf-1001
>10.0.0.0/26 via 172.24.4.146 dev br-ex
>(redirect traffic to OpenStack subnet 10.0.0.0/26 to br-ex)
>172.24.4.146 dev br-ex scope link
>
> Perhaps there is a better way of connecting this?
> I tried (without success) to create a veth device and set one end on
> the linux bridge and the other on the OVS bridge.

Follow up on this. I found the mistake I was making on the veth-pair
addition configuration (ovs flow was setting the wrong mac address
before sending the traffic through the veth device to the vrf). And it
indeed works connecting the VRF to the OVS bridge by using a veth pair
instead of directly plugin the VRF device as an OVS port.

>
> Best regards,
> Luis



-- 
LUIS TOMÁS BOLÍVAR
Principal Software Engineer
Red Hat
Madrid, Spain
ltoma...@redhat.com

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Kevin Traynor

On 21/09/2021 08:08, Suneetha Kalahasthi wrote:

Modify ci linux build script to use the latest DPDK stable release 20.11.3.
Modify Documentation to use the latest DPDK stable release 20.11.3.
Update NEWS file to reflect the latest DPDK stable release 20.11.3.
FAQ is updated to reflect the latest DPDK for each OVS branch.



David has reported a crash for virtio_user devices with 20.11.3 [1]. I 
ran a quick test of adding a virtio_user port to OVS and there was no 
crash, but maybe it was not a full test or I was lucky.


Can you check if it is ok to use 20.11.3 with this known issue?

[1] 
http://inbox.dpdk.org/dev/CAJFAV8yjvEvk-YQgwBb=zawcrn_p2ndzcugc2w-o+7jzoyd...@mail.gmail.com/



Signed-off-by: Suneetha Kalahasthi 
---
  .ci/linux-build.sh   | 2 +-
  Documentation/faq/releases.rst   | 8 
  Documentation/intro/install/dpdk.rst | 8 
  NEWS | 2 ++
  4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 863f02388..5323cb2f2 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -216,7 +216,7 @@ fi
  
  if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then

  if [ -z "$DPDK_VER" ]; then
-DPDK_VER="20.11.1"
+DPDK_VER="20.11.3"
  fi
  install_dpdk $DPDK_VER
  if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 68c9867b1..4f8d105e6 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release work 
with?
  2.10.x   17.11.10
  2.11.x   18.11.9
  2.12.x   18.11.9
-2.13.x   19.11.8
-2.14.x   19.11.8
-2.15.x   20.11.1
-2.16.x   20.11.1
+2.13.x   19.11.10
+2.14.x   19.11.10
+2.15.x   20.11.3
+2.16.x   20.11.3
   
  
  Q: Are all the DPDK releases that OVS versions work with maintained?

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 96843af73..83c758783 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
  In addition to the requirements described in :doc:`general`, building Open
  vSwitch with DPDK will require the following:
  
-- DPDK 20.11.1

+- DPDK 20.11.3
  
  - A `DPDK supported NIC`_
  
@@ -73,9 +73,9 @@ Install DPDK

  #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
  
 $ cd /usr/src/

-   $ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz
-   $ tar xf dpdk-20.11.1.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.1
+   $ wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
+   $ tar xf dpdk-20.11.3.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.3
 $ cd $DPDK_DIR
  
  #. Configure and install DPDK using Meson

diff --git a/NEWS b/NEWS
index 90f4b1590..b92445a32 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,8 @@
  Post-v2.16.0
  -
 - DPDK:
+ * OVS validated with DPDK 20.11.3. It is recommended to use this version
+   until further releases.
   * EAL argument --socket-mem is no longer configured by default upon
 start-up.  If dpdk-socket-mem and dpdk-alloc-mem are not specified,
 DPDK defaults will be used.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 6/6] dpif-netdev/mfex: Avoid hashing when opt mfex called

2021-09-21 Thread Kumar Amber
This patch avoids calculating the software hash of the packet again
if the optimized miniflow-extract hit and has already calculated the
packet hash. In cases of scalar miniflow extract, the normal hashing
calculation is performed.

Signed-off-by: Kumar Amber 
---
 lib/dpif-netdev-avx512.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c
index 544d36903..2188abfd9 100644
--- a/lib/dpif-netdev-avx512.c
+++ b/lib/dpif-netdev-avx512.c
@@ -210,15 +210,15 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread 
*pmd,
 if (!mfex_hit) {
 /* Do a scalar miniflow extract into keys. */
 miniflow_extract(packet, >mf);
+key->len = netdev_flow_key_size(miniflow_n_values(>mf));
+key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
+ >mf);
 }
 
 /* Cache TCP and byte values for all packets. */
 pkt_meta[i].bytes = dp_packet_size(packet);
 pkt_meta[i].tcp_flags = miniflow_get_tcp_flags(>mf);
 
-key->len = netdev_flow_key_size(miniflow_n_values(>mf));
-key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet, >mf);
-
 if (emc_enabled) {
 f = emc_lookup(>emc_cache, key);
 
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 5/6] dpif-netdev/mfex: Add ipv6 profile based hashing

2021-09-21 Thread Kumar Amber
This commit adds IPv6 profile specific hashing which
uses fixed offsets into the packet to improve hashing
perforamnce.

Hash value is autovalidated by MFEX autovalidator.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v2:
- Fix check-patch sign-offs
---
 NEWS |  1 +
 lib/dpif-netdev-extract-avx512.c | 57 
 2 files changed, 58 insertions(+)

diff --git a/NEWS b/NEWS
index 27c2eddaf..a2099efcd 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,7 @@ Post-v2.16.0
  * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
and VLAN/IPv6/TCP.
  * Add IPv4 profile based 5tuple hashing optimizations.
+ * Add IPv6 profile based 5tuple hashing optimizations.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 196ec1625..a16ba06b7 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -360,6 +360,12 @@ enum MFEX_PROFILES {
 #define HASH_DT1Q_IPV4 \
 30, 34, 27, 38, 0, 0
 
+#define HASH_IPV6 \
+22, 30, 38, 46, 20, 54
+
+#define HASH_DT1Q_IPV6 \
+26, 34, 42, 50, 24, 58
+
 /* Static const instances of profiles. These are compile-time constants,
  * and are specialized into individual miniflow-extract functions.
  * NOTE: Order of the fields is significant, any change in the order must be
@@ -451,6 +457,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 54,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV6 },
 },
 
 [PROFILE_ETH_IPV6_TCP] = {
@@ -465,6 +473,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 54,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV6 },
 },
 
 [PROFILE_ETH_VLAN_IPV6_TCP] = {
@@ -481,6 +491,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 58,
 },
 .dp_pkt_min_size = 66,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV6 },
 },
 
 [PROFILE_ETH_VLAN_IPV6_UDP] = {
@@ -497,6 +509,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 58,
 },
 .dp_pkt_min_size = 66,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV6 },
 },
 };
 
@@ -576,6 +590,38 @@ mfex_5tuple_hash_ipv4(struct dp_packet *packet, const 
uint8_t *pkt,
 key->len = netdev_flow_key_size(miniflow_n_values(>mf));
 }
 
+static inline void
+mfex_5tuple_hash_ipv6(struct dp_packet *packet, const uint8_t *pkt,
+  struct netdev_flow_key *key,
+  const uint8_t *pkt_offsets)
+{
+if (!dp_packet_rss_valid(packet)) {
+uint32_t hash = 0;
+void *ipv6_src_lo = (void *) [pkt_offsets[0]];
+void *ipv6_src_hi = (void *) [pkt_offsets[1]];
+void *ipv6_dst_lo = (void *) [pkt_offsets[2]];
+void *ipv6_dst_hi = (void *) [pkt_offsets[3]];
+void *ports_l4 = (void *) [pkt_offsets[5]];
+
+/* IPv6 Src and Dst. */
+hash = hash_add64(hash, *(uint64_t *) ipv6_src_lo);
+hash = hash_add64(hash, *(uint64_t *) ipv6_src_hi);
+hash = hash_add64(hash, *(uint64_t *) ipv6_dst_lo);
+hash = hash_add64(hash, *(uint64_t *) ipv6_dst_hi);
+/* IPv6 proto. */
+hash = hash_add(hash, pkt[pkt_offsets[4]]);
+/* L4 ports. */
+hash = hash_add(hash, *(uint32_t *) ports_l4);
+hash = hash_finish(hash, 42);
+
+dp_packet_set_rss_hash(packet, hash);
+key->hash = hash;
+} else {
+key->hash = dp_packet_get_rss_hash(packet);
+}
+key->len = netdev_flow_key_size(miniflow_n_values(>mf));
+}
+
 /* Protocol specific helper functions, for calculating offsets/lenghts. */
 static int32_t
 mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh,
@@ -769,6 +815,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 /* Process UDP header. */
 mfex_handle_ipv6_l4((void *)[54], [9]);
 
+mfex_5tuple_hash_ipv6(packet, pkt, [i],
+  profile->hash_pkt_offs);
+
 } break;
 
 case PROFILE_ETH_IPV6_TCP: {
@@ -786,6 +835,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 const struct tcp_header *tcp = (void *)[54];
 mfex_handle_tcp_flags(tcp, [9]);
 
+mfex_5tuple_hash_ipv6(packet, pkt, [i],
+  profile->hash_pkt_offs);
+
 } break;
 
 case PROFILE_ETH_VLAN_IPV6_TCP: {
@@ -806,6 +858,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 const struct tcp_header *tcp = (void *)[58];
 mfex_handle_tcp_flags(tcp, [10]);
 
+mfex_5tuple_hash_ipv6(packet, pkt, [i],
+  

[ovs-dev] [PATCH v3 4/6] dpif-netdev/mfex: Add ipv4 profile based hashing

2021-09-21 Thread Kumar Amber
This commit adds IPv4 profile specific hashing which
uses fixed offsets into the packet to improve hashing
perforamnce.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v3:
- Fix check-patch sign-offs
---
 NEWS |  1 +
 lib/dpif-netdev-extract-avx512.c | 57 
 2 files changed, 58 insertions(+)

diff --git a/NEWS b/NEWS
index 5578d4ed1..27c2eddaf 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,7 @@ Post-v2.16.0
IPv6/TCP.
  * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
and VLAN/IPv6/TCP.
+ * Add IPv4 profile based 5tuple hashing optimizations.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 11bca0144..196ec1625 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -297,6 +297,9 @@ struct mfex_profile {
 uint64_t mf_bits[FLOWMAP_UNITS];
 uint16_t dp_pkt_offs[4];
 uint16_t dp_pkt_min_size;
+
+/* Constant data offsets for Hashing. */
+uint8_t hash_pkt_offs[6];
 };
 
 /* Ensure dp_pkt_offs[4] is the correct size as in struct dp_packet. */
@@ -350,6 +353,13 @@ enum MFEX_PROFILES {
 PROFILE_COUNT,
 };
 
+/* Packet offsets for 5 tuple Hash function. */
+#define HASH_IPV4 \
+26, 30, 23, 34, 0, 0
+
+#define HASH_DT1Q_IPV4 \
+30, 34, 27, 38, 0, 0
+
 /* Static const instances of profiles. These are compile-time constants,
  * and are specialized into individual miniflow-extract functions.
  * NOTE: Order of the fields is significant, any change in the order must be
@@ -369,6 +379,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 34,
 },
 .dp_pkt_min_size = 42,
+
+.hash_pkt_offs = { HASH_IPV4 },
 },
 
 [PROFILE_ETH_IPV4_TCP] = {
@@ -383,6 +395,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 34,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV4 },
 },
 
 [PROFILE_ETH_VLAN_IPV4_UDP] = {
@@ -401,6 +415,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 38,
 },
 .dp_pkt_min_size = 46,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV4 },
 },
 
 [PROFILE_ETH_VLAN_IPV4_TCP] = {
@@ -419,6 +435,8 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 38,
 },
 .dp_pkt_min_size = 46,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV4 },
 },
 
 [PROFILE_ETH_IPV6_UDP] = {
@@ -530,6 +548,34 @@ mfex_ipv6_set_l2_pad_size(struct dp_packet *pkt,
 dp_packet_set_l2_pad_size(pkt, payload_size_ipv6 - p_len);
 }
 
+static inline void
+mfex_5tuple_hash_ipv4(struct dp_packet *packet, const uint8_t *pkt,
+  struct netdev_flow_key *key,
+  const uint8_t *pkt_offsets)
+{
+if (!dp_packet_rss_valid(packet)) {
+uint32_t hash = 0;
+void *ipv4_src = (void *) [pkt_offsets[0]];
+void *ipv4_dst = (void *) [pkt_offsets[1]];
+void *ports_l4 = (void *) [pkt_offsets[3]];
+
+/* IPv4 Src and Dst. */
+hash = hash_add(hash, *(uint32_t *) ipv4_src);
+hash = hash_add(hash, *(uint32_t *) ipv4_dst);
+/* IPv4 proto. */
+hash = hash_add(hash, pkt[pkt_offsets[2]]);
+/* L4 ports. */
+hash = hash_add(hash, *(uint32_t *) ports_l4);
+hash = hash_finish(hash, 42);
+
+dp_packet_set_rss_hash(packet, hash);
+key->hash = hash;
+} else {
+key->hash = dp_packet_get_rss_hash(packet);
+}
+key->len = netdev_flow_key_size(miniflow_n_values(>mf));
+}
+
 /* Protocol specific helper functions, for calculating offsets/lenghts. */
 static int32_t
 mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh,
@@ -664,6 +710,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 /* Process TCP flags, and store to blocks. */
 const struct tcp_header *tcp = (void *)[38];
 mfex_handle_tcp_flags(tcp, [7]);
+
+mfex_5tuple_hash_ipv4(packet, pkt, [i],
+  profile->hash_pkt_offs);
 } break;
 
 case PROFILE_ETH_VLAN_IPV4_UDP: {
@@ -674,6 +723,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) {
 continue;
 }
+
+mfex_5tuple_hash_ipv4(packet, pkt, [i],
+  profile->hash_pkt_offs);
 } break;
 
 case PROFILE_ETH_IPV4_TCP: {
@@ -687,6 +739,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) {
 continue;
 }
+

[ovs-dev] [PATCH v3 3/6] dpif-netdev/mfex: Add packet hash check to autovalidator

2021-09-21 Thread Kumar Amber
This patch adds the per profile AVX512 opt hashing to autovalidator
for validating the hash values against the scalar hash.

Signed-off-by: Kumar Amber 
---
 lib/dpif-netdev-private-extract.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/lib/dpif-netdev-private-extract.c 
b/lib/dpif-netdev-private-extract.c
index b3d96075c..263629903 100644
--- a/lib/dpif-netdev-private-extract.c
+++ b/lib/dpif-netdev-private-extract.c
@@ -303,6 +303,9 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch 
*packets,
 DP_PACKET_BATCH_FOR_EACH (i, packet, packets) {
 pkt_metadata_init(>md, in_port);
 miniflow_extract(packet, [i].mf);
+keys[i].len = netdev_flow_key_size(miniflow_n_values([i].mf));
+keys[i].hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
+[i].mf);
 
 /* Store known good metadata to compare with optimized metadata. */
 good_l2_5_ofs[i] = packet->l2_5_ofs;
@@ -352,6 +355,15 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch 
*packets,
 failed = 1;
 }
 
+/* Check hashes are equal. */
+if ((keys[i].hash != test_keys[i].hash) ||
+(keys[i].len != test_keys[i].len)) {
+ds_put_format(_msg, "Good hash: %d len: %d\tTest hash:%d"
+  " len:%d\n", keys[i].hash, keys[i].len,
+  test_keys[i].hash, test_keys[i].len);
+failed = 1;
+}
+
 if (!miniflow_equal([i].mf, _keys[i].mf)) {
 uint32_t block_cnt = miniflow_n_values([i].mf);
 ds_put_format(_msg, "Autovalidation blocks failed\n"
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 2/6] dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles

2021-09-21 Thread Kumar Amber
Add AVX512 Ipv6 optimized profile for vlan/IPv6/UDP and
vlan/IPv6/TCP.

MFEX autovalidaton test-case already has the IPv6 support for
validating against the scalar mfex.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v2:
- Fix check-patch sign-offs
---
 NEWS  |  2 +
 lib/dpif-netdev-extract-avx512.c  | 94 +++
 lib/dpif-netdev-private-extract.c | 23 
 lib/dpif-netdev-private-extract.h |  6 ++
 4 files changed, 125 insertions(+)

diff --git a/NEWS b/NEWS
index b2c6e8fff..5578d4ed1 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,8 @@ Post-v2.16.0
- Userspace datapath:
  * Add AVX512 optimized profiles to miniflow extract for IPv6/UDP and
IPv6/TCP.
+ * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
+   and VLAN/IPv6/TCP.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 3384a8dba..11bca0144 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -214,6 +214,21 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
   38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, /* IPv6 */  \
   NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
 
+/* VLAN (Dot1Q) patterns and masks. */
+#define PATTERN_DT1Q_MASK \
+  0x00, 0x00, 0xFF, 0xFF,
+#define PATTERN_DT1Q_IPV6 \
+  0x00, 0x00, 0x86, 0xDD,
+
+#define PATTERN_DT1Q_IPV6_SHUFFLE \
+  /* Ether (2 blocks): Note that *VLAN* type is written here. */  \
+  0,  1,  2,  3,  4,  5,  6,  7, 8,  9, 10, 11, 16, 17,  0,  0,   \
+  /* VLAN (1 block): Note that the *EtherHdr->Type* is written here. */   \
+  12, 13, 14, 15, 0, 0, 0, 0, \
+  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, /* IPv6 */  \
+  42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, /* IPv6 */  \
+  NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
+
 /* Generation of K-mask bitmask values, to zero out data in result. Note that
  * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be
  * set in this K-mask, and "NU" values must be zero in the k-mask. Each mask
@@ -228,6 +243,8 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define KMASK_TCP   0x0F00ULL
 #define KMASK_IPV6  0xULL
 #define KMASK_ETHER_IPV6 0x3FFFULL
+#define KMASK_DT1Q_IPV6  0xFF0FULL
+#define KMASK_IPV6_NOHDR 0x00FFULL
 
 #define PATTERN_IPV4_UDP_KMASK \
 (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32))
@@ -244,6 +261,10 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
 #define PATTERN_IPV6_KMASK \
 (KMASK_ETHER_IPV6 | (KMASK_IPV6 << 16) | (KMASK_IPV6 << 32))
 
+#define PATTERN_DT1Q_IPV6_KMASK \
+(KMASK_ETHER_IPV6 | (KMASK_DT1Q_IPV6 << 16) | (KMASK_IPV6 << 32) | \
+(KMASK_IPV6_NOHDR << 48))
+
 /* This union allows initializing static data as u8, but easily loading it
  * into AVX512 registers too. The union ensures proper alignment for the zmm.
  */
@@ -324,6 +345,8 @@ enum MFEX_PROFILES {
 PROFILE_ETH_VLAN_IPV4_TCP,
 PROFILE_ETH_IPV6_UDP,
 PROFILE_ETH_IPV6_TCP,
+PROFILE_ETH_VLAN_IPV6_TCP,
+PROFILE_ETH_VLAN_IPV6_UDP,
 PROFILE_COUNT,
 };
 
@@ -426,6 +449,37 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 .dp_pkt_min_size = 54,
 },
 
+[PROFILE_ETH_VLAN_IPV6_TCP] = {
+.probe_mask.u8_data = {
+PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV6_MASK },
+.probe_data.u8_data = {
+PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV6 PATTERN_IPV6_TCP },
+
+.store_shuf.u8_data = { PATTERN_DT1Q_IPV6_SHUFFLE },
+.store_kmsk = PATTERN_DT1Q_IPV6_KMASK,
+
+.mf_bits = { 0x38a0, 0x0004443c},
+.dp_pkt_offs = {
+14, UINT16_MAX, 18, 58,
+},
+.dp_pkt_min_size = 66,
+},
+
+[PROFILE_ETH_VLAN_IPV6_UDP] = {
+.probe_mask.u8_data = {
+PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV6_MASK },
+.probe_data.u8_data = {
+PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV6 PATTERN_IPV6_UDP },
+
+.store_shuf.u8_data = { PATTERN_DT1Q_IPV6_SHUFFLE },
+.store_kmsk = PATTERN_DT1Q_IPV6_KMASK,
+
+.mf_bits = { 0x38a0, 0x0004043c},
+.dp_pkt_offs = {
+14, UINT16_MAX, 18, 58,
+},
+.dp_pkt_min_size = 66,
+},
 };
 
 /* IPv6 header helper function to fix TC, flow label and next header. */
@@ -676,6 +730,44 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 

[ovs-dev] [PATCH v3 1/6] dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles

2021-09-21 Thread Kumar Amber
Add AVX512 IPv6 optimized profile for IPv6/UDP and
IPv6/TCP.

MFEX autovalidaton test-case already has the IPv6 support for
validating against the scalar mfex.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v2:
- Fix CI build error
- Fix check-patch sign-offs
---
 NEWS  |   3 +
 lib/automake.mk   |   1 +
 lib/dpif-netdev-extract-avx512.c  | 140 +-
 lib/dpif-netdev-private-extract.c |  28 +-
 lib/dpif-netdev-private-extract.h |   6 ++
 tests/pcap/mfex_test.pcap | Bin 416 -> 632 bytes
 6 files changed, 176 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 90f4b1590..b2c6e8fff 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,9 @@ Post-v2.16.0
limiting behavior.
  * Add hardware offload support for matching IPv4/IPv6 frag types
(experimental).
+   - Userspace datapath:
+ * Add AVX512 optimized profiles to miniflow extract for IPv6/UDP and
+   IPv6/TCP.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/automake.mk b/lib/automake.mk
index 46f869a33..eeb1fbadd 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -33,6 +33,7 @@ lib_libopenvswitchavx512_la_CFLAGS = \
-mavx512f \
-mavx512bw \
-mavx512dq \
+   -mavx512vl \
-mbmi \
-mbmi2 \
-fPIC \
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index ec64419e3..3384a8dba 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -49,6 +49,8 @@
 #include "dpif-netdev-private-extract.h"
 #include "dpif-netdev-private-flow.h"
 
+#define plen ip6_ctlun.ip6_un1.ip6_un1_plen
+
 /* AVX512-BW level permutex2var_epi8 emulation. */
 static inline __m512i
 __attribute__((target("avx512bw")))
@@ -137,6 +139,7 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define PATTERN_ETHERTYPE_MASK PATTERN_ETHERTYPE_GEN(0xFF, 0xFF)
 #define PATTERN_ETHERTYPE_IPV4 PATTERN_ETHERTYPE_GEN(0x08, 0x00)
 #define PATTERN_ETHERTYPE_DT1Q PATTERN_ETHERTYPE_GEN(0x81, 0x00)
+#define PATTERN_ETHERTYPE_IPV6 PATTERN_ETHERTYPE_GEN(0x86, 0xDD)
 
 /* VLAN (Dot1Q) patterns and masks. */
 #define PATTERN_DT1Q_MASK   \
@@ -192,6 +195,25 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
   NU, NU, NU, NU, NU, NU, NU, NU, 38, 39, 40, 41, NU, NU, NU, NU, /* TCP */   \
   NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */
 
+/* Generator for checking IPv6 ver. */
+#define PATTERN_IPV6_GEN(VER_TRC, PROTO)  \
+  VER_TRC, /* Version: 4bits and Traffic class: 4bits. */ \
+  0, 0, 0, /* Traffic class: 4bits and Flow Label: 24bits. */ \
+  0, 0,/* Payload length 16bits. */   \
+  PROTO, 0,/* Next Header 8bits and Hop limit 8bits. */   \
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* Src IP: 128bits. */  \
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* Dst IP: 128bits. */
+
+#define PATTERN_IPV6_MASK PATTERN_IPV6_GEN(0xF0, 0xFF)
+#define PATTERN_IPV6_UDP PATTERN_IPV6_GEN(0x60, 0x11)
+#define PATTERN_IPV6_TCP PATTERN_IPV6_GEN(0x60, 0x06)
+
+#define PATTERN_IPV6_SHUFFLE  \
+   0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, NU, NU, /* Ether */ \
+  22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, /* IPv6 */  \
+  38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, /* IPv6 */  \
+  NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
+
 /* Generation of K-mask bitmask values, to zero out data in result. Note that
  * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be
  * set in this K-mask, and "NU" values must be zero in the k-mask. Each mask
@@ -204,6 +226,8 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define KMASK_IPV4  0xF0FFULL
 #define KMASK_UDP   0x000FULL
 #define KMASK_TCP   0x0F00ULL
+#define KMASK_IPV6  0xULL
+#define KMASK_ETHER_IPV6 0x3FFFULL
 
 #define PATTERN_IPV4_UDP_KMASK \
 (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32))
@@ -217,6 +241,9 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define PATTERN_DT1Q_IPV4_TCP_KMASK \
 (KMASK_ETHER | (KMASK_DT1Q << 16) | (KMASK_IPV4 << 24) | (KMASK_TCP << 40))
 
+#define PATTERN_IPV6_KMASK \
+(KMASK_ETHER_IPV6 | (KMASK_IPV6 << 16) | (KMASK_IPV6 << 32))
+
 /* This union allows initializing static data as u8, but easily loading it
  * into AVX512 registers too. The union ensures proper alignment for the zmm.
  */
@@ -295,6 +322,8 @@ enum MFEX_PROFILES {
 PROFILE_ETH_IPV4_TCP,
 PROFILE_ETH_VLAN_IPV4_UDP,
 PROFILE_ETH_VLAN_IPV4_TCP,
+PROFILE_ETH_IPV6_UDP,
+PROFILE_ETH_IPV6_TCP,
 

[ovs-dev] [PATCH v3 0/6] MFEX Optimizations IPv6 + Hashing

2021-09-21 Thread Kumar Amber
---
v3:
- rebase to master.
v2:
- fix the CI build.
- fix check-patch for co-author.
---

The patch-set introduces AVX512 optimizations of IPv6
traffic profiles and hashing improvements for all AVX512
supported traffic profiles for IPv4 and IPv6.

Kumar Amber (6):
  dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles
  dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles
  dpif-netdev/mfex: Add packet hash check to autovalidator
  dpif-netdev/mfex: Add ipv4 profile based hashing
  dpif-netdev/mfex: Add ipv6 profile based hashing
  dpif-netdev/mfex: Avoid hashing when opt mfex called

 NEWS  |   7 +
 lib/automake.mk   |   1 +
 lib/dpif-netdev-avx512.c  |   6 +-
 lib/dpif-netdev-extract-avx512.c  | 348 +-
 lib/dpif-netdev-private-extract.c |  63 +-
 lib/dpif-netdev-private-extract.h |  12 ++
 tests/pcap/mfex_test.pcap | Bin 416 -> 632 bytes
 7 files changed, 432 insertions(+), 5 deletions(-)

-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] ovsdb-data: Optimize union of sets.

2021-09-21 Thread Mark Gray
On 17/09/2021 18:17, Ilya Maximets wrote:
> Current algorithm of ovsdb_datum_union looks like this:
> 
>   for-each atom in b:
>   if not bin_search(a, atom):
>   push(a, clone(atom))
>   quicksort(a)
> 
> So, the complexity looks like this:
> 
>Nb * log2(Na)   +Nb +   (Na + Nb) * log2(Na + Nb)
>Comparisonsclones   Comparisons for quicksort
>for search
> 
> ovsdb_datum_union() is heavily used in database transactions while
> new element is added to a set.  For example, if new logical switch
> port is added to a logical switch in OVN.  This is a very common
> use case where CMS adds one new port to an existing switch that
> already has, let's say, 100 ports.  For this case ovsdb-server will
> have to perform:
> 
>1 * log2(100)  + 1 clone + 101 * log2(101)
>ComparisonsComparisons for
>for search   quicksort.
>~7   1~707
>Roughly 714 comparisons of atoms and 1 clone.
> 
> Since binary search can give us position, where new atom should go
> (it's the 'low' index after the search completion) for free, the
> logic can be re-worked like this:
> 
>   copied = 0
>   for-each atom in b:
>   desired_position = bin_search(a, atom)
>   push(result, a[ copied : desired_position - 1 ])
>   copied = desired_position
>   push(result, clone(atom))
>   push(result, a[ copied : Na ])
>   swap(a, result)
> 
> Complexity of this schema:
> 
>Nb * log2(Na)   +Nb + Na
>Comparisonsclones   memory copy on push
>for search
> 
> 'swap' is just a swap of a few pointers.  'push' is not a 'clone',
> but a simple memory copy of 'union ovsdb_atom'.
> 
> In general, this schema substitutes complexity of a quicksort
> with complexity of a memory copy of Na atom structures, where we're
> not even copying strings that these atoms are pointing to.
> 
> Complexity in the example above goes down from 714 comparisons
> to 7 comparisons and memcpy of 100 * sizeof (union ovsdb_atom) bytes.
> 
> General complexity of a memory copy should always be lower than
> complexity of a quicksort, especially because these copies usually
> performed in bulk, so this new schema should work faster for any input.
> 
> All in all, this change allows to execute several times more
> transactions per second for transactions that adds new entries to sets.
> 
> Alternatively, union can be implemented as a linear merge of two
> sorted arrays, but this will result in O(Na) comparisons, which
> is more than Nb * log2(Na) in common case, since Na is usually
> far bigger than Nb.  Linear merge will also mean per-atom memory
> copies instead of copying in bulk.
> 
> 'replace' functionality of ovsdb_datum_union() had no users, so it
> just removed.  But it can easily be added back if needed in the future.
> 
> Signed-off-by: Ilya Maximets 
> ---
>  lib/db-ctl-base.c | 10 +++---
>  lib/ovsdb-data.c  | 84 ---
>  lib/ovsdb-data.h  |  6 ++--
>  lib/ovsdb-idl.c   |  8 ++---
>  ovsdb/mutation.c  |  2 +-
>  vswitchd/bridge.c |  9 +++--
>  6 files changed, 77 insertions(+), 42 deletions(-)
> 
> diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
> index 77cc76a9f..f69868702 100644
> --- a/lib/db-ctl-base.c
> +++ b/lib/db-ctl-base.c
> @@ -321,7 +321,7 @@ get_row_by_id(struct ctl_context *ctx,
>  const union ovsdb_atom key_atom
>  = { .string = CONST_CAST(char *, id->key) };
>  unsigned int i = ovsdb_datum_find_key(datum, _atom,
> -  OVSDB_TYPE_STRING);
> +  OVSDB_TYPE_STRING, NULL);
>  name = i == UINT_MAX ? NULL : >values[i];
>  }
>  if (!name) {
> @@ -820,7 +820,7 @@ check_condition(const struct ovsdb_idl_table_class *table,
>  }
>  
>  idx = ovsdb_datum_find_key(have_datum,
> -   _key, column->type.key.type);
> +   _key, column->type.key.type, NULL);
>  if (idx == UINT_MAX && !is_set_operator(operator)) {
>  retval = false;
>  } else {
> @@ -993,7 +993,7 @@ cmd_get(struct ctl_context *ctx)
>  }
>  
>  idx = ovsdb_datum_find_key(datum, ,
> -   column->type.key.type);
> +   column->type.key.type, NULL);
>  if (idx == UINT_MAX) {
>  if (must_exist) {
>  ctl_error(
> @@ -1375,7 +1375,7 @@ set_column(const struct ovsdb_idl_table_class *table,
>  ovsdb_atom_destroy(, column->type.value.type);
>  
>  ovsdb_datum_union(, ovsdb_idl_read(row, column),
> -  >type, false);
> +  >type);
>  ovsdb_idl_txn_verify(row, column);
>  ovsdb_idl_txn_write(row, column, );
>  } else {
> @@ -1514,7 

Re: [ovs-dev] [PATCH v2 2/2] ovsdb-data: Optimize subtraction of sets.

2021-09-21 Thread Mark Gray
On 17/09/2021 18:17, Ilya Maximets wrote:
> Current algorithm for ovsdb_datum_subtract looks like this:
> 
>   for-each atom in a:
>   if atom in b:
>   swap(atom, )
>   destroy(atom)
>   quicksort(a)
> 
> Complexity:
> 
>   Na * log2(Nb)  +  (Na - Nb) * log2(Na - Nb)
> Search  Comparisons for quicksort
> 
> It's not optimal, especially because Nb << Na in a vast majority of
> cases.
> 
> Reversing the search phase to look up atoms from 'b' in 'a', and
> closing gaps from deleted elements in 'a' by plain memory copy to
> avoid quicksort.
> 
> Resulted complexity:
> 
>   Nb * log2(Na)  +(Na - Nb)
> Search  Memory copies
> 
> Subtraction is heavily used while executing database transactions.
> For example, to remove one port from a logical switch in OVN.
> Complexity of such operation if original logical switch had 100 ports
> goes down from
> 
>  100 * log2(1)   = 100 comparisons for search and
>   99 * log2(99)  = 656 comparisons for quicksort
>--
>756 comparisons in total
> to only
> 
>1 * log2(100) = 7 comparisons for search
>+ memory copy of 99 * sizeof (union ovsdb_atom) bytes.
> 
> We could use memmove to close the gaps after removing atoms, but
> it will lead to 2 memory copies inside the call, while we can perform
> only one to the temporary 'result' and swap pointers.
> 
> Performance in cases, where sizes of 'a' and 'b' are comparable,
> should not change.  Cases with Nb >> Na should not happen in practice.

It seems like we are optimizing for the more common case in which Na >>
Nb. Not sure if there are use cases in which the other case could bubble
up as a bottleneck. Could we try to assess which case we are in by
checking the size of the sets and then select the algorithm based on
that assessment? i.e. have two algorithms?

> 
> All in all, this change allows ovsdb-server to perform several times
> more transactions, that removes elements from sets, per second.
> 
> Signed-off-by: Ilya Maximets 
> ---
>  lib/ovsdb-data.c | 52 +---
>  1 file changed, 40 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c
> index 11bf95fed..b6129d6ba 100644
> --- a/lib/ovsdb-data.c
> +++ b/lib/ovsdb-data.c
> @@ -2019,26 +2019,54 @@ ovsdb_datum_subtract(struct ovsdb_datum *a, const 
> struct ovsdb_type *a_type,
>   const struct ovsdb_datum *b,
>   const struct ovsdb_type *b_type)
>  {
> -bool changed = false;
> -size_t i;
> +size_t i, ai, bi, n_idx;
> +size_t *idx;
>  
>  ovs_assert(a_type->key.type == b_type->key.type);
>  ovs_assert(a_type->value.type == b_type->value.type
> || b_type->value.type == OVSDB_TYPE_VOID);
>  
> -/* XXX The big-O of this could easily be improved. */
> -for (i = 0; i < a->n; ) {
> -unsigned int idx = ovsdb_datum_find(a, i, b, b_type);
> -if (idx != UINT_MAX) {
> -changed = true;
> -ovsdb_datum_remove_unsafe(a, i, a_type);
> -} else {
> -i++;
> +idx = xmalloc(b->n * sizeof *idx);
> +n_idx = 0;
> +for (bi = 0; bi < b->n; bi++) {
> +ai = ovsdb_datum_find(b, bi, a, b_type);
> +if (ai == UINT_MAX || (n_idx && ai <= idx[n_idx - 1])) {

If b and a are always sorted, will we ever hit the second clause (n_idx
&& ai <= idx[n_idx - 1])? For example, if the following elements are
equivalent

b0 -> a0
b1 -> a1
..
..

bi is always > bi-1 therefore ai is always greater than ai-1. If there
is still a chance that we will hit it, could you add a more detailed
comment?

> +/* No such atom in 'a' or it's already marked for removal. */
> +continue;
>  }
> +idx[n_idx++] = ai;
>  }
> -if (changed) {
> -ovsdb_datum_sort_assert(a, a_type->key.type);
> +if (!n_idx) {
> +free(idx);
> +return;
>  }
> +
> +struct ovsdb_datum result;
> +
> +ovsdb_datum_init_empty();
> +ovsdb_datum_reallocate(, a_type, a->n - n_idx);
> +
> +for (i = 0; i < n_idx; i++) {

Why don't you destroy the atoms in-place in the loop above (bi = 0; bi <
b->n; bi++). Is it because you have to ovsdb_datum_find() each time? If
so, maybe add a comment.

> +ai = idx[i];
> +
> +/* Destroying atom. */
> +ovsdb_atom_destroy(>keys[ai], a_type->key.type);
> +if (a_type->value.type != OVSDB_TYPE_VOID) {
> +ovsdb_atom_destroy(>values[ai], a_type->value.type);
> +}
> +
> +/* Copy non-removed atoms from 'a' to result. */
> +unsigned int start_idx = (i > 0) ? idx[i - 1] + 1 : 0;

It won't save many cycles but could you initialize 'start_idx' to 0
outside the loop and set it to 'idx[i] + 1' after the statement below?

> +ovsdb_datum_push_unsafe(, a, start_idx, ai - start_idx, 
> a_type);
> +}
> +/* Copying 

Re: [ovs-dev] [PATCH v1] configure: Allow opt-in to CPU ISA opts at compile time

2021-09-21 Thread Eelco Chaudron



On 13 Sep 2021, at 16:36, Van Haaren, Harry wrote:

>> -Original Message-
>> From: Eelco Chaudron 
>> Sent: Friday, September 10, 2021 3:41 PM
>> To: Van Haaren, Harry ; i.maxim...@ovn.org;
>> Stokes, Ian ; f...@sysclose.org
>> Cc: Amber, Kumar ; ovs-dev@openvswitch.org;
>> ktray...@redhat.com
>> Subject: Re: [PATCH v1] configure: Allow opt-in to CPU ISA opts at compile 
>> time
>>
>>
>>
>> On 8 Sep 2021, at 17:28, Van Haaren, Harry wrote:
>>
 -Original Message-
 From: Eelco Chaudron 
 Sent: Wednesday, September 8, 2021 9:16 AM
 To: Amber, Kumar 
 Cc: ovs-dev@openvswitch.org; ktray...@redhat.com; i.maxim...@ovn.org;
 Stokes, Ian ; f...@sysclose.org; Van Haaren, Harry
 
 Subject: Re: [PATCH v1] configure: Allow opt-in to CPU ISA opts at compile
>> time

 Not a real review of the patch, but just some comment/questions glancing
>> over
 the patch.
>>>
>>> Sure, thanks for input.
>>>
 On 3 Sep 2021, at 15:53, Kumar Amber wrote:

> This commit allows "opt-in" to CPU ISA optimized implementations of
> OVS SW datapath components at compile time. This can be useful in some
> deployments where the CPU ISA optimized implementation is to be chosen
> by default.
>>>
>>> 
>>>
> +Enabling all AVX512 options
> +---
> +
> +A user can enable all the three DPIF, Miniflow Extract and DPLCS 
> optimized
> +AVX512 options at build time, if the CPU supports the required AVX512 ISA
> +by using the following command ::
> +
> +./configure --enable-cpu-isa

 If we have different ISA architectures, i.e., i86 vs ARM, we are ok with a 
 single
 option. Have you thought about AMD adding its own AMDXXX instructions in
 addition to AVX512? How would this configuration option work? Maybe an
 optional option to prioritize one over the other.
>>>
>>> The ISA enabling efforts have been generic so far, any reference to 
>>> specific ISA
>> (e.g. AVX512)
>>> has been solely in the implementation choice - never in a general component.
>> Intention here
>>> is to stay in line with that - and "enable CPU ISA" seemed a logical string 
>>> to
>> achieve that to me..
>>>
>>> It is of course possible to provide multiple configure command lines, but I 
>>> was
>> hoping to avoid
>>> creating too many compile time flags. Typically I think projects attempt to 
>>> avoid
>> due to expanding
>>> testing & validation. A single flag would limit overhead to the minimum...
>>>
>>> Typically ISA sets have a "good - better - best" type relationship - which 
>>> could
>> lead to a general
>>> acceptance of what ISA is best. We have runtime functions to switch
>> implementation - so today
>>> the code already enables a log of runtime/dynamic updating of
>> implementation. If there's a
>>> need to expose that at compile time too, then that's easy to add - but comes
>> with a burden in
>>> testing & validation...
>>
>> The main reason to mention this is the inconsistent behavior across
>> builds/releases. With this flag being as general as it is, if someone 
>> decides to add
>> AVX1024, it now gets selected as the default isa function (assuming the 
>> target
>> was already supporting this). This is a change in behavior that happens 
>> without
>> any configure option change. The difference to any other general change is 
>> that
>> this is not a global change, but something that changes based on your target.
>
> Note that the default setting for this option is suggested as "off", meaning 
> this is an entirely
> *opt in* strategy, to allow people to deploy OVS and automatically benefit 
> from CPU ISA.
>
> To be more specific, this feature is a request from folks who intend to 
> deploy with CPU ISA
> enabled by default - it suited their CI/CD/QA tooling to have this enabled by 
> default compile
> time switch to ease validation as the CPU ISA will get picked up 
> automatically when available.
>
> Note that it is not a "change in behaviour", because functionally its 
> identical.
> (The fuzzing, autovalidation & unit tests are there to ensure it is 
> functionally identical).
> It correct that there is a change in the default *implementation* of the 
> functionality,
> which I think you meant (just clarifying the "change in behaviour" as not 
> being "functional behaviour",
> only "implementation of behaviour")

>> Anyone else has an opinion on this? I think an alternative to this, is a 
>> proper
>> configuration option, which will survive a restart.
>>
>> Thinking about it a bit more during my afternoon walk, I think we should 
>> minimize
>> (not allow) any compile-time default behavior variations. It should all be 
>> based on
>> the configuration, which if required to survive a reboot, be added to the 
>> ovsdb.

Thinking about this more, I do not feel like we should introduce compile-time 
default configuration options.
It should just be added to the ovsdb and picked up from there on 

Re: [ovs-dev] [PATCH v15 0/7] Add offload support for sFlow

2021-09-21 Thread Eelco Chaudron
Hi Chris,

Just a quick update, I did see your responses to v14 and I also noticed you 
send out a v15. I planned to review it this week, but due to some other 
unforeseen stuff, I have to move it to next week (if nothing more is going to 
mess up my plan ;)

Cheers,

Eelco


On 15 Sep 2021, at 14:43, Chris Mi wrote:

> This patch set adds offload support for sFlow.
>
> Psample is a genetlink channel for packet sampling. TC action act_sample
> uses psample to send sampled packets to userspace.
>
> When offloading sample action to TC, userspace creates a unique ID to
> map sFlow action and tunnel info and passes this ID to kernel instead
> of the sFlow info. psample will send this ID and sampled packet to
> userspace. Using the ID, userspace can recover the sFlow info and send
> sampled packet to the right sFlow monitoring host.
>
> v2-v1:
> - Fix robot errors.
> v3-v2:
> - Remove Gerrit Change-Id.
> - Add patch #9 to fix older kernels build issue.
> - Add travis test result.
> v4-v3:
> - Fix offload issue when sampling rate is 1.
> v5-v4:
> - Move polling thread from ofproto to netdev-offload-tc.
> v6-v5:
> - Rebase.
> - Add GitHub Actions test result.
> v7-v6:
> - Remove Gerrit Change-Id.
> - Fix "ERROR: Inappropriate spacing around cast"
> v8-v7
> - Address Eelco Chaudron's comment for patch #11.
> v9-v8
> - Remove sflow_len from struct dpif_sflow_attr.
> - Log a debug message for other userspace actions.
> v10-v9
> - Address Eelco Chaudron's comments on v9.
> v11-v10
> - Fix a bracing error.
> v12-v11
> - Add duplicate sample group id check.
> v13-v12
> - Remove the psample poll thread from netdev-offload-tc and reuse
>   ofproto handler thread according to Ilya's new desgin.
> - Add dpif-offload-provider layer according to Eli's suggestion.
> v14-v13
> - Fix a robot error.
> v15-v14
> - Address Eelco Chaudron's comments on v14.
>
> Chris Mi (7):
>   compat: Add psample and tc sample action defines for older kernels
>   ovs-kmod-ctl: Load kernel module psample
>   dpif-offload-provider: Introduce dpif-offload-provider layer
>   netdev-offload-tc: Introduce group ID management API
>   dpif-offload-netlink: Implement dpif-offload-provider API
>   ofproto: Introduce API to process sFlow offload packet
>   netdev-offload-tc: Add offload support for sFlow
>
>  NEWS |   1 +
>  include/linux/automake.mk|   4 +-
>  include/linux/psample.h  |  62 +
>  include/linux/tc_act/tc_sample.h |  25 ++
>  lib/automake.mk  |   3 +
>  lib/dpif-netdev.c|   1 +
>  lib/dpif-netlink.c   |   2 +
>  lib/dpif-offload-netlink.c   | 208 ++
>  lib/dpif-offload-provider.h  |  75 +
>  lib/dpif-offload.c   |  43 +++
>  lib/dpif-provider.h  |   8 +-
>  lib/dpif.c   |  10 +
>  lib/netdev-offload-tc.c  | 459 +--
>  lib/netdev-offload.h |   1 +
>  lib/tc.c |  61 +++-
>  lib/tc.h |  16 +-
>  ofproto/ofproto-dpif-upcall.c|  63 +
>  utilities/ovs-kmod-ctl.in|  14 +
>  18 files changed, 1030 insertions(+), 26 deletions(-)
>  create mode 100644 include/linux/psample.h
>  create mode 100644 include/linux/tc_act/tc_sample.h
>  create mode 100644 lib/dpif-offload-netlink.c
>  create mode 100644 lib/dpif-offload-provider.h
>  create mode 100644 lib/dpif-offload.c
>
> -- 
> 2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] openvswitch: allow linking a VRF to an OVS bridge

2021-09-21 Thread Luis Tomas Bolivar
On Mon, Sep 20, 2021 at 5:45 PM David Ahern  wrote:

> On 9/20/21 9:34 AM, Antoine Tenart wrote:
> > There might be questions about the setup in which a VRF is linked to an
> > OVS bridge; I cc'ed Luis Tomás who wrote the article.
>
> My head just exploded. You want to make an L3 device a port of an L2
> device.
>
> Can someone explain how this is supposed to work and why it is even
> needed? ie., given how an OVS bridge handles packets and the point of a
> VRF device (to direct lookups to a table), the 2 are at odds in my head.
>

Hi David,

Thanks for your comment. And yes you are right, this probably is a bit
of an odd setup. That said, OVS is not pure L2 as it knows about IPs
and it is doing virtual routing too (we can say it is 2.5 xD)

What we want to achieve is something similar to what is shown in slide 100
here http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf, but
instead
of connecting the VRF bridge directly to containers, we have a single ovs
bridge (where the OpenStack VMs are connected to) where we connect the
vrfs in different (ovs) ports (so that the traffic in the way out of OVS
can be
redirected to the right VRF).

The initial part is pretty much the same as in the slide 100:
1) creating the vrf
   - ip link add vrf-1001 type vrf table 1001
2) vxlan device, in our case associated to the loopback,
for ECMP (instead of associate both nics/vlan devices to the VRF)
   - ip link add vxlan-1001 type vxlan id 1001 dstport 4789 local L_IP
nolearning
3) create the linux bridge device
   - ip link add name br-1001 type bridge stp_state 0
4) link the 3 above
   - ip link set br-1001 master vrf-1001 (bridge to vrf)
   - ip link set vxlan-1001 master br-1001 (vxlan to bridge)

Then, I'm attaching the vrf device also as an ovs bridge port, so that
traffic (together with some ip routes in the vrf routing table) can be
redirected
to OVS, and the (OpenStack) virtual networking happens there
(br-ex is the ovs bridge)
   - ovs-vsctl add-port br-ex vrf-1001
   - ip route show vrf vrf-1001
   10.0.0.0/26 via 172.24.4.146 dev br-ex
   (redirect traffic to OpenStack subnet 10.0.0.0/26 to br-ex)
   172.24.4.146 dev br-ex scope link

Perhaps there is a better way of connecting this?
I tried (without success) to create a veth device and set one end on
the linux bridge and the other on the OVS bridge.

Best regards,
Luis
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4 9/9] plug_providers: Introduce representor plugin.

2021-09-21 Thread Han Zhou
On Fri, Sep 3, 2021 at 12:27 PM Frode Nordahl 
wrote:
>
> Add the first in-tree plug provider plugin and its dependencies.
> The representor plugin can be used with multiple NIC vendors
> supporting Open vSwitch hardware offload and the devlink-port
> infrastructure[0].
>
> It is particularly useful for use with NICs connected to multiple
> distinct CPUs where the instance runs on one host and Open
> vSwitch and OVN runs on a different host, the smartnic CPU.
>
> Extend the build system with macros from the OVS build system to
> allow checking for dependencies of the plugin as well as providing
> kernel header files that may not be available at build time.
>
> The plugin will only be built when enabled and when building on
> a Linux system.
>
> 0:
https://www.kernel.org/doc/html/latest/networking/devlink/devlink-port.html
> Signed-off-by: Frode Nordahl 
> ---
>  Documentation/automake.mk |   1 +
>  Documentation/topics/plug_providers/index.rst |   1 +
>  .../topics/plug_providers/plug-providers.rst  |   5 +
>  .../plug_providers/plug-representor.rst   |  45 ++
>  build-aux/initial-tab-whitelist   |   1 +
>  configure.ac  |   2 +
>  include/automake.mk   |   4 +
>  include/linux/automake.mk |   2 +
>  include/linux/devlink.h   | 625 ++
>  lib/automake.mk   |  11 +
>  lib/plug-provider.h   |   6 +-
>  lib/plug.c|   1 +
>  .../representor/netlink-devlink.c | 499 ++
>  .../representor/netlink-devlink.h | 115 
>  .../representor/plug-representor.c| 307 +
>  m4/ovn.m4 |  26 +
>  16 files changed, 1650 insertions(+), 1 deletion(-)
>  create mode 100644
Documentation/topics/plug_providers/plug-representor.rst
>  create mode 100644 include/linux/automake.mk
>  create mode 100644 include/linux/devlink.h
>  create mode 100644 lib/plug_providers/representor/netlink-devlink.c
>  create mode 100644 lib/plug_providers/representor/netlink-devlink.h
>  create mode 100644 lib/plug_providers/representor/plug-representor.c
>

Hi Frode,

Thanks for adding this to the series. This does provide a better
understanding of how the plug_provider interfaces are going to be used for
representor ports. However, I had no idea how complex this provider would
be when I proposed adding it to the repo. Now that I am seeing it, I am not
sure if it is a good idea. It is probably better to maintain this provider
under a separate project, primarily because of totally different focus and
dependencies. For an in-tree provider, I'd consider something that plugs
regular VIFs.

I'd also like to hear what other maintainers think. I am sorry for not
realizing this earlier, and if we finally decide to exclude this single
patch from the series, I hope this doesn't waste too much of your effort,
assuming the majority of the code would be the same when it is hosted under
a separate repo.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Suneetha Kalahasthi
Modify ci linux build script to use the latest DPDK stable release 20.11.3.
Modify Documentation to use the latest DPDK stable release 20.11.3.
Update NEWS file to reflect the latest DPDK stable release 20.11.3.
FAQ is updated to reflect the latest DPDK for each OVS branch.

Signed-off-by: Suneetha Kalahasthi 
---
 .ci/linux-build.sh   | 2 +-
 Documentation/faq/releases.rst   | 8 
 Documentation/intro/install/dpdk.rst | 8 
 NEWS | 2 ++
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 863f02388..5323cb2f2 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -216,7 +216,7 @@ fi
 
 if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="20.11.1"
+DPDK_VER="20.11.3"
 fi
 install_dpdk $DPDK_VER
 if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 68c9867b1..4f8d105e6 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.10.x   17.11.10
 2.11.x   18.11.9
 2.12.x   18.11.9
-2.13.x   19.11.8
-2.14.x   19.11.8
-2.15.x   20.11.1
-2.16.x   20.11.1
+2.13.x   19.11.10
+2.14.x   19.11.10
+2.15.x   20.11.3
+2.16.x   20.11.3
  
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 96843af73..83c758783 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 20.11.1
+- DPDK 20.11.3
 
 - A `DPDK supported NIC`_
 
@@ -73,9 +73,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz
-   $ tar xf dpdk-20.11.1.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.1
+   $ wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
+   $ tar xf dpdk-20.11.3.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.3
$ cd $DPDK_DIR
 
 #. Configure and install DPDK using Meson
diff --git a/NEWS b/NEWS
index 90f4b1590..b92445a32 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,8 @@
 Post-v2.16.0
 -
- DPDK:
+ * OVS validated with DPDK 20.11.3. It is recommended to use this version
+   until further releases.
  * EAL argument --socket-mem is no longer configured by default upon
start-up.  If dpdk-socket-mem and dpdk-alloc-mem are not specified,
DPDK defaults will be used.
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.16] dpdk: Use DPDK 20.11.3 release

2021-09-21 Thread Suneetha Kalahasthi
Modify ci linux build script to use the latest DPDK stable release 20.11.3.
Modify Documentation to use the latest DPDK stable release 20.11.3.
Update NEWS file to reflect the latest DPDK stable release 20.11.3.
FAQ is updated to reflect the latest DPDK for each OVS branch.

Signed-off-by: Suneetha Kalahasthi 
---
 .ci/linux-build.sh   | 2 +-
 Documentation/faq/releases.rst   | 8 
 Documentation/intro/install/dpdk.rst | 8 
 NEWS | 3 +++
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 863f02388..5323cb2f2 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -216,7 +216,7 @@ fi
 
 if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="20.11.1"
+DPDK_VER="20.11.3"
 fi
 install_dpdk $DPDK_VER
 if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 68c9867b1..4f8d105e6 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -205,10 +205,10 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.10.x   17.11.10
 2.11.x   18.11.9
 2.12.x   18.11.9
-2.13.x   19.11.8
-2.14.x   19.11.8
-2.15.x   20.11.1
-2.16.x   20.11.1
+2.13.x   19.11.10
+2.14.x   19.11.10
+2.15.x   20.11.3
+2.16.x   20.11.3
  
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index d8fa931fa..d9e2e2d1a 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 20.11.1
+- DPDK 20.11.3
 
 - A `DPDK supported NIC`_
 
@@ -73,9 +73,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget https://fast.dpdk.org/rel/dpdk-20.11.1.tar.xz
-   $ tar xf dpdk-20.11.1.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.1
+   $ wget https://fast.dpdk.org/rel/dpdk-20.11.3.tar.xz
+   $ tar xf dpdk-20.11.3.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-20.11.3
$ cd $DPDK_DIR
 
 #. Configure and install DPDK using Meson
diff --git a/NEWS b/NEWS
index f2497d5ce..fbeecccea 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,8 @@
 v2.16.1 - xx xxx 
 -
+   - DPDK:
+ * OVS validated with DPDK 20.11.3. It is recommended to use this version
+   until further releases.
 
 v2.16.0 - 16 Aug 2021
 -
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev