date:20200629

On Mon, Jun 29, 2020 at 9:29 PM Numan Siddique  wrote:

>
>
> On Mon, Jun 29, 2020 at 8:48 PM Mark Michelson 
> wrote:
>
>> On 6/22/20 1:41 PM, num...@ovn.org wrote:
>> > From: Numan Siddique 
>> >
>> > The commit 5af304e7478a ("Support selection fields in load balancer.")
>> > didn't handle the selection fields for UDP and SCTP protocol.
>> > If CMS has set the selection fields - tp_src and tp_dst for UDP or SCTP
>> > load balancers, ovn-northd was adding lflows as
>> > ct_lb(backends=, hash_fields(tp_src,tp_dst))
>> > instead of ct_lb(backends=, hash_fields(udp_src,udp_dst)) and
>> > ct_lb(backends=, hash_fields(sctp_src,sctp_dst)) respectively.
>> >
>> > Because of this, select group flows were encoded with tcp_src and
>> tcp_dst
>> > hash fields.
>> >
>> > This patch fixes this issue and refactors the load balancer code in
>> ovn-northd
>> > to better handle the protocols.
>> >
>> > Test cases are now added for UDP and SCTP protocols.
>> >
>> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1846189
>> > Signed-off-by: Numan Siddique 
>> > ---
>> >   northd/ovn-northd.c |  71 --
>> >   tests/ovn.at| 179
>> ++--
>> >   tests/system-ovn.at |  23 ++
>> >   3 files changed, 227 insertions(+), 46 deletions(-)
>> >
>> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> > index 40aeb0a58..7887d0d2f 100644
>> > --- a/northd/ovn-northd.c
>> > +++ b/northd/ovn-northd.c
>> > @@ -3121,6 +3121,7 @@ struct ovn_lb {
>> >
>> >   const struct nbrec_load_balancer *nlb; /* May be NULL. */
>> >   char *selection_fields;
>> > +char *proto;
>> >   struct lb_vip *vips;
>> >   size_t n_vips;
>> >   };
>> > @@ -3218,6 +3219,12 @@ ovn_lb_create(struct northd_context *ctx, struct
>> hmap *lbs,
>> >   struct smap_node *node;
>> >   size_t n_vips = 0;
>> >
>> > +if (!nbrec_lb->protocol || !nbrec_lb->protocol[0]) {
>> > +lb->proto = xstrdup("tcp");
>> > +} else {
>> > +lb->proto = xstrdup(nbrec_lb->protocol);
>> > +}
>> > +
>>
>> Hm, I don't think this is quite right. Load balancers that do not
>> specify ports are L4-agnostic. This patch does not change the behavior,
>> but it sets lb->proto to "tcp" even if no port is specified. This could
>> result in confusion and potential future buggy code.
>>
>
>
> Lets say, CMS creates a load balancer without specifying protocol
>
> ovn-nbctl lb-add lb0 "10.0.0.10" "10.0.0.3,20.0.0.3"
>
> And then
>
> ovn-nbctl set load_balancer  vips:"10.0.0.12:80"="10.0.0.4:80,
> 20.0.0.4:80"
>
> In this case we should consider the protocol as TCP right ?
>
> And anyway we add the L4 port checks only if the VIP has port specified.
> So I think
> it's ok to assume that the load balancer is TCP even if no VIP of the load
> balancer has a port specified.
>
> And lets say user specifies the selection_fields as
> "ip_src,ip_dst,tp_src,tp_dst" for the below load balancer
> (with no protocol specified)
>
> ovn-nbctl lb-add lb1 "10.0.0.20" "10.0.0.30,20.0.0.30"
> ovn-nbctl set load_balancer 
> selection_fields="ip_src,ip_dst,tp_src,tp_dst".
>
> In this case even without this patch, ovs-vswitchd will consider it as TCP
> protocol.
>
> So do you think it's fair to consider the protocol as tcp for a load
> balancer (when protocol is not explicitly set)
>- If a VIP has a port specified
>- If selection_fields has "tp_src/tp_dst" set ?
>
>
Hi Mark,

I removed the refactoring part of the code to address your concern.

Please check out v2 -
https://patchwork.ozlabs.org/project/openvswitch/patch/20200630052114.2813829-1-num...@ovn.org/

Thanks
Numan



> Thanks
> Numan
>
> Thanks
> Numan
>
>
>
>
>>
>> >   SMAP_FOR_EACH (node, &nbrec_lb->vips) {
>> >   char *vip;
>> >   uint16_t port;
>> > @@ -3234,7 +3241,7 @@ ovn_lb_create(struct northd_context *ctx, struct
>> hmap *lbs,
>> >   lb->vips[n_vips].backend_ips = xstrdup(node->value);
>> >
>> >   struct nbrec_load_balancer_health_check *lb_health_check =
>> NULL;
>> > -if (nbrec_lb->protocol && !strcmp(nbrec_lb->protocol, "sctp"))
>> {
>> > +if (!strcmp(lb->proto, "sctp")) {
>> >   if (nbrec_lb->n_health_check > 0) {
>> >   static struct vlog_rate_limit rl =
>> VLOG_RATE_LIMIT_INIT(1, 1);
>> >   VLOG_WARN_RL(&rl,
>> > @@ -3309,15 +3316,11 @@ ovn_lb_create(struct northd_context *ctx,
>> struct hmap *lbs,
>> >   lb->vips[n_vips].backends[i].svc_mon_src_ip =
>> svc_mon_src_ip;
>> >
>> >   if (lb_health_check && op && svc_mon_src_ip) {
>> > -const char *protocol = nbrec_lb->protocol;
>> > -if (!protocol || !protocol[0]) {
>> > -protocol = "tcp";
>> > -}
>> >   lb->vips[n_vips].backends[i].health_check = true;
>> >   struct service_monitor_info *mon_info =
>> >   create_or_get_service_mon(ctx, monit

[ovs-dev] [PATCH ovn v2] Fix selection fields for UDP and SCTP load balancers.

2020-06-29 Thread numans

From: Numan Siddique 

The commit 5af304e7478a ("Support selection fields in load balancer.")
didn't handle the selection fields for UDP and SCTP protocol.
If CMS has set the selection fields - tp_src and tp_dst for UDP or SCTP
load balancers, ovn-northd was adding lflows as
ct_lb(backends=, hash_fields(tp_src,tp_dst))
instead of ct_lb(backends=, hash_fields(udp_src,udp_dst)) and
ct_lb(backends=, hash_fields(sctp_src,sctp_dst)) respectively.

Because of this, select group flows were encoded with tcp_src and tcp_dst
hash fields.

This patch fixes this issue.

Test cases are now added for UDP and SCTP protocols.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1846189
Signed-off-by: Numan Siddique 
---

v1 -> v2

  * Addressed the concerns raised by Mark.
Removed the refactoring of the code and the load balancer
protocol is not stored in the internal structure.

 northd/ovn-northd.c |  14 +++-
 tests/ovn.at| 179 ++--
 tests/system-ovn.at |  23 ++
 3 files changed, 207 insertions(+), 9 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 2f973bff6e..9da4b5b860 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -3351,10 +3351,22 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
*lbs,
 n_vips++;
 }
 
+char *proto = NULL;
+if (nbrec_lb->protocol && nbrec_lb->protocol[0]) {
+proto = nbrec_lb->protocol;
+}
+
 if (lb->nlb->n_selection_fields) {
 struct ds sel_fields = DS_EMPTY_INITIALIZER;
 for (size_t i = 0; i < lb->nlb->n_selection_fields; i++) {
-ds_put_format(&sel_fields, "%s,", lb->nlb->selection_fields[i]);
+char *field = lb->nlb->selection_fields[i];
+if (!strcmp(field, "tp_src") && proto) {
+ds_put_format(&sel_fields, "%s_src,", proto);
+} else if (!strcmp(field, "tp_dst") && proto) {
+ds_put_format(&sel_fields, "%s_dst,", proto);
+} else {
+ds_put_format(&sel_fields, "%s,", field);
+}
 }
 ds_chomp(&sel_fields, ',');
 lb->selection_fields = ds_steal_cstr(&sel_fields);
diff --git a/tests/ovn.at b/tests/ovn.at
index 6a9bdf1d56..1a1cfbf124 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1029,6 +1029,18 @@ ct_lb(backends=fd0f::2,fd0f::3; 
hash_fields="eth_src,eth_dst,ip_src,ip_dst,tp_sr
 encodes as group:5
 uses group: id(5), 
name(type=select,selection_method=hash,fields(eth_src,eth_dst,ip_src,ip_dst,tp_src,tp_dst),bucket=bucket_id=0,weight:100,actions=ct(nat(dst=fd0f::2),commit,table=19,zone=NXM_NX_REG13[0..15]),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=fd0f::3),commit,table=19,zone=NXM_NX_REG13[0..15]))
 has prereqs ip
+ct_lb(backends=fd0f::2,fd0f::3; 
hash_fields="eth_src,eth_dst,ip_src,ip_dst,tcp_src,tcp_dst");
+encodes as group:6
+uses group: id(6), 
name(type=select,selection_method=hash,fields(eth_src,eth_dst,ip_src,ip_dst,tcp_src,tcp_dst),bucket=bucket_id=0,weight:100,actions=ct(nat(dst=fd0f::2),commit,table=19,zone=NXM_NX_REG13[0..15]),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=fd0f::3),commit,table=19,zone=NXM_NX_REG13[0..15]))
+has prereqs ip
+ct_lb(backends=fd0f::2,fd0f::3; 
hash_fields="eth_src,eth_dst,ip_src,ip_dst,udp_src,udp_dst");
+encodes as group:7
+uses group: id(7), 
name(type=select,selection_method=hash,fields(eth_src,eth_dst,ip_src,ip_dst,udp_src,udp_dst),bucket=bucket_id=0,weight:100,actions=ct(nat(dst=fd0f::2),commit,table=19,zone=NXM_NX_REG13[0..15]),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=fd0f::3),commit,table=19,zone=NXM_NX_REG13[0..15]))
+has prereqs ip
+ct_lb(backends=fd0f::2,fd0f::3; 
hash_fields="eth_src,eth_dst,ip_src,ip_dst,sctp_src,sctp_dst");
+encodes as group:8
+uses group: id(8), 
name(type=select,selection_method=hash,fields(eth_src,eth_dst,ip_src,ip_dst,sctp_src,sctp_dst),bucket=bucket_id=0,weight:100,actions=ct(nat(dst=fd0f::2),commit,table=19,zone=NXM_NX_REG13[0..15]),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=fd0f::3),commit,table=19,zone=NXM_NX_REG13[0..15]))
+has prereqs ip
 
 # ct_next
 ct_next;
@@ -1571,13 +1583,13 @@ handle_svc_check(reg0);
 # select
 reg9[16..31] = select(1=50, 2=100, 3, );
 formats as reg9[16..31] = select(1=50, 2=100, 3=100);
-encodes as group:6
-uses group: id(6), 
name(type=select,selection_method=dp_hash,bucket=bucket_id=0,weight:50,actions=load:1->xreg4[16..31],resubmit(,19),bucket=bucket_id=1,weight:100,actions=load:2->xreg4[16..31],resubmit(,19),bucket=bucket_id=2,weight:100,actions=load:3->xreg4[16..31],resubmit(,19))
+encodes as group:9
+uses group: id(9), 
name(type=select,selection_method=dp_hash,bucket=bucket_id=0,weight:50,actions=load:1->xreg4[16..31],resubmit(,19),bucket=bucket_id=1,weight:100,actions=load:2->xreg4[16..31],resubmit(,19),bucket=bucket_id=2,weight:100,actions=load:3->xreg4[16..31],resubmit(,19))
 
 reg0 = select(1,

Re: [ovs-dev] [PATCH v2] tc: Changes to netlink message population for OVS-TC Flower Offload

2020-06-29 Thread Satish

Hi Simon,

We decided to withdraw this patch for the time being. We'll resubmit this patch
along with the linux net-next patch(with driver code) in future".

On Sat, Jun 20, 2020 at 12:17 AM dsatish  wrote:
>
> From: Satish Dhote 
>
> OVS-TC offload sends only fields that are not completely masked out
> to the TC kernel module. Some of the hardware we work with requires
> all fields extracted from the packet along with the mask to be sent
> to the driver. Hardware may optimize tables based on entire fields
> of packet and mask of fields, even though certain fields are masked
> out.
>
> OVS when offloading to ovs kernel datapath sends the entire key
> along with the mask. Proposal requests for similar behaviour in
> ovs-vswitchd in case of tc-offload. While constructing netlink
> messages, ovs-vswitchd is removing completely masked out fields.
> Sending additional masked fields should not impact existing drivers.
> TC internally uses dissectors to extract non masked fields, so it
> should not impact existing drivers in tc.
>
> Example:
> ovs-rule:ovs-ofctl add-flow br0 \
> "table=0, dl_dst=00:a1:45:23:23:11/ff:ff:ff:ff:ff:ff, actions=1"
>
> If we have above rule in ovs and an icmp packet matching destination
> MAC 00:a1:45:23:23:11 arrives, ovs prepares the key and mask with all
> relevant fields. When sending to TC it removes fields which are
> completely masked out. In this case ethertype, and dst_mac fields are
> sent as part of netlink message to tc.
>
> OVS patch also requires minor fixes in TC. Patch to TC is submitted
> at following link. 1/3 patch has the fix for the same.
> http://patchwork.ozlabs.org/project/netdev/list/?series=184493
>
> Co-authored-by: Chandra Kesava 
> Co-authored-by: Prathibha Nagooru 
> Signed-off-by: Chandra Kesava 
> Signed-off-by: Prathibha Nagooru 
> Signed-off-by: Satish Dhote 
> ---
> v2:
> - Corrected From author name.
> - Added missing co-author detail.
> - Deleted unused variable.
> ---
>  lib/tc.c | 111 +--
>  1 file changed, 42 insertions(+), 69 deletions(-)
>
> diff --git a/lib/tc.c b/lib/tc.c
> index c2ab77553..4d0232120 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -2648,9 +2648,6 @@ nl_msg_put_masked_value(struct ofpbuf *request, 
> uint16_t type,
>  const void *mask_data, size_t len)
>  {
>  if (mask_type != TCA_FLOWER_UNSPEC) {
> -if (is_all_zeros(mask_data, len)) {
> -return;
> -}
>  nl_msg_put_unspec(request, mask_type, mask_data, len);
>  }
>  nl_msg_put_unspec(request, type, data, len);
> @@ -2705,17 +2702,16 @@ nl_msg_put_flower_tunnel(struct ofpbuf *request, 
> struct tc_flower *flower)
>  uint8_t ttl = flower->key.tunnel.ttl;
>  uint8_t tos_mask = flower->mask.tunnel.tos;
>  uint8_t ttl_mask = flower->mask.tunnel.ttl;
> -ovs_be64 id_mask = flower->mask.tunnel.id;
>
> -if (ipv4_dst_mask || ipv4_src_mask) {
> +if (ipv4_dst || ipv4_src) {
>  nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,
>  ipv4_dst_mask);
>  nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,
>  ipv4_src_mask);
>  nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_IPV4_DST, ipv4_dst);
>  nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_IPV4_SRC, ipv4_src);
> -} else if (ipv6_addr_is_set(ipv6_dst_mask) ||
> -   ipv6_addr_is_set(ipv6_src_mask)) {
> +} else if (ipv6_addr_is_set(ipv6_dst) ||
> +   ipv6_addr_is_set(ipv6_src)) {
>  nl_msg_put_in6_addr(request, TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,
>  ipv6_dst_mask);
>  nl_msg_put_in6_addr(request, TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,
> @@ -2723,20 +2719,18 @@ nl_msg_put_flower_tunnel(struct ofpbuf *request, 
> struct tc_flower *flower)
>  nl_msg_put_in6_addr(request, TCA_FLOWER_KEY_ENC_IPV6_DST, ipv6_dst);
>  nl_msg_put_in6_addr(request, TCA_FLOWER_KEY_ENC_IPV6_SRC, ipv6_src);
>  }
> -if (tos_mask) {
> -nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TOS, tos);
> -nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TOS_MASK, tos_mask);
> -}
> -if (ttl_mask) {
> -nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL, ttl);
> -nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL_MASK, ttl_mask);
> -}
> +
> +nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TOS, tos);
> +nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TOS_MASK, tos_mask);
> +
> +nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL, ttl);
> +nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL_MASK, ttl_mask);
> +
>  if (tp_dst) {
>  nl_msg_put_be16(request, TCA_FLOWER_KEY_ENC_UDP_DST_PORT, tp_dst);
>  }
> -if (id_mask) {
> -nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_KEY_ID, id);
> -}
> +
> +nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_KEY_ID, id);
>  nl_msg_put_flower_tunnel_opts(request, T

Re: [ovs-dev] [PATCH V3 12/12] netdev-offload-dpdk: Fix Ethernet matching for type only



On 6/29/2020 9:10 PM, Ilya Maximets wrote:

While calling from the userspace datapath, we always have a match on dl_type.
This means that it's not possible to hit the 'else' condition.
I'm worried about the usecase described there.  It's hard to track changes
in i40e_flow.c.  Can someone test this with XL710?

Yes, that's true. I think we should have it for consistency with the rest of 
the code, and also just in case.

Regarding the XL710 comment - I think it should not have been merged into OVS 
in the first place. If it is an issue with XL710, the relevant PMD should 
handle it, and not OVS. I just kept it in case I miss something here. Do you 
think it's OK to remove it?

Thinking about correctness, if dl_type match requested and we're
matching any L2 packets instead, this does not sound good.  In
this case we might perform incorrect set of actions for a packet
that should be handled differently.  Also, we might end up having
several flows with the same match and different actions in case the
only dufference was in dl_type, which is not a good thing too.
This commit addresses exactly this issue, and applies specific ETH 
matches if applicable, including dl_type.


So, from that point of view it's better to remove the 'else' branch
entirely.  Good reasoning should be described in a commit message.
I don't see how that implies removing the else entirely. If there is 
from some reason (though won't happen at least currently) no matches on 
any eth (src/dst/type), it does make sense to match on generic "ETH". 
For completeness I'd prefer to keep the else branch, but maybe drop only 
the comment about XL710. What do you think?


For the 'if' condition:  You could do 'masks.dl_type' check first,
i.e. before checking eth addresses, this way we will save a few cpu
cycles in most cases.

OK.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH RESEND ovs-dev v1 0/2] dpif-netdev: avoid ovs-vswitchd crash

2020-06-29 Thread Tonghao Zhang

On Fri, Jun 12, 2020 at 9:39 AM Tonghao Zhang  wrote:
>
> On Tue, Jun 9, 2020 at 8:54 AM  wrote:
> >
> > From: Tonghao Zhang 
> >
> > This patchset add more robust error handling.
> > Tested-at: 
> > https://travis-ci.com/github/ovn-open-virtual-networks/ovs/builds/170300796
> ping
> > Tonghao Zhang (2):
> >   dpif-netdev: Add check mark to avoid ovs-vswitchd crash.
> >   dpif-netdev: Return error code when no mark available.
> >
> >  lib/dpif-netdev.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > --
> > 2.26.1
> >
>
>
> --
> Best regards, Tonghao
Hi Ben, Ilya
any comments about that bugfix ?


-- 
Best regards, Tonghao
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Test Mail

2020-06-29 Thread svc . eng . git-patch






___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 2/5] stream: Add record/replay functionality.

For debugging purposes it is useful to be able to record all the
incoming transactions and commands and replay them locally under
debugger or with additional logging enabled.  This patch introduces
ability to record all the incoming stream data and replay it via new
stream provider named 'stream-replay'.  During the record phase all
the incoming stream data written to special replay_* files in the
application rundir.  On replay phase instead of opening real streams
application will open replay_* files and read all the incoming data
directly from them.

If enabled for ovsdb-server, for example, this allows to record all
the connections and transactions from the big setup and replay them
locally afterwards to debug the behaviour or tests performance.

To start application in recording mode there is a --stream-replay-record
cmdline option. --stream-replay is to replay previously recorded
streams.  If application depends on generation of UUIDs, it's likely
required to pass --predictable-uuids-with-seed=XXX cmdline option in
both runs in order to have same UUIDs generated.

Current version doesn't work well with time-based stream events like
inactivity probes or any other events generated internally.  This is
a point for further improvement.

Signed-off-by: Ilya Maximets 
---
 lib/automake.mk |   1 +
 lib/stream-provider.h   |   4 +
 lib/stream-replay.c | 561 
 lib/stream.c|  38 ++-
 lib/stream.h|  40 ++-
 ovsdb/ovsdb-client.c|   2 +-
 ovsdb/ovsdb-server.c|   2 +-
 tests/test-jsonrpc.c|   2 +-
 utilities/ovs-vsctl.c   |   2 +-
 vswitchd/ovs-vswitchd.c |   2 +-
 vtep/vtep-ctl.c |   2 +-
 11 files changed, 642 insertions(+), 14 deletions(-)
 create mode 100644 lib/stream-replay.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 86940ccd2..bbe7d04f1 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -281,6 +281,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/stream-fd.c \
lib/stream-fd.h \
lib/stream-provider.h \
+   lib/stream-replay.c \
lib/stream-ssl.h \
lib/stream-tcp.c \
lib/stream.c \
diff --git a/lib/stream-provider.h b/lib/stream-provider.h
index 75f4f059b..0ce4f6f4c 100644
--- a/lib/stream-provider.h
+++ b/lib/stream-provider.h
@@ -29,6 +29,7 @@ struct stream {
 const struct stream_class *class;
 int state;
 int error;
+FILE *replay_wfd;
 char *name;
 char *peer_id;
 };
@@ -133,6 +134,7 @@ struct pstream {
 const struct pstream_class *class;
 char *name;
 ovs_be16 bound_port;
+FILE *replay_wfd;
 };
 
 void pstream_init(struct pstream *, const struct pstream_class *, char *name);
@@ -200,5 +202,7 @@ extern const struct pstream_class pwindows_pstream_class;
 extern const struct stream_class ssl_stream_class;
 extern const struct pstream_class pssl_pstream_class;
 #endif
+extern const struct stream_class replay_stream_class;
+extern const struct pstream_class preplay_pstream_class;
 
 #endif /* stream-provider.h */
diff --git a/lib/stream-replay.c b/lib/stream-replay.c
new file mode 100644
index 0..16486f94e
--- /dev/null
+++ b/lib/stream-replay.c
@@ -0,0 +1,561 @@
+/*
+ * Copyright (c) 2020, Red Hat, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "dirs.h"
+#include "ovs-atomic.h"
+#include "util.h"
+#include "stream-provider.h"
+#include "stream.h"
+#include "openvswitch/poll-loop.h"
+#include "openvswitch/vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(stream_replay);
+
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(10, 25);
+
+/* Stream replay. */
+
+static struct ovs_mutex replay_mutex = OVS_MUTEX_INITIALIZER;
+static int replay_seqno OVS_GUARDED_BY(replay_mutex) = 0;
+static atomic_int replay_state = ATOMIC_VAR_INIT(STREAM_REPLAY_NONE);
+
+void
+stream_replay_set_state(enum stream_replay_state state)
+{
+atomic_store_relaxed(&replay_state, state);
+}
+
+enum stream_replay_state
+stream_replay_get_state(void)
+{
+int state;
+
+atomic_read_relaxed(&replay_state, &state);
+return state;
+}
+
+static char *
+replay_file_name(const char *name, int seqno)
+{
+char *local_name = xstrdup(name);
+char *filename, *p, *c;
+bool skip = false;
+
+/* Replace all the numbers and special symbols with single underscore.
+ * Numbers might be PIDs or port number

[ovs-dev] [PATCH 1/5] ovsdb-server: Allow using predictable UUIDs.

In some cases for debugability it is useful to be able to generate
exactly same UUIDs in different runs of the application.

For example, this is required for the future stream record/replay
functionality of ovsdb-server.  With predictable UUIDs we could
record all incoming transactions and replay them later while
being sure that ovsdb-server will generate exactly same UUIDs
for all the data updates.

Signed-off-by: Ilya Maximets 
---
 lib/uuid.c   | 34 ++
 lib/uuid.h   |  1 +
 ovsdb/ovsdb-server.c | 14 ++
 3 files changed, 49 insertions(+)

diff --git a/lib/uuid.c b/lib/uuid.c
index 13d20ac64..6f56fc13f 100644
--- a/lib/uuid.c
+++ b/lib/uuid.c
@@ -35,6 +35,25 @@ static struct aes128 key;
 static uint64_t counter[2];
 BUILD_ASSERT_DECL(sizeof counter == 16);
 
+/* Non-random seed.
+ *
+ * For testing/debugging reasons it might be useful to have predictable UUIDs
+ * generated by uuid_random() and uuid_generate().
+ *
+ * This variable is intentionally 64bit signed integer in order to use '-1' as
+ * an indicator that it shoudl not be used.  'uuid_set_non_random_seed()' sets
+ * unsigned 32bit value that should be more than enough for testing purposes.
+ * This function should be called before uuid_init() and in the same thread
+ * in order to take effect.
+ * */
+static int64_t non_random_seed = -1;
+
+void
+uuid_set_non_random_seed(uint32_t seed)
+{
+non_random_seed = seed;
+}
+
 static void do_init(void);
 
 /*
@@ -276,6 +295,21 @@ do_init(void)
 uint8_t random_seed[16];
 struct timeval now;
 
+if (non_random_seed >= 0) {
+/* We're requested to have predictable UUIDs, so using non-random
+ * seed for the initialization. */
+sha1_init(&sha1_ctx);
+sha1_update_int(&sha1_ctx, non_random_seed);
+sha1_final(&sha1_ctx, sha1);
+
+aes128_schedule(&key, sha1);
+
+/* Seeting initial counter to the same non-random seed. */
+counter[0] = non_random_seed;
+counter[1] = 0;
+return;
+}
+
 /* Get seed data. */
 get_entropy_or_die(random_seed, sizeof random_seed);
 xgettimeofday(&now);
diff --git a/lib/uuid.h b/lib/uuid.h
index fa49354f6..4abae64ea 100644
--- a/lib/uuid.h
+++ b/lib/uuid.h
@@ -73,6 +73,7 @@ uuid_prefix(const struct uuid *uuid, int digits)
 return (uuid->parts[0] >> (32 - 4 * digits));
 }
 
+void uuid_set_non_random_seed(uint32_t seed);
 void uuid_init(void);
 void uuid_generate(struct uuid *);
 struct uuid uuid_random(void);
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index ef4e996df..42ba0053b 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -1712,6 +1712,7 @@ parse_options(int argc, char *argv[],
 OPT_SYNC_EXCLUDE,
 OPT_ACTIVE,
 OPT_NO_DBS,
+OPT_PREDICTABLE_UUIDS,
 VLOG_OPTION_ENUMS,
 DAEMON_OPTION_ENUMS,
 SSL_OPTION_ENUMS,
@@ -1734,6 +1735,8 @@ parse_options(int argc, char *argv[],
 {"sync-exclude-tables", required_argument, NULL, OPT_SYNC_EXCLUDE},
 {"active", no_argument, NULL, OPT_ACTIVE},
 {"no-dbs", no_argument, NULL, OPT_NO_DBS},
+{"predictable-uuids-with-seed", required_argument, NULL,
+OPT_PREDICTABLE_UUIDS},
 {NULL, 0, NULL, 0},
 };
 char *short_options = ovs_cmdl_long_options_to_short_options(long_options);
@@ -1824,6 +1827,17 @@ parse_options(int argc, char *argv[],
 add_default_db = false;
 break;
 
+case OPT_PREDICTABLE_UUIDS: {
+unsigned int seed;
+
+if (str_to_uint(optarg, 10, &seed)) {
+uuid_set_non_random_seed(seed);
+uuid_init();
+} else {
+ovs_fatal(0, "Failed to parse non-random seed.");
+}
+break;
+}
 case '?':
 exit(EXIT_FAILURE);
 
-- 
2.25.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 3/5] ovsdb-server: Integrate stream replay engine.

This change adds support of stream record/replay functionality to
ovsdb-server.

Since current replay engine doesn't work well with time-based
events generated locally, it will work only with standalone databases
for now (raft heavily depends on time).

To use this functionality run:

  # record:
  
  ovsdb-server --sream-replay-record --predictable-uuids-with-seed=1234 <...>
  
  ovs-appctl -t ovsdb-server exit

  # replay:
  
  ovsdb-server --sream-replay --predictable-uuids-with-seed=1234 <...>
  At this point ovsdb-server should execute all the same commands
  and transactions.  Since the last command was 'exit' via unixctl,
  ovsdb-server will exit in the end.

Signed-off-by: Ilya Maximets 
---
 ovsdb/ovsdb-server.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 3af09c8c1..178fa6fa5 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -1716,6 +1716,7 @@ parse_options(int argc, char *argv[],
 VLOG_OPTION_ENUMS,
 DAEMON_OPTION_ENUMS,
 SSL_OPTION_ENUMS,
+STREAM_REPLAY_OPTION_ENUMS,
 };
 
 static const struct option long_options[] = {
@@ -1731,6 +1732,7 @@ parse_options(int argc, char *argv[],
 {"bootstrap-ca-cert", required_argument, NULL, OPT_BOOTSTRAP_CA_CERT},
 {"peer-ca-cert", required_argument, NULL, OPT_PEER_CA_CERT},
 STREAM_SSL_LONG_OPTIONS,
+STREAM_REPLAY_LONG_OPTIONS,
 {"sync-from",   required_argument, NULL, OPT_SYNC_FROM},
 {"sync-exclude-tables", required_argument, NULL, OPT_SYNC_EXCLUDE},
 {"active", no_argument, NULL, OPT_ACTIVE},
@@ -1807,6 +1809,8 @@ parse_options(int argc, char *argv[],
 stream_ssl_set_peer_ca_cert_file(optarg);
 break;
 
+STREAM_REPLAY_OPTION_HANDLERS
+
 case OPT_SYNC_FROM:
 *sync_from = xstrdup(optarg);
 break;
@@ -1868,7 +1872,7 @@ usage(void)
program_name, program_name, ovs_dbdir());
 printf("\nJSON-RPC options (may be specified any number of times):\n"
"  --remote=REMOTE connect or listen to REMOTE\n");
-stream_usage("JSON-RPC", true, true, true, false);
+stream_usage("JSON-RPC", true, true, true, true);
 daemon_usage();
 vlog_usage();
 replication_usage();
-- 
2.25.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 5/5] jsonrpc: Disable inactivity probes if replay engine is active.

Current version of replay engine doesn't handle time-based internal
events that results in stream send/receive.  Disabling jsonrpc inactivity
probes for now to not block process waiting for probe being sent.

The proper solution whould be to implement correct record/replay
of time, probably, by recording time and using the time warping.

Signed-off-by: Ilya Maximets 
---
 lib/jsonrpc.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c
index ed748dbde..0e42751b0 100644
--- a/lib/jsonrpc.c
+++ b/lib/jsonrpc.c
@@ -848,7 +848,7 @@ jsonrpc_session_open_multiple(const struct svec *remotes, 
bool retry)
 reconnect_set_backoff(s->reconnect, INT_MAX, INT_MAX);
 }
 
-if (!stream_or_pstream_needs_probes(name)) {
+if (!stream_or_pstream_needs_probes(name) || stream_replay_is_active()) {
 reconnect_set_probe_interval(s->reconnect, 0);
 }
 
@@ -873,6 +873,11 @@ jsonrpc_session_open_unreliably(struct jsonrpc *jsonrpc, 
uint8_t dscp)
 reconnect_set_quiet(s->reconnect, true);
 reconnect_set_name(s->reconnect, jsonrpc_get_name(jsonrpc));
 reconnect_set_max_tries(s->reconnect, 0);
+
+if (stream_replay_is_active()) {
+reconnect_set_probe_interval(s->reconnect, 0);
+}
+
 reconnect_connected(s->reconnect, time_msec());
 s->dscp = dscp;
 s->rpc = jsonrpc;
@@ -1231,6 +1236,9 @@ void
 jsonrpc_session_set_probe_interval(struct jsonrpc_session *s,
int probe_interval)
 {
+if (stream_replay_is_active()) {
+return;
+}
 reconnect_set_probe_interval(s->reconnect, probe_interval);
 }
 
-- 
2.25.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 4/5] ovsdb-server: Don't update manager status if replay engine is active.

Current version or replay engine doesn't handle correctly internal
time-based events that ends up in stream events.  For example,
updates of a database status that happens each 2.5 seconds reults
in updates on client monitors.  Disable updates for now if replay
engine is active.  The very first update kept to store the initial
information about the server.

The proper solution whould be to record time and replay it, probably,
with time warping or in some other way.

Signed-off-by: Ilya Maximets 
---
 lib/stream.h |  6 ++
 ovsdb/ovsdb-server.c | 10 +++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/lib/stream.h b/lib/stream.h
index 40357137e..8b88c3ceb 100644
--- a/lib/stream.h
+++ b/lib/stream.h
@@ -105,6 +105,12 @@ enum stream_replay_state {
 
 void stream_replay_set_state(enum stream_replay_state);
 enum stream_replay_state stream_replay_get_state(void);
+static inline bool
+stream_replay_is_active(void)
+{
+return stream_replay_get_state() != STREAM_REPLAY_NONE;
+}
+
 void stream_replay_open_wfd(struct stream *);
 void pstream_replay_open_wfd(struct pstream *);
 void stream_replay_close_wfd(struct stream *);
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 178fa6fa5..25425569e 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -252,8 +252,10 @@ main_loop(struct server_config *config,
 }
 }
 
-/* update Manager status(es) every 2.5 seconds */
-if (time_msec() >= status_timer) {
+/* update Manager status(es) every 2.5 seconds.  Don't update if we're
+ * reconding or performing replay. */
+if (status_timer == LLONG_MIN ||
+ (!stream_replay_is_active() && time_msec() >= status_timer)) {
 status_timer = time_msec() + 2500;
 update_remote_status(jsonrpc, remotes, all_dbs);
 }
@@ -279,7 +281,9 @@ main_loop(struct server_config *config,
 if (*exiting) {
 poll_immediate_wake();
 }
-poll_timer_wait_until(status_timer);
+if (!stream_replay_is_active()) {
+poll_timer_wait_until(status_timer);
+}
 poll_block();
 if (should_service_stop()) {
 *exiting = true;
-- 
2.25.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 0/5] Stream Record/Replay.

This patch set adds new stream provider and other functionality in
order to record all the incoming data on all the steams (ssl, tcp,
unixctl) of openvswitch library based applications and replay these
streams later for debugging purposes or performance tests.

For example, these changes allowed me to record the lifecycle of
a standalone ovsdb-server in a fake-multinode cluster from ovn
scale testing setup with 100 nodes.  While having only replay files,
user could reproduce transactions and connections in a big cluster on
a local PC wile being able to tweak various log levels, run everything
under debugger or tracer, measure performance difference with perf.

Current implementation doesn't work with clustered databases since
raft heavily depends on time events.  This is a point of further
improvement.

Some more details in individual commit messages.  Some documentation
updates are missing, will be in next version of the patch-set.  Code
maybe not that elegant in a couple of places, I'm going to work on
this too.

ovn-northd might be a good candidate to integrate this functionality
from the OVN side.

Ilya Maximets (5):
  ovsdb-server: Allow using predictable UUIDs.
  stream: Add record/replay functionality.
  ovsdb-server: Integrate stream replay engine.
  ovsdb-server: Don't update manager status if replay engine is active.
  jsonrpc: Disable inactivity probes if replay engine is active.

 lib/automake.mk |   1 +
 lib/jsonrpc.c   |  10 +-
 lib/stream-provider.h   |   4 +
 lib/stream-replay.c | 561 
 lib/stream.c|  38 ++-
 lib/stream.h|  46 +++-
 lib/uuid.c  |  34 +++
 lib/uuid.h  |   1 +
 ovsdb/ovsdb-client.c|   2 +-
 ovsdb/ovsdb-server.c|  30 ++-
 tests/test-jsonrpc.c|   2 +-
 utilities/ovs-vsctl.c   |   2 +-
 vswitchd/ovs-vswitchd.c |   2 +-
 vtep/vtep-ctl.c |   2 +-
 14 files changed, 717 insertions(+), 18 deletions(-)
 create mode 100644 lib/stream-replay.c

-- 
2.25.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Tell me I'm dreaming

2020-06-29 Thread 2nd Amendment

Maybe you heard the news report. . . the Democrats are taking aim at Concealed
Carry Permits like this site promotes. Currently the #1 CCW Permit available
today is in the crossfire of New Lawmakers that want to abolish our rights to
carry! Luckily, it's still very easy to Download Your Certification here.

Gun Control is back and in a way that could change your right to carry a gun
forever. . .

Luckily this Concealed Carry Certification site is still up and legal citizens
of the US can download the multi-state Permit Certification Today.

Not Available to Illegal Immigrants
Not Available to Illegal Drug Users
Not Available to Felons
Basically, every law-abiding US Resident can download it now. All you have to
do is prove you know some basic gun safety rules and its Yours if You Want it.
. . This fast-track Online Permit Certification Process was once only
available to the military until recent Pro-Gun laws changed.

Now civilians, just like active and retired military, can Fast Track their
Concealed Carry Permit Processing by Certifying Online at this Site. You don't
even have to own or fire a gun to qualify!

Remember this one thing. . .

Nothing this Simple will Last Forever. Government Regulations are Sure to Make
this Multi-State CCW Obsolete Very Soon. . .

This site might not be available tomorrow, so get yours now while there's still
time.

unsubscribe -
http://www.tampornoizle.site/11f5jL2395EpF8911zj4Z2u4m9ej20PHI5fstIr5w6bxGaEsvZ10zd.nee9bnm7n10VR_U5kTrlx/Wendell-Brewster
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVS needs to release new stable versions.

On Tue, Jun 30, 2020 at 12:43:48AM +0200, Ilya Maximets wrote:
> On 6/29/20 10:47 PM, Ben Pfaff wrote:
> > On Fri, Jun 26, 2020 at 12:18:37PM +0200, Ilya Maximets wrote:
> >> So, what is the proposed plan:
> >>
> >>   1. We should add missed git tags to 2.11.3 and 2.11.4 releases.
> >>
> >>  Ben could you, please, take care of this? (Alternatively, I could do 
> >> that,
> >>  but I'm not sure what with the keys to sign tags and how this key
> >>  management handled these days since the collapse of web of trust.)
> > 
> > I think that it's probably best to take myself out off the critical path
> > here.  I started out using signed tags because they were easy for me,
> > since I already had a key in Debian's web of trust.  But I don't know of
> > a reason why they need to be signed, or signed by a particular key.  I
> > think it is perfectly reasonable if you push the tags, signed or
> > unsigned.
> 
> OK.  I'll do that.

Great.

> > I don't know what you mean by "the collapse of the web of trust".  Did I
> > miss a memo?
> 
> It was a reference to Certificate Spamming Attacks happened last year [1],
> and some consequences like creating new implementations of key servers
> (keys.openpgp.org) with some design changes in compare with SKS.
> 
> [1] https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d6955275f

Oh my.  I did see something about that at the time, I think.  Ugh.

> >>   3. Prepare patches for OVS stable releases for all branches starting from
> >>  branch-2.5 and apply them.  Create, sign and push tags.
> >>
> >>  Ben, I know you handled this part last time.  What is the preferred 
> >> way?
> >>  Someone else could prepare patches and send for review, I guess.
> > 
> > I think we should designate a newer "long term stable" release now that
> > OVN has released separately.
> 
> Yeah.  That's a good point.  Maybe 2.13 is a good candidate for a new LTS?

I hope so?  I'd really rather not be in the middle of that decision.

> > That aside, I have never prepared separate patches for OVS stable
> > releases.  When I apply bug fixes, I backport them as far as necessary.
> > It would be a burden to figure this out retrospectively, but it's
> > usually easy at the time of bug fix.
> 
> By "Prepare patches for OVS stable releases" I meant patches like this:
>* 2ff0b7715 2019-09-06 | Prepare for 2.10.5.
>* 5f19eaaf2 2019-09-06 | Set release date for 2.10.4. (tag: v2.10.4)

Oh!  Got it.  Justin used to do most of that work, so I didn't think
about it.  My part of the job was, typically, acking the patches and
pushing the tags (I did the latter because I was the one with the GPG
key).

I suspect that a lot of it could be automated with a script (which would
probably be more reliable than doing it manually, which I *think* Justin
probably did).

> But yes, I agree that looking through patches and backport them selectively
> is too much work and will likely require even more people.  Backporting
> patches as far as necessary is a good practice. I do that too.

Wonderful!  I was sure hoping that there was not a big backlog of
patches that might need backporting, since it's much more difficult to
do it that way in my experience.

> We could do selective backports by request in case something missed, but
> we should not revisit all the patches while preparing stable releases on
> older branches. 

Yes.

> >>   4. Update relevant docs outside of openvswitch main repository (web-site,
> >>  wiki, etc.)
> >>
> >>
> >> P.S.: We should, probably, do some periodic checks on released
> >> branches and make stable releases a bit more frequently.  For example,
> >> we could make regular stable release every few months in case we have
> >> fixes on top of the previous release.  I could keep monitoring things
> >> and ping people, but this might be better to have some plan, I think.
> > 
> > Yes, probably.  At one point Justin and I were definitely the people who
> > coordinated this, but now if it is left to us then it probably won't get
> > done, or not done often enough.
> > 
> > I really need some new people to take the leadership role here.
> > 
> 
> I could take care of this.  I mean, I could periodically look at numbers
> of patches on stable branches and prepare minor releases if needed.

That would be fantastic.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVS needs to release new stable versions.

On 6/29/20 10:47 PM, Ben Pfaff wrote:
> On Fri, Jun 26, 2020 at 12:18:37PM +0200, Ilya Maximets wrote:
>> So, what is the proposed plan:
>>
>>   1. We should add missed git tags to 2.11.3 and 2.11.4 releases.
>>
>>  Ben could you, please, take care of this? (Alternatively, I could do 
>> that,
>>  but I'm not sure what with the keys to sign tags and how this key
>>  management handled these days since the collapse of web of trust.)
> 
> I think that it's probably best to take myself out off the critical path
> here.  I started out using signed tags because they were easy for me,
> since I already had a key in Debian's web of trust.  But I don't know of
> a reason why they need to be signed, or signed by a particular key.  I
> think it is perfectly reasonable if you push the tags, signed or
> unsigned.

OK.  I'll do that.

> 
> I don't know what you mean by "the collapse of the web of trust".  Did I
> miss a memo?

It was a reference to Certificate Spamming Attacks happened last year [1],
and some consequences like creating new implementations of key servers
(keys.openpgp.org) with some design changes in compare with SKS.

[1] https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d6955275f

> 
>>   3. Prepare patches for OVS stable releases for all branches starting from
>>  branch-2.5 and apply them.  Create, sign and push tags.
>>
>>  Ben, I know you handled this part last time.  What is the preferred way?
>>  Someone else could prepare patches and send for review, I guess.
> 
> I think we should designate a newer "long term stable" release now that
> OVN has released separately.

Yeah.  That's a good point.  Maybe 2.13 is a good candidate for a new LTS?

> 
> That aside, I have never prepared separate patches for OVS stable
> releases.  When I apply bug fixes, I backport them as far as necessary.
> It would be a burden to figure this out retrospectively, but it's
> usually easy at the time of bug fix.

By "Prepare patches for OVS stable releases" I meant patches like this:
   * 2ff0b7715 2019-09-06 | Prepare for 2.10.5.
   * 5f19eaaf2 2019-09-06 | Set release date for 2.10.4. (tag: v2.10.4)

But yes, I agree that looking through patches and backport them selectively
is too much work and will likely require even more people.  Backporting
patches as far as necessary is a good practice. I do that too.

We could do selective backports by request in case something missed, but
we should not revisit all the patches while preparing stable releases on
older branches. 

> 
>>   4. Update relevant docs outside of openvswitch main repository (web-site,
>>  wiki, etc.)
>>
>>
>> P.S.: We should, probably, do some periodic checks on released
>> branches and make stable releases a bit more frequently.  For example,
>> we could make regular stable release every few months in case we have
>> fixes on top of the previous release.  I could keep monitoring things
>> and ping people, but this might be better to have some plan, I think.
> 
> Yes, probably.  At one point Justin and I were definitely the people who
> coordinated this, but now if it is left to us then it probably won't get
> done, or not done often enough.
> 
> I really need some new people to take the leadership role here.
> 

I could take care of this.  I mean, I could periodically look at numbers
of patches on stable branches and prepare minor releases if needed.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVS needs to release new stable versions.

2020-06-29 Thread Salario emocional y compensación al personal

On Fri, Jun 26, 2020 at 12:18:37PM +0200, Ilya Maximets wrote:
> So, what is the proposed plan:
> 
>   1. We should add missed git tags to 2.11.3 and 2.11.4 releases.
> 
>  Ben could you, please, take care of this? (Alternatively, I could do 
> that,
>  but I'm not sure what with the keys to sign tags and how this key
>  management handled these days since the collapse of web of trust.)

I think that it's probably best to take myself out off the critical path
here.  I started out using signed tags because they were easy for me,
since I already had a key in Debian's web of trust.  But I don't know of
a reason why they need to be signed, or signed by a particular key.  I
think it is perfectly reasonable if you push the tags, signed or
unsigned.

I don't know what you mean by "the collapse of the web of trust".  Did I
miss a memo?

>   3. Prepare patches for OVS stable releases for all branches starting from
>  branch-2.5 and apply them.  Create, sign and push tags.
> 
>  Ben, I know you handled this part last time.  What is the preferred way?
>  Someone else could prepare patches and send for review, I guess.

I think we should designate a newer "long term stable" release now that
OVN has released separately.

That aside, I have never prepared separate patches for OVS stable
releases.  When I apply bug fixes, I backport them as far as necessary.
It would be a burden to figure this out retrospectively, but it's
usually easy at the time of bug fix.

>   4. Update relevant docs outside of openvswitch main repository (web-site,
>  wiki, etc.)
> 
> 
> P.S.: We should, probably, do some periodic checks on released
> branches and make stable releases a bit more frequently.  For example,
> we could make regular stable release every few months in case we have
> fixes on top of the previous release.  I could keep monitoring things
> and ping people, but this might be better to have some plan, I think.

Yes, probably.  At one point Justin and I were definitely the people who
coordinated this, but now if it is left to us then it probably won't get
done, or not done often enough.

I really need some new people to take the leadership role here.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Clima Laboral - Herramientas para mejorar el entorno

Jueves 16 de Julio | Horario de 10:00 a 14:00 hrs.  |  (hora del centro de 
México) 

- Salario emocional y otras estrategias de compensación al personal - 

El salario emocional juega un papel importante en la motivación e implicación 
de la persona respecto a la organización.
Garantizar el compromiso y permanencia en la empresa, haciendo una adecuada 
gestión del talento, depende del salario
emocional. La retribución.

Objetivos Específicos:

- El participante revisará conceptos relacionados con el clima laboral.
- El participante revisará la metodología de medición de clima laboral.
- El participante aprenderá a evaluar el clima laboral a través de instrumentos 
de medida científicamente validados.
- El participante revisará herramientas para mejorar el clima laboral y las 
relaciones con sus colaboradores.

Conocer las estrategias más innovadoras para fomentar la motivación de sus 
trabajadores y su compromiso con la
organización potenciando la motivación de los trabajadores y conociendo cuáles 
son los beneficios y resultados que
 tiene para la empresa.

Solicita información respondiendo a este correo con la palabra "SALARIO" junto 
con los siguientes datos:

Nombre:
Correo electrónico:
Número telefónico:
Correo Alterno:

Números de Atención: 

55 3016 7085 - 55 1554 6630

En caso de que haya recibido este correo sin haberlo solicitado o si desea 
dejar de recibir nuestra promoción favor de responder
con la palabra baja o enviar un correo a bajas@ innovalearn.net


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Dr Grace H Admas

2020-06-29 Thread perfect cash finance

Congratulation!!!
Dear winner,

Please kindly view the mail attachment and contact the Claim Manager with the 
required details for payment Verification/Release. If this Notification Letter 
hits your Junk/Spam folder, simply move it from your Spam/Junk folder to your 
inbox for better viewing and easy accessibility.

Sincerely,

Dr Grace H Admas

COVID-19!/Chief Executive Promo Officer


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 1/2 ovn] External IP based NAT: Add Columns and CLI

2020-06-29 Thread svc . mail . git

Hi Mark,

Thanks a lot for the feedback.

a. Address Set is used to make sure that we don't have to configure common set 
of endpoint ips again and again. In a deployment, peered physical subnets will 
be common across all the logical routers, hence using address set looked 
better. As more IPs are allowed/disallowed, we dont have to update multiple 
logical routers.
However, i am fine with moving to arbitrary list, if that is preferred, please 
let me know your thoughts.

b. Agree with better naming feedback ?. Let me know if you have some 
preference, else i will try to come up with something better in V3.

Will address rest of the comments in V3.

Regards,
Ankur

From: Mark Michelson 
Sent: Monday, June 29, 2020 11:33 AM
To: svc.mail.git ; ovs-dev@openvswitch.org 

Subject: Re: [ovs-dev] [PATCH v2 1/2 ovn] External IP based NAT: Add Columns 
and CLI

Hi Ankur,

Why is it required to use an address set here? Typically in the
northbound database, we allow for an arbitrary list of IP addresses to
be accepted, rather than requiring the use of an address set.

We already use the term "external IP" in NAT documentation, so I think
the changes in ovn-nb.xml could use some elaboration. If I could think
of a better name for this new feature, I would :). Mention that for
SNAT, it refers to the destination address, rather than the NATted
source address, and that for DNAT, it refers to the source address
rather than the pre-NATted destination address.

Also see below for a couple more comments.

On 6/28/20 9:34 PM, Ankur Sharma wrote:
> From: Ankur Sharma 
>
> This patch adds following columns to NAT table.
>
> a. allowed_external_ip:
> Represents Address Set of External IPs for which
> a NAT rule is applicable.
>
> b. disallowed_external_ip
> Represents Address Set of External IPs for which
> a NAT rule is NOT applicable.
>
> Additionally, patch adds nbctl cli to set these column values.
> ovn-nbctl [--is-allowed] lr-nat-update-ext-ip
>
> Signed-off-by: Ankur Sharma 
> ---
>   ovn-nb.ovsschema  |  14 ++-
>   ovn-nb.xml|  24 
>   tests/ovn-nbctl.at|  37 +-
>   utilities/ovn-nbctl.c | 102 
> +-
>   4 files changed, 173 insertions(+), 4 deletions(-)
>
> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
> index da9af71..8688ae0 100644
> --- a/ovn-nb.ovsschema
> +++ b/ovn-nb.ovsschema
> @@ -1,7 +1,7 @@
>   {
>   "name": "OVN_Northbound",
> -"version": "5.24.0",
> -"cksum": "1092394564 25961",
> +"version": "5.24.1",
> +"cksum": "2114041852 26430",
>   "tables": {
>   "NB_Global": {
>   "columns": {
> @@ -400,6 +400,16 @@
>"snat",
>"dnat_and_snat"
>  ]]}}},
> +"allowed_external_ip": {"type": {
> +"key": {"type": "uuid", "refTable": "Address_Set",
> +"refType": "strong"},
> +"min": 0,
> +"max": 1}},
> +"disallowed_external_ip": {"type": {
> +"key": {"type": "uuid", "refTable": "Address_Set",
> +"refType": "strong"},
> +"min": 0,
> +"max": 1}},
>   "options": {"type": {"key": "string", "value": "string",
>"min": 0, "max": "unlimited"}},
>   "external_ids": {
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 6ac178b..d2d0b25 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -2658,6 +2658,30 @@
> 
>   
>
> +
> +  It represents Address Set of external ips that NAT rule is applicable 
> to.
> +
> +  
> +This configuration overrides the default NAT behavior of applying a
> +rule solely based on internal IP.
> +  
> +
> +
> +
> +  It represents Address Set of external ips that NAT rule is NOT
> +  applicable to.
> +  
> +This configuration overrides the default NAT behavior of applying a
> +rule solely based on internal IP.
> +  
> +
> +  
> +"allowed_external_ip" and "disallowed_external_ip" are mutually
> +exclusive to each other. If both Address Sets are set for a rule,
> +then the NAT rule is not applied.
> +  
> +
> +
>   
> Indicates if a dnat_and_snat rule should lead to connection
> tracking state or not.
> diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
> index 6d66087..613d297 100644
> --- a/tests/ovn-nbctl.at
> +++ b/tests/ovn-nbctl.at
> @@ -685,7 +685,42 @@ snat 40.0.0.3   21-65535 
> 192.168.1.6
>   AT_CHECK([ovn-nbctl lr-nat-del lr0])
>   AT_CHECK([ovn-nbctl lr-nat-list lr0], [0], [])
>   AT_CHECK

[ovs-dev] (no subject)

2020-06-29 Thread Mrs. Daborah Raymond

Dear friend,


I have a business container transaction what that some of( $13million dollars)

 I would like to discuss with you. If you are interested, please
contact my email

address (mrs.victoria.alexand...@gmail.com)

My WhatsApp number but only message (+19293737780)

Please do not reply if you are not ready
Thanks
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] CALLING FOR HELP

2020-06-29 Thread Mrs Elizabeth Edwards

Dear Friend,
Greetings!

Please forgive me for stressing you with my predicaments as I know that this 
letter may come to you as big surprise. Actually, a prophesy came to me 3 days 
ago through my pastor to reject earthly reward and riches by handing to a 
person I have not seen and known the charity project which I have planned to 
manage by myself for a greater reward in heaven awaits for me.
 
I came across your E-mail from my personal search, and after I decided to email 
you directly believing that you will be honest to assist me fulfill my final 
wish for the poor.
 
Meanwhile, I am Madam Elizabeth Edwards, 73 years, am from USA and childless. I 
am suffering from Adenocarcinoma Cancer of the lungs for the past 4 years and 
from all indication my condition is really deteriorating as my doctors have 
confirmed and courageously advised me that I may not live beyond 2 weeks from 
now for the reason that my tumor has reached a critical stage which has defiled 
all forms of medical treatment.
 
Since my days are numbered, I willingly accepted to obey my pastor’s prophecy 
by seeking your concern to help me receive the sum of Eighteen million five 
hundred thousand dollars ($18.5m) for the interest of the less privileged as 
I’ve earlier vowed to do.
 
My vow or promise for the poor includes building of well-equipped charity 
foundation hospital and a technical school for their well-being.
 If you will be honest, kind and willing to assist me handle this charity 
project as I’ve mentioned here, I will like you to provide me your personal 
data like.
 
(1) Your full name:
(2) country:
(3) phone number:
(4) Age:
(5) gender
Your opinion on how to manage this fund for the poor is welcomed too.
 
Best Regards!
Mrs. Elizabeth Edwards

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [ovs-discuss] OVS 2.12/2.13 compilation on Ubuntu Bionic

2020-06-29 Thread Gregory Rose




On 6/26/2020 4:57 AM, Maciej Jozefczyk wrote:

Hello!

I would like to kindly ask You if there is a possibility to cherry-pick
patch [1] to stable branches OVS 2.12, OVS 2.13 and release new tags for it?

Without this patch we're now unable to compile OVS 2.12 in OpenStack
Neutron stable releases CI, because it recently started to fail on Ubuntu
Bionic with an error:

2020-06-24 14:50:13.975917 | primary |
/opt/stack/new/ovs/datapath/linux/geneve.c: In function
â€˜geneve_get_v6_dstâ€™:
2020-06-24 14:50:13.975993 | primary |
/opt/stack/new/ovs/datapath/linux/geneve.c:966:15: error: â€˜const
struct ipv6_stubâ€™ has no member named â€˜ipv6_dst_lookupâ€™
2020-06-24 14:50:13.976026 | primary |   if
(ipv6_stub->ipv6_dst_lookup(geneve->net, gs6->sock->sk, &dst, fl6)) {
2020-06-24 14:50:13.976049 | primary |^
2020-06-24 14:50:14.010809 | primary | scripts/Makefile.build:285:
recipe for target '/opt/stack/new/ovs/datapath/linux/geneve.o' failed

The same happens for OVN 2.13. For now this blocks your CI pipelines.

Can I ask You to backport this patch?

Thanks,
Maciej

[1]
https://github.com/openvswitch/ovs/commit/5519e384f6a17f564fef4c5eb39e471e16c77235


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss



Adding OVS Dev list where maybe the maintainers might see this sooner.

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 2/2 ovn] External IP based NAT: NORTHD changes to use allowed/disallowed external ip

2020-06-29 Thread Mark Michelson


On 6/28/20 9:34 PM, Ankur Sharma wrote:

From: Ankur Sharma 

This patch has northd changes which consumes allowed/disallowed external ip
configuration per NAT rule in logical flow.

Allowed/Disallowed external ip range adds an additional match criteria in
snat/dnat/unsnat/undant logical flow rules.

For example, if an allowed_external_ip address set ("efgh")
is configured for following NAT rule.
TYPE EXTERNAL_IPLOGICAL_IP
snat 10.15.24.135   50.0.0.10

Then logical flow will look like following:
..(lr_out_snat)...match=(ip &&  && ip4.dst == $efgh), action=(ct_snat(...);)
...(lr_in_unsnat...)match=(ip && . && ip4.src == $efgh), action=(ct_snat;)

Signed-off-by: Ankur Sharma 
---
  northd/ovn-northd.c |  82 +
  tests/ovn-northd.at | 127 
  2 files changed, 209 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index f49e4f2..3858de9 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -8809,6 +8809,21 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
  struct in6_addr ipv6, mask_v6, v6_exact = IN6ADDR_EXACT_INIT;
  bool is_v6 = false;
  bool stateless = lrouter_nat_is_stateless(nat);
+struct nbrec_address_set *allowed_external_ip =
+  nat->allowed_external_ip;
+struct nbrec_address_set *disallowed_external_ip =
+  nat->disallowed_external_ip;
+
+if (allowed_external_ip && disallowed_external_ip) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_INFO_RL(&rl, "Both allowed: %s and disallowed: %s "
+ "external ips set for NAT, type: %s, "
+ "logical_ip: %s, external_ip: %s",
+ allowed_external_ip->name,
+ disallowed_external_ip->name,
+ nat->type, nat->logical_ip, nat->external_ip);


3 things:
1) This message should probably be a WARN instead of an INFO message.
2) You can refer to the NAT by UUID instead of type and IPs.
3) The message should say that the NAT is not being applied because of 
setting both allow and disallow IP addresses.



+continue;
+}
  
  char *error = ip_parse_masked(nat->external_ip, &ip, &mask);

  if (error || mask != OVS_BE32_MAX) {
@@ -8896,6 +8911,15 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
  ds_put_format(&match, "ip && ip%s.dst == %s",
is_v6 ? "6" : "4",
nat->external_ip);
+if (allowed_external_ip || disallowed_external_ip) {
+ds_put_format(&match, " && ip%s.src %s $%s",
+  is_v6 ? "6" : "4",
+  allowed_external_ip? "==" : "!=",
+  allowed_external_ip?
+  allowed_external_ip->name :
+  disallowed_external_ip->name);
+}


This block is repeated many times throughout this patch (with the only 
different being "src" vs "dst") and should be made into a function.



+
  if (!strcmp(nat->type, "dnat_and_snat") && stateless) {
 ds_put_format(&actions, "ip%s.dst=%s; next;",
   is_v6 ? "6" : "4", nat->logical_ip);
@@ -8925,6 +8949,15 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
od->l3redirect_port->json_key);
  }
  
+if (allowed_external_ip || disallowed_external_ip) {

+ds_put_format(&match, " && ip%s.src %s $%s",
+  is_v6 ? "6" : "4",
+  allowed_external_ip? "==" : "!=",
+  allowed_external_ip?
+  allowed_external_ip->name :
+  disallowed_external_ip->name);
+}
+
  if (!strcmp(nat->type, "dnat_and_snat") && stateless) {
  ds_put_format(&actions, "ip%s.dst=%s; next;",
is_v6 ? "6" : "4", nat->logical_ip);
@@ -8953,6 +8986,15 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
  ds_put_format(&match, "ip && ip%s.dst == %s",
is_v6 ? "6" : "4",
nat->external_ip);
+if (allowed_external_ip || disallowed_external_ip) {
+ds_put_format(&match, " && ip%s.sr

Re: [ovs-dev] [PATCH v2 1/2 ovn] External IP based NAT: Add Columns and CLI

2020-06-29 Thread Mark Michelson


Hi Ankur,

Why is it required to use an address set here? Typically in the 
northbound database, we allow for an arbitrary list of IP addresses to 
be accepted, rather than requiring the use of an address set.


We already use the term "external IP" in NAT documentation, so I think 
the changes in ovn-nb.xml could use some elaboration. If I could think 
of a better name for this new feature, I would :). Mention that for 
SNAT, it refers to the destination address, rather than the NATted 
source address, and that for DNAT, it refers to the source address 
rather than the pre-NATted destination address.


Also see below for a couple more comments.

On 6/28/20 9:34 PM, Ankur Sharma wrote:

From: Ankur Sharma 

This patch adds following columns to NAT table.

a. allowed_external_ip:
Represents Address Set of External IPs for which
a NAT rule is applicable.

b. disallowed_external_ip
Represents Address Set of External IPs for which
a NAT rule is NOT applicable.

Additionally, patch adds nbctl cli to set these column values.
ovn-nbctl [--is-allowed] lr-nat-update-ext-ip

Signed-off-by: Ankur Sharma 
---
  ovn-nb.ovsschema  |  14 ++-
  ovn-nb.xml|  24 
  tests/ovn-nbctl.at|  37 +-
  utilities/ovn-nbctl.c | 102 +-
  4 files changed, 173 insertions(+), 4 deletions(-)

diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
index da9af71..8688ae0 100644
--- a/ovn-nb.ovsschema
+++ b/ovn-nb.ovsschema
@@ -1,7 +1,7 @@
  {
  "name": "OVN_Northbound",
-"version": "5.24.0",
-"cksum": "1092394564 25961",
+"version": "5.24.1",
+"cksum": "2114041852 26430",
  "tables": {
  "NB_Global": {
  "columns": {
@@ -400,6 +400,16 @@
   "snat",
   "dnat_and_snat"
 ]]}}},
+"allowed_external_ip": {"type": {
+"key": {"type": "uuid", "refTable": "Address_Set",
+"refType": "strong"},
+"min": 0,
+"max": 1}},
+"disallowed_external_ip": {"type": {
+"key": {"type": "uuid", "refTable": "Address_Set",
+"refType": "strong"},
+"min": 0,
+"max": 1}},
  "options": {"type": {"key": "string", "value": "string",
   "min": 0, "max": "unlimited"}},
  "external_ids": {
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 6ac178b..d2d0b25 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -2658,6 +2658,30 @@

  
  
+

+  It represents Address Set of external ips that NAT rule is applicable to.
+
+  
+This configuration overrides the default NAT behavior of applying a
+rule solely based on internal IP.
+  
+
+
+
+  It represents Address Set of external ips that NAT rule is NOT
+  applicable to.
+  
+This configuration overrides the default NAT behavior of applying a
+rule solely based on internal IP.
+  
+
+  
+"allowed_external_ip" and "disallowed_external_ip" are mutually
+exclusive to each other. If both Address Sets are set for a rule,
+then the NAT rule is not applied.
+  
+
+
  
Indicates if a dnat_and_snat rule should lead to connection
tracking state or not.
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index 6d66087..613d297 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -685,7 +685,42 @@ snat 40.0.0.3   21-65535 
192.168.1.6
  AT_CHECK([ovn-nbctl lr-nat-del lr0])
  AT_CHECK([ovn-nbctl lr-nat-list lr0], [0], [])
  AT_CHECK([ovn-nbctl lr-nat-del lr0])
-AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat])])
+AT_CHECK([ovn-nbctl lr-nat-del lr0 dnat])
+
+AT_CHECK([ovn-nbctl lr-nat-del lr0])
+
+ovn-nbctl create Address_Set name=allowed_range addresses=\"1.1.1.1\"
+ovn-nbctl create Address_Set name=disallowed_range addresses=\"2.2.2.2\"
+AT_CHECK([ovn-nbctl lr-nat-add lr0 snat 40.0.0.3 192.168.1.6])
+AT_CHECK([ovn-nbctl lr-nat-update-ext-ip lr0 snat 192.168.1.6 
disallowed_range])
+AT_CHECK([ovn-nbctl --is-allowed lr-nat-update-ext-ip lr0 snat 192.168.1.6 
allowed_range])
+AT_CHECK([ovn-nbctl lr-nat-update-ext-ip lr0 snat 192.168.1.6 
disallowed_range_tmp], [1], [],
+[ovn-nbctl: disallowed_range_tmp: Address Set name not found
+])
+AT_CHECK([ovn-nbctl --is-allowed lr-nat-update-ext-ip lr0 snat 192.168.1.6 
allowed_range_tmp], [1], [],
+[ovn-nbctl: allowed_range_tmp: Address Set name not found
+])
+
+AT_CHECK([ovn-nbctl lr-nat-add lr0 dnat 40.0.0.4 192.168.1.7])
+AT_CHECK([ovn-nbctl lr-nat-update-ext-ip lr0 dnat 40.0.0.4 disallowed_range])
+AT_CHECK([ovn-nbctl --is-allowed lr-nat-update-ext-ip lr0

Re: [ovs-dev] [PATCH V3 00/12] netdev datapath offload: Support IPv6 and VXLAN encap

On 6/29/20 4:56 PM, Eli Britstein wrote:
> Hi
> 
> I have rebased, changed the order of commits to do testpmd format first and 
> addressed the comments.
> 
> I also added the acked-by of Harsha (though a bit changed since v2). Thanks 
> again.
> 
> As far as I see the only open issue is if to drop the comment about XL710 or 
> not.
> 
> Please address it and if there are any other comments before I'll post v4.

I replied about XL710 workaround.
Please, take a look.  I have no other comments for v3 right now, so, please,
send v4 once agreed on what to do with XL710 issue.

Best regards, Ilya Maximets.

> 
> Thanks,
> 
> Eli
> 
> On 6/25/2020 8:37 AM, Eli Britstein wrote:
>>
>> On 6/24/2020 5:28 PM, Ilya Maximets wrote:
>>> On 6/24/20 4:16 PM, Ilya Maximets wrote:
 On 6/24/20 12:47 PM, Eli Britstein wrote:
> On 6/24/2020 1:37 PM, Ilya Maximets wrote:
>> On 6/21/20 1:19 PM, Eli Britstein wrote:
>>> This patch set includes additional offloads - IPv6 and VXLAN encap, and
>>> enhanced logging to increase debugability.
>>>
>>> Patches #1-#8:   Add support for offloads of IPv6 patterns, partial
>>>    TCP/UDP ports, set IPv6 and encap actions
>>>    (clone/output).
>>> Patch #9:    Bug fix of partial offloads.
>>> Patches #10-#11: Enhance log prints for debugability.
>>> Patch #12:   Fix Ethernet matching for type only.
>>>
>>> v2-v1:
>>> - Removed redundant out label.
>>> - Added a patch to fix dl_type match only.
>>> v3-v2:
>>> - Rebased, and more elaboration in #7 commit message.
>> Hi.
>>
>> I noticed that you didn't include Acked-by tags from Harsha.
>> Was it intentional? i.e. if there was any significant changes during
>> rebase process?
>>
>> I'd like to have these patches acked by Harsha before applying, so
>> my question if tags from v2 are valid for v3 and can I use them?
>> Harsha?  Eli?
> No. It was by mistake. Sorry. The rebase had only one minor conflict in
>
> [PATCH V3 06/12] netdev-offload: Use dpif type instead of class.
>
> The conflict was due to previous commit [1].
>
> As I will have to rebase again for sure to add testpmd prints for VLANs, 
> as [2] was merged, will it be OK I'll add the Acked-by then?
 Sure.  I think it should be OK for patches without any functional
 changes.  Current patches doesn't apply starting from the second one
 due to merged patch [2], so they need a rebase anyway.
>> OK. No functional changes from v2 (or from v1, except the additional dl_type 
>> patch). Could you please review the rest of the series and gather comments 
>> before to apply and not just rebase? Thanks.

 Regarding the testpmd related patch itself, I'm not sure about it.
 It might be hard to support in terms that we will need to track
 testpmd changes and it's also hard to tell for which testpmd version
 these cmdlines should be applicable.
>> Testpmd format doesn't change frequently, if any. Also, a specific OVS 
>> version implies specific DPDK version. In the case of changed format, we'll 
>> need to fix.

 Another point is that it's actually printing of the same information
 twice.  And also lots of extra printing code.
 Maybe, if you need these cmdlines, it's better to write a helper script
 to convert existing output to testpmd cmdlines?  Or even parse offloaded
 and non-offloaded flows from the output of dump-flows.
>>> Anoher option is to just drop current format entirely and use format
>>> of testpmd cmdlines instead.  Current format is not standardized and
>>> doesn't comply with OVS flow dumps format anyway.
>>>
>>> What do you think?
>>
>> I kept the old format for backward compatibility, as well as some testpmd 
>> require more than one command. For example, "vxlan_encap" in "flow create", 
>> but it won't work properly in real testpmd if there was no "set vxlan" to 
>> define the encap header.
>>
>> I'll try to do all in testpmd format and drop the current format.
>>
>>>
> [1] 191536574e3b ("netdev-offload: Implement terse dump support").
>
> [2] 029273855939 ("netdev-offload-dpdk: Support offload of VLAN PUSH/POP 
> actions.")
>
>> Best regards, Ilya Maximets.
>>
>>> Travis:
>>> v1: 
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F688413350&data=02%7C01%7Celibr%40mellanox.com%7Cdaa680f281e44d447bc708d8184ad3d9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637286056953162106&sdata=6iuCat3IMKBtQAMILf4uam1BioABbJ%2BG2estCEaelXg%3D&reserved=0
>>> v2: 
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F691375847&data=02%7C01%7Celibr%40mellanox.com%7Cdaa680f281e44d447bc708d8184ad3d9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637286056953162106&sdata=ybvwkH1KMHB

Re: [ovs-dev] [PATCH V3 12/12] netdev-offload-dpdk: Fix Ethernet matching for type only

On 6/29/20 2:21 PM, Eli Britstein wrote:
> 
> On 6/29/2020 3:38 AM, Ilya Maximets wrote:
>> On 6/21/20 1:19 PM, Eli Britstein wrote:
>>> For OVS rule of the form "eth type is 0x1234 / end", rule is offloaded
>>> in the form of "eth / end", which is incorrect. Fix it.
>>>
>>> Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow")
>>> Signed-off-by: Eli Britstein 
>>> Reviewed-by: Roni Bar Yanai 
>>> ---
>>>   lib/netdev-offload-dpdk.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>> index e8b8d6464..e68b6549c 100644
>>> --- a/lib/netdev-offload-dpdk.c
>>> +++ b/lib/netdev-offload-dpdk.c
>>> @@ -860,7 +860,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>     /* Eth */
>>>   if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
>>> -    !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>>> +    !eth_addr_is_zero(match->wc.masks.dl_dst) ||
>>> +    match->wc.masks.dl_type) {
>> While calling from the userspace datapath, we always have a match on dl_type.
>> This means that it's not possible to hit the 'else' condition.
>> I'm worried about the usecase described there.  It's hard to track changes
>> in i40e_flow.c.  Can someone test this with XL710?
> 
> Yes, that's true. I think we should have it for consistency with the rest of 
> the code, and also just in case.
> 
> Regarding the XL710 comment - I think it should not have been merged into OVS 
> in the first place. If it is an issue with XL710, the relevant PMD should 
> handle it, and not OVS. I just kept it in case I miss something here. Do you 
> think it's OK to remove it?

Thinking about correctness, if dl_type match requested and we're
matching any L2 packets instead, this does not sound good.  In
this case we might perform incorrect set of actions for a packet
that should be handled differently.  Also, we might end up having
several flows with the same match and different actions in case the
only dufference was in dl_type, which is not a good thing too.

So, from that point of view it's better to remove the 'else' branch
entirely.  Good reasoning should be described in a commit message.

For the 'if' condition:  You could do 'masks.dl_type' check first,
i.e. before checking eth addresses, this way we will save a few cpu
cycles in most cases.

> 
>>
>>>   struct rte_flow_item_eth *spec, *mask;
>>>     spec = xzalloc(sizeof *spec);
>>> @@ -876,6 +877,7 @@ parse_flow_match(struct flow_patterns *patterns,
>>>     memset(&consumed_masks->dl_dst, 0, sizeof 
>>> consumed_masks->dl_dst);
>>>   memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src);
>>> +    consumed_masks->dl_type = 0;
>>>     add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
>>>   } else {
>>> @@ -888,7 +890,6 @@ parse_flow_match(struct flow_patterns *patterns,
>>>    */
>>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
>> Actually looking at i40e_flow.c, it seems like this pattern will be rejected.
>> But I'm not sure.
>>
>>>   }
>>> -    consumed_masks->dl_type = 0;
>>>     /* VLAN */
>>>   if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) {
>>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] netdev-offload-dpdk: Set transfer attribute to zero for mark/rss offload

The offload layer doesn't initialize the 'transfer' attribute
for mark/rss offload (partial offload). It should be set to 0.

Fixes: 60e778c7533a ("netdev-offload-dpdk: Framework for actions offload.")
Signed-off-by: Sriharsha Basavapatna 
---
 lib/netdev-offload-dpdk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 26a75f0f2..4c652fd82 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -818,7 +818,8 @@ netdev_offload_dpdk_mark_rss(struct flow_patterns *patterns,
 .group = 0,
 .priority = 0,
 .ingress = 1,
-.egress = 0
+.egress = 0,
+.transfer = 0
 };
 struct rte_flow_error error;
 struct rte_flow *flow;
-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ovn-northd: Make it harder to specify a bad database remote.

On Mon, Jun 29, 2020 at 12:38:02PM +0530, Numan Siddique wrote:
> On Sat, Jun 27, 2020 at 1:19 AM Ben Pfaff  wrote:
> 
> > Without this change, --ovnnb-db='' produces bad results, such as an
> > assertion failure.  With it, ovn-northd uses the default database.  The
> > latter seems preferable.  Similarly for --ovnsb-db=''.
> >
> > Signed-off-by: Ben Pfaff 
> >
> 
> Acked-by: Numan Siddique 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] jsonrpc: Don't assert for 0 remotes in jsonrpc_session_open_multiple().

On Mon, Jun 29, 2020 at 12:40:01PM +0530, Numan Siddique wrote:
> On Sat, Jun 27, 2020 at 1:17 AM Ben Pfaff  wrote:
> 
> > It's pretty easy to get 0 remotes here from ovn-northd if you specify
> > --ovnnb-db='' or --ovnnb-db='   ' on the command line.  The internals
> > of jsonrpc_session aren't equipped to cope with that, so just add a
> > dummy remote instead.
> >
> > Signed-off-by: Ben Pfaff 
> >
> 
> Acked-by: Numan Siddique 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 4/4] bpf: Add reference XDP program implementation for netdev-offload-xdp

2020-06-29 Thread 0-day Robot

Bleep bloop.  Greetings Toshiaki Makita, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 83 characters long (recommended limit is 79)
ERROR: Improper whitespace around control block
#488 FILE: bpf/bpf_netlink.h:57:
#define BPF_MAP_NL_ATTR_FOR_EACH(OFF, ATTRS, START, END, IDX, MAX_ATTRS, 
MAX_LEN) \

WARNING: Comment with 'xxx' marker
#628 FILE: bpf/flowtable_afxdp.c:95:
/* XXX: This size should be uint16_t but needs to be int as kernel has a bug

ERROR: Improper whitespace around control block
#1077 FILE: bpf/flowtable_afxdp.c:544:
BPF_MAP_NL_ATTR_FOR_EACH(offset, attrs, start, end, i, XDP_MAX_ACTIONS,

Lines checked: 1144, Warnings: 2, Errors: 2


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V3 02/12] netdev-offload-dpdk: Add IPv6 pattern matching



On 6/29/2020 10:45 AM, Eli Britstein wrote:


On 6/29/2020 1:42 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

Add support for IPv6 pattern matching for offloading flows.

Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  Documentation/howto/dpdk.rst |  2 +-
  NEWS |  1 +
  lib/netdev-offload-dpdk.c    | 76 


  3 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/Documentation/howto/dpdk.rst 
b/Documentation/howto/dpdk.rst

index be950d7ce..8a0eee80c 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -385,7 +385,7 @@ The validated NICs are:
  Supported protocols for hardware offload matches are:
    - L2: Ethernet, VLAN
-- L3: IPv4
+- L3: IPv4, IPv6
  - L4: TCP, UDP, SCTP, ICMP
    Supported actions for hardware offload are:
diff --git a/NEWS b/NEWS
index 22cacda20..07e23316c 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,7 @@ Post-v2.13.0
 - DPDK:
   * Deprecated DPDK pdump packet capture support removed.
   * Deprecated DPDK ring ports (dpdkr) are no longer supported.
+ * Add hardware offload support for matching IPv6 protocol.

Not experimental? :)

OK. BTW, when do features stop being experimental but mainstream?



 - Linux datapath:
   * Support for kernel versions up to 5.5.x.
 - AF_XDP:
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 2f1b42bf7..6b12e9ae3 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -16,6 +16,8 @@
   */
  #include 
  +#include 
+#include 
  #include 

Can we keep these in alphabetical order?

OK.


It fails travis. I'll keep it that order.

../../include/sparse/netinet/in.h:24:2: error: "Must include 
 before  for FreeBSD support"





    #include "cmap.h"
@@ -321,6 +323,42 @@ dump_flow_pattern(struct ds *s, const struct 
rte_flow_item *item)

  } else {
  ds_put_cstr(s, "  Mask = null\n");
  }
+    } else if (item->type == RTE_FLOW_ITEM_TYPE_IPV6) {
+    const struct rte_flow_item_ipv6 *ipv6_spec = item->spec;
+    const struct rte_flow_item_ipv6 *ipv6_mask = item->mask;
+
+    char src_addr_str[INET6_ADDRSTRLEN];
+    char dst_addr_str[INET6_ADDRSTRLEN];
+
+    ds_put_cstr(s, "rte flow ipv6 pattern:\n");
+    if (ipv6_spec) {
+    ipv6_string_mapped(src_addr_str,
+   (struct in6_addr 
*)&ipv6_spec->hdr.src_addr);

+    ipv6_string_mapped(dst_addr_str,
+   (struct in6_addr 
*)&ipv6_spec->hdr.dst_addr);

+
+    ds_put_format(s, "  Spec:  vtc_flow=%#"PRIx32", 
proto=%"PRIu8","

+  "  hlim=%"PRIu8",  src=%s, dst=%s\n",
+  ntohl(ipv6_spec->hdr.vtc_flow), 
ipv6_spec->hdr.proto,

+  ipv6_spec->hdr.hop_limits, src_addr_str,
+  dst_addr_str);
+    } else {
+    ds_put_cstr(s, "  Spec = null\n");
+    }
+    if (ipv6_mask) {
+    ipv6_string_mapped(src_addr_str,
+   (struct in6_addr 
*)&ipv6_mask->hdr.src_addr);

+    ipv6_string_mapped(dst_addr_str,
+   (struct in6_addr 
*)&ipv6_mask->hdr.dst_addr);

+
+    ds_put_format(s, "  Mask:  vtc_flow=%#"PRIx32", 
proto=%#"PRIx8","

+  "  hlim=%#"PRIx8",  src=%s, dst=%s\n",
+  ntohl(ipv6_mask->hdr.vtc_flow), 
ipv6_mask->hdr.proto,

+  ipv6_mask->hdr.hop_limits, src_addr_str,
+  dst_addr_str);
+    } else {
+    ds_put_cstr(s, "  Mask = null\n");
+    }
  } else {
  ds_put_format(s, "unknown rte flow pattern (%d)\n", 
item->type);

  }
@@ -658,6 +696,44 @@ parse_flow_match(struct flow_patterns *patterns,
  /* ignore mask match for now */
  consumed_masks->nw_frag = 0;
  +    /* IP v6 */
+    if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
+    struct rte_flow_item_ipv6 *spec, *mask;
+
+    spec = xzalloc(sizeof *spec);
+    mask = xzalloc(sizeof *mask);
+
+    spec->hdr.proto = match->flow.nw_proto;
+    spec->hdr.hop_limits = match->flow.nw_ttl;
+    spec->hdr.vtc_flow = htonl((uint32_t)match->flow.nw_tos <<

Please, add a space between the caset and a value.
It also might look better if the whole right side of the assignment
will be moved to the next line and joined.

OK



+ RTE_IPV6_HDR_TC_SHIFT);
+    memcpy(spec->hdr.src_addr, &match->flow.ipv6_src,
+   sizeof spec->hdr.src_addr);
+    memcpy(spec->hdr.dst_addr, &match->flow.ipv6_dst,
+   sizeof spec->hdr.dst_addr);
+
+    mask->hdr.proto = match->wc.masks.nw_proto;
+    mask->hdr.hop_limits = match->wc.masks.nw_ttl;
+    mask->hdr.vtc_flow = htonl((uint32_t)match->wc.masks.nw_tos <<
+   RTE_IPV6_HDR_TC_SHIFT);

Ditto.

Re: [ovs-dev] [PATCH v3 3/4] netdev-offload: Add xdp flow api provider

2020-06-29 Thread 0-day Robot

Bleep bloop.  Greetings Toshiaki Makita, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#252 FILE: lib/netdev-afxdp.c:329:
/* XXX: close output_map_fd somewhere? */

ERROR: Improper whitespace around control block
#734 FILE: lib/netdev-offload-xdp.c:258:
FLOWMAP_FOR_EACH_INDEX(idx, mf->map) {

ERROR: Improper whitespace around control block
#806 FILE: lib/netdev-offload-xdp.c:330:
NL_ATTR_FOR_EACH_UNSAFE(a, left, actions, actions_len) {

ERROR: Improper whitespace around control block
#963 FILE: lib/netdev-offload-xdp.c:487:
HMAP_FOR_EACH_WITH_HASH(netdev_info, port_node, port_hash,

ERROR: Improper whitespace around control block
#1018 FILE: lib/netdev-offload-xdp.c:542:
NL_ATTR_FOR_EACH_UNSAFE(a, left, actions, actions_len) {

WARNING: Comment with 'xxx' marker
#1038 FILE: lib/netdev-offload-xdp.c:562:
/* XXX: Some NICs cannot handle XDP_REDIRECT'ed packets even with

ERROR: Improper whitespace around control block
#1056 FILE: lib/netdev-offload-xdp.c:580:
HMAP_FOR_EACH_WITH_HASH(data, node, hash, &devmap_idx_table) {

ERROR: Improper whitespace around control block
#1255 FILE: lib/netdev-offload-xdp.c:779:
FLOWMAP_FOR_EACH_INDEX(fidx, minimatch.mask->masks.map) {

ERROR: Improper whitespace around control block
#1528 FILE: lib/netdev-offload-xdp.c:1052:
HMAP_FOR_EACH_WITH_HASH(data, ufid_node, hash, &ufid_to_xdp) {

WARNING: Line lacks whitespace around operator
#1604 FILE: lib/netdev-offload-xdp.c:1128:
if (entry->count == (uint16_t)-1) {

WARNING: Line is 87 characters long (recommended limit is 79)
#1735 FILE: lib/netdev-offload-xdp.c:1259:
VLOG_ERR("\"subtbl_masks\" map has different max_entries from 
\"flow_table\"");

WARNING: Line is 83 characters long (recommended limit is 79)
#1782 FILE: lib/netdev-offload-xdp.c:1306:
VLOG_WARN("%s: Failed to get output_map fd on uninitializing xdp flow 
api",

Lines checked: 1879, Warnings: 5, Errors: 7


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 1/4] netdev-afxdp: Enable loading XDP program.

2020-06-29 Thread 0-day Robot

Bleep bloop.  Greetings Toshiaki Makita, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 80 characters long (recommended limit is 79)
#164 FILE: lib/netdev-afxdp.c:265:
xsk_load_prog(struct netdev *netdev, const char *path, struct bpf_object **pobj,

WARNING: Line is 81 characters long (recommended limit is 79)
#349 FILE: lib/netdev-afxdp.c:826:
&& nullable_string_is_equal(dev->xdp_obj_path, dev->requested_xdp_obj)) 
{

Lines checked: 427, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] Fix selection fields for UDP and SCTP load balancers.

On Mon, Jun 29, 2020 at 8:48 PM Mark Michelson  wrote:

> On 6/22/20 1:41 PM, num...@ovn.org wrote:
> > From: Numan Siddique 
> >
> > The commit 5af304e7478a ("Support selection fields in load balancer.")
> > didn't handle the selection fields for UDP and SCTP protocol.
> > If CMS has set the selection fields - tp_src and tp_dst for UDP or SCTP
> > load balancers, ovn-northd was adding lflows as
> > ct_lb(backends=, hash_fields(tp_src,tp_dst))
> > instead of ct_lb(backends=, hash_fields(udp_src,udp_dst)) and
> > ct_lb(backends=, hash_fields(sctp_src,sctp_dst)) respectively.
> >
> > Because of this, select group flows were encoded with tcp_src and tcp_dst
> > hash fields.
> >
> > This patch fixes this issue and refactors the load balancer code in
> ovn-northd
> > to better handle the protocols.
> >
> > Test cases are now added for UDP and SCTP protocols.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1846189
> > Signed-off-by: Numan Siddique 
> > ---
> >   northd/ovn-northd.c |  71 --
> >   tests/ovn.at| 179 ++--
> >   tests/system-ovn.at |  23 ++
> >   3 files changed, 227 insertions(+), 46 deletions(-)
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 40aeb0a58..7887d0d2f 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -3121,6 +3121,7 @@ struct ovn_lb {
> >
> >   const struct nbrec_load_balancer *nlb; /* May be NULL. */
> >   char *selection_fields;
> > +char *proto;
> >   struct lb_vip *vips;
> >   size_t n_vips;
> >   };
> > @@ -3218,6 +3219,12 @@ ovn_lb_create(struct northd_context *ctx, struct
> hmap *lbs,
> >   struct smap_node *node;
> >   size_t n_vips = 0;
> >
> > +if (!nbrec_lb->protocol || !nbrec_lb->protocol[0]) {
> > +lb->proto = xstrdup("tcp");
> > +} else {
> > +lb->proto = xstrdup(nbrec_lb->protocol);
> > +}
> > +
>
> Hm, I don't think this is quite right. Load balancers that do not
> specify ports are L4-agnostic. This patch does not change the behavior,
> but it sets lb->proto to "tcp" even if no port is specified. This could
> result in confusion and potential future buggy code.
>


Lets say, CMS creates a load balancer without specifying protocol

ovn-nbctl lb-add lb0 "10.0.0.10" "10.0.0.3,20.0.0.3"

And then

ovn-nbctl set load_balancer  vips:"10.0.0.12:80"="10.0.0.4:80,
20.0.0.4:80"

In this case we should consider the protocol as TCP right ?

And anyway we add the L4 port checks only if the VIP has port specified. So
I think
it's ok to assume that the load balancer is TCP even if no VIP of the load
balancer has a port specified.

And lets say user specifies the selection_fields as
"ip_src,ip_dst,tp_src,tp_dst" for the below load balancer
(with no protocol specified)

ovn-nbctl lb-add lb1 "10.0.0.20" "10.0.0.30,20.0.0.30"
ovn-nbctl set load_balancer 
selection_fields="ip_src,ip_dst,tp_src,tp_dst".

In this case even without this patch, ovs-vswitchd will consider it as TCP
protocol.

So do you think it's fair to consider the protocol as tcp for a load
balancer (when protocol is not explicitly set)
   - If a VIP has a port specified
   - If selection_fields has "tp_src/tp_dst" set ?

Thanks
Numan

Thanks
Numan




>
> >   SMAP_FOR_EACH (node, &nbrec_lb->vips) {
> >   char *vip;
> >   uint16_t port;
> > @@ -3234,7 +3241,7 @@ ovn_lb_create(struct northd_context *ctx, struct
> hmap *lbs,
> >   lb->vips[n_vips].backend_ips = xstrdup(node->value);
> >
> >   struct nbrec_load_balancer_health_check *lb_health_check =
> NULL;
> > -if (nbrec_lb->protocol && !strcmp(nbrec_lb->protocol, "sctp")) {
> > +if (!strcmp(lb->proto, "sctp")) {
> >   if (nbrec_lb->n_health_check > 0) {
> >   static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(1, 1);
> >   VLOG_WARN_RL(&rl,
> > @@ -3309,15 +3316,11 @@ ovn_lb_create(struct northd_context *ctx, struct
> hmap *lbs,
> >   lb->vips[n_vips].backends[i].svc_mon_src_ip =
> svc_mon_src_ip;
> >
> >   if (lb_health_check && op && svc_mon_src_ip) {
> > -const char *protocol = nbrec_lb->protocol;
> > -if (!protocol || !protocol[0]) {
> > -protocol = "tcp";
> > -}
> >   lb->vips[n_vips].backends[i].health_check = true;
> >   struct service_monitor_info *mon_info =
> >   create_or_get_service_mon(ctx, monitor_map,
> backend_ip,
> > op->nbsp->name,
> backend_port,
> > -  protocol);
> > +  lb->proto);
> >
> >   ovs_assert(mon_info);
> >   sbrec_service_monitor_set_options(
> > @@ -3351,7 +3354,14 @@ ovn_lb_create(struct northd_context *ctx, struct
> hmap *lbs,
> >   if (lb->nl

[ovs-dev] [PATCH v3 1/4] netdev-afxdp: Enable loading XDP program.

From: William Tu 

Now netdev-afxdp always forwards all packets to userspace because
it is using libbpf's default XDP program, see 'xsk_load_xdp_prog'.
There are some cases when users want to keep packets in kernel instead
of sending to userspace, for example, management traffic such as SSH
should be processed in kernel.

The patch enables loading the user-provided XDP program by
  $ovs-vsctl -- set int afxdp-p0 options:xdp-obj=

So users can implement their filtering logic or traffic steering idea
in their XDP program, and rest of the traffic passes to AF_XDP socket
handled by OVS.

Note: kernel in AF_XDP CI test is updated to 5.5 because libbpf from 5.3
does not have newly used APIs like "bpf_program__get_type".

Signed-off-by: William Tu 
Co-Authored-by: Toshiaki Makita 
Signed-off-by: Toshiaki Makita 
---
 .travis.yml   |   2 +-
 Documentation/intro/install/afxdp.rst |  59 ++
 NEWS  |   2 +
 lib/netdev-afxdp.c| 154 --
 lib/netdev-linux-private.h|   3 +
 5 files changed, 212 insertions(+), 8 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 527240a67..26b55a3e6 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -40,7 +40,7 @@ env:
   - TESTSUITE=1 LIBS=-ljemalloc
   - KERNEL_LIST="5.5  4.20 4.19 4.18 4.17 4.16"
   - KERNEL_LIST="4.15 4.14 4.9  4.4  3.19 3.16"
-  - AFXDP=1 KERNEL=5.3
+  - AFXDP=1 KERNEL=5.5
   - M32=1 OPTS="--disable-ssl"
   - DPDK=1 OPTS="--enable-shared"
   - DPDK_SHARED=1
diff --git a/Documentation/intro/install/afxdp.rst 
b/Documentation/intro/install/afxdp.rst
index 3c8f78825..2cd02477f 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -300,6 +300,65 @@ Or, use OVS pmd tool::
   ovs-appctl dpif-netdev/pmd-stats-show
 
 
+Loading Custom XDP Program
+--
+By defailt, netdev-afxdp always forwards all packets to userspace because
+it is using libbpf's default XDP program. There are some cases when users
+want to keep packets in kernel instead of sending to userspace, for example,
+management traffic such as SSH should be processed in kernel. This can be
+done by loading the user-provided XDP program::
+
+  ovs-vsctl -- set int afxdp-p0 options:xdp-obj=
+
+So users can implement their filtering logic or traffic steering idea
+in their XDP program, and rest of the traffic passes to AF_XDP socket
+handled by OVS. To set it back to default, use::
+
+  ovs-vsctl remove int afxdp-p0 options xdp-obj
+
+Below is a sample C program compiled under kernel's samples/bpf/.
+
+.. code-block:: c
+
+  #include 
+  #include "bpf_helpers.h"
+
+  #if LINUX_VERSION_CODE < KERNEL_VERSION(5,3,0)
+  /* Kernel version before 5.3 needed an additional map */
+  struct bpf_map_def SEC("maps") qidconf_map = {
+  .type = BPF_MAP_TYPE_ARRAY,
+  .key_size = sizeof(int),
+  .value_size = sizeof(int),
+  .max_entries = 64,
+  };
+  #endif
+
+  /* OVS will associate map 'xsks_map' to xsk socket. */
+  struct bpf_map_def SEC("maps") xsks_map = {
+  .type = BPF_MAP_TYPE_XSKMAP,
+  .key_size = sizeof(int),
+  .value_size = sizeof(int),
+  .max_entries = 32,
+  };
+
+  SEC("xdp_sock")
+  int xdp_sock_prog(struct xdp_md *ctx)
+  {
+  int index = ctx->rx_queue_index;
+
+  /* Customized by user.
+   * For example
+   * 1) filter out all SSH traffic and return XDP_PASS
+   *for kernel to process.
+   * 2) Drop unwanted packet by returning XDP_DROP.
+   */
+
+  /* Rest of packets goes to AF_XDP. */
+  return bpf_redirect_map(&xsks_map, index, 0);
+  }
+  char _license[] SEC("license") = "GPL";
+
+
 Example Script
 --
 
diff --git a/NEWS b/NEWS
index 0116b3ea0..298bd7b1e 100644
--- a/NEWS
+++ b/NEWS
@@ -48,6 +48,8 @@ v2.13.0 - 14 Feb 2020
  generic   - former SKB
  best-effort [default] - new one, chooses the best available from
  3 above modes
+ * New option 'xdp-obj' for loading custom XDP program.  Default uses
+   the libbpf builtin XDP program.
- DPDK:
  * DPDK pdump packet capture support disabled by default. New configure
option '--enable-dpdk-pdump' to enable it.
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index 482400d8d..2b2b811e2 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -21,6 +21,7 @@
 #include "netdev-afxdp.h"
 #include "netdev-afxdp-pool.h"
 
+#include 
 #include 
 #include 
 #include 
@@ -32,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -96,7 +98,8 @@ static struct xsk_socket_info *xsk_configure(int ifindex, int 
xdp_queue_id,
  enum afxdp_mode mode,
  bool use_need_wakeup,
  bool report_socket_failures);
-static void xsk_remove_xdp_program(uint32_t ifindex, enum afxd

[ovs-dev] [PATCH v3 4/4] bpf: Add reference XDP program implementation for netdev-offload-xdp

This adds a reference program, flowtable_afxdp.o, which can be used to
offload flows to XDP through netdev-offload-xdp.
The program can be compiled with --enable-bpf switch.

Signed-off-by: Toshiaki Makita 
---
 Makefile.am   |   9 +-
 acinclude.m4  |  57 
 bpf/.gitignore|   4 +
 bpf/Makefile.am   |  61 +
 bpf/bpf_compiler.h|  25 ++
 bpf/bpf_miniflow.h| 179 +
 bpf/bpf_netlink.h |  62 +
 bpf/bpf_workaround.h  |  28 ++
 bpf/flowtable_afxdp.c | 585 ++
 configure.ac  |   2 +
 10 files changed, 1010 insertions(+), 2 deletions(-)
 create mode 100644 bpf/.gitignore
 create mode 100644 bpf/Makefile.am
 create mode 100644 bpf/bpf_compiler.h
 create mode 100644 bpf/bpf_miniflow.h
 create mode 100644 bpf/bpf_netlink.h
 create mode 100644 bpf/bpf_workaround.h
 create mode 100644 bpf/flowtable_afxdp.c

diff --git a/Makefile.am b/Makefile.am
index b279303d1..f18bfefde 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -8,6 +8,9 @@
 AUTOMAKE_OPTIONS = foreign subdir-objects
 ACLOCAL_AMFLAGS = -I m4
 SUBDIRS = datapath
+if HAVE_BPF
+SUBDIRS += bpf
+endif
 
 AM_CPPFLAGS = $(SSL_CFLAGS)
 AM_LDFLAGS = $(SSL_LDFLAGS)
@@ -198,7 +201,9 @@ ALL_LOCAL += dist-hook-git
 dist-hook-git: distfiles
@if test -e $(srcdir)/.git && (git --version) >/dev/null 2>&1; then \
  (cd datapath && $(MAKE) distfiles); \
- (cat distfiles; sed 's|^|datapath/|' datapath/distfiles) | \
+ (cd bpf && $(MAKE) distfiles); \
+ (cat distfiles; sed 's|^|datapath/|' datapath/distfiles; \
+  sed 's|^|bpf/|' bpf/distfiles) | \
LC_ALL=C sort -u > all-distfiles; \
  (cd $(srcdir) && git ls-files) | grep -v '\.gitignore$$' | \
grep -v '\.gitattributes$$' | \
@@ -234,7 +239,7 @@ config-h-check:
@cd $(srcdir); \
if test -e .git && (git --version) >/dev/null 2>&1 && \
  git --no-pager grep -L '#include ' `git ls-files | grep 
'\.c$$' | \
-   grep -vE 
'^datapath|^lib/sflow|^third-party|^datapath-windows|^python'`; \
+   grep -vE 
'^datapath|^lib/sflow|^third-party|^datapath-windows|^python|^bpf'`; \
then \
  echo "See above for list of violations of the rule that"; \
  echo "every C source file must #include ."; \
diff --git a/acinclude.m4 b/acinclude.m4
index 8847b8145..8d481eb98 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -301,6 +301,63 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [
   AM_CONDITIONAL([HAVE_AF_XDP], test "$AF_XDP_ENABLE" = true)
 ])
 
+dnl OVS_CHECK_LINUX_BPF
+dnl
+dnl Check both llvm and libbpf support
+AC_DEFUN([OVS_CHECK_LINUX_BPF], [
+  AC_ARG_ENABLE([bpf],
+[AC_HELP_STRING([--enable-bpf],
+[Compile reference eBPF programs for XDP])],
+[], [enable_bpf=no])
+  AC_MSG_CHECKING([whether BPF is enabled])
+  if test "$enable_bpf" != yes; then
+AC_MSG_RESULT([no])
+BPF_ENABLE=false
+  else
+AC_MSG_RESULT([yes])
+BPF_ENABLE=true
+
+AC_CHECK_PROG(CLANG_CHECK, clang, yes)
+AS_IF([test X"$CLANG_CHECK" != X"yes"],
+  [AC_MSG_ERROR([unable to find clang to compile BPF program])])
+
+AC_CHECK_PROG(LLC_CHECK, llc, yes)
+AS_IF([test X"$LLC_CHECK" != X"yes"],
+  [AC_MSG_ERROR([unable to find llc to compile BPF program])])
+
+AC_CHECK_HEADER([bpf/bpf_helpers.h], [],
+  [AC_MSG_ERROR([unable to find bpf/bpf_helpers.h to compile BPF 
program])],
+  [#include ])
+
+AC_CHECK_HEADER([linux/bpf.h], [],
+  [AC_MSG_ERROR([unable to find linux/bpf.h to compile BPF program])])
+
+AC_MSG_CHECKING([for LLVM bpf target support])
+if llc -march=bpf -mattr=help >/dev/null 2>&1; then
+  AC_MSG_RESULT([yes])
+else
+  AC_MSG_RESULT([no])
+  AC_MSG_ERROR([LLVM does not support bpf target])
+fi
+
+AC_MSG_CHECKING([for BTF DATASEC support])
+AC_LANG_CONFTEST(
+  [AC_LANG_SOURCE([__attribute__((section("_x"), used)) int x;])])
+if clang -g -O2 -S -target bpf -emit-llvm -c conftest.c -o conftest.ll && \
+   llc -march=bpf -filetype=obj -o conftest.o conftest.ll && \
+   readelf -p.BTF conftest.o 2>/dev/null | grep -q -w _x; then
+  AC_MSG_RESULT([yes])
+else
+  AC_MSG_RESULT([no])
+  AC_MSG_ERROR([LLVM does not support BTF DATASEC])
+fi
+
+AC_DEFINE([HAVE_BPF], [1],
+  [Define to 1 if BPF compilation is available and enabled.])
+  fi
+  AM_CONDITIONAL([HAVE_BPF], test "$BPF_ENABLE" = true)
+])
+
 dnl OVS_CHECK_DPDK
 dnl
 dnl Configure DPDK source tree
diff --git a/bpf/.gitignore b/bpf/.gitignore
new file mode 100644
index 0..ab0f2d9e4
--- /dev/null
+++ b/bpf/.gitignore
@@ -0,0 +1,4 @@
+*.ll
+/distfiles
+/Makefile
+/Makefile.in
diff --git a/bpf/Makefile.am b/bpf/Makefile.am
new file mode 100644
index 0..0335e5fcc
--- /dev/null
+++ b/bpf/Makefile.am
@@ -0,0 +1,61 @@
+AUTOMAKE_OPTIONS = foreign
+
+EXTRA_DIST = flowtable_af

[ovs-dev] [PATCH v3 2/4] netdev-offload: Add "offload-driver" other_config to specify offload driver

The following commit will introduce another offload driver using XDP.
When using afxdp netdev, both of TC and XDP will be supported, so let's
add an other_config to specify which offload driver is preferable.
When not specified and multiple offload drivers can be used, TC will be
used if netdev supports it.

Signed-off-by: Toshiaki Makita 
---
 lib/netdev-offload.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index ab97a292e..669513646 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -60,6 +60,9 @@ VLOG_DEFINE_THIS_MODULE(netdev_offload);
 
 static bool netdev_flow_api_enabled = false;
 
+#define FLOW_API_DRIVER_DEFAULT "linux_tc"
+static const char *netdev_flow_api_driver = NULL;
+
 /* Protects 'netdev_flow_apis'.  */
 static struct ovs_mutex netdev_flow_api_provider_mutex = OVS_MUTEX_INITIALIZER;
 
@@ -171,18 +174,30 @@ netdev_flow_api_equals(const struct netdev *netdev1,
 static int
 netdev_assign_flow_api(struct netdev *netdev)
 {
-struct netdev_registered_flow_api *rfa;
+struct netdev_registered_flow_api *rfa, *current_rfa = NULL;
 
 CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
+if (netdev_flow_api_driver &&
+strcmp(netdev_flow_api_driver, rfa->flow_api->type)) {
+continue;
+}
 if (!rfa->flow_api->init_flow_api(netdev)) {
-ovs_refcount_ref(&rfa->refcnt);
-ovsrcu_set(&netdev->flow_api, rfa->flow_api);
-VLOG_INFO("%s: Assigned flow API '%s'.",
-  netdev_get_name(netdev), rfa->flow_api->type);
-return 0;
+if (!current_rfa ||
+(!netdev_flow_api_driver &&
+ !strcmp(FLOW_API_DRIVER_DEFAULT, rfa->flow_api->type))) {
+current_rfa = rfa;
+}
+} else {
+VLOG_DBG("%s: flow API '%s' is not suitable.",
+ netdev_get_name(netdev), rfa->flow_api->type);
 }
-VLOG_DBG("%s: flow API '%s' is not suitable.",
- netdev_get_name(netdev), rfa->flow_api->type);
+}
+if (current_rfa) {
+ovs_refcount_ref(¤t_rfa->refcnt);
+ovsrcu_set(&netdev->flow_api, current_rfa->flow_api);
+VLOG_INFO("%s: Assigned flow API '%s'.",
+  netdev_get_name(netdev), current_rfa->flow_api->type);
+return 0;
 }
 VLOG_INFO("%s: No suitable flow API found.", netdev_get_name(netdev));
 
@@ -649,6 +664,8 @@ netdev_set_flow_api_enabled(const struct smap 
*ovs_other_config)
 static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
 
 if (ovsthread_once_start(&once)) {
+const char *offload_driver;
+
 netdev_flow_api_enabled = true;
 
 VLOG_INFO("netdev: Flow API Enabled");
@@ -662,6 +679,10 @@ netdev_set_flow_api_enabled(const struct smap 
*ovs_other_config)
 netdev_offload_rebalance_policy = true;
 }
 
+offload_driver = smap_get_def(ovs_other_config, "offload-driver",
+  NULL);
+netdev_flow_api_driver = nullable_xstrdup(offload_driver);
+
 netdev_ports_flow_init();
 
 ovsthread_once_done(&once);
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v3 0/4] XDP offload using flow API provider

This patch adds an XDP-based flow cache using the OVS netdev-offload
flow API provider.  When an OVS device with XDP offload enabled,
packets first are processed in the XDP flow cache (with parse, and
table lookup implemented in eBPF) and if hits, the action processing
are also done in the context of XDP, which has the minimum overhead.

This provider is based on top of William's recently posted patch for
custom XDP load.  When a custom XDP is loaded, the provider detects if
the program supports classifier, and if supported it starts offloading
flows to the XDP program.

The patches are derived from xdp_flow[1], which is a mechanism similar to
this but implemented in kernel.


* Motivation

While userspace datapath using netdev-afxdp or netdev-dpdk shows good
performance, there are use cases where packets better to be processed in
kernel, for example, TCP/IP connections, or container to container
connections.  Current solution is to use tap device or af_packet with
extra kernel-to/from-userspace overhead.  But with XDP, a better solution
is to steer packets earlier in the XDP program, and decides to send to
userspace datapath or stay in kernel.

One problem with current netdev-afxdp is that it forwards all packets to
userspace, The first patch from William (netdev-afxdp: Enable loading XDP
program.) only provides the interface to load XDP program, howerver users
usually don't know how to write their own XDP program.

XDP also supports HW-offload so it may be possible to offload flows to
HW through this provider in the future, although not currently.
The reason is that map-in-map is required for our program to support
classifier with subtables in XDP, but map-in-map is not offloadable.
If map-in-map becomes offloadable, HW-offload of our program will also
be doable.


* How to use

1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and
   kernel >= 5.3.

2. make with --enable-afxdp --enable-bpf
--enable-bpf will generate XDP program "bpf/flowtable_afxdp.o".  Note that
the BPF object will not be installed anywhere by "make install" at this point. 

3. Load custom XDP program
E.g.
$ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
  options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
$ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
  options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"

4. Enable XDP_REDIRECT
If you use veth devices, make sure to load some (possibly dummy) programs
on the peers of veth devices.
Some HW NIC drivers require as many queues as cores on its system. Tweak
queues using "ethtool -L".

5. Enable hw-offload 
$ ovs-vsctl set Open_vSwitch . other_config:offload-driver=linux_xdp
$ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
This will starts offloading flows to the XDP program.

You should be able to see some maps installed, including "debug_stats".
$ bpftool map

If packets are successfully redirected by the XDP program,
debug_stats[2] will be counted.
$ bpftool map dump id 

Currently only very limited keys and output actions are supported.
For example NORMAL action entry and IP based matching work with current
key support. VLAN actions used by port tag/trunks are also supported.


* Performance

Tested 2 cases. 1) i40e to veth, 2) i40e to i40e.
Test 1 Measured drop rate at veth interface with redirect action from
physical interface (i40e 25G NIC, XXV 710) to veth. The CPU is Xeon
Silver 4114 (2.20 GHz).
   XDP_DROP
+--+  +---++---+
 pktgen -- wire --> | eth0 | -- NORMAL ACTION --> | veth0 || veth2 |
+--+  +---++---+

Test 2 uses i40e instead of veth, and measured tx packet rate at output
device.

Single-flow performance test results:

1) i40e-veth

  a) no-zerocopy in i40e

- xdp   3.7 Mpps
- afxdp 820 kpps

  b) zerocopy in i40e (veth does not have zc)

- xdp   1.8 Mpps
- afxdp 800 Kpps

2) i40e-i40e

  a) no-zerocopy

- xdp   3.0 Mpps
- afxdp 1.1 Mpps

  b) zerocopy

- xdp   1.7 Mpps
- afxdp 4.0 Mpps

** xdp is better when zc is disabled. The reason of poor performance on zc
   is that xdp_frame requires packet memory allocation and memcpy on
   XDP_REDIRECT to other devices iff zc is enabled.

** afxdp with zc is better than xdp without zc, but afxdp is using 2 cores
   in this case, one is pmd and the other is softirq. When pmd and softirq
   were running on the same core, the performance was extremely poor as
   pmd consumes cpus.
   When offloading to xdp, xdp only uses softirq while pmd is still
   consuming 100% cpu.  This means we need probably only one pmd for xdp
   even when we want to use more cores for multi-flow.
   Testing afxdp-nonpmd is TBD.


This patch set is based on top of commit dda80837 ("AUTHORS: Add Sriharsha
Basavapatna.").

[1] https://lwn.net/Articles/802653/

v3:
- Use

[ovs-dev] [PATCH v3 3/4] netdev-offload: Add xdp flow api provider

This provider offloads classifier to software XDP.

It works only when a custom XDP object is loaded by afxdp netdev.
The BPF program needs to implement classifier with array-of-maps for
subtable hashmaps and arraymap for subtable masks. The flow api
provider detects classifier support in the custom XDP program when
loading it.

The requirements for the BPF program is as below:

- A struct named "meta_info" in ".ovs_meta" section
  inform vswitchd of supported keys, actions, and the max number of
  actions.

- A map named "subtbl_masks"
  An array which implements a list. The entry contains struct
  xdp_subtables_mask provided by lib/netdev-offload-xdp.h followed by
  miniflow buf of any length for the mask.

- A map named "subtbl_masks_hd"
  A one entry int array which contains the first entry index of the list
  implemented by subtbl_masks.

- A map named "flow_table"
  An array-of-maps. Each entry points to subtable hash-map instantiated
  with the following "subtbl_template".
  The entry of each index of subtbl_masks has miniflow mask of subtable
  in the corresponding index of flow_table.

- A map named "subtbl_template"
  A template for subtable, used for entries of "flow_table".
  The key must be miniflow buf (without leading map).

- A map named "output_map"
  A devmap. Each entry represents odp port.

- A map named "xsks_map"
  A xskmap. Used for upcall.

For more details, refer to the reference BPF program in next commit.

In the future it may be possible to offload classifier to SmartNIC
through XDP, but currently it's not because map-in-map is not supported
by XDP hw-offload.

Signed-off-by: Toshiaki Makita 
---
 lib/automake.mk   |6 +-
 lib/bpf-util.c|   38 +
 lib/bpf-util.h|   22 +
 lib/netdev-afxdp.c|  205 -
 lib/netdev-afxdp.h|3 +
 lib/netdev-linux-private.h|2 +
 lib/netdev-offload-provider.h |6 +
 lib/netdev-offload-xdp.c  | 1315 +
 lib/netdev-offload-xdp.h  |   49 ++
 lib/netdev-offload.c  |6 +
 10 files changed, 1650 insertions(+), 2 deletions(-)
 create mode 100644 lib/bpf-util.c
 create mode 100644 lib/bpf-util.h
 create mode 100644 lib/netdev-offload-xdp.c
 create mode 100644 lib/netdev-offload-xdp.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 86940ccd2..1fa1371f3 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -421,10 +421,14 @@ endif
 
 if HAVE_AF_XDP
 lib_libopenvswitch_la_SOURCES += \
+   lib/bpf-util.c \
+   lib/bpf-util.h \
lib/netdev-afxdp-pool.c \
lib/netdev-afxdp-pool.h \
lib/netdev-afxdp.c \
-   lib/netdev-afxdp.h
+   lib/netdev-afxdp.h \
+   lib/netdev-offload-xdp.c \
+   lib/netdev-offload-xdp.h
 endif
 
 if DPDK_NETDEV
diff --git a/lib/bpf-util.c b/lib/bpf-util.c
new file mode 100644
index 0..324cfbe1d
--- /dev/null
+++ b/lib/bpf-util.c
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2020 NTT Corp.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include "bpf-util.h"
+
+#include 
+
+#include "ovs-thread.h"
+
+DEFINE_STATIC_PER_THREAD_DATA(struct { char s[128]; },
+  libbpf_strerror_buffer,
+  { "" });
+
+const char *
+ovs_libbpf_strerror(int err)
+{
+enum { BUFSIZE = sizeof libbpf_strerror_buffer_get()->s };
+char *buf = libbpf_strerror_buffer_get()->s;
+
+libbpf_strerror(err, buf, BUFSIZE);
+
+return buf;
+}
diff --git a/lib/bpf-util.h b/lib/bpf-util.h
new file mode 100644
index 0..6346935b3
--- /dev/null
+++ b/lib/bpf-util.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (c) 2020 NTT Corp.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef BPF_UTIL_H
+#define BPF_UTIL_H 1
+
+const char *ovs_libbpf_strerror(int err);
+
+#endif /* bpf-util.h */
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index 2b2b811e2..9229ae1b0 100644
--- a/lib/net

Re: [ovs-dev] [PATCH ovn] Split SB Port_Group per datapath.

On 6/29/20 9:39 AM, Numan Siddique wrote:
> 
> 
> On Fri, Jun 26, 2020 at 6:50 PM Dumitru Ceara  > wrote:
> 
> In order to avoid ovn-controller reinstalling all logical flows that
> refer a port_group when some ports are added/removed from the port group
> we now change the way ovn-northd populates the Southbound DB Port_Group
> table.
> 
> Instead of copying NB.Port_Group.name  to
> SB.Port_Group.name  we now
> create one SB.Port_Group record for every datapath that has ports
> referenced by the NB.Port_Group.ports field. In order to maintain the
> SB.Port_Group.name  uniqueness
> constraint, ovn-northd populates the field
> with the value: _ >.
> 
> In specific scenarios we see significant improvements in time to
> install/remove all logical flows to/from OVS. One such scenario, in the
> BZ referenced below has:
> 
> $ ovn-nbctl acl-list pg
>   from-lport  1001 (inport == @pg && ip) drop
>     to-lport  1001 (outport == @pg && ip) drop
> 
> Then, incrementally, creates new logical ports on different logical
> switches, binds them to OVS interfaces and adds them to the port_group.
> 
> Measuring the total time to perform the above steps 500 times (for 500
> new ports attached to 100 switches, 5 per switch) on a test setup
> we observe an improvement of 50% in time it takes to install all
> openflow rules when port_groups are split in the SB database.
> 
> Suggested-by: Numan Siddique mailto:num...@ovn.org>>
> Reported-by: Venkata Anil  >
> Reported-at: https://bugzilla.redhat.com/1818128
> Signed-off-by: Dumitru Ceara  >
> 
> 
> Thanks Dumitru for this patch.
> 
> Thi patch has turned out to be much simpler than I thought.
> 
> Can you please add a few test cases in ovn-northd.at
>  to make sure
> that the SB Port_Group rows are created as expected when
> a NB Port_Group is created and it is referenced by multiple logical
> switches.
> 
> If you could cover cases like, delete a logical switch and make sure
> that the PG
> for that logical switch in SB DB is deleted etc.
> 
> Thanks
> Numan
>  

Hi Numan,

Thanks for the review. I sent a v2 with additional tests in ovn-northd.at:

https://patchwork.ozlabs.org/project/openvswitch/patch/1593444281-19811-1-git-send-email-dce...@redhat.com/

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2] Split SB Port_Group per datapath.

In order to avoid ovn-controller reinstalling all logical flows that
refer a port_group when some ports are added/removed from the port group
we now change the way ovn-northd populates the Southbound DB Port_Group
table.

Instead of copying NB.Port_Group.name to SB.Port_Group.name we now
create one SB.Port_Group record for every datapath that has ports
referenced by the NB.Port_Group.ports field. In order to maintain the
SB.Port_Group.name uniqueness constraint, ovn-northd populates the field
with the value: _.

In specific scenarios we see significant improvements in time to
install/remove all logical flows to/from OVS. One such scenario, in the
BZ referenced below has:

$ ovn-nbctl acl-list pg
  from-lport  1001 (inport == @pg && ip) drop
to-lport  1001 (outport == @pg && ip) drop

Then, incrementally, creates new logical ports on different logical
switches, binds them to OVS interfaces and adds them to the port_group.

Measuring the total time to perform the above steps 500 times (for 500
new ports attached to 100 switches, 5 per switch) on a test setup
we observe an improvement of 50% in time it takes to install all
openflow rules when port_groups are split in the SB database.

Suggested-by: Numan Siddique 
Reported-by: Venkata Anil 
Reported-at: https://bugzilla.redhat.com/1818128
Signed-off-by: Dumitru Ceara 

---
v2:
- add tests in ovn-northd.at as suggested by Numan.
---
 TODO.rst  |  8 ++
 controller/lflow.c|  4 ++-
 include/ovn/expr.h|  4 ++-
 lib/actions.c |  2 +-
 lib/expr.c| 48 ---
 lib/ovn-util.h|  7 +
 northd/ovn-northd.c   | 79 ++-
 tests/ovn-northd.at   | 79 +++
 tests/test-ovn.c  | 10 +++
 utilities/ovn-trace.c |  3 +-
 10 files changed, 198 insertions(+), 46 deletions(-)

diff --git a/TODO.rst b/TODO.rst
index 6df6b84..4b2fc2d 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -140,3 +140,11 @@ OVN To-do List
 * ovn-controller: Stop copying the local OVS configuration into the
   Chassis external_ids column (same for the "is-remote" configuration from
   ovn-ic) a few releases after the 20.06 version (21.06 maybe ?).
+
+* ovn-controller: Remove backwards compatibility for Southbound DB Port_Group
+  names in expr.c a few releases after the 20.09 version. Right now
+  ovn-controller maintains backwards compatibility when connecting to a
+  SB database that doesn't store Port_Group.name as
+  . This causes an additional
+  hashtable lookup in parse_port_group() which can be avoided when we are sure
+  that the Southbound DB uses the new format.
diff --git a/controller/lflow.c b/controller/lflow.c
index c850a0d..5454361 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -566,7 +566,9 @@ consider_logical_flow(const struct sbrec_logical_flow 
*lflow,
 struct sset port_groups_ref = SSET_INITIALIZER(&port_groups_ref);
 expr = expr_parse_string(lflow->match, &symtab, l_ctx_in->addr_sets,
  l_ctx_in->port_groups,
- &addr_sets_ref, &port_groups_ref, &error);
+ &addr_sets_ref, &port_groups_ref,
+ lflow->logical_datapath->tunnel_key,
+ &error);
 const char *addr_set_name;
 SSET_FOR_EACH (addr_set_name, &addr_sets_ref) {
 lflow_resource_add(l_ctx_out->lfrr, REF_TYPE_ADDRSET, addr_set_name,
diff --git a/include/ovn/expr.h b/include/ovn/expr.h
index 21bf51c..9838251 100644
--- a/include/ovn/expr.h
+++ b/include/ovn/expr.h
@@ -391,12 +391,14 @@ struct expr *expr_parse(struct lexer *, const struct 
shash *symtab,
 const struct shash *addr_sets,
 const struct shash *port_groups,
 struct sset *addr_sets_ref,
-struct sset *port_groups_ref);
+struct sset *port_groups_ref,
+int64_t dp_id);
 struct expr *expr_parse_string(const char *, const struct shash *symtab,
const struct shash *addr_sets,
const struct shash *port_groups,
struct sset *addr_sets_ref,
struct sset *port_groups_ref,
+   int64_t dp_id,
char **errorp);
 
 struct expr *expr_clone(struct expr *);
diff --git a/lib/actions.c b/lib/actions.c
index ee7c825..fc6e191 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -242,7 +242,7 @@ add_prerequisite(struct action_context *ctx, const char 
*prerequisite)
 char *error;
 
 expr = expr_parse_string(prerequisite, ctx->pp->symtab, NULL, NULL,
- NULL, NULL, &error);
+ NULL, NULL, 0, &error);
 ovs_assert(!error);
 ctx->prereqs = expr_combine(EXPR_T_AND, ctx->prereqs,

Re: [ovs-dev] [PATCH ovn] Fix selection fields for UDP and SCTP load balancers.

2020-06-29 Thread Mark Michelson


On 6/22/20 1:41 PM, num...@ovn.org wrote:

From: Numan Siddique 

The commit 5af304e7478a ("Support selection fields in load balancer.")
didn't handle the selection fields for UDP and SCTP protocol.
If CMS has set the selection fields - tp_src and tp_dst for UDP or SCTP
load balancers, ovn-northd was adding lflows as
ct_lb(backends=, hash_fields(tp_src,tp_dst))
instead of ct_lb(backends=, hash_fields(udp_src,udp_dst)) and
ct_lb(backends=, hash_fields(sctp_src,sctp_dst)) respectively.

Because of this, select group flows were encoded with tcp_src and tcp_dst
hash fields.

This patch fixes this issue and refactors the load balancer code in ovn-northd
to better handle the protocols.

Test cases are now added for UDP and SCTP protocols.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1846189
Signed-off-by: Numan Siddique 
---
  northd/ovn-northd.c |  71 --
  tests/ovn.at| 179 ++--
  tests/system-ovn.at |  23 ++
  3 files changed, 227 insertions(+), 46 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 40aeb0a58..7887d0d2f 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -3121,6 +3121,7 @@ struct ovn_lb {
  
  const struct nbrec_load_balancer *nlb; /* May be NULL. */

  char *selection_fields;
+char *proto;
  struct lb_vip *vips;
  size_t n_vips;
  };
@@ -3218,6 +3219,12 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
*lbs,
  struct smap_node *node;
  size_t n_vips = 0;
  
+if (!nbrec_lb->protocol || !nbrec_lb->protocol[0]) {

+lb->proto = xstrdup("tcp");
+} else {
+lb->proto = xstrdup(nbrec_lb->protocol);
+}
+


Hm, I don't think this is quite right. Load balancers that do not 
specify ports are L4-agnostic. This patch does not change the behavior, 
but it sets lb->proto to "tcp" even if no port is specified. This could 
result in confusion and potential future buggy code.



  SMAP_FOR_EACH (node, &nbrec_lb->vips) {
  char *vip;
  uint16_t port;
@@ -3234,7 +3241,7 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
*lbs,
  lb->vips[n_vips].backend_ips = xstrdup(node->value);
  
  struct nbrec_load_balancer_health_check *lb_health_check = NULL;

-if (nbrec_lb->protocol && !strcmp(nbrec_lb->protocol, "sctp")) {
+if (!strcmp(lb->proto, "sctp")) {
  if (nbrec_lb->n_health_check > 0) {
  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
  VLOG_WARN_RL(&rl,
@@ -3309,15 +3316,11 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
*lbs,
  lb->vips[n_vips].backends[i].svc_mon_src_ip = svc_mon_src_ip;
  
  if (lb_health_check && op && svc_mon_src_ip) {

-const char *protocol = nbrec_lb->protocol;
-if (!protocol || !protocol[0]) {
-protocol = "tcp";
-}
  lb->vips[n_vips].backends[i].health_check = true;
  struct service_monitor_info *mon_info =
  create_or_get_service_mon(ctx, monitor_map, backend_ip,
op->nbsp->name, backend_port,
-  protocol);
+  lb->proto);
  
  ovs_assert(mon_info);

  sbrec_service_monitor_set_options(
@@ -3351,7 +3354,14 @@ ovn_lb_create(struct northd_context *ctx, struct hmap 
*lbs,
  if (lb->nlb->n_selection_fields) {
  struct ds sel_fields = DS_EMPTY_INITIALIZER;
  for (size_t i = 0; i < lb->nlb->n_selection_fields; i++) {
-ds_put_format(&sel_fields, "%s,", lb->nlb->selection_fields[i]);
+char *field = lb->nlb->selection_fields[i];
+if (!strcmp(field, "tp_src")) {
+ds_put_format(&sel_fields, "%s_src,", lb->proto);
+} else if (!strcmp(field, "tp_dst")) {
+ds_put_format(&sel_fields, "%s_dst,", lb->proto);
+} else {
+ds_put_format(&sel_fields, "%s,", field);
+}
  }
  ds_chomp(&sel_fields, ',');
  lb->selection_fields = ds_steal_cstr(&sel_fields);
@@ -3374,6 +3384,7 @@ ovn_lb_destroy(struct ovn_lb *lb)
  
  free(lb->vips[i].backends);

  }
+free(lb->proto);
  free(lb->vips);
  free(lb->selection_fields);
  }
@@ -4820,7 +4831,7 @@ ls_has_dns_records(const struct nbrec_logical_switch *nbs)
  
  static void

  build_empty_lb_event_flow(struct ovn_datapath *od, struct hmap *lflows,
-  struct lb_vip *lb_vip,
+  struct lb_vip *lb_vip, const char *proto,
struct nbrec_load_balancer *lb,
int pl, struct shash *meter_groups)
  {
@@ -4837,11 +4848,11 @@ build_empty_lb_event_flow(struct ovn_datapath *od, 
struct hmap *lflo

Re: [ovs-dev] [PATCH V3 00/12] netdev datapath offload: Support IPv6 and VXLAN encap

I have rebased, changed the order of commits to do testpmd format first
and addressed the comments.

I also added the acked-by of Harsha (though a bit changed since v2).
Thanks again.

As far as I see the only open issue is if to drop the comment about
XL710 or not.

Please address it and if there are any other comments before I'll post v4.

Thanks,

Eli

On 6/25/2020 8:37 AM, Eli Britstein wrote:

On 6/24/2020 5:28 PM, Ilya Maximets wrote:

On 6/24/20 4:16 PM, Ilya Maximets wrote:

On 6/24/20 12:47 PM, Eli Britstein wrote:

On 6/24/2020 1:37 PM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:
This patch set includes additional offloads - IPv6 and VXLAN
encap, and

enhanced logging to increase debugability.

Patches #1-#8: Add support for offloads of IPv6 patterns, partial
TCP/UDP ports, set IPv6 and encap actions
(clone/output).
Patch #9: Bug fix of partial offloads.
Patches #10-#11: Enhance log prints for debugability.
Patch #12: Fix Ethernet matching for type only.

v2-v1:
- Removed redundant out label.
- Added a patch to fix dl_type match only.
v3-v2:
- Rebased, and more elaboration in #7 commit message.

Hi.

I noticed that you didn't include Acked-by tags from Harsha.
Was it intentional? i.e. if there was any significant changes during
rebase process?

I'd like to have these patches acked by Harsha before applying, so
my question if tags from v2 are valid for v3 and can I use them?
Harsha? Eli?
No. It was by mistake. Sorry. The rebase had only one minor
conflict in

[PATCH V3 06/12] netdev-offload: Use dpif type instead of class.

The conflict was due to previous commit [1].

As I will have to rebase again for sure to add testpmd prints for
VLANs, as [2] was merged, will it be OK I'll add the Acked-by then?

Sure. I think it should be OK for patches without any functional
changes. Current patches doesn't apply starting from the second one
due to merged patch [2], so they need a rebase anyway.
OK. No functional changes from v2 (or from v1, except the additional
dl_type patch). Could you please review the rest of the series and
gather comments before to apply and not just rebase? Thanks.

Regarding the testpmd related patch itself, I'm not sure about it.
It might be hard to support in terms that we will need to track
testpmd changes and it's also hard to tell for which testpmd version
these cmdlines should be applicable.
Testpmd format doesn't change frequently, if any. Also, a specific OVS
version implies specific DPDK version. In the case of changed format,
we'll need to fix.

Another point is that it's actually printing of the same information
twice. And also lots of extra printing code.
Maybe, if you need these cmdlines, it's better to write a helper script
to convert existing output to testpmd cmdlines? Or even parse
offloaded

and non-offloaded flows from the output of dump-flows.

Anoher option is to just drop current format entirely and use format
of testpmd cmdlines instead. Current format is not standardized and
doesn't comply with OVS flow dumps format anyway.

What do you think?

I kept the old format for backward compatibility, as well as some
testpmd require more than one command. For example, "vxlan_encap" in
"flow create", but it won't work properly in real testpmd if there was
no "set vxlan" to define the encap header.

I'll try to do all in testpmd format and drop the current format.

[1] 191536574e3b ("netdev-offload: Implement terse dump support").

[2] 029273855939 ("netdev-offload-dpdk: Support offload of VLAN
PUSH/POP actions.")

Best regards, Ilya Maximets.

Travis:
v1:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F688413350&data=02%7C01%7Celibr%40mellanox.com%7Cdaa680f281e44d447bc708d8184ad3d9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637286056953162106&sdata=6iuCat3IMKBtQAMILf4uam1BioABbJ%2BG2estCEaelXg%3D&reserved=0
v2:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F691375847&data=02%7C01%7Celibr%40mellanox.com%7Cdaa680f281e44d447bc708d8184ad3d9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637286056953162106&sdata=ybvwkH1KMHBmUZewFiM1wnz0mPlFUNu6%2FATfLTiaBTs%3D&reserved=0
v3:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.org%2Fgithub%2Felibritstein%2FOVS%2Fbuilds%2F700534550&data=02%7C01%7Celibr%40mellanox.com%7Cdaa680f281e44d447bc708d8184ad3d9%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637286056953162106&sdata=LBRQrqZyYceBOuJS4A4ZbbpvLxIhz0dnsE2S46Mn5JY%3D&reserved=0

Eli Britstein (10):
netdev-offload-dpdk: Remove pre-validate of patterns function
netdev-offload-dpdk: Add IPv6 pattern matching
netdev-offload-dpdk: Support offload of set IPv6 actions
netdev-offload-dpdk: Support partial TCP/UDP port matching
netdev-offload-dpdk: Support offload of clone tnl_push

[ovs-dev] [PATCH v6] Bareudp Tunnel Support

2020-06-29 Thread Martin Varghese

From: Martin Varghese 

There are various L3 encapsulation standards using UDP being discussed to
leverage the UDP based load balancing capability of different networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.

The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.

An example to create bareudp device to tunnel MPLS traffic is
given

$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
 type=bareudp options:remote_ip=2.1.1.3
 options:local_ip=2.1.1.2 \
 options:payload_type=0x8847 options:dst_port=6635 \
 options:packet_type="legacy_l3" \
 ofport_request=$bareudp_egress_port

The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).

The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.

$ ovs-vsctl add-port  br_mpls udp_port -- set interface
udp_port \
type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
options:payload_type=mpls options:dst_port=6635 \
options:packet_type="legacy_l3"

Signed-off-by: Martin Varghese 
---
Changes in v2:
- Removed vport-bareudp module.

Changes in v3:
- Added net-next upstream commit id and message to commit message.

Changes in v4:
- Removed kernel datapath changes.

Changes in v5:
- Fixed release notes errors.
- Fixed coding errors in dpif-nelink-rtnl.c.

Changes in v6:
- Added code to enable rx metadata collection in the kernel device.
- Added version history.

 Documentation/automake.mk |  1 +
 Documentation/faq/bareudp.rst | 62 +++
 Documentation/faq/index.rst   |  1 +
 Documentation/faq/releases.rst|  1 +
 NEWS  |  4 ++
 datapath/linux/compat/include/linux/openvswitch.h | 10 
 lib/dpif-netlink-rtnl.c   | 55 
 lib/dpif-netlink.c|  5 ++
 lib/netdev-vport.c| 27 +-
 lib/netdev.h  |  1 +
 ofproto/ofproto-dpif-xlate.c  |  1 +
 tests/system-layer3-tunnels.at| 47 +
 12 files changed, 213 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/faq/bareudp.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index f85c432..ea3475f 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -88,6 +88,7 @@ DOC_SOURCE = \
Documentation/faq/terminology.rst \
Documentation/faq/vlan.rst \
Documentation/faq/vxlan.rst \
+   Documentation/faq/bareudp.rst \
Documentation/internals/index.rst \
Documentation/internals/authors.rst \
Documentation/internals/bugs.rst \
diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst
new file mode 100644
index 000..9266daa
--- /dev/null
+++ b/Documentation/faq/bareudp.rst
@@ -0,0 +1,62 @@
+..
+  Licensed under the Apache License, Version 2.0 (the "License"); you may
+  not use this file except in compliance with the License. You may obtain
+  a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+===
+Bareudp
+===
+
+Q: What is Bareudp?
+
+A: There are various L3 encapsulation standards using UDP being discussed
+   to leverage the UDP based load balancing capability of different
+   networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among
+   them.
+
+   The Bareudp tunnel provides a generic L3 encapsulation support for
+   tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
+   tunnel.
+
+   An example to create bareudp device to tunnel MPLS traffic is given
+

Re: [ovs-dev] [PATCH V3 12/12] netdev-offload-dpdk: Fix Ethernet matching for type only




On 6/29/2020 3:38 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

For OVS rule of the form "eth type is 0x1234 / end", rule is offloaded
in the form of "eth / end", which is incorrect. Fix it.

Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow")
Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  lib/netdev-offload-dpdk.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index e8b8d6464..e68b6549c 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -860,7 +860,8 @@ parse_flow_match(struct flow_patterns *patterns,
  
  /* Eth */

  if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
-!eth_addr_is_zero(match->wc.masks.dl_dst)) {
+!eth_addr_is_zero(match->wc.masks.dl_dst) ||
+match->wc.masks.dl_type) {

While calling from the userspace datapath, we always have a match on dl_type.
This means that it's not possible to hit the 'else' condition.
I'm worried about the usecase described there.  It's hard to track changes
in i40e_flow.c.  Can someone test this with XL710?


Yes, that's true. I think we should have it for consistency with the 
rest of the code, and also just in case.


Regarding the XL710 comment - I think it should not have been merged 
into OVS in the first place. If it is an issue with XL710, the relevant 
PMD should handle it, and not OVS. I just kept it in case I miss 
something here. Do you think it's OK to remove it?





  struct rte_flow_item_eth *spec, *mask;
  
  spec = xzalloc(sizeof *spec);

@@ -876,6 +877,7 @@ parse_flow_match(struct flow_patterns *patterns,
  
  memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks->dl_dst);

  memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src);
+consumed_masks->dl_type = 0;
  
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);

  } else {
@@ -888,7 +890,6 @@ parse_flow_match(struct flow_patterns *patterns,
   */
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);

Actually looking at i40e_flow.c, it seems like this pattern will be rejected.
But I'm not sure.


  }
-consumed_masks->dl_type = 0;
  
  /* VLAN */

  if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) {


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V3 09/12] dpif-netdev: Don't use zero flow mark




On 6/29/2020 2:47 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

Zero flow mark is used to indicate the HW to remove the mark. A packet
marked with zero mark is received in SW without a mark at all, so it
cannot be used as a valid mark. Change the pool range to fix it.

Fixes: 241bad15d99a ("dpif-netdev: associate flow with a mark id")
Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  lib/dpif-netdev.c|  4 ++--
  tests/dpif-netdev.at | 18 +-
  2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 87068803e..57565802a 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2150,7 +2150,7 @@ dp_netdev_pmd_find_dpcls(struct dp_netdev_pmd_thread *pmd,
  }
  
  #define MAX_FLOW_MARK   (UINT32_MAX - 1)

-#define INVALID_FLOW_MARK   (UINT32_MAX)
+#define INVALID_FLOW_MARK   0
  
  struct megaflow_to_mark_data {

  const struct cmap_node node;
@@ -2176,7 +2176,7 @@ flow_mark_alloc(void)
  
  if (!flow_mark.pool) {

  /* Haven't initiated yet, do it here */

It might be good to add information to the comment why zero is not used here.

I thought in the commit message is enough. I'll add a comment here too.



-flow_mark.pool = id_pool_create(0, MAX_FLOW_MARK);
+flow_mark.pool = id_pool_create(1, MAX_FLOW_MARK);
  }
  
  if (id_pool_alloc_id(flow_mark.pool, &mark)) {



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V3 08/12] netdev-offload-dpdk: Support tnl/push using vxlan encap attribute




On 6/29/2020 2:45 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

For DPDK, there is the RAW_ENCAP attribute which gets raw buffer of the
encapsulation header. For specific protocol, such as vxlan, there is a
more specific attribute, VXLAN_ENCAP, which gets the parsed fields of
the outer header. In case tunnel type is vxlan, parse the header
and use the specific attribute, with fallback to RAW_ENCAP.

Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  lib/netdev-offload-dpdk.c | 123 +-
  1 file changed, 121 insertions(+), 2 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 16df1fedd..59e3655ce 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -359,6 +359,25 @@ dump_flow_pattern(struct ds *s, const struct rte_flow_item 
*item)
  } else {
  ds_put_cstr(s, "  Mask = null\n");
  }
+} else if (item->type == RTE_FLOW_ITEM_TYPE_VXLAN) {
+const struct rte_flow_item_vxlan *vxlan_spec = item->spec;
+const struct rte_flow_item_vxlan *vxlan_mask = item->mask;
+
+ds_put_cstr(s, "rte flow vxlan pattern:\n");
+if (vxlan_spec) {
+ds_put_format(s, "  Spec: flags=0x%x, vni=%"PRIu32"\n",
+  vxlan_spec->flags,
+  ntohl(*(ovs_be32 *)vxlan_spec->vni) >> 8);
+} else {
+ds_put_cstr(s, "  Spec = null\n");
+}
+if (vxlan_mask) {
+ds_put_format(s, "  Mask: flags=0x%x, vni=0x%06"PRIx32"\n",
+  vxlan_mask->flags,
+  ntohl(*(ovs_be32 *)vxlan_mask->vni) >> 8);
+} else {
+ds_put_cstr(s, "  Mask = null\n");
+}
  } else {
  ds_put_format(s, "unknown rte flow pattern (%d)\n", item->type);
  }
@@ -486,6 +505,14 @@ dump_flow_action(struct ds *s, const struct 
rte_flow_action *actions)
  } else {
  ds_put_cstr(s, "  Raw-encap = null\n");
  }
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP) {
+const struct rte_flow_action_vxlan_encap *vxlan_encap = actions->conf;
+const struct rte_flow_item *items = vxlan_encap->definition;
+
+ds_put_cstr(s, "rte flow vxlan-encap action:\n");
+while (items && items->type != RTE_FLOW_ITEM_TYPE_END) {
+dump_flow_pattern(s, items++);
+}
  } else {
  ds_put_format(s, "unknown rte flow action (%d)\n", actions->type);
  }
@@ -1125,6 +1152,93 @@ parse_set_actions(struct flow_actions *actions,
  return 0;
  }
  
+/* Maximum number of items in struct rte_flow_action_vxlan_encap.

+ * ETH / IPv4(6) / UDP / VXLAN / END
+ */
+#define ACTION_VXLAN_ENCAP_ITEMS_NUM 5
+
+static int
+add_vxlan_encap_action(struct flow_actions *actions,
+   const void *header)
+{
+const struct eth_header *eth;
+const struct udp_header *udp;
+struct vxlan_data {
+struct rte_flow_action_vxlan_encap conf;
+struct rte_flow_item items[0];

Why not:
struct rte_flow_item items[ACTION_VXLAN_ENCAP_ITEMS_NUM];
?

OK.



+} *vxlan_data;
+BUILD_ASSERT_DECL(offsetof(struct vxlan_data, conf) == 0);
+const void *vxlan;
+const void *l3;
+const void *l4;
+int field;
+
+vxlan_data = xzalloc(sizeof *vxlan_data +
+ sizeof(struct rte_flow_item) *
+ ACTION_VXLAN_ENCAP_ITEMS_NUM);
+field = 0;
+
+eth = header;
+/* Ethernet */
+vxlan_data->items[field].type = RTE_FLOW_ITEM_TYPE_ETH;
+vxlan_data->items[field].spec = eth;
+vxlan_data->items[field].mask = &rte_flow_item_eth_mask;
+field++;
+
+l3 = eth + 1;
+/* IP */
+if (eth->eth_type == htons(ETH_TYPE_IP)) {
+/* IPv4 */
+const struct ip_header *ip = l3;
+
+vxlan_data->items[field].type = RTE_FLOW_ITEM_TYPE_IPV4;
+vxlan_data->items[field].spec = ip;
+vxlan_data->items[field].mask = &rte_flow_item_ipv4_mask;
+
+if (ip->ip_proto != IPPROTO_UDP) {
+goto err;
+}
+l4 = (ip + 1);
+} else if (eth->eth_type == htons(ETH_TYPE_IPV6)) {
+const struct ovs_16aligned_ip6_hdr *ip6 = l3;
+
+vxlan_data->items[field].type = RTE_FLOW_ITEM_TYPE_IPV6;
+vxlan_data->items[field].spec = ip6;
+vxlan_data->items[field].mask = &rte_flow_item_ipv6_mask;
+
+if (ip6->ip6_nxt != IPPROTO_UDP) {
+goto err;
+}
+l4 = (ip6 + 1);
+} else {
+goto err;
+}
+field++;
+
+udp = (const struct udp_header *)l4;
+vxlan_data->items[field].type = RTE_FLOW_ITEM_TYPE_UDP;
+vxlan_data->items[field].spec = udp;
+vxlan_data->items[field].mask = &rte_flow_item_udp_mask;
+field++;
+
+vxlan = (udp + 1);
+vxlan_data->items[field].type = RTE_FLOW_ITEM_TYPE_VXLAN;
+vxlan_data->items[field].spec = vxl

Re: [ovs-dev] [PATCH V3 07/12] netdev-offload-dpdk: Support offload of clone tnl_push/output actions




On 6/29/2020 2:12 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

Tunnel encapsulation is done by tnl_push and output actions nested in a
clone action. Support offloading of such flows with
RTE_FLOW_ACTION_TYPE_RAW_ENCAP attribute.

Signed-off-by: Eli Britstein 
Reviewed-by: Oz Shlomo 
---
  Documentation/howto/dpdk.rst |  1 +
  NEWS |  2 +-
  lib/netdev-offload-dpdk.c| 57 
  3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 1756a7149..724c837f5 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -396,6 +396,7 @@ Supported actions for hardware offload are:
  - Modification of IPv4 (mod_nw_src/mod_nw_dst/mod_nw_ttl).
  - Modification of TCP/UDP (mod_tp_src/mod_tp_dst).
  - Modification of IPv6 (set_field:->ipv6_src/ipv6_dst/mod_nw_ttl).
+- Clone/output (tnl_push and output) for encapsulating over a tunnel.
  
  Further Reading

  ---
diff --git a/NEWS b/NEWS
index 7df8b134d..e7d3ca3c8 100644
--- a/NEWS
+++ b/NEWS
@@ -11,7 +11,7 @@ Post-v2.13.0
   * Deprecated DPDK ring ports (dpdkr) are no longer supported.
   * Add hardware offload support for matching IPv6 protocol.
   * Add hardware offload support for set of IPv6 TCP/UDP ports
-   actions (experimental).
+   and tunnel push-output actions (experimental).
 - Linux datapath:
   * Support for kernel versions up to 5.5.x.
 - AF_XDP:
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 13259a669..16df1fedd 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -475,6 +475,17 @@ dump_flow_action(struct ds *s, const struct 
rte_flow_action *actions)
  } else {
  ds_put_format(s, "  Set-ipv6-%s = null\n", dirstr);
  }
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_RAW_ENCAP) {
+const struct rte_flow_action_raw_encap *raw_encap = actions->conf;
+
+ds_put_cstr(s, "rte flow raw-encap action:\n");
+if (raw_encap) {
+ds_put_format(s, "  Raw-encap: size=%ld\n", raw_encap->size);
+ds_put_format(s, "  Raw-encap: encap=\n");

s/ds_put_format/ds_put_cstr/

And why it says "Raw-encap:" twice?

I changed it as we agreed to drop this format in favor of testpmd format.



+ds_put_hex_dump(s, raw_encap->data, raw_encap->size, 0, false);
+} else {
+ds_put_cstr(s, "  Raw-encap = null\n");
+}
  } else {
  ds_put_format(s, "unknown rte flow action (%d)\n", actions->type);
  }
@@ -1114,6 +1125,44 @@ parse_set_actions(struct flow_actions *actions,
  return 0;
  }
  
+static int

+parse_clone_actions(struct netdev *netdev,
+struct flow_actions *actions,
+const struct nlattr *clone_actions,
+const size_t clone_actions_len)
+{
+const struct nlattr *ca;
+unsigned int cleft;
+
+NL_ATTR_FOR_EACH_UNSAFE (ca, cleft, clone_actions, clone_actions_len) {
+int clone_type = nl_attr_type(ca);
+
+if (clone_type == OVS_ACTION_ATTR_TUNNEL_PUSH) {
+const struct ovs_action_push_tnl *tnl_push = nl_attr_get(ca);
+struct rte_flow_action_raw_encap *raw_encap =
+xzalloc(sizeof *raw_encap);
+
+raw_encap->data = (uint8_t *)tnl_push->header;
+raw_encap->preserve = NULL;
+raw_encap->size = tnl_push->header_len;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_RAW_ENCAP,
+raw_encap);
+} else if (clone_type == OVS_ACTION_ATTR_OUTPUT) {
+if (add_output_action(netdev, actions, ca)) {
+return -1;
+}
+} else {
+VLOG_DBG_RL(&rl,
+"Unsupported nested action inside clone(), "
+"action type: %d", clone_type);
+return -1;
+}
+}
+
+return 0;
+}
+
  static int
  parse_flow_actions(struct netdev *netdev,
 struct flow_actions *actions,
@@ -1141,6 +1190,14 @@ parse_flow_actions(struct netdev *netdev,
masked)) {
  return -1;
  }
+} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_CLONE) {

I believe, we already discussed this somewhere, but we need to check
that clone is actually the last action in a list since we do not perform
an actual cloning.  Otherwise, later actions will be performed on the
modified packet.
Actually we discussed about the output *NOT* to limit for the last 
action, as it was mlx vendor specific limitation. For clone I agree it's 
different, and I'll change.



+const struct nlattr *clone_actions = nl_attr_get(nla);
+size_t clone_actions_len = nl_attr_get_size(nla);
+
+if (parse_clone_actions(netdev, actions, clone_acti

Re: [ovs-dev] [PATCH ovn] chassis.c: Add comment to SB DB transaction only when needed.

On 6/29/20 9:35 AM, Numan Siddique wrote:
> 
> 
> On Thu, Jun 25, 2020 at 1:28 PM Dumitru Ceara  > wrote:
> 
> The chassis_run() function incorrectly adds a "ovn-controller:
> registering chassis" comment to every SB transaction. This should be
> done
> only when the chassis record is created or updated. If nothing
> changes in
> the chassis record we shouldn't add useless extra information to the
> transaction request.
> 
> Reported-by: Daniel Alvarez  >
> Reported-at: https://bugzilla.redhat.com/1850511
> Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to abstract
> the string parsing")
> Signed-off-by: Dumitru Ceara  >
> ---
>  controller/chassis.c | 64
> +---
>  1 file changed, 41 insertions(+), 23 deletions(-)
> 
> diff --git a/controller/chassis.c b/controller/chassis.c
> index d619361..33a1e18 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -489,53 +489,64 @@ chassis_get_stale_record(const struct
> sbrec_chassis_table *chassis_table,
>   * Otherwise (i.e., first time we create the record) then we check
> if there's
>   * a stale record from a previous controller run that didn't end
> gracefully
>   * and reuse it. If not then we create a new record.
> + *
> + * Sets '*chassis_rec' to point to the local chassis record.
> + * Returns true if this record was already in the database, false
> if it was
> + * just inserted.
>   */
> -static const struct sbrec_chassis *
> +static bool
>  chassis_get_record(struct ovsdb_idl_txn *ovnsb_idl_txn,
>                     struct ovsdb_idl_index *sbrec_chassis_by_name,
>                     const struct sbrec_chassis_table *chassis_table,
>                     const struct ovs_chassis_cfg *ovs_cfg,
> -                   const char *chassis_id)
> +                   const char *chassis_id,
> +                   const struct sbrec_chassis **chassis_rec)
>  {
> -    const struct sbrec_chassis *chassis_rec;
> -
>      if (chassis_info_id_inited(&chassis_state)) {
> -        chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
> -                                           
>  chassis_info_id(&chassis_state));
> -        if (!chassis_rec) {
> +        *chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
> +                                             
> chassis_info_id(&chassis_state));
> +        if (!(*chassis_rec)) {
>              VLOG_DBG("Could not find Chassis, will create it"
>                       ": stored (%s) ovs (%s)",
>                       chassis_info_id(&chassis_state), chassis_id);
>              if (ovnsb_idl_txn) {
>                  /* Recreate the chassis record.  */
> -                chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +                *chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +                return false;
>              }
>          }
>      } else {
> -        chassis_rec =
> +        *chassis_rec =
>              chassis_get_stale_record(chassis_table, ovs_cfg,
> chassis_id);
> 
> -        if (!chassis_rec && ovnsb_idl_txn) {
> -            chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +        if (!(*chassis_rec) && ovnsb_idl_txn) {
> +            *chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +            return false;
>          }
>      }
> -    return chassis_rec;
> +    return true;
>  }
> 
> -/* Update a Chassis record based on the config in the ovs config. */
> -static void
> +/* Update a Chassis record based on the config in the ovs config.
> + * Returns true if 'chassis_rec' was updated, false otherwise.
> + */
> +static bool
>  chassis_update(const struct sbrec_chassis *chassis_rec,
>                 struct ovsdb_idl_txn *ovnsb_idl_txn,
>                 const struct ovs_chassis_cfg *ovs_cfg,
>                 const char *chassis_id,
>                 const struct sset *transport_zones)
>  {
> +    bool updated = false;
> +
>      if (strcmp(chassis_id, chassis_rec->name)) {
>          sbrec_chassis_set_name(chassis_rec, chassis_id);
> +        updated = true;
>      }
> 
>      if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) {
>          sbrec_chassis_set_hostname(chassis_rec, ovs_cfg->hostname);
> +        updated = true;
>      }
> 
>      if (chassis_other_config_changed(ovs_cfg->bridge_mappings,
> @@ -561,6 +572,7 @@ chassis_update(const struct sbrec_chassis
> *chassis_rec,
>           * systems, this behavior should be removed in the future. */
>          sbrec_chassis_set_external_

[ovs-dev] [PATCH ovn v2] chassis.c: Add comment to SB DB transaction only when needed.

The chassis_run() function incorrectly adds a "ovn-controller:
registering chassis" comment to every SB transaction. This should be done
only when the chassis record is created or updated. If nothing changes in
the chassis record we shouldn't add useless extra information to the
transaction request.

Reported-by: Daniel Alvarez 
Reported-at: https://bugzilla.redhat.com/1850511
Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to abstract the string 
parsing")
Signed-off-by: Dumitru Ceara 

---
v2:
- changed chassis TXN comment as suggested by Numan.
---
 controller/chassis.c | 64 +---
 1 file changed, 41 insertions(+), 23 deletions(-)

diff --git a/controller/chassis.c b/controller/chassis.c
index d619361..aa3fed0 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -489,53 +489,64 @@ chassis_get_stale_record(const struct sbrec_chassis_table 
*chassis_table,
  * Otherwise (i.e., first time we create the record) then we check if there's
  * a stale record from a previous controller run that didn't end gracefully
  * and reuse it. If not then we create a new record.
+ *
+ * Sets '*chassis_rec' to point to the local chassis record.
+ * Returns true if this record was already in the database, false if it was
+ * just inserted.
  */
-static const struct sbrec_chassis *
+static bool
 chassis_get_record(struct ovsdb_idl_txn *ovnsb_idl_txn,
struct ovsdb_idl_index *sbrec_chassis_by_name,
const struct sbrec_chassis_table *chassis_table,
const struct ovs_chassis_cfg *ovs_cfg,
-   const char *chassis_id)
+   const char *chassis_id,
+   const struct sbrec_chassis **chassis_rec)
 {
-const struct sbrec_chassis *chassis_rec;
-
 if (chassis_info_id_inited(&chassis_state)) {
-chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
- chassis_info_id(&chassis_state));
-if (!chassis_rec) {
+*chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
+  chassis_info_id(&chassis_state));
+if (!(*chassis_rec)) {
 VLOG_DBG("Could not find Chassis, will create it"
  ": stored (%s) ovs (%s)",
  chassis_info_id(&chassis_state), chassis_id);
 if (ovnsb_idl_txn) {
 /* Recreate the chassis record.  */
-chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
+*chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
+return false;
 }
 }
 } else {
-chassis_rec =
+*chassis_rec =
 chassis_get_stale_record(chassis_table, ovs_cfg, chassis_id);
 
-if (!chassis_rec && ovnsb_idl_txn) {
-chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
+if (!(*chassis_rec) && ovnsb_idl_txn) {
+*chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
+return false;
 }
 }
-return chassis_rec;
+return true;
 }
 
-/* Update a Chassis record based on the config in the ovs config. */
-static void
+/* Update a Chassis record based on the config in the ovs config.
+ * Returns true if 'chassis_rec' was updated, false otherwise.
+ */
+static bool
 chassis_update(const struct sbrec_chassis *chassis_rec,
struct ovsdb_idl_txn *ovnsb_idl_txn,
const struct ovs_chassis_cfg *ovs_cfg,
const char *chassis_id,
const struct sset *transport_zones)
 {
+bool updated = false;
+
 if (strcmp(chassis_id, chassis_rec->name)) {
 sbrec_chassis_set_name(chassis_rec, chassis_id);
+updated = true;
 }
 
 if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) {
 sbrec_chassis_set_hostname(chassis_rec, ovs_cfg->hostname);
+updated = true;
 }
 
 if (chassis_other_config_changed(ovs_cfg->bridge_mappings,
@@ -561,6 +572,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
  * systems, this behavior should be removed in the future. */
 sbrec_chassis_set_external_ids(chassis_rec, &other_config);
 smap_destroy(&other_config);
+updated = true;
 }
 
 update_chassis_transport_zones(transport_zones, chassis_rec);
@@ -571,7 +583,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
 &ovs_cfg->encap_ip_set, ovs_cfg->encap_csum,
 chassis_rec);
 if (!tunnels_changed) {
-return;
+return updated;
 }
 
 struct sbrec_encap **encaps;
@@ -583,6 +595,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
  ovs_cfg->encap_csum, &n_encap);
 sbrec_chassis_set_encaps(chassis_rec, encaps, n_encap);
 free(encaps);
+return true;
 }
 
 /* Returns this chassis's Chassis record, if it is a

[ovs-dev] [PATCH v4 4/5] dpif-netdev: Support flow_get() with partial-action-offload

For flows that offload partial actions in egress direction,
provide the right netdev to fetch statistics.

Signed-off-by: Sriharsha Basavapatna 
---
 lib/dpif-netdev.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b96a75d1f..e489e2d90 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3180,8 +3180,14 @@ dpif_netdev_get_flow_offload_status(const struct 
dp_netdev *dp,
 return false;
 }
 
-netdev = netdev_ports_get(netdev_flow->flow.in_port.odp_port,
-  dpif_normalize_type(dp->class->type));
+if (netdev_flow->partial_actions_offloaded &&
+netdev_flow->egress_offload_port != ODPP_NONE) {
+netdev = netdev_ports_get(netdev_flow->egress_offload_port,
+  dpif_normalize_type(dp->class->type));
+} else {
+netdev = netdev_ports_get(netdev_flow->flow.in_port.odp_port,
+  dpif_normalize_type(dp->class->type));
+}
 if (!netdev) {
 return false;
 }
-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 3/5] dpif-netdev: Skip encap action during datapath execution

In this patch we check if action processing (apart from OUTPUT action),
should be skipped for a given dp_netdev_flow. Specifically, we check if
the action is TNL_PUSH and if it has been offloaded to HW, then we do not
push the tunnel header in SW. The datapath only executes the OUTPUT action.
The packet will be encapsulated in HW during transmit.

Signed-off-by: Sriharsha Basavapatna 
---
 lib/dpif-netdev.c | 46 --
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 7adea8c40..b96a75d1f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -114,6 +114,7 @@ COVERAGE_DEFINE(datapath_drop_invalid_port);
 COVERAGE_DEFINE(datapath_drop_invalid_bond);
 COVERAGE_DEFINE(datapath_drop_invalid_tnl_port);
 COVERAGE_DEFINE(datapath_drop_rx_invalid_packet);
+COVERAGE_DEFINE(datapath_skip_tunnel_push);
 
 /* Protects against changes to 'dp_netdevs'. */
 static struct ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;
@@ -545,6 +546,16 @@ struct dp_netdev_flow {
 bool dead;
 uint32_t mark;   /* Unique flow mark assigned to a flow */
 
+/* The next two members are used to support partial offloading of
+ * actions. The boolean flag tells if this flow has its actions partially
+ * offloaded. The egress port# tells if the action should be offloaded
+ * on the egress (output) port instead of the in-port for the flow. Note
+ * that we support flows with a single egress port action.
+ * (see MAX_ACTION_ATTRS for related comments).
+ */
+bool partial_actions_offloaded;
+odp_port_t  egress_offload_port;
+
 /* Statistics. */
 struct dp_netdev_flow_stats stats;
 
@@ -817,7 +828,8 @@ static void dp_netdev_execute_actions(struct 
dp_netdev_pmd_thread *pmd,
   bool should_steal,
   const struct flow *flow,
   const struct nlattr *actions,
-  size_t actions_len);
+  size_t actions_len,
+  const struct dp_netdev_flow *dp_flow);
 static void dp_netdev_input(struct dp_netdev_pmd_thread *,
 struct dp_packet_batch *, odp_port_t port_no);
 static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
@@ -3954,7 +3966,7 @@ dpif_netdev_execute(struct dpif *dpif, struct 
dpif_execute *execute)
 dp_packet_batch_init_packet(&pp, execute->packet);
 pp.do_not_steal = true;
 dp_netdev_execute_actions(pmd, &pp, false, execute->flow,
-  execute->actions, execute->actions_len);
+  execute->actions, execute->actions_len, NULL);
 dp_netdev_pmd_flush_output_packets(pmd, true);
 
 if (pmd->core_id == NON_PMD_CORE_ID) {
@@ -6672,7 +6684,7 @@ packet_batch_per_flow_execute(struct 
packet_batch_per_flow *batch,
 actions = dp_netdev_flow_get_actions(flow);
 
 dp_netdev_execute_actions(pmd, &batch->array, true, &flow->flow,
-  actions->actions, actions->size);
+  actions->actions, actions->size, flow);
 }
 
 static inline void
@@ -6967,7 +6979,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
  * we'll send the packet up twice. */
 dp_packet_batch_init_packet(&b, packet);
 dp_netdev_execute_actions(pmd, &b, true, &match.flow,
-  actions->data, actions->size);
+  actions->data, actions->size, NULL);
 
 add_actions = put_actions->size ? put_actions : actions;
 if (OVS_LIKELY(error != ENOSPC)) {
@@ -7202,6 +7214,7 @@ dp_netdev_recirculate(struct dp_netdev_pmd_thread *pmd,
 struct dp_netdev_execute_aux {
 struct dp_netdev_pmd_thread *pmd;
 const struct flow *flow;
+const struct dp_netdev_flow *dp_flow;/* for partial action offload */
 };
 
 static void
@@ -7346,7 +7359,7 @@ dp_execute_userspace_action(struct dp_netdev_pmd_thread 
*pmd,
 if (!error || error == ENOSPC) {
 dp_packet_batch_init_packet(&b, packet);
 dp_netdev_execute_actions(pmd, &b, should_steal, flow,
-  actions->data, actions->size);
+  actions->data, actions->size, NULL);
 } else if (should_steal) {
 dp_packet_delete(packet);
 COVERAGE_INC(datapath_drop_userspace_action_error);
@@ -7455,6 +7468,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 int type = nl_attr_type(a);
 struct tx_port *p;
 uint32_t packet_count, packets_dropped;
+struct dp_netdev_flow *dp_flow = aux->dp_flow;
 
 switch ((enum ovs_action_attr)type) {
 case OVS_ACTION_ATTR_OUTPUT:
@@ -7477,9 +7491,20 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 }
 dp_packet_batch_apply_cutlen(packets_);
 packet_count =

[ovs-dev] [PATCH v4 0/5] netdev datapath: Partial action offload

Hi,

This patchset extends the "Partial HW acceleration" mode to offload a
part of the action processing to HW, instead of offloading just lookup
(MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action
Offload". This mode does not require SRIOV/switchdev configuration. In this
mode, forwarding (output) action is still performed by OVS-DPDK SW datapath.

Note:
This patchset utilizes the full-offload framework available in OVS-DPDK
to offload specific actions to HW. In this initial patchset we support
offloading of VXLAN encap action. The changes are based on the VXLAN encap
full-offload patchset currently being reviewed on ovs-dev: 
https://mail.openvswitch.org/pipermail/ovs-dev/2020-June/371854.html

Thanks,
-Harsha

**

v3-->v4:
- Removed mega-ufid mapping code from dpif-netdev (patch #5) and updated the
  existing mega-ufid mapping code in netdev-offload-dpdk to support partial
  action offload.

v2-->v3:

- Added more commit log comments in the refactoring patch (#1).
- Removed wrapper function for should_partial_offload_egress().
- Removed partial offload check for output action in parse_flow_actions().
- Changed patch sequence (skip-encap and get-stats before offload patch).
- Removed locking code (details in email), added inline comments.
- Moved mega-ufid mapping code from skip-encap (#3) to the offload patch (#5).

v1-->v2:

Fixed review comments from Eli:
- Revamped action list parsing to reject multiple clone/output actions
- Updated comments to reflect support for single clone/output action
- Removed place-holder function for ingress partial action offload
- Created a separate patch (#2) to query dpdk-vhost netdevs
- Set transfer attribute to 0 for partial action offload
- Updated data type of 'dp_flow' in 'dp_netdev_execute_aux'
- Added a mutex to synchronize between offload and datapath threads
- Avoid fall back to mark/rss when egress partial offload fails
- Drop count action for partial action offload

Other changes:
- Avoid duplicate offload requests for the same mega-ufid (from PMD threads)
- Added a coverage counter to track pkts with tnl-push partial offloaded
- Fixed dp_netdev_pmd_remove_flow() to delete partial offloaded flow

**

Sriharsha Basavapatna (5):
  dpif-netdev: Refactor dp_netdev_flow_offload_put()
  netdev-dpdk: provide a function to identify dpdk-vhost netdevs
  dpif-netdev: Skip encap action during datapath execution
  dpif-netdev: Support flow_get() with partial-action-offload
  dpif-netdev: Support partial-action-offload of VXLAN encap flow

 lib/dpif-netdev.c | 330 --
 lib/netdev-dpdk.c |   5 +
 lib/netdev-dpdk.h |   1 +
 lib/netdev-offload-dpdk.c |  78 ++---
 lib/netdev-offload.h  |   2 +
 5 files changed, 352 insertions(+), 64 deletions(-)

-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 5/5] dpif-netdev: Support partial-action-offload of VXLAN encap flow

In this patch, we support offloading of VXLAN_ENCAP action for a vhost-user
port (aka "partial-action-offload"). At the time of offloading the flow, we
determine if the flow can be offloaded to an egress device, if the input
port is not offload capable such as a vhost-user port. We then offload the
flow with a VXLAN_ENCAP RTE action, to the egress device. We do not add
the OUTPUT RTE action, which indicates to the PMD that is is a partial
action offload request. Note that since the action is being offloaded in
egress direction, classification is expected to be done by OVS SW datapath
and hence there's no need to offload a MARK action.

If offload succeeds, we save the information in 'dp_netdev_flow' so that
we skip execution of the corresponding action (previous patch) during SW
datapath processing.

Signed-off-by: Sriharsha Basavapatna 
---
 lib/dpif-netdev.c | 212 --
 lib/netdev-offload-dpdk.c |  78 ++
 lib/netdev-offload.h  |   2 +
 3 files changed, 262 insertions(+), 30 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e489e2d90..d289d265d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2488,10 +2488,174 @@ dp_netdev_append_flow_offload(struct 
dp_flow_offload_item *offload)
 ovs_mutex_unlock(&dp_flow_offload.mutex);
 }
 
+static int
+partial_offload_egress_flow_del(struct dp_flow_offload_item *offload)
+{
+struct dp_netdev_pmd_thread *pmd = offload->pmd;
+struct dp_netdev_flow *flow = offload->flow;
+const char *dpif_type_str = dpif_normalize_type(pmd->dp->class->type);
+struct netdev *port;
+int ret;
+
+port = netdev_ports_get(flow->egress_offload_port, dpif_type_str);
+if (!port) {
+return -1;
+}
+
+/* Taking a global 'port_mutex' to fulfill thread safety
+ * restrictions for the netdev-offload-dpdk module. */
+ovs_mutex_lock(&pmd->dp->port_mutex);
+ret = netdev_flow_del(port, &flow->mega_ufid, NULL);
+ovs_mutex_unlock(&pmd->dp->port_mutex);
+netdev_close(port);
+
+if (ret) {
+return ret;
+}
+
+flow->egress_offload_port = NULL;
+flow->partial_actions_offloaded = false;
+
+VLOG_DBG_RL("%s: flow: %p mega_ufid: "UUID_FMT" pmd_id: %d\n", __func__,
+flow, UUID_ARGS((struct uuid *)&flow->mega_ufid),
+offload->flow->pmd_id);
+return ret;
+}
+
 static int
 dp_netdev_flow_offload_del(struct dp_flow_offload_item *offload)
 {
-return mark_to_flow_disassociate(offload->pmd, offload->flow);
+if (unlikely(offload->flow->partial_actions_offloaded &&
+offload->flow->egress_offload_port != ODPP_NONE)) {
+return partial_offload_egress_flow_del(offload);
+} else {
+return mark_to_flow_disassociate(offload->pmd, offload->flow);
+}
+}
+
+/* Structure to hold a nl_parsed OVS action */
+struct action_attr {
+int type;/* OVS action type */
+struct nlattr *action;   /* action attribute */
+};
+
+/*
+ * Maxium number of actions to be parsed while selecting a flow for partial
+ * action offload. This number is currently based on the minimum number of
+ * attributes seen with the tunnel encap action (clone, tunnel_push, output).
+ * This number includes output action to a single egress device (uplink) and
+ * supports neither multiple clone() actions nor multiple output actions.
+ * This number could change if and when we support other actions or
+ * combinations of actions for partial offload.
+ */
+#define MAX_ACTION_ATTRS3 /* Max # action attributes supported */
+
+/*
+ * This function parses the list of OVS "actions" of length "actions_len",
+ * and returns them in an array of action "attrs", of size "max_attrs".
+ * The parsed number of actions is returned in "num_attrs". If the number
+ * of actions exceeds "max_attrs", parsing is stopped and E2BIG is returned.
+ * Otherwise, returns success (0).
+ */
+static int
+parse_nlattr_actions(struct nlattr *actions, size_t actions_len,
+ struct action_attr *attrs, int max_attrs, int *num_attrs)
+{
+const struct nlattr *a;
+unsigned int left;
+int num_actions = 0;
+int n_attrs = 0;
+int rc = 0;
+int type;
+
+*num_attrs = 0;
+
+NL_ATTR_FOR_EACH (a, left, actions, actions_len) {
+type = nl_attr_type(a);
+
+if (num_actions >= max_attrs) {
+*num_attrs = num_actions;
+return E2BIG;
+}
+
+attrs[num_actions].type = type;
+attrs[num_actions].action = a;
+num_actions++;
+if (type == OVS_ACTION_ATTR_CLONE) {
+rc = parse_nlattr_actions(nl_attr_get(a), nl_attr_get_size(a),
+  &attrs[num_actions],
+  (max_attrs - num_actions), &n_attrs);
+num_actions += n_attrs;
+if (rc == E2BIG) {
+*num_attrs = num_actions;
+return rc;
+}
+

[ovs-dev] [PATCH v4 1/5] dpif-netdev: Refactor dp_netdev_flow_offload_put()

This patch refactors dp_netdev_flow_offload_put() to prepare for
changes to support partial action offload, in subsequent patches.

- Move mark allocation code into a separate wrapper function, outside of
  dp_netdev_flow_offload_put() to improve readability and to facilitate
  more changes in this function to support partial action offload.

- We need to get the in-port's netdev-type (e.g, vhost) to determine if
  the flow should be offloaded on the egress device instead. To facilitate
  such changes, netdev_ports_get() is moved ahead of mark allocation.

Signed-off-by: Sriharsha Basavapatna 
---
 lib/dpif-netdev.c | 72 +--
 1 file changed, 45 insertions(+), 27 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 285c1c52f..7adea8c40 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2482,6 +2482,43 @@ dp_netdev_flow_offload_del(struct dp_flow_offload_item 
*offload)
 return mark_to_flow_disassociate(offload->pmd, offload->flow);
 }
 
+static int
+dp_netdev_alloc_flow_mark(struct dp_netdev_flow *flow, bool modification,
+  uint32_t *markp)
+{
+uint32_t mark;
+
+if (modification) {
+mark = flow->mark;
+ovs_assert(mark != INVALID_FLOW_MARK);
+*markp = mark;
+return 0;
+}
+
+/*
+ * If a mega flow has already been offloaded (from other PMD
+ * instances), do not offload it again.
+ */
+mark = megaflow_to_mark_find(&flow->mega_ufid);
+if (mark != INVALID_FLOW_MARK) {
+VLOG_DBG("Flow has already been offloaded with mark %u\n", mark);
+if (flow->mark != INVALID_FLOW_MARK) {
+ovs_assert(flow->mark == mark);
+} else {
+mark_to_flow_associate(mark, flow);
+}
+return 1;
+}
+
+mark = flow_mark_alloc();
+if (mark == INVALID_FLOW_MARK) {
+VLOG_ERR("Failed to allocate flow mark!\n");
+}
+
+*markp = mark;
+return 0;
+}
+
 /*
  * There are two flow offload operations here: addition and modification.
  *
@@ -2510,37 +2547,18 @@ dp_netdev_flow_offload_put(struct dp_flow_offload_item 
*offload)
 return -1;
 }
 
-if (modification) {
-mark = flow->mark;
-ovs_assert(mark != INVALID_FLOW_MARK);
-} else {
-/*
- * If a mega flow has already been offloaded (from other PMD
- * instances), do not offload it again.
- */
-mark = megaflow_to_mark_find(&flow->mega_ufid);
-if (mark != INVALID_FLOW_MARK) {
-VLOG_DBG("Flow has already been offloaded with mark %u\n", mark);
-if (flow->mark != INVALID_FLOW_MARK) {
-ovs_assert(flow->mark == mark);
-} else {
-mark_to_flow_associate(mark, flow);
-}
-return 0;
-}
+port = netdev_ports_get(in_port, dpif_type_str);
+if (!port) {
+return -1;
+}
 
-mark = flow_mark_alloc();
-if (mark == INVALID_FLOW_MARK) {
-VLOG_ERR("Failed to allocate flow mark!\n");
-}
+if (dp_netdev_alloc_flow_mark(flow, modification, &mark)) {
+/* flow already offloaded */
+netdev_close(port);
+return 0;
 }
 info.flow_mark = mark;
 
-port = netdev_ports_get(in_port, dpif_type_str);
-if (!port || netdev_vport_is_vport_class(port->netdev_class)) {
-netdev_close(port);
-goto err_free;
-}
 /* Taking a global 'port_mutex' to fulfill thread safety restrictions for
  * the netdev-offload-dpdk module. */
 ovs_mutex_lock(&pmd->dp->port_mutex);
-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 2/5] netdev-dpdk: provide a function to identify dpdk-vhost netdevs

This patch adds a function to determine if a given netdev belongs to the
dpdk-vhost class, using the netdev_class specific data.

Signed-off-by: Sriharsha Basavapatna 
---
 lib/netdev-dpdk.c | 5 +
 lib/netdev-dpdk.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 44ebf96da..a2a9bb8e7 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -558,6 +558,11 @@ is_dpdk_class(const struct netdev_class *class)
|| class->destruct == netdev_dpdk_vhost_destruct;
 }
 
+bool is_dpdk_vhost_netdev(struct netdev *netdev)
+{
+return netdev->netdev_class->destruct == netdev_dpdk_vhost_destruct;
+}
+
 /* DPDK NIC drivers allocate RX buffers at a particular granularity, typically
  * aligned at 1k or less. If a declared mbuf size is not a multiple of this
  * value, insufficient buffers are allocated to accomodate the packet in its
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 848346cb4..ab3c3102e 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -37,6 +37,7 @@ void netdev_dpdk_register(void);
 void free_dpdk_buf(struct dp_packet *);
 
 bool netdev_dpdk_flow_api_supported(struct netdev *);
+bool is_dpdk_vhost_netdev(struct netdev *);
 
 int
 netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC v3 PATCH 3/5] dpif-netdev: Skip encap action during datapath execution

On Tue, Jun 2, 2020 at 11:47 AM Eli Britstein  wrote:
>
>
> On 6/1/2020 8:29 PM, Sriharsha Basavapatna wrote:
> > On Mon, Jun 1, 2020 at 9:18 PM Eli Britstein  wrote:
> >>
> >> On 6/1/2020 6:15 PM, Sriharsha Basavapatna wrote:
> >>> On Mon, Jun 1, 2020 at 7:58 PM Eli Britstein  wrote:
>  On 5/28/2020 11:19 AM, Sriharsha Basavapatna wrote:
> > In this patch we check if action processing (apart from OUTPUT action),
> > should be skipped for a given dp_netdev_flow. Specifically, we check if
> > the action is TNL_PUSH and if it has been offloaded to HW, then we do 
> > not
> > push the tunnel header in SW. The datapath only executes the OUTPUT 
> > action.
> > The packet will be encapsulated in HW during transmit.
> >
> > Signed-off-by: Sriharsha Basavapatna 
> > 
> > ---
> > lib/dpif-netdev.c | 46 
> > --
> > 1 file changed, 36 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index 781b233f4..3e175c25e 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -112,6 +112,7 @@ COVERAGE_DEFINE(datapath_drop_recirc_error);
> > COVERAGE_DEFINE(datapath_drop_invalid_port);
> > COVERAGE_DEFINE(datapath_drop_invalid_tnl_port);
> > COVERAGE_DEFINE(datapath_drop_rx_invalid_packet);
> > +COVERAGE_DEFINE(datapath_skip_tunnel_push);
> >
> > /* Protects against changes to 'dp_netdevs'. */
> > static struct ovs_mutex dp_netdev_mutex = OVS_MUTEX_INITIALIZER;
> > @@ -538,6 +539,16 @@ struct dp_netdev_flow {
> > bool dead;
> > uint32_t mark;   /* Unique flow mark assigned to a 
> > flow */
> >
> > +/* The next two members are used to support partial offloading of
> > + * actions. The boolean flag tells if this flow has its actions 
> > partially
> > + * offloaded. The egress port# tells if the action should be 
> > offloaded
> > + * on the egress (output) port instead of the in-port for the 
> > flow. Note
> > + * that we support flows with a single egress port action.
> > + * (see MAX_ACTION_ATTRS for related comments).
> > + */
> > +bool partial_actions_offloaded;
> > +odp_port_t  egress_offload_port;
> > +
> > /* Statistics. */
> > struct dp_netdev_flow_stats stats;
> >
> > @@ -791,7 +802,8 @@ static void dp_netdev_execute_actions(struct 
> > dp_netdev_pmd_thread *pmd,
> >   bool should_steal,
> >   const struct flow *flow,
> >   const struct nlattr *actions,
> > -  size_t actions_len);
> > +  size_t actions_len,
> > +  const struct dp_netdev_flow 
> > *dp_flow);
> > static void dp_netdev_input(struct dp_netdev_pmd_thread *,
> > struct dp_packet_batch *, odp_port_t 
> > port_no);
> > static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
> > @@ -3828,7 +3840,7 @@ dpif_netdev_execute(struct dpif *dpif, struct 
> > dpif_execute *execute)
> > dp_packet_batch_init_packet(&pp, execute->packet);
> > pp.do_not_steal = true;
> > dp_netdev_execute_actions(pmd, &pp, false, execute->flow,
> > -  execute->actions, execute->actions_len);
> > +  execute->actions, execute->actions_len, 
> > NULL);
> > dp_netdev_pmd_flush_output_packets(pmd, true);
> >
> > if (pmd->core_id == NON_PMD_CORE_ID) {
> > @@ -6456,7 +6468,7 @@ packet_batch_per_flow_execute(struct 
> > packet_batch_per_flow *batch,
> > actions = dp_netdev_flow_get_actions(flow);
> >
> > dp_netdev_execute_actions(pmd, &batch->array, true, &flow->flow,
> > -  actions->actions, actions->size);
> > +  actions->actions, actions->size, flow);
> > }
> >
> > static inline void
> > @@ -6764,7 +6776,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread 
> > *pmd,
> >  * we'll send the packet up twice. */
> > dp_packet_batch_init_packet(&b, packet);
> > dp_netdev_execute_actions(pmd, &b, true, &match.flow,
> > -  actions->data, actions->size);
> > +  actions->data, actions->size, NULL);
> >
> > add_actions = put_actions->size ? put_actions : actions;
> > if (OVS_LIKELY(error != ENOSPC)) {
> > @@ -6999,6 +7011,7 @@ dp_netdev_recirculate(struct dp_netdev_pmd_thread 
> > *pmd,

[ovs-dev] [PATCH] lib/tc: only update the stats for non-empty counter

2020-06-29 Thread wenxu

From: wenxu 

A packet with first frag and execute act_ct action.
The packet will stole by defrag. So the stats counter
for "gact action goto chain" will always 0. The openvswitch
update each action in order. So the flower stats finally
alway be zero. The rule will be delete adter max-idle time
even there are packet executing the action.

ovs-appctl dpctl/dump-flows
recirc_id(0),in_port(1),eth_type(0x0800),ipv4(dst=11.0.0.7,frag=first), 
packets:0, bytes:0, used:5.390s, actions:ct(zone=1,nat),recirc(0x4)

filter protocol ip pref 2 flower chain 0 handle 0x2
  eth_type ipv4
  dst_ip 1.1.1.1
  ip_flags frag/firstfrag
  skip_hw
  not_in_hw
 action order 1: ct zone 1 nat pipe
  index 2 ref 1 bind 1 installed 11 sec used 1 sec
 Action statistics:
 Sent 15000 bytes 11 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie e04106c2ac41769b278edaa9b5309960

 action order 2: gact action goto chain 1
  random type none pass val 0
  index 2 ref 1 bind 1 installed 11 sec used 11 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie e04106c2ac41769b278edaa9b5309960

Signed-off-by: wenxu 
---
 lib/tc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/tc.c b/lib/tc.c
index c2ab775..c96d095 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -1726,8 +1726,10 @@ nl_parse_single_action(struct nlattr *action, struct 
tc_flower *flower,
 }
 
 bs = nl_attr_get_unspec(stats_attrs[TCA_STATS_BASIC], sizeof *bs);
-put_32aligned_u64(&stats->n_packets, bs->packets);
-put_32aligned_u64(&stats->n_bytes, bs->bytes);
+if (bs->packets) {
+put_32aligned_u64(&stats->n_packets, bs->packets);
+put_32aligned_u64(&stats->n_bytes, bs->bytes);
+}
 
 return 0;
 }
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V3 10/12] dpif-netdev: Add mega ufid in flow add log

On 6/29/20 7:33 AM, Sriharsha Basavapatna wrote:
> On Mon, Jun 29, 2020 at 5:30 AM Ilya Maximets  wrote:
>>
>> On 6/21/20 1:19 PM, Eli Britstein wrote:
>>> As offload is done using the mega ufid of a flow, for better
>>> debugability, add it in the log message.
>>
>> Could you, please, tell me why we need this extra ufid generated
>> at all?  Why the usual flow ufid is not enough?  I'm a bit confused.
> 
> I believe this comes from the mark/rss (partial offload)
> implementation. The rationale behind this design is explained in this
> commit:
> 
> 241bad15d99a422dce71a6ec6116a2d7b9c31796
> (dpif-netdev: associate flow with a mark id)
> 
> I'm pasting the relevant section here for quick reference:
> 
> "   One tricky thing in OVS-DPDK is, the flow tables is per-PMD. For the
> case there is only one phys port but with 2 queues, there could be 2
> PMDs. In other words, even for a single mega flow (i.e. udp,tp_src=1000),
> there could be 2 different dp_netdev flows, one for each PMD. That could
> results to the same mega flow being offloaded twice in the hardware,
> worse, we may get 2 different marks and only the last one will work.
> 
> To avoid that, a megaflow_to_mark CMAP is created. An entry will be
> added for the first PMD that wants to offload a flow. For later PMDs,
> it will see such megaflow is already offloaded, then the flow will not
> be offloaded to HW twice."

OK.  Thanks for the pointer!

That is because we're mixing pmd_id into flow ufid.  I see.
There should be a more elegant solution, but let it be this way for now.

> 
> Thanks,
> -Harsha
> 
>>
>>>
>>> Signed-off-by: Eli Britstein 
>>> Reviewed-by: Roni Bar Yanai 
>>> ---
>>>  lib/dpif-netdev.c   | 7 +--
>>>  tests/dpif-netdev.at| 2 ++
>>>  tests/ofproto-macros.at | 3 ++-
>>>  3 files changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 57565802a..da0c48ef5 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -2510,8 +2510,9 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED)
>>>  OVS_NOT_REACHED();
>>>  }
>>>
>>> -VLOG_DBG("%s to %s netdev flow\n",
>>> - ret == 0 ? "succeed" : "failed", op);
>>> +VLOG_DBG("%s to %s netdev flow "UUID_FMT"\n",
>>> + ret == 0 ? "succeed" : "failed", op,
>>> + UUID_ARGS((struct uuid *) &offload->flow->mega_ufid));
>>>  dp_netdev_free_flow_offload(offload);
>>>  ovsrcu_quiesce();
>>>  }
>>> @@ -3383,6 +3384,8 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
>>>
>>>  ds_put_cstr(&ds, "flow_add: ");
>>>  odp_format_ufid(ufid, &ds);
>>> +ds_put_cstr(&ds, " mega_");
>>> +odp_format_ufid(&flow->mega_ufid, &ds);
>>>  ds_put_cstr(&ds, " ");
>>>  odp_flow_format(key_buf.data, key_buf.size,
>>>  mask_buf.data, mask_buf.size,
>>> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
>>> index 21f0c8d24..ec5ffc290 100644
>>> --- a/tests/dpif-netdev.at
>>> +++ b/tests/dpif-netdev.at
>>> @@ -13,6 +13,7 @@ strip_timers () {
>>>
>>>  strip_xout () {
>>>  sed '
>>> +s/mega_ufid:[-0-9a-f]* //
>>>  s/ufid:[-0-9a-f]* //
>>>  s/used:[0-9]*\.[0-9]*/used:0.0/
>>>  s/actions:.*/actions: /
>>> @@ -23,6 +24,7 @@ strip_xout () {
>>>
>>>  strip_xout_keep_actions () {
>>>  sed '
>>> +s/mega_ufid:[-0-9a-f]* //
>>>  s/ufid:[-0-9a-f]* //
>>>  s/used:[0-9]*\.[0-9]*/used:0.0/
>>>  s/packets:[0-9]*/packets:0/
>>> diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
>>> index b2b17eed3..87f9ae280 100644
>>> --- a/tests/ofproto-macros.at
>>> +++ b/tests/ofproto-macros.at
>>> @@ -131,7 +131,8 @@ strip_duration () {
>>>  # Strips 'ufid:...' from output, to make it easier to compare.
>>>  # (ufids are random.)
>>>  strip_ufid () {
>>> -sed 's/ufid:[[-0-9a-f]]* //'
>>> +sed 's/mega_ufid:[[-0-9a-f]]* //
>>> +s/ufid:[[-0-9a-f]]* //'
>>>  }
>>>  m4_divert_pop([PREPARE_TESTS])
>>>
>>>
>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V3 02/12] netdev-offload-dpdk: Add IPv6 pattern matching

On 6/29/20 9:45 AM, Eli Britstein wrote:
> 
> On 6/29/2020 1:42 AM, Ilya Maximets wrote:
>> On 6/21/20 1:19 PM, Eli Britstein wrote:
>>> Add support for IPv6 pattern matching for offloading flows.
>>>
>>> Signed-off-by: Eli Britstein 
>>> Reviewed-by: Roni Bar Yanai 
>>> ---
>>>   Documentation/howto/dpdk.rst |  2 +-
>>>   NEWS |  1 +
>>>   lib/netdev-offload-dpdk.c    | 76 
>>> 
>>>   3 files changed, 78 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
>>> index be950d7ce..8a0eee80c 100644
>>> --- a/Documentation/howto/dpdk.rst
>>> +++ b/Documentation/howto/dpdk.rst
>>> @@ -385,7 +385,7 @@ The validated NICs are:
>>>   Supported protocols for hardware offload matches are:
>>>     - L2: Ethernet, VLAN
>>> -- L3: IPv4
>>> +- L3: IPv4, IPv6
>>>   - L4: TCP, UDP, SCTP, ICMP
>>>     Supported actions for hardware offload are:
>>> diff --git a/NEWS b/NEWS
>>> index 22cacda20..07e23316c 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -9,6 +9,7 @@ Post-v2.13.0
>>>  - DPDK:
>>>    * Deprecated DPDK pdump packet capture support removed.
>>>    * Deprecated DPDK ring ports (dpdkr) are no longer supported.
>>> + * Add hardware offload support for matching IPv6 protocol.
>> Not experimental? :)
> OK. BTW, when do features stop being experimental but mainstream?

I don't think there is a standardized process for that.  In general feature
becomes non-experimental when we're confident enough in it and it is
thoroughly tested.

BTW, I'm not sure about marking every single bit of HW offloading as
experimental feature since the whole 'hw-offload' is already marked this way.
Also there is no way to disable parts of matching or some of actions.
So, it might not make much sense having experimental tags everywhere.
However, we will likely need to treat different offload providers
differently while considering removing experimental tag from the
'hw-offload' configuration knob.

>>
>>>  - Linux datapath:
>>>    * Support for kernel versions up to 5.5.x.
>>>  - AF_XDP:
>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>> index 2f1b42bf7..6b12e9ae3 100644
>>> --- a/lib/netdev-offload-dpdk.c
>>> +++ b/lib/netdev-offload-dpdk.c
>>> @@ -16,6 +16,8 @@
>>>    */
>>>   #include 
>>>   +#include 
>>> +#include 
>>>   #include 
>> Can we keep these in alphabetical order?
> OK.
>>
>>>     #include "cmap.h"
>>> @@ -321,6 +323,42 @@ dump_flow_pattern(struct ds *s, const struct 
>>> rte_flow_item *item)
>>>   } else {
>>>   ds_put_cstr(s, "  Mask = null\n");
>>>   }
>>> +    } else if (item->type == RTE_FLOW_ITEM_TYPE_IPV6) {
>>> +    const struct rte_flow_item_ipv6 *ipv6_spec = item->spec;
>>> +    const struct rte_flow_item_ipv6 *ipv6_mask = item->mask;
>>> +
>>> +    char src_addr_str[INET6_ADDRSTRLEN];
>>> +    char dst_addr_str[INET6_ADDRSTRLEN];
>>> +
>>> +    ds_put_cstr(s, "rte flow ipv6 pattern:\n");
>>> +    if (ipv6_spec) {
>>> +    ipv6_string_mapped(src_addr_str,
>>> +   (struct in6_addr 
>>> *)&ipv6_spec->hdr.src_addr);
>>> +    ipv6_string_mapped(dst_addr_str,
>>> +   (struct in6_addr 
>>> *)&ipv6_spec->hdr.dst_addr);
>>> +
>>> +    ds_put_format(s, "  Spec:  vtc_flow=%#"PRIx32",  
>>> proto=%"PRIu8","
>>> +  "  hlim=%"PRIu8",  src=%s,  dst=%s\n",
>>> +  ntohl(ipv6_spec->hdr.vtc_flow), 
>>> ipv6_spec->hdr.proto,
>>> +  ipv6_spec->hdr.hop_limits, src_addr_str,
>>> +  dst_addr_str);
>>> +    } else {
>>> +    ds_put_cstr(s, "  Spec = null\n");
>>> +    }
>>> +    if (ipv6_mask) {
>>> +    ipv6_string_mapped(src_addr_str,
>>> +   (struct in6_addr 
>>> *)&ipv6_mask->hdr.src_addr);
>>> +    ipv6_string_mapped(dst_addr_str,
>>> +   (struct in6_addr 
>>> *)&ipv6_mask->hdr.dst_addr);
>>> +
>>> +    ds_put_format(s, "  Mask:  vtc_flow=%#"PRIx32",  
>>> proto=%#"PRIx8","
>>> +  "  hlim=%#"PRIx8",  src=%s,  dst=%s\n",
>>> +  ntohl(ipv6_mask->hdr.vtc_flow), 
>>> ipv6_mask->hdr.proto,
>>> +  ipv6_mask->hdr.hop_limits, src_addr_str,
>>> +  dst_addr_str);
>>> +    } else {
>>> +    ds_put_cstr(s, "  Mask = null\n");
>>> +    }
>>>   } else {
>>>   ds_put_format(s, "unknown rte flow pattern (%d)\n", item->type);
>>>   }
>>> @@ -658,6 +696,44 @@ parse_flow_match(struct flow_patterns *patterns,
>>>   /* ignore mask match for now */
>>>   consumed_masks->nw_frag = 0;
>>>   +    /* IP v6 */
>>> +    if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>> +    struct rte_flow_item_ipv6 *spec, *mask;
>>> +
>>> +    s

Re: [ovs-dev] [PATCH V3 01/12] netdev-offload-dpdk: Remove pre-validate of patterns function

On Mon, Jun 29, 2020 at 2:27 PM Ilya Maximets  wrote:
>
> On 6/29/20 9:11 AM, Eli Britstein wrote:
> >
> > On 6/29/2020 1:01 AM, Ilya Maximets wrote:
> >> On 6/21/20 1:19 PM, Eli Britstein wrote:
> >>> The function of adding patterns by requested matches checks that it
> >>> consumed all the required matches, and err if not. This nullify the
> >>> purpose of the validation function. Future supported matches will only
> >>> change the pattern parsing code.
> >>>
> >>> Signed-off-by: Eli Britstein 
> >>> Reviewed-by: Roni Bar Yanai 
> >>> ---
> >>>   lib/netdev-offload-dpdk.c | 122 
> >>> --
> >>>   1 file changed, 43 insertions(+), 79 deletions(-)
> >>>
> >>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >>> index f8c46bbaa..2f1b42bf7 100644
> >>> --- a/lib/netdev-offload-dpdk.c
> >>> +++ b/lib/netdev-offload-dpdk.c
> >>> @@ -559,11 +559,21 @@ free_flow_actions(struct flow_actions *actions)
> >>> static int
> >>>   parse_flow_match(struct flow_patterns *patterns,
> >>> - const struct match *match)
> >>> + struct match *match)
> >>>   {
> >>>   uint8_t *next_proto_mask = NULL;
> >>> +struct flow *consumed_masks;
> >>>   uint8_t proto = 0;
> >>>   +consumed_masks = &match->wc.masks;
> >>> +
> >>> +memset(&consumed_masks->in_port, 0, sizeof consumed_masks->in_port);
> >>> +if (match->flow.recirc_id != 0) {
> >> This is not fully correct since we may not be requested to match on it.
> >> Original validation function checks against match_zero_wc which has
> >> all wildcarded fields zeroed.
> > OK.
> >>
> >>> +return -1;
> >>> +}
> >>> +consumed_masks->recirc_id = 0;
> >>> +consumed_masks->packet_type = 0;
> >>> +
> >>>   /* Eth */
> >>>   if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
> >>>   !eth_addr_is_zero(match->wc.masks.dl_dst)) {
> >>> @@ -580,6 +590,9 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>   memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
> >>>   mask->type = match->wc.masks.dl_type;
> >>>   +memset(&consumed_masks->dl_dst, 0, sizeof 
> >>> consumed_masks->dl_dst);
> >>> +memset(&consumed_masks->dl_src, 0, sizeof 
> >>> consumed_masks->dl_src);
> >>> +
> >>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
> >>>   } else {
> >>>   /*
> >>> @@ -591,6 +604,7 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>*/
> >>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
> >>>   }
> >>> +consumed_masks->dl_type = 0;
> >>> /* VLAN */
> >>>   if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) {
> >>> @@ -607,6 +621,7 @@ parse_flow_match(struct flow_patterns *patterns,
> >>> add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_VLAN, spec, 
> >>> mask);
> >>>   }
> >>> +memset(&consumed_masks->vlans[0], 0, sizeof 
> >>> consumed_masks->vlans[0]);
> >> I don't think 'qtag' is consumed here.  Also this clearing should be done
> >> inside the if condition.
> >
> > If you refer to 'qtag' field in union flow_vlan_hdr, so it is consumed as 
> > this is a union. If you refer to qinq, it was not consumed either before. I 
> > didn't change that.
>
> OK.  Sorry, missed that it's a union.
>
> >
> > Regarding the clearing, it should not be inside, but outside. See also in 
> > netdev-offload-tc.c. In case of native (untagged), 
> > match->wc.masks.vlans[0].tci is 0x and match->flow.vlans[0].tci is 0.
>
> OK.
Eli, Can you please add a comment above that line (that it's being
cleared outside to handle native vlan case)

Thanks,
-Harsha
>
> >
> >>
> >>> /* IP v4 */
> >>>   if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
> >>> @@ -627,6 +642,12 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>   mask->hdr.src_addr= match->wc.masks.nw_src;
> >>>   mask->hdr.dst_addr= match->wc.masks.nw_dst;
> >>>   +consumed_masks->nw_tos = 0;
> >>> +consumed_masks->nw_ttl = 0;
> >>> +consumed_masks->nw_proto = 0;
> >>> +consumed_masks->nw_src = 0;
> >>> +consumed_masks->nw_dst = 0;
> >>> +
> >>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV4, spec, mask);
> >>> /* Save proto for L4 protocol setup. */
> >>> @@ -634,6 +655,8 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>   mask->hdr.next_proto_id;
> >>>   next_proto_mask = &mask->hdr.next_proto_id;
> >>>   }
> >>> +/* ignore mask match for now */
> >>> +consumed_masks->nw_frag = 0;
> >> Why this ignored?  Original validation code failed if matching on 
> >> fragmentation
> >> flags requested.
> > OK.
> >>
> >>> if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP  &&
> >>>   proto != IPPROTO_SCTP && proto != IPPROTO_TCP  &&
> >>> @@ -665,6 +688,10 @@ parse_flow_match(stru

Re: [ovs-dev] [PATCH V3 01/12] netdev-offload-dpdk: Remove pre-validate of patterns function

On 6/29/20 9:11 AM, Eli Britstein wrote:
> 
> On 6/29/2020 1:01 AM, Ilya Maximets wrote:
>> On 6/21/20 1:19 PM, Eli Britstein wrote:
>>> The function of adding patterns by requested matches checks that it
>>> consumed all the required matches, and err if not. This nullify the
>>> purpose of the validation function. Future supported matches will only
>>> change the pattern parsing code.
>>>
>>> Signed-off-by: Eli Britstein 
>>> Reviewed-by: Roni Bar Yanai 
>>> ---
>>>   lib/netdev-offload-dpdk.c | 122 
>>> --
>>>   1 file changed, 43 insertions(+), 79 deletions(-)
>>>
>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>> index f8c46bbaa..2f1b42bf7 100644
>>> --- a/lib/netdev-offload-dpdk.c
>>> +++ b/lib/netdev-offload-dpdk.c
>>> @@ -559,11 +559,21 @@ free_flow_actions(struct flow_actions *actions)
>>>     static int
>>>   parse_flow_match(struct flow_patterns *patterns,
>>> - const struct match *match)
>>> + struct match *match)
>>>   {
>>>   uint8_t *next_proto_mask = NULL;
>>> +    struct flow *consumed_masks;
>>>   uint8_t proto = 0;
>>>   +    consumed_masks = &match->wc.masks;
>>> +
>>> +    memset(&consumed_masks->in_port, 0, sizeof consumed_masks->in_port);
>>> +    if (match->flow.recirc_id != 0) {
>> This is not fully correct since we may not be requested to match on it.
>> Original validation function checks against match_zero_wc which has
>> all wildcarded fields zeroed.
> OK.
>>
>>> +    return -1;
>>> +    }
>>> +    consumed_masks->recirc_id = 0;
>>> +    consumed_masks->packet_type = 0;
>>> +
>>>   /* Eth */
>>>   if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
>>>   !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>>> @@ -580,6 +590,9 @@ parse_flow_match(struct flow_patterns *patterns,
>>>   memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
>>>   mask->type = match->wc.masks.dl_type;
>>>   +    memset(&consumed_masks->dl_dst, 0, sizeof 
>>> consumed_masks->dl_dst);
>>> +    memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src);
>>> +
>>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
>>>   } else {
>>>   /*
>>> @@ -591,6 +604,7 @@ parse_flow_match(struct flow_patterns *patterns,
>>>    */
>>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
>>>   }
>>> +    consumed_masks->dl_type = 0;
>>>     /* VLAN */
>>>   if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) {
>>> @@ -607,6 +621,7 @@ parse_flow_match(struct flow_patterns *patterns,
>>>     add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_VLAN, spec, mask);
>>>   }
>>> +    memset(&consumed_masks->vlans[0], 0, sizeof consumed_masks->vlans[0]);
>> I don't think 'qtag' is consumed here.  Also this clearing should be done
>> inside the if condition.
> 
> If you refer to 'qtag' field in union flow_vlan_hdr, so it is consumed as 
> this is a union. If you refer to qinq, it was not consumed either before. I 
> didn't change that.

OK.  Sorry, missed that it's a union.

> 
> Regarding the clearing, it should not be inside, but outside. See also in 
> netdev-offload-tc.c. In case of native (untagged), 
> match->wc.masks.vlans[0].tci is 0x and match->flow.vlans[0].tci is 0.

OK.

> 
>>
>>>     /* IP v4 */
>>>   if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>> @@ -627,6 +642,12 @@ parse_flow_match(struct flow_patterns *patterns,
>>>   mask->hdr.src_addr    = match->wc.masks.nw_src;
>>>   mask->hdr.dst_addr    = match->wc.masks.nw_dst;
>>>   +    consumed_masks->nw_tos = 0;
>>> +    consumed_masks->nw_ttl = 0;
>>> +    consumed_masks->nw_proto = 0;
>>> +    consumed_masks->nw_src = 0;
>>> +    consumed_masks->nw_dst = 0;
>>> +
>>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV4, spec, mask);
>>>     /* Save proto for L4 protocol setup. */
>>> @@ -634,6 +655,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>   mask->hdr.next_proto_id;
>>>   next_proto_mask = &mask->hdr.next_proto_id;
>>>   }
>>> +    /* ignore mask match for now */
>>> +    consumed_masks->nw_frag = 0;
>> Why this ignored?  Original validation code failed if matching on 
>> fragmentation
>> flags requested.
> OK.
>>
>>>     if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP  &&
>>>   proto != IPPROTO_SCTP && proto != IPPROTO_TCP  &&
>>> @@ -665,6 +688,10 @@ parse_flow_match(struct flow_patterns *patterns,
>>>   mask->hdr.data_off  = ntohs(match->wc.masks.tcp_flags) >> 8;
>>>   mask->hdr.tcp_flags = ntohs(match->wc.masks.tcp_flags) & 0xff;
>>>   +    consumed_masks->tp_src = 0;
>>> +    consumed_masks->tp_dst = 0;
>>> +    consumed_masks->tcp_flags = 0;
>>> +
>>>   add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_TCP, spec, mask);
>>>     /* proto == T

Re: [ovs-dev] [PATCH V3 03/12] netdev-offload-dpdk: Support offload of set IPv6 actions




On 6/29/2020 2:02 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

Some trivial commit message here?

OK.



Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  Documentation/howto/dpdk.rst |  1 +
  NEWS |  2 ++
  lib/netdev-offload-dpdk.c| 35 +++
  3 files changed, 38 insertions(+)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 8a0eee80c..1756a7149 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -395,6 +395,7 @@ Supported actions for hardware offload are:
  - Modification of Ethernet (mod_dl_src/mod_dl_dst).
  - Modification of IPv4 (mod_nw_src/mod_nw_dst/mod_nw_ttl).
  - Modification of TCP/UDP (mod_tp_src/mod_tp_dst).
+- Modification of IPv6 (set_field:->ipv6_src/ipv6_dst/mod_nw_ttl).
  
  Further Reading

  ---
diff --git a/NEWS b/NEWS
index 07e23316c..7df8b134d 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,8 @@ Post-v2.13.0
   * Deprecated DPDK pdump packet capture support removed.
   * Deprecated DPDK ring ports (dpdkr) are no longer supported.
   * Add hardware offload support for matching IPv6 protocol.
+ * Add hardware offload support for set of IPv6 TCP/UDP ports
+   actions (experimental).
 - Linux datapath:
   * Support for kernel versions up to 5.5.x.
 - AF_XDP:
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 6b12e9ae3..dd5e71e36 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -458,6 +458,23 @@ dump_flow_action(struct ds *s, const struct 
rte_flow_action *actions)
  } else {
  ds_put_format(s, "  Set-%s-tcp/udp-port = null\n", dirstr);
  }
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC ||
+   actions->type == RTE_FLOW_ACTION_TYPE_SET_IPV6_DST) {
+const struct rte_flow_action_set_ipv6 *set_ipv6 = actions->conf;
+
+char *dirstr = actions->type == RTE_FLOW_ACTION_TYPE_SET_IPV6_DST
+   ? "dst" : "src";
+
+ds_put_format(s, "rte flow set-ipv6-%s action:\n", dirstr);
+if (set_ipv6) {
+char addr_str[INET6_ADDRSTRLEN];
+
+ipv6_string_mapped(addr_str,
+   (struct in6_addr *)&set_ipv6->ipv6_addr);
+ds_put_format(s, "  Set-ipv6-%s: %s\n", dirstr, addr_str);

It seems like using ipv6_format_mapped() will result in lower number of
lines and will also avoid introducing an array:

 ds_put_format(s, "  Set-ipv6-%s: ", dirstr);
 ipv6_format_mapped((struct in6_addr *) &set_ipv6->ipv6_addr, s);
 ds_put_cstr(s, "\n");

OK. I'll use ipv6_format_addr.


Same, actually might be applicable to the matching code from the previous patch.
It actually takes less number of lines even with extra ones like:
 ds_put_cstr(",  src=");
Following the suggestion about testpmd format, I'll change the order of 
commits to do the testpmd format first. For testpmd format, 
ipv6_format_mapped (or ipv6_format_addr) is not applicable, so I'll keep 
it there.




+} else {
+ds_put_format(s, "  Set-ipv6-%s = null\n", dirstr);
+}
  } else {
  ds_put_format(s, "unknown rte flow action (%d)\n", actions->type);
  }
@@ -1004,6 +1021,12 @@ BUILD_ASSERT_DECL(sizeof(struct 
rte_flow_action_set_ipv4) ==
MEMBER_SIZEOF(struct ovs_key_ipv4, ipv4_dst));
  BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_ttl) ==
MEMBER_SIZEOF(struct ovs_key_ipv4, ipv4_ttl));
+BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_ipv6) ==
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_src));
+BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_ipv6) ==
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_dst));
+BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_ttl) ==
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_hlimit));
  BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_tp) ==
MEMBER_SIZEOF(struct ovs_key_tcp, tcp_src));
  BUILD_ASSERT_DECL(sizeof(struct rte_flow_action_set_tp) ==
@@ -1053,6 +1076,18 @@ parse_set_actions(struct flow_actions *actions,
  VLOG_DBG_RL(&rl, "Unsupported IPv4 set action");
  return -1;
  }
+} else if (nl_attr_type(sa) == OVS_KEY_ATTR_IPV6) {
+const struct ovs_key_ipv6 *key = nl_attr_get(sa);
+const struct ovs_key_ipv6 *mask = masked ? key + 1 : NULL;
+
+add_set_flow_action(ipv6_src, RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC);
+add_set_flow_action(ipv6_dst, RTE_FLOW_ACTION_TYPE_SET_IPV6_DST);
+add_set_flow_action(ipv6_hlimit, RTE_FLOW_ACTION_TYPE_SET_TTL);
+
+if (mask && !is_all_zeros(mask, sizeof *mask)) {
+VLOG_DBG_RL(&rl, "Unsupported IPv6 set action");
+return -1;
+}

Re: [ovs-dev] [PATCH V3 01/12] netdev-offload-dpdk: Remove pre-validate of patterns function




On 6/29/2020 1:01 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

The function of adding patterns by requested matches checks that it
consumed all the required matches, and err if not. This nullify the
purpose of the validation function. Future supported matches will only
change the pattern parsing code.

Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  lib/netdev-offload-dpdk.c | 122 --
  1 file changed, 43 insertions(+), 79 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index f8c46bbaa..2f1b42bf7 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -559,11 +559,21 @@ free_flow_actions(struct flow_actions *actions)
  
  static int

  parse_flow_match(struct flow_patterns *patterns,
- const struct match *match)
+ struct match *match)
  {
  uint8_t *next_proto_mask = NULL;
+struct flow *consumed_masks;
  uint8_t proto = 0;
  
+consumed_masks = &match->wc.masks;

+
+memset(&consumed_masks->in_port, 0, sizeof consumed_masks->in_port);
+if (match->flow.recirc_id != 0) {

This is not fully correct since we may not be requested to match on it.
Original validation function checks against match_zero_wc which has
all wildcarded fields zeroed.

OK.



+return -1;
+}
+consumed_masks->recirc_id = 0;
+consumed_masks->packet_type = 0;
+
  /* Eth */
  if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
  !eth_addr_is_zero(match->wc.masks.dl_dst)) {
@@ -580,6 +590,9 @@ parse_flow_match(struct flow_patterns *patterns,
  memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
  mask->type = match->wc.masks.dl_type;
  
+memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks->dl_dst);

+memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src);
+
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
  } else {
  /*
@@ -591,6 +604,7 @@ parse_flow_match(struct flow_patterns *patterns,
   */
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
  }
+consumed_masks->dl_type = 0;
  
  /* VLAN */

  if (match->wc.masks.vlans[0].tci && match->flow.vlans[0].tci) {
@@ -607,6 +621,7 @@ parse_flow_match(struct flow_patterns *patterns,
  
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_VLAN, spec, mask);

  }
+memset(&consumed_masks->vlans[0], 0, sizeof consumed_masks->vlans[0]);

I don't think 'qtag' is consumed here.  Also this clearing should be done
inside the if condition.


If you refer to 'qtag' field in union flow_vlan_hdr, so it is consumed 
as this is a union. If you refer to qinq, it was not consumed either 
before. I didn't change that.


Regarding the clearing, it should not be inside, but outside. See also 
in netdev-offload-tc.c. In case of native (untagged), 
match->wc.masks.vlans[0].tci is 0x and match->flow.vlans[0].tci is 0.




  
  /* IP v4 */

  if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
@@ -627,6 +642,12 @@ parse_flow_match(struct flow_patterns *patterns,
  mask->hdr.src_addr= match->wc.masks.nw_src;
  mask->hdr.dst_addr= match->wc.masks.nw_dst;
  
+consumed_masks->nw_tos = 0;

+consumed_masks->nw_ttl = 0;
+consumed_masks->nw_proto = 0;
+consumed_masks->nw_src = 0;
+consumed_masks->nw_dst = 0;
+
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_IPV4, spec, mask);
  
  /* Save proto for L4 protocol setup. */

@@ -634,6 +655,8 @@ parse_flow_match(struct flow_patterns *patterns,
  mask->hdr.next_proto_id;
  next_proto_mask = &mask->hdr.next_proto_id;
  }
+/* ignore mask match for now */
+consumed_masks->nw_frag = 0;

Why this ignored?  Original validation code failed if matching on fragmentation
flags requested.

OK.


  
  if (proto != IPPROTO_ICMP && proto != IPPROTO_UDP  &&

  proto != IPPROTO_SCTP && proto != IPPROTO_TCP  &&
@@ -665,6 +688,10 @@ parse_flow_match(struct flow_patterns *patterns,
  mask->hdr.data_off  = ntohs(match->wc.masks.tcp_flags) >> 8;
  mask->hdr.tcp_flags = ntohs(match->wc.masks.tcp_flags) & 0xff;
  
+consumed_masks->tp_src = 0;

+consumed_masks->tp_dst = 0;
+consumed_masks->tcp_flags = 0;
+
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_TCP, spec, mask);
  
  /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */

@@ -683,6 +710,9 @@ parse_flow_match(struct flow_patterns *patterns,
  mask->hdr.src_port = match->wc.masks.tp_src;
  mask->hdr.dst_port = match->wc.masks.tp_dst;
  
+consumed_masks->tp_src = 0;

+consumed_masks->tp_dst = 0;
+
  add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_UDP, spec, mask);
  
  /* proto == UDP and ITEM_TYPE_UDP, thus no need for proto ma

Re: [ovs-dev] [PATCH V3 02/12] netdev-offload-dpdk: Add IPv6 pattern matching




On 6/29/2020 1:42 AM, Ilya Maximets wrote:

On 6/21/20 1:19 PM, Eli Britstein wrote:

Add support for IPv6 pattern matching for offloading flows.

Signed-off-by: Eli Britstein 
Reviewed-by: Roni Bar Yanai 
---
  Documentation/howto/dpdk.rst |  2 +-
  NEWS |  1 +
  lib/netdev-offload-dpdk.c| 76 
  3 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index be950d7ce..8a0eee80c 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -385,7 +385,7 @@ The validated NICs are:
  Supported protocols for hardware offload matches are:
  
  - L2: Ethernet, VLAN

-- L3: IPv4
+- L3: IPv4, IPv6
  - L4: TCP, UDP, SCTP, ICMP
  
  Supported actions for hardware offload are:

diff --git a/NEWS b/NEWS
index 22cacda20..07e23316c 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,7 @@ Post-v2.13.0
 - DPDK:
   * Deprecated DPDK pdump packet capture support removed.
   * Deprecated DPDK ring ports (dpdkr) are no longer supported.
+ * Add hardware offload support for matching IPv6 protocol.

Not experimental? :)

OK. BTW, when do features stop being experimental but mainstream?



 - Linux datapath:
   * Support for kernel versions up to 5.5.x.
 - AF_XDP:
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 2f1b42bf7..6b12e9ae3 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -16,6 +16,8 @@
   */
  #include 
  
+#include 

+#include 
  #include 

Can we keep these in alphabetical order?

OK.


  
  #include "cmap.h"

@@ -321,6 +323,42 @@ dump_flow_pattern(struct ds *s, const struct rte_flow_item 
*item)
  } else {
  ds_put_cstr(s, "  Mask = null\n");
  }
+} else if (item->type == RTE_FLOW_ITEM_TYPE_IPV6) {
+const struct rte_flow_item_ipv6 *ipv6_spec = item->spec;
+const struct rte_flow_item_ipv6 *ipv6_mask = item->mask;
+
+char src_addr_str[INET6_ADDRSTRLEN];
+char dst_addr_str[INET6_ADDRSTRLEN];
+
+ds_put_cstr(s, "rte flow ipv6 pattern:\n");
+if (ipv6_spec) {
+ipv6_string_mapped(src_addr_str,
+   (struct in6_addr *)&ipv6_spec->hdr.src_addr);
+ipv6_string_mapped(dst_addr_str,
+   (struct in6_addr *)&ipv6_spec->hdr.dst_addr);
+
+ds_put_format(s, "  Spec:  vtc_flow=%#"PRIx32",  proto=%"PRIu8","
+  "  hlim=%"PRIu8",  src=%s,  dst=%s\n",
+  ntohl(ipv6_spec->hdr.vtc_flow), ipv6_spec->hdr.proto,
+  ipv6_spec->hdr.hop_limits, src_addr_str,
+  dst_addr_str);
+} else {
+ds_put_cstr(s, "  Spec = null\n");
+}
+if (ipv6_mask) {
+ipv6_string_mapped(src_addr_str,
+   (struct in6_addr *)&ipv6_mask->hdr.src_addr);
+ipv6_string_mapped(dst_addr_str,
+   (struct in6_addr *)&ipv6_mask->hdr.dst_addr);
+
+ds_put_format(s, "  Mask:  vtc_flow=%#"PRIx32",  proto=%#"PRIx8","
+  "  hlim=%#"PRIx8",  src=%s,  dst=%s\n",
+  ntohl(ipv6_mask->hdr.vtc_flow), ipv6_mask->hdr.proto,
+  ipv6_mask->hdr.hop_limits, src_addr_str,
+  dst_addr_str);
+} else {
+ds_put_cstr(s, "  Mask = null\n");
+}
  } else {
  ds_put_format(s, "unknown rte flow pattern (%d)\n", item->type);
  }
@@ -658,6 +696,44 @@ parse_flow_match(struct flow_patterns *patterns,
  /* ignore mask match for now */
  consumed_masks->nw_frag = 0;
  
+/* IP v6 */

+if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
+struct rte_flow_item_ipv6 *spec, *mask;
+
+spec = xzalloc(sizeof *spec);
+mask = xzalloc(sizeof *mask);
+
+spec->hdr.proto = match->flow.nw_proto;
+spec->hdr.hop_limits = match->flow.nw_ttl;
+spec->hdr.vtc_flow = htonl((uint32_t)match->flow.nw_tos <<

Please, add a space between the caset and a value.
It also might look better if the whole right side of the assignment
will be moved to the next line and joined.

OK



+   RTE_IPV6_HDR_TC_SHIFT);
+memcpy(spec->hdr.src_addr, &match->flow.ipv6_src,
+   sizeof spec->hdr.src_addr);
+memcpy(spec->hdr.dst_addr, &match->flow.ipv6_dst,
+   sizeof spec->hdr.dst_addr);
+
+mask->hdr.proto = match->wc.masks.nw_proto;
+mask->hdr.hop_limits = match->wc.masks.nw_ttl;
+mask->hdr.vtc_flow = htonl((uint32_t)match->wc.masks.nw_tos <<
+   RTE_IPV6_HDR_TC_SHIFT);

Ditto.

OK



+memcpy(mask->hdr.src_addr, &match->wc.masks.ipv6_src,
+   sizeof mask->hdr.src_addr);
+memcpy(mask->hdr.dst_addr, &match->wc

Re: [ovs-dev] [PATCH ovn] Split SB Port_Group per datapath.

On Fri, Jun 26, 2020 at 6:50 PM Dumitru Ceara  wrote:

> In order to avoid ovn-controller reinstalling all logical flows that
> refer a port_group when some ports are added/removed from the port group
> we now change the way ovn-northd populates the Southbound DB Port_Group
> table.
>
> Instead of copying NB.Port_Group.name to SB.Port_Group.name we now
> create one SB.Port_Group record for every datapath that has ports
> referenced by the NB.Port_Group.ports field. In order to maintain the
> SB.Port_Group.name uniqueness constraint, ovn-northd populates the field
> with the value: _.
>
> In specific scenarios we see significant improvements in time to
> install/remove all logical flows to/from OVS. One such scenario, in the
> BZ referenced below has:
>
> $ ovn-nbctl acl-list pg
>   from-lport  1001 (inport == @pg && ip) drop
> to-lport  1001 (outport == @pg && ip) drop
>
> Then, incrementally, creates new logical ports on different logical
> switches, binds them to OVS interfaces and adds them to the port_group.
>
> Measuring the total time to perform the above steps 500 times (for 500
> new ports attached to 100 switches, 5 per switch) on a test setup
> we observe an improvement of 50% in time it takes to install all
> openflow rules when port_groups are split in the SB database.
>
> Suggested-by: Numan Siddique 
> Reported-by: Venkata Anil 
> Reported-at: https://bugzilla.redhat.com/1818128
> Signed-off-by: Dumitru Ceara 
>

Thanks Dumitru for this patch.

Thi patch has turned out to be much simpler than I thought.

Can you please add a few test cases in ovn-northd.at to make sure
that the SB Port_Group rows are created as expected when
a NB Port_Group is created and it is referenced by multiple logical
switches.

If you could cover cases like, delete a logical switch and make sure that
the PG
for that logical switch in SB DB is deleted etc.

Thanks
Numan


> ---
>  TODO.rst  |  8 ++
>  controller/lflow.c|  4 ++-
>  include/ovn/expr.h|  4 ++-
>  lib/actions.c |  2 +-
>  lib/expr.c| 48 ---
>  lib/ovn-util.h|  7 +
>  northd/ovn-northd.c   | 79
> ++-
>  tests/test-ovn.c  | 10 +++
>  utilities/ovn-trace.c |  3 +-
>  9 files changed, 119 insertions(+), 46 deletions(-)
>
> diff --git a/TODO.rst b/TODO.rst
> index 6df6b84..4b2fc2d 100644
> --- a/TODO.rst
> +++ b/TODO.rst
> @@ -140,3 +140,11 @@ OVN To-do List
>  * ovn-controller: Stop copying the local OVS configuration into the
>Chassis external_ids column (same for the "is-remote" configuration from
>ovn-ic) a few releases after the 20.06 version (21.06 maybe ?).
> +
> +* ovn-controller: Remove backwards compatibility for Southbound DB
> Port_Group
> +  names in expr.c a few releases after the 20.09 version. Right now
> +  ovn-controller maintains backwards compatibility when connecting to a
> +  SB database that doesn't store Port_Group.name as
> +  . This causes an
> additional
> +  hashtable lookup in parse_port_group() which can be avoided when we are
> sure
> +  that the Southbound DB uses the new format.
> diff --git a/controller/lflow.c b/controller/lflow.c
> index c850a0d..5454361 100644
> --- a/controller/lflow.c
> +++ b/controller/lflow.c
> @@ -566,7 +566,9 @@ consider_logical_flow(const struct sbrec_logical_flow
> *lflow,
>  struct sset port_groups_ref = SSET_INITIALIZER(&port_groups_ref);
>  expr = expr_parse_string(lflow->match, &symtab, l_ctx_in->addr_sets,
>   l_ctx_in->port_groups,
> - &addr_sets_ref, &port_groups_ref, &error);
> + &addr_sets_ref, &port_groups_ref,
> + lflow->logical_datapath->tunnel_key,
> + &error);
>  const char *addr_set_name;
>  SSET_FOR_EACH (addr_set_name, &addr_sets_ref) {
>  lflow_resource_add(l_ctx_out->lfrr, REF_TYPE_ADDRSET,
> addr_set_name,
> diff --git a/include/ovn/expr.h b/include/ovn/expr.h
> index 21bf51c..9838251 100644
> --- a/include/ovn/expr.h
> +++ b/include/ovn/expr.h
> @@ -391,12 +391,14 @@ struct expr *expr_parse(struct lexer *, const struct
> shash *symtab,
>  const struct shash *addr_sets,
>  const struct shash *port_groups,
>  struct sset *addr_sets_ref,
> -struct sset *port_groups_ref);
> +struct sset *port_groups_ref,
> +int64_t dp_id);
>  struct expr *expr_parse_string(const char *, const struct shash *symtab,
> const struct shash *addr_sets,
> const struct shash *port_groups,
> struct sset *addr_sets_ref,
> struct sset *port_groups_ref,
> +   int64_t dp_id,
>

Re: [ovs-dev] [PATCH ovn] chassis.c: Add comment to SB DB transaction only when needed.

On Thu, Jun 25, 2020 at 1:28 PM Dumitru Ceara  wrote:

> The chassis_run() function incorrectly adds a "ovn-controller:
> registering chassis" comment to every SB transaction. This should be done
> only when the chassis record is created or updated. If nothing changes in
> the chassis record we shouldn't add useless extra information to the
> transaction request.
>
> Reported-by: Daniel Alvarez 
> Reported-at: https://bugzilla.redhat.com/1850511
> Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to abstract the
> string parsing")
> Signed-off-by: Dumitru Ceara 
> ---
>  controller/chassis.c | 64
> +---
>  1 file changed, 41 insertions(+), 23 deletions(-)
>
> diff --git a/controller/chassis.c b/controller/chassis.c
> index d619361..33a1e18 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -489,53 +489,64 @@ chassis_get_stale_record(const struct
> sbrec_chassis_table *chassis_table,
>   * Otherwise (i.e., first time we create the record) then we check if
> there's
>   * a stale record from a previous controller run that didn't end
> gracefully
>   * and reuse it. If not then we create a new record.
> + *
> + * Sets '*chassis_rec' to point to the local chassis record.
> + * Returns true if this record was already in the database, false if it
> was
> + * just inserted.
>   */
> -static const struct sbrec_chassis *
> +static bool
>  chassis_get_record(struct ovsdb_idl_txn *ovnsb_idl_txn,
> struct ovsdb_idl_index *sbrec_chassis_by_name,
> const struct sbrec_chassis_table *chassis_table,
> const struct ovs_chassis_cfg *ovs_cfg,
> -   const char *chassis_id)
> +   const char *chassis_id,
> +   const struct sbrec_chassis **chassis_rec)
>  {
> -const struct sbrec_chassis *chassis_rec;
> -
>  if (chassis_info_id_inited(&chassis_state)) {
> -chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
> -
>  chassis_info_id(&chassis_state));
> -if (!chassis_rec) {
> +*chassis_rec = chassis_lookup_by_name(sbrec_chassis_by_name,
> +
> chassis_info_id(&chassis_state));
> +if (!(*chassis_rec)) {
>  VLOG_DBG("Could not find Chassis, will create it"
>   ": stored (%s) ovs (%s)",
>   chassis_info_id(&chassis_state), chassis_id);
>  if (ovnsb_idl_txn) {
>  /* Recreate the chassis record.  */
> -chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +*chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +return false;
>  }
>  }
>  } else {
> -chassis_rec =
> +*chassis_rec =
>  chassis_get_stale_record(chassis_table, ovs_cfg, chassis_id);
>
> -if (!chassis_rec && ovnsb_idl_txn) {
> -chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +if (!(*chassis_rec) && ovnsb_idl_txn) {
> +*chassis_rec = sbrec_chassis_insert(ovnsb_idl_txn);
> +return false;
>  }
>  }
> -return chassis_rec;
> +return true;
>  }
>
> -/* Update a Chassis record based on the config in the ovs config. */
> -static void
> +/* Update a Chassis record based on the config in the ovs config.
> + * Returns true if 'chassis_rec' was updated, false otherwise.
> + */
> +static bool
>  chassis_update(const struct sbrec_chassis *chassis_rec,
> struct ovsdb_idl_txn *ovnsb_idl_txn,
> const struct ovs_chassis_cfg *ovs_cfg,
> const char *chassis_id,
> const struct sset *transport_zones)
>  {
> +bool updated = false;
> +
>  if (strcmp(chassis_id, chassis_rec->name)) {
>  sbrec_chassis_set_name(chassis_rec, chassis_id);
> +updated = true;
>  }
>
>  if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) {
>  sbrec_chassis_set_hostname(chassis_rec, ovs_cfg->hostname);
> +updated = true;
>  }
>
>  if (chassis_other_config_changed(ovs_cfg->bridge_mappings,
> @@ -561,6 +572,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
>   * systems, this behavior should be removed in the future. */
>  sbrec_chassis_set_external_ids(chassis_rec, &other_config);
>  smap_destroy(&other_config);
> +updated = true;
>  }
>
>  update_chassis_transport_zones(transport_zones, chassis_rec);
> @@ -571,7 +583,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
>  &ovs_cfg->encap_ip_set,
> ovs_cfg->encap_csum,
>  chassis_rec);
>  if (!tunnels_changed) {
> -return;
> +return updated;
>  }
>
>  struct sbrec_encap **encaps;
> @@ -583,6 +595,7 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
>   ovs_cfg->encap_csum, &n_encap);
>  sbrec_chas

Re: [ovs-dev] [PATCH] jsonrpc: Don't assert for 0 remotes in jsonrpc_session_open_multiple().

On Sat, Jun 27, 2020 at 1:17 AM Ben Pfaff  wrote:

> It's pretty easy to get 0 remotes here from ovn-northd if you specify
> --ovnnb-db='' or --ovnnb-db='   ' on the command line.  The internals
> of jsonrpc_session aren't equipped to cope with that, so just add a
> dummy remote instead.
>
> Signed-off-by: Ben Pfaff 
>

Acked-by: Numan Siddique 

Thanks
Numan


> ---
>  lib/jsonrpc.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c
> index 060b6669d893..10c0c09962da 100644
> --- a/lib/jsonrpc.c
> +++ b/lib/jsonrpc.c
> @@ -825,8 +825,10 @@ jsonrpc_session_open_multiple(const struct svec
> *remotes, bool retry)
>  s = xmalloc(sizeof *s);
>
>  /* Set 'n' remotes from 'names'. */
> -ovs_assert(remotes->n > 0);
>  svec_clone(&s->remotes, remotes);
> +if (!s->remotes.n) {
> +svec_add(&s->remotes, "invalid:");
> +}
>  s->next_remote = 0;
>
>  s->reconnect = reconnect_create(time_msec());
> --
> 2.26.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] ovn-northd: Make it harder to specify a bad database remote.