Re: [ovs-dev] [PATCH ovn] utilities: Do not send 'set-db-change-aware' for non-daemon mode tools.

2023-01-16 Thread Han Zhou
On Tue, Jan 10, 2023 at 6:09 PM Han Zhou  wrote:
>
> This is to avoid annoying error logs from NB/SB/IC DB servers.
>
> Reported-by: Girish Moodalbail 
> Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-July/050343.html
> Signed-off-by: Han Zhou 
> ---
> This patch depends on the pending OVSDB patch:
>
https://patchwork.ozlabs.org/project/openvswitch/patch/20230111000756.4054163-1-hz...@ovn.org/

The above dependent OVS patch was merged. I just realized that this OVN
patch needs to bump the OVS submodule to use the new API.
Once it is Acked I will include the OVS submodule version change before
merging it.

Thanks,
Han

>
>  utilities/ovn-dbctl.c| 2 ++
>  utilities/ovn-ic-nbctl.c | 4 +++-
>  utilities/ovn-ic-sbctl.c | 4 +++-
>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/utilities/ovn-dbctl.c b/utilities/ovn-dbctl.c
> index a850c2f31709..f0ee0ba053d3 100644
> --- a/utilities/ovn-dbctl.c
> +++ b/utilities/ovn-dbctl.c
> @@ -188,6 +188,8 @@ ovn_dbctl_main(int argc, char *argv[],
>  idl = the_idl =
ovsdb_idl_create_unconnected(dbctl_options->idl_class,
>   daemon_mode);
>  ovsdb_idl_set_shuffle_remotes(idl, shuffle_remotes);
> +/* "set_db_change_aware" is true iff in daemon mode. */
> +ovsdb_idl_set_db_change_aware(idl, daemon_mode);
>  /* "retry" is true iff in daemon mode. */
>  ovsdb_idl_set_remote(idl, db, daemon_mode);
>  ovsdb_idl_set_leader_only(idl, leader_only);
> diff --git a/utilities/ovn-ic-nbctl.c b/utilities/ovn-ic-nbctl.c
> index 63550d492fb5..f3d8039a872c 100644
> --- a/utilities/ovn-ic-nbctl.c
> +++ b/utilities/ovn-ic-nbctl.c
> @@ -112,7 +112,9 @@ main(int argc, char *argv[])
>  ctl_timeout_setup(timeout);
>
>  /* Initialize IDL. */
> -idl = the_idl = ovsdb_idl_create(db, _idl_class, true,
false);
> +idl = the_idl = ovsdb_idl_create_unconnected(_idl_class,
true);
> +ovsdb_idl_set_remote(idl, db, false);
> +ovsdb_idl_set_db_change_aware(idl, false);
>  ovsdb_idl_set_leader_only(idl, leader_only);
>  run_prerequisites(commands, n_commands, idl);
>
> diff --git a/utilities/ovn-ic-sbctl.c b/utilities/ovn-ic-sbctl.c
> index 39fbb83057b0..3060b48b9a5c 100644
> --- a/utilities/ovn-ic-sbctl.c
> +++ b/utilities/ovn-ic-sbctl.c
> @@ -111,7 +111,9 @@ main(int argc, char *argv[])
>  ctl_timeout_setup(timeout);
>
>  /* Initialize IDL. */
> -idl = the_idl = ovsdb_idl_create(db, _idl_class, true,
false);
> +idl = the_idl = ovsdb_idl_create_unconnected(_idl_class,
true);
> +ovsdb_idl_set_remote(idl, db, false);
> +ovsdb_idl_set_db_change_aware(idl, false);
>  ovsdb_idl_set_leader_only(idl, leader_only);
>  run_prerequisites(commands, n_commands, idl);
>
> --
> 2.30.2
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ci: ovn-kubernetes: Use Fedora 37 as base image.

2023-01-16 Thread Ales Musil
On Mon, Jan 16, 2023 at 5:45 PM Dumitru Ceara  wrote:

> Fedora 35 is EOL:
> https://docs.fedoraproject.org/en-US/releases/eol/
>
> Signed-off-by: Dumitru Ceara 
> ---
>  .ci/ovn-kubernetes/Dockerfile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/.ci/ovn-kubernetes/Dockerfile b/.ci/ovn-kubernetes/Dockerfile
> index 7edf86a13a..1cb26d0d0e 100644
> --- a/.ci/ovn-kubernetes/Dockerfile
> +++ b/.ci/ovn-kubernetes/Dockerfile
> @@ -1,7 +1,7 @@
>  ARG OVNKUBE_COMMIT=master
>  ARG LIBOVSDB_COMMIT=8081fe24e48f
>
> -FROM fedora:35 AS ovnbuilder
> +FROM fedora:37 AS ovnbuilder
>
>  USER root
>
> @@ -64,7 +64,7 @@ COPY --from=ovnbuilder /tmp/ovn/ovn-sb.ovsschema
> pkg/sbdb/ovn-sb.ovsschema
>  RUN go generate ./pkg/nbdb && go generate ./pkg/sbdb && make
>
>  # Build the final image
> -FROM fedora:35
> +FROM fedora:37
>
>  # install needed dependencies
>  RUN INSTALL_PKGS=" \
> --
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVS "soft freeze" for 3.1 is in effect.

2023-01-16 Thread Han Zhou
On Mon, Jan 16, 2023 at 7:41 AM Eelco Chaudron  wrote:
>
>
>
> On 16 Jan 2023, at 14:05, Ilya Maximets wrote:
>
> > On 1/12/23 20:38, Dumitru Ceara wrote:
> >> On 1/12/23 19:13, Han Zhou wrote:
> >>> On Mon, Jan 2, 2023 at 3:08 PM Ilya Maximets 
wrote:
> 
>  Hi.  As described in Documentation/internals/release-process.rst, we
are
>  in a "soft freeze" state:
> 
> During the freeze, we ask committers to refrain from applying
patches
> >>> that
> add new features unless those patches were already being publicly
> >>> discussed
> and reviewed before the freeze began.  Bug fixes are welcome at
any
> >>> time.
> Please propose and discuss exceptions on ovs-dev.
> 
>  We should branch for version 3.1 in two weeks from now, on Monday,
Jan 16.
>  With that, release should be on Thursday, Feb 16.
> 
>  Best regards, Ilya Maximets.
>  ___
>  dev mailing list
>  d...@openvswitch.org
>  https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>
> >>> Hi Ilya,
> >>>
> >>> Here is a list of patches I'd like to bring to your attention:
> >>>
> >>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=336342
> >>> This series is new, but it provides an IDL API to avoid a problem, so
it
> >>> falls in between "bug fixes" and "new features". Please see if it can
be
> >>> included in the release.
> >>>
> >>
> >> It would be nice indeed if this could make it into 3.1.  It's contained
> >> enough and only changes behavior when explicitly requested.  On the
> >> other hand it solves the (benign) error message that often raises
questions.
> >
> > OK.  Added this to my queue.
> >
Thanks for applying.

> >>
> >>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=313042
> >>> This series was posted in Aug. We have been using it downstream since
then.
> >>> There was feedback from Adrian and he agreed with the first patch, so
> >>> hopefully at least this one can catch the release.
> >
> > There is an ongoing discussion about the issue here:
> >
https://patchwork.ozlabs.org/project/openvswitch/patch/167274555298.379205.1390528419336184506.stgit@ebuild/
> >
> > The issue is that the mechanism of removing flows without revalidation
> > should not kick in in the first place.  We need more data on where and
> > why exactly revalidator threads are getting stuck.  I'm not comfortable
> > applying workarounds without RCA.

I think my patch is different. It doesn't remove flows without
revalidation. On the contrary, it allows users to avoid the behavior of
"removing flows without revalidation" completely. And it is not related to
offloading.
Probably the word "disabled" in my commit title was misleading. I sent v2
with title modified and better documentation, hopefully avoiding confusions.

> >
> >>> For the second patch, Adrian mentioned about a patch came later that
is
> >>> related. Both patches are useful but it would be better to be aligned
with
> >>> each other for the counter names. Could you suggest how to proceed
with it?
> >
> > I would suggest asking Eelco for review on this patch.  I agree that
> > coverage counters are orthogonal to USDT probes.  If necessary, we
> > could get them after branching as there are no changes in the logic
> > and the patch is very simple.  But, yes, we need to agree on naming.
> > I hope, Eelco and Adrian can help with that.
>
> Replied to the patches. I think the first patch might need some
documentation to describe the 0 value. The second patch would be nice if it
could be aligned with Kevin’s. However, I do not think Kevin will address
his patch in the 3.1 timeframe, maybe we can take the code changes only,
and not his script, and then Han can add coverage counters based on that
patch?
>
Thanks!

> >>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=312997
> >>> This one was also posted in Aug. The first patch got an ACK, and the
second
> >>> one is still waiting for feedback. Could you comment on it, too?
> >
> > We can get this as a bug fix later, as it seems to be a bug fix more
> > than a new feature.  But I had no chance to properly review it.
> >
Thanks!

> > Best regards, Ilya Maximets.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-upcall.c: Add counters for reasons of DP flow deletions.

2023-01-16 Thread Han Zhou
On Mon, Jan 16, 2023 at 7:36 AM Eelco Chaudron  wrote:
>
>
>
> On 18 Oct 2022, at 18:11, Adrian Moreno wrote:
>
> > On 10/14/22 20:07, Han Zhou wrote:
> >>
> >>
> >> On Thu, Oct 13, 2022 at 11:51 PM Adrian Moreno mailto:amore...@redhat.com>> wrote:
> >>>
> >>>
> >>>
> >>> On 8/9/22 03:41, Han Zhou wrote:
>  Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
>  ---
>    ofproto/ofproto-dpif-upcall.c | 13 -
>    1 file changed, 12 insertions(+), 1 deletion(-)
> 
>  diff --git a/ofproto/ofproto-dpif-upcall.c
b/ofproto/ofproto-dpif-upcall.c
>  index 23b52ef40..f26a035f4 100644
>  --- a/ofproto/ofproto-dpif-upcall.c
>  +++ b/ofproto/ofproto-dpif-upcall.c
>  @@ -58,6 +58,10 @@ COVERAGE_DEFINE(upcall_ukey_replace);
>    COVERAGE_DEFINE(revalidate_missed_dp_flow);
>    COVERAGE_DEFINE(upcall_flow_limit_hit);
>    COVERAGE_DEFINE(upcall_flow_limit_kill);
>  +COVERAGE_DEFINE(upcall_flow_del_rev);
>  +COVERAGE_DEFINE(upcall_flow_del_no_rev);
>  +COVERAGE_DEFINE(upcall_flow_del_idle_or_limit);
>  +COVERAGE_DEFINE(upcall_flow_del_purge);
> 
>    /* A thread that reads upcalls from dpif, forwards each upcall's
packet,
> * and possibly sets up a kernel flow as a cache. */
>  @@ -2334,7 +2338,12 @@ revalidate_ukey(struct udpif *udpif, struct
udpif_key *ukey,
>    }
>    result = revalidate_ukey__(udpif, ukey, push.tcp_flags,
>   odp_actions, recircs,
ukey->xcache);
>  -} /* else delete; too expensive to revalidate */
>  +if (result == UKEY_DELETE) {
>  +COVERAGE_INC(upcall_flow_del_rev);
>  +}
>  +} else { /* else delete; too expensive to revalidate */
>  +COVERAGE_INC(upcall_flow_del_no_rev);
>  +}
>    } else if (!push.n_packets || ukey->xcache
>   || !populate_xcache(udpif, ukey, push.tcp_flags)) {
>    result = UKEY_KEEP;
>  @@ -2771,6 +2780,7 @@ revalidate(struct revalidator *revalidator)
>    }
>    if (kill_them_all || (used && used < now - max_idle)) {
>    result = UKEY_DELETE;
>  +COVERAGE_INC(upcall_flow_del_idle_or_limit);
>    } else {
>    result = revalidate_ukey(udpif, ukey, ,
_actions,
> reval_seq, ,
>  @@ -2852,6 +2862,7 @@ revalidator_sweep__(struct revalidator
*revalidator, bool purge)
> 
>    if (purge) {
>    result = UKEY_DELETE;
>  +COVERAGE_INC(upcall_flow_del_purge);
>    } else if (!seq_mismatch) {
>    result = UKEY_KEEP;
>    } else {
> >>>
> >>>
> >>> This has some overlap with the work Kevin has been doing [1].
> >>>
> >>>
https://patchwork.ozlabs.org/project/openvswitch/patch/20221003200742.78047-1-ksprague0...@gmail.com/
<
https://patchwork.ozlabs.org/project/openvswitch/patch/20221003200742.78047-1-ksprague0...@gmail.com/
>
> >>>
> >>> Maybe this patch can be combined with Kevin's to provide unified
visibility
> >>> (coverage counters + USDT probes) to the revalidation results?
> >>>
> >> Thanks Adrian for reviewing! This has been there for 2 months and I
was about to send a reminder.
> >> And thanks for the pointer to the related patch. These two patches are
for similar purposes, while they provide different ways to access the
information, and I think both are useful. The coverage counters added here
try to provide stats of basic DP flow deletion categories, while USDT
probes may provide more details, so I am not sure whether these two need to
match to each other exactly. But I agree that at least for the same del
reasons these two patches should use the same names to avoid confusion.
> >> Do you have any detailed comments regarding the counters in this patch?
> >>
> >
> > I think the counters added by this patch are useful. I just think (as
you mention above) that the reason names should be unified with Kevin's
work so we have both USDT and counters reporting revalidator decisions.
>
> I was adding a lot of changes above on naming, but I think it would be
better to maybe merge his patch with yours?!
>
> However, as his patch is waiting for a v7. Maybe you can take his code
changes without the script, and add some counters @ the probe location. As
I’m fine with his patch, except for the python script.
>
> Latest Kevin patch,
https://patchwork.ozlabs.org/project/openvswitch/patch/20221021163543.508906-1-ksprague0...@gmail.com/
>
>
Thanks Eelco. To make it simpler, I'd wait for Kevin's patch to be merged
and I can rebase this one on top of it. I have provided some minor comments
on Kevin's patch, too. Please take a look.

Regards,
Han
___

Re: [ovs-dev] [PATCH 1/2] Revalidator: Allow min-revalidator-pps to be 0 (disabled).

2023-01-16 Thread Han Zhou
On Mon, Jan 16, 2023 at 7:14 AM Eelco Chaudron  wrote:
>
>
>
> On 18 Oct 2022, at 18:08, Adrian Moreno wrote:
>
> > On 10/14/22 19:49, Han Zhou wrote:
> >>
> >>
> >> On Thu, Oct 13, 2022 at 11:43 PM Adrian Moreno mailto:amore...@redhat.com>> wrote:
> >>>
> >>> Hi Han.
> >>>
> >>> On 8/9/22 03:40, Han Zhou wrote:
>  Today the minimum value for this setting is 1. This patch allows it
to
>  be 0, meaning not checking pps at all, and always do revalidation.
> 
>  This is particularly useful for environments where some of the
>  applications with long-lived connections may have very low traffic
for
>  certain period but have high rate of burst periodically. It is
desirable
>  to keep the datapath flows instead of periodically deleting them to
>  avoid burst of packet miss to userspace.
> 
>  When setting to 0, there may be more datapath flows to be
revalidated,
>  resulting in higher CPU cost of revalidator threads. This is the
>  downside but in certain cases this is still more desirable than
packet
>  misses to user space.
> 
> >>> I am trying to quantify this CPU cost. Do you have any numbers so we
understand
> >>> the effects of disabling pps-based evictions?
> >>>
> >> Hi Adrian,
> >>
> >> Thanks for reviewing. First of all, the CPU cost is added to
revalidator threads only, but saving cost for the handler threads which is
more critical to the packet processing.
> >> Now regarding the actual CPU cost of revalidator threads, the extra
cost really depends on the traffic pattern - how many long lived DP flows
are there with pps smaller than . The more such DP
flows, the more revalidtor CPU, of course. We don't see a problem when
there are 10k DP flows when setting min-revalidator-pps to 0, and this is
on a DPU with small number of less powerful ARM cores. It also depends on
whether it is a dedicated network node/DPU where it is ok for OVS to take a
significant portion of the CPU or it is a compute node where more cores are
supposed to be reserved for applications.
> >> In any case, this is just an option and totally depends on user
settings according to their use case, as explained in the commit message.
> >>
> >
> > I agree. There are many factors that come into play. My concern is that
we have so many knobs that tweak the revalidator/handler load distribution
that depend on so many external factors that it's quite complicated to
determine when to recommend their use.
> >
> > That's why I wanted to have some ballpark numbers to get an intuition
of the potential side effects.
> >
> > Having said that, I think this particular proposal is beneficial as I
also have seen drops caused by long-lived connections.
>
> I do not see anything wrong with adding this feature, however, the
documentation needs to be clear on what 0 means, i.e. does it result in
always revalidate or never revalidate?

Thanks Eelco. I thought that min-revalidate-pps=0 naturally means
revalidating all flows, because pps > 0 would include all flows. However,
probably the "disable" in my commit title was confusing, and some may think
it was trying to disable the revalidation (while I meant disable the
option).
So, I updated the title and also updated the document to avoid confusion in
v2. PTAL:
https://patchwork.ozlabs.org/project/openvswitch/patch/20230117030129.1447363-1-hz...@ovn.org/

Thanks,
Han

>
>
> //Eelco
>
>
>  Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
>  ---
>    ofproto/ofproto-dpif-upcall.c | 4 
>    ofproto/ofproto.c | 2 +-
>    vswitchd/vswitch.xml  | 2 +-
>    3 files changed, 6 insertions(+), 2 deletions(-)
> 
>  diff --git a/ofproto/ofproto-dpif-upcall.c
b/ofproto/ofproto-dpif-upcall.c
>  index 57f94df54..23b52ef40 100644
>  --- a/ofproto/ofproto-dpif-upcall.c
>  +++ b/ofproto/ofproto-dpif-upcall.c
>  @@ -2079,6 +2079,10 @@ should_revalidate(const struct udpif *udpif,
uint64_t packets,
>    {
>    long long int metric, now, duration;
> 
>  +if (!ofproto_min_revalidate_pps) {
>  +return true;
>  +}
>  +
>    if (!used) {
>    /* Always revalidate the first time a flow is dumped. */
>    return true;
>  diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
>  index 3a527683c..259fad4b5 100644
>  --- a/ofproto/ofproto.c
>  +++ b/ofproto/ofproto.c
>  @@ -723,7 +723,7 @@ ofproto_set_max_revalidator(unsigned
max_revalidator)
>    void
>    ofproto_set_min_revalidate_pps(unsigned min_revalidate_pps)
>    {
>  -ofproto_min_revalidate_pps = min_revalidate_pps ?
min_revalidate_pps : 1;
>  +ofproto_min_revalidate_pps = min_revalidate_pps;
>    }
> 
>    /* If forward_bpdu is true, the NORMAL action will forward frames
with
>  diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>  index 4a9284f6b..c1296 100644
>  --- a/vswitchd/vswitch.xml
>  +++ 

[ovs-dev] [PATCH v2] Revalidator: Allow min-revalidator-pps to be 0.

2023-01-16 Thread Han Zhou
Today the minimum value for this setting is 1. This patch allows it to
be 0, meaning not checking pps at all, and always do revalidation.

This is particularly useful for environments where some of the
applications with long-lived connections may have very low traffic for
certain period but have high rate of burst periodically. It is desirable
to keep the datapath flows instead of periodically deleting them to
avoid burst of packet miss to userspace.

When setting to 0, there may be more datapath flows to be revalidated,
resulting in higher CPU cost of revalidator threads. This is the
downside but in certain cases this is still more desirable than packet
misses to user space.

Signed-off-by: Han Zhou 
---
v2: Addresses Eelco's comment - update document of the option.
 ofproto/ofproto-dpif-upcall.c | 4 
 ofproto/ofproto.c | 2 +-
 vswitchd/vswitch.xml  | 3 ++-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index ad96354966f1..442141ccd91d 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2099,6 +2099,10 @@ should_revalidate(const struct udpif *udpif, uint64_t 
packets,
 {
 long long int metric, now, duration;
 
+if (!ofproto_min_revalidate_pps) {
+return true;
+}
+
 if (!used) {
 /* Always revalidate the first time a flow is dumped. */
 return true;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 17f636ed9dc5..e4a1bee769d0 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -724,7 +724,7 @@ ofproto_set_max_revalidator(unsigned max_revalidator)
 void
 ofproto_set_min_revalidate_pps(unsigned min_revalidate_pps)
 {
-ofproto_min_revalidate_pps = min_revalidate_pps ? min_revalidate_pps : 1;
+ofproto_min_revalidate_pps = min_revalidate_pps;
 }
 
 /* If forward_bpdu is true, the NORMAL action will forward frames with
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 8c4acfb1817f..90174e57fb13 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -205,10 +205,11 @@
   
 
   
+  type='{"type": "integer", "minInteger": 0}'>
 
   Set minimum pps that flow must have in order to be revalidated when
   revalidation duration exceeds half of max-revalidator config 
variable.
+  Setting to 0 means always revalidate flows regardless of pps.
 
 
   The default is 5.
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] revalidator: add a USDT probe after evaluation when flows are deleted.

2023-01-16 Thread Han Zhou
On Fri, Oct 21, 2022 at 9:36 AM Kevin Sprague 
wrote:
>
> During normal operations, it is useful to understand when a particular
flow
> gets removed from the system. This can be useful when debugging
performance
> issues tied to ofproto flow changes, trying to determine deployed traffic
> patterns, or while debugging dynamic systems where ports come and go.
>
> Prior to this change, there was a lack of visibility around flow
expiration.
> The existing debugging infrastructure could tell us when a flow was added
to
> the datapath, but not when it was removed or why.
>
> This change introduces a USDT probe at the point where the revalidator
> determines that the flow should be removed.  Additionally, we track the
> reason for the flow eviction and provide that information as well.  With
> this change, we can track the complete flow lifecycle for the netlink
datapath
> by hooking the upcall tracepoint in kernel, the flow put USDT, and the
> revaldiator USDT, letting us watch as flows are added and removed from the
> kernel datapath.
>
> This change only enables this information via USDT probe, so it won't be
> possible to access this information any other way (see:
> Documentation/topics/usdt-probes.rst).
>
> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
that
> serves as a demonstration of how the new USDT probe might be used going
> forward.
>
> Change since v5: fixed author information.
>
> Signed-off-by: Kevin Sprague 
> ---
>  Documentation/topics/usdt-probes.rst |  23 +
>  ofproto/ofproto-dpif-upcall.c|  45 +-
>  utilities/automake.mk|   2 +
>  utilities/usdt-scripts/flow_reval_monitor.py | 607 +++
>  4 files changed, 671 insertions(+), 6 deletions(-)
>  create mode 100755 utilities/usdt-scripts/flow_reval_monitor.py
>
> diff --git a/Documentation/topics/usdt-probes.rst
b/Documentation/topics/usdt-probes.rst
> index 7ce19aaed..45cbc0d89 100644
> --- a/Documentation/topics/usdt-probes.rst
> +++ b/Documentation/topics/usdt-probes.rst
> @@ -214,6 +214,7 @@ Available probes in ``ovs_vswitchd``:
>  - dpif_recv:recv_upcall
>  - main:poll_block
>  - main:run_start
> +- revalidate:flow_result
>
>
>  dpif_netlink_operate\_\_:op_flow_del
> @@ -294,6 +295,7 @@ DPIF_OP_FLOW_PUT operation as part of the dpif
``operate()`` callback.
>
>  **Script references**:
>
> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>  - ``utilities/usdt-scripts/upcall_cost.py``
>
>
> @@ -358,6 +360,27 @@ See also the ``main:run_start`` probe above.
>  - ``utilities/usdt-scripts/bridge_loop.bt``
>
>
> +probe revalidate:flow_result
> +~
> +
> +**Description**:
> +This probe is triggered when the revalidator decides whether or not to
> +revalidate a flow. ``reason`` is an enum that denotes that either the
flow
> +is being kept, or the reason why the flow is being deleted. The
> +``flow_reval_monitor.py`` script uses this probe to notify users when
flows
> +matching user-provided criteria are deleted.
> +
> +**Arguments**:
> +
> +- *arg0*: ``(enum flow_del_reason) reason``
> +- *arg1*: ``(struct udpif *) udpif``
> +- *arg2*: ``(struct udpif_key *) ukey``
> +
> +**Script references**:
> +
> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
> +
> +
>  Adding your own probes
>  --
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index ad9635496..dc0ecc1da 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -31,6 +31,7 @@
>  #include "openvswitch/list.h"
>  #include "netlink.h"
>  #include "openvswitch/ofpbuf.h"
> +#include "openvswitch/usdt-probes.h"
>  #include "ofproto-dpif-ipfix.h"
>  #include "ofproto-dpif-sflow.h"
>  #include "ofproto-dpif-xlate.h"
> @@ -260,6 +261,18 @@ enum ukey_state {
>  };
>  #define N_UKEY_STATES (UKEY_DELETED + 1)
>

Hi Kevin, Eelco,

I have some minor comments regarding the namings, since I am revisiting and
trying to align:
https://patchwork.ozlabs.org/project/openvswitch/patch/20220809014100.3716850-2-hz...@ovn.org/

>
> +enum flow_del_reason {
> +FDR_FLOW_LIVE = 0,  /* The flow was revalidated. */

How about FDR_REVALIDATE? LIVE looks a little confusing as a reason to
delete. FLOW is also redundant because FDR means Flow Delete Reason already.

> +FDR_FLOW_TIME_OUT,  /* The flow went unused and was deleted. */

It is really about *idle* timeout, so would it be more clear to be:
FDR_IDLE?

> +FDR_TOO_EXPENSIVE,  /* The flow was too expensive to revalidate.
*/
> +FDR_FLOW_WILDCARDED,/* The flow needed a narrower wildcard mask.
*/
> +FDR_BAD_ODP_FIT,/* The flow had a bad ODP flow fit. */
> +FDR_ASSOCIATED_OFPROTO, /* The flow didn't have an associated
ofproto. */

Would FDR_NO_OFPROTO be more clear (and concise)?

> +FDR_XLATION_ERROR,  /* There was an error translating the flow.
*/
> +FDR_AVOID_CACHING,  /* Flow deleted to avoid caching. */
> 

Re: [ovs-dev] [PATCH ovn] doc: Explicitly mention the current LTS release.

2023-01-16 Thread Mark Michelson

On 1/13/23 03:01, Dumitru Ceara wrote:

On 1/12/23 20:59, Han Zhou wrote:

On Thu, Jan 12, 2023 at 9:50 AM Dumitru Ceara  wrote:


Until now we only mentioned it on the ovn.org releases page:
https://www.ovn.org/en/releases/

Signed-off-by: Dumitru Ceara 
---
  Documentation/internals/release-process.rst | 1 +
  1 file changed, 1 insertion(+)

diff --git a/Documentation/internals/release-process.rst

b/Documentation/internals/release-process.rst

index 13a22fa60b..d75dd3bb91 100644
--- a/Documentation/internals/release-process.rst
+++ b/Documentation/internals/release-process.rst
@@ -93,6 +93,7 @@ LTS releases are scheduled to be released once every

two years. This means

  that any given LTS will receive bug fix support for two years, followed

by

  one year of critical bug fixes and security fixes.

+The current LTS release is 22.03.x.

  Release Numbering
  -
--
2.31.1


Acked-by: Han Zhou 



Thanks for the review!


Alternatively we can put a link to the https://www.ovn.org/en/releases/,


I think it's good to have that link anyway.  I'll add it in v2.


instead of updating the "process" document every time a new LTS release is
created.


The releases page is also updated manually so someone still needs to
push a commit there too, e.g.:

https://github.com/ovn-org/ovn-website/commit/0c1fd615275cbb6cb9213a754f731460aceb5d3a


But I am ok either way.



Mark, Numan, Ilya, do you have preferences?  I'll wait a bit before
posting v2.


Before addressing this particular change, I'd like to point out that I 
have an outstanding PR for ovn-website that adds release notes for the 
recent 22.03.2, 22.06.1, and 22.09.1 releases. It also updates the 
"Releases" page to have an LTS section. See here: 
https://github.com/ovn-org/ovn-website/pull/49 .


I'm of the opinion that the more mentions of the LTS, the better. But 
also we don't want to be in a position where it's easy to forget to 
update certain documents when the LTS changes.


I wonder if it would be worth having a permanent link to something like 
https://ovn.org/en/releases/long-term-support that always contains 
information about the current LTS. This way, documentation could always 
link to this page, and we only have to update this page when LTS 
information changes.





Thanks,
Han



Thanks,
Dumitru



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd, controller: Commit flows dropped by ACLs to conntrack

2023-01-16 Thread Mark Michelson

Hello Abhiram,

I haven't taken a close look at every line of the series, but I have two 
high-level questions/observations.


First, what is the benefit of using this compared to ACL logging or drop 
sampling?


Second, I think that if we are to accept this patch, the behavior needs 
to be optional, and it needs to be opt-in. In other words, someone needs 
to explicitly enable the behavior on the logical switch in order to 
commit dropped connections to CT. If the goal is to use this as a 
debugging tool, it should only be enabled when attempting to debug. I'm 
not knowledgeable about the inner workings of CT, but committing every 
dropped connection to CT sounds like a vector for a DOS exploit to me. 
Someone else can comment in case I'm completely wrong here.


Thanks,
Mark Michelson

On 1/13/23 07:44, Abhiram Sangana wrote:

This patch commits connections dropped/rejected by ACLs with label
(introduced in 0e0228be (northd: Add ACL label)) to the connection
tracking table. The dropped connections are committed in a separate
conntrack zone so that they can be managed independently and do not
interact with the connection tracking state of allowed connections.

Each logical switch is assigned a new conntrack zone for committing
dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
A new lflow action "ct_commit_drop" is introduced that commits flows
to connection tracking table in a zone identified by
MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
and non-empty label translates to include "ct_commit_drop" in its
actions instead of simply dropping/rejecting the packet.

This provides a new approach to identify connections dropped by ACLs
besides the existing ACL logging and drop sampling approaches.

Signed-off-by: Abhiram Sangana 
---
  controller/ovn-controller.c  |  14 +++--
  controller/physical.c|  32 ++-
  include/ovn/actions.h|   1 +
  include/ovn/logical-fields.h |   1 +
  lib/actions.c|  65 ++
  lib/ovn-util.c   |   4 +-
  lib/ovn-util.h   |   2 +-
  northd/northd.c  |  19 ++-
  northd/ovn-northd.8.xml  |  18 ++-
  ovn-nb.xml   |   8 ++-
  ovn-sb.xml   |  22 
  tests/ovn-nbctl.at   |  10 ++--
  tests/ovn-northd.at  | 102 +--
  tests/ovn.at |  90 ++-
  utilities/ovn-nbctl.c|   7 ---
  15 files changed, 323 insertions(+), 72 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 351cf1c5b..c19b56838 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -734,8 +734,8 @@ update_ct_zones(const struct shash *binding_lports,
  HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
  /* XXX Add method to limit zone assignment to logical router
   * datapaths with NAT */
-char *dnat = alloc_nat_zone_key(>datapath->header_.uuid, "dnat");
-char *snat = alloc_nat_zone_key(>datapath->header_.uuid, "snat");
+char *dnat = alloc_ct_zone_key(>datapath->header_.uuid, "dnat");
+char *snat = alloc_ct_zone_key(>datapath->header_.uuid, "snat");
  sset_add(_users, dnat);
  sset_add(_users, snat);
  
@@ -745,6 +745,14 @@ update_ct_zones(const struct shash *binding_lports,

  }
  free(dnat);
  free(snat);
+
+/* Zone for committing connections dropped by ACLs with labels. */
+if (ld->is_switch) {
+char *drop = alloc_ct_zone_key(
+>datapath->header_.uuid, "drop");
+sset_add(_users, drop);
+free(drop);
+}
  }
  
  /* Delete zones that do not exist in above sset. */

@@ -2415,7 +2423,7 @@ ct_zones_datapath_binding_handler(struct engine_node 
*node, void *data)
  /* Check if the requested snat zone has changed for the datapath
   * or not.  If so, then fall back to full recompute of
   * ct_zone engine. */
-char *snat_dp_zone_key = alloc_nat_zone_key(>header_.uuid, "snat");
+char *snat_dp_zone_key = alloc_ct_zone_key(>header_.uuid, "snat");
  struct simap_node *simap_node = simap_find(_zones_data->current,
 snat_dp_zone_key);
  free(snat_dp_zone_key);
diff --git a/controller/physical.c b/controller/physical.c
index 4dcf44e01..3c573c492 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -60,6 +60,7 @@ struct zone_ids {
  int ct; /* MFF_LOG_CT_ZONE. */
  int dnat;   /* MFF_LOG_DNAT_ZONE. */
  int snat;   /* MFF_LOG_SNAT_ZONE. */
+int drop;   /* MFF_LOG_ACL_DROP_ZONE. */
  };
  
  struct tunnel {

@@ -204,14 +205,18 @@ get_zone_ids(const struct sbrec_port_binding *binding,
  
  const struct uuid *key = 

Re: [ovs-dev] [PATCH v2 1/2] ovsdb-idl: Provide API to disable set_db_change_aware request.

2023-01-16 Thread Ilya Maximets
On 1/12/23 18:08, Han Zhou wrote:
> For ovsdb clients that are short-lived, e.g. when using
> ovn-nbctl/ovn-sbctl to read some metrics from the OVN NB/SB server, they
> don't really need to be aware of db changes, because they exit
> immediately after getting the initial response for the requested data.
> In such use cases, however, the clients still send 'set_db_change_aware'
> request, which results in server side error logs when the server tries
> to send out the response for the 'set_db_change_aware' request, because
> at the moment the client that is supposed to receive the request has
> already closed the connection and exited. E.g.:
> 
> 2023-01-10T18:23:29.431Z|7|jsonrpc|WARN|unix#3: receive error: Connection 
> reset by peer
> 2023-01-10T18:23:29.431Z|8|reconnect|WARN|unix#3: connection dropped 
> (Connection reset by peer)
> 
> To avoid such problems, this patch provides an API to allow a client to
> choose to not send the 'set_db_change_aware' request.
> 
> There was an earlier attempt to fix this [0], but it was not accepted
> back then as discussed in the email [1]. It was also discussed in the
> emails that an alternative approach is to use notification instead of
> request, but that would require protocol changes and taking backward
> compatibility into consideration. So this patch takes a different
> approach and tries to keep the change small.
> 
> [0] 
> http://patchwork.ozlabs.org/project/openvswitch/patch/1594380801-32134-1-git-send-email-dce...@redhat.com/
> 
> [1] 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050919.html
> 
> Reported-by: Girish Moodalbail 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-July/050343.html
> Reported-by: Tobias Hofmann 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2021-February/050914.html
> Signed-off-by: Han Zhou 
> Acked-by: Dumitru Ceara 
> ---
> v2: Addressed Dumitru's comments.

Thanks, Han and Dumitru!  Applied (before branching).

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/2] Add extension for partial CT flush

2023-01-16 Thread Ilya Maximets
On 1/16/23 12:45, Ales Musil wrote:
> Add extension that will allow to flush
> connection from CT specified by full
> orig 5-tuple or partial match that combines
> orig or reply direction.
> 
> Ales Musil (2):
>   ofp, dpif: Allow CT flush based on partial match
>   openflow: Add extension to flush CT by generic match
> 
>  NEWS|   9 +
>  include/openflow/nicira-ext.h   |  37 +++
>  include/openvswitch/automake.mk |   1 +
>  include/openvswitch/ofp-ct.h|  80 +++
>  include/openvswitch/ofp-msgs.h  |   4 +
>  lib/automake.mk |   1 +
>  lib/ct-dpif.c   | 298 ++-
>  lib/ct-dpif.h   |   5 +-
>  lib/dpctl.c |  43 +++-
>  lib/dpctl.man   |  24 +-
>  lib/ofp-bundle.c|   1 +
>  lib/ofp-ct.c| 410 
>  lib/ofp-print.c |  20 ++
>  lib/rconn.c |   1 +
>  ofproto/ofproto-dpif.c  |   9 +-
>  ofproto/ofproto-provider.h  |   7 +-
>  ofproto/ofproto.c   |  29 ++-
>  tests/ofp-print.at  | 108 +
>  tests/ovs-ofctl.at  |  38 +++
>  tests/system-traffic.at | 120 +-
>  utilities/ovs-ofctl.8.in|  27 +++
>  utilities/ovs-ofctl.c   |  51 
>  22 files changed, 1170 insertions(+), 153 deletions(-)
>  create mode 100644 include/openvswitch/ofp-ct.h
>  create mode 100644 lib/ofp-ct.c
> 


Thanks!  I fixed a few other style issues, typos and inconsistencies
and applied the set before branching.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] OVS has now branched for the 3.1 release.

2023-01-16 Thread Ilya Maximets
Hi, everyone.

branch-3.1 was just created.  Please, test it and report issues!

Official release date, according to our release process, should be
on Thursday, February 16th (1 month from now).

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] Prepare for post-3.1.0 (3.1.90).

2023-01-16 Thread Ilya Maximets
On 1/16/23 18:12, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Acked-by: Aaron Conole 
> 


Thanks!  Applied.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpif-netdev: Optimize flushing of output packet buffers

2023-01-16 Thread Ilya Maximets
On 1/13/23 13:20, Dheeraj Kumar wrote:
> Problem Statement:
> Before OVS 2.12 the OVS-DPDK datapath transmitted processed rx packet batches
> directly to the wanted tx queues. In OVS 2.12 each PMD stores the processed
> packets in an intermediate buffer per output port and flushes these output
> buffers in a separate step. This buffering was introduced to allow better
> batching of packets for transmit.
> 
> The current implementation of the function that flushes the output buffers
> performs a full scan overall output ports, even if only one single packet
> was buffered for a single output port. In systems with hundreds of ports
> this can take a long time and degrades OVS-DPDK performance significantly.
> 
> Solution:
> Maintain a list of output ports with buffered packets for each PMD thread and
> only iterate over that list when flushing output buffers.
> 
> Signed-off-by: Dheeraj Kumar 
> ---
>  lib/dpif-netdev-private-thread.h |  7 ---
>  lib/dpif-netdev.c| 24 
>  2 files changed, 16 insertions(+), 15 deletions(-)

Hi, Dheeraj.  Thanks for the updated version.
Unfortunately, on a quick test with bidirectional traffic between
2 vhost-user ports I see a noticeable 4% performance degradation
with this change applied.  Could you, please, check why is that
happening or if we can avoid this?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1 1/2] ofproto-ipfix: use per-domain template timeouts

2023-01-16 Thread Mike Pattrick
On Mon, Jan 16, 2023 at 10:17 AM Adrián Moreno  wrote:
>
> From: Adrian Moreno 
>
> IPFIX templates have to be sent for each Observation Domain ID.
> Currently, a timer is kept at each dpif_ipfix_exporter to send them.
> This works fine for per-bridge sampling where there is only one
> Observation Domain ID per exporter. However, this is does not work for
> per-flow sampling where more than one Observation Domain IDs can be
> specified by the controller. In this case, ovs-vswitchd will only send
> template information for one (arbitrary) DomainID.
>
> Fix per-flow sampling by using an hmap to keep a timer for each
> Observation Domain ID.
>
> Signed-off-by: Adrian Moreno 
>
> ---
> - v1:
>   - Added unit test.
>   - Drop domain_id from "struct dpif_ipfix_domain" since the hashmap
>   already uses it as index.
>   - Added OVS_REQUES(mutex) to dpif_ipfix_exporter_find_domain.
> ---
>  ofproto/ofproto-dpif-ipfix.c | 127 ---
>  tests/system-traffic.at  |  51 ++
>  2 files changed, 155 insertions(+), 23 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
> index 742eed399..fa71fda0f 100644
> --- a/ofproto/ofproto-dpif-ipfix.c
> +++ b/ofproto/ofproto-dpif-ipfix.c
> @@ -124,11 +124,18 @@ struct dpif_ipfix_port {
>  uint32_t ifindex;
>  };
>
> +struct dpif_ipfix_domain {
> +struct hmap_node hmap_node; /* In struct dpif_ipfix_exporter's domains. 
> */
> +time_t last_template_set_time;
> +};
> +
>  struct dpif_ipfix_exporter {
>  uint32_t exporter_id; /* Exporting Process identifier */
>  struct collectors *collectors;
>  uint32_t seq_number;
> -time_t last_template_set_time;
> +struct hmap domains; /* Contains struct dpif_ipfix_domain indexed by
> +observation domain id. */
> +time_t last_stats_sent_time;
>  struct hmap cache_flow_key_map;  /* ipfix_flow_cache_entry. */
>  struct ovs_list cache_flow_start_timestamp_list;  /* 
> ipfix_flow_cache_entry. */
>  uint32_t cache_active_timeout;  /* In seconds. */
> @@ -617,6 +624,9 @@ static void get_export_time_now(uint64_t *, uint32_t *);
>
>  static void dpif_ipfix_cache_expire_now(struct dpif_ipfix_exporter *, bool);
>
> +static void dpif_ipfix_exporter_del_domain(struct dpif_ipfix_exporter * ,
> +   struct dpif_ipfix_domain * );
> +
>  static bool
>  ofproto_ipfix_bridge_exporter_options_equal(
>  const struct ofproto_ipfix_bridge_exporter_options *a,
> @@ -697,13 +707,14 @@ dpif_ipfix_exporter_init(struct dpif_ipfix_exporter 
> *exporter)
>  exporter->exporter_id = ++exporter_total_count;
>  exporter->collectors = NULL;
>  exporter->seq_number = 1;
> -exporter->last_template_set_time = 0;
> +exporter->last_stats_sent_time = 0;
>  hmap_init(>cache_flow_key_map);
>  ovs_list_init(>cache_flow_start_timestamp_list);
>  exporter->cache_active_timeout = 0;
>  exporter->cache_max_flows = 0;
>  exporter->virtual_obs_id = NULL;
>  exporter->virtual_obs_len = 0;
> +hmap_init(>domains);
>
>  memset(>ipfix_global_stats, 0,
> sizeof(struct dpif_ipfix_global_stats));
> @@ -711,6 +722,7 @@ dpif_ipfix_exporter_init(struct dpif_ipfix_exporter 
> *exporter)
>
>  static void
>  dpif_ipfix_exporter_clear(struct dpif_ipfix_exporter *exporter)
> +OVS_REQUIRES(mutex)
>  {
>  /* Flush the cache with flow end reason "forced end." */
>  dpif_ipfix_cache_expire_now(exporter, true);
> @@ -719,22 +731,29 @@ dpif_ipfix_exporter_clear(struct dpif_ipfix_exporter 
> *exporter)
>  exporter->exporter_id = 0;
>  exporter->collectors = NULL;
>  exporter->seq_number = 1;
> -exporter->last_template_set_time = 0;
> +exporter->last_stats_sent_time = 0;
>  exporter->cache_active_timeout = 0;
>  exporter->cache_max_flows = 0;
>  free(exporter->virtual_obs_id);
>  exporter->virtual_obs_id = NULL;
>  exporter->virtual_obs_len = 0;
>
> +struct dpif_ipfix_domain *dom;
> +HMAP_FOR_EACH_SAFE (dom, hmap_node, >domains) {
> +dpif_ipfix_exporter_del_domain(exporter, dom);
> +}
> +
>  memset(>ipfix_global_stats, 0,
> sizeof(struct dpif_ipfix_global_stats));
>  }
>
>  static void
>  dpif_ipfix_exporter_destroy(struct dpif_ipfix_exporter *exporter)
> +OVS_REQUIRES(mutex)
>  {
>  dpif_ipfix_exporter_clear(exporter);
>  hmap_destroy(>cache_flow_key_map);
> +hmap_destroy(>domains);
>  }
>
>  static bool
> @@ -742,7 +761,7 @@ dpif_ipfix_exporter_set_options(struct 
> dpif_ipfix_exporter *exporter,
>  const struct sset *targets,
>  const uint32_t cache_active_timeout,
>  const uint32_t cache_max_flows,
> -const char *virtual_obs_id)
> +const char *virtual_obs_id) 
> OVS_REQUIRES(mutex)
>  {
>  

Re: [ovs-dev] [PATCH 2/2] Prepare for post-3.1.0 (3.1.90).

2023-01-16 Thread Aaron Conole
Ilya Maximets  writes:

> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] Prepare for 3.1.0.

2023-01-16 Thread Aaron Conole
Ilya Maximets  writes:

> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] controller: use packet proto for hairpin traffic learned action if not specified

2023-01-16 Thread Mark Michelson
Thanks Lorenzo and Ales. I pushed this change all the way back to 
branch-21.12


On 1/13/23 06:14, Ales Musil wrote:

On Wed, Jan 11, 2023 at 11:46 PM Lorenzo Bianconi <
lorenzo.bianc...@redhat.com> wrote:


Rely on IP protocol from the incoming packet for learn action
in table 68 if it has not specified in the related load-balancer.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2157846
Signed-off-by: Lorenzo Bianconi 
---
  controller/lflow.c  | 24 ++--
  tests/system-ovn.at | 16 
  2 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index bb47bb0c7..4b1cfe318 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1567,9 +1567,6 @@ add_lb_vip_hairpin_reply_action(struct in6_addr
*vip6, ovs_be32 vip,
  /* Hairpin replies have the same nw_proto as packets that created the
   * session.
   */
-union mf_value imm_proto = {
-.u8 = lb_proto,
-};
  ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
  ol_spec->dst.field = mf_from_id(MFF_IP_PROTO);
  ol_spec->src.field = mf_from_id(MFF_IP_PROTO);
@@ -1577,16 +1574,21 @@ add_lb_vip_hairpin_reply_action(struct in6_addr
*vip6, ovs_be32 vip,
  ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
  ol_spec->n_bits = ol_spec->dst.n_bits;
  ol_spec->dst_type = NX_LEARN_DST_MATCH;
-ol_spec->src_type = NX_LEARN_SRC_IMMEDIATE;
-mf_write_subfield_value(_spec->dst, _proto, );
-
-/* Push value last, as this may reallocate 'ol_spec' */
-imm_bytes = DIV_ROUND_UP(ol_spec->dst.n_bits, 8);
-src_imm = ofpbuf_put_zeros(ofpacts, OFPACT_ALIGN(imm_bytes));
-memcpy(src_imm, _proto, imm_bytes);

  /* Hairpin replies have source port == . */
  if (has_l4_port) {
+union mf_value imm_proto = {
+.u8 = lb_proto,
+};
+
+ol_spec->src_type = NX_LEARN_SRC_IMMEDIATE;
+mf_write_subfield_value(_spec->dst, _proto, );
+
+/* Push value last, as this may reallocate 'ol_spec' */
+imm_bytes = DIV_ROUND_UP(ol_spec->dst.n_bits, 8);
+src_imm = ofpbuf_put_zeros(ofpacts, OFPACT_ALIGN(imm_bytes));
+memcpy(src_imm, _proto, imm_bytes);
+
  ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
  switch (lb_proto) {
  case IPPROTO_TCP:
@@ -1610,6 +1612,8 @@ add_lb_vip_hairpin_reply_action(struct in6_addr
*vip6, ovs_be32 vip,
  ol_spec->n_bits = ol_spec->dst.n_bits;
  ol_spec->dst_type = NX_LEARN_DST_MATCH;
  ol_spec->src_type = NX_LEARN_SRC_FIELD;
+} else {
+ol_spec->src_type = NX_LEARN_SRC_FIELD;
  }

  /* Set MLF_LOOKUP_LB_HAIRPIN_BIT for hairpin replies. */
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 710cf22a2..a34aeb0fa 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -4649,10 +4649,12 @@ ovn-nbctl lb-add lb-ipv4-tcp 88.88.88.88:8080
42.42.42.1:4041 tcp
  ovn-nbctl lb-add lb-ipv4-tcp-dup 88.88.88.89:8080 42.42.42.1:4041 tcp
  ovn-nbctl lb-add lb-ipv4-udp 88.88.88.88:4040 42.42.42.1:2021 udp
  ovn-nbctl lb-add lb-ipv4-udp-dup 88.88.88.89:4040 42.42.42.1:2021 udp
+ovn-nbctl lb-add lb-ipv4 88.88.88.90 42.42.42.1
  ovn-nbctl ls-lb-add sw lb-ipv4-tcp
  ovn-nbctl ls-lb-add sw lb-ipv4-tcp-dup
  ovn-nbctl ls-lb-add sw lb-ipv4-udp
  ovn-nbctl ls-lb-add sw lb-ipv4-udp-dup
+ovn-nbctl ls-lb-add sw lb-ipv4

  ovn-nbctl lr-add rtr
  ovn-nbctl lrp-add rtr rtr-sw 00:00:00:00:01:00 42.42.42.254/24
@@ -4673,21 +4675,23 @@ NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l
42.42.42.1 4041 &], [0])
  # Check that IPv4 TCP hairpin connection succeeds on both VIPs.
  NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore])
  NS_CHECK_EXEC([lsp], [nc 88.88.88.89 8080 -z], [0], [ignore], [ignore])
+NS_CHECK_EXEC([lsp], [nc 88.88.88.90 4041 -z], [0], [ignore], [ignore])

  # Capture IPv4 UDP hairpinned packets.
  filter="dst 42.42.42.1 and dst port 2021 and udp"
-NS_CHECK_EXEC([lsp], [tcpdump -nn -c 2 -i lsp ${filter} > lsp.pcap &])
+NS_CHECK_EXEC([lsp], [tcpdump -nn -c 3 -i lsp ${filter} > lsp.pcap &])

  sleep 1

  # Generate IPv4 UDP hairpin traffic.
  NS_CHECK_EXEC([lsp], [echo a | nc -u 88.88.88.88 4040 &], [0])
  NS_CHECK_EXEC([lsp], [echo a | nc -u 88.88.88.89 4040 &], [0])
+NS_CHECK_EXEC([lsp], [echo a | nc -u 88.88.88.90 2021 &], [0])

  # Check hairpin traffic.
  OVS_WAIT_UNTIL([
  total_pkts=$(cat lsp.pcap | wc -l)
-test "${total_pkts}" = "2"
+test "${total_pkts}" = "3"
  ])

  OVS_APP_EXIT_AND_WAIT([ovn-controller])
@@ -4736,10 +4740,12 @@ ovn-nbctl lb-add lb-ipv6-tcp
  [[8800::0088]]:8080 [[4200::1]]:4041 tcp
  ovn-nbctl lb-add lb-ipv6-tcp-dup [[8800::0089]]:8080 [[4200::1]]:4041 tcp
  ovn-nbctl lb-add lb-ipv6-udp [[8800::0088]]:4040 [[4200::1]]:2021 udp
  ovn-nbctl lb-add lb-ipv6-udp-dup [[8800::0089]]:4040 [[4200::1]]:2021 udp
+ovn-nbctl lb-add lb-ipv6 8800::0090 4200::1
  ovn-nbctl ls-lb-add sw lb-ipv6-tcp
  ovn-nbctl ls-lb-add sw 

[ovs-dev] [PATCH ovn] ci: ovn-kubernetes: Use Fedora 37 as base image.

2023-01-16 Thread Dumitru Ceara
Fedora 35 is EOL:
https://docs.fedoraproject.org/en-US/releases/eol/

Signed-off-by: Dumitru Ceara 
---
 .ci/ovn-kubernetes/Dockerfile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.ci/ovn-kubernetes/Dockerfile b/.ci/ovn-kubernetes/Dockerfile
index 7edf86a13a..1cb26d0d0e 100644
--- a/.ci/ovn-kubernetes/Dockerfile
+++ b/.ci/ovn-kubernetes/Dockerfile
@@ -1,7 +1,7 @@
 ARG OVNKUBE_COMMIT=master
 ARG LIBOVSDB_COMMIT=8081fe24e48f
 
-FROM fedora:35 AS ovnbuilder
+FROM fedora:37 AS ovnbuilder
 
 USER root
 
@@ -64,7 +64,7 @@ COPY --from=ovnbuilder /tmp/ovn/ovn-sb.ovsschema 
pkg/sbdb/ovn-sb.ovsschema
 RUN go generate ./pkg/nbdb && go generate ./pkg/sbdb && make
 
 # Build the final image
-FROM fedora:35
+FROM fedora:37
 
 # install needed dependencies
 RUN INSTALL_PKGS=" \
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] Prepare for post-3.1.0 (3.1.90).

2023-01-16 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 NEWS | 4 
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 933c5700f..02ef48590 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,7 @@
+Post-v3.1.0
+
+
+
 v3.1.0 - xx xxx 
 
- ovs-vswitchd now detects changes in CPU affinity and adjusts the number
diff --git a/configure.ac b/configure.ac
index 9bf896c01..d05e544b5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.1.0, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.1.90, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 6b99df5af..c62bb5646 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (3.1.90-1) unstable; urgency=low
+
+   * New upstream version
+
+ -- Open vSwitch team   Mon, 16 Jan 2023 16:51:01 +0100
+
 openvswitch (3.1.0-1) unstable; urgency=low
 
* New upstream version
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] Prepare for 3.1.0.

2023-01-16 Thread Ilya Maximets
Signed-off-by: Ilya Maximets 
---
 Documentation/faq/releases.rst | 1 +
 NEWS   | 2 +-
 configure.ac   | 2 +-
 debian/changelog   | 4 ++--
 debian/rules   | 4 ++--
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 53ce23004..9e1b42262 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -216,6 +216,7 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.16.x   20.11.6
 2.17.x   21.11.2
 3.0.x21.11.2
+3.1.x22.11.1
  
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/NEWS b/NEWS
index 4f9291bf1..933c5700f 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-Post-v3.0.0
+v3.1.0 - xx xxx 
 
- ovs-vswitchd now detects changes in CPU affinity and adjusts the number
  of handler and revalidator threads if necessary.
diff --git a/configure.ac b/configure.ac
index adfd9f006..9bf896c01 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 3.0.90, b...@openvswitch.org)
+AC_INIT(openvswitch, 3.1.0, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([vswitchd/ovs-vswitchd.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 32cea72d9..6b99df5af 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,8 +1,8 @@
-openvswitch (3.0.90-1) unstable; urgency=low
+openvswitch (3.1.0-1) unstable; urgency=low
 
* New upstream version
 
- -- Open vSwitch team   Mon, 15 Aug 2022 17:43:59 +0200
+ -- Open vSwitch team   Mon, 16 Jan 2023 16:51:00 +0100
 
 openvswitch (3.0.0-1) unstable; urgency=low
 
diff --git a/debian/rules b/debian/rules
index ddbd4dc5c..28c249d07 100755
--- a/debian/rules
+++ b/debian/rules
@@ -134,8 +134,8 @@ override_dh_python3:
 # Helper target for creating snapshots from upstream git
 DATE=$(shell date +%Y%m%d)
 # Upstream branch to track
-BRANCH=branch-3.0
-VERSION=3.0.0
+BRANCH=branch-3.1
+VERSION=3.1.0
 
 get-orig-snapshot:
rm -Rf openvswitch-upstream
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/2] Patches to branch for 3.1.

2023-01-16 Thread Ilya Maximets
There is still a couple of changes that need to go in today
before the actual branching, includng CT_FLUSH extention
and db_change_aware change.  Sending this patch set earlier
to get it reviewed in time and not bother people later.

Ilya Maximets (2):
  Prepare for 3.1.0.
  Prepare for post-3.1.0 (3.1.90).

 Documentation/faq/releases.rst |  1 +
 NEWS   |  6 +-
 configure.ac   |  2 +-
 debian/changelog   | 10 --
 debian/rules   |  4 ++--
 5 files changed, 17 insertions(+), 6 deletions(-)

-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 5/6] controller: only sample flow if Collector Set exists

2023-01-16 Thread Adrian Moreno



On 1/13/23 10:30, Adrian Moreno wrote:



On 1/6/23 20:15, Mark Michelson wrote:

The code change on its own is fine, so

Acked-by: Mark Michelson 

I have a note down below.

On 12/19/22 11:18, Adrian Moreno wrote:

Adding a OFPACT_SAMPLE action to a flow is useless (and even detrimental
in terms of performance) if a Flow_Sample_Collector_Set row does not
exist with the correspondent id. The sample (i.e: upcall) would take
place but ovs-vswitchd would not have a target IPFIX collector to send
the sample to.

In order to be able to control what Chassis perform sampling without
incurring in any performance cost if not needed, this patch makes the
controller only encode the OFPACT_SAMPLE if there's a valid
Flow_Sample_Collector_Set configured on that chassis.

In order to do that, a new input is added to the en_lflow_output engine
node that is associated with the OVSDB table. But since the information
has to be available in the lib/actions layer, an intermediate helper
struct is added (in lib/ovn-util.{h,c}) to help store and query the
configured collector set ids instead of using the idl directly.

Signed-off-by: Adrian Moreno 
---
  controller/lflow.c  |  1 +
  controller/lflow.h  |  8 --
  controller/ovn-controller.c | 56 -
  include/ovn/actions.h   |  4 +++
  lib/actions.c   |  9 +-
  lib/ovn-util.c  | 51 +
  lib/ovn-util.h  | 27 +++---
  tests/ovn-performance.at    | 24 
  tests/ovn.at    |  8 ++
  tests/test-ovn.c    |  9 ++
  10 files changed, 188 insertions(+), 9 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index bb47bb0c7..d2d0a6d1c 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -911,6 +911,7 @@ add_matches_to_flow_table(const struct sbrec_logical_flow 
*lflow,

  .is_switch = ldp->is_switch,
  .group_table = l_ctx_out->group_table,
  .meter_table = l_ctx_out->meter_table,
+    .collector_ids = l_ctx_in->collector_ids,
  .lflow_uuid = lflow->header_.uuid,
  .dp_key = ldp->datapath->tunnel_key,
diff --git a/controller/lflow.h b/controller/lflow.h
index 9e8f9afd3..9bb61c039 100644
--- a/controller/lflow.h
+++ b/controller/lflow.h
@@ -43,11 +43,12 @@
  #include "openvswitch/uuid.h"
  #include "openvswitch/list.h"
-struct ovn_extend_table;
-struct ovsdb_idl_index;
-struct ovn_desired_flow_table;
  struct hmap;
  struct hmap_node;
+struct ovn_desired_flow_table;
+struct ovn_extend_table;
+struct ovsdb_idl_index;
+struct ovsrec_flow_sample_collector_set_table;
  struct sbrec_chassis;
  struct sbrec_load_balancer;
  struct sbrec_logical_flow_table;
@@ -114,6 +115,7 @@ struct lflow_ctx_in {
  const struct hmap *dhcpv6_opts;
  const struct controller_event_options *controller_event_opts;
  const struct smap *template_vars;
+    const struct flow_collector_ids *collector_ids;
  bool lb_hairpin_use_ct_mark;
  };
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 882d9..09f843541 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -988,6 +988,8 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
  ovsdb_idl_add_column(ovs_idl, _ssl_col_private_key);
  ovsdb_idl_add_table(ovs_idl, _table_datapath);
  ovsdb_idl_add_column(ovs_idl, _datapath_col_capabilities);
+    ovsdb_idl_add_table(ovs_idl, _table_flow_sample_collector_set);
+
  chassis_register_ovs_idl(ovs_idl);
  encaps_register_ovs_idl(ovs_idl);
  binding_register_ovs_idl(ovs_idl);
@@ -1004,6 +1006,10 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
  ovsdb_idl_track_add_column(ovs_idl, _port_col_name);
  ovsdb_idl_track_add_column(ovs_idl, _port_col_interfaces);
  ovsdb_idl_track_add_column(ovs_idl, _port_col_external_ids);
+    ovsdb_idl_track_add_column(ovs_idl,
+   _flow_sample_collector_set_col_bridge);
+    ovsdb_idl_track_add_column(ovs_idl,
+   _flow_sample_collector_set_col_id);
  }
  #define SB_NODES \
@@ -1042,7 +1048,8 @@ enum sb_engine_node {
  OVS_NODE(bridge, "bridge") \
  OVS_NODE(port, "port") \
  OVS_NODE(interface, "interface") \
-    OVS_NODE(qos, "qos")
+    OVS_NODE(qos, "qos") \
+    OVS_NODE(flow_sample_collector_set, "flow_sample_collector_set")
  enum ovs_engine_node {
  #define OVS_NODE(NAME, NAME_STR) OVS_##NAME,
@@ -2813,6 +2820,9 @@ struct ed_type_lflow_output {
  /* Fixed controller_event supported options. */
  struct controller_event_options controller_event_opts;
+
+    /* Configured Flow Sample Collector Sets. */
+    struct flow_collector_ids collector_ids;
  };
  static void
@@ -2951,6 +2961,7 @@ init_lflow_ctx(struct engine_node *node,
  l_ctx_in->dhcpv6_opts = _opts->v6_opts;
  l_ctx_in->controller_event_opts = >controller_event_opts;
  

Re: [ovs-dev] OVS "soft freeze" for 3.1 is in effect.

2023-01-16 Thread Eelco Chaudron


On 16 Jan 2023, at 14:05, Ilya Maximets wrote:

> On 1/12/23 20:38, Dumitru Ceara wrote:
>> On 1/12/23 19:13, Han Zhou wrote:
>>> On Mon, Jan 2, 2023 at 3:08 PM Ilya Maximets  wrote:

 Hi.  As described in Documentation/internals/release-process.rst, we are
 in a "soft freeze" state:

During the freeze, we ask committers to refrain from applying patches
>>> that
add new features unless those patches were already being publicly
>>> discussed
and reviewed before the freeze began.  Bug fixes are welcome at any
>>> time.
Please propose and discuss exceptions on ovs-dev.

 We should branch for version 3.1 in two weeks from now, on Monday, Jan 16.
 With that, release should be on Thursday, Feb 16.

 Best regards, Ilya Maximets.
 ___
 dev mailing list
 d...@openvswitch.org
 https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>> Hi Ilya,
>>>
>>> Here is a list of patches I'd like to bring to your attention:
>>>
>>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=336342
>>> This series is new, but it provides an IDL API to avoid a problem, so it
>>> falls in between "bug fixes" and "new features". Please see if it can be
>>> included in the release.
>>>
>>
>> It would be nice indeed if this could make it into 3.1.  It's contained
>> enough and only changes behavior when explicitly requested.  On the
>> other hand it solves the (benign) error message that often raises questions.
>
> OK.  Added this to my queue.
>
>>
>>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=313042
>>> This series was posted in Aug. We have been using it downstream since then.
>>> There was feedback from Adrian and he agreed with the first patch, so
>>> hopefully at least this one can catch the release.
>
> There is an ongoing discussion about the issue here:
> https://patchwork.ozlabs.org/project/openvswitch/patch/167274555298.379205.1390528419336184506.stgit@ebuild/
>
> The issue is that the mechanism of removing flows without revalidation
> should not kick in in the first place.  We need more data on where and
> why exactly revalidator threads are getting stuck.  I'm not comfortable
> applying workarounds without RCA.
>
>>> For the second patch, Adrian mentioned about a patch came later that is
>>> related. Both patches are useful but it would be better to be aligned with
>>> each other for the counter names. Could you suggest how to proceed with it?
>
> I would suggest asking Eelco for review on this patch.  I agree that
> coverage counters are orthogonal to USDT probes.  If necessary, we
> could get them after branching as there are no changes in the logic
> and the patch is very simple.  But, yes, we need to agree on naming.
> I hope, Eelco and Adrian can help with that.

Replied to the patches. I think the first patch might need some documentation 
to describe the 0 value. The second patch would be nice if it could be aligned 
with Kevin’s. However, I do not think Kevin will address his patch in the 3.1 
timeframe, maybe we can take the code changes only, and not his script, and 
then Han can add coverage counters based on that patch?

>>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=312997
>>> This one was also posted in Aug. The first patch got an ACK, and the second
>>> one is still waiting for feedback. Could you comment on it, too?
>
> We can get this as a bug fix later, as it seems to be a bug fix more
> than a new feature.  But I had no chance to properly review it.
>
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1 1/2] ofproto-ipfix: use per-domain template timeouts

2023-01-16 Thread 0-day Robot
Bleep bloop.  Greetings Adrián Moreno, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#271 FILE: ofproto/ofproto-dpif-ipfix.c:2925:
/* XXX: Make frequency of the (Options) Template and Exporter Process

Lines checked: 378, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-upcall.c: Add counters for reasons of DP flow deletions.

2023-01-16 Thread Eelco Chaudron


On 18 Oct 2022, at 18:11, Adrian Moreno wrote:

> On 10/14/22 20:07, Han Zhou wrote:
>>
>>
>> On Thu, Oct 13, 2022 at 11:51 PM Adrian Moreno > > wrote:
>>>
>>>
>>>
>>> On 8/9/22 03:41, Han Zhou wrote:
 Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
 ---
   ofproto/ofproto-dpif-upcall.c | 13 -
   1 file changed, 12 insertions(+), 1 deletion(-)

 diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
 index 23b52ef40..f26a035f4 100644
 --- a/ofproto/ofproto-dpif-upcall.c
 +++ b/ofproto/ofproto-dpif-upcall.c
 @@ -58,6 +58,10 @@ COVERAGE_DEFINE(upcall_ukey_replace);
   COVERAGE_DEFINE(revalidate_missed_dp_flow);
   COVERAGE_DEFINE(upcall_flow_limit_hit);
   COVERAGE_DEFINE(upcall_flow_limit_kill);
 +COVERAGE_DEFINE(upcall_flow_del_rev);
 +COVERAGE_DEFINE(upcall_flow_del_no_rev);
 +COVERAGE_DEFINE(upcall_flow_del_idle_or_limit);
 +COVERAGE_DEFINE(upcall_flow_del_purge);

   /* A thread that reads upcalls from dpif, forwards each upcall's packet,
    * and possibly sets up a kernel flow as a cache. */
 @@ -2334,7 +2338,12 @@ revalidate_ukey(struct udpif *udpif, struct 
 udpif_key *ukey,
               }
               result = revalidate_ukey__(udpif, ukey, push.tcp_flags,
                                          odp_actions, recircs, 
 ukey->xcache);
 -        } /* else delete; too expensive to revalidate */
 +            if (result == UKEY_DELETE) {
 +                COVERAGE_INC(upcall_flow_del_rev);
 +            }
 +        } else { /* else delete; too expensive to revalidate */
 +            COVERAGE_INC(upcall_flow_del_no_rev);
 +        }
       } else if (!push.n_packets || ukey->xcache
                  || !populate_xcache(udpif, ukey, push.tcp_flags)) {
           result = UKEY_KEEP;
 @@ -2771,6 +2780,7 @@ revalidate(struct revalidator *revalidator)
               }
               if (kill_them_all || (used && used < now - max_idle)) {
                   result = UKEY_DELETE;
 +                COVERAGE_INC(upcall_flow_del_idle_or_limit);
               } else {
                   result = revalidate_ukey(udpif, ukey, , 
 _actions,
                                            reval_seq, ,
 @@ -2852,6 +2862,7 @@ revalidator_sweep__(struct revalidator *revalidator, 
 bool purge)

                   if (purge) {
                       result = UKEY_DELETE;
 +                    COVERAGE_INC(upcall_flow_del_purge);
                   } else if (!seq_mismatch) {
                       result = UKEY_KEEP;
                   } else {
>>>
>>>
>>> This has some overlap with the work Kevin has been doing [1].
>>>
>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20221003200742.78047-1-ksprague0...@gmail.com/
>>>  
>>> 
>>>
>>> Maybe this patch can be combined with Kevin's to provide unified visibility
>>> (coverage counters + USDT probes) to the revalidation results?
>>>
>> Thanks Adrian for reviewing! This has been there for 2 months and I was 
>> about to send a reminder.
>> And thanks for the pointer to the related patch. These two patches are for 
>> similar purposes, while they provide different ways to access the 
>> information, and I think both are useful. The coverage counters added here 
>> try to provide stats of basic DP flow deletion categories, while USDT probes 
>> may provide more details, so I am not sure whether these two need to match 
>> to each other exactly. But I agree that at least for the same del reasons 
>> these two patches should use the same names to avoid confusion.
>> Do you have any detailed comments regarding the counters in this patch?
>>
>
> I think the counters added by this patch are useful. I just think (as you 
> mention above) that the reason names should be unified with Kevin's work so 
> we have both USDT and counters reporting revalidator decisions.

I was adding a lot of changes above on naming, but I think it would be better 
to maybe merge his patch with yours?!

However, as his patch is waiting for a v7. Maybe you can take his code changes 
without the script, and add some counters @ the probe location. As I’m fine 
with his patch, except for the python script.

Latest Kevin patch, 
https://patchwork.ozlabs.org/project/openvswitch/patch/20221021163543.508906-1-ksprague0...@gmail.com/


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v1 1/2] ofproto-ipfix: use per-domain template timeouts

2023-01-16 Thread Adrián Moreno
From: Adrian Moreno 

IPFIX templates have to be sent for each Observation Domain ID.
Currently, a timer is kept at each dpif_ipfix_exporter to send them.
This works fine for per-bridge sampling where there is only one
Observation Domain ID per exporter. However, this is does not work for
per-flow sampling where more than one Observation Domain IDs can be
specified by the controller. In this case, ovs-vswitchd will only send
template information for one (arbitrary) DomainID.

Fix per-flow sampling by using an hmap to keep a timer for each
Observation Domain ID.

Signed-off-by: Adrian Moreno 

---
- v1:
  - Added unit test.
  - Drop domain_id from "struct dpif_ipfix_domain" since the hashmap
  already uses it as index.
  - Added OVS_REQUES(mutex) to dpif_ipfix_exporter_find_domain.
---
 ofproto/ofproto-dpif-ipfix.c | 127 ---
 tests/system-traffic.at  |  51 ++
 2 files changed, 155 insertions(+), 23 deletions(-)

diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
index 742eed399..fa71fda0f 100644
--- a/ofproto/ofproto-dpif-ipfix.c
+++ b/ofproto/ofproto-dpif-ipfix.c
@@ -124,11 +124,18 @@ struct dpif_ipfix_port {
 uint32_t ifindex;
 };
 
+struct dpif_ipfix_domain {
+struct hmap_node hmap_node; /* In struct dpif_ipfix_exporter's domains. */
+time_t last_template_set_time;
+};
+
 struct dpif_ipfix_exporter {
 uint32_t exporter_id; /* Exporting Process identifier */
 struct collectors *collectors;
 uint32_t seq_number;
-time_t last_template_set_time;
+struct hmap domains; /* Contains struct dpif_ipfix_domain indexed by
+observation domain id. */
+time_t last_stats_sent_time;
 struct hmap cache_flow_key_map;  /* ipfix_flow_cache_entry. */
 struct ovs_list cache_flow_start_timestamp_list;  /* 
ipfix_flow_cache_entry. */
 uint32_t cache_active_timeout;  /* In seconds. */
@@ -617,6 +624,9 @@ static void get_export_time_now(uint64_t *, uint32_t *);
 
 static void dpif_ipfix_cache_expire_now(struct dpif_ipfix_exporter *, bool);
 
+static void dpif_ipfix_exporter_del_domain(struct dpif_ipfix_exporter * ,
+   struct dpif_ipfix_domain * );
+
 static bool
 ofproto_ipfix_bridge_exporter_options_equal(
 const struct ofproto_ipfix_bridge_exporter_options *a,
@@ -697,13 +707,14 @@ dpif_ipfix_exporter_init(struct dpif_ipfix_exporter 
*exporter)
 exporter->exporter_id = ++exporter_total_count;
 exporter->collectors = NULL;
 exporter->seq_number = 1;
-exporter->last_template_set_time = 0;
+exporter->last_stats_sent_time = 0;
 hmap_init(>cache_flow_key_map);
 ovs_list_init(>cache_flow_start_timestamp_list);
 exporter->cache_active_timeout = 0;
 exporter->cache_max_flows = 0;
 exporter->virtual_obs_id = NULL;
 exporter->virtual_obs_len = 0;
+hmap_init(>domains);
 
 memset(>ipfix_global_stats, 0,
sizeof(struct dpif_ipfix_global_stats));
@@ -711,6 +722,7 @@ dpif_ipfix_exporter_init(struct dpif_ipfix_exporter 
*exporter)
 
 static void
 dpif_ipfix_exporter_clear(struct dpif_ipfix_exporter *exporter)
+OVS_REQUIRES(mutex)
 {
 /* Flush the cache with flow end reason "forced end." */
 dpif_ipfix_cache_expire_now(exporter, true);
@@ -719,22 +731,29 @@ dpif_ipfix_exporter_clear(struct dpif_ipfix_exporter 
*exporter)
 exporter->exporter_id = 0;
 exporter->collectors = NULL;
 exporter->seq_number = 1;
-exporter->last_template_set_time = 0;
+exporter->last_stats_sent_time = 0;
 exporter->cache_active_timeout = 0;
 exporter->cache_max_flows = 0;
 free(exporter->virtual_obs_id);
 exporter->virtual_obs_id = NULL;
 exporter->virtual_obs_len = 0;
 
+struct dpif_ipfix_domain *dom;
+HMAP_FOR_EACH_SAFE (dom, hmap_node, >domains) {
+dpif_ipfix_exporter_del_domain(exporter, dom);
+}
+
 memset(>ipfix_global_stats, 0,
sizeof(struct dpif_ipfix_global_stats));
 }
 
 static void
 dpif_ipfix_exporter_destroy(struct dpif_ipfix_exporter *exporter)
+OVS_REQUIRES(mutex)
 {
 dpif_ipfix_exporter_clear(exporter);
 hmap_destroy(>cache_flow_key_map);
+hmap_destroy(>domains);
 }
 
 static bool
@@ -742,7 +761,7 @@ dpif_ipfix_exporter_set_options(struct dpif_ipfix_exporter 
*exporter,
 const struct sset *targets,
 const uint32_t cache_active_timeout,
 const uint32_t cache_max_flows,
-const char *virtual_obs_id)
+const char *virtual_obs_id) OVS_REQUIRES(mutex)
 {
 size_t virtual_obs_len;
 collectors_destroy(exporter->collectors);
@@ -769,6 +788,37 @@ dpif_ipfix_exporter_set_options(struct dpif_ipfix_exporter 
*exporter,
 return true;
 }
 
+static struct dpif_ipfix_domain *
+dpif_ipfix_exporter_find_domain(const struct dpif_ipfix_exporter *exporter,
+  

[ovs-dev] [PATCH v1 2/2] ipfix: make template and stats interval configurable

2023-01-16 Thread Adrián Moreno
From: Adrian Moreno 

Add options to the IPFIX table configure the interval to send statistics
and template information.

Signed-off-by: Adrian Moreno 

---
- v1:
  - Added unit test.
---
 NEWS |  2 ++
 ofproto/ofproto-dpif-ipfix.c | 38 +
 ofproto/ofproto.h|  9 +++
 tests/system-traffic.at  | 46 
 vswitchd/bridge.c| 17 +
 vswitchd/vswitch.ovsschema   | 12 +-
 vswitchd/vswitch.xml | 20 
 7 files changed, 133 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index 4f9291bf1..6f0f9e3a0 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,8 @@ Post-v3.0.0
  * Add new experimental PMD load based sleeping feature. PMD threads can
request to sleep up to a user configured 'pmd-maxsleep' value under
low load conditions.
+   - IPFIX template and statistics intervals can be configured through two new
+   options in the IPFIX table: 'template_interval' and 'stats_interval'.
 
 
 v3.0.0 - 15 Aug 2022
diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
index fa71fda0f..fdb48b5fd 100644
--- a/ofproto/ofproto-dpif-ipfix.c
+++ b/ofproto/ofproto-dpif-ipfix.c
@@ -140,6 +140,8 @@ struct dpif_ipfix_exporter {
 struct ovs_list cache_flow_start_timestamp_list;  /* 
ipfix_flow_cache_entry. */
 uint32_t cache_active_timeout;  /* In seconds. */
 uint32_t cache_max_flows;
+uint32_t stats_interval;
+uint32_t template_interval;
 char *virtual_obs_id;
 uint8_t virtual_obs_len;
 
@@ -174,11 +176,6 @@ struct dpif_ipfix {
 
 #define IPFIX_VERSION 0x000a
 
-/* When using UDP, IPFIX Template Records must be re-sent regularly.
- * The standard default interval is 10 minutes (600 seconds).
- * Cf. IETF RFC 5101 Section 10.3.6. */
-#define IPFIX_TEMPLATE_INTERVAL 600
-
 /* Cf. IETF RFC 5101 Section 3.1. */
 OVS_PACKED(
 struct ipfix_header {
@@ -637,6 +634,8 @@ ofproto_ipfix_bridge_exporter_options_equal(
 && a->sampling_rate == b->sampling_rate
 && a->cache_active_timeout == b->cache_active_timeout
 && a->cache_max_flows == b->cache_max_flows
+&& a->stats_interval == b->stats_interval
+&& a->template_interval == b->template_interval
 && a->enable_tunnel_sampling == b->enable_tunnel_sampling
 && a->enable_input_sampling == b->enable_input_sampling
 && a->enable_output_sampling == b->enable_output_sampling
@@ -674,6 +673,8 @@ ofproto_ipfix_flow_exporter_options_equal(
 return (a->collector_set_id == b->collector_set_id
 && a->cache_active_timeout == b->cache_active_timeout
 && a->cache_max_flows == b->cache_max_flows
+&& a->stats_interval == b->stats_interval
+&& a->template_interval == b->template_interval
 && a->enable_tunnel_sampling == b->enable_tunnel_sampling
 && sset_equals(>targets, >targets)
 && nullable_string_is_equal(a->virtual_obs_id, b->virtual_obs_id));
@@ -712,6 +713,9 @@ dpif_ipfix_exporter_init(struct dpif_ipfix_exporter 
*exporter)
 ovs_list_init(>cache_flow_start_timestamp_list);
 exporter->cache_active_timeout = 0;
 exporter->cache_max_flows = 0;
+exporter->stats_interval = OFPROTO_IPFIX_DEFAULT_TEMPLATE_INTERVAL;
+exporter->template_interval = OFPROTO_IPFIX_DEFAULT_TEMPLATE_INTERVAL;
+exporter->last_stats_sent_time = 0;
 exporter->virtual_obs_id = NULL;
 exporter->virtual_obs_len = 0;
 hmap_init(>domains);
@@ -734,6 +738,9 @@ dpif_ipfix_exporter_clear(struct dpif_ipfix_exporter 
*exporter)
 exporter->last_stats_sent_time = 0;
 exporter->cache_active_timeout = 0;
 exporter->cache_max_flows = 0;
+exporter->stats_interval = OFPROTO_IPFIX_DEFAULT_TEMPLATE_INTERVAL;
+exporter->template_interval = OFPROTO_IPFIX_DEFAULT_TEMPLATE_INTERVAL;
+exporter->last_stats_sent_time = 0;
 free(exporter->virtual_obs_id);
 exporter->virtual_obs_id = NULL;
 exporter->virtual_obs_len = 0;
@@ -761,6 +768,8 @@ dpif_ipfix_exporter_set_options(struct dpif_ipfix_exporter 
*exporter,
 const struct sset *targets,
 const uint32_t cache_active_timeout,
 const uint32_t cache_max_flows,
+const uint32_t stats_interval,
+const uint32_t template_interval,
 const char *virtual_obs_id) OVS_REQUIRES(mutex)
 {
 size_t virtual_obs_len;
@@ -775,6 +784,8 @@ dpif_ipfix_exporter_set_options(struct dpif_ipfix_exporter 
*exporter,
 }
 exporter->cache_active_timeout = cache_active_timeout;
 exporter->cache_max_flows = cache_max_flows;
+exporter->stats_interval = stats_interval;
+exporter->template_interval = template_interval;
 virtual_obs_len = virtual_obs_id ? 

Re: [ovs-dev] [PATCH v7 2/2] openflow: Add extension to flush CT by generic match

2023-01-16 Thread Ales Musil
On Mon, Jan 16, 2023 at 12:45 PM Ales Musil  wrote:

> Add extension that allows to flush connections from CT
> by specifying fields that the connections should be
> matched against. This allows to match only some fields
> of the connection e.g. source address for orig direction.
>
> Reported-at: https://bugzilla.redhat.com/2120546
> Signed-off-by: Ales Musil 
> ---
> v7: Rebase on top of current master.
> Address comments from Ilya:
> * Use the public header for encode/decode functions.
> * Adjust the code accordingly to the new header file.
> * Address some style related comments.
> * Rename the nested TLV, so they do not overlap with
>   the outer types.
> * Document the ICMP partial match limitation.
>
> v6: Rebase on top of current master.
> Address comments from Ilya.
> v5: Add missing usage and man for ovs-ofctl command.
> v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
> v3: Rebase on top of master.
> v2: Rebase on top of master.
> Use suggestion from Ilya.
> ---
>  NEWS   |   6 +
>  include/openflow/nicira-ext.h  |  37 +++
>  include/openvswitch/ofp-ct.h   |  10 ++
>  include/openvswitch/ofp-msgs.h |   4 +
>  lib/ofp-bundle.c   |   1 +
>  lib/ofp-ct.c   | 196 +
>  lib/ofp-print.c|  20 
>  lib/rconn.c|   1 +
>  ofproto/ofproto-dpif.c |   9 +-
>  ofproto/ofproto-provider.h |   7 +-
>  ofproto/ofproto.c  |  29 -
>  tests/ofp-print.at | 108 ++
>  tests/ovs-ofctl.at |  38 +++
>  tests/system-traffic.at|  38 ---
>  utilities/ovs-ofctl.8.in   |  27 +
>  utilities/ovs-ofctl.c  |  51 +
>  16 files changed, 560 insertions(+), 22 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 035fcb0ee..22840402c 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -36,6 +36,12 @@ Post-v3.0.0
> - ovs-dpctl and 'ovs-appctl dpctl/' commands:
>   * "flush-conntrack" is now capable of handling partial 5-tuple,
>  with additional optional parameter to specify the reply direction.
> +   - OpenFlow:
> + * New OpenFlow extension NXT_CT_FLUSH to flush connections matching
> +   the specified fields.
> +   - ovs-ofctl:
> + * New command "flush-conntrack" that accepts zone and 5-tuple or
> partial
> +   5-tuple for both directions.
>
>
>  v3.0.0 - 15 Aug 2022
> diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
> index b68804991..35f66a76d 100644
> --- a/include/openflow/nicira-ext.h
> +++ b/include/openflow/nicira-ext.h
> @@ -1064,4 +1064,41 @@ struct nx_zone_id {
>  };
>  OFP_ASSERT(sizeof(struct nx_zone_id) == 8);
>
> +/* CT flush available TLVs. */
> +enum nx_ct_flush_tlv_type {
> +/* Outer types. */
> +NXT_CT_ORIG_TUPLE = 0, /* Outer type for original tuple TLV.
> +* The nested TLV are specified
> +* by 'enum nx_ct_flush_tuple_tlv_type'. */
> +NXT_CT_REPLY_TUPLE = 1,/* Outer type for reply tuple TLC. *
> +* The nested TLV are specified
> +* by 'enum nx_ct_flush_tuple_tlv_type'*/
> +/* Primitive types. */
> +NXT_CT_ZONE_ID = 2,/* be16 zone id. */
> +};
> +
> +/* CT flush nested TLVs. */
> +enum nx_ct_flush_tuple_tlv_type {
> +NXT_CT_TUPLE_SRC = 0,/* IPv6 or mapped IPv4 address. */
> +NXT_CT_TUPLE_DST = 1,/* IPv6 or mapped IPv4 address. */
> +NXT_CT_TUPLE_SRC_PORT = 2,   /* be16 source port. */
> +NXT_CT_TUPLE_DST_PORT = 3,   /* be16 destination port. */
> +NXT_CT_TUPLE_ICMP_ID = 4,   /* be16 ICMP id. */
> +NXT_CT_TUPLE_ICMP_TYPE = 5, /* u8 ICMP type. */
> +NXT_CT_TUPLE_ICMP_CODE = 6, /* u8 ICMP code. */
> +};
> +
> +/* NXT_CT_FLUSH.
> + *
> + * Flushes the connection tracking specified by 5-tuple.
> + * The struct should be followed by TLVs specifying the matching
> parameters.
> + * Currently there is a limitation for ICMP, in order to partially match
> on
> + * ICMP parameters the tuple should include at least SRC/DST. */
> +struct nx_ct_flush {
> +uint8_t ip_proto;  /* IP protocol. */
> +uint8_t pad[7];   /* Must be zero. */
> +/* Followed by optional TLVs of type 'enum nx_ct_flush_tlv_type'. */
> +};
> +OFP_ASSERT(sizeof(struct nx_ct_flush) == 8);
> +
>  #endif /* openflow/nicira-ext.h */
> diff --git a/include/openvswitch/ofp-ct.h b/include/openvswitch/ofp-ct.h
> index e7af45337..094da0a97 100644
> --- a/include/openvswitch/ofp-ct.h
> +++ b/include/openvswitch/ofp-ct.h
> @@ -21,6 +21,8 @@
>  #include 
>  #include 
>
> +#include "openflow/nicira-ext.h"
> +
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
> @@ -63,6 +65,14 @@ bool ofp_ct_tuple_parse(struct ofp_ct_tuple *tuple,
> const char *s,
>
>  bool ofp_ct_is_match_zero(const struct 

Re: [ovs-dev] [PATCH 1/2] Revalidator: Allow min-revalidator-pps to be 0 (disabled).

2023-01-16 Thread Eelco Chaudron


On 18 Oct 2022, at 18:08, Adrian Moreno wrote:

> On 10/14/22 19:49, Han Zhou wrote:
>>
>>
>> On Thu, Oct 13, 2022 at 11:43 PM Adrian Moreno > > wrote:
>>>
>>> Hi Han.
>>>
>>> On 8/9/22 03:40, Han Zhou wrote:
 Today the minimum value for this setting is 1. This patch allows it to
 be 0, meaning not checking pps at all, and always do revalidation.

 This is particularly useful for environments where some of the
 applications with long-lived connections may have very low traffic for
 certain period but have high rate of burst periodically. It is desirable
 to keep the datapath flows instead of periodically deleting them to
 avoid burst of packet miss to userspace.

 When setting to 0, there may be more datapath flows to be revalidated,
 resulting in higher CPU cost of revalidator threads. This is the
 downside but in certain cases this is still more desirable than packet
 misses to user space.

>>> I am trying to quantify this CPU cost. Do you have any numbers so we 
>>> understand
>>> the effects of disabling pps-based evictions?
>>>
>> Hi Adrian,
>>
>> Thanks for reviewing. First of all, the CPU cost is added to revalidator 
>> threads only, but saving cost for the handler threads which is more critical 
>> to the packet processing.
>> Now regarding the actual CPU cost of revalidator threads, the extra cost 
>> really depends on the traffic pattern - how many long lived DP flows are 
>> there with pps smaller than . The more such DP flows, 
>> the more revalidtor CPU, of course. We don't see a problem when there are 
>> 10k DP flows when setting min-revalidator-pps to 0, and this is on a DPU 
>> with small number of less powerful ARM cores. It also depends on whether it 
>> is a dedicated network node/DPU where it is ok for OVS to take a significant 
>> portion of the CPU or it is a compute node where more cores are supposed to 
>> be reserved for applications.
>> In any case, this is just an option and totally depends on user settings 
>> according to their use case, as explained in the commit message.
>>
>
> I agree. There are many factors that come into play. My concern is that we 
> have so many knobs that tweak the revalidator/handler load distribution that 
> depend on so many external factors that it's quite complicated to determine 
> when to recommend their use.
>
> That's why I wanted to have some ballpark numbers to get an intuition of the 
> potential side effects.
>
> Having said that, I think this particular proposal is beneficial as I also 
> have seen drops caused by long-lived connections.

I do not see anything wrong with adding this feature, however, the 
documentation needs to be clear on what 0 means, i.e. does it result in always 
revalidate or never revalidate?

//Eelco


 Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
 ---
   ofproto/ofproto-dpif-upcall.c | 4 
   ofproto/ofproto.c             | 2 +-
   vswitchd/vswitch.xml          | 2 +-
   3 files changed, 6 insertions(+), 2 deletions(-)

 diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
 index 57f94df54..23b52ef40 100644
 --- a/ofproto/ofproto-dpif-upcall.c
 +++ b/ofproto/ofproto-dpif-upcall.c
 @@ -2079,6 +2079,10 @@ should_revalidate(const struct udpif *udpif, 
 uint64_t packets,
   {
       long long int metric, now, duration;

 +    if (!ofproto_min_revalidate_pps) {
 +        return true;
 +    }
 +
       if (!used) {
           /* Always revalidate the first time a flow is dumped. */
           return true;
 diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
 index 3a527683c..259fad4b5 100644
 --- a/ofproto/ofproto.c
 +++ b/ofproto/ofproto.c
 @@ -723,7 +723,7 @@ ofproto_set_max_revalidator(unsigned max_revalidator)
   void
   ofproto_set_min_revalidate_pps(unsigned min_revalidate_pps)
   {
 -    ofproto_min_revalidate_pps = min_revalidate_pps ? min_revalidate_pps 
 : 1;
 +    ofproto_min_revalidate_pps = min_revalidate_pps;
   }

   /* If forward_bpdu is true, the NORMAL action will forward frames with
 diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
 index 4a9284f6b..c1296 100644
 --- a/vswitchd/vswitch.xml
 +++ b/vswitchd/vswitch.xml
 @@ -212,7 +212,7 @@
         

         >>> -              type='{"type": "integer", "minInteger": 1}'>
 +              type='{"type": "integer", "minInteger": 0}'>
           
             Set minimum pps that flow must have in order to be revalidated 
 when
             revalidation duration exceeds half of max-revalidator config 
 variable.
>>>
>>> --
>>> Adrián Moreno
>>>
>
> -- 
> Adrián Moreno
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd: Add logical flows to allow rpl/rel traffic in acl_after_lb stage.

2023-01-16 Thread Dumitru Ceara
Otherwise we cannot correctly implement "allow-related" semantics when
default_acl_drop is set to "true".

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1947807#c11
74d82e296f80 ("northd: Support the option to apply from-lport ACLs after load 
balancer.")
Signed-off-by: Dumitru Ceara 
---
 northd/northd.c |  12 +++-
 northd/ovn-northd.8.xml |  14 +++-
 tests/ovn-northd.at |  63 ++
 tests/system-ovn.at | 142 
 4 files changed, 199 insertions(+), 32 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 4751feab4f..f21ceae78d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -216,6 +216,7 @@ enum ovn_stage {
 #define REGBIT_FROM_RAMP  "reg0[14]"
 #define REGBIT_PORT_SEC_DROP  "reg0[15]"
 #define REGBIT_ACL_STATELESS  "reg0[16]"
+#define REGBIT_ACL_HINT_ALLOW_REL "reg0[17]"
 
 #define REG_ORIG_DIP_IPV4 "reg1"
 #define REG_ORIG_DIP_IPV6 "xxreg1"
@@ -6770,7 +6771,8 @@ build_acls(struct ovn_datapath *od, const struct 
chassis_features *features,
   ct_blocked_match);
 ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX - 3,
   ds_cstr(), REGBIT_ACL_HINT_DROP" = 0; "
-  REGBIT_ACL_HINT_BLOCK" = 0; next;");
+  REGBIT_ACL_HINT_BLOCK" = 0; "
+  REGBIT_ACL_HINT_ALLOW_REL" = 1; next;");
 ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX - 3,
   ds_cstr(), "next;");
 
@@ -6791,7 +6793,8 @@ build_acls(struct ovn_datapath *od, const struct 
chassis_features *features,
   use_ct_inv_match ? " && !ct.inv" : "",
   ct_blocked_match);
 ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX - 3,
-  ds_cstr(), "ct_commit_nat;");
+  ds_cstr(),
+  REGBIT_ACL_HINT_ALLOW_REL" = 1; ct_commit_nat;");
 ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX - 3,
   ds_cstr(), "ct_commit_nat;");
 
@@ -6802,6 +6805,11 @@ build_acls(struct ovn_datapath *od, const struct 
chassis_features *features,
   "nd || nd_ra || nd_rs || mldv1 || mldv2", "next;");
 ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX - 3,
   "nd || nd_ra || nd_rs || mldv1 || mldv2", "next;");
+
+/* Reply and related traffic matched by an "allow-related" ACL
+ * should be allowed in the ls_in_acl_after_lb stage too. */
+ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL_AFTER_LB, UINT16_MAX - 3,
+  REGBIT_ACL_HINT_ALLOW_REL" == 1", "next;");
 }
 
 /* Ingress or Egress ACL Table (Various priorities). */
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 25a742c904..41ba30d463 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -794,8 +794,9 @@
 policy, ct_mark.blocked will get set and packets in the
 reply direction will no longer be allowed, either. This flow also
 clears the register bits reg0[9] and
-reg0[10].  If ACL logging and logging of related packets
-is enabled, then a companion priority-65533 flow will be installed that
+reg0[10] and sets register bit reg0[17].
+If ACL logging and logging of related packets is enabled, then a
+companion priority-65533 flow will be installed that
 accomplishes the same thing but also logs the traffic.
   
 
@@ -1087,6 +1088,15 @@
   
 
 
+
+  
+One priority-65532 flow matching packets with reg0[17]
+set (either replies to existing sessions or traffic related to
+existing sessions) and allows these by advancing to the next
+table.
+  
+
+
 
   
 One priority-0 fallback flow that matches all packets and advances to
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index a95468347f..455b75d9e8 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2472,8 +2472,8 @@ AT_CHECK([ovn-sbctl lflow-list ls | grep -e 
ls_in_acl_hint -e ls_out_acl_hint -e
   table=7 (ls_in_acl_hint ), priority=7, match=(ct.new && !ct.est), 
action=(reg0[[7]] = 1; reg0[[9]] = 1; next;)
   table=8 (ls_in_acl  ), priority=1, match=(ip && !ct.est), 
action=(reg0[[1]] = 1; next;)
   table=8 (ls_in_acl  ), priority=1, match=(ip && ct.est && 
ct_mark.blocked == 1), action=(reg0[[1]] = 1; next;)
-  table=8 (ls_in_acl  ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && !ct.inv && ct_mark.blocked == 0), action=(ct_commit_nat;)
-  table=8 (ls_in_acl  ), priority=65532, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg0[[9]] = 0; 
reg0[[10]] = 0; next;)
+  table=8 (ls_in_acl  ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && !ct.inv && ct_mark.blocked == 0), 

Re: [ovs-dev] [PATCH] netdev-dpdk: Fix deadlock due to virtqueue stats retrieval.

2023-01-16 Thread Ilya Maximets
On 1/16/23 14:41, David Marchand wrote:
> On Fri, Jan 13, 2023 at 4:57 PM Ilya Maximets  wrote:
> @@ -5766,6 +5804,7 @@ static const struct netdev_class dpdk_vhost_class = 
> {
>  static const struct netdev_class dpdk_vhost_client_class = {
>  .type = "dpdkvhostuserclient",
>  NETDEV_DPDK_CLASS_COMMON,
> +.run = netdev_dpdk_vhost_run,

 This part is also needed for the 'server' type.

 However, there is no guarantee that run() will be called in any reasonable
 time, because we don't have a wait() method that would schedule a wake up
 for the main thread.   The main thread wakes up for other reasons like
 stats update, but that happens not that frequently and we can't rely on 
 that
 anyway.  For example, default stats update interval is 5 seconds.

 The problem is that we can't really iterate over devices and take 
 dev->mutex
 in the context of a callback.  Meaning we can't really implement a good
 wait() method that would trigger some seqno change that would wake up the
 main thread.  We could trigger a global connectivity seqno, I guess.  Not
 sure how safe it is, since sequence numbers use internal mutex.

 OTOH, maybe we can employ dpdk-watchdog thread to perform vhost 
 housekeeping?
 Currently it's doing a very similar thing for physical ports, i.e. updates
 the link state periodically.  It's sleep interval is a bit large though,
 but maybe we could tweak that, or update phy link states with the same
 interval as it is now, but check the vhost event queue a bit more 
 frequently.
 Maybe even have something like a backoff in dp_netdev_flow_offload_main().
>>>
>>> Having a dedicated thread seems safer, for the same reason as Maxime
>>> mentionned for OVS  main thread: vhost events may be triggered by a
>>> malicious guest and OVS would be blind to some phy link updates.
>>>
>>>
>>> Another thought on my patch... is it safe to have an unbound queue wrt
>>> to memory consumption?
>>> Again, a malicious guest could make OVS consume a lot of memory (in my
>>> tests, OVS complained that heap increased to 500M).
> 
> *Note* heap was at ~400M before starting the test. So it was an
> increase of around 100M.
> 
> The reason was likely that .run() was not run often enough and the
> queue was growing with pending updates.
> Now that I went with the dedicated thread approach, all I see is an
> increase of 1-2M, maximum.
> 
> I think we can avoid going with your suggestion.
> WDYT?

OK.  Sounds OK then.

Maybe add a rate-limited warning message if the queue exceeds
1000 entries or so?

> 
> 
>>
>> We're entering a gross over-engineering territory, but the solution
>> might be following:
>>
>> RCU-list with nodes with following elements:
>>  - ifname
>>  - qid
>>  - atomic_bool enabled
>>  - atomic_bool seen
>>  - timestamp
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Fix deadlock due to virtqueue stats retrieval.

2023-01-16 Thread David Marchand
On Fri, Jan 13, 2023 at 4:57 PM Ilya Maximets  wrote:
> >>> @@ -5766,6 +5804,7 @@ static const struct netdev_class dpdk_vhost_class = 
> >>> {
> >>>  static const struct netdev_class dpdk_vhost_client_class = {
> >>>  .type = "dpdkvhostuserclient",
> >>>  NETDEV_DPDK_CLASS_COMMON,
> >>> +.run = netdev_dpdk_vhost_run,
> >>
> >> This part is also needed for the 'server' type.
> >>
> >> However, there is no guarantee that run() will be called in any reasonable
> >> time, because we don't have a wait() method that would schedule a wake up
> >> for the main thread.   The main thread wakes up for other reasons like
> >> stats update, but that happens not that frequently and we can't rely on 
> >> that
> >> anyway.  For example, default stats update interval is 5 seconds.
> >>
> >> The problem is that we can't really iterate over devices and take 
> >> dev->mutex
> >> in the context of a callback.  Meaning we can't really implement a good
> >> wait() method that would trigger some seqno change that would wake up the
> >> main thread.  We could trigger a global connectivity seqno, I guess.  Not
> >> sure how safe it is, since sequence numbers use internal mutex.
> >>
> >> OTOH, maybe we can employ dpdk-watchdog thread to perform vhost 
> >> housekeeping?
> >> Currently it's doing a very similar thing for physical ports, i.e. updates
> >> the link state periodically.  It's sleep interval is a bit large though,
> >> but maybe we could tweak that, or update phy link states with the same
> >> interval as it is now, but check the vhost event queue a bit more 
> >> frequently.
> >> Maybe even have something like a backoff in dp_netdev_flow_offload_main().
> >
> > Having a dedicated thread seems safer, for the same reason as Maxime
> > mentionned for OVS  main thread: vhost events may be triggered by a
> > malicious guest and OVS would be blind to some phy link updates.
> >
> >
> > Another thought on my patch... is it safe to have an unbound queue wrt
> > to memory consumption?
> > Again, a malicious guest could make OVS consume a lot of memory (in my
> > tests, OVS complained that heap increased to 500M).

*Note* heap was at ~400M before starting the test. So it was an
increase of around 100M.

The reason was likely that .run() was not run often enough and the
queue was growing with pending updates.
Now that I went with the dedicated thread approach, all I see is an
increase of 1-2M, maximum.

I think we can avoid going with your suggestion.
WDYT?


>
> We're entering a gross over-engineering territory, but the solution
> might be following:
>
> RCU-list with nodes with following elements:
>  - ifname
>  - qid
>  - atomic_bool enabled
>  - atomic_bool seen
>  - timestamp


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Update release-process.rst to have the 2024 calendar.

2023-01-16 Thread 0-day Robot
Bleep bloop.  Greetings Mark Michelson, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author Mark Michelson  needs to sign off.
Lines checked: 62, Warnings: 0, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Update release-process.rst to have the 2024 calendar.

2023-01-16 Thread Mark Michelson
---
 Documentation/internals/release-process.rst | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/Documentation/internals/release-process.rst 
b/Documentation/internals/release-process.rst
index 13a22fa60..ac278bd4a 100644
--- a/Documentation/internals/release-process.rst
+++ b/Documentation/internals/release-process.rst
@@ -134,34 +134,34 @@ approximate:
 Release Calendar
 
 
-The 2022 timetable is shown below. Note that these dates are not set in stone.
+The 2023 timetable is shown below. Note that these dates are not set in stone.
 If extenuating circumstances arise, a release may be delayed from its target
 date.
 
 +-+-+-+-+
 | Release | Soft Freeze | Branch Creation | Release |
 +-+-+-+-+
-| 22.03.0 | Feb 4   | Feb 18  | Mar 4   |
+| 23.03.0 | Feb 3   | Feb 17  | Mar 3   |
 +-+-+-+-+
-| 22.06.0 | May 6   | May 20  | Jun 3   |
+| 23.06.0 | May 5   | May 19  | Jun 2   |
 +-+-+-+-+
-| 22.09.0 | Aug 5   | Aug 19  | Sep 2   |
+| 23.09.0 | Aug 4   | Aug 18  | Sep 1   |
 +-+-+-+-+
-| 22.12.0 | Nov 4   | Nov 18  | Dec 2   |
+| 23.12.0 | Nov 3   | Nov 17  | Dec 1   |
 +-+-+-+-+
 
-Below is the 2023 timetable
+Below is the 2024 timetable
 
 +-+-+-+-+
 | Release | Soft Freeze | Branch Creation | Release |
 +-+-+-+-+
-| 23.03.0 | Feb 3   | Feb 17  | Mar 3   |
+| 24.03.0 | Feb 2   | Feb 16  | Mar 1   |
 +-+-+-+-+
-| 23.06.0 | May 5   | May 19  | Jun 2   |
+| 24.06.0 | May 10  | May 24  | Jun 7   |
 +-+-+-+-+
-| 23.09.0 | Aug 4   | Aug 18  | Sep 1   |
+| 24.09.0 | Aug 9   | Aug 23  | Sep 6   |
 +-+-+-+-+
-| 23.12.0 | Nov 3   | Nov 17  | Dec 1   |
+| 24.12.0 | Nov 8   | Nov 22  | Dec 6   |
 +-+-+-+-+
 
 Contact
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] id-pool: Refactor to use a bitmap.

2023-01-16 Thread Mark Michelson
Using a bitmap makes the id-pool use less memory and be more
cache-friendly. It theoretically should be faster since hashes do not
have to be computed.

This takes the approach of expanding the bitmap when necessary rather
than allocating the entire space at once. This makes the approach less
memory-intensive for id pools with a large theoretical maximum number of
values.

Signed-off-by: Mark Michelson 
---
v1 -> v2:
 * Switched a MAX to a MIN. Without this change, we'll always just
   realloc the bitmap to the max size on the first realloc. Whoops.
 lib/id-pool.c | 110 +-
 1 file changed, 45 insertions(+), 65 deletions(-)

diff --git a/lib/id-pool.c b/lib/id-pool.c
index 69910ad08..32ca6ce58 100644
--- a/lib/id-pool.c
+++ b/lib/id-pool.c
@@ -17,25 +17,20 @@
 
 #include 
 #include "id-pool.h"
-#include "openvswitch/hmap.h"
-#include "hash.h"
+#include "bitmap.h"
 
-struct id_node {
-struct hmap_node node;
-uint32_t id;
-};
+#define ID_POOL_BASE_ALLOC 64
 
 struct id_pool {
-struct hmap map;
+unsigned long *bitmap;
+uint32_t bitmap_size;  /* Current highest offset the bitmap can hold */
 uint32_t base; /* IDs in the range of [base, base + n_ids). */
 uint32_t n_ids;/* Total number of ids in the pool. */
-uint32_t next_free_id; /* Possible next free id. */
 };
 
 static void id_pool_init(struct id_pool *pool,
  uint32_t base, uint32_t n_ids);
 static void id_pool_uninit(struct id_pool *pool);
-static struct id_node *id_pool_find(struct id_pool *pool, uint32_t id);
 
 struct id_pool *
 id_pool_create(uint32_t base, uint32_t n_ids)
@@ -62,96 +57,81 @@ id_pool_init(struct id_pool *pool, uint32_t base, uint32_t 
n_ids)
 {
 pool->base = base;
 pool->n_ids = n_ids;
-pool->next_free_id = base;
-hmap_init(>map);
+pool->bitmap_size = MIN(n_ids, ID_POOL_BASE_ALLOC);
+pool->bitmap = bitmap_allocate(pool->bitmap_size);
 }
 
 static void
 id_pool_uninit(struct id_pool *pool)
 {
-struct id_node *id_node;
+free(pool->bitmap);
+pool->bitmap = NULL;
+}
 
-HMAP_FOR_EACH_POP(id_node, node, >map) {
-free(id_node);
+static bool
+id_pool_expand(struct id_pool *pool, uint32_t target)
+{
+if (target >= pool->n_ids) {
+return false;
 }
 
-hmap_destroy(>map);
+uint32_t new_bitmap_size = MAX(pool->bitmap_size * 2, target);
+new_bitmap_size = MIN(new_bitmap_size, pool->n_ids);
+unsigned long *new_bitmap = bitmap_allocate(new_bitmap_size);
+
+memcpy(new_bitmap, pool->bitmap, bitmap_n_bytes(pool->bitmap_size));
+
+id_pool_uninit(pool);
+pool->bitmap = new_bitmap;
+pool->bitmap_size = new_bitmap_size;
+return true;
 }
 
-static struct id_node *
-id_pool_find(struct id_pool *pool, uint32_t id)
+static bool
+id_pool_add__(struct id_pool *pool, uint32_t offset)
 {
-size_t hash;
-struct id_node *id_node;
-
-hash = hash_int(id, 0);
-HMAP_FOR_EACH_WITH_HASH(id_node, node, hash, >map) {
-if (id == id_node->id) {
-return id_node;
-}
+if (offset >= pool->n_ids) {
+return false;
+}
+
+if (offset >= pool->bitmap_size && !id_pool_expand(pool, offset)) {
+return false;
 }
-return NULL;
+
+bitmap_set1(pool->bitmap, offset);
+return true;
 }
 
 void
 id_pool_add(struct id_pool *pool, uint32_t id)
 {
-struct id_node *id_node = xmalloc(sizeof *id_node);
-size_t hash;
+if (id < pool->base) {
+return;
+}
 
-id_node->id = id;
-hash = hash_int(id, 0);
-hmap_insert(>map, _node->node, hash);
+id_pool_add__(pool, id - pool->base);
 }
 
 bool
 id_pool_alloc_id(struct id_pool *pool, uint32_t *id_)
 {
-uint32_t id;
+size_t offset;
 
 if (pool->n_ids == 0) {
 return false;
 }
 
-if (!(id_pool_find(pool, pool->next_free_id))) {
-id = pool->next_free_id;
-goto found_free_id;
-}
-
-for(id = pool->base; id < pool->base + pool->n_ids; id++) {
-if (!id_pool_find(pool, id)) {
-goto found_free_id;
-}
-}
-
-/* Not available. */
-return false;
-
-found_free_id:
-id_pool_add(pool, id);
-
-if (id + 1 < pool->base + pool->n_ids) {
-pool->next_free_id = id + 1;
-} else {
-pool->next_free_id = pool->base;
+offset = bitmap_scan(pool->bitmap, 0, 0, pool->bitmap_size);
+if (!id_pool_add__(pool, offset)) {
+return false;
 }
 
-*id_ = id;
+*id_ = offset + pool->base;
 return true;
 }
 
 void
 id_pool_free_id(struct id_pool *pool, uint32_t id)
 {
-struct id_node *id_node;
-if (id >= pool->base && (id < pool->base + pool->n_ids)) {
-id_node = id_pool_find(pool, id);
-if (id_node) {
-hmap_remove(>map, _node->node);
-if (id < pool->next_free_id) {
-pool->next_free_id = id;
-}
-free(id_node);
-}
-}
+  

Re: [ovs-dev] OVS "soft freeze" for 3.1 is in effect.

2023-01-16 Thread Ilya Maximets
On 1/12/23 20:38, Dumitru Ceara wrote:
> On 1/12/23 19:13, Han Zhou wrote:
>> On Mon, Jan 2, 2023 at 3:08 PM Ilya Maximets  wrote:
>>>
>>> Hi.  As described in Documentation/internals/release-process.rst, we are
>>> in a "soft freeze" state:
>>>
>>>During the freeze, we ask committers to refrain from applying patches
>> that
>>>add new features unless those patches were already being publicly
>> discussed
>>>and reviewed before the freeze began.  Bug fixes are welcome at any
>> time.
>>>Please propose and discuss exceptions on ovs-dev.
>>>
>>> We should branch for version 3.1 in two weeks from now, on Monday, Jan 16.
>>> With that, release should be on Thursday, Feb 16.
>>>
>>> Best regards, Ilya Maximets.
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> Hi Ilya,
>>
>> Here is a list of patches I'd like to bring to your attention:
>>
>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=336342
>> This series is new, but it provides an IDL API to avoid a problem, so it
>> falls in between "bug fixes" and "new features". Please see if it can be
>> included in the release.
>>
> 
> It would be nice indeed if this could make it into 3.1.  It's contained
> enough and only changes behavior when explicitly requested.  On the
> other hand it solves the (benign) error message that often raises questions.

OK.  Added this to my queue.

> 
>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=313042
>> This series was posted in Aug. We have been using it downstream since then.
>> There was feedback from Adrian and he agreed with the first patch, so
>> hopefully at least this one can catch the release.

There is an ongoing discussion about the issue here:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/167274555298.379205.1390528419336184506.stgit@ebuild/

The issue is that the mechanism of removing flows without revalidation
should not kick in in the first place.  We need more data on where and
why exactly revalidator threads are getting stuck.  I'm not comfortable
applying workarounds without RCA.

>> For the second patch, Adrian mentioned about a patch came later that is
>> related. Both patches are useful but it would be better to be aligned with
>> each other for the counter names. Could you suggest how to proceed with it?

I would suggest asking Eelco for review on this patch.  I agree that
coverage counters are orthogonal to USDT probes.  If necessary, we
could get them after branching as there are no changes in the logic
and the patch is very simple.  But, yes, we need to agree on naming.
I hope, Eelco and Adrian can help with that.


>> * https://patchwork.ozlabs.org/project/openvswitch/list/?series=312997
>> This one was also posted in Aug. The first patch got an ACK, and the second
>> one is still waiting for feedback. Could you comment on it, too?

We can get this as a bug fix later, as it seems to be a bug fix more
than a new feature.  But I had no chance to properly review it.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 106 characters long (recommended limit is 79)
#619 FILE: lib/dpctl.man:305:
\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR] 
[\fIct-origin-tuple\fR [\fIct-reply-tuple\fR]]

Lines checked: 996, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ofproto-dpif-upcall: Wait for valid hw flow stats before applying min-revalidate-pps

2023-01-16 Thread Ilya Maximets
On 1/10/23 11:02, Eelco Chaudron wrote:
> On 5 Jan 2023, at 14:39, Eelco Chaudron wrote:
> 
> 
> 
 I'm not saying that we don't have issue with statistics.  We do have
 2 separate issues here - dump duration and statistics update frequency.
 Just trying to figure out which is a primary and which is a secondary
 contributor to the flow deletion issues that you observe.

 My point is that if dump durations wasn't that unreasonably high, we
 would not see flow deletions.  OTOH, if statistics updates wasn't delayed,
 we would not see the issue either.  So, I propose a draw. :)
>>
>> Well with the default config a run of 251ms in the default scenario could 
>> hit the problem if the stats were not updated between the next and the 
>> previous (too long) run.
>>
 We know the root cause of the statistics delay (driver), and we don't
 know the root cause of the long dump duration.  So we need to figure
 that out.
>>
>> Yes, the excessive delay we need to figure out.
>>
>> William can you get some more accurate values (I think you have my reval 
>> patch applied, 
>> https://patchwork.ozlabs.org/project/openvswitch/patch/166367078734.325473.17899801287395410245.stgit@ebuild/,
>>  so you could run the script with -q the get the values over time, maybe 
>> also do -w for offline analysis).
> 
> William was able to run my revalidator script on the system for 2d and 
> 17:16:14 hours. The results are below, however one thing I did not realise is 
> that the n_flows value is the actual dp flows, not including the hw offloaded 
> ones.

Sounds like something we should fix.

> 
> So to determine the offloaded flows I used the n_reval_flows value, which is 
> no revalidating needed to happening is reported as 0.
> 
> So here are some histograms for the entire period:
> 
> => n_flows:
> # NumSamples = 477616; Min = 0.00; Max = 20.00
> # Mean = 2.564169; Variance = 2.860125; SD = 1.691190; Median 2.00
> # each ∎ represents a count of 1930
> 0. - 1. [144762]: 
> ∎∎∎
> 1. - 2. [128073]: 
> ∎∎
> 2. - 3. [ 94831]: 
> ∎
> 3. - 4. [ 59869]: ∎∎∎
> 4. - 5. [ 29455]: ∎∎∎
> 5. - 6. [ 12025]: ∎∎
> 6. - 7. [  4425]: ∎∎
> 7. - 8. [  1461]:
> 8. - 9. [   364]:
> 9. -10. [   123]:
>10. -11. [21]:
>11. -12. [40]:
>12. -13. [   247]:
>13. -14. [   569]:
>14. -15. [   509]:
>15. -16. [   393]:
>16. -17. [   220]:
>17. -18. [   169]:
>18. -19. [51]:
>19. -20. [ 9]:
> 
> => n_reval_flows:
> # NumSamples = 477616; Min = 0.00; Max = 148.00
> # Mean = 4.193549; Variance = 336.919508; SD = 18.355367; Median 0.00
> # each ∎ represents a count of 6041
> 0. - 7.4000 [453117]: 
> ∎∎∎
> 7.4000 -14.8000 [72]:
>14.8000 -22.2000 [   382]:
>22.2000 -29.6000 [   117]:
>29.6000 -37. [49]:
>37. -44.4000 [25]:
>44.4000 -51.8000 [49]:
>51.8000 -59.2000 [   239]:
>59.2000 -66.6000 [  1020]:
>66.6000 -74. [  3640]:
>74. -81.4000 [  5819]:
>81.4000 -88.8000 [  6191]: ∎
>88.8000 -96.2000 [  4288]:
>96.2000 -   103.6000 [  1550]:
>   103.6000 -   111. [   603]:
>   111. -   118.4000 [   194]:
>   118.4000 -   125.8000 [   126]:
>   125.8000 -   133.2000 [87]:
>   133.2000 -   140.6000 [34]:
>   140.6000 -   148. [14]:
> 
> => dump_duration:
> # NumSamples = 477616; Min = 1.00; Max = 2396.00
> # Mean = 20.440632; Variance = 4416.771835; SD = 66.458798; Median 2.00
> # each ∎ represents a count of 6078
> 1. -   120.7500 [455905]: 
> ∎∎∎
>   120.7500 -   240.5000 [ 12804]: ∎∎
>   240.5000 -   360.2500 [  4972]:
>   360.2500 -   480. [  2455]:
>   480. -   599.7500 [   704]:
>   599.7500 -   719.5000 [   276]:
>   719.5000 -   839.2500 [   169]:
>   839.2500 -   959. [   137]:
>   959. -  1078.7500 [65]:
>  1078.7500 -  1198.5000 [58]:
>  1198.5000 -  1318.2500 [27]:
>  1318.2500 -  1438. [16]:
>  1438. -  1557.7500 [11]:
>  1557.7500 -  1677.5000 [ 9]:
>  1677.5000 -  1797.2500 [ 2]:
>  1797.2500 -  1917. [ 2]:
>  1917. -  2036.7500 [ 1]:
>  2036.7500 -  2156.5000 [ 1]:
>  2156.5000 -  2276.2500 [ 0]:
>  2276.2500 -  2396. [ 2]:
> 
> 
> So it looks like the maximum 

[ovs-dev] [PATCH v7 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread Ales Musil
Currently, the CT can be flushed by dpctl only by specifying
the whole 5-tuple. This is not very convenient when there are
only some fields known to the user of CT flush. Add new struct
ofputil_ct_match which represents the generic filtering that can
be done for CT flush. The match is done only on fields that are
non-zero with exception to the icmp fields.

This allows the filtering just within dpctl, however
it is a preparation for OpenFlow extension.

Reported-at: https://bugzilla.redhat.com/2120546
Signed-off-by: Ales Musil 
---
v7: Rebase on top of current master.
Address comments from Ilya:
* Move the structs and function to public header.
* Address some style related comments.
* Document the ICMP patial match limitation.

v6: Rebase on top of current master.
Address comments from Ilya.
v5: Add ack from Paolo.
v4: Fix a flush all scenario.
v3: Rebase on top of master.
Address the C99 comment and missing dpif_close call.
v2: Rebase on top of master.
Address comments from Paolo.
---
 NEWS|   3 +
 include/openvswitch/automake.mk |   1 +
 include/openvswitch/ofp-ct.h|  70 
 lib/automake.mk |   1 +
 lib/ct-dpif.c   | 298 +++-
 lib/ct-dpif.h   |   5 +-
 lib/dpctl.c |  43 +++--
 lib/dpctl.man   |  24 ++-
 lib/ofp-ct.c| 214 +++
 tests/system-traffic.at | 102 ++-
 10 files changed, 620 insertions(+), 141 deletions(-)
 create mode 100644 include/openvswitch/ofp-ct.h
 create mode 100644 lib/ofp-ct.c

diff --git a/NEWS b/NEWS
index 4f9291bf1..035fcb0ee 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,9 @@ Post-v3.0.0
  * Add new experimental PMD load based sleeping feature. PMD threads can
request to sleep up to a user configured 'pmd-maxsleep' value under
low load conditions.
+   - ovs-dpctl and 'ovs-appctl dpctl/' commands:
+ * "flush-conntrack" is now capable of handling partial 5-tuple,
+with additional optional parameter to specify the reply direction.
 
 
 v3.0.0 - 15 Aug 2022
diff --git a/include/openvswitch/automake.mk b/include/openvswitch/automake.mk
index 84670d80a..0cc1f569e 100644
--- a/include/openvswitch/automake.mk
+++ b/include/openvswitch/automake.mk
@@ -15,6 +15,7 @@ openvswitchinclude_HEADERS = \
include/openvswitch/ofp-actions.h \
include/openvswitch/ofp-bundle.h \
include/openvswitch/ofp-connection.h \
+   include/openvswitch/ofp-ct.h \
include/openvswitch/ofp-ed-props.h \
include/openvswitch/ofp-errors.h \
include/openvswitch/ofp-flow.h \
diff --git a/include/openvswitch/ofp-ct.h b/include/openvswitch/ofp-ct.h
new file mode 100644
index 0..e7af45337
--- /dev/null
+++ b/include/openvswitch/ofp-ct.h
@@ -0,0 +1,70 @@
+/* Copyright (c) 2022, Red Hat, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef OVS_OFP_CT_H
+#define OVS_OFP_CT_H
+
+#include 
+#include 
+#include 
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct ofp_ct_tuple {
+struct in6_addr src;
+struct in6_addr dst;
+
+union {
+ovs_be16 src_port;
+ovs_be16 icmp_id;
+};
+union {
+ovs_be16 dst_port;
+struct {
+uint8_t icmp_code;
+uint8_t icmp_type;
+};
+};
+};
+
+struct ofp_ct_match {
+uint8_t ip_proto;
+uint16_t l3_type;
+
+struct ofp_ct_tuple tuple_orig;
+struct ofp_ct_tuple tuple_reply;
+};
+
+bool ofp_ct_tuple_is_zero(const struct ofp_ct_tuple *tuple, uint8_t ip_proto);
+
+bool ofp_ct_tuple_is_five_tuple(const struct ofp_ct_tuple *tuple,
+uint8_t ip_proto);
+
+void ofp_ct_match_format(struct ds *ds, const struct ofp_ct_match *match);
+
+bool ofp_ct_tuple_parse(struct ofp_ct_tuple *tuple, const char *s,
+struct ds *ds, uint8_t *ip_proto,
+uint16_t *l3_type);
+
+bool ofp_ct_is_match_zero(const struct ofp_ct_match *match);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* include/openvswitch/ofp-ct.h */
diff --git a/lib/automake.mk b/lib/automake.mk
index 61bdc308f..e64ee76ce 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -226,6 +226,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/ofp-actions.c \
lib/ofp-bundle.c \
lib/ofp-connection.c \
+   lib/ofp-ct.c \

[ovs-dev] [PATCH v7 0/2] Add extension for partial CT flush

2023-01-16 Thread Ales Musil
Add extension that will allow to flush
connection from CT specified by full
orig 5-tuple or partial match that combines
orig or reply direction.

Ales Musil (2):
  ofp, dpif: Allow CT flush based on partial match
  openflow: Add extension to flush CT by generic match

 NEWS|   9 +
 include/openflow/nicira-ext.h   |  37 +++
 include/openvswitch/automake.mk |   1 +
 include/openvswitch/ofp-ct.h|  80 +++
 include/openvswitch/ofp-msgs.h  |   4 +
 lib/automake.mk |   1 +
 lib/ct-dpif.c   | 298 ++-
 lib/ct-dpif.h   |   5 +-
 lib/dpctl.c |  43 +++-
 lib/dpctl.man   |  24 +-
 lib/ofp-bundle.c|   1 +
 lib/ofp-ct.c| 410 
 lib/ofp-print.c |  20 ++
 lib/rconn.c |   1 +
 ofproto/ofproto-dpif.c  |   9 +-
 ofproto/ofproto-provider.h  |   7 +-
 ofproto/ofproto.c   |  29 ++-
 tests/ofp-print.at  | 108 +
 tests/ovs-ofctl.at  |  38 +++
 tests/system-traffic.at | 120 +-
 utilities/ovs-ofctl.8.in|  27 +++
 utilities/ovs-ofctl.c   |  51 
 22 files changed, 1170 insertions(+), 153 deletions(-)
 create mode 100644 include/openvswitch/ofp-ct.h
 create mode 100644 lib/ofp-ct.c

-- 
2.39.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 2/2] openflow: Add extension to flush CT by generic match

2023-01-16 Thread Ales Musil
Add extension that allows to flush connections from CT
by specifying fields that the connections should be
matched against. This allows to match only some fields
of the connection e.g. source address for orig direction.

Reported-at: https://bugzilla.redhat.com/2120546
Signed-off-by: Ales Musil 
---
v7: Rebase on top of current master.
Address comments from Ilya:
* Use the public header for encode/decode functions.
* Adjust the code accordingly to the new header file.
* Address some style related comments.
* Rename the nested TLV, so they do not overlap with
  the outer types.
* Document the ICMP partial match limitation.

v6: Rebase on top of current master.
Address comments from Ilya.
v5: Add missing usage and man for ovs-ofctl command.
v4: Allow ovs-ofctl flush/conntrack without any zone/tuple.
v3: Rebase on top of master.
v2: Rebase on top of master.
Use suggestion from Ilya.
---
 NEWS   |   6 +
 include/openflow/nicira-ext.h  |  37 +++
 include/openvswitch/ofp-ct.h   |  10 ++
 include/openvswitch/ofp-msgs.h |   4 +
 lib/ofp-bundle.c   |   1 +
 lib/ofp-ct.c   | 196 +
 lib/ofp-print.c|  20 
 lib/rconn.c|   1 +
 ofproto/ofproto-dpif.c |   9 +-
 ofproto/ofproto-provider.h |   7 +-
 ofproto/ofproto.c  |  29 -
 tests/ofp-print.at | 108 ++
 tests/ovs-ofctl.at |  38 +++
 tests/system-traffic.at|  38 ---
 utilities/ovs-ofctl.8.in   |  27 +
 utilities/ovs-ofctl.c  |  51 +
 16 files changed, 560 insertions(+), 22 deletions(-)

diff --git a/NEWS b/NEWS
index 035fcb0ee..22840402c 100644
--- a/NEWS
+++ b/NEWS
@@ -36,6 +36,12 @@ Post-v3.0.0
- ovs-dpctl and 'ovs-appctl dpctl/' commands:
  * "flush-conntrack" is now capable of handling partial 5-tuple,
 with additional optional parameter to specify the reply direction.
+   - OpenFlow:
+ * New OpenFlow extension NXT_CT_FLUSH to flush connections matching
+   the specified fields.
+   - ovs-ofctl:
+ * New command "flush-conntrack" that accepts zone and 5-tuple or partial
+   5-tuple for both directions.
 
 
 v3.0.0 - 15 Aug 2022
diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
index b68804991..35f66a76d 100644
--- a/include/openflow/nicira-ext.h
+++ b/include/openflow/nicira-ext.h
@@ -1064,4 +1064,41 @@ struct nx_zone_id {
 };
 OFP_ASSERT(sizeof(struct nx_zone_id) == 8);
 
+/* CT flush available TLVs. */
+enum nx_ct_flush_tlv_type {
+/* Outer types. */
+NXT_CT_ORIG_TUPLE = 0, /* Outer type for original tuple TLV.
+* The nested TLV are specified
+* by 'enum nx_ct_flush_tuple_tlv_type'. */
+NXT_CT_REPLY_TUPLE = 1,/* Outer type for reply tuple TLC. *
+* The nested TLV are specified
+* by 'enum nx_ct_flush_tuple_tlv_type'*/
+/* Primitive types. */
+NXT_CT_ZONE_ID = 2,/* be16 zone id. */
+};
+
+/* CT flush nested TLVs. */
+enum nx_ct_flush_tuple_tlv_type {
+NXT_CT_TUPLE_SRC = 0,/* IPv6 or mapped IPv4 address. */
+NXT_CT_TUPLE_DST = 1,/* IPv6 or mapped IPv4 address. */
+NXT_CT_TUPLE_SRC_PORT = 2,   /* be16 source port. */
+NXT_CT_TUPLE_DST_PORT = 3,   /* be16 destination port. */
+NXT_CT_TUPLE_ICMP_ID = 4,   /* be16 ICMP id. */
+NXT_CT_TUPLE_ICMP_TYPE = 5, /* u8 ICMP type. */
+NXT_CT_TUPLE_ICMP_CODE = 6, /* u8 ICMP code. */
+};
+
+/* NXT_CT_FLUSH.
+ *
+ * Flushes the connection tracking specified by 5-tuple.
+ * The struct should be followed by TLVs specifying the matching parameters.
+ * Currently there is a limitation for ICMP, in order to partially match on
+ * ICMP parameters the tuple should include at least SRC/DST. */
+struct nx_ct_flush {
+uint8_t ip_proto;  /* IP protocol. */
+uint8_t pad[7];   /* Must be zero. */
+/* Followed by optional TLVs of type 'enum nx_ct_flush_tlv_type'. */
+};
+OFP_ASSERT(sizeof(struct nx_ct_flush) == 8);
+
 #endif /* openflow/nicira-ext.h */
diff --git a/include/openvswitch/ofp-ct.h b/include/openvswitch/ofp-ct.h
index e7af45337..094da0a97 100644
--- a/include/openvswitch/ofp-ct.h
+++ b/include/openvswitch/ofp-ct.h
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include "openflow/nicira-ext.h"
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -63,6 +65,14 @@ bool ofp_ct_tuple_parse(struct ofp_ct_tuple *tuple, const 
char *s,
 
 bool ofp_ct_is_match_zero(const struct ofp_ct_match *match);
 
+enum ofperr ofp_ct_match_decode(struct ofp_ct_match *match, bool *with_zone,
+uint16_t *zone_id,
+const struct ofp_header *oh);
+
+struct ofpbuf * ofp_ct_match_encode(const struct ofp_ct_match *match,
+

Re: [ovs-dev] [PATCH v6 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread Ales Musil
On Mon, Jan 16, 2023 at 12:20 PM Ilya Maximets  wrote:

> On 1/16/23 12:08, Ales Musil wrote:
> >
> >
> > On Mon, Jan 16, 2023 at 10:53 AM Ilya Maximets  > wrote:
> >
> > On 1/16/23 07:21, Ales Musil wrote:
> > >
> > >
> > > On Fri, Jan 13, 2023 at 9:58 PM Ilya Maximets   >> wrote:
> > >
> > > On 1/12/23 18:17, Ales Musil wrote:
> > > > Currently, the CT can be flushed by dpctl only by specifying
> > > > the whole 5-tuple. This is not very convenient when there are
> > > > only some fields known to the user of CT flush. Add new
> struct
> > > > ofputil_ct_match which represents the generic filtering that
> can
> > > > be done for CT flush. The match is done only on fields that
> are
> > > > non-zero with exception to the icmp fields.
> > > >
> > > > This allows the filtering just within dpctl, however
> > > > it is a preparation for OpenFlow extension.
> > > >
> > > > Reported-at: https://bugzilla.redhat.com/2120546 <
> https://bugzilla.redhat.com/2120546>  >
> > > > Signed-off-by: Ales Musil  amu...@redhat.com> >>
> > > > ---
> > > > v6: Rebase on top of current master.
> > > > Address comments from Ilya.
> > >
> > > Nit: this kind of version history is not really useful.  It
> should say
> > > what actually changed.  I don't remember every comment I made,
> other
> > > people have no idea what I was asking for.
> > >
> > > > v5: Add ack from Paolo.
> > > > v4: Fix a flush all scenario.
> > > > v3: Rebase on top of master.
> > > > Address the C99 comment and missing dpif_close call.
> > > > v2: Rebase on top of master.
> > > > Address comments from Paolo.
> > > > ---
> > > >  NEWS   |   3 +
> > > >  include/openvswitch/ofp-util.h |  28 +++
> > > >  lib/automake.mk   http://automake.mk>>|   2 +
> > > >  lib/ct-dpif.c  | 210 ++-
> > > >  lib/ct-dpif.h  |   4 +-
> > > >  lib/dpctl.c|  43 +++--
> > > >  lib/dpctl.man  |  15 +-
> > > >  lib/ofp-ct-util.c  | 303
> +
> > > >  lib/ofp-ct-util.h  |  40 +
> > > >  tests/system-traffic.at  <
> http://system-traffic.at >| 102
> ++-
> > > >  10 files changed, 611 insertions(+), 139 deletions(-)
> > > >  create mode 100644 lib/ofp-ct-util.c
> > > >  create mode 100644 lib/ofp-ct-util.h
> > > >
> >
> > 
> >
> > > > +bool
> > > > +ofputil_ct_tuple_is_zero(const struct ofputil_ct_tuple
> *tuple, uint8_t ip_proto)
> > > > +{
> > > > +bool is_zero = ipv6_is_zero(>src) &&
> ipv6_is_zero(>dst);
> > > > +
> > > > +if (!(ip_proto == IPPROTO_ICMP || ip_proto ==
> IPPROTO_ICMPV6)) {
> > > > +is_zero = is_zero && !tuple->src_port &&
> !tuple->dst_port;
> > > > +}
> > > > +
> > > > +return is_zero;
> > >
> > > Do we require ICMP tuples to always have at least one address
> specified?
> > > Or should the 'if' above have an 'else { return false; }' ?
> > >
> > >
> > > The ICMP is very problematic, if we do the 'else { return false;
> }' the shortcut for
> > > the full 5-tuple stops working. Without it if user specifies reply
> with only ICMP parameters
> > > we won't flush anything. I'm not sure what the correct approach
> here would be.
> > > Maybe documenting the limitation? Adding note that for ICMP in
> reply direction to work
> > > user needs to specify at least one IP?
> >
> > I think, the correct solution to construct a mask of what the user
> > have specified while parsing and use the mask instead of values for
> > is_zero().  And also use it during comparison to con compare fields
> > that are not specified.  Will that solve the problem?
> >
> > Since partial matches are not supported, it can be a bitmask of
> > NXT_CT_TUPLE_* values.
> >
> > It should be OK to just document for now.  This also should be
> > mentioned in the header along with the definition of the struct
> > nx_ct_flush.
> >
> > A bitmap thing can go as a fixup after the branching, if it works.
> >
> > Best regards, Ilya Maximets.
> >
> >
> >
> > Yes, that would work and it would be probably faster.
> > I will update the NXT_CT_TUPLE_* 

Re: [ovs-dev] [PATCH v6 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread Ilya Maximets
On 1/16/23 12:08, Ales Musil wrote:
> 
> 
> On Mon, Jan 16, 2023 at 10:53 AM Ilya Maximets  > wrote:
> 
> On 1/16/23 07:21, Ales Musil wrote:
> >
> >
> > On Fri, Jan 13, 2023 at 9:58 PM Ilya Maximets    >> wrote:
> >
> >     On 1/12/23 18:17, Ales Musil wrote:
> >     > Currently, the CT can be flushed by dpctl only by specifying
> >     > the whole 5-tuple. This is not very convenient when there are
> >     > only some fields known to the user of CT flush. Add new struct
> >     > ofputil_ct_match which represents the generic filtering that can
> >     > be done for CT flush. The match is done only on fields that are
> >     > non-zero with exception to the icmp fields.
> >     >
> >     > This allows the filtering just within dpctl, however
> >     > it is a preparation for OpenFlow extension.
> >     >
> >     > Reported-at: https://bugzilla.redhat.com/2120546 
>   >
> >     > Signed-off-by: Ales Musil    >>
> >     > ---
> >     > v6: Rebase on top of current master.
> >     >     Address comments from Ilya.
> >
> >     Nit: this kind of version history is not really useful.  It should 
> say
> >     what actually changed.  I don't remember every comment I made, other
> >     people have no idea what I was asking for.
> >
> >     > v5: Add ack from Paolo.
> >     > v4: Fix a flush all scenario.
> >     > v3: Rebase on top of master.
> >     >     Address the C99 comment and missing dpif_close call.
> >     > v2: Rebase on top of master.
> >     >     Address comments from Paolo.
> >     > ---
> >     >  NEWS                           |   3 +
> >     >  include/openvswitch/ofp-util.h |  28 +++
> >     >  lib/automake.mk   >                |   2 +
> >     >  lib/ct-dpif.c                  | 210 ++-
> >     >  lib/ct-dpif.h                  |   4 +-
> >     >  lib/dpctl.c                    |  43 +++--
> >     >  lib/dpctl.man                  |  15 +-
> >     >  lib/ofp-ct-util.c              | 303 
> +
> >     >  lib/ofp-ct-util.h              |  40 +
> >     >  tests/system-traffic.at  
> >        | 102 ++-
> >     >  10 files changed, 611 insertions(+), 139 deletions(-)
> >     >  create mode 100644 lib/ofp-ct-util.c
> >     >  create mode 100644 lib/ofp-ct-util.h
> >     >
> 
> 
> 
> >     > +bool
> >     > +ofputil_ct_tuple_is_zero(const struct ofputil_ct_tuple *tuple, 
> uint8_t ip_proto)
> >     > +{
> >     > +    bool is_zero = ipv6_is_zero(>src) && 
> ipv6_is_zero(>dst);
> >     > +
> >     > +    if (!(ip_proto == IPPROTO_ICMP || ip_proto == 
> IPPROTO_ICMPV6)) {
> >     > +        is_zero = is_zero && !tuple->src_port && 
> !tuple->dst_port;
> >     > +    }
> >     > +
> >     > +    return is_zero;
> >
> >     Do we require ICMP tuples to always have at least one address 
> specified?
> >     Or should the 'if' above have an 'else { return false; }' ?
> >
> >
> > The ICMP is very problematic, if we do the 'else { return false; }' the 
> shortcut for
> > the full 5-tuple stops working. Without it if user specifies reply with 
> only ICMP parameters
> > we won't flush anything. I'm not sure what the correct approach here 
> would be.
> > Maybe documenting the limitation? Adding note that for ICMP in reply 
> direction to work
> > user needs to specify at least one IP? 
> 
> I think, the correct solution to construct a mask of what the user
> have specified while parsing and use the mask instead of values for
> is_zero().  And also use it during comparison to con compare fields
> that are not specified.  Will that solve the problem?
> 
> Since partial matches are not supported, it can be a bitmask of
> NXT_CT_TUPLE_* values.
> 
> It should be OK to just document for now.  This also should be
> mentioned in the header along with the definition of the struct
> nx_ct_flush.
> 
> A bitmap thing can go as a fixup after the branching, if it works.
> 
> Best regards, Ilya Maximets.
> 
> 
> 
> Yes, that would work and it would be probably faster.
> I will update the NXT_CT_TUPLE_* values, so they can be included in bitmask
> without breaking compatibility with the current approach.

They can be included in the mask as long as they are unique,
i.e. with (1 << NXTC_CT_TUPLE_*).

What do you have 

Re: [ovs-dev] [PATCH v6 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread Ales Musil
On Mon, Jan 16, 2023 at 10:53 AM Ilya Maximets  wrote:

> On 1/16/23 07:21, Ales Musil wrote:
> >
> >
> > On Fri, Jan 13, 2023 at 9:58 PM Ilya Maximets  > wrote:
> >
> > On 1/12/23 18:17, Ales Musil wrote:
> > > Currently, the CT can be flushed by dpctl only by specifying
> > > the whole 5-tuple. This is not very convenient when there are
> > > only some fields known to the user of CT flush. Add new struct
> > > ofputil_ct_match which represents the generic filtering that can
> > > be done for CT flush. The match is done only on fields that are
> > > non-zero with exception to the icmp fields.
> > >
> > > This allows the filtering just within dpctl, however
> > > it is a preparation for OpenFlow extension.
> > >
> > > Reported-at: https://bugzilla.redhat.com/2120546 <
> https://bugzilla.redhat.com/2120546>
> > > Signed-off-by: Ales Musil  amu...@redhat.com>>
> > > ---
> > > v6: Rebase on top of current master.
> > > Address comments from Ilya.
> >
> > Nit: this kind of version history is not really useful.  It should
> say
> > what actually changed.  I don't remember every comment I made, other
> > people have no idea what I was asking for.
> >
> > > v5: Add ack from Paolo.
> > > v4: Fix a flush all scenario.
> > > v3: Rebase on top of master.
> > > Address the C99 comment and missing dpif_close call.
> > > v2: Rebase on top of master.
> > > Address comments from Paolo.
> > > ---
> > >  NEWS   |   3 +
> > >  include/openvswitch/ofp-util.h |  28 +++
> > >  lib/automake.mk |   2 +
> > >  lib/ct-dpif.c  | 210 ++-
> > >  lib/ct-dpif.h  |   4 +-
> > >  lib/dpctl.c|  43 +++--
> > >  lib/dpctl.man  |  15 +-
> > >  lib/ofp-ct-util.c  | 303
> +
> > >  lib/ofp-ct-util.h  |  40 +
> > >  tests/system-traffic.at | 102
> ++-
> > >  10 files changed, 611 insertions(+), 139 deletions(-)
> > >  create mode 100644 lib/ofp-ct-util.c
> > >  create mode 100644 lib/ofp-ct-util.h
> > >
>
> 
>
> > > +bool
> > > +ofputil_ct_tuple_is_zero(const struct ofputil_ct_tuple *tuple,
> uint8_t ip_proto)
> > > +{
> > > +bool is_zero = ipv6_is_zero(>src) &&
> ipv6_is_zero(>dst);
> > > +
> > > +if (!(ip_proto == IPPROTO_ICMP || ip_proto ==
> IPPROTO_ICMPV6)) {
> > > +is_zero = is_zero && !tuple->src_port && !tuple->dst_port;
> > > +}
> > > +
> > > +return is_zero;
> >
> > Do we require ICMP tuples to always have at least one address
> specified?
> > Or should the 'if' above have an 'else { return false; }' ?
> >
> >
> > The ICMP is very problematic, if we do the 'else { return false; }' the
> shortcut for
> > the full 5-tuple stops working. Without it if user specifies reply with
> only ICMP parameters
> > we won't flush anything. I'm not sure what the correct approach here
> would be.
> > Maybe documenting the limitation? Adding note that for ICMP in reply
> direction to work
> > user needs to specify at least one IP?
>
> I think, the correct solution to construct a mask of what the user
> have specified while parsing and use the mask instead of values for
> is_zero().  And also use it during comparison to con compare fields
> that are not specified.  Will that solve the problem?
>
> Since partial matches are not supported, it can be a bitmask of
> NXT_CT_TUPLE_* values.
>
> It should be OK to just document for now.  This also should be
> mentioned in the header along with the definition of the struct
> nx_ct_flush.
>
> A bitmap thing can go as a fixup after the branching, if it works.
>
> Best regards, Ilya Maximets.
>
>

Yes, that would work and it would be probably faster.
I will update the NXT_CT_TUPLE_* values, so they can be included in bitmask
without breaking compatibility with the current approach.

Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 1/2] ofp, dpif: Allow CT flush based on partial match

2023-01-16 Thread Ilya Maximets
On 1/16/23 07:21, Ales Musil wrote:
> 
> 
> On Fri, Jan 13, 2023 at 9:58 PM Ilya Maximets  > wrote:
> 
> On 1/12/23 18:17, Ales Musil wrote:
> > Currently, the CT can be flushed by dpctl only by specifying
> > the whole 5-tuple. This is not very convenient when there are
> > only some fields known to the user of CT flush. Add new struct
> > ofputil_ct_match which represents the generic filtering that can
> > be done for CT flush. The match is done only on fields that are
> > non-zero with exception to the icmp fields.
> >
> > This allows the filtering just within dpctl, however
> > it is a preparation for OpenFlow extension.
> >
> > Reported-at: https://bugzilla.redhat.com/2120546 
> 
> > Signed-off-by: Ales Musil mailto:amu...@redhat.com>>
> > ---
> > v6: Rebase on top of current master.
> >     Address comments from Ilya.
> 
> Nit: this kind of version history is not really useful.  It should say
> what actually changed.  I don't remember every comment I made, other
> people have no idea what I was asking for.
> 
> > v5: Add ack from Paolo.
> > v4: Fix a flush all scenario.
> > v3: Rebase on top of master.
> >     Address the C99 comment and missing dpif_close call.
> > v2: Rebase on top of master.
> >     Address comments from Paolo.
> > ---
> >  NEWS                           |   3 +
> >  include/openvswitch/ofp-util.h |  28 +++
> >  lib/automake.mk                 |   2 +
> >  lib/ct-dpif.c                  | 210 ++-
> >  lib/ct-dpif.h                  |   4 +-
> >  lib/dpctl.c                    |  43 +++--
> >  lib/dpctl.man                  |  15 +-
> >  lib/ofp-ct-util.c              | 303 +
> >  lib/ofp-ct-util.h              |  40 +
> >  tests/system-traffic.at         | 102 
> ++-
> >  10 files changed, 611 insertions(+), 139 deletions(-)
> >  create mode 100644 lib/ofp-ct-util.c
> >  create mode 100644 lib/ofp-ct-util.h
> >



> > +bool
> > +ofputil_ct_tuple_is_zero(const struct ofputil_ct_tuple *tuple, uint8_t 
> ip_proto)
> > +{
> > +    bool is_zero = ipv6_is_zero(>src) && 
> ipv6_is_zero(>dst);
> > +
> > +    if (!(ip_proto == IPPROTO_ICMP || ip_proto == IPPROTO_ICMPV6)) {
> > +        is_zero = is_zero && !tuple->src_port && !tuple->dst_port;
> > +    }
> > +
> > +    return is_zero;
> 
> Do we require ICMP tuples to always have at least one address specified?
> Or should the 'if' above have an 'else { return false; }' ?
> 
> 
> The ICMP is very problematic, if we do the 'else { return false; }' the 
> shortcut for
> the full 5-tuple stops working. Without it if user specifies reply with only 
> ICMP parameters
> we won't flush anything. I'm not sure what the correct approach here would be.
> Maybe documenting the limitation? Adding note that for ICMP in reply 
> direction to work
> user needs to specify at least one IP? 

I think, the correct solution to construct a mask of what the user
have specified while parsing and use the mask instead of values for
is_zero().  And also use it during comparison to con compare fields
that are not specified.  Will that solve the problem?

Since partial matches are not supported, it can be a bitmask of
NXT_CT_TUPLE_* values.

It should be OK to just document for now.  This also should be
mentioned in the header along with the definition of the struct
nx_ct_flush.

A bitmap thing can go as a fixup after the branching, if it works.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev