Re: [ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Han Zhou
On Mon, Apr 17, 2023 at 12:01 PM Ilya Maximets  wrote:
>
> On 4/17/23 19:20, Han Zhou wrote:
> >
> >
> > On Mon, Apr 17, 2023 at 9:24 AM Numan Siddique > wrote:
> >>
> >> On Mon, Apr 17, 2023 at 5:57 AM Dumitru Ceara mailto:dce...@redhat.com>> wrote:
> >> >
> >> > On 4/17/23 11:55, Dumitru Ceara wrote:
> >> > > On 4/14/23 18:01, Han Zhou wrote:
> >> > >>
> >> > >>
> >> > >> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara mailto:dce...@redhat.com>
> >> > >> >> wrote:
> >> > >>>
> >> > >>> On 4/14/23 04:15, Han Zhou wrote:
> >> >  On Thu, Apr 13, 2023 at 12:44 PM Han Zhou mailto:hz...@ovn.org>
> >> > >> >> wrote:
> >> > >
> >> >  I backported this series to the last release branch-23.03 and
the LTS
> >> >  branch-22.03.
> >> >  The code base has changed a lot so was mostly manually
backported
> >> > >> instead
> >> >  of cherry-pick.
> >> > 
> >> > >>>
> >> > >>> Thanks for that but I think we shouldn't skip stable branches in
> >> > >>> between.  We likely still have users on those that won't upgrade
to the
> >> > >>> latest 23.03 but instead to a new z-stream release (when that
happens)
> >> > >>> of the stable branch.
> >> > >>>
> >> > >>> I cherry picked your backports and made sure the tests pass and
then
> >> > >>> pushed them to branches 22.12, 22.09 and 22.06.
> >> > >>
> >> > >> Thanks Dumitru for helping backporting. I was uncertain about
whether we
> >> > >> should continue backporting without skipping branches, as our
growing
> >> > >> number of released branches makes this approach increasingly
difficult
> >> > >> to sustain.
> >> > >>
> >> > >> The release process document
> >> > >> (https://docs.ovn.org/en/latest/internals/release-process.html <
https://docs.ovn.org/en/latest/internals/release-process.html>
> >> > >> >) doesn't
> >> > >> specify how long we should maintain non-LTS branches. My
understanding
> >> > >> was that we should primarily maintain LTS and the latest releases
for
> >> > >> regular bug fixes, while critical/security fixes would be applied
to all
> >> > >> branches. It appears we have different interpretations of this
process.
> >> > >> While it's beneficial to backport to all branches, I thought this
was
> >> > >> more of a convenience when there weren't too many branches to
manage. In
> >> > >> the worst case, a patch may require manual backporting to each
branch if
> >> > >> new features make the codebase in each branch unique. I also
question
> >> > >> what differentiates an LTS branch if we're backporting to all
branches
> >> > >> anyway. My assumption was that users opting for a non-LTS branch
should
> >> > >> be prepared to upgrade to newer releases, given the "short-term"
> >> > >> maintenance of these branches. Is my understanding incorrect?
> >> > >>
> >> > > Hi Han,
> >> > >
> >> > > I changed the subject of the email to better reflect the new
discussion.
> >> > >
> >> > > There's this statement in the release-process document: "LTS
releases
> >> > > will receive bug fixes until the point that another LTS is
released. At
> >> > > that point, the old LTS will receive an additional year of
critical and
> >> > > security fixes."
> >> > >
> >> > > Which further differentiates an LTS release from a non-LTS release.
> >
> > If you are saying "the old LTS will receive an additional year of
critical and security fixes" is the different part, I agree. But still, in
this case, for critical and security fixes, when applied to the LTS for the
*additional* year, do you agree that we will skip non-LTS releases? If not
skipping, then it makes no difference.
> >
> >> > >
> >> > >> Perhaps we should discuss and formalize our backporting practice
going
> >> > >> forward, as the number of branches will only continue to grow in
the
> >> > >> coming months and years. cc @Mark Michelson
> >> > >> >  @Numan
Siddique >
> >> > >>
> >> > >
> >> > > Checking the OVS/OVN release-process documents again it's indeed
not
> >> > > explicitly specified that non-LTS stable release branches are
supposed
> >> > > to get all bug fixes.  But the OVS document has this addition:
> >> > >
> >> > > "While branches other than LTS and the latest release are not
formally
> >> > > maintained, the OVS project usually provides stable releases for
these
> >> > > branches for at least 2 years, i.e. stable releases are provided
for the
> >> > >  last 4 release branches.  However, these branches may not include
all
> >> > > the fixes that LTS has in case backporting is not straightforward
and
> >> > > developers are not willing to spend their time on that (this mostly
> >> > > affects branches that are older than the LTS, because backporting
to LTS
> >> > > implies 

Re: [ovs-dev] [PATCH ovn] Expose distributed gateway port information in NB DB

2023-04-17 Thread Han Zhou
(Sorry that my previous reply includes redundant texts. Please ignore that
one and use this version :D)

On Mon, Apr 17, 2023 at 7:18 AM Lucas Martins  wrote:
>
> Thanks all for the discussion and all the ideas here.
>
> After reading the emails, I think it boils down to two proposed
approaches:
>
> 1) CMS to continue to connect to the Southbound database if they need
> information about the physical location of the resources. That would
> avoid the inefficiency of having to copy data back-and-forth from the
> Northbound and Southbound database.
>
> I guess the downside of this approach is that CMS will have to
> maintain a connection with both databases (which is already the case
> today).
>
> If we go with this approach, it would be good to have consensus from
> the core OVN team where some tables in the Southbound must be kept
> stable with backward compatibility in case of changes. Tables such
> Chassis, Chassis_Private and Port_Binding at least will need that. I
> guess that makes part of the Southbound database to not be considered
> OVN internal only.

It doesn't look very clean to expose the SB DB to CMS, but in practice I
think it is not a real problem for doing so for keeping tables backward
compatible, because even without considering CMS, OVN itself needs to keep
the compatibility for upgrading.
I am still open to the idea of keeping things in NB DB only, but at least
one issue not addressed so far in this discussion (even with the status DB)
is how to manage the orphan chassis if SB is not exposed to CMS. It seems
still more practical to me to let CMS connect to SB directly instead of
introducing a copy of Chassis table in NB and ovn-northd doing the
back-and-forth data sync. So I am not sure if it should be a goal to remove
all the SB access from CMS.

On the other hand, giving another thought, it doesn't sound too much extra
cost for propagating the "hosting-chassis" information for chassis-redirect
ports back to NB, considering that the number of DGP/LRP records is
relatively small, and we already update "UP" back for LSPs (which are far
more than LRPs) when they bind to a chassis. For code complexity, we can
make sure they are handled in similar ways (today only as part of
ovnsb_db_run(), but I am working on improvements with incremental
processing).

If we do that, I'd rather not use the "options" column, because this is not
a configuration, but a status. Maybe we should introduce a new
"status/details" column which is extensible for more information in the
future.

But again, we should only do this if it is really the best option for the
CMS (see below).

I think my real concern here is in fact connecting to SB or NB from every
node, like the example use case of the ovn-bgp-agent. There are below
options worth considering:
- The local OVSDB on each node is also considered as an interface between
OVN and CMS. It is possible for ovn-controller to expose some information
to the local OVSDB (e.g. through external-ids of br-int bridge or just the
openvswitch table). The benefit of this approach is that it reuses existing
SB connection of ovn-controller and avoids extra connections, but I am not
sure if this is a better or worse interface for the use case.
- Deploy a dedicated OVSDB relay cluster (for SB, or NB if it is decided to
propagate "hosting-chassis" to NB) for the per-node agents if scale becomes
an issue. This would introduce some operational complexity of course.
- A centralized CMS component replicates such information from SB/NB to
CMS's own DB/API, and the per-node component reads through CMS's DB/API
instead directly from OVN SB/NB. This leaves the scale problem to CMS, and
for my understanding this is probably more applicable for K8S (thanks to
the k8s api-server framework) but challenging for OpenStack.
- The "status DB" proposed by Mark (see my comments below).

>
> 2) The second approach is what Mark has described below, having a
> separated database for the runtime information.
>
> One thing I like about this approach is that it keep the role and
> permissions of each database quite clear, as described:
>
> NB DB: CMS writes, OVN reads
> SB DB: OVN-internal
> Status DB: CMS reads, OVN writes
>
> It also seems to enable a lot of room for extra information about the
> runtime to be added in the future (as mentioned in Mark's email, LB
> health status, LSP packet counts etc...).
>
> The con is that it's yet another database that needs to be deployed
> and maintained and another connection for CMS to maintain if they need
> that sort of information.
>

This is indeed an interesting idea, and it's not easy to determine whether
it's ultimately beneficial or detrimental. On one hand, it clearly defines
the role of each database and can be helpful for scaling, but on the other
hand, it might be too heavy to maintain an extra database from both OVN
development and operational perspectives.

For use cases like the ovn-bgp-agent, this approach may be more complex
than necessary. 

Re: [ovs-dev] [PATCH ovn] Expose distributed gateway port information in NB DB

2023-04-17 Thread Han Zhou
On Mon, Apr 17, 2023 at 7:18 AM Lucas Martins  wrote:
>
> Thanks all for the discussion and all the ideas here.
>
> After reading the emails, I think it boils down to two proposed
approaches:
>
> 1) CMS to continue to connect to the Southbound database if they need
> information about the physical location of the resources. That would
> avoid the inefficiency of having to copy data back-and-forth from the
> Northbound and Southbound database.
>
> I guess the downside of this approach is that CMS will have to
> maintain a connection with both databases (which is already the case
> today).
>
> If we go with this approach, it would be good to have consensus from
> the core OVN team where some tables in the Southbound must be kept
> stable with backward compatibility in case of changes. Tables such
> Chassis, Chassis_Private and Port_Binding at least will need that. I
> guess that makes part of the Southbound database to not be considered
> OVN internal only.

It doesn't look very clean to expose the SB DB to CMS, but in practice I
think it is not a real problem for doing so for keeping tables backward
compatible, because even without considering CMS, OVN itself needs to keep
the compatibility for upgrading.
I am still open to the idea of keeping things in NB DB only, but at least
one issue not addressed so far in this discussion (even with the status DB)
is how to manage the orphan chassis if SB is not exposed to CMS. It seems
still more practical to me to let CMS connect to SB directly instead of
introducing a copy of Chassis table in NB and ovn-northd doing the
back-and-forth data sync. So I am not sure if it should be a goal to remove
all the SB access from CMS.

On the other hand, giving another thought, it doesn't sound too much extra
cost for propagating the "hosting-chassis" information for chassis-redirect
ports back to NB, considering that the number of DGP/LRP records is
relatively small, and we already update "UP" back for LSPs (which are far
more than LRPs) when they bind to a chassis. For code complexity, we can
make sure they are handled in similar ways (today only as part of
ovnsb_db_run(), but I am working on improvements with incremental
processing).

If we do that, I'd rather not use the "options" column, because this is not
a configuration, but a status. Maybe we should introduce a new
"status/details" column which is extensible for more information in the
future.

But again, we should only do this if it is really the best option for the
CMS (see below).

I think my real concern here is in fact connecting to SB or NB from every
node, like the example use case of the ovn-bgp-agent. There are below
options worth considering:
- The local OVSDB on each node is also considered as an interface between
OVN and CMS. It is possible for ovn-controller to expose some information
to the local OVSDB (e.g. through external-ids of br-int bridge or just the
openvswitch table). The benefit of this approach is that it reuses existing
SB connection of ovn-controller and avoids extra connections, but I am not
sure if this is a better or worse interface for the use case.
- Deploy a dedicated OVSDB relay cluster (for SB, or NB if it is decided to
propagate "hosting-chassis" to NB) for the per-node agents if scale becomes
an issue. This would introduce some operational complexity of course.
- A centralized CMS component replicates such information from SB/NB to
CMS's own DB/API, and the per-node component reads through CMS's DB/API
instead directly from OVN SB/NB. This leaves the scale problem to CMS, and
for my understanding this is probably more applicable for K8S (thanks to
the k8s api-server framework) but challenging for OpenStack.
- The "status DB" proposed by Mark (see my comments below).

>
> 2) The second approach is what Mark has described below, having a
> separated database for the runtime information.
>
> One thing I like about this approach is that it keep the role and
> permissions of each database quite clear, as described:
>
> NB DB: CMS writes, OVN reads
> SB DB: OVN-internal
> Status DB: CMS reads, OVN writes
>
> It also seems to enable a lot of room for extra information about the
> runtime to be added in the future (as mentioned in Mark's email, LB
> health status, LSP packet counts etc...).
>
> The con is that it's yet another database that needs to be deployed
> and maintained and another connection for CMS to maintain if they need
> that sort of information.

This is indeed an interesting idea and it seems not easy to conclude
whether it's ultimately beneficial or detrimental. On one hand it does make
the role for each DB clear and also helpful for scaling, but on the other
hand it sounds too heavy for an extra DB to maintain (both from OVN
development and operation perspective).
I think for the use case like the ovn-bgp-agent, this is probably
unnecessarily complex than it really needs to be, but for the new
requirement of the telemetry counters brought up by Mark, it seems
necessary 

Re: [ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Ilya Maximets
On 4/17/23 19:20, Han Zhou wrote:
> 
> 
> On Mon, Apr 17, 2023 at 9:24 AM Numan Siddique  > wrote:
>>
>> On Mon, Apr 17, 2023 at 5:57 AM Dumitru Ceara > > wrote:
>> >
>> > On 4/17/23 11:55, Dumitru Ceara wrote:
>> > > On 4/14/23 18:01, Han Zhou wrote:
>> > >>
>> > >>
>> > >> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara > > >> 
>> > >> >> wrote:
>> > >>>
>> > >>> On 4/14/23 04:15, Han Zhou wrote:
>> >  On Thu, Apr 13, 2023 at 12:44 PM Han Zhou > >  
>> > >> >> wrote:
>> > >
>> >  I backported this series to the last release branch-23.03 and the LTS
>> >  branch-22.03.
>> >  The code base has changed a lot so was mostly manually backported
>> > >> instead
>> >  of cherry-pick.
>> > 
>> > >>>
>> > >>> Thanks for that but I think we shouldn't skip stable branches in
>> > >>> between.  We likely still have users on those that won't upgrade to the
>> > >>> latest 23.03 but instead to a new z-stream release (when that happens)
>> > >>> of the stable branch.
>> > >>>
>> > >>> I cherry picked your backports and made sure the tests pass and then
>> > >>> pushed them to branches 22.12, 22.09 and 22.06.
>> > >>
>> > >> Thanks Dumitru for helping backporting. I was uncertain about whether we
>> > >> should continue backporting without skipping branches, as our growing
>> > >> number of released branches makes this approach increasingly difficult
>> > >> to sustain.
>> > >>
>> > >> The release process document
>> > >> (https://docs.ovn.org/en/latest/internals/release-process.html 
>> > >> 
>> > >> > > >> >) 
>> > >> doesn't
>> > >> specify how long we should maintain non-LTS branches. My understanding
>> > >> was that we should primarily maintain LTS and the latest releases for
>> > >> regular bug fixes, while critical/security fixes would be applied to all
>> > >> branches. It appears we have different interpretations of this process.
>> > >> While it's beneficial to backport to all branches, I thought this was
>> > >> more of a convenience when there weren't too many branches to manage. In
>> > >> the worst case, a patch may require manual backporting to each branch if
>> > >> new features make the codebase in each branch unique. I also question
>> > >> what differentiates an LTS branch if we're backporting to all branches
>> > >> anyway. My assumption was that users opting for a non-LTS branch should
>> > >> be prepared to upgrade to newer releases, given the "short-term"
>> > >> maintenance of these branches. Is my understanding incorrect?
>> > >>
>> > > Hi Han,
>> > >
>> > > I changed the subject of the email to better reflect the new discussion.
>> > >
>> > > There's this statement in the release-process document: "LTS releases
>> > > will receive bug fixes until the point that another LTS is released. At
>> > > that point, the old LTS will receive an additional year of critical and
>> > > security fixes."
>> > >
>> > > Which further differentiates an LTS release from a non-LTS release.
> 
> If you are saying "the old LTS will receive an additional year of critical 
> and security fixes" is the different part, I agree. But still, in this case, 
> for critical and security fixes, when applied to the LTS for the *additional* 
> year, do you agree that we will skip non-LTS releases? If not skipping, then 
> it makes no difference.
> 
>> > >
>> > >> Perhaps we should discuss and formalize our backporting practice going
>> > >> forward, as the number of branches will only continue to grow in the
>> > >> coming months and years. cc @Mark Michelson
>> > >> >  @Numan 
>> > >> Siddique >
>> > >>
>> > >
>> > > Checking the OVS/OVN release-process documents again it's indeed not
>> > > explicitly specified that non-LTS stable release branches are supposed
>> > > to get all bug fixes.  But the OVS document has this addition:
>> > >
>> > > "While branches other than LTS and the latest release are not formally
>> > > maintained, the OVS project usually provides stable releases for these
>> > > branches for at least 2 years, i.e. stable releases are provided for the
>> > >  last 4 release branches.  However, these branches may not include all
>> > > the fixes that LTS has in case backporting is not straightforward and
>> > > developers are not willing to spend their time on that (this mostly
>> > > affects branches that are older than the LTS, because backporting to LTS
>> > > implies backporting to all intermediate branches)."
>> > >
>> > > This last part really makes me think that we SHOULD backport to all
>> > > 

Re: [ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Han Zhou
On Mon, Apr 17, 2023 at 9:24 AM Numan Siddique  wrote:
>
> On Mon, Apr 17, 2023 at 5:57 AM Dumitru Ceara  wrote:
> >
> > On 4/17/23 11:55, Dumitru Ceara wrote:
> > > On 4/14/23 18:01, Han Zhou wrote:
> > >>
> > >>
> > >> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara  > >> > wrote:
> > >>>
> > >>> On 4/14/23 04:15, Han Zhou wrote:
> >  On Thu, Apr 13, 2023 at 12:44 PM Han Zhou  > >> > wrote:
> > >
> > >
> > >
> > > On Thu, Apr 13, 2023 at 12:35 PM Dumitru Ceara  > >> > wrote:
> > >>
> > >> On 4/13/23 21:21, Han Zhou wrote:
> > >>> On Thu, Apr 13, 2023 at 8:37 AM Dumitru Ceara  > >> >
> >  wrote:
> > 
> >  On 4/13/23 17:34, Han Zhou wrote:
> > > On Thu, Apr 13, 2023 at 12:54 AM Dumitru Ceara
> > >> mailto:dce...@redhat.com>>
> > >>> wrote:
> > >>
> > >> On 4/13/23 07:07, Han Zhou wrote:
> > >>> In RFC1812 section 5.3.1, it is mentioned that:
> > >>>
> > >>>If the TTL is reduced to zero (or less), the packet MUST
be
> > >>>discarded, and if the destination is not a multicast
address
> >  the
> > >>>router MUST send an ICMP Time Exceeded message ...
> > >>>
> > >>
> > >> The code itself looks OK but I wonder a bit about the
rationale.
> >  Do
> > >>> you
> > >> have an example in which OVN replies with Time Exceeded for
> >  multicast
> > >> destinations and that causes issues?
> > >>
> > >>> So if the destionation is a multicast address the route
shouldn't
> >  send
> > >>> ICMP Time Exceeded, but the current OVN implementation
didn't
> >  check
> > >>> multicast and tries to send ICMP regardless. This patch
fixes it.
> > >>
> > >> The statement "if destination is not a multicast address the
> > >> router
> > >>> MUST
> > >> send an ICMP Time Exceeded message" implies that "if
destination
> >  is a
> > >> multicast address the router MAY or MAY NOT send an ICMP Time
> >  Exceeded
> > >> message".  So the fact that OVN sends one is not necessarily
> > >> wrong.
> > >
> > > For my limited understanding of multicast, sending ICMP time
> >  exceeded is
> > > not a good idea. In multicast TTL has special meanings, for
> > >> example:
> > > TTL 0: Restricted to the same host, not transmitted by the
> >  router.
> > > TTL 1: Restricted to the same subnet, not forwarded by the
> >  router.
> > > If we send ICMP for such packets, it means for a very common
use
> >  case of
> > > multicast (ttl = 1, same subnet), we will end up sending ICMP
for
> >  every
> > > normal packet.
> > > In production we saw this happening with a rate higher than
10k
> > >> pps!
> > >
> > 
> >  Makes sense, thanks for the details!
> > 
> > > So I believe this is the reason behind the "if" statement in
the
> >  RFC.
> > >>> Maybe
> > > I should add this rationale in the comment, too.
> > >
> > 
> >  If you could add some of the details above to the commit log
too
> >  then:
> > 
> >  Acked-by: Dumitru Ceara  > >> >
> > >>>
> > >>> Thanks Dumitru. I applied the series to main.
> > >>> Han
> > >>
> > >> This sounds like a relevant bug fix.  Should this get backported
to
> > >> stable branches too?
> > >>
> > > Sound good. I will do that.
> > 
> >  I backported this series to the last release branch-23.03 and the
LTS
> >  branch-22.03.
> >  The code base has changed a lot so was mostly manually backported
> > >> instead
> >  of cherry-pick.
> > 
> > >>>
> > >>> Thanks for that but I think we shouldn't skip stable branches in
> > >>> between.  We likely still have users on those that won't upgrade to
the
> > >>> latest 23.03 but instead to a new z-stream release (when that
happens)
> > >>> of the stable branch.
> > >>>
> > >>> I cherry picked your backports and made sure the tests pass and then
> > >>> pushed them to branches 22.12, 22.09 and 22.06.
> > >>
> > >> Thanks Dumitru for helping backporting. I was uncertain about
whether we
> > >> should continue backporting without skipping branches, as our growing
> > >> number of released branches makes this approach increasingly
difficult
> > >> to sustain.
> > >>
> > >> The release process document
> > >> (https://docs.ovn.org/en/latest/internals/release-process.html
> > >> )
doesn't
> > >> specify how long we should maintain non-LTS branches. My
understanding
> > >> was that we should primarily maintain LTS and the latest releases for
> > >> regular bug fixes, while critical/security 

[ovs-dev] [PATCH] system-offloads-traffic: Remove tc ingress pps check on meter offload.

2023-04-17 Thread David Marchand
Caught during some code review.
The incriminated commit had put an unneeded check on tc ingress support
for the meter offloading test.

Note: SUPPORT_TC_INGRESS_PPS had been reworked in the commit 5f0fdf5e2c2e
("test: Move check for tc ingress pps support to test script.").

Fixes: 5660b89a309d ("dpif-netlink: Offloading meter to tc police action")
Signed-off-by: David Marchand 
---
 tests/system-offloads-traffic.at | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index da18597cd8..df43caa9e5 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -240,7 +240,6 @@ AT_CLEANUP
 
 AT_SETUP([offloads - check interface meter offloading -  offloads enabled])
 AT_KEYWORDS([offload-meter])
-AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"])
 AT_SKIP_IF([test $HAVE_NC = "no"])
 OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
other_config:hw-offload=true])
 
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Numan Siddique
On Mon, Apr 17, 2023 at 5:57 AM Dumitru Ceara  wrote:
>
> On 4/17/23 11:55, Dumitru Ceara wrote:
> > On 4/14/23 18:01, Han Zhou wrote:
> >>
> >>
> >> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara  >> > wrote:
> >>>
> >>> On 4/14/23 04:15, Han Zhou wrote:
>  On Thu, Apr 13, 2023 at 12:44 PM Han Zhou  >> > wrote:
> >
> >
> >
> > On Thu, Apr 13, 2023 at 12:35 PM Dumitru Ceara  >> > wrote:
> >>
> >> On 4/13/23 21:21, Han Zhou wrote:
> >>> On Thu, Apr 13, 2023 at 8:37 AM Dumitru Ceara  >> >
>  wrote:
> 
>  On 4/13/23 17:34, Han Zhou wrote:
> > On Thu, Apr 13, 2023 at 12:54 AM Dumitru Ceara
> >> mailto:dce...@redhat.com>>
> >>> wrote:
> >>
> >> On 4/13/23 07:07, Han Zhou wrote:
> >>> In RFC1812 section 5.3.1, it is mentioned that:
> >>>
> >>>If the TTL is reduced to zero (or less), the packet MUST be
> >>>discarded, and if the destination is not a multicast address
>  the
> >>>router MUST send an ICMP Time Exceeded message ...
> >>>
> >>
> >> The code itself looks OK but I wonder a bit about the rationale.
>  Do
> >>> you
> >> have an example in which OVN replies with Time Exceeded for
>  multicast
> >> destinations and that causes issues?
> >>
> >>> So if the destionation is a multicast address the route shouldn't
>  send
> >>> ICMP Time Exceeded, but the current OVN implementation didn't
>  check
> >>> multicast and tries to send ICMP regardless. This patch fixes it.
> >>
> >> The statement "if destination is not a multicast address the
> >> router
> >>> MUST
> >> send an ICMP Time Exceeded message" implies that "if destination
>  is a
> >> multicast address the router MAY or MAY NOT send an ICMP Time
>  Exceeded
> >> message".  So the fact that OVN sends one is not necessarily
> >> wrong.
> >
> > For my limited understanding of multicast, sending ICMP time
>  exceeded is
> > not a good idea. In multicast TTL has special meanings, for
> >> example:
> > TTL 0: Restricted to the same host, not transmitted by the
>  router.
> > TTL 1: Restricted to the same subnet, not forwarded by the
>  router.
> > If we send ICMP for such packets, it means for a very common use
>  case of
> > multicast (ttl = 1, same subnet), we will end up sending ICMP for
>  every
> > normal packet.
> > In production we saw this happening with a rate higher than 10k
> >> pps!
> >
> 
>  Makes sense, thanks for the details!
> 
> > So I believe this is the reason behind the "if" statement in the
>  RFC.
> >>> Maybe
> > I should add this rationale in the comment, too.
> >
> 
>  If you could add some of the details above to the commit log too
>  then:
> 
>  Acked-by: Dumitru Ceara  >> >
> >>>
> >>> Thanks Dumitru. I applied the series to main.
> >>> Han
> >>
> >> This sounds like a relevant bug fix.  Should this get backported to
> >> stable branches too?
> >>
> > Sound good. I will do that.
> 
>  I backported this series to the last release branch-23.03 and the LTS
>  branch-22.03.
>  The code base has changed a lot so was mostly manually backported
> >> instead
>  of cherry-pick.
> 
> >>>
> >>> Thanks for that but I think we shouldn't skip stable branches in
> >>> between.  We likely still have users on those that won't upgrade to the
> >>> latest 23.03 but instead to a new z-stream release (when that happens)
> >>> of the stable branch.
> >>>
> >>> I cherry picked your backports and made sure the tests pass and then
> >>> pushed them to branches 22.12, 22.09 and 22.06.
> >>
> >> Thanks Dumitru for helping backporting. I was uncertain about whether we
> >> should continue backporting without skipping branches, as our growing
> >> number of released branches makes this approach increasingly difficult
> >> to sustain.
> >>
> >> The release process document
> >> (https://docs.ovn.org/en/latest/internals/release-process.html
> >> ) doesn't
> >> specify how long we should maintain non-LTS branches. My understanding
> >> was that we should primarily maintain LTS and the latest releases for
> >> regular bug fixes, while critical/security fixes would be applied to all
> >> branches. It appears we have different interpretations of this process.
> >> While it's beneficial to backport to all branches, I thought this was
> >> more of a convenience when there weren't too many branches to manage. In
> >> the worst case, a patch may require manual 

Re: [ovs-dev] [PATCH v10] utilities/ofctl: add-meters for save and restore

2023-04-17 Thread Simon Horman
On Tue, Apr 11, 2023 at 09:25:43PM +0800, Wan Junjie wrote:
> put dump-meters' result in one line so add-meters can handle.
> save and restore meters when restart ovs.
> bundle functions are not implemented in this patch.
> 
> Signed-off-by: Wan Junjie 
> 
> ---
> v10:
> merge oneline to verbosity in parse_options
> ofp_to_string__ handle verbosity bits right

Thanks, this one looks good to me.

Reviewed-by: Simon Horman 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ci: Add job to build Fedora RPMs on every push/PR

2023-04-17 Thread Enrique Llorente Pastora
This is better than nothing, copr would be the usually thing to have, but
looks like it needs
a lot of administrative duties, we can improve this in the future.

+1

On Wed, Apr 12, 2023 at 10:59 AM Ales Musil  wrote:

> In order to have up-to-date Fedora RPMs from main
> branch add job that will build the RPMs automatically
> and publishes them as artifacts. Those artifacts are
> available for download to any logged-in user on GH.
>
> Reported-at: https://bugzilla.redhat.com/2178936
> Signed-off-by: Ales Musil 
> ---
>  .github/workflows/test.yml | 50 ++
>  1 file changed, 50 insertions(+)
>
> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
> index b41f95936..12cae 100644
> --- a/.github/workflows/test.yml
> +++ b/.github/workflows/test.yml
> @@ -201,3 +201,53 @@ jobs:
>with:
>  name: logs-osx-clang---disable-ssl
>  path: config.log
> +
> +  build-linux-rpm:
> +name: linux rpm fedora
> +runs-on: ubuntu-latest
> +container: fedora:latest
> +timeout-minutes: 30
> +
> +strategy:
> +  fail-fast: false
> +
> +steps:
> +  - name: install dependencies
> +run: dnf install -y dnf-plugins-core git rpm-build
> +
> +  - name: checkout
> +uses: actions/checkout@v3
> +with:
> +  submodules: recursive
> +
> +  - name: install build dependencies
> +run: |
> +  sed -e 's/@VERSION@/0.0.1/' rhel/ovn-fedora.spec.in \
> +  > /tmp/ovn.spec
> +  dnf builddep -y /tmp/ovn.spec
> +
> +  - name: configure OvS
> +run: ./boot.sh && ./configure
> +working-directory: ovs
> +
> +  - name: make dist OvS
> +run: make dist
> +working-directory: ovs
> +
> +  - name: configure OVN
> +run:  ./boot.sh && ./configure
> +
> +  - name: make dist OVN
> +run: make dist
> +
> +  - name: build RPM
> +run:  make rpm-fedora
> +
> +  - name: upload rpm packages
> +uses: actions/upload-artifact@v3
> +with:
> +  name: rpm-packages
> +  path: |
> +rpm/rpmbuild/SRPMS/*.rpm
> +rpm/rpmbuild/RPMS/*/*.rpm
> +  retention-days: 14
> --
> 2.39.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>

-- 
*Quique Llorente*

CNV networking Senior Software Engineer

Red Hat EMEA 

ellor...@redhat.com 
@RedHat    Red Hat
  Red Hat


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 3/3] ci: Switch Cirrus CI to use the new image

2023-04-17 Thread Ales Musil
Use the image with preinstalled dependencies
also on Cirrus CI.

Signed-off-by: Ales Musil 
---
 .cirrus.yml | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 10869d8a2..bd4cd08aa 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -1,7 +1,7 @@
 arm_unit_tests_task:
 
   arm_container:
-image: quay.io/fedora/fedora:37
+image: ghcr.io/ovn-org/ovn-tests:fedora
 memory: 4G
 cpu: 2
 
@@ -9,10 +9,6 @@ arm_unit_tests_task:
 ARCH: aarch64
 CIRRUS_CLONE_SUBMODULES: true
 PATH: ${HOME}/bin:${HOME}/.local/bin:${PATH}
-DEPENDENCIES: autoconf automake clang curl ethtool gcc git
-  glibc-langpack-en groff iproute iproute-tc iputils libcap-ng-devel
-  libtool net-tools nmap-ncat openssl openssl-devel python3-devel
-  python3-pip python3-sphinx tcpdump unbound unbound-devel wget
 matrix:
   - CC: gcc
 TESTSUITE: test
@@ -35,12 +31,5 @@ arm_unit_tests_task:
 
   name: ARM64 ${CC} ${TESTSUITE} ${TEST_RANGE}
 
-  dependencies_script:
-- dnf -y update
-- dnf -y install ${DEPENDENCIES}
-
-  prepare_script:
-- ./.ci/linux-prepare.sh
-
   build_script:
 - ./.ci/linux-build.sh
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/3] ci: Use container to run the tests

2023-04-17 Thread Ales Musil
Move the ci.sh script into .ci folder
and remove the linux-prepare.sh as it
is no longer needed with all the requirements
installed in container.

Signed-off-by: Ales Musil 
---
 {utilities/containers => .ci}/ci.sh |  0
 .ci/linux-prepare.sh| 21 ---
 .github/workflows/test.yml  | 54 ++---
 Makefile.am |  2 +-
 utilities/automake.mk   |  1 -
 5 files changed, 3 insertions(+), 75 deletions(-)
 rename {utilities/containers => .ci}/ci.sh (100%)
 delete mode 100755 .ci/linux-prepare.sh

diff --git a/utilities/containers/ci.sh b/.ci/ci.sh
similarity index 100%
rename from utilities/containers/ci.sh
rename to .ci/ci.sh
diff --git a/.ci/linux-prepare.sh b/.ci/linux-prepare.sh
deleted file mode 100755
index 6617d0c42..0
--- a/.ci/linux-prepare.sh
+++ /dev/null
@@ -1,21 +0,0 @@
-#!/bin/bash
-
-set -ev
-
-# Build and install sparse.
-#
-# Explicitly disable sparse support for llvm because some travis
-# environments claim to have LLVM (llvm-config exists and works) but
-# linking against it fails.
-# Disabling sqlite support because sindex build fails and we don't
-# really need this utility being installed.
-git clone git://git.kernel.org/pub/scm/devel/sparse/sparse.git
-cd sparse && make -j4 HAVE_LLVM= HAVE_SQLITE= install && cd ..
-
-# Installing wheel separately because it may be needed to build some
-# of the packages during dependency backtracking and pip >= 22.0 will
-# abort backtracking on build failures:
-# https://github.com/pypa/pip/issues/10655
-pip3 install --disable-pip-version-check --user wheel
-pip3 install --disable-pip-version-check --user \
--r utilities/containers/py-requirements.txt
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index b41f95936..3251d2385 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -14,12 +14,7 @@ concurrency:
 jobs:
   build-linux:
 env:
-  dependencies: |
-automake libtool gcc bc libjemalloc2 libjemalloc-dev\
-libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
-selinux-policy-dev ncat python3-scapy isc-dhcp-server \
-iputils-arping
-  m32_dependecies: gcc-multilib
+  IMAGE_NAME:  ghcr.io/ovn-org/ovn-tests:ubuntu
   ARCH:${{ matrix.cfg.arch }}
   CC:  ${{ matrix.cfg.compiler }}
   LIBS:${{ matrix.cfg.libs }}
@@ -89,53 +84,8 @@ jobs:
 sort -V | tail -1)
   working-directory: ovs
 
-- name: update APT cache
-  run:  sudo apt update
-
-- name: remove netcat-openbsd
-  run:  sudo apt remove -y netcat-openbsd
-
-- name: install required dependencies
-  run:  sudo apt install -y ${{ env.dependencies }}
-
-- name: install libunbound libunwind
-  if:   matrix.cfg.arch != 'x86'
-  run:  sudo apt install -y libunbound-dev libunwind-dev
-
-- name: install 32-bit dependencies
-  if:   matrix.cfg.arch == 'x86'
-  run:  sudo apt install -y ${{ env.m32_dependecies }}
-
-- name: update PATH
-  run:  |
-echo "$HOME/bin">> $GITHUB_PATH
-echo "$HOME/.local/bin" >> $GITHUB_PATH
-
-- name: set up python
-  uses: actions/setup-python@v4
-  with:
-python-version: '3.x'
-
-- name: prepare
-  run:  ./.ci/linux-prepare.sh
-
 - name: build
-  run:  ./.ci/linux-build.sh
-
-- name: copy logs on failure
-  if: failure() || cancelled()
-  run: |
-# upload-artifact@v3 throws exceptions if it tries to upload socket
-# files and we could have some socket files in testsuite.dir.
-# Also, upload-artifact@v3 doesn't work well enough with wildcards.
-# So, we're just archiving everything here to avoid any issues.
-mkdir logs
-cp config.log ./logs/
-cp -r ./*/_build/sub/tests/testsuite.* ./logs/ || true
-# System tests are run as root, need to adjust permissions.
-sudo chmod -R +r ./tests/system-kmod-testsuite.* || true
-cp -r ./tests/system-kmod-testsuite.* ./logs/ || true
-tar -czvf logs.tgz logs/
+  run: sudo -E ./.ci/ci.sh --archive-logs
 
 - name: upload logs on failure
   if: failure() || cancelled()
diff --git a/Makefile.am b/Makefile.am
index 4be2d48d4..2c4c7eeef 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -86,8 +86,8 @@ EXTRA_DIST = \
README.rst \
NOTICE \
.cirrus.yml \
+   .ci/ci.sh \
.ci/linux-build.sh \
-   .ci/linux-prepare.sh \
.ci/osx-build.sh \
.ci/osx-prepare.sh \
.ci/ovn-kubernetes/Dockerfile \
diff --git a/utilities/automake.mk b/utilities/automake.mk
index bb261439d..c44563c26 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -37,7 +37,6 @@ EXTRA_DIST += \
 utilities/ovn_detrace.py.in \
 utilities/ovndb-servers.ocf \
 utilities/checkpatch.py \
-utilities/containers/ci.sh \
 utilities/containers/Makefile \
  

[ovs-dev] [PATCH ovn 1/3] ci: Conditionally install gcc-multilib

2023-04-17 Thread Ales Musil
The gcc-multilib is needed for x86 build on
x86_64, but the package is not available for
arm64. We need to conditonally install only
when we run the x86 job.

Signed-off-by: Ales Musil 
---
 .ci/linux-build.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 7dfb5c317..542bda3ba 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -34,6 +34,12 @@ function configure_gcc()
 # Adding m32 flag directly to CC to avoid any possible issues
 # with API/ABI difference on 'configure' and 'make' stages.
 export CC="$CC -m32"
+if which apt; then
+# We should install gcc-multilib for x86 build, we cannot
+# do it directly because gcc-multilib is not available
+# for arm64
+sudo apt install -y gcc-multilib
+fi
 fi
 }
 
@@ -81,6 +87,7 @@ function execute_system_tests()
   fi
 }
 
+
 configure_$CC
 
 if [ "$TESTSUITE" ]; then
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Expose distributed gateway port information in NB DB

2023-04-17 Thread Lucas Martins
Thanks all for the discussion and all the ideas here.

After reading the emails, I think it boils down to two proposed approaches:

1) CMS to continue to connect to the Southbound database if they need
information about the physical location of the resources. That would
avoid the inefficiency of having to copy data back-and-forth from the
Northbound and Southbound database.

I guess the downside of this approach is that CMS will have to
maintain a connection with both databases (which is already the case
today).

If we go with this approach, it would be good to have consensus from
the core OVN team where some tables in the Southbound must be kept
stable with backward compatibility in case of changes. Tables such
Chassis, Chassis_Private and Port_Binding at least will need that. I
guess that makes part of the Southbound database to not be considered
OVN internal only.

2) The second approach is what Mark has described below, having a
separated database for the runtime information.

One thing I like about this approach is that it keep the role and
permissions of each database quite clear, as described:

NB DB: CMS writes, OVN reads
SB DB: OVN-internal
Status DB: CMS reads, OVN writes

It also seems to enable a lot of room for extra information about the
runtime to be added in the future (as mentioned in Mark's email, LB
health status, LSP packet counts etc...).

The con is that it's yet another database that needs to be deployed
and maintained and another connection for CMS to maintain if they need
that sort of information.

...

Is there any preference of which approach the core OVN team would prefer here ?

Cheers,
Lucas


On Thu, Apr 13, 2023 at 6:11 PM Mark Michelson  wrote:
>
> Hi all,
>
> I just caught up on this discussion and wanted to complicate things
> further by suggesting another idea. I think the Red Hat folks have heard
> this before, but I'm not sure if it has been brought up on this list before.
>
> Aside from this issue, there is also this high-priority issue from Red
> Hat Openstack: https://bugzilla.redhat.com/show_bug.cgi?id=2123176 .
>
> IMO, this all converges on the idea of introducing a third database to
> OVN. We can refer to this as the "Status" DB.
>
> The Status DB would be a place for state information generated by
> OVN/OVS to be stored. Some ideas for existing things that could go in
> the Status DB would be:
> * Logical port up/down state.
> * Logical switch port dynamic addresses (maybe, this is more complicated)
> * BFD status
> * Logical port installation status and installation timestamp.
>
> In addition to these existing items, the Status DB would be a place for
> additional items that do not exist yet, such as
> * Load balancer health check status
> * Logical port packet/byte counts
> * Gateway port bound chassis
>
> With the implementation of the Status DB, it would cement a relationship
> between the DBs as such:
>
> NB DB: CMS writes, OVN reads
> SB DB: OVN-internal
> Status DB: CMS reads, OVN writes
>
> It may be tempting to get this patch merged as-is, with the intention of
> migrating this to the new DB once it gets implemented. I don't think
> this is a good idea. Between this issue and the one I linked, I think
> the implementation of a Status DB is a good idea, and one that should be
> implemented very soon.
>
> Since this particular problem is already worked around by OpenStack, I
> think it makes more sense to implement this feature in a way that will
> be easier to maintain long-term than to get it in quickly. If we merge
> this as-is, then we are on the hook for supporting this status in the NB
> DB for quite a long time since we would need to take time to deprecate
> it properly. If we instead treat this as the impetus to write the Status
> DB, then I think this lightweight use-case would give us a good starting
> point towards adding the other items we're interested in.
>
> What do you think?
>
> On 4/13/23 09:32, Lucas Martins wrote:
> > Hi Han, Dumitru and Luis,
> >
> > Thanks for the discussion and ideas so far. My reply is inline:
> >
> > On Thu, Apr 13, 2023 at 10:45 AM Luis Tomas Bolivar  
> > wrote:
> >>
> >>
> >>
> >> On Thu, Apr 13, 2023 at 9:33 AM Dumitru Ceara  wrote:
> >>>
> >>> On 4/12/23 23:07, Han Zhou wrote:
>  On Wed, Apr 12, 2023 at 8:01 AM  wrote:
> >
> > From: Lucas Alvares Gomes 
> >
> > In order for the CMS to know which Chassis a distributed gateway port
> > is bond to, this patch updates the ovn-northd daemon to populate the
> > Logical_Router_Port table with that information.
> >
> > To avoid changing the database schema, ovn-northd is setting a new key
> > called "hosting-chassis" in the options column from the LRP table. This
> > key value points to the name of the Chassis that is currently hosting
> > the distributed port.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2107515
> > Signed-off-by: Lucas Alvares Gomes 
> >>>
> >>> Hi, Lucas, Han,
> >>>

[ovs-dev] [PATCH v10] netdev-dpdk: add control plane protection support

2023-04-17 Thread Robin Jarry
Some control protocols are used to maintain link status between
forwarding engines (e.g. LACP). When the system is not sized properly,
the PMD threads may not be able to process all incoming traffic from the
configured Rx queues. When a signaling packet of such protocols is
dropped, it can cause link flapping, worsening the situation.

Use the RTE flow API to redirect these protocols into a dedicated Rx
queue. The assumption is made that the ratio between control protocol
traffic and user data traffic is very low and thus this dedicated Rx
queue will never get full. The RSS redirection table is re-programmed to
only use the other Rx queues. The RSS table size is stored in the
netdev_dpdk structure at port initialization to avoid requesting the
information again when changing the port configuration.

The additional Rx queue will be assigned a PMD core like any other Rx
queue. Polling that extra queue may introduce increased latency and
a slight performance penalty at the benefit of preventing link flapping.

This feature must be enabled per port on specific protocols via the
cp-protection option. This option takes a coma-separated list of
protocol names. It is only supported on ethernet ports. This feature is
experimental.

If the user has already configured multiple Rx queues on the port, an
additional one will be allocated for control plane packets. If the
hardware cannot satisfy the requested number of requested Rx queues, the
last Rx queue will be assigned for control plane. If only one Rx queue
is available, the cp-protection feature will be disabled. If the
hardware does not support the RTE flow matchers/actions, the feature
will be disabled.

It cannot be enabled when other_config:hw-offload=true as it may
conflict with the offloaded RTE flows. Similarly, if hw-offload is
enabled, cp-protection will be forcibly disabled on all ports.

Example use:

 ovs-vsctl add-bond br-phy bond0 phy0 phy1 -- \
   set interface phy0 type=dpdk options:dpdk-devargs=:ca:00.0 -- \
   set interface phy0 options:cp-protection=lacp -- \
   set interface phy1 type=dpdk options:dpdk-devargs=:ca:00.1 -- \
   set interface phy1 options:cp-protection=lacp

As a starting point, only one protocol is supported: LACP. Other
protocols can be added in the future. NIC compatibility should be
checked.

To validate that this works as intended, I used a traffic generator to
generate random traffic slightly above the machine capacity at line rate
on a two ports bond interface. OVS is configured to receive traffic on
two VLANs and pop/push them in a br-int bridge based on tags set on
patch ports.

   +--+
   | DUT  |
   |++|
   ||   br-int   || 
in_port=patch10,actions=mod_dl_src:$patch11,mod_dl_dst:$tgen1,output:patch11
   |||| 
in_port=patch11,actions=mod_dl_src:$patch10,mod_dl_dst:$tgen0,output:patch10
   || patch10patch11 ||
   |+---|---|+|
   ||   | |
   |+---|---|+|
   || patch00patch01 ||
   ||  tag:10tag:20  ||
   ||||
   ||   br-phy   || default flow, action=NORMAL
   ||||
   ||   bond0|| balance-slb, lacp=passive, lacp-time=fast
   ||phy0   phy1 ||
   |+--|-|---+|
   +---|-|+
   | |
   +---|-|+
   | port0  port1 | balance L3/L4, lacp=active, lacp-time=fast
   | lag  | mode trunk VLANs 10, 20
   |  |
   |switch|
   |  |
   |  vlan 10vlan 20  |  mode access
   |   port2  port3   |
   +-|--|-+
 |  |
   +-|--|-+
   |   tgen0  tgen1   |  Random traffic that is properly balanced
   |  |  across the bond ports in both directions.
   |  traffic generator   |
   +--+

Without cp-protection, the bond0 links are randomly switching to
"defaulted" when one of the LACP packets sent by the switch is dropped
because the RX queues are full and the PMD threads did not process them
fast enough. When that happens, all traffic must go through a single
link which causes above line rate traffic to be dropped.

 ~# ovs-appctl lacp/show-stats bond0
  bond0 statistics 
 member: phy0:
   TX PDUs: 347246
   RX PDUs: 14865
   RX Bad PDUs: 0
   RX Marker Request PDUs: 0
   Link Expired: 168
   Link Defaulted: 0
   Carrier Status Changed: 0
 member: phy1:
   TX PDUs: 347245
   RX PDUs: 14919
   RX Bad PDUs: 0
   RX Marker Request PDUs: 0
   Link Expired: 147
   Link Defaulted: 1
   Carrier Status Changed: 0

When cp-protection is enabled, no LACP packet is dropped and the bond
links remain enabled at all times, maximizing the throughput. Neither
the "Link Expired" nor the "Link Defaulted" counters are incremented
anymore.

This feature may be considered as "QoS". However, it does not work by
limiting the rate of 

Re: [ovs-dev] [PATCH v4] util: fix an issue that thread name cannot be set

2023-04-17 Thread Eelco Chaudron



On 17 Apr 2023, at 8:54, Songtao Zhan wrote:

> To: d...@openvswitch.org,
> i.maxim...@ovn.org
>
> The name of the current thread consists of a name with a maximum
> length of 16 bytes and a thread ID. The final name may be longer
> than 16 bytes. If the name is longer than 16 bytes, the thread
> name will fail to be set

Some inline comments below.

Cheers,

Eelco

> Signed-off-by: Songtao Zhan 
> ---
>  lib/util.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/lib/util.c b/lib/util.c
> index 96a71550d..60a1e56a2 100644
> --- a/lib/util.c
> +++ b/lib/util.c
> @@ -645,6 +645,13 @@ set_subprogram_name(const char *subprogram_name)
>  free(subprogram_name_set(pname));
>
>  #if HAVE_GLIBC_PTHREAD_SETNAME_NP
> +/* The maximum thead name including '\0' supported is 16.
> + * add '>' at 0th position to highlight that the name was truncated. */
> +if (strlen(pname) > 15) {
> +memcpy(pname, [strlen(pname) - 15], 15);

I think you need to use memmove() as srd/dst can overlap.

> +pname[15] = '\0';

If you make the memmove + 1, i.e. memmove(pname, [strlen(pname) - 15], 15 
+ 1);, you can skip the above line.

> +pname[0] = '>';
> +}
>  pthread_setname_np(pthread_self(), pname);
>  #elif HAVE_NETBSD_PTHREAD_SETNAME_NP
>  pthread_setname_np(pthread_self(), "%s", pname);
> -- 
> 2.39.1
>
>
>
> zhan...@chinatelecom.cn
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Dumitru Ceara
On 4/17/23 11:55, Dumitru Ceara wrote:
> On 4/14/23 18:01, Han Zhou wrote:
>>
>>
>> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara > > wrote:
>>>
>>> On 4/14/23 04:15, Han Zhou wrote:
 On Thu, Apr 13, 2023 at 12:44 PM Han Zhou > > wrote:
>
>
>
> On Thu, Apr 13, 2023 at 12:35 PM Dumitru Ceara > > wrote:
>>
>> On 4/13/23 21:21, Han Zhou wrote:
>>> On Thu, Apr 13, 2023 at 8:37 AM Dumitru Ceara > >
 wrote:

 On 4/13/23 17:34, Han Zhou wrote:
> On Thu, Apr 13, 2023 at 12:54 AM Dumitru Ceara
>> mailto:dce...@redhat.com>>
>>> wrote:
>>
>> On 4/13/23 07:07, Han Zhou wrote:
>>> In RFC1812 section 5.3.1, it is mentioned that:
>>>
>>>    If the TTL is reduced to zero (or less), the packet MUST be
>>>    discarded, and if the destination is not a multicast address
 the
>>>    router MUST send an ICMP Time Exceeded message ...
>>>
>>
>> The code itself looks OK but I wonder a bit about the rationale.
 Do
>>> you
>> have an example in which OVN replies with Time Exceeded for
 multicast
>> destinations and that causes issues?
>>
>>> So if the destionation is a multicast address the route shouldn't
 send
>>> ICMP Time Exceeded, but the current OVN implementation didn't
 check
>>> multicast and tries to send ICMP regardless. This patch fixes it.
>>
>> The statement "if destination is not a multicast address the
>> router
>>> MUST
>> send an ICMP Time Exceeded message" implies that "if destination
 is a
>> multicast address the router MAY or MAY NOT send an ICMP Time
 Exceeded
>> message".  So the fact that OVN sends one is not necessarily
>> wrong.
>
> For my limited understanding of multicast, sending ICMP time
 exceeded is
> not a good idea. In multicast TTL has special meanings, for
>> example:
>     TTL 0: Restricted to the same host, not transmitted by the
 router.
>     TTL 1: Restricted to the same subnet, not forwarded by the
 router.
> If we send ICMP for such packets, it means for a very common use
 case of
> multicast (ttl = 1, same subnet), we will end up sending ICMP for
 every
> normal packet.
> In production we saw this happening with a rate higher than 10k
>> pps!
>

 Makes sense, thanks for the details!

> So I believe this is the reason behind the "if" statement in the
 RFC.
>>> Maybe
> I should add this rationale in the comment, too.
>

 If you could add some of the details above to the commit log too
 then:

 Acked-by: Dumitru Ceara > >
>>>
>>> Thanks Dumitru. I applied the series to main.
>>> Han
>>
>> This sounds like a relevant bug fix.  Should this get backported to
>> stable branches too?
>>
> Sound good. I will do that.

 I backported this series to the last release branch-23.03 and the LTS
 branch-22.03.
 The code base has changed a lot so was mostly manually backported
>> instead
 of cherry-pick.

>>>
>>> Thanks for that but I think we shouldn't skip stable branches in
>>> between.  We likely still have users on those that won't upgrade to the
>>> latest 23.03 but instead to a new z-stream release (when that happens)
>>> of the stable branch.
>>>
>>> I cherry picked your backports and made sure the tests pass and then
>>> pushed them to branches 22.12, 22.09 and 22.06.
>>
>> Thanks Dumitru for helping backporting. I was uncertain about whether we
>> should continue backporting without skipping branches, as our growing
>> number of released branches makes this approach increasingly difficult
>> to sustain.
>>
>> The release process document
>> (https://docs.ovn.org/en/latest/internals/release-process.html
>> ) doesn't
>> specify how long we should maintain non-LTS branches. My understanding
>> was that we should primarily maintain LTS and the latest releases for
>> regular bug fixes, while critical/security fixes would be applied to all
>> branches. It appears we have different interpretations of this process.
>> While it's beneficial to backport to all branches, I thought this was
>> more of a convenience when there weren't too many branches to manage. In
>> the worst case, a patch may require manual backporting to each branch if
>> new features make the codebase in each branch unique. I also question
>> what differentiates an LTS branch if we're backporting to all branches
>> anyway. My assumption was that users opting for a non-LTS branch should
>> be prepared to upgrade to newer releases, given the 

[ovs-dev] OVN LTS/non-LTS bug fix backport strategy (was: Re: [PATCH ovn 2/2] northd.c: Avoid sending ICMP time exceeded for multicast packets.)

2023-04-17 Thread Dumitru Ceara
On 4/14/23 18:01, Han Zhou wrote:
> 
> 
> On Fri, Apr 14, 2023 at 2:16 AM Dumitru Ceara  > wrote:
>>
>> On 4/14/23 04:15, Han Zhou wrote:
>> > On Thu, Apr 13, 2023 at 12:44 PM Han Zhou  > wrote:
>> >>
>> >>
>> >>
>> >> On Thu, Apr 13, 2023 at 12:35 PM Dumitru Ceara  > wrote:
>> >>>
>> >>> On 4/13/23 21:21, Han Zhou wrote:
>>  On Thu, Apr 13, 2023 at 8:37 AM Dumitru Ceara  >
>> > wrote:
>> >
>> > On 4/13/23 17:34, Han Zhou wrote:
>> >> On Thu, Apr 13, 2023 at 12:54 AM Dumitru Ceara
> mailto:dce...@redhat.com>>
>>  wrote:
>> >>>
>> >>> On 4/13/23 07:07, Han Zhou wrote:
>>  In RFC1812 section 5.3.1, it is mentioned that:
>> 
>>     If the TTL is reduced to zero (or less), the packet MUST be
>>     discarded, and if the destination is not a multicast address
>> > the
>>     router MUST send an ICMP Time Exceeded message ...
>> 
>> >>>
>> >>> The code itself looks OK but I wonder a bit about the rationale.
>> > Do
>>  you
>> >>> have an example in which OVN replies with Time Exceeded for
>> > multicast
>> >>> destinations and that causes issues?
>> >>>
>>  So if the destionation is a multicast address the route shouldn't
>> > send
>>  ICMP Time Exceeded, but the current OVN implementation didn't
>> > check
>>  multicast and tries to send ICMP regardless. This patch fixes it.
>> >>>
>> >>> The statement "if destination is not a multicast address the
> router
>>  MUST
>> >>> send an ICMP Time Exceeded message" implies that "if destination
>> > is a
>> >>> multicast address the router MAY or MAY NOT send an ICMP Time
>> > Exceeded
>> >>> message".  So the fact that OVN sends one is not necessarily
> wrong.
>> >>
>> >> For my limited understanding of multicast, sending ICMP time
>> > exceeded is
>> >> not a good idea. In multicast TTL has special meanings, for
> example:
>> >>     TTL 0: Restricted to the same host, not transmitted by the
>> > router.
>> >>     TTL 1: Restricted to the same subnet, not forwarded by the
>> > router.
>> >> If we send ICMP for such packets, it means for a very common use
>> > case of
>> >> multicast (ttl = 1, same subnet), we will end up sending ICMP for
>> > every
>> >> normal packet.
>> >> In production we saw this happening with a rate higher than 10k
> pps!
>> >>
>> >
>> > Makes sense, thanks for the details!
>> >
>> >> So I believe this is the reason behind the "if" statement in the
>> > RFC.
>>  Maybe
>> >> I should add this rationale in the comment, too.
>> >>
>> >
>> > If you could add some of the details above to the commit log too
>> > then:
>> >
>> > Acked-by: Dumitru Ceara  >
>> 
>>  Thanks Dumitru. I applied the series to main.
>>  Han
>> >>>
>> >>> This sounds like a relevant bug fix.  Should this get backported to
>> >>> stable branches too?
>> >>>
>> >> Sound good. I will do that.
>> >
>> > I backported this series to the last release branch-23.03 and the LTS
>> > branch-22.03.
>> > The code base has changed a lot so was mostly manually backported
> instead
>> > of cherry-pick.
>> >
>>
>> Thanks for that but I think we shouldn't skip stable branches in
>> between.  We likely still have users on those that won't upgrade to the
>> latest 23.03 but instead to a new z-stream release (when that happens)
>> of the stable branch.
>>
>> I cherry picked your backports and made sure the tests pass and then
>> pushed them to branches 22.12, 22.09 and 22.06.
> 
> Thanks Dumitru for helping backporting. I was uncertain about whether we
> should continue backporting without skipping branches, as our growing
> number of released branches makes this approach increasingly difficult
> to sustain.
> 
> The release process document
> (https://docs.ovn.org/en/latest/internals/release-process.html
> ) doesn't
> specify how long we should maintain non-LTS branches. My understanding
> was that we should primarily maintain LTS and the latest releases for
> regular bug fixes, while critical/security fixes would be applied to all
> branches. It appears we have different interpretations of this process.
> While it's beneficial to backport to all branches, I thought this was
> more of a convenience when there weren't too many branches to manage. In
> the worst case, a patch may require manual backporting to each branch if
> new features make the codebase in each branch unique. I also question
> what differentiates an LTS branch if we're backporting to all branches
> anyway. My assumption was that users opting for a non-LTS branch should
> be prepared to upgrade to newer releases, given the "short-term"
> maintenance of these branches. Is my understanding incorrect?
> 
Hi 

[ovs-dev] [Patch ovn] pinctrl: fix restart of controller when bfd min_tx set to 1

2023-04-17 Thread wangchuanlei
when create bfd entry, set min_tx to 1, (tx_timeout * 25)/100 is
zero, which cause controller to restart!

Signed-off-by: wangchuanlei 
---
 controller/pinctrl.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 795847729..7817da6e0 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7190,7 +7190,6 @@ bfd_monitor_send_msg(struct rconn *swconn, long long int 
*bfd_time)
 pinctrl_send_bfd_tx_msg(swconn, entry, false);
 
 tx_timeout = MAX(entry->local_min_tx, entry->remote_min_rx);
-tx_timeout -= random_range((tx_timeout * 25) / 100);
 entry->next_tx = cur_time + tx_timeout;
 next:
 if (*bfd_time > entry->next_tx) {
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 0/3] selftests: openvswitch: add support for testing upcall interface

2023-04-17 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller :

On Fri, 14 Apr 2023 09:17:47 -0400 you wrote:
> The existing selftest suite for openvswitch will work for regression
> testing the datapath feature bits, but won't test things like adding
> interfaces, or the upcall interface.  Here, we add some additional
> test facilities.
> 
> First, extend the ovs-dpctl.py python module to support the OVS_FLOW
> and OVS_PACKET netlink families, with some associated messages.  These
> can be extended over time, but the initial support is for more well
> known cases (output, userspace, and CT).
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] selftests: openvswitch: add interface support
https://git.kernel.org/netdev/net-next/c/74cc26f416b9
  - [net-next,2/3] selftests: openvswitch: add flow dump support
https://git.kernel.org/netdev/net-next/c/e52b07aa1a54
  - [net-next,3/3] selftests: openvswitch: add support for upcall testing
https://git.kernel.org/netdev/net-next/c/9feac87b673c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4] util: fix an issue that thread name cannot be set

2023-04-17 Thread Songtao Zhan


To: d...@openvswitch.org,
i.maxim...@ovn.org

The name of the current thread consists of a name with a maximum
length of 16 bytes and a thread ID. The final name may be longer
than 16 bytes. If the name is longer than 16 bytes, the thread
name will fail to be set

Signed-off-by: Songtao Zhan 
---
 lib/util.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/lib/util.c b/lib/util.c
index 96a71550d..60a1e56a2 100644
--- a/lib/util.c
+++ b/lib/util.c
@@ -645,6 +645,13 @@ set_subprogram_name(const char *subprogram_name)
 free(subprogram_name_set(pname));

 #if HAVE_GLIBC_PTHREAD_SETNAME_NP
+/* The maximum thead name including '\0' supported is 16.
+ * add '>' at 0th position to highlight that the name was truncated. */
+if (strlen(pname) > 15) {
+memcpy(pname, [strlen(pname) - 15], 15);
+pname[15] = '\0';
+pname[0] = '>';
+}
 pthread_setname_np(pthread_self(), pname);
 #elif HAVE_NETBSD_PTHREAD_SETNAME_NP
 pthread_setname_np(pthread_self(), "%s", pname);
-- 
2.39.1



zhan...@chinatelecom.cn
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev