Re: [ovs-discuss] disable OpenFlow inactivity probing in v22.03

2024-07-12 Thread Vladislav Odintsov via discuss
Hi Krzysztof,

talking about OVN < 23.09:

1. For the main OF connection which is used to configure OF flows, the probing 
is disabled by default. Just ensure that 
external_ids:ovn-openflow-probe-interval is not configured. The disconnects are 
harmful for ha-chassis-group functioning (false-positive failover events), 
which lead to flows recomputing and traffic interruptions.
2. There are still two OF connections for packet-ins and feature probing, which 
have hardcoded enabled probing with constant interval of 5 seconds. Disconnects 
on them are not harmful (just waist CPU time on connection re-instantiation).

For all OVN versions: the mentioned patch and settings disable probing only in 
OVN -> OVS direction. The reverse direction still uses default probing 
configured from OVS side. OVN doesn’t configure OVS OF probing yet, so you 
should consider it on your own. For more details see thread [1].

So, I guess the backport is possible but not very necessary (if you have 
another inputs, please provide).

The discussion about default OVS bridge probe configuration stopped here [2], 
but we can continue from that point.

@Ilya, what do you think about disabling OF probes by default for unix domain 
sockets in OVS? If you’re okay with it, I can work on it and submit a patch.

1: https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/405296.html
2: https://mail.openvswitch.org/pipermail/ovs-dev/2023-September/408070.html

regards,
Vladislav Odintsov

> On 11 Jul 2024, at 17:41, Krzysztof Tomaszewski via discuss 
>  wrote:
> 
> 
> Hello,
> 
> In newer release of OVN there is already disabled OpenFlow inactivity probing
> (disscussion [1], commit [2]) but problem still exist in v22.03.
> 
> IMHO problem is quite impacting as it causes many reconnections after 5s of 
> inactivity
> even on medium scale deployments which leads to OVN system instability.
> 
> Worth to mention is that v22.03 release is still main OVN package version in 
> many LTS distributions
> like Ubuntu 22.04.
> 
> Are there any plans to include those changes (disable inactivity probing for 
> OpenFlow) in OVN v22.03 release?
> 
> 
> 1: https://mail.openvswitch.org/pipermail/ovs-dev/2023-May/404625.html
> 2: 
> https://github.com/ovn-org/ovn/commit/c16e5da803838fa66129eb61d7930fc84d237f85
> 
> Regards,
> 
> Krzysztof Tomaszewski
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Incorrect max_tunid

2024-06-24 Thread Vladislav Odintsov via discuss
Hi,

There is a patchset under review [1] for this functionality.
You can try this out locally or wait it to be accepted in upstream.

1: https://patchwork.ozlabs.org/project/ovn/list/?series=410010

regards,

Vladislav Odintsov

-Original Message-
From: discuss  on behalf of ab via discuss 

Reply to: ab 
Date: Monday, 24 June 2024 at 17:46
To: "ovs-discuss@openvswitch.org" 
Subject: [ovs-discuss] Incorrect max_tunid

Hello

In continue the thread 
ovs-discuss@openvswitch.org ovs-discuss@openvswitch.org 
https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg08817.html
We want to make a patch so that the OVN return the correct number of 
max_tunid. We are trying to add a new value "ovn-vxlan-for-hw-vtep-only" 
in the chassis->other_config section. If this value is true, then the 
max_tunid value will be 16 million, if false - max_tunid=4095. The 
condition will be met if the value of ovn-vxlan-for-hw-vtep-only=true on 
any of the chassis. Please tell us, will the patch be correct? Can we 
add such a feature this way?

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Clustered ovsdb-server 3.1 to 3.2+ upgrade question

2024-04-25 Thread Vladislav Odintsov via discuss
Thank you for your answers!

> On 25 Apr 2024, at 13:01, Ilya Maximets  wrote:
> 
> On 4/25/24 11:51, Vladislav Odintsov wrote:
>> 
>> 
>>> On 25 Apr 2024, at 12:20, Ilya Maximets  wrote:
>>> 
>>> On 4/25/24 10:53, Vladislav Odintsov wrote:
 Hi Ilya,
 
 I’ve got question regarding upgrade of clustered ovsdb-servers from 3.1 to 
 3.2+ version.
 ovsdb(7) states that the recommended path for upgrade is to upgrade 
 ovsdb-servers one-by-one
 and that after upgrade ovsdb-server should be started with option 
 --disable-file-no-data-conversion
 and after whole cluster is upgradede, it is needed to enable no-data 
 conversion via appctl.
 
 I’ve ran through code and did some upgrade tests, so my question is:
 Do I understand correctly, that if there is no necessity to call schema 
 conversion after start
 and before end of cluster upgrade, so it is allowed just to restart 
 ovsdb-servers without
 --disable-file-no-data-conversion option and this will not have any 
 downsides?
>>> 
>>> Simply re-starting removing the option is enough.  There is no need
>>> to enable specifically via appctl in this case.
>> 
>> I’m talking about a bit another thing:
>> I want not to start each upgraded ovsdb-server with 
>> --disable-file-no-data-conversion at all,
>> because it guaranteed in my case that there will be no schema conversions 
>> before full cluster
>> upgrade is finished.
>> So, in this case there will be no need to enable it back via appctl or 
>> removing ''disable"
>> option and restart.
>> 
>> Am I right, or I’m missing something?
> 
> Ah, sorry.  If you're sure that there will be no conversion
> before all the servers are upgraded, then it should be fine
> to just upgrade as usual.
> 
>> 
>>> 
>>> Also, we did actually backport parts of the format change to 3.1.
>>> It should be in v3.1.2 release and newer.  So, technically, if you're
>>> performing upgrade from v3.1.2+, it should be safe to just upgrade as
>>> usual.  ovsdb-servr v3.1.2 understands new database format, though it
>>> can't produce it on its own.  See the following commit on branch-3.1:
>>> 
>>>  9529e9aa967c ("ovsdb: Allow conversion records with no data in a clustered 
>>> storage.")
>> 
>> Hmm, nice. But mine source version is 3.0 and this is not applicable :)
> 
> The same change should be available in v3.0.5.
> 
>> 
>>> 
>>> It would be painful to document all the combinations of minor versions,
>>> so it's not an officially supported upgrade path, but it is there if
>>> you know what are you doing.
>> 
>> Sure.
>> 
>>> 
>>> Best regards, Ilya Maximets.
>> 
>> 
>> Regards,
>> Vladislav Odintsov


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Clustered ovsdb-server 3.1 to 3.2+ upgrade question

2024-04-25 Thread Vladislav Odintsov via discuss


> On 25 Apr 2024, at 12:20, Ilya Maximets  wrote:
> 
> On 4/25/24 10:53, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> I’ve got question regarding upgrade of clustered ovsdb-servers from 3.1 to 
>> 3.2+ version.
>> ovsdb(7) states that the recommended path for upgrade is to upgrade 
>> ovsdb-servers one-by-one
>> and that after upgrade ovsdb-server should be started with option 
>> --disable-file-no-data-conversion
>> and after whole cluster is upgradede, it is needed to enable no-data 
>> conversion via appctl.
>> 
>> I’ve ran through code and did some upgrade tests, so my question is:
>> Do I understand correctly, that if there is no necessity to call schema 
>> conversion after start
>> and before end of cluster upgrade, so it is allowed just to restart 
>> ovsdb-servers without
>> --disable-file-no-data-conversion option and this will not have any 
>> downsides?
> 
> Simply re-starting removing the option is enough.  There is no need
> to enable specifically via appctl in this case.

I’m talking about a bit another thing:
I want not to start each upgraded ovsdb-server with 
--disable-file-no-data-conversion at all, because it guaranteed in my case that 
there will be no schema conversions before full cluster upgrade is finished.
So, in this case there will be no need to enable it back via appctl or removing 
''disable" option and restart.

Am I right, or I’m missing something?

> 
> Also, we did actually backport parts of the format change to 3.1.
> It should be in v3.1.2 release and newer.  So, technically, if you're
> performing upgrade from v3.1.2+, it should be safe to just upgrade as
> usual.  ovsdb-servr v3.1.2 understands new database format, though it
> can't produce it on its own.  See the following commit on branch-3.1:
> 
>  9529e9aa967c ("ovsdb: Allow conversion records with no data in a clustered 
> storage.")

Hmm, nice. But mine source version is 3.0 and this is not applicable :)

> 
> It would be painful to document all the combinations of minor versions,
> so it's not an officially supported upgrade path, but it is there if
> you know what are you doing.

Sure.

> 
> Best regards, Ilya Maximets.


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Clustered ovsdb-server 3.1 to 3.2+ upgrade question

2024-04-25 Thread Vladislav Odintsov via discuss
Hi Ilya,

I’ve got question regarding upgrade of clustered ovsdb-servers from 3.1 to 3.2+ 
version.
ovsdb(7) states that the recommended path for upgrade is to upgrade 
ovsdb-servers one-by-one and that after upgrade ovsdb-server should be started 
with option --disable-file-no-data-conversion and after whole cluster is 
upgradede, it is needed to enable no-data conversion via appctl.

I’ve ran through code and did some upgrade tests, so my question is:
Do I understand correctly, that if there is no necessity to call schema 
conversion after start and before end of cluster upgrade, so it is allowed just 
to restart ovsdb-servers without --disable-file-no-data-conversion option and 
this will not have any downsides?

Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Huge Logical_DP_Groups; NorthD 100% CPU

2024-03-28 Thread Vladislav Odintsov via discuss
Sorry, it first appeared in 23.06.

> On 28 Mar 2024, at 17:13, Vladislav Odintsov  wrote:
> 
> Hi George,
> 
> do you experience these problems with 22.03 version?
> If yes, you may want try patch by Ilya [1], which is a part of 23.09 release.
> Shortly, there was a problem where a new datapath in OVN caused a full 
> re-create of datapath_groups, which results in a huge ovsdb transaction.
> 
> 1: 
> https://github.com/ovn-org/ovn/commit/2a57b204459612a9a4edb49b9c5ebc44e101ee93
> 
>> On 28 Mar 2024, at 16:24, Шагов Георгий via discuss 
>>  wrote:
>> 
>> Hello everyone
>>  
>> In our setup (*) we have encountered a problem with NorthD consuming 100% CPU
>> Looking into transaction log of SBDB it was found that some transactions 
>> from NorthD take almost 2Mb in Json; these are not exception, rather regular
>> Looking deeply into the transaction we have encountered Huge 
>> Logical_DP_Group Record,
>> More in details: 2Mb Json SBDB Transaction Record was parsed by json_pp to 
>> json file, that takes 157K Lines, from which 152K Lines – Logical_DP_Group
>> That Logical_DP_Group record contains of 10 Groups into it. Approx 4K dp per 
>> every group
>>  
>> Googling for a similar case I have found one, that looked close to our: 
>> (link1) – this is not our case – we checked that
>> Unfortunately, the basic information regarding Logical DB Groups is well 
>> documented into the code of NorthD
>>  
>> Any clue/help/questions are extremely appreciated
>>  
>> (*) Setup:
>> - OVN 22.03.3 – we are in process of migrating to 24.03
>> - OVS 2.17.7 – the same for 3.3
>> - Routers – approx. 5K
>>  
>> (link1): 
>> https://github.com/ovn-org/ovn/commit/db4cb7098c890e974175d4d05dd70dc409fad91e
>>  
>> Yours truly, George
>>  
>> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые 
>> документы, приложенные к нему, содержат конфиденциальную информацию. 
>> Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, 
>> использование, копирование, распространение информации, содержащейся в 
>> настоящем сообщении, а также осуществление любых действий на основе этой 
>> информации, строго запрещено. Если Вы получили это сообщение по ошибке, 
>> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это 
>> сообщение.
>> CONFIDENTIALITY NOTICE: This email and any files attached to it are 
>> confidential. If you are not the intended recipient you are notified that 
>> using, copying, distributing or taking any action in reliance on the 
>> contents of this information is strictly prohibited. If you have received 
>> this email in error please notify the sender and delete this email.
>> ___
>> discuss mailing list
>> disc...@openvswitch.org 
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 
> 
> 
> 
> Regards,
> Vladislav Odintsov
> 


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Huge Logical_DP_Groups; NorthD 100% CPU

2024-03-28 Thread Vladislav Odintsov via discuss
Hi George,

do you experience these problems with 22.03 version?
If yes, you may want try patch by Ilya [1], which is a part of 23.09 release.
Shortly, there was a problem where a new datapath in OVN caused a full 
re-create of datapath_groups, which results in a huge ovsdb transaction.

1: 
https://github.com/ovn-org/ovn/commit/2a57b204459612a9a4edb49b9c5ebc44e101ee93

> On 28 Mar 2024, at 16:24, Шагов Георгий via discuss 
>  wrote:
> 
> Hello everyone
>  
> In our setup (*) we have encountered a problem with NorthD consuming 100% CPU
> Looking into transaction log of SBDB it was found that some transactions from 
> NorthD take almost 2Mb in Json; these are not exception, rather regular
> Looking deeply into the transaction we have encountered Huge Logical_DP_Group 
> Record,
> More in details: 2Mb Json SBDB Transaction Record was parsed by json_pp to 
> json file, that takes 157K Lines, from which 152K Lines – Logical_DP_Group
> That Logical_DP_Group record contains of 10 Groups into it. Approx 4K dp per 
> every group
>  
> Googling for a similar case I have found one, that looked close to our: 
> (link1) – this is not our case – we checked that
> Unfortunately, the basic information regarding Logical DB Groups is well 
> documented into the code of NorthD
>  
> Any clue/help/questions are extremely appreciated
>  
> (*) Setup:
> - OVN 22.03.3 – we are in process of migrating to 24.03
> - OVS 2.17.7 – the same for 3.3
> - Routers – approx. 5K
>  
> (link1): 
> https://github.com/ovn-org/ovn/commit/db4cb7098c890e974175d4d05dd70dc409fad91e
>  
> Yours truly, George
>  
> УВЕДОМЛЕНИЕ О КОНФИДЕНЦИАЛЬНОСТИ: Это электронное сообщение и любые 
> документы, приложенные к нему, содержат конфиденциальную информацию. 
> Настоящим уведомляем Вас о том, что если это сообщение не предназначено Вам, 
> использование, копирование, распространение информации, содержащейся в 
> настоящем сообщении, а также осуществление любых действий на основе этой 
> информации, строго запрещено. Если Вы получили это сообщение по ошибке, 
> пожалуйста, сообщите об этом отправителю по электронной почте и удалите это 
> сообщение.
> CONFIDENTIALITY NOTICE: This email and any files attached to it are 
> confidential. If you are not the intended recipient you are notified that 
> using, copying, distributing or taking any action in reliance on the contents 
> of this information is strictly prohibited. If you have received this email 
> in error please notify the sender and delete this email.
> ___
> discuss mailing list
> disc...@openvswitch.org 
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss




Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN SB DB from RAFT cluster to Relay DB

2024-03-05 Thread Vladislav Odintsov via discuss
Hi Felix,

> On 6 Mar 2024, at 10:16, Felix Huettner via discuss 
>  wrote:
> 
> Hi Srini,
> 
> i can share what works for us for ~1k hypervisors:
> 
> On Tue, Mar 05, 2024 at 09:51:43PM -0800, Sri kor via discuss wrote:
>> Hi Team,
>> 
>> 
>> Currently , we are using OVN in RAFT cluster mode. We have 3 NB and SB
>> ovsdb-servers operating in RAFT cluster mode. Currently we have 500
>> hypervisors connected to this RAFT cluster.
>> 
>> For our next deployment, our scale would increase to 3000 hypervisors. To
>> accommodate this scaled hypervisors, we are migrating to DB relay with
>> multigroup deployment model. This increase helps with OVN SB DB read
>> transactions. But for write transactions, only the leader in the RAFT
>> cluster can update the DB. This creates a load on the leader of RAFT. Is
>> there a way to address the load on the RAFT cluster leader?
> 
> We do the following:
> * If you need TLS on the ovsdb path, separate it out to some
>  reverseproxy that can do just L4 TLS Termination (e.g. traefik, or so)

Do I understand correctly that with such TLS "offload" you can’t use RBAC for 
hypervisors?

> * Have nobody besides northd connect to the SB DB directly, everyone
>  else needs to use a relay
> * Do not run backups on the cluster leader, but on one of the current
>  followers
> * Increase the raft election timeout significantly (we have 120s in
>  there). However there is a patch afaik in 3.3 that makes that better
> * If you create metrics or so from database content generate these on
>  the relays instead of the raft cluster
> 
> Overall when our southbound db had issues most of the time it was some
> client constantly reconnecting to it and thereby pulling always a full
> DB dump.
> 
>> 
>> 
>> As the scale increases, number updates coming to the ovn-controller from
>> OVN SB increases. that creates pressure on ovn-controller. Is there a way
>> to minimize the load on ovn-controller?
> 
> Did not see any kind of issue there yet.
> However if you are using some python tooling outside of OVN (e.g.
> Openstack) ensure that you have JSON parsing using a C library avaialble
> in the ovs lib. This brings significant performance benefts if you have
> a lot of updates.
> You can check with `python3 -c "import ovs.json; print(ovs.json.PARSER)"`
> which should return "C".
> 
>> 
>> I wish there is a way for ovn-controller to subscribe to updates specific
>> to this hypervisor. Are there any known ovn-contrller subscription methods
>> available and being used OVS community?
> 
> Yes, they do that per default. However for us we saw that this creates
> increased load on the relays due to the needed additional filtering and
> json serializing per target node. So we turned it of and thereby trade
> less ovsdb load for more network bandwidth.
> Relevant setting is `external_ids:ovn-monitor-all`.
> 
> Thanks
> Felix
> 
>> 
>> 
>> How can I optimize the load on the leader node in an OVN RAFT cluster to
>> handle increased write transactions?
>> 
>> 
>> 
>> Thanks,
>> 
>> Srini
> 
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 
> ___
> discuss mailing list
> disc...@openvswitch.org 
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] LB with HW-VTEP

2023-12-19 Thread Vladislav Odintsov via discuss
Hi Sergey,

The LB support for HW VTEP is not implemented. The main trouble is the fact 
that HW VTEP device itself cannot modify packets or do whatever with control 
plane traffic (answer ARPs, etc.). At least for now.

Additional L3 Gateway chassis can be used to connect routed overlay topologies 
with HW VTEP. In this case L3 GW chassis will answer ARP requests for overlay 
GW IP for “vlan” side arriving requests and route packets to another subnets. 
This is an asymmetric routing: ingress traffic (from outside into OVN) goes 
from HW VTEP chassis to L3 GW chassis and next to destination VM’s/container’s 
chassis (hypervisor). On the reverse direction traffic goes directly from this 
hypervisor to HW VTEP skipping L3 Gateway.

So in load balancing case L3 Gateway could be the chassis which does the load 
balancing, but we must make reply traffic going via this L3 GW for “unDNATing”. 
This is not implemented yet, thus load balancing stage is skipped in logical 
switch pipeline for packets coming from HW VTEP.

Hope, this answers your question.

regards,
Vladislav Odintsov

> On 20 Dec 2023, at 07:12, Sergey via discuss  
> wrote:
> 
> Hello!
> 
> We are using the OVN HW-VTEP mechanism with our OpenStack
> installation, and it works great.
> But we are testing LB (Load Balancer) now, and it does not work in the
> case of this kind of network.
> 
> According to documentation and code:
> https://github.com/ovn-org/ovn/blob/v23.09.1/northd/northd.c#L6996
> All tables about LB are skipped in the case of HW-VTEP, it goes
> directly to table 17.
> 
> My question is, is it reasonable?
> Or does it have to be fixed?
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enormous amount of records into openvswitch db Bridge table external_ids ct-zone

2023-11-15 Thread Vladislav Odintsov via discuss


> On 15 Nov 2023, at 16:41, Шагов Георгий via discuss 
>  wrote:
> 
> Hello Vladislav
>  
> I really appreciate your reply to the problem
> AFAIK, ovn-controller doesn’t use TCP sockets to connect to local OVS, so it 
> seems that this ERR message is not related to OVS<->ovn-controller interaction
> Looking at this picture: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
> I would say that ovn-controller establishes tcp  both SBDB and ovswitch_db, 
> but indeed it uses UNIX to openvswitch daemon.

Ahh, I was talking about non-container deployment with systemd units.

>  
> If you see such errors in ovsdb-server, which handles Open_vSwitch database, 
> you should check which service has connected to it.
>  
> Getting back to the error from ovswitch_db
> reconnect|ERR|tcp:127.0.0.1:53560
>  
> we checked this port: 53560 into ovn-controller logs and found a corresponded 
> error with the same port and timing, so I am sure this is connection of 
> ovn-controller.
>  
> In addition, what we did, we indeed re-established/reconfigured connection 
> from ovn-controller to ovswitch_db using unix socket, that had given us a 
> default probe interval in 10 secs against 5 secs for tcp. And that worked! 
> Ovn-controller dropped CPU consumption from 100% to 60%! Though ovswitch_db 
> increased 20% I would say.
>  
>  
> I’d suggest enabling dbg logs for ovn-controller (via ovn-appctl vlog/set 
> command) to see which module makes most of CPU load.
> Hm… Yes, we indeed extended logs to DBG, but the option of ‘to see which 
> module makes most of CPU load’ – slept my attention, thx for the hint, will 
> check

You can use ovn-appctl stopwatch/show to try to find which module uses much 
time and its values.

>  
> What we did, and this seems to be working we hacked ovs changing 
> RECONNECT_DEFAULT_PROBE_INTERVAL to 6, checkint at the moment…..
>  
> Thns in advance
>  
>  
>  
> From: Vladislav Odintsov mailto:odiv...@gmail.com>>
> Date: Wednesday, 15 November 2023, 15:10
> To: Шагов Георгий mailto:gmsha...@cloud.ru>>
> Cc: Ales Musil mailto:amu...@redhat.com>>, 
> "ovs-discuss@openvswitch.org " 
> mailto:ovs-discuss@openvswitch.org>>
> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
> Bridge table external_ids ct-zone
>  
> ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
> Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
> пароль,
> не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru 
> 
> Hi,
> 
> 
> On 15 Nov 2023, at 12:49, Шагов Георгий via discuss 
>  wrote:
>  
> Hello Ales
>  
> I really appreciate your reply. It helps a lot.
>  
> ovn-appctl -t ovn-controller ct-zone-list
> this request produced about 7K+ records. So, it seems like 6K5 records for 
> ct-zone in Bridge Table seems to be valid
>  
> Digging deeper we have found that it looks like a major cause for 100% CPU in 
> ovn-controller is that it performs a full recompute constantly.
> Looking into logs of ovsdb-server of openvswitch_db we see constant messages:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
>  
> So it seems like openvswitch_db constantly drop the connection from 
> ovn-controller due to inactivity.
>  
> We tried to change an Inactivity Probe interval at openvswitch_db using 
> ovn-controller setting into openvswitch_db:Open_VSwitch table into 
> external_ids: ovn-openflow-probe-interval
> How it is explained here: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
> Yet , it seems to be not working, regardless of the value we set (ex: 
> ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs 
> interval into openvswitch_db logs:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
> This is very confusing.
>  
> AFAIK, ovn-controller doesn’t use TCP sockets to connect to local OVS, so it 
> seems that this ERR message is not related to OVS<->ovn-controller 
> interaction.
> TCP can be used to connect to OVN_Southbound database...
> If you see such errors in ovsdb-server, which handles Open_vSwitch database, 
> you should check which service has connected to it.
>  
> If you think the problem is about connections, so it is important to 
> understand which socket brings these problems.
>  
> ovn-controller <-> ovs-vswitchd OpenFlow is connected via unix socket 
> (default for br-int if /var/run/openvswitch/br-int.mgmt). There is a 
> configuration knob external_ids:ovn-openflow-probe-interval for this 
> connection. 0 is its default, I’d leave it as is.
> local ovsdb connection is done via unix socket to ovsdb-server 
> (/var/run/openvswitch/db.sock).
>  
> I’d suggest enabling dbg logs for ovn-controller (via ovn-appctl vlog/set 
> command) to see which module makes most of CPU load.
> 
> 
>  
> Do we miss anything here? Any hint is appreciated.
> 

Re: [ovs-discuss] Enormous amount of records into openvswitch db Bridge table external_ids ct-zone

2023-11-15 Thread Vladislav Odintsov via discuss
Hi,

> On 15 Nov 2023, at 12:49, Шагов Георгий via discuss 
>  wrote:
> 
> Hello Ales
>  
> I really appreciate your reply. It helps a lot.
>  
> ovn-appctl -t ovn-controller ct-zone-list
> this request produced about 7K+ records. So, it seems like 6K5 records for 
> ct-zone in Bridge Table seems to be valid
>  
> Digging deeper we have found that it looks like a major cause for 100% CPU in 
> ovn-controller is that it performs a full recompute constantly.
> Looking into logs of ovsdb-server of openvswitch_db we see constant messages:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
>  
> So it seems like openvswitch_db constantly drop the connection from 
> ovn-controller due to inactivity.
>  
> We tried to change an Inactivity Probe interval at openvswitch_db using 
> ovn-controller setting into openvswitch_db:Open_VSwitch table into 
> external_ids: ovn-openflow-probe-interval
> How it is explained here: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2020-August/373671.html
> Yet , it seems to be not working, regardless of the value we set (ex: 
> ovn-openflow-probe-interval:”60”) we still do observe the same 5 secs 
> interval into openvswitch_db logs:
> reconnect|ERR|tcp:127.0.0.1:53560: no response to inactivity probe after 5 
> seconds, disconnecting
> This is very confusing.

AFAIK, ovn-controller doesn’t use TCP sockets to connect to local OVS, so it 
seems that this ERR message is not related to OVS<->ovn-controller interaction.
TCP can be used to connect to OVN_Southbound database...
If you see such errors in ovsdb-server, which handles Open_vSwitch database, 
you should check which service has connected to it.

If you think the problem is about connections, so it is important to understand 
which socket brings these problems.

ovn-controller <-> ovs-vswitchd OpenFlow is connected via unix socket (default 
for br-int if /var/run/openvswitch/br-int.mgmt). There is a configuration knob 
external_ids:ovn-openflow-probe-interval for this connection. 0 is its default, 
I’d leave it as is.
local ovsdb connection is done via unix socket to ovsdb-server 
(/var/run/openvswitch/db.sock).

I’d suggest enabling dbg logs for ovn-controller (via ovn-appctl vlog/set 
command) to see which module makes most of CPU load.

>  
> Do we miss anything here? Any hint is appreciated.
> Thanx in advance.
>  
>  
> From: Ales Musil mailto:amu...@redhat.com>>
> Date: Tuesday, 14 November 2023, 14:19
> To: Шагов Георгий mailto:gmsha...@cloud.ru>>
> Cc: "ovs-discuss@openvswitch.org " 
> mailto:ovs-discuss@openvswitch.org>>
> Subject: Re: [ovs-discuss] Enormous amount of records into openvswitch db 
> Bridge table external_ids ct-zone
>  
> ВНИМАНИЕ! ВНЕШНИЙ ОТПРАВИТЕЛЬ
> Если отправитель почты неизвестен, не переходите по ссылкам, не сообщайте 
> пароль,
> не запускайте вложения и сообщите коллегам из ЦКЗ на secur...@cloud.ru 
> 
>  
>  
> On Tue, Nov 14, 2023 at 12:02 PM Шагов Георгий via discuss 
> mailto:ovs-discuss@openvswitch.org>> wrote:
> Hello All
>  
>  
> Hi,
> 
>  
> We do observe strangely f(or our installation) amount of records into 
> openvswitch db Bridge table external_ids:ct-zone, i.e.: 6K5+
>  
> CT zone is allocated for most of the LSPs (there are some exceptions) and for 
> all LR DNAT and SNAT that are local for the specified controller. Which means 
> that you have a lot of ports and possibly routers on that single controller 
> or the external-ids are not cleared on update (this would be a bug) . You can 
> actually check the zone list by running: ovn-appctl -t ovn-controller 
> ct-zone-list, to see if that matches the count of active zones that 
> ovn-controller knows about.
>  
>  
> grep -A20 '^Bridge table' ./ovs.dump | grep external_ids | sed 
> 's/ct-zone-/\nct-zone-/g' | sort | uniq | wc -l
> 6659
>  
> Details:
>   5   "Bridge" : {
>   6  "06ef9e06-188e-4654-93b2-5242a324a5c7" : {
>   7 "initial" : {
>   8"datapath_type" : "system",
>   9"external_ids" : [
>  10   "map",
>  11   [
>  12  [
>  13 
> "ct-zone-00368809-59f5-4408-8ae3-fb5401ff6ea4_dnat",
>  14 "60"
>  15  ],
>  16  [
>  
> In that same time if I run:
> ovs-dpctl ct-stats-show
> Connections Stats:
> Total: 1672
>   TCP: 1269
>   UDP: 398
>   ICMP: 5
>  
> The questions are:
> Who is writing into: openvswitch db Bridge table external_ids:ct-zone?
>  
> ovn-controller is writing those values for the purpose of restoring the zones 
> after restart.
>  
>  
> Is there any way to manage these records into openvswitch db Bridge table 
> external_ids? I want to purge them…
>  
> ovn-controller will still write those that are new/changed if you purge them. 
> I would 

Re: [ovs-discuss] OVS & OVN HWOL with Nvidia ConnectX-6 Dx - Kernel flower acknowledgment does not match request

2023-10-31 Thread Vladislav Odintsov via discuss
Hi Ilya and Eelco,

thanks for your work!
I’ll try out mentioned patch this week to make sure that there are no warnings 
anymore.

> On 31 Oct 2023, at 11:28, Eelco Chaudron  wrote:
> 
> 
> 
> On 30 Oct 2023, at 15:47, Eelco Chaudron wrote:
> 
>> On 30 Oct 2023, at 15:08, Ilya Maximets wrote:
>> 
>>> On 10/26/23 14:05, Odintsov Vladislav wrote:
>>>> Hi,
>>>> 
>>>>> On 19 Oct 2023, at 17:06, Vladislav Odintsov via discuss 
>>>>>  wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 18 Oct 2023, at 18:43, Ilya Maximets via discuss 
>>>>>>  wrote:
>>>>>> 
>>>>>> On 10/18/23 16:24, Vladislav Odintsov wrote:
>>>>>>> Hi Ilya,
>>>>>>> 
>>>>>>> thanks for your response!
>>>>>>> 
>>>>>>>> On 18 Oct 2023, at 15:59, Ilya Maximets via discuss 
>>>>>>>>  wrote:
>>>>>>>> 
>>>>>>>> On 10/17/23 16:30, Vladislav Odintsov via discuss wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I’m testing OVS hardware offload with tc flower with Mellanox/NVidia 
>>>>>>>>> ConnectX-6 Dx smartnic and see next warning in ovs-vswitchd log:
>>>>>>>>> 
>>>>>>>>> 2023-10-17T14:23:15.116Z|00386|tc(handler20)|WARN|Kernel flower 
>>>>>>>>> acknowledgment does not match request!  Set dpif_netlink to dbg to 
>>>>>>>>> see which rule caused this error.
>>>>>>>>> 
>>>>>>>>> With dpif_netlink debug logs enabled, after this message appears two 
>>>>>>>>> additional lines:
>>>>>>>>> 
>>>>>>>>> 2023-10-17T14:23:15.117Z|00387|dpif_netlink(handler20)|DBG|added flow
>>>>>>>>> 2023-10-17T14:23:15.117Z|00388|dpif_netlink(handler20)|DBG|system@ovs-system:
>>>>>>>>>  put[create] ufid:d8a3ab6d-77d1-4574-8bbf-634b01a116f3 
>>>>>>>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x10,src=10.1.0.105,dst=10.1.0.109,ttl=64/0,tp_src=59507/0,tp_dst=6081/0,geneve({class=0x102,type=0x80,len=4,0x60002}),flags(-df+csum+key)),in_port(4),skb_mark(0/0),ct_state(0/0x2f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x3),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.2.4/0.0.0.0,dst=172.32.1.4/0.0.0.0,proto=1,tos=0/0x3,ttl=63/0,frag=no),icmp(type=8/0,code=0/0),
>>>>>>>>>  
>>>>>>>>> actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_src=59507,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(df|csum|key))),4
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Could you also enable debug logs for 'tc' module in OVS?
>>>>>>>> It shoudl give more infomation about where exactly is the
>>>>>>>> difference between what OVS asked for and what the kenrel
>>>>>>>> reported back.
>>>>>>>> 
>>>>>>>> In general this warning typically signifies a kernel bug,
>>>>>>>> but it could be that OVS doesn't format something correctly
>>>>>>>> as well.
>>>>>>> 
>>>>>>> With enabled tc logs I see mismatches in expected/real keys and actions:
>>>>>>> 
>>>>>>> 2023-10-18T13:33:35.882Z|00118|tc(handler21)|DBG|tc flower compare 
>>>>>>> failed action compare
>>>>>>> Expected Mask:
>>>>>>>   ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
>>>>>>> 0030  00 00 2f 00 00 00 00 00-00 00 00 00 00 00 00 00
>>>>>>> 0040  03 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>>>>>>> 0050  00 00 00 00 ff ff ff ff-00 00 00 00 00 00 00 00
>>>>>>> 0060  00 00 00 00 ff 00 00 00-00 00 00 00 00 00 00 00
>>>>>>> 0090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
>>>>>>> 00c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff
>>>>>>> 00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>>>>>>> 00e0  ff ff ff 01 ff ff ff ff-00 00 00 00 00 00 00 00
>>>>>>> 
>>>>>>> Received Mask:
>>>>>>>

Re: [ovs-discuss] OVS & OVN HWOL with Nvidia ConnectX-6 Dx - Kernel flower acknowledgment does not match request

2023-10-19 Thread Vladislav Odintsov via discuss


> On 18 Oct 2023, at 18:43, Ilya Maximets via discuss 
>  wrote:
> 
> On 10/18/23 16:24, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> thanks for your response!
>> 
>>> On 18 Oct 2023, at 15:59, Ilya Maximets via discuss 
>>>  wrote:
>>> 
>>> On 10/17/23 16:30, Vladislav Odintsov via discuss wrote:
>>>> Hi,
>>>> 
>>>> I’m testing OVS hardware offload with tc flower with Mellanox/NVidia 
>>>> ConnectX-6 Dx smartnic and see next warning in ovs-vswitchd log:
>>>> 
>>>> 2023-10-17T14:23:15.116Z|00386|tc(handler20)|WARN|Kernel flower 
>>>> acknowledgment does not match request!  Set dpif_netlink to dbg to see 
>>>> which rule caused this error.
>>>> 
>>>> With dpif_netlink debug logs enabled, after this message appears two 
>>>> additional lines:
>>>> 
>>>> 2023-10-17T14:23:15.117Z|00387|dpif_netlink(handler20)|DBG|added flow
>>>> 2023-10-17T14:23:15.117Z|00388|dpif_netlink(handler20)|DBG|system@ovs-system:
>>>>  put[create] ufid:d8a3ab6d-77d1-4574-8bbf-634b01a116f3 
>>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x10,src=10.1.0.105,dst=10.1.0.109,ttl=64/0,tp_src=59507/0,tp_dst=6081/0,geneve({class=0x102,type=0x80,len=4,0x60002}),flags(-df+csum+key)),in_port(4),skb_mark(0/0),ct_state(0/0x2f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x3),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.2.4/0.0.0.0,dst=172.32.1.4/0.0.0.0,proto=1,tos=0/0x3,ttl=63/0,frag=no),icmp(type=8/0,code=0/0),
>>>>  
>>>> actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_src=59507,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(df|csum|key))),4
>>>> 
>>> 
>>> Could you also enable debug logs for 'tc' module in OVS?
>>> It shoudl give more infomation about where exactly is the
>>> difference between what OVS asked for and what the kenrel
>>> reported back.
>>> 
>>> In general this warning typically signifies a kernel bug,
>>> but it could be that OVS doesn't format something correctly
>>> as well.
>> 
>> With enabled tc logs I see mismatches in expected/real keys and actions:
>> 
>> 2023-10-18T13:33:35.882Z|00118|tc(handler21)|DBG|tc flower compare failed 
>> action compare
>> Expected Mask:
>>   ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
>> 0030  00 00 2f 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 0040  03 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 0050  00 00 00 00 ff ff ff ff-00 00 00 00 00 00 00 00
>> 0060  00 00 00 00 ff 00 00 00-00 00 00 00 00 00 00 00
>> 0090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
>> 00c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff
>> 00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 00e0  ff ff ff 01 ff ff ff ff-00 00 00 00 00 00 00 00
>> 
>> Received Mask:
>>   ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
>> 0030  00 00 2f 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 0040  03 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 0050  00 00 00 00 ff ff ff ff-00 00 00 00 00 00 00 00
>> 0060  00 00 00 00 ff 00 00 00-00 00 00 00 00 00 00 00
>> 0090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
>> 00c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff
>> 00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 00e0  ff ff ff 01 ff ff ff ff-00 00 00 00 00 00 00 00
>> 
>> Expected Key:
>>   08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
>> 0050  a9 fe 64 01 a9 fe 64 03-00 00 ba a4 6e ad 00 00  <— mismatch in 
>> this line
>> 0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
>> 0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
>> 00c0  00 40 c0 5b 17 c1 00 00-00 00 00 00 00 00 00 10  <— mismatch in 
>> this line
>> 00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 00e0  01 02 80 01 00 03 00 02-00 00 00 00 00 00 00 00
>> 
>> Received Key:
>>   08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
>> 0050  00 00 00 00 a9 fe 64 03-00 00 00 00 00 00 00 00  <— mismatch in 
>> this line
>> 0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
>> 0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
>> 00c0  00 00 00 00 17 c1 00 00-00 00 00 00 00 00 00 10  <— mismatch in 
>> this line
>> 00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> 00e0  01 02 80 01 00 03 00 02-00 00 00 00 00 00 00 00
>

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-10-18 Thread Vladislav Odintsov via discuss


> On 18 Oct 2023, at 18:59, Ilya Maximets  wrote:
> 
> On 10/18/23 17:14, Vladislav Odintsov wrote:
>> Hi Ilya, Terry,
>> 
>>> On 7 Mar 2023, at 14:03, Ilya Maximets  wrote:
>>> 
>>> On 3/7/23 00:15, Vladislav Odintsov wrote:
 Hi Ilya,
 
 I’m wondering whether there are possible configuration parameters for 
 ovsdb relay -> main ovsdb server inactivity probe timer.
 My cluster experiencing issues where relay disconnects from main cluster 
 due to 5 sec. inactivity probe timeout.
 Main cluster has quite big database and a bunch of daemons, which connects 
 to it and it makes difficult to maintain connections in time.
 
 For ovsdb relay as a remote I use in-db configuration (to provide 
 inactivity probe and rbac configuration for ovn-controllers).
 For ovsdb-server, which serves SB, I just set --remote=pssl:.
 
 I’d like to configure remote for ovsdb cluster via DB to set inactivity 
 probe setting, but I’m not sure about the correct way for that.
 
 For now I see only two options:
 1. Setup custom database scheme with connection table, serve it in same SB 
 cluster and specify this connection when start ovsdb sb server.
>>> 
>>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>>> used for that purpose.  But you'll need to craft transactions for it
>>> manually with ovsdb-client.
>>> 
>>> There is a control tool prepared by Terry:
>>>  
>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>> 
>>> But it's not in the repo yet (I need to get back to reviews on that
>>> topic at some point).  The tool itself should be fine, but maybe name
>>> will change.
>> 
>> I want to step back to this thread.
>> The mentioned patch is archived with "Changes Requested" state, but there is 
>> no review comments in this patch.
>> If there is no ongoing work with it, I can take it over to finalise.
>> For now it needs a small rebase, so I can do it and resend, but before want 
>> to hear your thoughts on this.
>> 
>> Internally we use this patch to work with Local_Config DB for almost 6 
>> months and it works fine.
>> On each OVS update we have to re-apply it and sometimes solve conflicts, so 
>> would be nice to have this patch in upstream.
> 
> Hi, I'm currently in the middle of re-working the ovsdb-server configuration
> for a different approach that will replace command-line and appctl configs
> with a config file (cmdline and appctls will be preserved for backward
> compatibility, but there will be a new way of setting things up).  This should
> be much more flexible and user-friendly than working with a local-config
> database.  That should also address most of the concerns raised by Terry
> regarding usability of local-config (having way too many ways of configuring
> the same thing mainly, and requirement to use special tools to modify the
> configuration).  I'm planning to post the first version of the change
> relatively soon.  I can Cc you on the patches.

Okay, got it.
It would be nice if you can Cc me for not to miss patches, thanks!

> 
> Best regards, Ilya Maximets.
> 
>> 
>>> 
 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
 and deploy cluster separately from ovsdb relay, because they both start 
 same connections and conflict on ports. (I don’t use docker here, so I 
 need a separate server for that).
>>> 
>>> That's an easy option available right now, true.  If they are deployed
>>> on different nodes, you may even use the same connection record.
>>> 
 
 Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
 inactivity probe (say, to 60k), I guess it’s still not enough to have 
 ovsdb pings every 60 seconds. Inactivity probe must be the same from both 
 ends - right? From the ovsdb relay process.
>>> 
>>> Inactivity probes don't need to be the same.  They are separate for each
>>> side of a connection and so configured separately.
>>> 
>>> You can set up inactivity probe for the server side of the connection via
>>> database.  So, server will probe the relay every 60 seconds, but today
>>> it's not possible to set inactivity probe for the relay-to-server direction.
>>> So, relay will probe the server every 5 seconds.
>>> 
>>> The way out from this situation is to allow configuration of relays via
>>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>>> require addition of a new table to the Local_Config database and allowing
>>> relay config to be parsed from the database in the code.  That wasn't
>>> implemented yet.
>>> 
 I saw your talk on last ovscon about this topic, and the solution was in 
 progress there. But maybe there were some changes from that time? I’m 
 ready to test it if any. Or, maybe there’s any workaround?
>>> 
>>> Sorry, we didn't move forward much on that topic since the presentation.
>>> There are few unanswered questions 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-10-18 Thread Vladislav Odintsov via discuss
Hi Ilya, Terry,

> On 7 Mar 2023, at 14:03, Ilya Maximets  wrote:
> 
> On 3/7/23 00:15, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> I’m wondering whether there are possible configuration parameters for ovsdb 
>> relay -> main ovsdb server inactivity probe timer.
>> My cluster experiencing issues where relay disconnects from main cluster due 
>> to 5 sec. inactivity probe timeout.
>> Main cluster has quite big database and a bunch of daemons, which connects 
>> to it and it makes difficult to maintain connections in time.
>> 
>> For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
>> probe and rbac configuration for ovn-controllers).
>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>> 
>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>> probe setting, but I’m not sure about the correct way for that.
>> 
>> For now I see only two options:
>> 1. Setup custom database scheme with connection table, serve it in same SB 
>> cluster and specify this connection when start ovsdb sb server.
> 
> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> used for that purpose.  But you'll need to craft transactions for it
> manually with ovsdb-client.
> 
> There is a control tool prepared by Terry:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> 
> But it's not in the repo yet (I need to get back to reviews on that
> topic at some point).  The tool itself should be fine, but maybe name
> will change.

I want to step back to this thread.
The mentioned patch is archived with "Changes Requested" state, but there is no 
review comments in this patch.
If there is no ongoing work with it, I can take it over to finalise.
For now it needs a small rebase, so I can do it and resend, but before want to 
hear your thoughts on this.

Internally we use this patch to work with Local_Config DB for almost 6 months 
and it works fine.
On each OVS update we have to re-apply it and sometimes solve conflicts, so 
would be nice to have this patch in upstream.

> 
>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>> and deploy cluster separately from ovsdb relay, because they both start same 
>> connections and conflict on ports. (I don’t use docker here, so I need a 
>> separate server for that).
> 
> That's an easy option available right now, true.  If they are deployed
> on different nodes, you may even use the same connection record.
> 
>> 
>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>> right? From the ovsdb relay process.
> 
> Inactivity probes don't need to be the same.  They are separate for each
> side of a connection and so configured separately.
> 
> You can set up inactivity probe for the server side of the connection via
> database.  So, server will probe the relay every 60 seconds, but today
> it's not possible to set inactivity probe for the relay-to-server direction.
> So, relay will probe the server every 5 seconds.
> 
> The way out from this situation is to allow configuration of relays via
> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> require addition of a new table to the Local_Config database and allowing
> relay config to be parsed from the database in the code.  That wasn't
> implemented yet.
> 
>> I saw your talk on last ovscon about this topic, and the solution was in 
>> progress there. But maybe there were some changes from that time? I’m ready 
>> to test it if any. Or, maybe there’s any workaround?
> 
> Sorry, we didn't move forward much on that topic since the presentation.
> There are few unanswered questions around local config database.  Mainly
> regarding upgrades from cmdline/main db -based configuration to a local
> config -based.  But I hope we can figure that out in the current release
> time frame, i.e. before 3.2 release.
> 
> There is also this workaround:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
> It simply takes the server->relay inactivity probe value and applies it
> to the relay->server connection.  But it's not a correct solution, because
> it relies on certain database names.
> 
> Out of curiosity, what kind of poll intervals you see on your main server
> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
> solve some of these issues?  3.1 should be noticeably faster than 2.17,
> and also parallel compaction introduced in 3.0 removes one of the big
> reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
> should also help with database sizes.
> 
> Best regards, Ilya Maximets.


Regards,
Vladislav Odintsov

___
discuss mailing list

Re: [ovs-discuss] OVS & OVN HWOL with Nvidia ConnectX-6 Dx - Kernel flower acknowledgment does not match request

2023-10-18 Thread Vladislav Odintsov via discuss
Hi Ilya,

thanks for your response!

> On 18 Oct 2023, at 15:59, Ilya Maximets via discuss 
>  wrote:
> 
> On 10/17/23 16:30, Vladislav Odintsov via discuss wrote:
>> Hi,
>> 
>> I’m testing OVS hardware offload with tc flower with Mellanox/NVidia 
>> ConnectX-6 Dx smartnic and see next warning in ovs-vswitchd log:
>> 
>> 2023-10-17T14:23:15.116Z|00386|tc(handler20)|WARN|Kernel flower 
>> acknowledgment does not match request!  Set dpif_netlink to dbg to see which 
>> rule caused this error.
>> 
>> With dpif_netlink debug logs enabled, after this message appears two 
>> additional lines:
>> 
>> 2023-10-17T14:23:15.117Z|00387|dpif_netlink(handler20)|DBG|added flow
>> 2023-10-17T14:23:15.117Z|00388|dpif_netlink(handler20)|DBG|system@ovs-system:
>>  put[create] ufid:d8a3ab6d-77d1-4574-8bbf-634b01a116f3 
>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x10,src=10.1.0.105,dst=10.1.0.109,ttl=64/0,tp_src=59507/0,tp_dst=6081/0,geneve({class=0x102,type=0x80,len=4,0x60002}),flags(-df+csum+key)),in_port(4),skb_mark(0/0),ct_state(0/0x2f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x3),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.2.4/0.0.0.0,dst=172.32.1.4/0.0.0.0,proto=1,tos=0/0x3,ttl=63/0,frag=no),icmp(type=8/0,code=0/0),
>>  
>> actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_src=59507,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(df|csum|key))),4
>> 
> 
> Could you also enable debug logs for 'tc' module in OVS?
> It shoudl give more infomation about where exactly is the
> difference between what OVS asked for and what the kenrel
> reported back.
> 
> In general this warning typically signifies a kernel bug,
> but it could be that OVS doesn't format something correctly
> as well.

With enabled tc logs I see mismatches in expected/real keys and actions:

2023-10-18T13:33:35.882Z|00118|tc(handler21)|DBG|tc flower compare failed 
action compare
Expected Mask:
  ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
0030  00 00 2f 00 00 00 00 00-00 00 00 00 00 00 00 00
0040  03 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
0050  00 00 00 00 ff ff ff ff-00 00 00 00 00 00 00 00
0060  00 00 00 00 ff 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
00c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff
00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00e0  ff ff ff 01 ff ff ff ff-00 00 00 00 00 00 00 00

Received Mask:
  ff ff 00 00 ff ff ff ff-ff ff ff ff ff ff ff ff
0030  00 00 2f 00 00 00 00 00-00 00 00 00 00 00 00 00
0040  03 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
0050  00 00 00 00 ff ff ff ff-00 00 00 00 00 00 00 00
0060  00 00 00 00 ff 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-ff ff ff ff ff ff ff ff
00c0  ff 00 00 00 ff ff 00 00-ff ff ff ff ff ff ff ff
00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00e0  ff ff ff 01 ff ff ff ff-00 00 00 00 00 00 00 00

Expected Key:
  08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
0050  a9 fe 64 01 a9 fe 64 03-00 00 ba a4 6e ad 00 00  <— mismatch in this 
line
0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
00c0  00 40 c0 5b 17 c1 00 00-00 00 00 00 00 00 00 10  <— mismatch in this 
line
00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00e0  01 02 80 01 00 03 00 02-00 00 00 00 00 00 00 00

Received Key:
  08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
0050  00 00 00 00 a9 fe 64 03-00 00 00 00 00 00 00 00  <— mismatch in this 
line
0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
00c0  00 00 00 00 17 c1 00 00-00 00 00 00 00 00 00 10  <— mismatch in this 
line
00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00e0  01 02 80 01 00 03 00 02-00 00 00 00 00 00 00 00

Expected Masked Key:
  08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
0050  00 00 00 00 a9 fe 64 03-00 00 00 00 00 00 00 00
0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
00c0  00 00 00 00 17 c1 00 00-00 00 00 00 00 00 00 10
00d0  08 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
00e0  01 02 80 01 00 03 00 02-00 00 00 00 00 00 00 00

Received Masked Key:
  08 06 00 00 ff ff ff ff-ff ff 00 00 ba a4 6e ad
0050  00 00 00 00 a9 fe 64 03-00 00 00 00 00 00 00 00
0060  00 00 00 00 01 00 00 00-00 00 00 00 00 00 00 00
0090  00 00 00 00 00 00 00 00-0a 01 00 68 0a 01 00 6d
00c0  00 00 00 00 17 c1 00 00-00 00 00 00 0

[ovs-discuss] OVS & OVN HWOL with Nvidia ConnectX-6 Dx - Kernel flower acknowledgment does not match request

2023-10-17 Thread Vladislav Odintsov via discuss
Hi,

I’m testing OVS hardware offload with tc flower with Mellanox/NVidia ConnectX-6 
Dx smartnic and see next warning in ovs-vswitchd log:

2023-10-17T14:23:15.116Z|00386|tc(handler20)|WARN|Kernel flower acknowledgment 
does not match request!  Set dpif_netlink to dbg to see which rule caused this 
error.

With dpif_netlink debug logs enabled, after this message appears two additional 
lines:

2023-10-17T14:23:15.117Z|00387|dpif_netlink(handler20)|DBG|added flow
2023-10-17T14:23:15.117Z|00388|dpif_netlink(handler20)|DBG|system@ovs-system: 
put[create] ufid:d8a3ab6d-77d1-4574-8bbf-634b01a116f3 
recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x10,src=10.1.0.105,dst=10.1.0.109,ttl=64/0,tp_src=59507/0,tp_dst=6081/0,geneve({class=0x102,type=0x80,len=4,0x60002}),flags(-df+csum+key)),in_port(4),skb_mark(0/0),ct_state(0/0x2f),ct_zone(0/0),ct_mark(0/0),ct_label(0/0x3),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.2.4/0.0.0.0,dst=172.32.1.4/0.0.0.0,proto=1,tos=0/0x3,ttl=63/0,frag=no),icmp(type=8/0,code=0/0),
 
actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_src=59507,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(df|csum|key))),4

The test system is a CentOS 8.4 with installed elrepo mainline kernel 6.5.5, 
OVS 3.1.1 and OVN 22.09.1.
The workload I’m testing is a L3 Gateway for OVN IC (cross-az traffic).

tc monitor at the same moment outputs next:

replaced filter dev genev_sys_6081 ingress protocol ip pref 2 flower chain 0 
handle 0x3
  dst_mac 00:01:ba:a4:6e:ad
  src_mac 00:00:ba:a4:6e:ad
  eth_type ipv4
  ip_proto icmp
  ip_tos 0/0x3
  enc_dst_ip 10.1.0.109
  enc_src_ip 10.1.0.105
  enc_key_id 16
  enc_dst_port 6081
  enc_tos 0
  geneve_opts 0102:80:00060002/:ff:
  ip_flags nofrag
  ct_state -trk-new-est
  ct_label /03
  in_hw in_hw_count 2
action order 1: tunnel_key  unset pipe
 index 5 ref 1 bind 1
no_percpu
used_hw_stats delayed

action order 2: tunnel_key  set
src_ip 10.1.0.109
dst_ip 10.1.1.18
key_id 16711697
dst_port 6081
geneve_opts 0102:80:0018000b
csum
ttl 64 pipe
 index 6 ref 1 bind 1
no_percpu
used_hw_stats delayed

action order 3: mirred (Egress Redirect to device genev_sys_6081) stolen
index 3 ref 1 bind 1
cookie 6daba3d87445d1774b63bf8bf316a101
no_percpu
used_hw_stats delayed


Despite of these warnings, the flow is finally offloaded and the traffic 
traverses this gw node well, only first packets of an ICMP sequence reach CPU 
(seen in tcpdump):

# ovs-appctl dpctl/dump-flows type=offloaded
tunnel(tun_id=0x10,src=10.1.0.107,dst=10.1.0.109,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x50002}),flags(+key)),ct_state(-new-est-rel-rpl-trk),ct_label(0/0x3),recirc_id(0),in_port(4),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(proto=1,tos=0/0x3,frag=no),
 packets:3192, bytes:312816, used:1.240s, 
actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(csum|key))),4
tunnel(tun_id=0xff0011,src=10.1.1.18,dst=10.1.0.109,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb0018}),flags(+key)),ct_state(-new-est-rel-rpl-trk),ct_label(0/0x3),recirc_id(0),in_port(4),eth(src=00:01:ba:a4:6e:ad,dst=00:00:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.1.0/255.255.255.0,dst=172.32.0.4,proto=1,tos=0/0x3,ttl=63,frag=no),
 packets:3192, bytes:312816, used:1.240s, 
actions:set(tunnel(tun_id=0x11,src=10.1.0.109,dst=10.1.0.107,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x10002}),flags(csum|key))),set(eth(src=d0:fe:00:00:00:1d,dst=0a:00:66:ec:f7:40)),set(ipv4(ttl=62)),4
tunnel(tun_id=0x10,src=10.1.0.105,dst=10.1.0.109,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x60002}),flags(+key)),ct_state(-new-est-rel-rpl-trk),ct_label(0/0x3),recirc_id(0),in_port(4),eth(src=00:00:ba:a4:6e:ad,dst=00:01:ba:a4:6e:ad),eth_type(0x0800),ipv4(proto=1,tos=0/0x3,frag=no),
 packets:293, bytes:28714, used:1.240s, 
actions:set(tunnel(tun_id=0xff0011,src=10.1.0.109,dst=10.1.1.18,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x18000b}),flags(csum|key))),4
tunnel(tun_id=0xff0011,src=10.1.1.18,dst=10.1.0.109,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0xb0018}),flags(+key)),ct_state(-new-est-rel-rpl-trk),ct_label(0/0x3),recirc_id(0),in_port(4),eth(src=00:01:ba:a4:6e:ad,dst=00:00:ba:a4:6e:ad),eth_type(0x0800),ipv4(src=172.32.1.0/255.255.255.0,dst=172.32.2.4,proto=1,tos=0/0x3,ttl=63,frag=no),
 packets:293, bytes:28714, used:1.240s, 
actions:set(tunnel(tun_id=0x17,src=10.1.0.109,dst=10.1.0.105,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x10002}),flags(csum|key))),set(eth(src=d0:fe:00:00:00:8e,dst=0a:00:40:c2:76:a0)),set(ipv4(ttl=62)),4

Re: [ovs-discuss] OVN: scaling L2 networks beyond 10k chassis - proposals

2023-09-30 Thread Vladislav Odintsov via discuss
regards,Vladislav OdintsovOn 30 Sep 2023, at 23:24, Han Zhou  wrote:On Sat, Sep 30, 2023 at 9:56 AM Vladislav Odintsov <odiv...@gmail.com> wrote:>>>> regards,> Vladislav Odintsov>> > On 30 Sep 2023, at 16:50, Robin Jarry <rja...@redhat.com> wrote:> >> > Hi Vladislav, Frode,> >> > Thanks for your replies.> >> > Frode Nordahl, Sep 30, 2023 at 10:55:> >> On Sat, Sep 30, 2023 at 9:43 AM Vladislav Odintsov via discuss> >> <ovs-discuss@openvswitch.org> wrote:> >>>> On 29 Sep 2023, at 18:14, Robin Jarry via discuss <ovs-discuss@openvswitch.org> wrote:> >>>>> >>>> Felix Huettner, Sep 29, 2023 at 15:23:> >>>>>> Distributed mac learning> >>>>>> > >>>> [snip]> >>>>>>> >>>>>> Cons:> >>>>>>> >>>>>> - How to manage seamless upgrades?> >>>>>> - Requires ovn-controller to move/plug ports in the correct bridge.> >>>>>> - Multiple openflow connections (one per managed bridge).> >>>>>> - Requires ovn-trace to be reimplemented differently (maybe other tools> >>>>>> as well).> >>>>>> >>>>> - No central information anymore on mac bindings. All nodes need to> >>>>> update their data individually> >>>>> - Each bridge generates also a linux network interface. I do not know if> >>>>> there is some kind of limit to the linux interfaces or the ovs bridges> >>>>> somewhere.> >>>>> >>>> That's a good point. However, only the bridges related to one> >>>> implemented logical network would need to be created on a single> >>>> chassis. Even with the largest OVN deployments, I doubt this would be> >>>> a limitation.> >>>>> >>>>> Would you still preprovision static mac addresses on the bridge for all> >>>>> port_bindings we know the mac address from, or would you rather leave> >>>>> that up for learning as well?> >>>>> >>>> I would leave everything dynamic.> >>>>> >>>>> I do not know if there is some kind of performance/optimization penality> >>>>> for moving packets between different bridges.> >>>>> >>>> As far as I know, once the openflow pipeline has been resolved into> >>>> a datapath flow, there is no penalty.> >>>>> >>>>> You can also not only use the logical switch that have a local port> >>>>> bound. Assume the following topology:> >>>>> +---+ +---+ +---+ +---+ +---+ +---+ +---+> >>>>> |vm1+-+ls1+-+lr1+-+ls2+-+lr2+-+ls3+-+vm2|> >>>>> +---+ +---+ +---+ +---+ +---+ +---+ +---+> >>>>> vm1 and vm2 are both running on the same hypervisor. Creating only local> >>>>> logical switches would mean only ls1 and ls3 are available on that> >>>>> hypervisor. This would break the connection between the two vms which> >>>>> would in the current implementation just traverse the two logical> >>>>> routers.> >>>>> I guess we would need to create bridges for each locally reachable> >>>>> logical switch. I am concerned about the potentially significant> >>>>> increase in bridges and openflow connections this brings.> >>>>> >>>> That is one of the concerns I raised in the last point. In my opinion> >>>> this is a trade off. You remove centralization and require more local> >>>> processing. But overall, the processing cost should remain equivalent.> >>>> >>> Just want to clarify.> >>>> >>> For topology described by Felix above, you propose to create 2 OVS> >>> bridges, right? How will the packet traverse from vm1 to vm2?> >> > In this particular case, there would be 3 OVS bridges, one for each> > logical switch.>> Yeah, agree, this is typo. Below I named three bridges :).>> >> >>> Currently when the packet enters OVS all the logical switching and> >>> routing openflow calculation is done with no packet re-entering OVS,> >>> and this results in one DP flow match to deliver this packet from> >>> vm1 to vm2 (if no conntrack used, which could introduce> >>> recirculations).> >>>> >>> Do I understand correctly, that 

Re: [ovs-discuss] OVN: scaling L2 networks beyond 10k chassis - proposals

2023-09-30 Thread Vladislav Odintsov via discuss


regards,
Vladislav Odintsov

> On 30 Sep 2023, at 16:50, Robin Jarry  wrote:
> 
> Hi Vladislav, Frode,
> 
> Thanks for your replies.
> 
> Frode Nordahl, Sep 30, 2023 at 10:55:
>> On Sat, Sep 30, 2023 at 9:43 AM Vladislav Odintsov via discuss
>>  wrote:
>>>> On 29 Sep 2023, at 18:14, Robin Jarry via discuss 
>>>>  wrote:
>>>> 
>>>> Felix Huettner, Sep 29, 2023 at 15:23:
>>>>>> Distributed mac learning
>>>>>> 
>>>> [snip]
>>>>>> 
>>>>>> Cons:
>>>>>> 
>>>>>> - How to manage seamless upgrades?
>>>>>> - Requires ovn-controller to move/plug ports in the correct bridge.
>>>>>> - Multiple openflow connections (one per managed bridge).
>>>>>> - Requires ovn-trace to be reimplemented differently (maybe other tools
>>>>>> as well).
>>>>> 
>>>>> - No central information anymore on mac bindings. All nodes need to
>>>>> update their data individually
>>>>> - Each bridge generates also a linux network interface. I do not know if
>>>>> there is some kind of limit to the linux interfaces or the ovs bridges
>>>>> somewhere.
>>>> 
>>>> That's a good point. However, only the bridges related to one
>>>> implemented logical network would need to be created on a single
>>>> chassis. Even with the largest OVN deployments, I doubt this would be
>>>> a limitation.
>>>> 
>>>>> Would you still preprovision static mac addresses on the bridge for all
>>>>> port_bindings we know the mac address from, or would you rather leave
>>>>> that up for learning as well?
>>>> 
>>>> I would leave everything dynamic.
>>>> 
>>>>> I do not know if there is some kind of performance/optimization penality
>>>>> for moving packets between different bridges.
>>>> 
>>>> As far as I know, once the openflow pipeline has been resolved into
>>>> a datapath flow, there is no penalty.
>>>> 
>>>>> You can also not only use the logical switch that have a local port
>>>>> bound. Assume the following topology:
>>>>> +---+ +---+ +---+ +---+ +---+ +---+ +---+
>>>>> |vm1+-+ls1+-+lr1+-+ls2+-+lr2+-+ls3+-+vm2|
>>>>> +---+ +---+ +---+ +---+ +---+ +---+ +---+
>>>>> vm1 and vm2 are both running on the same hypervisor. Creating only local
>>>>> logical switches would mean only ls1 and ls3 are available on that
>>>>> hypervisor. This would break the connection between the two vms which
>>>>> would in the current implementation just traverse the two logical
>>>>> routers.
>>>>> I guess we would need to create bridges for each locally reachable
>>>>> logical switch. I am concerned about the potentially significant
>>>>> increase in bridges and openflow connections this brings.
>>>> 
>>>> That is one of the concerns I raised in the last point. In my opinion
>>>> this is a trade off. You remove centralization and require more local
>>>> processing. But overall, the processing cost should remain equivalent.
>>> 
>>> Just want to clarify.
>>> 
>>> For topology described by Felix above, you propose to create 2 OVS
>>> bridges, right? How will the packet traverse from vm1 to vm2?
> 
> In this particular case, there would be 3 OVS bridges, one for each
> logical switch.

Yeah, agree, this is typo. Below I named three bridges :).

> 
>>> Currently when the packet enters OVS all the logical switching and
>>> routing openflow calculation is done with no packet re-entering OVS,
>>> and this results in one DP flow match to deliver this packet from
>>> vm1 to vm2 (if no conntrack used, which could introduce
>>> recirculations).
>>> 
>>> Do I understand correctly, that in this proposal OVS needs to
>>> receive packet from “ls1” bridge, next run through lrouter “lr1”
>>> OpenFlow pipelines, then output packet to “ls2” OVS bridge for mac
>>> learning between logical routers (should we have here OF flow with
>>> learn action?), then send packet again to OVS, calculate “lr2”
>>> OpenFlow pipeline and finally reach destination OVS bridge “ls3” to
>>> send packet to a vm2?
> 
> What I am proposing is to implement the northbound L2 network intent
> with actual OVS bridges

Re: [ovs-discuss] OVN: scaling L2 networks beyond 10k chassis - proposals

2023-09-30 Thread Vladislav Odintsov via discuss
Hi Robin,

Please, see inline.

regards,
Vladislav Odintsov

> On 29 Sep 2023, at 18:14, Robin Jarry via discuss 
>  wrote:
> 
> Felix Huettner, Sep 29, 2023 at 15:23:
>>> Distributed mac learning
>>> 
> [snip]
>>> 
>>> Cons:
>>> 
>>> - How to manage seamless upgrades?
>>> - Requires ovn-controller to move/plug ports in the correct bridge.
>>> - Multiple openflow connections (one per managed bridge).
>>> - Requires ovn-trace to be reimplemented differently (maybe other tools
>>>  as well).
>> 
>> - No central information anymore on mac bindings. All nodes need to
>>  update their data individually
>> - Each bridge generates also a linux network interface. I do not know if
>>  there is some kind of limit to the linux interfaces or the ovs bridges
>>  somewhere.
> 
> That's a good point. However, only the bridges related to one
> implemented logical network would need to be created on a single
> chassis. Even with the largest OVN deployments, I doubt this would be
> a limitation.
> 
>> Would you still preprovision static mac addresses on the bridge for all
>> port_bindings we know the mac address from, or would you rather leave
>> that up for learning as well?
> 
> I would leave everything dynamic.
> 
>> I do not know if there is some kind of performance/optimization penality
>> for moving packets between different bridges.
> 
> As far as I know, once the openflow pipeline has been resolved into
> a datapath flow, there is no penalty.
> 
>> You can also not only use the logical switch that have a local port
>> bound. Assume the following topology:
>> +---+ +---+ +---+ +---+ +---+ +---+ +---+
>> |vm1+-+ls1+-+lr1+-+ls2+-+lr2+-+ls3+-+vm2|
>> +---+ +---+ +---+ +---+ +---+ +---+ +---+
>> vm1 and vm2 are both running on the same hypervisor. Creating only local
>> logical switches would mean only ls1 and ls3 are available on that
>> hypervisor. This would break the connection between the two vms which
>> would in the current implementation just traverse the two logical
>> routers.
>> I guess we would need to create bridges for each locally reachable
>> logical switch. I am concerned about the potentially significant
>> increase in bridges and openflow connections this brings.
> 
> That is one of the concerns I raised in the last point. In my opinion
> this is a trade off. You remove centralization and require more local
> processing. But overall, the processing cost should remain equivalent.

Just want to clarify.
For topology described by Felix above, you propose to create 2 OVS bridges, 
right? How will the packet traverse from vm1 to vm2? 

Currently when the packet enters OVS all the logical switching and routing 
openflow calculation is done with no packet re-entering OVS, and this results 
in one DP flow match to deliver this packet from vm1 to vm2 (if no conntrack 
used, which could introduce recirculations).
Do I understand correctly, that in this proposal OVS needs to receive packet 
from “ls1” bridge, next run through lrouter “lr1” OpenFlow pipelines, then 
output packet to “ls2” OVS bridge for mac learning between logical routers 
(should we have here OF flow with learn action?), then send packet again to 
OVS, calculate “lr2” OpenFlow pipeline and finally reach destination OVS bridge 
“ls3” to send packet to a vm2? 

Also, will such behavior be compatible with HW-offload-capable to 
smartnics/DPUs?

> 
>>> Use multicast for overlay networks
>>> ==
> [snip]
>>> - 24bit VNI allows for more than 16 million logical switches. No need
>>>  for extended GENEVE tunnel options.
>> Note that using vxlan at the moment significantly reduces the ovn
>> featureset. This is because the geneve header options are currently used
>> for data that would not fit into the vxlan vni.
>> 
>> From ovn-architecture.7.xml:
>> ```
>> The maximum number of networks is reduced to 4096.
>> The maximum number of ports per network is reduced to 2048.
>> ACLs matching against logical ingress port identifiers are not supported.
>> OVN interconnection feature is not supported.
>> ```
> 
> In my understanding, the main reason why GENEVE replaced VXLAN is
> because Openstack uses full mesh point to point tunnels and that the
> sender needs to know behind which chassis any mac address is to send it
> into the correct tunnel. GENEVE allowed to reduce the lookup time both
> on the sender and receiver thanks to ingress/egress port metadata.
> 
> https://blog.russellbryant.net/2017/05/30/ovn-geneve-vs-vxlan-does-it-matter/
> https://dani.foroselectronica.es/ovn-geneve-encapsulation-541/
> 
> If VXLAN + multicast and address learning was used, the "correct" tunnel
> would be established ad-hoc and both sender and receiver lookups would
> only be a simple mac forwarding with learning. The ingress pipeline
> would probably cost a little more.
> 
> Maybe multicast + address learning could be implemented for GENEVE as
> well. But it would not be interoperable with other VTEPs.
> 
>>> - Limited 

Re: [ovs-discuss] OVN: DHCP Support for vtep port type

2023-08-31 Thread Vladislav Odintsov via discuss
Hi Austin,

There is a special function in northd.c to handle hairpin traffic - 
build_vtep_hairpin().
The mentioned flows are not matched due to flow [1], which by default sends all 
traffic received from vtep lport to l2lkp table for local delivery.
As you correctly noticed, there is no instance to handle "special" traffic 
coming from VTEP.
But there is a mechanism introduced in [2], [3], where you can attach LS to LR 
and set gw chassis for associated LRP to one of your chassis.
ARP responses for LRPs are made with such logic [4].

You can try to add similar logic for DHCP traffic, set loopback flag and 
resubmit packet to appropriate table. This can be not enough.

Hope this helps.

1: https://github.com/ovn-org/ovn/blob/0d021216c/northd/northd.c#L8374-L8377
2: 
https://github.com/ovn-org/ovn/commit/4e90bcf55c2ef1ab9940836455e4bfe36e57f307
3: 
https://github.com/ovn-org/ovn/commit/f20f664bc962094e4679ff2d3a8d834637bff27f
4: https://github.com/ovn-org/ovn/blob/0d021216c/northd/northd.c#L8403-L8405

> On 31 Aug 2023, at 21:04, Austin Cormier via discuss 
>  wrote:
> 
> I’m trying to understand whether DHCP is supported (or can easily be 
> supported) for vtep port types.  The use case I am driving towards is to 
> support DHCP/PXE booting for hardware over vxlan overlay network via a TOR 
> switch.  I believe this could work using Neutron DHCP Agent but we’ve hit 
> scaling issues in that area so we’re attempting to use OVN DHCP server if 
> possible.
>  
> In the NB DB, the vtep port is added to the ironic provisioning network with 
> both the IP/MAC set.
>  
>  
> $ ovn-nbctl show
> …
> switch 99a907e0-7525-4d4e-a029-fddaafeff3e5 
> (neutron-a2d49759-d8fa-43a3-8620-35201420f72b) (aka Ironic Provisioning)
> port provnet-2fd9fb49-7e68-4f86-8690-aec776e9f128
> type: localnet
> addresses: ["unknown"]
> port 2ebf0521-3f7e-44b6-848f-82a0214e3c25
> type: localport
> addresses: ["fa:16:3e:7f:e9:56 10.90.100.1"]
> port 23af2a74-0333-4600-9e11-e99bff860999
> type: vtep
> addresses: ["0c:c4:7a:ac:75:4e 10.90.100.32"]
>  
>  
> In the SB database, I see the port binding under the data-switch-01 chassis:
>  
> $ ovn-sbctl show
> Chassis compute-test
> hostname: compute-test
> Encap vxlan
> ip: "10.40.0.1"
> options: {csum="true"}
> Port_Binding "04901637-9467-4f01-b16b-e22de768a1b8"
> Chassis data-switch-01
> Encap vxlan
> ip: "17.17.17.17"
> options: {csum="false"}
> Port_Binding "23af2a74-0333-4600-9e11-e99bff860999"
> Chassis controller-test
> hostname: controller-test
> Encap geneve
> ip: "10.40.0.3"
> options: {csum="true"}
> Encap vxlan
> ip: "10.40.0.3"
> options: {csum="true"}
> Port_Binding cr-lrp-876f85ad-0a6f-4668-8736-e122fedc32c2
>  
>  
>  
> In the logical flow list, it appears that the DHCP responses are added:
>  
>   table=19(ls_in_arp_rsp  ), priority=50   , match=(arp.tpa == 
> 10.90.100.1 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 
> fa:16:3e:7f:e9:56; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 
> fa:16:3e:7f:e9:56; arp.tpa = arp.spa; arp.spa = 10.90.100.1; ou
> tport = inport; flags.loopback = 1; output;)
>   table=19(ls_in_arp_rsp  ), priority=50   , match=(arp.tpa == 
> 10.90.100.32 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 
> 0c:c4:7a:ac:75:4e; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 
> 0c:c4:7a:ac:75:4e; arp.tpa = arp.spa; arp.spa = 10.90.100.32;
> outport = inport; flags.loopback = 1; output;)
>   table=19(ls_in_arp_rsp  ), priority=0, match=(1), action=(next;)
>   table=20(ls_in_dhcp_options ), priority=100  , match=(inport == 
> "23af2a74-0333-4600-9e11-e99bff860999" && eth.src == 0c:c4:7a:ac:75:4e && 
> ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst 
> == 67), action=(reg0[3] = put_dhcp_opts(offerip = 10.90
> .100.32, bootfile_name = http://192.168.3.3:8089/boot.ipxe, bootfile_name_alt 
> = "snponly.efi", classless_static_route = {169.254.169.254/32,10.90.100.1}, 
> dns_server = {127.0.0.53}, lease_time = 43200, mtu = 1500, netmask = 
> 255.255.0.0, next_server = 192.168.3.3, server_
> id = 10.90.100.1, tftp_server = "192.168.3.3", tftp_server_address = 
> 192.168.3.3); next;)
> table=20(ls_in_dhcp_options ), priority=100  , match=(inport == 
> "23af2a74-0333-4600-9e11-e99bff860999" && eth.src == 0c:c4:7a:ac:75:4e && 
> ip4.src == 10.90.100.32 && ip4.dst == {10.90.100.1, 255.255.255.255} && 
> udp.src == 68 && udp.dst == 67), action=(reg0[3] = put_dhcp_
> opts(offerip = 10.90.100.32, bootfile_name = 
> http://192.168.3.3:8089/boot.ipxe, bootfile_name_alt = "snponly.efi", 
> classless_static_route = {169.254.169.254/32,10.90.100.1}, dns_server = 
> {127.0.0.53}, lease_time = 43200, mtu = 1500, netmask = 255.255.0.0, 
> next_server =
> 192.168.3.3, server_id = 10.90.100.1, tftp_server = "192.168.3.3", 
> tftp_server_address = 

Re: [ovs-discuss] [ovn] Routing loop in OVN crashes OVS with segmentation violation

2023-06-15 Thread Vladislav Odintsov via discuss
Also, probably it’s worth to detect on the OVN side a routing loop and not to 
create associated to affected routes logical flows to SB.
But it could be not so easy to build full map of all interconnected routes and 
find routes which create a loop...

> On 15 Jun 2023, at 18:26, Mike Pattrick  wrote:
> 
> On Thu, Jun 15, 2023 at 11:11 AM Eelco Chaudron  <mailto:echau...@redhat.com>> wrote:
>> 
>> 
>> 
>> On 15 Jun 2023, at 17:07, Vladislav Odintsov wrote:
>> 
>>>> On 15 Jun 2023, at 16:16, Eelco Chaudron via discuss 
>>>>  wrote:
>>>> 
>>>> 
>>>> 
>>>> On 15 Jun 2023, at 14:36, Vladislav Odintsov via discuss wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I’ve faced condition in flow lookup where OVS crashes with segmentation 
>>>>> violation because of insufficient stack limit size for ovs-vswitchd 
>>>>> daemon.
>>>>> Below is the reproducer:
>>>>> 
>>>>> # --->
>>>>> # Ensure there is a default LimitSTACK in ovs-vswitchd.service file with 
>>>>> which OVS is run (should be 2M):
>>>>> grep LimitSTACK /usr/lib/systemd/system/ovs-vswitchd.service
>>>>> 
>>>>> # create 2 LRs and connect them via ls
>>>>> ovn-nbctl lr-add lr1
>>>>> ovn-nbctl lr-add lr2
>>>>> ovn-nbctl lrp-add lr1 lrp1 00:00:00:00:00:01 10.0.0.1/24
>>>>> ovn-nbctl lrp-add lr2 lrp2 00:00:00:00:00:02 10.0.0.2/24
>>>>> ovn-nbctl ls-add ls
>>>>> ovn-nbctl lsp-add ls ls-lrp1 -- lsp-set-type ls-lrp1 router -- 
>>>>> lsp-set-addresses ls-lrp1 router -- lsp-set-options ls-lrp1 
>>>>> router-port=lrp1
>>>>> ovn-nbctl lsp-add ls ls-lrp2 -- lsp-set-type ls-lrp2 router -- 
>>>>> lsp-set-addresses ls-lrp2 router -- lsp-set-options ls-lrp2 
>>>>> router-port=lrp2
>>>>> 
>>>>> # create route to same cidr looping routing
>>>>> ovn-nbctl lr-route-add lr1 1.1.1.1/32 10.0.0.2
>>>>> ovn-nbctl lr-route-add lr2 1.1.1.1/32 10.0.0.1
>>>>> 
>>>>> # create vif lport and configure it
>>>>> ovn-nbctl lsp-add ls lp1 -- lsp-set-addresses lp1 00:00:00:00:00:f1
>>>>> ovs-vsctl add-port br-int lp1 -- set int lp1 type=internal 
>>>>> external_ids:iface-id=lp1
>>>>> ip li set lp1 addr 00:00:00:00:00:f1
>>>>> ip a add 10.0.0.200/24 dev lp1
>>>>> ip li set lp1 up
>>>>> ip r add 1.1.1.1/32 via 10.0.0.1
>>>>> ping 1.1.1.1 -c1
>>>>> 
>>>>> # <---
>>>>> 
>>>>> This problem was first described in [1] and continued in [2].
>>>>> I’m wonder whether [2] was discussed somewhere in another place or had no 
>>>>> resolution.
>>>>> 
>>>>> OVS crash reproduces on different versions: 2.13, 2.17, 3.1.
>>>>> Default stack limit shipped with OVS looks not enough to reach 'Recursion 
>>>>> too deep'. In my tests for this reproducer it is needed at least 2293K to 
>>>>> work properly.
>>>>> 
>>>>> I understand, that such configuration should be validated and avoided 
>>>>> from the CMS side, but I think that there should be no possibility so 
>>>>> easily bring system to crashed state.
>>>>> 
>>>>> Should the default OVS StackLimit in systemd.unit be increased?
>>>>> Or, maybe, OVN should document the need to increase OVS stack limit 
>>>>> manually by users?
>>>>> Or, should OVN supply systemd drop-in unit to override default OVS 
>>>>> StackLimit?
>>>>> 
>>>>> 1: https://bugzilla.redhat.com/show_bug.cgi?id=1821185#c3
>>>>> 2: https://mail.openvswitch.org/pipermail/ovs-dev/2020-April/369776.html
>>>> 
>>>> There is a recent patch series sent out by Mike, you might want to give 
>>>> that a try.
>>>> 
>>>> https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=82705
>>>> 
>>>> 
>>>> Would be interesting to see if that solves the crash part.
>>> 
>>> Hi Eelco,
>>> 
>>> thank you for pointing out to this patch.
>>> 
>>> I’ve tried it and it seems working with default ovs-vswitchd stack limit 
>>> (2M)! That’s cool.
>>> Also, I’ve tried to find new watermark for the described scenario (to find 
>>> a new stack limit value, where ovs-vswitchd will crash with segv).
>>> So, its value decreased from 2293KB to 1633K.
>> 
>> Thanks for testing, this is good news :)
>> 
>>> Thanks @Mike for your improvement!
>>> 
>>> Can this patch be considered for backporting after merge to upstream as it 
>>> fixes this issue?
>> 
>> Guess this is up to the maintainers to decide, maybe Mike cant tell how 
>> impactful the change is.
>> I still need to review the latest revision, so can’t comment on this from 
>> the top of my head.
> 
> I just tested applying it back to 2.15. It applied relatively cleanly
> with only minor changes required. So I think a backport is reasonable,
> but as Eelco said this is up to the maintainers.
> 
> 
> -M
> 
>> 
>> //Eelco


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovn] Routing loop in OVN crashes OVS with segmentation violation

2023-06-15 Thread Vladislav Odintsov via discuss


> On 15 Jun 2023, at 16:16, Eelco Chaudron via discuss 
>  wrote:
> 
> 
> 
> On 15 Jun 2023, at 14:36, Vladislav Odintsov via discuss wrote:
> 
>> Hi all,
>> 
>> I’ve faced condition in flow lookup where OVS crashes with segmentation 
>> violation because of insufficient stack limit size for ovs-vswitchd daemon.
>> Below is the reproducer:
>> 
>> # --->
>> # Ensure there is a default LimitSTACK in ovs-vswitchd.service file with 
>> which OVS is run (should be 2M):
>> grep LimitSTACK /usr/lib/systemd/system/ovs-vswitchd.service
>> 
>> # create 2 LRs and connect them via ls
>> ovn-nbctl lr-add lr1
>> ovn-nbctl lr-add lr2
>> ovn-nbctl lrp-add lr1 lrp1 00:00:00:00:00:01 10.0.0.1/24
>> ovn-nbctl lrp-add lr2 lrp2 00:00:00:00:00:02 10.0.0.2/24
>> ovn-nbctl ls-add ls
>> ovn-nbctl lsp-add ls ls-lrp1 -- lsp-set-type ls-lrp1 router -- 
>> lsp-set-addresses ls-lrp1 router -- lsp-set-options ls-lrp1 router-port=lrp1
>> ovn-nbctl lsp-add ls ls-lrp2 -- lsp-set-type ls-lrp2 router -- 
>> lsp-set-addresses ls-lrp2 router -- lsp-set-options ls-lrp2 router-port=lrp2
>> 
>> # create route to same cidr looping routing
>> ovn-nbctl lr-route-add lr1 1.1.1.1/32 10.0.0.2
>> ovn-nbctl lr-route-add lr2 1.1.1.1/32 10.0.0.1
>> 
>> # create vif lport and configure it
>> ovn-nbctl lsp-add ls lp1 -- lsp-set-addresses lp1 00:00:00:00:00:f1
>> ovs-vsctl add-port br-int lp1 -- set int lp1 type=internal 
>> external_ids:iface-id=lp1
>> ip li set lp1 addr 00:00:00:00:00:f1
>> ip a add 10.0.0.200/24 dev lp1
>> ip li set lp1 up
>> ip r add 1.1.1.1/32 via 10.0.0.1
>> ping 1.1.1.1 -c1
>> 
>> # <---
>> 
>> This problem was first described in [1] and continued in [2].
>> I’m wonder whether [2] was discussed somewhere in another place or had no 
>> resolution.
>> 
>> OVS crash reproduces on different versions: 2.13, 2.17, 3.1.
>> Default stack limit shipped with OVS looks not enough to reach 'Recursion 
>> too deep'. In my tests for this reproducer it is needed at least 2293K to 
>> work properly.
>> 
>> I understand, that such configuration should be validated and avoided from 
>> the CMS side, but I think that there should be no possibility so easily 
>> bring system to crashed state.
>> 
>> Should the default OVS StackLimit in systemd.unit be increased?
>> Or, maybe, OVN should document the need to increase OVS stack limit manually 
>> by users?
>> Or, should OVN supply systemd drop-in unit to override default OVS 
>> StackLimit?
>> 
>> 1: https://bugzilla.redhat.com/show_bug.cgi?id=1821185#c3
>> 2: https://mail.openvswitch.org/pipermail/ovs-dev/2020-April/369776.html
> 
> There is a recent patch series sent out by Mike, you might want to give that 
> a try.
> 
> https://patchwork.ozlabs.org/project/openvswitch/list/?submitter=82705
> 
> 
> Would be interesting to see if that solves the crash part.

Hi Eelco,

thank you for pointing out to this patch.

I’ve tried it and it seems working with default ovs-vswitchd stack limit (2M)! 
That’s cool.
Also, I’ve tried to find new watermark for the described scenario (to find a 
new stack limit value, where ovs-vswitchd will crash with segv).
So, its value decreased from 2293KB to 1633K.

Thanks @Mike for your improvement!

Can this patch be considered for backporting after merge to upstream as it 
fixes this issue?

> 
> //Eelco
> 
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] [ovn] Routing loop in OVN crashes OVS with segmentation violation

2023-06-15 Thread Vladislav Odintsov via discuss
Hi all,

I’ve faced condition in flow lookup where OVS crashes with segmentation 
violation because of insufficient stack limit size for ovs-vswitchd daemon.
Below is the reproducer:

# --->
# Ensure there is a default LimitSTACK in ovs-vswitchd.service file with which 
OVS is run (should be 2M):
grep LimitSTACK /usr/lib/systemd/system/ovs-vswitchd.service

# create 2 LRs and connect them via ls
ovn-nbctl lr-add lr1
ovn-nbctl lr-add lr2
ovn-nbctl lrp-add lr1 lrp1 00:00:00:00:00:01 10.0.0.1/24
ovn-nbctl lrp-add lr2 lrp2 00:00:00:00:00:02 10.0.0.2/24
ovn-nbctl ls-add ls
ovn-nbctl lsp-add ls ls-lrp1 -- lsp-set-type ls-lrp1 router -- 
lsp-set-addresses ls-lrp1 router -- lsp-set-options ls-lrp1 router-port=lrp1
ovn-nbctl lsp-add ls ls-lrp2 -- lsp-set-type ls-lrp2 router -- 
lsp-set-addresses ls-lrp2 router -- lsp-set-options ls-lrp2 router-port=lrp2

# create route to same cidr looping routing
ovn-nbctl lr-route-add lr1 1.1.1.1/32 10.0.0.2
ovn-nbctl lr-route-add lr2 1.1.1.1/32 10.0.0.1

# create vif lport and configure it
ovn-nbctl lsp-add ls lp1 -- lsp-set-addresses lp1 00:00:00:00:00:f1
ovs-vsctl add-port br-int lp1 -- set int lp1 type=internal 
external_ids:iface-id=lp1
ip li set lp1 addr 00:00:00:00:00:f1
ip a add 10.0.0.200/24 dev lp1
ip li set lp1 up
ip r add 1.1.1.1/32 via 10.0.0.1
ping 1.1.1.1 -c1

# <---

This problem was first described in [1] and continued in [2].
I’m wonder whether [2] was discussed somewhere in another place or had no 
resolution.

OVS crash reproduces on different versions: 2.13, 2.17, 3.1.
Default stack limit shipped with OVS looks not enough to reach 'Recursion too 
deep'. In my tests for this reproducer it is needed at least 2293K to work 
properly.

I understand, that such configuration should be validated and avoided from the 
CMS side, but I think that there should be no possibility so easily bring 
system to crashed state.

Should the default OVS StackLimit in systemd.unit be increased?
Or, maybe, OVN should document the need to increase OVS stack limit manually by 
users?
Or, should OVN supply systemd drop-in unit to override default OVS StackLimit?

1: https://bugzilla.redhat.com/show_bug.cgi?id=1821185#c3
2: https://mail.openvswitch.org/pipermail/ovs-dev/2020-April/369776.html

Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db unstable

2023-05-10 Thread Vladislav Odintsov via discuss


> On 10 May 2023, at 17:15, Vladislav Odintsov  wrote:
> 
> Hi all,
> 
>> On 3 May 2023, at 15:11, Ilya Maximets  wrote:
>> 
>> On 5/3/23 12:47, Vladislav Odintsov wrote:
>>> Thanks Ilya for your inputs.
>>> 
>>>> On 2 May 2023, at 21:49, Ilya Maximets  wrote:
>>>> 
>>>> On 5/2/23 19:22, Ilya Maximets wrote:
>>>>> On 5/2/23 19:04, Vladislav Odintsov via discuss wrote:
>>>>>> I ran perf record -F99 -p $(ovsdb-server) -- sleep 30 on ovsdb-server 
>>>>>> process during CPU spike. perf report result:
>>>>>> 
>>>>> 
>>>>> Could you run it for a couple of minutes during that 5-6 minute window?
>>> 
>>> Sure, here it is (this report was collected during ~3 minutes while 
>>> ovsdb-server was under 100% CPU load):
>>> 
>>> # To display the perf.data header info, please use --header/--header-only 
>>> options.
>>> #
>>> #
>>> # Total Lost Samples: 0
>>> #
>>> # Samples: 12K of event 'cpu-clock'
>>> # Event count (approx.): 130030301730
>>> #
>>> # Overhead  Command   Shared ObjectSymbol
>>> #     ...  
>>> ..
>>> #
>>> 21.20%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] uuid_compare_3way
>>> 10.49%  ovsdb-server  libc-2.17.so [.] 
>>> malloc_consolidate
>>> 10.04%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>>> ovsdb_clause_evaluate
>>>  9.40%  ovsdb-server  libc-2.17.so [.] _int_malloc
>>>  6.42%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_destroy__
>>>  4.36%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>>> ovsdb_atom_compare_3way
>>>  3.29%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>>> json_serialize_string
>>>  3.23%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>>> ovsdb_condition_match_any_clause
>>>  3.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_serialize
>>>  2.60%  ovsdb-server  [kernel.kallsyms][k] clear_page_c_e
>>>  1.87%  ovsdb-server  libc-2.17.so [.] 
>>> __memcpy_ssse3_back
>>>  1.80%  ovsdb-server  libc-2.17.so [.] free
>>>  1.67%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>>> json_serialize_object_member
>>>  1.60%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>>> ovsdb_atom_is_default
>>>  1.47%  ovsdb-server  libc-2.17.so [.] vfprintf
>>>  1.17%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] resize
>>>  1.12%  ovsdb-server  libc-2.17.so [.] _int_free
>>>  1.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>>> ovsdb_atom_compare_3way@plt
>>>  1.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] shash_find__
>> 
>> Thanks!  Yeah, the conditional monitoring appears to be the
>> main issue here.
>> 
>> 
>> 
>>> 
>>>>> Also, is it a single 5-6 minute poll interval or a several shorter ones
>>>>> (from the log)?
>>> 
>>> I see from monitoring that during these 5-6 minutes ovsdb-server utilizes 
>>> 100% of one core. With top command it is also constantly shown as 100%.
>>> 
>>> Worth to add that I see next log warnings in ovsdb-server relay (long poll 
>>> interval for 84 seconds):
>>> 
>>> 2023-05-03T10:21:53.928Z|11522|timeval|WARN|Unreasonably long 84348ms poll 
>>> interval (84270ms user, 14ms system)
>>> 2023-05-03T10:21:53.931Z|11523|timeval|WARN|context switches: 0 voluntary, 
>>> 229 involuntary
>>> 2023-05-03T10:21:53.933Z|11524|coverage|INFO|Skipping details of duplicate 
>>> event coverage for hash=580d57f8
>>> 2023-05-03T10:21:53.935Z|11525|poll_loop|INFO|wakeup due to [POLLIN] on fd 
>>> 21 (0.0.0.0:6642<->) at lib/stream-ssl.c:978 (99% CPU usage)
>>> 2023-05-03T10:21:54.094Z|11526|stream_ssl|WARN|SSL_write: system error 
>>> (Broken pipe)
>>> 2023-05-03T10:21:54.096Z|11527|jsonrpc|WARN|ssl:x.x.x.x:46894: send error: 
>>> Broken pipe
>>> 2023-05-03T10:21:54.120Z|11528|stream_ssl|WARN|SSL_accept: unexpected SSL 
>>> connection close
>>> 2023-05-03T10:21:54.120Z|11529|jsonrpc|WARN|ssl:x.x.x.x:46950: receive 
>>> error: Protocol error
>>> 2023-05-03T10:21:54.122Z|11530|poll_loo

Re: [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db unstable

2023-05-10 Thread Vladislav Odintsov via discuss
Hi all,

> On 3 May 2023, at 15:11, Ilya Maximets  wrote:
> 
> On 5/3/23 12:47, Vladislav Odintsov wrote:
>> Thanks Ilya for your inputs.
>> 
>>> On 2 May 2023, at 21:49, Ilya Maximets  wrote:
>>> 
>>> On 5/2/23 19:22, Ilya Maximets wrote:
>>>> On 5/2/23 19:04, Vladislav Odintsov via discuss wrote:
>>>>> I ran perf record -F99 -p $(ovsdb-server) -- sleep 30 on ovsdb-server 
>>>>> process during CPU spike. perf report result:
>>>>> 
>>>> 
>>>> Could you run it for a couple of minutes during that 5-6 minute window?
>> 
>> Sure, here it is (this report was collected during ~3 minutes while 
>> ovsdb-server was under 100% CPU load):
>> 
>> # To display the perf.data header info, please use --header/--header-only 
>> options.
>> #
>> #
>> # Total Lost Samples: 0
>> #
>> # Samples: 12K of event 'cpu-clock'
>> # Event count (approx.): 130030301730
>> #
>> # Overhead  Command   Shared ObjectSymbol
>> #     ...  
>> ..
>> #
>> 21.20%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] uuid_compare_3way
>> 10.49%  ovsdb-server  libc-2.17.so [.] malloc_consolidate
>> 10.04%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>> ovsdb_clause_evaluate
>>  9.40%  ovsdb-server  libc-2.17.so [.] _int_malloc
>>  6.42%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_destroy__
>>  4.36%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>> ovsdb_atom_compare_3way
>>  3.29%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>> json_serialize_string
>>  3.23%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>> ovsdb_condition_match_any_clause
>>  3.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_serialize
>>  2.60%  ovsdb-server  [kernel.kallsyms][k] clear_page_c_e
>>  1.87%  ovsdb-server  libc-2.17.so [.] 
>> __memcpy_ssse3_back
>>  1.80%  ovsdb-server  libc-2.17.so [.] free
>>  1.67%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>> json_serialize_object_member
>>  1.60%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
>> ovsdb_atom_is_default
>>  1.47%  ovsdb-server  libc-2.17.so [.] vfprintf
>>  1.17%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] resize
>>  1.12%  ovsdb-server  libc-2.17.so [.] _int_free
>>  1.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
>> ovsdb_atom_compare_3way@plt
>>  1.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] shash_find__
> 
> Thanks!  Yeah, the conditional monitoring appears to be the
> main issue here.
> 
> 
> 
>> 
>>>> Also, is it a single 5-6 minute poll interval or a several shorter ones
>>>> (from the log)?
>> 
>> I see from monitoring that during these 5-6 minutes ovsdb-server utilizes 
>> 100% of one core. With top command it is also constantly shown as 100%.
>> 
>> Worth to add that I see next log warnings in ovsdb-server relay (long poll 
>> interval for 84 seconds):
>> 
>> 2023-05-03T10:21:53.928Z|11522|timeval|WARN|Unreasonably long 84348ms poll 
>> interval (84270ms user, 14ms system)
>> 2023-05-03T10:21:53.931Z|11523|timeval|WARN|context switches: 0 voluntary, 
>> 229 involuntary
>> 2023-05-03T10:21:53.933Z|11524|coverage|INFO|Skipping details of duplicate 
>> event coverage for hash=580d57f8
>> 2023-05-03T10:21:53.935Z|11525|poll_loop|INFO|wakeup due to [POLLIN] on fd 
>> 21 (0.0.0.0:6642<->) at lib/stream-ssl.c:978 (99% CPU usage)
>> 2023-05-03T10:21:54.094Z|11526|stream_ssl|WARN|SSL_write: system error 
>> (Broken pipe)
>> 2023-05-03T10:21:54.096Z|11527|jsonrpc|WARN|ssl:x.x.x.x:46894: send error: 
>> Broken pipe
>> 2023-05-03T10:21:54.120Z|11528|stream_ssl|WARN|SSL_accept: unexpected SSL 
>> connection close
>> 2023-05-03T10:21:54.120Z|11529|jsonrpc|WARN|ssl:x.x.x.x:46950: receive 
>> error: Protocol error
>> 2023-05-03T10:21:54.122Z|11530|poll_loop|INFO|wakeup due to [POLLIN] on fd 
>> 21 (0.0.0.0:6642<->) at lib/stream-ssl.c:978 (99% CPU usage)
>> 
>>>> 
>>>> And you seem to miss some debug symbols for libovsdb.
>> 
>> Thanks. Now I’ve installed openvswitch-debuginfo package.
>> 
>>>> 
>>>> One potentially quick fix for your setup would be to disable conditional
>>>> monitoring, i.e. set ovn-monitor-all=

Re: [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db unstable

2023-05-03 Thread Vladislav Odintsov via discuss
Thanks Ilya for your inputs.

> On 2 May 2023, at 21:49, Ilya Maximets  wrote:
> 
> On 5/2/23 19:22, Ilya Maximets wrote:
>> On 5/2/23 19:04, Vladislav Odintsov via discuss wrote:
>>> I ran perf record -F99 -p $(ovsdb-server) -- sleep 30 on ovsdb-server 
>>> process during CPU spike. perf report result:
>>> 
>> 
>> Could you run it for a couple of minutes during that 5-6 minute window?

Sure, here it is (this report was collected during ~3 minutes while 
ovsdb-server was under 100% CPU load):

# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of event 'cpu-clock'
# Event count (approx.): 130030301730
#
# Overhead  Command   Shared ObjectSymbol
#     ...  
..
#
21.20%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] uuid_compare_3way
10.49%  ovsdb-server  libc-2.17.so [.] malloc_consolidate
10.04%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] ovsdb_clause_evaluate
 9.40%  ovsdb-server  libc-2.17.so [.] _int_malloc
 6.42%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_destroy__
 4.36%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
ovsdb_atom_compare_3way
 3.29%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_serialize_string
 3.23%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_condition_match_any_clause
 3.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_serialize
 2.60%  ovsdb-server  [kernel.kallsyms][k] clear_page_c_e
 1.87%  ovsdb-server  libc-2.17.so [.] __memcpy_ssse3_back
 1.80%  ovsdb-server  libc-2.17.so [.] free
 1.67%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
json_serialize_object_member
 1.60%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_atom_is_default
 1.47%  ovsdb-server  libc-2.17.so [.] vfprintf
 1.17%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] resize
 1.12%  ovsdb-server  libc-2.17.so [.] _int_free
 1.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_atom_compare_3way@plt
 1.05%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] shash_find__
 0.99%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_string
 0.87%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_changes_update
 0.79%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_destroy
 0.71%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_clone
 0.54%  ovsdb-server  libc-2.17.so [.] _IO_default_xsputn
 0.53%  ovsdb-server  libc-2.17.so [.] malloc
 0.35%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] hash_bytes
 0.33%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_compose_update
 0.30%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
ovsdb_datum_is_default
 0.28%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_change_set_destroy
 0.27%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] shash_add_nocopy__
 0.22%  ovsdb-server  [kernel.kallsyms][k] 
copy_user_enhanced_fast_string
 0.21%  ovsdb-server  ld-2.17.so   [.] __tls_get_addr
 0.20%  ovsdb-server  libc-2.17.so [.] __memcpy_sse2
 0.20%  ovsdb-server  libc-2.17.so [.] __strlen_sse2_pminub
 0.19%  ovsdb-server  [kernel.kallsyms][k] 
system_call_after_swapgs
 0.19%  ovsdb-server  libc-2.17.so [.] 
_IO_str_init_static_internal
 0.19%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] xmalloc
 0.19%  ovsdb-server  libc-2.17.so [.] __strchrnul
 0.19%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_compose_row_update2
 0.18%  ovsdb-server  [kernel.kallsyms][k] audit_filter_syscall
 0.17%  ovsdb-server  libc-2.17.so [.] _itoa_word
 0.15%  ovsdb-server  [kernel.kallsyms][k] 
audit_filter_rules.isra.8
 0.13%  ovsdb-server  [kernel.kallsyms][k] __do_softirq
 0.13%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] hmap_swap
 0.12%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] xmalloc__
 0.11%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ds_put_uninit
 0.11%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_get_initial
 0.10%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_atom_to_json__
 0.09%  ovsdb-server  libc-2.17.so [.] __strcmp_sse42
 0.09%  ovsdb-server  [vdso]   [.] __vdso_clock_gettime
 0.09%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ds_put_format_valist
 0.08%  ovsdb-server  [kernel.kallsyms][k] __d_lookup_rcu
 0.07%  ovsdb-server  libc-2.17.so

Re: [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db unstable

2023-05-02 Thread Vladislav Odintsov via discuss
I ran perf record -F99 -p $(ovsdb-server) -- sleep 30 on ovsdb-server process 
during CPU spike. perf report result:

Samples: 2K of event 'cpu-clock', Event count (approx.): 29989898690
Overhead  Command   Shared ObjectSymbol
  58.71%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] uuid_compare_3way
   7.61%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011058
   6.60%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_atom_compare_3way
   5.93%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_condition_match_any_clause
   2.26%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011070
   2.19%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] uuid_compare_3way@plt
   2.16%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010ffd
   1.68%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fe8
   1.65%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x000110ec
   1.38%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011084
   1.31%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001100e
   1.25%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011063
   1.08%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
ovsdb_datum_compare_3way
   1.04%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fe0
   0.67%  ovsdb-server  libc-2.17.so [.] _int_malloc
   0.54%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001105a
   0.51%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_monitor_get_update
   0.47%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_hash
   0.30%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_destroy__
   0.24%  ovsdb-server  libc-2.17.so [.] malloc_consolidate
   0.24%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_atom_hash
   0.20%  ovsdb-server  [kernel.kallsyms][k] __do_softirq
   0.17%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] json_string
   0.13%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_destroy
   0.13%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fe5
   0.10%  ovsdb-server  libc-2.17.so [.] __memset_sse2
   0.10%  ovsdb-server  libc-2.17.so [.] __strlen_sse2_pminub
   0.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fea
   0.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011020
   0.10%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011067
   0.07%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 
ovsdb_datum_compare_3way@plt
   0.07%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_equals
   0.07%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010ff1
   0.07%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00011025
   0.07%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001108b
   0.03%  ovsdb-server  [kernel.kallsyms][k] __netif_schedule
   0.03%  ovsdb-server  libc-2.17.so [.] __strncmp_sse42
   0.03%  ovsdb-server  libc-2.17.so [.] _int_free
   0.03%  ovsdb-server  libc-2.17.so [.] free
   0.03%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_clone
   0.03%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] ovsdb_datum_from_json
   0.03%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] shash_find
   0.03%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 0x0018f7e2
   0.03%  ovsdb-server  libopenvswitch-3.1.so.0.0.0  [.] 0x0018f7e8
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 
ovsdb_condition_from_json
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fe1
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x00010fff
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001105e
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x000110c3
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x000110f2
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c007
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c030
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c05b
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c45f
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c535
   0.03%  ovsdb-server  libovsdb-3.1.so.0.0.0[.] 0x0001c644

> On 2 May 2023, at 19:45, Ilya Maximets via discuss 
>  wrote:
> 
> On 5/2/23 16:45, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> let me jump into this thread.
>> 
>> Right now I’m debugging the behaviour of ovn (22.09.x) and ovsdb-server 
>> 3.1.0 where one ovsdb update3 request makes ovsdb-server, which acts as a 
>> relay for OVN Southbound DB with only 5-6 clients connected to it 
>> (ovn-controllers, acting as a central chassis for external access with 
>> enabled ha_group for edge LRs), 

Re: [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db unstable

2023-05-02 Thread Vladislav Odintsov via discuss
Hi Ilya,

let me jump into this thread.

Right now I’m debugging the behaviour of ovn (22.09.x) and ovsdb-server 3.1.0 
where one ovsdb update3 request makes ovsdb-server, which acts as a relay for 
OVN Southbound DB with only 5-6 clients connected to it (ovn-controllers, 
acting as a central chassis for external access with enabled ha_group for edge 
LRs), utilize 100% CPU during 5-6 minutes.
During this time ovsdb relay failes to answer ovsdb inactivity probes and then 
clients and even upstream ovsdb-servers disconnect this ovsdb relay because of 
ping probe timeout of 60s. All the probe intervals configured to 60 seconds 
value (ovsdb-server SB cluster <-> ovsdb SB relay <-> ovn-controller). Earlier 
I’ve posted a long-read with some problems listed [1].

IIUC, this update is generated by ovn-northd after one LS with only one LSP 
type router and attached LB is removed.
You can see the request json here: [2]
Such updates appear not only if LS/LB is removed but also in some other 
operations, this is just an example.
So it seems like ovn-northd re-creates a big dp group and such update for some 
reason is difficult to handle for ovsdb relay (actually ovn-controllers  also 
utilize 100% cpu).

Have you seen such behaviour? Maybe you’ve got any suggestion about the reason 
and a possible fix for such huge load from one update3?

Thanks.

1: https://mail.openvswitch.org/pipermail/ovs-dev/2023-April/403699.html
2: https://gist.github.com/odivlad/bba4443e589a268a0f389c2972511df3


> On 2 May 2023, at 14:49, Ilya Maximets via discuss 
>  wrote:
> 
> Form my side, the first option would be increasing the inactivity
> probe on the ovn-controller side and see if that resolves the issue.
> Deployments typically have 60+ seconds set, just in case.
> 
> Also, if you're not already using latest versions of OVS/OVN, upgrade
> may resolve the issue as well.  For example, OVS 2.17 provides a big
> performance improvement over previous versions and 3.0 and 3.1 give
> even more on top.  And with new OVN releases, the southbound database
> size usually goes down significantly reducing the load on the OVSDB
> server.  I'd suggest to use releases after OVN 22.09 for large scale
> deployments.
> 
> However, if your setup have only one switch with 250 ports and you
> have an issue, that should not really be related to scale and you
> need to investigate further on what exactly is happening.
> 
> Best regards, Ilya Maximets.
> 
> On 5/2/23 08:58, Felix Hüttner via discuss wrote:
>> Hi Gavin,
>> 
>> we saw similar issues after reaching a certain number of hypervisors. This 
>> happened because our ovsdb processes ran at 100% cpu utilization (and they 
>> are not multithreaded).
>> 
>> Our solutions where:
>> 
>> 1. If you use ssl on your north-/southbound db. Disable it and add a tls 
>> terminating reverse proxy (like traefik) in front
>> 2. Increase the inactivity probe significantly (you might need to change it 
>> on the ovn-controller and ovsdb side, not sure anymore)
>> 3. Introduce ovsdb relays and connect the ovn-controllers there.
>> 
>> --
>> 
>> Felix Huettner
>> 
>>  
>> 
>> *From:* discuss  *On Behalf Of *Gavin 
>> McKee via discuss
>> *Sent:* Monday, May 1, 2023 9:20 PM
>> *To:* ovs-discuss 
>> *Subject:* [ovs-discuss] CPU pinned at 100% , ovn-controller to ovnsb_db 
>> unstable
>> 
>> Hi ,
>> 
>> I'm having a pretty bad issue with OVN controller on the hypervisors being 
>> unable to connect to the OVS SB DB ,
>> 
>>  
>> 
>> 2023-05-01T19:13:33.969Z|00541|reconnect|ERR|tcp:10.193.1.2:6642 
>> : no response to inactivity probe after 5 seconds, 
>> disconnecting
>> 2023-05-01T19:13:33.969Z|00542|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connection dropped
>> 2023-05-01T19:13:43.043Z|00543|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connected
>> 2023-05-01T19:13:56.115Z|00544|reconnect|ERR|tcp:10.193.1.2:6642 
>> : no response to inactivity probe after 5 seconds, 
>> disconnecting
>> 2023-05-01T19:13:56.115Z|00545|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connection dropped
>> 2023-05-01T19:14:36.177Z|00546|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connected
>> 2023-05-01T19:14:44.996Z|00547|jsonrpc|WARN|tcp:10.193.1.2:6642 
>> : receive error: Connection reset by peer
>> 2023-05-01T19:14:44.996Z|00548|reconnect|WARN|tcp:10.193.1.2:6642 
>> : connection dropped (Connection reset by peer)
>> 2023-05-01T19:15:44.131Z|00549|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connected
>> 2023-05-01T19:15:54.137Z|00550|reconnect|ERR|tcp:10.193.1.2:6642 
>> : no response to inactivity probe after 5 seconds, 
>> disconnecting
>> 2023-05-01T19:15:54.137Z|00551|reconnect|INFO|tcp:10.193.1.2:6642 
>> : connection dropped
>> 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-30 Thread Vladislav Odintsov via discuss
Thanks Ilya for such a detailed description about inactivity probes and 
keepalives.

regards,
Vladislav Odintsov



regards,
Vladislav Odintsov
> On 31 Mar 2023, at 00:37, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/30/23 22:51, Vladislav Odintsov via discuss wrote:
>> Hi Ilya,
>> following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
>> It’s a bit outdated, but with some changes related to "last_command" logic 
>> in ovs*ctl it successfully built.
>> Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb 
>> relay because it just solves my issue.
>> So, after testing it seems to work fine. I’ve managed to run ovn-sb-db 
>> cluster with custom connections from local db and ovsdb relay with 
>> connections from sb db.
> 
> Good to know.
> 
>> I’ve got a question here:
>> Do we actually need probing from relay to sb cluster if we have configured 
>> probing from the other side in other direction (db cluster to relay)? Maybe 
>> we even can just set to 0 inactivity probes in ovsdb/relay.c?
> 
> If connection between relay and the main cluster is lost, relay may
> not notice this and just think that there are no new updates.  All the
> clients connected to that relay will have stale data as a result.
> Inactivity probe interval is essentially a value for how long you think
> you can afford that condition to last.

Do I understand you correctly that by “connection is lost” you mean an 
accidental termination of tcp session? Like iptables drop or cluster member got 
killed by sigkill?
In my understanding if cluster member will just be gracefully stopped, it’ll 
gracefully shutdown the connection and relay will reconnect to another cluster 
member?

Just of curiosity, in case of accidental termination, where some “outdated” 
ovn-controller which is connected to relay which in turn thinks it is connected 
to cluster but it is not. If in such condition ovn-controller tries to claim 
vif, will relay detect connection failure and reconnecy to another “upstream”?

> 
>> Also, ovsdb relay has active bidirectional probing to ovn-controllers.
>> If tcp session got dropped, ovsdb relay wont notice this without probing?
> 
> TCP timeouts can be very high or may not exist at all, if the network
> connectivity suddenly disappears (a firewall in between or one of
> the nodes crashed), both the client and the server may not notice
> that for a very long time.  I've seen in practice OVN clusters where
> nodes suddenly disappeared (crashed) and other nodes didn't notice
> that for many hours (caused by non-working inactivity probes).
> 
> Another interesting side effect to consider is if controller disappears
> and the relay will keep sending updates to it, that may cause significant
> memory usage increase on the relay, because it will keep the backlog of
> data that underlying socket didn't accept.  May end up being killed by
> OOM killer, if that continues long enough.

By disappearing you mean death of ovn-controller without proper connection 
termination?
So if I understand correctly, relay-to-controllers probing is a “must have”.
That’s interesting, thanks!

> 
> If you don't want to deal with inactivity probes, you may partially
> replace them with TCP keepalive.  Disable probes and start daemons with
> keepalive library preloaded, i.e. LD_PRELOAD=libkeepalive.so with the
> configuration you think is suitable (default keepalive time is 2 hours
> on many systems, so defaults are likely not a good choice).  You will
> loose ability to detect infinite loops or deadlocks and stuff like that,
> but at least, you'll be protected from pure network failures.
> See some examples at the end of this page:
> https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
> 
> Running a cluster without any bidirectional probes is not advisable.
> 
> Best regards, Ilya Maximets.
> 
>> Thank you for your help and Terry for his patch!
>> 1: 
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>>> On 7 Mar 2023, at 19:43, Ilya Maximets via discuss 
>>>>  wrote:
>>> On 3/7/23 16:58, Vladislav Odintsov wrote:
>>>> I’ve sent last mail from wrong account and indentation was lost.
>>>> Resending...
>>>>> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>>>>>  wrote:
>>>>> Thanks Ilya for the quick and detailed response!
>>>>>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>>>>>  wrote:
>>>>>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>>>>>> Hi Ilya,
>>>>>>> I’m wondering whether there are possibl

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-30 Thread Vladislav Odintsov via discuss
Hi Ilya,

following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
It’s a bit outdated, but with some changes related to "last_command" logic in 
ovs*ctl it successfully built.

Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb relay 
because it just solves my issue.

So, after testing it seems to work fine. I’ve managed to run ovn-sb-db cluster 
with custom connections from local db and ovsdb relay with connections from sb 
db.

I’ve got a question here:
Do we actually need probing from relay to sb cluster if we have configured 
probing from the other side in other direction (db cluster to relay)? Maybe we 
even can just set to 0 inactivity probes in ovsdb/relay.c?
Also, ovsdb relay has active bidirectional probing to ovn-controllers.
If tcp session got dropped, ovsdb relay wont notice this without probing?


Thank you for your help and Terry for his patch!

1: 
https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

> On 7 Mar 2023, at 19:43, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/7/23 16:58, Vladislav Odintsov wrote:
>> I’ve sent last mail from wrong account and indentation was lost.
>> Resending...
>> 
>>> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>>>  wrote:
>>> 
>>> Thanks Ilya for the quick and detailed response!
>>> 
>>>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>>>  wrote:
>>>> 
>>>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>>>> Hi Ilya,
>>>>> 
>>>>> I’m wondering whether there are possible configuration parameters for 
>>>>> ovsdb relay -> main ovsdb server inactivity probe timer.
>>>>> My cluster experiencing issues where relay disconnects from main cluster 
>>>>> due to 5 sec. inactivity probe timeout.
>>>>> Main cluster has quite big database and a bunch of daemons, which 
>>>>> connects to it and it makes difficult to maintain connections in time.
>>>>> 
>>>>> For ovsdb relay as a remote I use in-db configuration (to provide 
>>>>> inactivity probe and rbac configuration for ovn-controllers).
>>>>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>>>>> 
>>>>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>>>>> probe setting, but I’m not sure about the correct way for that.
>>>>> 
>>>>> For now I see only two options:
>>>>> 1. Setup custom database scheme with connection table, serve it in same 
>>>>> SB cluster and specify this connection when start ovsdb sb server.
>>>> 
>>>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>>>> used for that purpose.  But you'll need to craft transactions for it
>>>> manually with ovsdb-client.
>>>> 
>>>> There is a control tool prepared by Terry:
>>>>  
>>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>> 
>>> Thanks for pointing on a patch, I guess, I’ll test it out.
>>> 
>>>> 
>>>> But it's not in the repo yet (I need to get back to reviews on that
>>>> topic at some point).  The tool itself should be fine, but maybe name
>>>> will change.
>>> 
>>> Am I right that in-DB remote configuration must be a hosted by this 
>>> ovsdb-server database?
> 
> Yes.
> 
>>> What is the best way to configure additional DB on ovsdb-server so that 
>>> this configuration to be permanent?
> 
> You may specify multiple database files on the command-line for ovsdb-server
> process.  It will open and serve each of them.  They all can be in different
> modes, e.g. you have multiple clustered, standalone and relay databases in
> the same ovsdb-server process.
> 
> There is also ovsdb-server/add-db appctl to add a new database to a running
> process, but it will not survive the restart.
> 
>>> Also, am I understand correctly that there is no necessity for this DB to 
>>> be clustered?
> 
> It's kind of a point of the Local_Config database to not be clustered.
> The original use case was to allow each cluster member to listen on a
> different IP. i.e. if you don't want to listen on 0.0.0.0 and your
> cluster members are on different nodes, so have different listening IPs.
> 
>>> 
>>>> 
>>>>> 2. Setup second connection in ovn sb database to be used for ovsdb 
>>>>> cluster and deploy cluster

Re: [ovs-discuss] OVN interconnection and NAT

2023-03-15 Thread Vladislav Odintsov via discuss
I’m sorry, of course I meant gateway_port instead of logical_port:

   gateway_port: optional weak reference to Logical_Router_Port
  A distributed gateway port in the Logical_Router_Port table where 
the NAT rule needs to be applied.

  When multiple distributed gateway ports are configured on a 
Logical_Router, applying a  NAT  rule  at
  each  of the distributed gateway ports might not be desired. 
Consider the case where a logical router
  has 2 distributed  gateway  port,  one  with  networks  
50.0.0.10/24  and  the  other  with  networks
  60.0.0.10/24.  If  the  logical router has a NAT rule of type 
snat, logical_ip 10.1.1.0/24 and exter‐
  nal_ip 50.1.1.20/24, the rule needs to be selectively applied on  
matching  packets  entering/leaving
  through the distributed gateway port with networks 50.0.0.10/24.

  When  a  logical  router  has multiple distributed gateway ports 
and this column is not set for a NAT
  rule, then the rule will be applied at the distributed gateway 
port which is in the same  network  as
  the  external_ip  of  the NAT rule, if such a router port exists. 
If logical router has a single dis‐
  tributed gateway port and this column is not set for a NAT rule, 
the rule will be applied at the dis‐
  tributed  gateway  port  even if the router port is not in the 
same network as the external_ip of the
  NAT rule.

> On 15 Mar 2023, at 20:05, Vladislav Odintsov via discuss 
>  wrote:
> 
> Hi,
> 
> since you’ve configured multiple LRPs with GW chassis, you must supply 
> logical_port for NAT rule. Did you configure it?
> You should see appropriate message in ovn-northd logfile.
> 
>logical_port: optional string
>   The name of the logical port where the logical_ip resides.
> 
>   This is only used on distributed routers. This must be 
> specified in order for the NAT rule to be pro‐
>   cessed  in a distributed manner on all chassis. If this is not 
> specified for a NAT rule on a distrib‐
>   uted router, then this NAT rule will be processed  in  a  
> centralized  manner  on  the  gateway  port
>   instance on the gateway chassis.
> 
>> On 15 Mar 2023, at 19:22, Tiago Pires via discuss 
>>  wrote:
>> 
>> Hi,
>> 
>> In an OVN Interconnection environment (OVN 22.03) with a few AZs, I noticed 
>> that when the OVN router has a SNAT enabled or DNAT_AND_SNAT,
>> the traffic between the AZs is nated.
>> When checking the OVN router's logical flows, it is possible to see the LSP 
>> that is connected into the transit switch with NAT enabled:
>> 
>> Scenario:
>> 
>> OVN Global database:
>> # ovn-ic-sbctl show
>> availability-zone az1
>> gateway ovn-central-1
>> hostname: ovn-central-1
>> type: geneve
>> ip: 192.168.40.50
>> port ts1-r1-az1
>> transit switch: ts1
>> address: ["aa:aa:aa:aa:aa:10 169.254.100.10/24 
>> <http://169.254.100.10/24>"]
>> availability-zone az2
>> gateway ovn-central-2
>> hostname: ovn-central-2
>> type: geneve
>> ip: 192.168.40.221
>> port ts1-r1-az2
>> transit switch: ts1
>> address: ["aa:aa:aa:aa:aa:20 169.254.100.20/24 
>> <http://169.254.100.20/24>"]
>> availability-zone az3
>> gateway ovn-central-3
>> hostname: ovn-central-3
>> type: geneve
>> ip: 192.168.40.247
>> port ts1-r1-az3
>> transit switch: ts1
>> address: ["aa:aa:aa:aa:aa:30 169.254.100.30/24 
>> <http://169.254.100.30/24>"]
>> 
>> OVN Central (az1)
>> 
>> # ovn-nbctl show r1
>> router 3e80e81a-58b5-41b1-9600-5bfc917c4ace (r1)
>> port r1-ts1-az1
>> mac: "aa:aa:aa:aa:aa:10"
>> networks: ["169.254.100.10/24 <http://169.254.100.10/24>"]
>> gateway chassis: [ovn-central-1]
>> port r1_s1
>> mac: "00:de:ad:fe:0:1"
>> networks: ["10.0.1.1/24 <http://10.0.1.1/24>"]
>> port r1_public
>> mac: "00:de:ad:ff:0:1"
>> networks: ["200.10.0.1/24 <http://200.10.0.1/24>"]
>> gateway chassis: [ovn-central-1]
>> nat df2b79d3-1334-4af3-8f61-5a46490f8a9c
>> external ip: "200.10.0.101"
>> logical ip: "10.0.1.2"
>> type: "dnat_and

Re: [ovs-discuss] OVN interconnection and NAT

2023-03-15 Thread Vladislav Odintsov via discuss
Hi,

since you’ve configured multiple LRPs with GW chassis, you must supply 
logical_port for NAT rule. Did you configure it?
You should see appropriate message in ovn-northd logfile.

   logical_port: optional string
  The name of the logical port where the logical_ip resides.

  This is only used on distributed routers. This must be specified 
in order for the NAT rule to be pro‐
  cessed  in a distributed manner on all chassis. If this is not 
specified for a NAT rule on a distrib‐
  uted router, then this NAT rule will be processed  in  a  
centralized  manner  on  the  gateway  port
  instance on the gateway chassis.

> On 15 Mar 2023, at 19:22, Tiago Pires via discuss 
>  wrote:
> 
> Hi,
> 
> In an OVN Interconnection environment (OVN 22.03) with a few AZs, I noticed 
> that when the OVN router has a SNAT enabled or DNAT_AND_SNAT,
> the traffic between the AZs is nated.
> When checking the OVN router's logical flows, it is possible to see the LSP 
> that is connected into the transit switch with NAT enabled:
> 
> Scenario:
> 
> OVN Global database:
> # ovn-ic-sbctl show
> availability-zone az1
> gateway ovn-central-1
> hostname: ovn-central-1
> type: geneve
> ip: 192.168.40.50
> port ts1-r1-az1
> transit switch: ts1
> address: ["aa:aa:aa:aa:aa:10 169.254.100.10/24 
> "]
> availability-zone az2
> gateway ovn-central-2
> hostname: ovn-central-2
> type: geneve
> ip: 192.168.40.221
> port ts1-r1-az2
> transit switch: ts1
> address: ["aa:aa:aa:aa:aa:20 169.254.100.20/24 
> "]
> availability-zone az3
> gateway ovn-central-3
> hostname: ovn-central-3
> type: geneve
> ip: 192.168.40.247
> port ts1-r1-az3
> transit switch: ts1
> address: ["aa:aa:aa:aa:aa:30 169.254.100.30/24 
> "]
> 
> OVN Central (az1)
> 
> # ovn-nbctl show r1
> router 3e80e81a-58b5-41b1-9600-5bfc917c4ace (r1)
> port r1-ts1-az1
> mac: "aa:aa:aa:aa:aa:10"
> networks: ["169.254.100.10/24 "]
> gateway chassis: [ovn-central-1]
> port r1_s1
> mac: "00:de:ad:fe:0:1"
> networks: ["10.0.1.1/24 "]
> port r1_public
> mac: "00:de:ad:ff:0:1"
> networks: ["200.10.0.1/24 "]
> gateway chassis: [ovn-central-1]
> nat df2b79d3-1334-4af3-8f61-5a46490f8a9c
> external ip: "200.10.0.101"
> logical ip: "10.0.1.2"
> type: "dnat_and_snat"
> 
> OVN Logical Flows:
> table=3 (lr_out_snat), priority=161  , match=(ip && ip4.src == 
> 10.0.1.2 && outport == "r1-ts1-az1" && is_chassis_resident("cr-r1-ts1-az1")), 
> action=(ct_snat_in_czone(200.10.0.101);)
> 
> The datapath flows into OVS shows that the traffic is being nated and sent to 
> the remote chassi gateway in AZ2:
> 
> recirc_id(0x14),in_port(3),eth(src=aa:aa:aa:aa:aa:10,dst=aa:aa:aa:aa:aa:20),eth_type(0x0800),ipv4(dst=200.16.0.0/255.240.0.0,tos=0/0x3,frag=no
>  ), packets:3, bytes:294, 
> used:0.888s, 
> actions:ct_clear,set(tunnel(tun_id=0xff0002,dst=192.168.40.221,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x10002}),flags(df|csum|key))),2
> recirc_id(0x13),in_port(3),eth(),eth_type(0x0800),ipv4(src=10.0.1.2,frag=no), 
> packets:3, bytes:294, used:0.888s, 
> actions:ct(commit,zone=2,nat(src=200.10.0.101)),recirc(0x14)
> recirc_id(0),in_port(3),eth(src=00:de:ad:01:00:01,dst=00:de:ad:fe:00:01),eth_type(0x0800),ipv4(src=10.0.1.2,dst=200.20.0.0/255.255.255.0,ttl=64,frag=no
>  ), packets:3, bytes:294, 
> used:0.888s, actions:set(e
> th(src=aa:aa:aa:aa:aa:10,dst=aa:aa:aa:aa:aa:20)),set(ipv4(ttl=63)),ct(zone=2,nat),recirc(0x13)
> 
> Is this behavior expected by design or is it a bug? In my use case, I would 
> like for the traffic between AZs to be routed instead of nated.
> 
> Tiago Pires
> 
> 
> ‘Esta mensagem é direcionada apenas para os endereços constantes no cabeçalho 
> inicial. Se você não está listado nos endereços constantes no cabeçalho, 
> pedimos-lhe que desconsidere completamente o conteúdo dessa mensagem e cuja 
> cópia, encaminhamento e/ou execução das ações citadas estão imediatamente 
> anuladas e proibidas’.
>  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para assegurar 
> que nenhum vírus esteja presente nesse e-mail, a empresa não poderá aceitar a 
> responsabilidade por quaisquer perdas ou danos causados por esse e-mail ou 
> por seus anexos’.
> 
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Regards,
Vladislav Odintsov


Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Vladislav Odintsov via discuss
I’ve sent last mail from wrong account and indentation was lost.
Resending...

> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>  wrote:
> 
> Thanks Ilya for the quick and detailed response!
> 
>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>  wrote:
>> 
>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>> Hi Ilya,
>>> 
>>> I’m wondering whether there are possible configuration parameters for ovsdb 
>>> relay -> main ovsdb server inactivity probe timer.
>>> My cluster experiencing issues where relay disconnects from main cluster 
>>> due to 5 sec. inactivity probe timeout.
>>> Main cluster has quite big database and a bunch of daemons, which connects 
>>> to it and it makes difficult to maintain connections in time.
>>> 
>>> For ovsdb relay as a remote I use in-db configuration (to provide 
>>> inactivity probe and rbac configuration for ovn-controllers).
>>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>>> 
>>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>>> probe setting, but I’m not sure about the correct way for that.
>>> 
>>> For now I see only two options:
>>> 1. Setup custom database scheme with connection table, serve it in same SB 
>>> cluster and specify this connection when start ovsdb sb server.
>> 
>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>> used for that purpose.  But you'll need to craft transactions for it
>> manually with ovsdb-client.
>> 
>> There is a control tool prepared by Terry:
>>  
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> 
> Thanks for pointing on a patch, I guess, I’ll test it out.
> 
>> 
>> But it's not in the repo yet (I need to get back to reviews on that
>> topic at some point).  The tool itself should be fine, but maybe name
>> will change.
> 
> Am I right that in-DB remote configuration must be a hosted by this 
> ovsdb-server database?
> What is the best way to configure additional DB on ovsdb-server so that this 
> configuration to be permanent?
> Also, am I understand correctly that there is no necessity for this DB to be 
> clustered?
> 
>> 
>>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>>> and deploy cluster separately from ovsdb relay, because they both start 
>>> same connections and conflict on ports. (I don’t use docker here, so I need 
>>> a separate server for that).
>> 
>> That's an easy option available right now, true.  If they are deployed
>> on different nodes, you may even use the same connection record.
>> 
>>> 
>>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>>> right? From the ovsdb relay process.
>> 
>> Inactivity probes don't need to be the same.  They are separate for each
>> side of a connection and so configured separately.
>> 
>> You can set up inactivity probe for the server side of the connection via
>> database.  So, server will probe the relay every 60 seconds, but today
>> it's not possible to set inactivity probe for the relay-to-server direction.
>> So, relay will probe the server every 5 seconds.
>> 
>> The way out from this situation is to allow configuration of relays via
>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>> require addition of a new table to the Local_Config database and allowing
>> relay config to be parsed from the database in the code.  That wasn't
>> implemented yet.
>> 
>>> I saw your talk on last ovscon about this topic, and the solution was in 
>>> progress there. But maybe there were some changes from that time? I’m ready 
>>> to test it if any. Or, maybe there’s any workaround?
>> 
>> Sorry, we didn't move forward much on that topic since the presentation.
>> There are few unanswered questions around local config database.  Mainly
>> regarding upgrades from cmdline/main db -based configuration to a local
>> config -based.  But I hope we can figure that out in the current release
>> time frame, i.e. before 3.2 release.

Regarding configuration method… Just like an idea (I haven’t seen this variant 
as one of possible).
Remote add/remove is possible via ovsdb-server ctl socket. Could introducing 
new command
"ovsdb-server/set-remote-param PAR

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Vladislav Odintsov via discuss
Thanks Ilya for the quick and detailed response!

> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/7/23 00:15, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> I’m wondering whether there are possible configuration parameters for ovsdb 
>> relay -> main ovsdb server inactivity probe timer.
>> My cluster experiencing issues where relay disconnects from main cluster due 
>> to 5 sec. inactivity probe timeout.
>> Main cluster has quite big database and a bunch of daemons, which connects 
>> to it and it makes difficult to maintain connections in time.
>> 
>> For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
>> probe and rbac configuration for ovn-controllers).
>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>> 
>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>> probe setting, but I’m not sure about the correct way for that.
>> 
>> For now I see only two options:
>> 1. Setup custom database scheme with connection table, serve it in same SB 
>> cluster and specify this connection when start ovsdb sb server.
> 
> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> used for that purpose.  But you'll need to craft transactions for it
> manually with ovsdb-client.
> 
> There is a control tool prepared by Terry:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

Thanks for pointing on a patch, I guess, I’ll test it out.

> 
> But it's not in the repo yet (I need to get back to reviews on that
> topic at some point).  The tool itself should be fine, but maybe name
> will change.

Am I right that in-DB remote configuration must be a hosted by this 
ovsdb-server database?
What is the best way to configure additional DB on ovsdb-server so that this 
configuration to be permanent?
Also, am I understand correctly that there is no necessity for this DB to be 
clustered?

> 
>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>> and deploy cluster separately from ovsdb relay, because they both start same 
>> connections and conflict on ports. (I don’t use docker here, so I need a 
>> separate server for that).
> 
> That's an easy option available right now, true.  If they are deployed
> on different nodes, you may even use the same connection record.
> 
>> 
>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>> right? From the ovsdb relay process.
> 
> Inactivity probes don't need to be the same.  They are separate for each
> side of a connection and so configured separately.
> 
> You can set up inactivity probe for the server side of the connection via
> database.  So, server will probe the relay every 60 seconds, but today
> it's not possible to set inactivity probe for the relay-to-server direction.
> So, relay will probe the server every 5 seconds.
> 
> The way out from this situation is to allow configuration of relays via
> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> require addition of a new table to the Local_Config database and allowing
> relay config to be parsed from the database in the code.  That wasn't
> implemented yet.
> 
>> I saw your talk on last ovscon about this topic, and the solution was in 
>> progress there. But maybe there were some changes from that time? I’m ready 
>> to test it if any. Or, maybe there’s any workaround?
> 
> Sorry, we didn't move forward much on that topic since the presentation.
> There are few unanswered questions around local config database.  Mainly
> regarding upgrades from cmdline/main db -based configuration to a local
> config -based.  But I hope we can figure that out in the current release
> time frame, i.e. before 3.2 release.
> 
> There is also this workaround:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
> It simply takes the server->relay inactivity probe value and applies it
> to the relay->server connection.  But it's not a correct solution, because
> it relies on certain database names.
> 
> Out of curiosity, what kind of poll intervals you see on your main server
> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
> solve some of these issues?  3.1 should be noticeably faster than 2.17,
> and also parallel compaction introduced in 3.0 removes one of the big
> reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
> should also help with database sizes.

We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection dropped

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-06 Thread Vladislav Odintsov via discuss
Hi Ilya,

I’m wondering whether there are possible configuration parameters for ovsdb 
relay -> main ovsdb server inactivity probe timer.
My cluster experiencing issues where relay disconnects from main cluster due to 
5 sec. inactivity probe timeout.
Main cluster has quite big database and a bunch of daemons, which connects to 
it and it makes difficult to maintain connections in time.

For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
probe and rbac configuration for ovn-controllers).
For ovsdb-server, which serves SB, I just set --remote=pssl:.

I’d like to configure remote for ovsdb cluster via DB to set inactivity probe 
setting, but I’m not sure about the correct way for that.

For now I see only two options:
1. Setup custom database scheme with connection table, serve it in same SB 
cluster and specify this connection when start ovsdb sb server.
2. Setup second connection in ovn sb database to be used for ovsdb cluster and 
deploy cluster separately from ovsdb relay, because they both start same 
connections and conflict on ports. (I don’t use docker here, so I need a 
separate server for that).

Anyway, if I configure ovsdb remote for ovsdb cluster with specified inactivity 
probe (say, to 60k), I guess it’s still not enough to have ovsdb pings every 60 
seconds. Inactivity probe must be the same from both ends - right? From the 
ovsdb relay process.
I saw your talk on last ovscon about this topic, and the solution was in 
progress there. But maybe there were some changes from that time? I’m ready to 
test it if any. Or, maybe there’s any workaround?

Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss