from:"Vladislav Odintsov"

Thank you, Dumitru, for the bumps in all branches.

regards,
Vladislav Odintsov

> On 18 Oct 2024, at 18:46, Dumitru Ceara  wrote:
> 
> On 10/18/24 10:13, Vladislav Odintsov wrote:
>> Nit: commit subject should be adjusted to: "ovs: Bump submodule to OVS
>> v3.0.7.".
>> 
> 
> Thanks for pointing that out!
> 
> I fixed it up and applied the patch to 22.03.
> 
> Thanks,
> Dumitru
> 
>> That was my mistake in previous message.
>> 
>>> On 18.10.2024 10:55, Dumitru Ceara wrote:
>>> From: Vladislav Odintsov 
>>> 
>>> This picks up the following relevant OVS changes:
>>>   4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
>>>   ec048ab62 vlog: Destroy async_append first then close log_fd.
>>>   dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
>>>   85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
>>>   e040f35b2 vconn: Count vconn_sent regardless of log level.
>>>   ... and others.
>>> 
>>> Reported-at: 
>>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
>>> Signed-off-by: Vladislav Odintsov 
>>> ---
>>>  ovs | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/ovs b/ovs
>>> index 94191b7a49..55ee005bef 16
>>> --- a/ovs
>>> +++ b/ovs
>>> @@ -1 +1 @@
>>> -Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
>>> +Subproject commit 55ee005bef94566fe829b667960d7bb9ac925974
>> 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-22.09] ovs: Bump submodule to OVS 3.0.7.

Nit: commit subject should be adjusted to: "ovs: Bump submodule to OVS 
v3.0.7.".

On 18.10.2024 10:54, Dumitru Ceara wrote:
> From: Vladislav Odintsov 
>
> This picks up the following relevant OVS changes:
>4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
>ec048ab62 vlog: Destroy async_append first then close log_fd.
>dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
>85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
>e040f35b2 vconn: Count vconn_sent regardless of log level.
>... and others.
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
> Signed-off-by: Vladislav Odintsov 
> ---
>   ovs | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ovs b/ovs
> index 94191b7a49..55ee005bef 16
> --- a/ovs
> +++ b/ovs
> @@ -1 +1 @@
> -Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
> +Subproject commit 55ee005bef94566fe829b667960d7bb9ac925974

-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-22.03] ovs: Bump submodule to OVS 3.0.7.

Nit: commit subject should be adjusted to: "ovs: Bump submodule to OVS 
v3.0.7.".

That was my mistake in previous message.

On 18.10.2024 10:55, Dumitru Ceara wrote:
> From: Vladislav Odintsov 
>
> This picks up the following relevant OVS changes:
>4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
>ec048ab62 vlog: Destroy async_append first then close log_fd.
>dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
>85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
>e040f35b2 vconn: Count vconn_sent regardless of log level.
>... and others.
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
> Signed-off-by: Vladislav Odintsov 
> ---
>   ovs | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ovs b/ovs
> index 94191b7a49..55ee005bef 16
> --- a/ovs
> +++ b/ovs
> @@ -1 +1 @@
> -Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
> +Subproject commit 55ee005bef94566fe829b667960d7bb9ac925974

-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-22.06] ovs: Bump submodule to OVS 3.0.7.

Nit: commit subject should be adjusted to: "ovs: Bump submodule to OVS 
v3.0.7.".

On 18.10.2024 10:54, Dumitru Ceara wrote:
> From: Vladislav Odintsov 
>
> This picks up the following relevant OVS changes:
>4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
>ec048ab62 vlog: Destroy async_append first then close log_fd.
>dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
>85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
>e040f35b2 vconn: Count vconn_sent regardless of log level.
>... and others.
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
> Signed-off-by: Vladislav Odintsov 
> ---
>   ovs | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ovs b/ovs
> index 94191b7a49..55ee005bef 16
> --- a/ovs
> +++ b/ovs
> @@ -1 +1 @@
> -Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
> +Subproject commit 55ee005bef94566fe829b667960d7bb9ac925974

-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-24.09] ovs: Bump submodule to latest OVS branch-3.4.

2024-10-17 Thread Vladislav Odintsov

Hi Dumitru,

On 15.10.2024 18:34, Dumitru Ceara wrote:
> On 10/14/24 16:41, Vladislav Odintsov wrote:
>> On 14.10.2024 16:13, Dumitru Ceara wrote:
>>> On 10/14/24 15:01, Vladislav Odintsov wrote:
>>>> On 14.10.2024 14:47, Dumitru Ceara wrote:
>>>>> On 10/13/24 10:19, Vladislav Odintsov wrote:
>>>>>> This picks up the following relevant OVS changes:
>>>>>>  a15ce086d ofproto-dpif: Improve load balancing in dp_hash select 
>>>>>> groups.
>>>>>>  76ba41b5c vconn: Always properly free flow stats reply.
>>>>>>  64cb90507 ovsdb-idl: Fix IDL memory leak.
>>>>>>  ... and others.
>>>>>>
>>>>>> Reported-at: 
>>>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
>>>>>> Signed-off-by: Vladislav Odintsov 
>>>>>> ---
>>>>> Hi Vladislav,
>>>>>
>>>>>> ovs | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/ovs b/ovs
>>>>>> index c598c05c8..a15ce086d 16
>>>>>> --- a/ovs
>>>>>> +++ b/ovs
>>>>>> @@ -1 +1 @@
>>>>>> -Subproject commit c598c05c85b2d38874a0ce8f7f088f6aae4fdabc
>>>>>> +Subproject commit a15ce086d41f9dfe6c1589333413b8e777401ef0
>>>>> This would make branch-24.09 use a newer submodule version than the OVN
>>>>> main branch does.
>>>>>
>>>>> I think we need this commit on main too, what do you think?
>>>> Hi Dumitru,
>>>>
>>>> Agree.
>>>>
>>>> The patch:
>>>> https://patchwork.ozlabs.org/project/ovn/patch/20241014125151.74094-1-vlodintsov@k2.cloud/
>>>>
>>> I was talking offline to Ilya about all these bumps and he made a good
>>> point: while the tip of OVS branch-3.y and the latest v3.y.z get almost
>>> the same amount of testing in the OVS repo it might be a bit better to
>>> use v3.y.z instead.  That's because likely external users of OVS run
>>> tagged OVS releases in production so those might get more external testing.
>>>
>>> I had a quick look at the main differences between choosing the tip of
>>> branch-3.y and v3.y.z on all branches and I think we'd only miss:
>>>
>>> 99e7cf9cce1c vconn: Always properly free flow stats reply.
>>> f59f19bf69a4 ovsdb-idl: Fix IDL memory leak.
>>>
>>> which might be OK.
>>>
>>> If you agree I can change your patches on all branches (no need to post
>>> new ones) and apply them.
>>>
>>> What do you think?
>> Well, I'm totally fine with this. Please feel free to modify my patches.
>>
> I prepared them here, it would be great if you could double check:
>
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-24.03
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-23.09
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-23.03
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-22.12
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-22.09

Could you please update commit subject to "ovs: Bump submodule to OVS 
3.0.7." for patches within branches branch-22.09, branch-22.06, 
branch-22.03?

> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-22.06
> https://github.com/dceara/ovn/tree/refs/heads/tmp-branch-22.03
>
> I skipped main and branch-24.09 because those are already on the latest
> OVS v3.4.z.
Thanks for the update, other than commit subject, these patches look 
good to me!
>
>
>> Also, shouldn't we update documentation to reflect this approach in
>> Documentation/internals/ovs_submodule.rst for further bumps? And if
> We should, you're right, I'll prepare a patch.
>
>> talking about documentation, I've got one note, which should be covered
>> by new process. Imagine situation, where quite old OVS branch (let's
>> say, 3.0) has a wanted commit (for example, fix for build with new
>> compiler or latest libs), but the new patch release is not created
>> because it is not a critical problem). I'd say we either need to request
>> OVS community to bump patch release or bump from release to commit sha.
>> What do you think here? Or, just leave it as is and decide how to bump
>> in flexible manner in each individual case?
>>
> I think this is not the common case so maybe we can leave it flexible
> for now.
>
> Regards,
> Dumitru
>
-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [ovn-dev] Port mirroring filter support in ovn.

2024-10-15 Thread Vladislav Odintsov

 useful to have an 
ability to encap this traffic with a VXLAN, GENEVE, GRE or ERSPAN 
tunneling (different network analyzing solutions support these protocols 
for mirrored traffic; AWS does the same), set outer src/dst-ip, correct 
tunnel key and inject it into the LS pipeline, so that this encapsulated 
packet can traverse inside the overlay (actually, double-encaped) in any 
point of infrastructure relying on overlay routing. This gives us 
ability to send this mirrored traffic to any destination inside overlay 
topology - another subnet, even to another availability zone or outside 
of ovn but inside same vrf.

But we faced that OF encap() action supports only nsh and mpls. Can you 
give us an advice whether it is possible to send double-encaped traffic 
with OVN somehow? Or we should extend encap() functinoality in this case?

Potentially this double-encap can be reused for feature similar to AWS 
Gateway Load balancers [1].

1: https://aws.amazon.com/elasticloadbalancing/gateway-load-balancer/

>
>> since both options are currently not possible, I would greatly
>> appreciate any insights or advice you may have regarding these approaches.
>>
> Thanks,
> Dumitru
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-24.09] ovs: Bump submodule to latest OVS branch-3.4.

2024-10-14 Thread Vladislav Odintsov



On 14.10.2024 16:13, Dumitru Ceara wrote:
> On 10/14/24 15:01, Vladislav Odintsov wrote:
>> On 14.10.2024 14:47, Dumitru Ceara wrote:
>>> On 10/13/24 10:19, Vladislav Odintsov wrote:
>>>> This picks up the following relevant OVS changes:
>>>> a15ce086d ofproto-dpif: Improve load balancing in dp_hash select 
>>>> groups.
>>>> 76ba41b5c vconn: Always properly free flow stats reply.
>>>> 64cb90507 ovsdb-idl: Fix IDL memory leak.
>>>> ... and others.
>>>>
>>>> Reported-at: 
>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
>>>> Signed-off-by: Vladislav Odintsov 
>>>> ---
>>> Hi Vladislav,
>>>
>>>>ovs | 2 +-
>>>>1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/ovs b/ovs
>>>> index c598c05c8..a15ce086d 16
>>>> --- a/ovs
>>>> +++ b/ovs
>>>> @@ -1 +1 @@
>>>> -Subproject commit c598c05c85b2d38874a0ce8f7f088f6aae4fdabc
>>>> +Subproject commit a15ce086d41f9dfe6c1589333413b8e777401ef0
>>> This would make branch-24.09 use a newer submodule version than the OVN
>>> main branch does.
>>>
>>> I think we need this commit on main too, what do you think?
>> Hi Dumitru,
>>
>> Agree.
>>
>> The patch:
>> https://patchwork.ozlabs.org/project/ovn/patch/20241014125151.74094-1-vlodintsov@k2.cloud/
>>
> I was talking offline to Ilya about all these bumps and he made a good
> point: while the tip of OVS branch-3.y and the latest v3.y.z get almost
> the same amount of testing in the OVS repo it might be a bit better to
> use v3.y.z instead.  That's because likely external users of OVS run
> tagged OVS releases in production so those might get more external testing.
>
> I had a quick look at the main differences between choosing the tip of
> branch-3.y and v3.y.z on all branches and I think we'd only miss:
>
> 99e7cf9cce1c vconn: Always properly free flow stats reply.
> f59f19bf69a4 ovsdb-idl: Fix IDL memory leak.
>
> which might be OK.
>
> If you agree I can change your patches on all branches (no need to post
> new ones) and apply them.
>
> What do you think?

Well, I'm totally fine with this. Please feel free to modify my patches.

Also, shouldn't we update documentation to reflect this approach in 
Documentation/internals/ovs_submodule.rst for further bumps? And if 
talking about documentation, I've got one note, which should be covered 
by new process. Imagine situation, where quite old OVS branch (let's 
say, 3.0) has a wanted commit (for example, fix for build with new 
compiler or latest libs), but the new patch release is not created 
because it is not a critical problem). I'd say we either need to request 
OVS community to bump patch release or bump from release to commit sha. 
What do you think here? Or, just leave it as is and decide how to bump 
in flexible manner in each individual case?

>
> Thanks,
> Dumitru
>
-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn branch-24.09] ovs: Bump submodule to latest OVS branch-3.4.

2024-10-14 Thread Vladislav Odintsov



On 14.10.2024 14:47, Dumitru Ceara wrote:
> On 10/13/24 10:19, Vladislav Odintsov wrote:
>> This picks up the following relevant OVS changes:
>>a15ce086d ofproto-dpif: Improve load balancing in dp_hash select groups.
>>76ba41b5c vconn: Always properly free flow stats reply.
>>64cb90507 ovsdb-idl: Fix IDL memory leak.
>>... and others.
>>
>> Reported-at: 
>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
>> Signed-off-by: Vladislav Odintsov 
>> ---
> Hi Vladislav,
>
>>   ovs | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/ovs b/ovs
>> index c598c05c8..a15ce086d 16
>> --- a/ovs
>> +++ b/ovs
>> @@ -1 +1 @@
>> -Subproject commit c598c05c85b2d38874a0ce8f7f088f6aae4fdabc
>> +Subproject commit a15ce086d41f9dfe6c1589333413b8e777401ef0
> This would make branch-24.09 use a newer submodule version than the OVN
> main branch does.
>
> I think we need this commit on main too, what do you think?

Hi Dumitru,

Agree.

The patch: 
https://patchwork.ozlabs.org/project/ovn/patch/20241014125151.74094-1-vlodintsov@k2.cloud/

>
> Thanks,
> Dumitru
>
-- 
Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] ovs: Bump submodule to latest OVS branch-3.4.

2024-10-14 Thread Vladislav Odintsov

This picks up the following relevant OVS changes:
  a15ce086d ofproto-dpif: Improve load balancing in dp_hash select groups.
  76ba41b5c vconn: Always properly free flow stats reply.
  64cb90507 ovsdb-idl: Fix IDL memory leak.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index c598c05c8..a15ce086d 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit c598c05c85b2d38874a0ce8f7f088f6aae4fdabc
+Subproject commit a15ce086d41f9dfe6c1589333413b8e777401ef0
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-22.12 v2] ovs: Bump submodule to latest OVS branch-3.1.

This picks up the following relevant OVS changes:
  a0af48b75 ofproto-dpif: Improve load balancing in dp_hash select groups.
  99e7cf9cc vconn: Always properly free flow stats reply.
  f59f19bf6 ovsdb-idl: Fix IDL memory leak.
  7694dfacb compiler: Fix errors in Clang 17 ubsan checks.
  faf175155 vlog: Destroy async_append first then close log_fd.
  483bc24e4 hash, jhash: Fix unaligned access to the hash remainder.
  bd5b5d3b3 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  bb61b5fe8 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
v2:
  There was an error: previous patch was bumped as branch-22.09 by mistake.
  New version bumps OVS submodule to latest branch-3.1 commit instead of
  branch-3.0.
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 8fd5f77cd..a0af48b75 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 8fd5f77cd84ea04667f987c7b84181604dc99f60
+Subproject commit a0af48b753ef3215091356f112bbb89737f286d9
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-22.03] ovs: Bump submodule to latest OVS branch-3.0.

This picks up the following relevant OVS changes:
  876584141 vconn: Always properly free flow stats reply.
  2d60ee374 ovsdb-idl: Fix IDL memory leak.
  4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
  ec048ab62 vlog: Destroy async_append first then close log_fd.
  dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
  85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  e040f35b2 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 94191b7a4..a9fb87867 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
+Subproject commit a9fb878679fb08bc3807cf455b0d3ce1797be26d
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-22.06] ovs: Bump submodule to latest OVS branch-3.0.

This picks up the following relevant OVS changes:
  876584141 vconn: Always properly free flow stats reply.
  2d60ee374 ovsdb-idl: Fix IDL memory leak.
  4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
  ec048ab62 vlog: Destroy async_append first then close log_fd.
  dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
  85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  e040f35b2 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 94191b7a4..a9fb87867 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
+Subproject commit a9fb878679fb08bc3807cf455b0d3ce1797be26d
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-22.09] ovs: Bump submodule to latest OVS branch-3.0.

This picks up the following relevant OVS changes:
  876584141 vconn: Always properly free flow stats reply.
  2d60ee374 ovsdb-idl: Fix IDL memory leak.
  4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
  ec048ab62 vlog: Destroy async_append first then close log_fd.
  dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
  85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  e040f35b2 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 94191b7a4..a9fb87867 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
+Subproject commit a9fb878679fb08bc3807cf455b0d3ce1797be26d
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-22.12] ovs: Bump submodule to latest OVS branch-3.0.

This picks up the following relevant OVS changes:
  876584141 vconn: Always properly free flow stats reply.
  2d60ee374 ovsdb-idl: Fix IDL memory leak.
  4198bcdfb compiler: Fix errors in Clang 17 ubsan checks.
  ec048ab62 vlog: Destroy async_append first then close log_fd.
  dbaf7271c hash, jhash: Fix unaligned access to the hash remainder.
  85285fb45 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  e040f35b2 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 94191b7a4..a9fb87867 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
+Subproject commit a9fb878679fb08bc3807cf455b0d3ce1797be26d
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-23.03] ovs: Bump submodule to latest OVS branch-3.1.

This picks up the following relevant OVS changes:
  a0af48b75 ofproto-dpif: Improve load balancing in dp_hash select groups.
  99e7cf9cc vconn: Always properly free flow stats reply.
  f59f19bf6 ovsdb-idl: Fix IDL memory leak.
  7694dfacb compiler: Fix errors in Clang 17 ubsan checks.
  faf175155 vlog: Destroy async_append first then close log_fd.
  483bc24e4 hash, jhash: Fix unaligned access to the hash remainder.
  bd5b5d3b3 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  bb61b5fe8 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 8fd5f77cd..a0af48b75 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 8fd5f77cd84ea04667f987c7b84181604dc99f60
+Subproject commit a0af48b753ef3215091356f112bbb89737f286d9
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-23.06] ovs: Bump submodule to latest OVS branch-3.1.

This picks up the following relevant OVS changes:
  a0af48b75 ofproto-dpif: Improve load balancing in dp_hash select groups.
  99e7cf9cc vconn: Always properly free flow stats reply.
  f59f19bf6 ovsdb-idl: Fix IDL memory leak.
  7694dfacb compiler: Fix errors in Clang 17 ubsan checks.
  faf175155 vlog: Destroy async_append first then close log_fd.
  483bc24e4 hash, jhash: Fix unaligned access to the hash remainder.
  bd5b5d3b3 ovs-atomic: Fix inclusion of Clang header by GCC 14.
  bb61b5fe8 vconn: Count vconn_sent regardless of log level.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 8fd5f77cd..a0af48b75 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 8fd5f77cd84ea04667f987c7b84181604dc99f60
+Subproject commit a0af48b753ef3215091356f112bbb89737f286d9
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-23.09] ovs: Bump submodule to branch-3.2.

From: Dumitru Ceara 

Specifically the following commit:
  4102674b3e ovsdb-idl: Preserve change_seqno when deleting rows.

Without it, in specific cases, the IDL might incorrectly report deletion
of yet to be seen records.

This commit differs from original by bumping OVS submodule to branch-3.2
related commit ec1d73016 ("ovsdb-idl: Preserve change_seqno when deleting
rows.")

Signed-off-by: Dumitru Ceara 
Acked-by: Ilya Maximets 
(cherry picked from commit 66ef6709678486f7abf88db10eed15fb72edcc4a)
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
The current revision differs from original commit referred above:
- ovs submodule commit sha if another
- discarded change at classifier_lookup()
- adjusted commit message

@Dumitru, @Ilya, I'm not aware of a correct handling of Signed-off-by and
Acked-By tags and commit content & message modification when
cherry-picking, just wanted to save credits. So if it is not right to
keep them, or to modify backport so please let me know, I can re-send v2
as a normal non-backport patch.
---
 controller/ofctrl.c | 2 +-
 ovs | 2 +-
 tests/test-ovn.c| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 497890eed..718baac18 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -3059,7 +3059,7 @@ ofctrl_inject_pkt(const struct ovsrec_bridge *br_int, 
const char *flow_s,
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
-flow_compose(&packet, &uflow, NULL, 64);
+flow_compose(&packet, &uflow, NULL, 64, false);
 
 uint64_t ofpacts_stub[1024 / 8];
 struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(ofpacts_stub);
diff --git a/ovs b/ovs
index c88a35fc2..c2f287013 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit c88a35fc29f0c0eb6189853bfc738c2100d4860f
+Subproject commit c2f287013025ad5b0c40e0c7fc3a9042d4899ce1
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 16d2d779d..6f38b1493 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -1238,7 +1238,7 @@ test_expr_to_packets(struct ovs_cmdl_context *ctx 
OVS_UNUSED)
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
-flow_compose(&packet, &uflow, NULL, 64);
+flow_compose(&packet, &uflow, NULL, 64, false);
 
 struct ds output = DS_EMPTY_INITIALIZER;
 const uint8_t *buf = dp_packet_data(&packet);
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-24.03] ovs: Bump submodule to latest OVS branch-3.3.

This picks up the following relevant OVS changes:
  618944a79 ofproto-dpif: Improve load balancing in dp_hash select groups.
  bb49e027c vconn: Always properly free flow stats reply.
  58ff23947 ovsdb-idl: Fix IDL memory leak.
  f02dc3cfe vlog: Destroy async_append first then close log_fd.
  01eca18be hash, jhash: Fix unaligned access to the hash remainder.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index f19448b86..618944a79 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit f19448b8618967a108ec6f34713dd811ce1d1334
+Subproject commit 618944a79fec8e98d5880ca2bbb60304855d4437
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-24.09] ovs: Bump submodule to latest OVS branch-3.4.

This picks up the following relevant OVS changes:
  a15ce086d ofproto-dpif: Improve load balancing in dp_hash select groups.
  76ba41b5c vconn: Always properly free flow stats reply.
  64cb90507 ovsdb-idl: Fix IDL memory leak.
  ... and others.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417627.html
Signed-off-by: Vladislav Odintsov 
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index c598c05c8..a15ce086d 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit c598c05c85b2d38874a0ce8f7f088f6aae4fdabc
+Subproject commit a15ce086d41f9dfe6c1589333413b8e777401ef0
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-23.03] ovs: Bump submodule to tip of OVS branch-3.2.

2024-10-07 Thread Vladislav Odintsov

From: Dumitru Ceara 

This picks up the following relevant commit:
  cd8ffc956c3c ovs-atomic: Fix inclusion of Clang header by GCC 14.

Without this builds on Fedora 40 (rawhide) are broken due to failing to
compile the submodule.

Signed-off-by: Dumitru Ceara 
Acked-by: Numan Siddique 
Signed-off-by: Numan Siddique 
(cherry picked from commit f224c6e5f69c099ddb008f99dba2e19a902a612f)
Signed-off-by: Vladislav Odintsov 
---
Without this patch there are errors building OVN on a modern systems.

I kindly request for this patch to be backported down to 22.03 LTS
including already officially unsupported branches 23.03, 22.09 and 22.06,
since we internally still need to base on 22.09 branch in development.

Thanks in advance if it is possible to make an exception and ignore
backport rules for non-LTS releases and patch them too.
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 8fd5f77cd..49e64f13b 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 8fd5f77cd84ea04667f987c7b84181604dc99f60
+Subproject commit 49e64f13b2c965f5b53a65eeab70ac2e3f0bf69a
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn branch-23.09] ovs: Bump submodule to tip of OVS branch-3.2.

2024-10-07 Thread Vladislav Odintsov

From: Dumitru Ceara 

This picks up the following relevant commit:
  cd8ffc956c3c ovs-atomic: Fix inclusion of Clang header by GCC 14.

Without this builds on Fedora 40 (rawhide) are broken due to failing to
compile the submodule.

Signed-off-by: Dumitru Ceara 
Acked-by: Numan Siddique 
Signed-off-by: Numan Siddique 
(cherry picked from commit f224c6e5f69c099ddb008f99dba2e19a902a612f)
Signed-off-by: Vladislav Odintsov 
---
Without this patch there are errors building OVN on a modern systems.

I kindly request for this patch to be backported down to 22.03 LTS
including already officially unsupport branches 23.09, 23.03, 22.09,
since we internally still need to base on 22.09 branch in development.

Thanks in advance if it is possible to make an exception and ignore
backport rules for non-LTS releases and patch them too.
---
 ovs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovs b/ovs
index 8fd5f77cd..49e64f13b 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 8fd5f77cd84ea04667f987c7b84181604dc99f60
+Subproject commit 49e64f13b2c965f5b53a65eeab70ac2e3f0bf69a
-- 
2.46.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] RFC OVN: fabric integration

2024-09-17 Thread Vladislav Odintsov

On 16.09.2024 17:41, Dumitru Ceara wrote:

On 9/12/24 19:45, Roberto Bartzen Acosta wrote:

 >          - The result of this synchronization is basically:
 >          - SB->NB: creating/deleting/updating
 Logical_Router_Static_Route
 >          entries for learned routes in the Routing_Information_Base
 > table (using the
 >          key).

 Why would you see the need to push the learned routes to the northbound
 database? I would see this as just creating chaos in the CMS.
 I would rather keep them only in the SB and let northd do the merging.

This is not necessary at all! It would not be necessary to sync a new
table with NB since the learned routes can be redistributed as static
routes via OVN-IC, for example. The solution as a whole needs to be
generic enough that we can redistribute/learn static routes in addition
to directly connected routes.
So, as long as northd merges these routes, it should work perfectly
fine! Of course we need to be careful with the route policies that can
block some prefix.

While I agree that it's probably not desirable to write these dynamic
routes in the NB I think it would be useful to have a way to dump both
"static" and "dynamic" routing table contents from a single place.  It
would make debugging easier.

I totally agree, that it is very desired that CMS can dump BGP learnt 
routes from OVN. This is the only one source of this type of data. Not 
only for debugging purposes, but also to return these routes in the CMS 
API to the end users.

NB is a good place to reflect useful information back to the CMS. It 
already has some read-only data populated by ovn-northd 
(NB_Global.options.max_tunid, Logical_Switch_Port.options.tag, 
Logical_Router_Static_Route.external_ids.ic-lrearned-route and others).

It seems to me that there is a recommended way to "gather" such 
information into CMS - through NB database. Though currently there is an 
example of anti-pattern - Service_Monitor SB table. We do use it to dump 
information about states of Load Balancers' backends (online/offline). 
This is not a clean solution to give the CMS direct access to SB, since 
it is not a config plane, but OVN internals.

E.g.:

ovn-nbctl --all lr-route-list 
[dumps all static routes and also learnt (received) routes]

However, that means we either need ovn-nbctl to connect to the SB or we
propagate the information to the NB for nbctl to read.

Optionally listing of BGP routes can give the wrong view for the 
administator about routing. The static and ic-learnt routes is 
insufficient to understand the real routing table if BGP routes age in 
the game.

Also, giving an access from ovn-nbctl to SB seems to be a huge breakage 
of NB/SB split concept, looks like there is a need for ovn-northd to 
sync information back to NB.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] news: Fix indentation for an entry.

2024-09-02 Thread Vladislav Odintsov

Signed-off-by: Vladislav Odintsov 
---
 NEWS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 318f63195..e0a48aaa7 100644
--- a/NEWS
+++ b/NEWS
@@ -59,7 +59,7 @@ OVN v24.09.0 - xx xxx 
   - The NB_Global.debug_drop_domain_id configured value is now overridden by
 the ID associated with the Sampling_App record created for drop sampling
 (Sampling_App.type configured as "drop").
-- Add support for ACL sampling through the new Sample_Collector and Sample
+  - Add support for ACL sampling through the new Sample_Collector and Sample
 tables.  Sampling is supported for both traffic that creates new
 connections and for traffic that is part of an existing connection.
   - Add "external_ids:ovn-encap-ip-default" config for ovn-controller to
-- 
2.45.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] news: Fix indentation for an entry.

2024-09-02 Thread Vladislav Odintsov

Signed-off-by: Vladislav Odintsov 
---
 NEWS | 2 +-
 ovs  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 318f63195..e0a48aaa7 100644
--- a/NEWS
+++ b/NEWS
@@ -59,7 +59,7 @@ OVN v24.09.0 - xx xxx 
   - The NB_Global.debug_drop_domain_id configured value is now overridden by
 the ID associated with the Sampling_App record created for drop sampling
 (Sampling_App.type configured as "drop").
-- Add support for ACL sampling through the new Sample_Collector and Sample
+  - Add support for ACL sampling through the new Sample_Collector and Sample
 tables.  Sampling is supported for both traffic that creates new
 connections and for traffic that is part of an existing connection.
   - Add "external_ids:ovn-encap-ip-default" config for ovn-controller to
diff --git a/ovs b/ovs
index 0aa14d912..bf1b16364 16
--- a/ovs
+++ b/ovs
@@ -1 +1 @@
-Subproject commit 0aa14d912d9a29d07ebc727007a1f21e3639eea5
+Subproject commit bf1b16364b3f01b0ff5f2f6e76842e666226a17b
-- 
2.45.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v3] northd: Support routing over other address families.

2024-08-26 Thread Vladislav Odintsov

Hi Frode,

regards,
Vladislav Odintsov

> On 26 Aug 2024, at 09:22, Frode Nordahl  wrote:
> 
> On Sun, Aug 25, 2024 at 8:43 AM Vladislav Odintsov  wrote:
>> 
>> Hi Felix,
>> 
>> I’m wondering which task or problem you want to achieve with this change? 
>> While this is a definitely useful feature in physical world, where network 
>> switches can use EUI-64 link-local addresses, how do you plan to use it with 
>> OVN?
>> Do you have plans to implement auto-generated LRP unique mac addresses and 
>> EUI-64 IPv6 LLAs in order to utilize this patch feature to remove IPAM 
>> complexity from CMS on allocating addresses for peering networks? Or, you’re 
>> mixing it somehow with Logical_Router_Port.options.prefix feature?
>> If not, why not just use IPv4 LLAs, why IPv6?
>> 
>> So, could you please explain the entire use case in more detail.
> 
> FWIW; just wanted to chime in with our interest / support for this
> being a useful addition.
> 
> We plan to use it together with the stream of work coming out of the
> OVN fabric integration thread [0] this coming cycle. The use case is
> to form relationships with the physical ToR switches, which as you
> point out typically use IPv6 LLA to implement BGP "unnumbered"
> functionality.
> 
> IPv6 LLAs are there by default today in OVN and in the ToRs, so I'd
> flip the question on the head and ask why would you want to use IPv4
> LLAs?

Oh, that was the missing puzzle! Please, forgive my ignorance, I didn’t know 
this feature was already implemented.
Now the picture is totally clear.

Thanks!

Though the CMS still has to ensure that we configure unique MAC addresses, 
right? Don’t we want to go ahead and add support to generate them?

> 
> 0: https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416296.html
> 
> --
> Frode Nordahl
> 
>> regards,
>> Vladislav Odintsov
>> 
>>>> On 22 Apr 2024, at 18:46, Felix Huettner via dev  
>>>> wrote:
>>> In most cases IPv4 packets are routed only over other IPv4 networks and
>>> IPv6 packets are routed only over IPv6 networks. However there is no
>>> inherent reason for this limitation. Routing IPv4 packets over IPv6
>>> networks just requires the router to contain a route for an IPv4 network
>>> with an IPv6 nexthop.
>>> 
>>> This was previously prevented in OVN in ovn-nbctl and northd. By
>>> removing these filters the forwarding will work if the mac addresses are
>>> prepopulated.
>>> 
>>> If the mac addresses are not prepopulated we will attempt to resolve them 
>>> using
>>> the original address family of the packet and not the address family of the
>>> nexthop. This will fail and we will not forward the packet.
>>> 
>>> This feature can for example be used by service providers to
>>> interconnect multiple IPv4 networks of a customer without needing to
>>> negotiate free IPv4 addresses by just using any IPv6 address.
>>> 
>>> Signed-off-by: Felix Huettner 
>>> ---
>>> v2->v3: fix uninitialized variable
>>> v1->v2:
>>> - move ipv4 info to parsed_route
>>> - add tests for lr-route-add
>>> - switch tests to use fmt_pkt
>>> - some minor test cleanups
>>> NEWS  |   4 +
>>> northd/northd.c   |  57 ++---
>>> tests/ovn-nbctl.at|  26 ++-
>>> tests/ovn.at  | 511 ++
>>> utilities/ovn-nbctl.c |  12 +-
>>> 5 files changed, 571 insertions(+), 39 deletions(-)
>>> 
>>> diff --git a/NEWS b/NEWS
>>> index 141f1831c..14a935c86 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -13,6 +13,10 @@ Post v24.03.0
>>>"lflow-stage-to-oftable STAGE_NAME" that converts stage name into 
>>> OpenFlow
>>>table id.
>>>  - Rename the ovs-sandbox script to ovn-sandbox.
>>> +  - Allow Static Routes where the address families of ip_prefix and nexthop
>>> +diverge (e.g. IPv4 packets over IPv6 links). This is currently limited 
>>> to
>>> +nexthops that have their mac addresses prepopulated (so
>>> +dynamic_neigh_routers must be false).
>>> 
>>> OVN v24.03.0 - 01 Mar 2024
>>> --
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 331d9c267..f2357af15 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -10194,6 +10194,8 @@ struct parsed_route {
>>>const struct nbrec_logical_router_static_route *rou

Re: [ovs-dev] [PATCH ovn v3] northd: Support routing over other address families.

2024-08-24 Thread Vladislav Odintsov

Hi Felix,

I’m wondering which task or problem you want to achieve with this change? While 
this is a definitely useful feature in physical world, where network switches 
can use EUI-64 link-local addresses, how do you plan to use it with OVN?
Do you have plans to implement auto-generated LRP unique mac addresses and 
EUI-64 IPv6 LLAs in order to utilize this patch feature to remove IPAM 
complexity from CMS on allocating addresses for peering networks? Or, you’re 
mixing it somehow with Logical_Router_Port.options.prefix feature?
If not, why not just use IPv4 LLAs, why IPv6?

So, could you please explain the entire use case in more detail.

regards,
Vladislav Odintsov

> On 22 Apr 2024, at 18:46, Felix Huettner via dev  
> wrote:
> In most cases IPv4 packets are routed only over other IPv4 networks and
> IPv6 packets are routed only over IPv6 networks. However there is no
> inherent reason for this limitation. Routing IPv4 packets over IPv6
> networks just requires the router to contain a route for an IPv4 network
> with an IPv6 nexthop.
> 
> This was previously prevented in OVN in ovn-nbctl and northd. By
> removing these filters the forwarding will work if the mac addresses are
> prepopulated.
> 
> If the mac addresses are not prepopulated we will attempt to resolve them 
> using
> the original address family of the packet and not the address family of the
> nexthop. This will fail and we will not forward the packet.
> 
> This feature can for example be used by service providers to
> interconnect multiple IPv4 networks of a customer without needing to
> negotiate free IPv4 addresses by just using any IPv6 address.
> 
> Signed-off-by: Felix Huettner 
> ---
> v2->v3: fix uninitialized variable
> v1->v2:
>  - move ipv4 info to parsed_route
>  - add tests for lr-route-add
>  - switch tests to use fmt_pkt
>  - some minor test cleanups
> NEWS  |   4 +
> northd/northd.c   |  57 ++---
> tests/ovn-nbctl.at|  26 ++-
> tests/ovn.at  | 511 ++
> utilities/ovn-nbctl.c |  12 +-
> 5 files changed, 571 insertions(+), 39 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 141f1831c..14a935c86 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -13,6 +13,10 @@ Post v24.03.0
> "lflow-stage-to-oftable STAGE_NAME" that converts stage name into OpenFlow
> table id.
>   - Rename the ovs-sandbox script to ovn-sandbox.
> +  - Allow Static Routes where the address families of ip_prefix and nexthop
> +diverge (e.g. IPv4 packets over IPv6 links). This is currently limited to
> +nexthops that have their mac addresses prepopulated (so
> +dynamic_neigh_routers must be false).
> 
> OVN v24.03.0 - 01 Mar 2024
> --
> diff --git a/northd/northd.c b/northd/northd.c
> index 331d9c267..f2357af15 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -10194,6 +10194,8 @@ struct parsed_route {
> const struct nbrec_logical_router_static_route *route;
> bool ecmp_symmetric_reply;
> bool is_discard_route;
> +bool is_ipv4_prefix;
> +bool is_ipv4_nexthop;
> };
> 
> static uint32_t
> @@ -10219,6 +10221,8 @@ parsed_routes_add(struct ovn_datapath *od, const 
> struct hmap *lr_ports,
> /* Verify that the next hop is an IP address with an all-ones mask. */
> struct in6_addr nexthop;
> unsigned int plen;
> +bool is_ipv4_nexthop = true;
> +bool is_ipv4_prefix;
> bool is_discard_route = !strcmp(route->nexthop, "discard");
> bool valid_nexthop = route->nexthop[0] && !is_discard_route;
> if (valid_nexthop) {
> @@ -10237,6 +10241,7 @@ parsed_routes_add(struct ovn_datapath *od, const 
> struct hmap *lr_ports,
>  UUID_ARGS(&route->header_.uuid));
> return NULL;
> }
> +is_ipv4_nexthop = IN6_IS_ADDR_V4MAPPED(&nexthop);
> }
> 
> /* Parse ip_prefix */
> @@ -10248,18 +10253,7 @@ parsed_routes_add(struct ovn_datapath *od, const 
> struct hmap *lr_ports,
>  UUID_ARGS(&route->header_.uuid));
> return NULL;
> }
> -
> -/* Verify that ip_prefix and nexthop have same address familiy. */
> -if (valid_nexthop) {
> -if (IN6_IS_ADDR_V4MAPPED(&prefix) != IN6_IS_ADDR_V4MAPPED(&nexthop)) 
> {
> -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> -VLOG_WARN_RL(&rl, "Address family doesn't match between 
> 'ip_prefix'"
> - " %s and 'nexthop' %s in static route "UUID_FMT,
> - route->ip_prefix, route->nexthop,
> -

Re: [ovs-dev] [Patch ovn v9] northd: Routing protocol port redirection.

2024-08-13 Thread Vladislav Odintsov


Thank you Martin and Dumitru for your work on this!

On 13.08.2024 22:46, Dumitru Ceara wrote:

On 8/12/24 19:30, Vladislav Odintsov wrote:

On 13.08.2024 00:14, Dumitru Ceara wrote:

On 8/12/24 18:21, Martin Kalcok wrote:

This change adds two new LRP options:
   * routing-protocol-redirect
   * routing-protocols

These allow redirection of a routing protocol traffic to
an Logical Switch Port. This enables external routing daemons
to listen on an interface bound to an LSP and effectively act
as if they were listening on (and speaking from) LRP's IP address.

Option 'routing-protocols' takes a comma-separated list of routing
protocols whose traffic should be redirected. Currently supported
are BGP (tcp 179) and BFD (udp 3784).

Option 'routing-protocol-redirect' expects a string with an LSP
name.

When both of these options are set, any traffic entering LS
that's destined for LRP's IP addresses (including IPv6 LLA) and
routing protocol's port number, is redirected to the LSP specified
in the 'routing-protocol-redirect' value.

NOTE: this feature is experimental and may be subject to
removal/change in the future.

Signed-off-by: Martin Kalcok
---

   v9 contains small changes based on the review of v8:
   * Simplified search for the port specified in
'routing-protocol-redirect',
     using 'ovn_port_find'
     * As a result this change a new possible warning was added when LRP
   is not connected to the same LS as LSP specified in
   'routing-protocol-redirect'.
   * Datapath test for this feature now includes verification of BFD's
     UDP traffic.
     * These tests required some more care as Ncat produced false
positive
   results even when sending to a port where nothing was
listening. My
   assumption is that Ncat tries to assert succes of a UDP connection
   based on lack of ICMP Port Unreachable message, and LR probably
   does not generate these?
   * nit: null pointer checks changed from 'if (p == NULL)' to 'if (!p)'
     for consistency.
   

This version looks good to me!

Acked-by: Dumitru Ceara

Vladislav, do you happen to have some time to try this version out on
your end too?

Yes, just now finished testing.

With centralized routing the BGP and BFD works well (in my setup there
are two VRFs with BGP and BFD peerings configured as a haipnit inside
one node to each other):

# sh run
<...snip...>
router bgp 64512 vrf dxvif-62C25580
  bgp router-id 169.254.252.1
  no bgp ebgp-requires-policy
  no bgp network import-check
  neighbor 169.254.252.2 remote-as 64513
  neighbor 169.254.252.2 bfd
exit
!
router bgp 64513 vrf dxvif-9ED34880
  bgp router-id 169.254.252.2
  no bgp ebgp-requires-policy
  no bgp network import-check
  neighbor 169.254.252.1 remote-as 64512
  neighbor 169.254.252.1 bfd
  !
  address-family ipv4 unicast
   network 10.0.0.0/24
  exit-address-family
exit

# sh ip bgp vrf all

Instance dxvif-62C25580:
BGP table version is 1, local router ID is 169.254.252.1, vrf id 760
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, * valid, > best, =
multipath,
    i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network  Next Hop    Metric LocPrf Weight Path
*> 10.0.0.0/24  169.254.252.2    0 0 64513 i

Displayed  1 routes and 1 total paths

Instance dxvif-9ED34880:
BGP table version is 1, local router ID is 169.254.252.2, vrf id 763
Default local pref 100, local AS 64513
Status codes:  s suppressed, d damped, h history, * valid, > best, =
multipath,
    i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network  Next Hop    Metric LocPrf Weight Path
*> 10.0.0.0/24  0.0.0.0  0 32768 i

Displayed  1 routes and 1 total paths

# sh bfd peers
BFD Peers:
     peer 169.254.252.1 local-address 169.254.252.2 vrf dxvif-9ED34880
interface dx-9ED34880-v
     ID: 3199966869
     Remote ID: 2401437121
     Active mode
     Status: up
     Uptime: 26 second(s)
     Diagnostics: ok
     Remote diagnostics: ok
     Peer Type: dynamic
     RTT min/avg/max: 0/0/0 usec
     Local timers:
     Detect-multiplier: 3
     Receive interval: 300ms
     Transmission interval: 300ms
     Echo receive interval: 50ms
     Echo transmission interval: disabled
     Remote timers:
     Detect-multiplier: 3
     Receive interval: 300ms
     Transmission interval: 300ms
     Echo rec

Re: [ovs-dev] [Patch ovn v9] northd: Routing protocol port redirection.

  Status: up
    Uptime: 26 second(s)
    Diagnostics: ok
    Remote diagnostics: ok
    Peer Type: dynamic
    RTT min/avg/max: 0/0/0 usec
    Local timers:
    Detect-multiplier: 3
    Receive interval: 300ms
    Transmission interval: 300ms
    Echo receive interval: 50ms
    Echo transmission interval: disabled
    Remote timers:
    Detect-multiplier: 3
    Receive interval: 300ms
    Transmission interval: 300ms
    Echo receive interval: 50ms

At this point this looks good with a note, that for LB VIPs and NAT 
addresses we do use ha-chassis-group with primary/secondary chassis, 
which is currently not supported by the redirect feature.


This probably should be somehow addressed in a future development.

Tested-by: Vladislav Odintsov 



Mark, Numan, Han, as discussed before branching, I think it's fine to
include this experimental feature on branch 24.09 (and in v24.09.0) too.

Do you guys agree?

Thanks,
Dumitru


  northd/northd.c | 231 
  northd/northd.h |   7 ++
  northd/ovn-northd.8.xml |  54 ++
  ovn-nb.xml  |  42 
  tests/ovn-northd.at |  93 
  tests/system-ovn.at | 149 ++
  6 files changed, 576 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 5ad30d854..8a240d93d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14002,6 +14002,234 @@ build_arp_resolve_flows_for_lrp(struct ovn_port *op,
  }
  }
  
+static void

+build_routing_protocols_redirect_rule__(
+const char *s_addr, const char *redirect_port_name, int protocol_port,
+const char *proto, bool is_ipv6, struct ovn_port *ls_peer,
+struct lflow_table *lflows, struct ds *match, struct ds *actions,
+struct lflow_ref *lflow_ref)
+{
+int ip_ver = is_ipv6 ? 6 : 4;
+ds_clear(actions);
+ds_put_format(actions, "outport = \"%s\"; output;", redirect_port_name);
+
+/* Redirect packets in the input pipeline destined for LR's IP
+ * and the routing protocol's port to the LSP specified in
+ * 'routing-protocol-redirect' option.*/
+ds_clear(match);
+ds_put_format(match, "ip%d.dst == %s && %s.dst == %d", ip_ver, s_addr,
+  proto, protocol_port);
+ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  lflow_ref);
+
+/* To accomodate "peer" nature of the routing daemons, redirect also
+ * replies to the daemons' client requests. */
+ds_clear(match);
+ds_put_format(match, "ip%d.dst == %s && %s.src == %d", ip_ver, s_addr,
+  proto, protocol_port);
+ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  lflow_ref);
+}
+
+static void
+apply_routing_protocols_redirect__(
+const char *s_addr, const char *redirect_port_name, int protocol_flags,
+bool is_ipv6, struct ovn_port *ls_peer, struct lflow_table *lflows,
+struct ds *match, struct ds *actions, struct lflow_ref *lflow_ref)
+{
+if (protocol_flags & REDIRECT_BGP) {
+build_routing_protocols_redirect_rule__(s_addr, redirect_port_name,
+179, "tcp", is_ipv6, ls_peer,
+lflows, match, actions,
+lflow_ref);
+}
+
+if (protocol_flags & REDIRECT_BFD) {
+build_routing_protocols_redirect_rule__(s_addr, redirect_port_name,
+3784, "udp", is_ipv6, ls_peer,
+lflows, match, actions,
+lflow_ref);
+}
+
+/* Because the redirected port shares IP and MAC addresses with the LRP,
+ * special consideration needs to be given to the signaling protocols. */
+ds_clear(actions);
+ds_put_format(actions,
+ "clone { outport = \"%s\"; output; }; "
+ "outport = %s; output;",
+  redirect_port_name, ls_peer->json_key);
+if (is_ipv6) {
+/* Ensure that redirect port receives copy of NA messages destined to
+ * its IP.*/
+ds_clear(match);
+ds_put_format(match, "ip6.dst == %s && nd_na", s_addr);
+ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  lflow_ref);
+} else {
+/* Ensure that redirect port receives copy of ARP replies destined to

Re: [ovs-dev] [ovn] OF connection dropped when OVS port mac is changes


Hi Ilya, thanks for the quick response!

On 12.08.2024 23:42, Ilya Maximets wrote:

On 8/12/24 18:33, Ilya Maximets wrote:

On 8/12/24 18:24, Vladislav Odintsov wrote:

Hi,

I've faced with a behavior of OVS/OVN (I couldn't understand which part
to blame) loosing openflow connection if I change mac address of an
added to br-int logical port, which is claimed as a normal VIF port by
ovn-controller. There is a simple reproducer script:

ovn-nbctl ls-add test
ovn-nbctl lsp-add test test1
ip li add test1 type dummy
ovs-vsctl add-port br-int test1
ip li set test1 add 00:00:00:00:00:01

When the last line is executed in ovn-controller.log I see openflow is
re-connected:

2024-08-12T15:42:02.493Z|03705|rconn(ovn_pinctrl0)|INFO|unix:/run/openvswitch/br-int.mgmt:
connection closed by peer
2024-08-12T15:42:02.494Z|00222|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
connection closed by peer
2024-08-12T15:42:02.494Z|00223|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
connection closed by peer
2024-08-12T15:42:03.495Z|00224|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
connecting...
2024-08-12T15:42:03.495Z|00225|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
connecting...
2024-08-12T15:42:03.495Z|03706|rconn(ovn_pinctrl0)|INFO|unix:/run/openvswitch/br-int.mgmt:
connecting...
2024-08-12T15:42:03.495Z|00226|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
connected

In ovs-vswitchd with enabled DBG logs there is:

2024-08-12T16:01:08.694Z|3796712|poll_loop|DBG|wakeup due to [POLLIN] on
fd 18 (NETLINK_ROUTE<->NETLINK_ROUTE) at lib/netlink-socket.c:1409 (0%
CPU usage)
2024-08-12T16:01:08.695Z|3796713|poll_loop|DBG|wakeup due to [POLLIN] on
fd 16 (NETLINK_ROUTE<->NETLINK_ROUTE) at lib/netlink-socket.c:1409 (0%
CPU usage)
2024-08-12T16:01:08.695Z|3796714|netlink_socket|DBG|Dropped 505 log
messages in last 1 seconds (most recently, 0 seconds ago) due to
excessive rate
2024-08-12T16:01:08.695Z|3796715|netlink_socket|DBG|nl_sock_recv__
(Success): nl(len:1360, type=16(family-defined), flags=0, seq=0, pid=0
2024-08-12T16:01:08.695Z|3796716|dpif|DBG|Dropped 15 log messages in
last 0 seconds (most recently, 0 seconds ago) due to excessive rate
2024-08-12T16:01:08.695Z|3796717|dpif|DBG|system@ovs-system: device
internet is on port 1
2024-08-12T16:01:08.695Z|3796718|dpif|DBG|system@ovs-system: device
br-ext is on port 2
2024-08-12T16:01:08.695Z|3796719|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock:
send request, method="transact",
params=["Open_vSwitch",{"where":[["_uuid","==",["uuid","acae6b73-de5f-46e4-a3d9-7874efd43cb4"]]],"row":{"mac_in_use":"00:00:00:00:00:02"},"op":"update","table":"Interface"},{"lock":"ovs_vswitchd","op":"assert"}],
id=4284801
2024-08-12T16:01:08.699Z|3796720|poll_loop|DBG|wakeup due to [POLLIN] on
fd 37 (FIFO pipe:[103980657]) at vswitchd/bridge.c:421 (0% CPU usage)
2024-08-12T16:01:08.699Z|3796721|poll_loop|DBG|wakeup due to [POLLIN] on
fd 17 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (0% CPU
usage)
2024-08-12T16:01:08.699Z|00690|poll_loop(urcu5)|DBG|wakeup due to
[POLLIN] on fd 56 (FIFO pipe:[103975681]) at lib/ovs-rcu.c:363
2024-08-12T16:01:08.699Z|3796722|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock:
received notification, method="update3",
params=[["monid","Open_vSwitch"],"----",{"Interface":{"acae6b73-de5f-46e4-a3d9-7874efd43cb4":{"modify":{"mac_in_use":"00:00:00:00:00:02"]
2024-08-12T16:01:08.699Z|3796723|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock:
received reply, result=[{"count":1},{}], id=4284801
2024-08-12T16:01:08.699Z|3796724|vconn|DBG|unix#121: sent (Success):
OFPT_PORT_STATUS (OF1.5) (xid=0x0): MOD: 15(test1): addr:00:00:00:00:00:02
   config: PORT_DOWN
   state:  LINK_DOWN
   speed: 0 Mbps now, 0 Mbps max
2024-08-12T16:01:08.699Z|3796725|bridge|INFO|bridge br-int: using
datapath ID 0002
2024-08-12T16:01:08.699Z|3796726|rconn|INFO|br-int<->unix#121: disconnecting
2024-08-12T16:01:08.699Z|3796727|rconn|DBG|br-int<->unix#121: entering
DISCONNECTED
2024-08-12T16:01:08.699Z|3796728|rconn|INFO|br-int<->unix#122: disconnecting
2024-08-12T16:01:08.699Z|3796729|rconn|DBG|br-int<->unix#122: entering
DISCONNECTED
2024-08-12T16:01:08.699Z|3796730|rconn|INFO|br-int<->unix#123: disconnecting
2024-08-12T16:01:08.699Z|3796731|rconn|DBG|br-int<->unix#123: entering
DISCONNECTED
2024-08-12T16:01:08.699Z|3796732|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock:
send request, method="transact",
params=["Open_vSwitch",{"where":[["_uuid","==",["uuid","5c37aa7c-9394-4068-8a4d-ab23705036d7"]]],"row":{"datapath_id

[ovs-dev] [ovn] OF connection dropped when OVS port mac is changes


Hi,

I've faced with a behavior of OVS/OVN (I couldn't understand which part 
to blame) loosing openflow connection if I change mac address of an 
added to br-int logical port, which is claimed as a normal VIF port by 
ovn-controller. There is a simple reproducer script:


ovn-nbctl ls-add test
ovn-nbctl lsp-add test test1
ip li add test1 type dummy
ovs-vsctl add-port br-int test1
ip li set test1 add 00:00:00:00:00:01

When the last line is executed in ovn-controller.log I see openflow is 
re-connected:


2024-08-12T15:42:02.493Z|03705|rconn(ovn_pinctrl0)|INFO|unix:/run/openvswitch/br-int.mgmt: 
connection closed by peer
2024-08-12T15:42:02.494Z|00222|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
connection closed by peer
2024-08-12T15:42:02.494Z|00223|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
connection closed by peer
2024-08-12T15:42:03.495Z|00224|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
connecting...
2024-08-12T15:42:03.495Z|00225|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
connecting...
2024-08-12T15:42:03.495Z|03706|rconn(ovn_pinctrl0)|INFO|unix:/run/openvswitch/br-int.mgmt: 
connecting...
2024-08-12T15:42:03.495Z|00226|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
connected


In ovs-vswitchd with enabled DBG logs there is:

2024-08-12T16:01:08.694Z|3796712|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 18 (NETLINK_ROUTE<->NETLINK_ROUTE) at lib/netlink-socket.c:1409 (0% 
CPU usage)
2024-08-12T16:01:08.695Z|3796713|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 16 (NETLINK_ROUTE<->NETLINK_ROUTE) at lib/netlink-socket.c:1409 (0% 
CPU usage)
2024-08-12T16:01:08.695Z|3796714|netlink_socket|DBG|Dropped 505 log 
messages in last 1 seconds (most recently, 0 seconds ago) due to 
excessive rate
2024-08-12T16:01:08.695Z|3796715|netlink_socket|DBG|nl_sock_recv__ 
(Success): nl(len:1360, type=16(family-defined), flags=0, seq=0, pid=0
2024-08-12T16:01:08.695Z|3796716|dpif|DBG|Dropped 15 log messages in 
last 0 seconds (most recently, 0 seconds ago) due to excessive rate
2024-08-12T16:01:08.695Z|3796717|dpif|DBG|system@ovs-system: device 
internet is on port 1
2024-08-12T16:01:08.695Z|3796718|dpif|DBG|system@ovs-system: device 
br-ext is on port 2
2024-08-12T16:01:08.695Z|3796719|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock: 
send request, method="transact", 
params=["Open_vSwitch",{"where":[["_uuid","==",["uuid","acae6b73-de5f-46e4-a3d9-7874efd43cb4"]]],"row":{"mac_in_use":"00:00:00:00:00:02"},"op":"update","table":"Interface"},{"lock":"ovs_vswitchd","op":"assert"}], 
id=4284801
2024-08-12T16:01:08.699Z|3796720|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 37 (FIFO pipe:[103980657]) at vswitchd/bridge.c:421 (0% CPU usage)
2024-08-12T16:01:08.699Z|3796721|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 17 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (0% CPU 
usage)
2024-08-12T16:01:08.699Z|00690|poll_loop(urcu5)|DBG|wakeup due to 
[POLLIN] on fd 56 (FIFO pipe:[103975681]) at lib/ovs-rcu.c:363
2024-08-12T16:01:08.699Z|3796722|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock: 
received notification, method="update3", 
params=[["monid","Open_vSwitch"],"----",{"Interface":{"acae6b73-de5f-46e4-a3d9-7874efd43cb4":{"modify":{"mac_in_use":"00:00:00:00:00:02"]
2024-08-12T16:01:08.699Z|3796723|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock: 
received reply, result=[{"count":1},{}], id=4284801
2024-08-12T16:01:08.699Z|3796724|vconn|DBG|unix#121: sent (Success): 
OFPT_PORT_STATUS (OF1.5) (xid=0x0): MOD: 15(test1): addr:00:00:00:00:00:02

 config: PORT_DOWN
 state:  LINK_DOWN
 speed: 0 Mbps now, 0 Mbps max
2024-08-12T16:01:08.699Z|3796725|bridge|INFO|bridge br-int: using 
datapath ID 0002

2024-08-12T16:01:08.699Z|3796726|rconn|INFO|br-int<->unix#121: disconnecting
2024-08-12T16:01:08.699Z|3796727|rconn|DBG|br-int<->unix#121: entering 
DISCONNECTED

2024-08-12T16:01:08.699Z|3796728|rconn|INFO|br-int<->unix#122: disconnecting
2024-08-12T16:01:08.699Z|3796729|rconn|DBG|br-int<->unix#122: entering 
DISCONNECTED

2024-08-12T16:01:08.699Z|3796730|rconn|INFO|br-int<->unix#123: disconnecting
2024-08-12T16:01:08.699Z|3796731|rconn|DBG|br-int<->unix#123: entering 
DISCONNECTED
2024-08-12T16:01:08.699Z|3796732|jsonrpc|DBG|unix:/var/run/openvswitch/db.sock: 
send request, method="transact", 
params=["Open_vSwitch",{"where":[["_uuid","==",["uuid","5c37aa7c-9394-4068-8a4d-ab23705036d7"]]],"row":{"datapath_id":"0002"},"op":"update","table":"Bridge"},{"lock":"ovs_vswitchd","op":"assert"}], 
id=4284802
2024-08-12T16:01:08.701Z|3796733|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 37 (FIFO pipe:[103980657]) at vswitchd/bridge.c:421 (0% CPU usage)
2024-08-12T16:01:08.701Z|3796734|poll_loop|DBG|wakeup due to [POLLIN] on 
fd 17 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (0% CPU 
usage)
2024-08-12T16:01:08.701Z|00691|poll_loop(urcu5)|DBG|wakeup due to 
[POLLIN] on fd 56 (FIFO pipe:[103975681]) at lib/ovs-rcu.c:236
2024-08-12T16:01:08.701Z|37967

Re: [ovs-dev] [Patch ovn v8] northd: Routing protocol port redirection.



On 12.08.2024 20:20, martin.kal...@canonical.com wrote:

Sorry, one more follow-up question.

On Mon, 2024-08-12 at 15:09 +0200,martin.kal...@canonical.com wrote:

Hi Dumitru,
thanks for the fast review.

On Mon, 2024-08-12 at 14:41 +0200, Dumitru Ceara wrote:

On 8/12/24 13:44, Martin Kalcok wrote:

This change adds two new LRP options:
  * routing-protocol-redirect
  * routing-protocols

These allow redirection of a routing protocol traffic to
an Logical Switch Port. This enables external routing daemons
to listen on an interface bound to an LSP and effectively act
as if they were listening on (and speaking from) LRP's IP
address.

Option 'routing-protocols' takes a comma-separated list of
routing
protocols whose traffic should be redirected. Currently supported
are BGP (tcp 179) and BFD (udp 3784).

Option 'routing-protocol-redirect' expects a string with an LSP
name.

When both of these options are set, any traffic entering LS
that's destined for LRP's IP addresses (including IPv6 LLA) and
routing protocol's port number, is redirected to the LSP
specified
in the 'routing-protocol-redirect' value.

NOTE: this feature is experimental and may be subject to
removal/change in the future.

Signed-off-by: Martin Kalcok
---

Hi Martin,


  v8 patch fixes intermittent segfault issue present in previous
versions.
  It was caused by mistakingly using peer port's 'lflow_ref'
  (op->peer->lflow_ref) when inserting rules into the peer's
datapath. I
  assumed that when rules are injected into the peer's datapath,
it
should
  use peer port's 'lflow_ref'. However that is not the case[0] and
the
  thread-unsafe nature of 'lflow_ref'[1] caused crashes when
another
thread
  tried to use peer port's lflow_ref at the same time.


Ah, good catch!  Yes, lflow_ref is not straightforward.


  I'm running tests for this feature in loop and I'm at 500+
successful
  executions, whereas before I would see crashes in about 10
attempts.

  While I was looking at this patch with fresh eyes, I also
included
few
  nits:
   * rename 'redirect_port' var. to more descriptive
'redirect_port_name'
   * rename potentially confusing iterator variable 'lsp_peer' to
more
     descriptive 'lsp_in_peer'
   * relaxed overly cautious string comparison
     'if (s1_len == s2_len && !strncmp(s1, s2, s1_len))' to more
simple
     'if (!strcmp(s1, s2))' as I believe that we can safely assume
that both
     s1 and s2 are null-terminated strings

  [0]
https://github.com/ovn-org/ovn/blob/f0a368143e492c798d3233439f9f097f1c9690cd/northd/northd.c#L13953-L13957
  [1]
https://github.com/ovn-org/ovn/blob/f0a368143e492c798d3233439f9f097f1c9690cd/northd/northd.h#L684


I have a couple of small comments below.

Otherwise, the code looks good to me!


  northd/northd.c | 226

  northd/northd.h |   7 ++
  northd/ovn-northd.8.xml |  54 ++
  ovn-nb.xml  |  42 
  tests/ovn-northd.at |  93 +
  tests/system-ovn.at | 120 +
  6 files changed, 542 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 5ad30d854..53012de89 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14002,6 +14002,230 @@ build_arp_resolve_flows_for_lrp(struct
ovn_port *op,
  }
  }
  
+static void

+build_routing_protocols_redirect_rule__(
+    const char *s_addr, const char *redirect_port_name, int
protocol_port,
+    const char *proto, bool is_ipv6, struct ovn_port
*ls_peer,
+    struct lflow_table *lflows, struct ds *match, struct ds
*actions,
+    struct lflow_ref *lflow_ref)
+{
+    int ip_ver = is_ipv6 ? 6 : 4;
+    ds_clear(actions);
+    ds_put_format(actions, "outport = \"%s\"; output;",
redirect_port_name);
+
+    /* Redirect packets in the input pipeline destined for LR's
IP
+ * and the routing protocol's port to the LSP specified in
+ * 'routing-protocol-redirect' option.*/
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.dst == %d",
ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  lflow_ref);
+
+    /* To accomodate "peer" nature of the routing daemons,
redirect also
+ * replies to the daemons' client requests. */
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.src == %d",
ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  lflow_ref);
+}
+
+static void
+apply_routing_protocols_redirect__(
+    const char *s_addr, const char *redirect_port_name, int
protocol_flags,
+    bool is_ipv6, struct ovn_port *ls_peer, struct
lflow_table
*lflows,
+    struct ds *match, struct ds *actions, struct lflow_ref
*lflow_ref)
+{
+    if (protocol_flags & R

Re: [ovs-dev] [Patch ovn v5] northd: Routing protocol port redirection.

2024-08-09 Thread Vladislav Odintsov



On 09.08.2024 16:51, martin.kal...@canonical.com wrote:

On Fri, 2024-08-09 at 11:28 +0200, Dumitru Ceara wrote:

On Friday, August 9, 2024, wrote:

I don't mind merging them, but while we are at the topic. Are there
benefits in processing performance in "fewer, more complex rules"
vs "more, less complex rules"? Or is it just to improve
readability?



In this case I don’t expect we’ll have a lot of ports with redirect
enabled so, from my perspective, it’s just readability (I was ok with
the 2 flow version too).  From an openflow perspective there’s no
difference.

Thanks, I think I'll have to keep the current 2-line implementation in
and add this into the list of future improvements. @Vladislav I do
appreciate the review and the feedback, and I don't want to look like
I'm just ignoring it, but due to the time pressure (freeze today) and
this change resulting in tests/docs change (which always takes me way
more than I expect) i don't think I'll fit it in. It doesn't look like
it but the overhead adds up.


Sure, no problem.

I'm going to test v6. Will return with feedback soon. Stay tuned :)



Martin.


Thanks,
Dumitru
  

Martin.

On Fri, 2024-08-09 at 11:08 +0200, Dumitru Ceara wrote:

On Friday, August 9, 2024, Vladislav Odintsov
wrote:

Don't we want to merge these two conditions into one logical
flow?

E.g.:

"(ip%d.dst == %s && (%s.dst == %d && %s.src == %d)"
  
Sorry, there is typo. It should be:
  
"(ip%d.dst == %s && (%s.dst == %d || %s.src == %d)"
  

?

This will make one logical flow per LRP IP per protocol
instead of two.

I didn’t test this but I think that looks ok.

Regards,
Dumitru




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [Patch ovn v5] northd: Routing protocol port redirection.

2024-08-09 Thread Vladislav Odintsov



On 09.08.2024 15:48, Vladislav Odintsov wrote:

Hi Martin, Dumitru,

I'm gonna to perform a quick testing after rebase and get back with a 
feedback.


Please see one question below.

On 09.08.2024 15:25, martin.kal...@canonical.com wrote:

Hi Dumitru,
thank you for the fast review. I'll post v6 momentarily, I just want to
do quick fresh end-to-end setup/test.
I also want to add that you were correct about the ARP/ND. It didn't
work properly and it was living off the entries populated by unicast
traffic from the peer, however the entries were flapping wildly and it
was discovered when I threw the BFD into the mix (real blessing in
disguise). The slower BFD protocol was able to deal with it without
disconnects, but BFD was disconnecting every once in a while. Hence the
`clone`rules for the ARP replies and IPv6 NAs.


On Fri, 2024-08-09 at 08:19 +0200, Dumitru Ceara wrote:

On 8/9/24 04:45, Martin Kalcok wrote:

This change adds two new LRP options:
  * routing-protocol-redirect
  * routing-protocols

These allow redirection of a routing protocol traffic to
an Logical Switch Port. This enables external routing daemons
to listen on an interface bound to an LSP and effectively act
as if they were listening on (and speaking from) LRP's IP address.

Option 'routing-protocols' takes a comma-separated list of routing
protocols whose traffic should be redirected. Currently supported
are BGP (tcp 179) and BFD (udp 3784).

Option 'routing-protocol-redirect' expects a string with an LSP
name.

When both of these options are set, any traffic entering LS
that's destined for LRP's IP addresses (including IPv6 LLA) and
routing protocol's port number, is redirected to the LSP specified
in the 'routing-protocol-redirect' value.

NOTE: this feature is experimental and may be subject to
removal/change in the future.

Signed-off-by: Martin Kalcok
---

Hi Martin,

Thanks for v5!  Unfortunately it needs a rebase.

Also there were some minor things that need to be fixed in the man
pages.  While at it I have a few other small comments.


  v5 includes:
   * change from pure BGP redirect to a configurable protocol
redirection
   * fixes issue with local routing daemon not being able to receive
reply
 traffic when contacting its peer.
   * ARP and ND traffic is cloned to the redirected port, to
properly populate
 information about its neighbors.


  northd/northd.c | 217

  northd/northd.h |   7 ++
  northd/ovn-northd.8.xml |  54 ++
  ovn-nb.xml  |  42 
  tests/ovn-northd.at |  93 +
  tests/system-ovn.at | 100 ++
  6 files changed, 513 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..39b1998fd 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,221 @@ build_arp_resolve_flows_for_lrp(struct
ovn_port *op,
  }
  }
  +static void
+build_r_p_redirect_rule__(

I think I prefer a more explicit name, e.g.,
build_routing_protocols_redirect_rule__().


+    const char *s_addr, const char *redirect_port_name, int
protocol_port,
+    const char *proto, bool is_ipv6, struct ovn_port *ls_peer,
+    struct lflow_table *lflows, struct ds *match, struct ds
*actions)
+{
+    int ip_ver = is_ipv6 ? 6 : 4;
+    ds_clear(actions);
+    ds_put_format(actions, "outport = \"%s\"; output;",
redirect_port_name);
+
+    /* Redirect packets in the input pipeline destined for LR's IP
+ * and the routing protocol's port to the LSP specified in
+ * 'routing-protocol-redirect' option.*/
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.dst == %d", ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+
+    /* To accomodate "peer" nature of the routing daemons,
redirect also
+ * replies to the daemons' client requests. */
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.src == %d", ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+}


Don't we want to merge these two conditions into one logical flow?

E.g.:

"(ip%d.dst == %s && (%s.dst == %d && %s.src == %d)"


Sorry, there is typo. It should be:

"(ip%d.dst == %s && (%s.dst == %d || %s.src == %d)"


?

This will make one logical flow per LRP IP per protocol instead of two.


+
+static void
+apply_r_p_redirect__(

Same here, I think I prefer apply_routing_protocols_redirect__().

Re: [ovs-dev] [Patch ovn v5] northd: Routing protocol port redirection.

2024-08-09 Thread Vladislav Odintsov


Hi Martin, Dumitru,

I'm gonna to perform a quick testing after rebase and get back with a 
feedback.


Please see one question below.

On 09.08.2024 15:25, martin.kal...@canonical.com wrote:

Hi Dumitru,
thank you for the fast review. I'll post v6 momentarily, I just want to
do quick fresh end-to-end setup/test.
I also want to add that you were correct about the ARP/ND. It didn't
work properly and it was living off the entries populated by unicast
traffic from the peer, however the entries were flapping wildly and it
was discovered when I threw the BFD into the mix (real blessing in
disguise). The slower BFD protocol was able to deal with it without
disconnects, but BFD was disconnecting every once in a while. Hence the
`clone`rules for the ARP replies and IPv6 NAs.


On Fri, 2024-08-09 at 08:19 +0200, Dumitru Ceara wrote:

On 8/9/24 04:45, Martin Kalcok wrote:

This change adds two new LRP options:
  * routing-protocol-redirect
  * routing-protocols

These allow redirection of a routing protocol traffic to
an Logical Switch Port. This enables external routing daemons
to listen on an interface bound to an LSP and effectively act
as if they were listening on (and speaking from) LRP's IP address.

Option 'routing-protocols' takes a comma-separated list of routing
protocols whose traffic should be redirected. Currently supported
are BGP (tcp 179) and BFD (udp 3784).

Option 'routing-protocol-redirect' expects a string with an LSP
name.

When both of these options are set, any traffic entering LS
that's destined for LRP's IP addresses (including IPv6 LLA) and
routing protocol's port number, is redirected to the LSP specified
in the 'routing-protocol-redirect' value.

NOTE: this feature is experimental and may be subject to
removal/change in the future.

Signed-off-by: Martin Kalcok
---

Hi Martin,

Thanks for v5!  Unfortunately it needs a rebase.

Also there were some minor things that need to be fixed in the man
pages.  While at it I have a few other small comments.


  v5 includes:
   * change from pure BGP redirect to a configurable protocol
redirection
   * fixes issue with local routing daemon not being able to receive
reply
     traffic when contacting its peer.
   * ARP and ND traffic is cloned to the redirected port, to
properly populate
     information about its neighbors.


  northd/northd.c | 217

  northd/northd.h |   7 ++
  northd/ovn-northd.8.xml |  54 ++
  ovn-nb.xml  |  42 
  tests/ovn-northd.at |  93 +
  tests/system-ovn.at | 100 ++
  6 files changed, 513 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..39b1998fd 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,221 @@ build_arp_resolve_flows_for_lrp(struct
ovn_port *op,
  }
  }
  
+static void

+build_r_p_redirect_rule__(

I think I prefer a more explicit name, e.g.,
build_routing_protocols_redirect_rule__().


+    const char *s_addr, const char *redirect_port_name, int
protocol_port,
+    const char *proto, bool is_ipv6, struct ovn_port *ls_peer,
+    struct lflow_table *lflows, struct ds *match, struct ds
*actions)
+{
+    int ip_ver = is_ipv6 ? 6 : 4;
+    ds_clear(actions);
+    ds_put_format(actions, "outport = \"%s\"; output;",
redirect_port_name);
+
+    /* Redirect packets in the input pipeline destined for LR's IP
+ * and the routing protocol's port to the LSP specified in
+ * 'routing-protocol-redirect' option.*/
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.dst == %d", ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+
+    /* To accomodate "peer" nature of the routing daemons,
redirect also
+ * replies to the daemons' client requests. */
+    ds_clear(match);
+    ds_put_format(match, "ip%d.dst == %s && %s.src == %d", ip_ver,
s_addr,
+  proto, protocol_port);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+}


Don't we want to merge these two conditions into one logical flow?

E.g.:

"(ip%d.dst == %s && (%s.dst == %d && %s.src == %d)"
?

This will make one logical flow per LRP IP per protocol instead of two.


+
+static void
+apply_r_p_redirect__(

Same here, I think I prefer apply_routing_protocols_redirect__().


+    const char *s_addr, const char *redirect_port_name, int
protocol_flags,
+    bool is_ipv6, struct ovn_port *ls_peer, struct lflow_table
*lflows,
+    struct ds *match, struct ds *actions)
+{
+    if (protocol_flags & REDIRECT_BGP) {
+    build_r_p_redirect_rule__(s_addr, redirect_port_name, 179,
"tcp",
+  is_ipv6,

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.


With OVN we do use IPv4 LLA for now. IPv6 is in future plans.

On 08.08.2024 19:31, martin.kal...@canonical.com wrote:

Would it be useful to redirect only traffic for LRP's IPv6 LLA?
@Vladislav, in your setups, do you use ipv4 or ipv6 LLAs for setting up
BGP peering?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.

On 08.08.2024 18:45, Dumitru Ceara wrote:

On 8/8/24 11:14, Vladislav Odintsov wrote:

On 08.08.2024 15:51,martin.kal...@canonical.com wrote:

On Thu, 2024-08-08 at 10:46 +0200, Dumitru Ceara wrote:

On 8/8/24 10:42, Vladislav Odintsov wrote:

On 08.08.2024 03:16, Dumitru Ceara wrote:

Re-adding the dev list.

On 8/7/24 18:12, Vladislav Odintsov wrote:

Hi Dumitru,

Hi Vladislav,

I'd like to add some thoughts to your inputs.

Thanks for that, I added some more of my own below. :)

On 06.08.2024 19:23, Dumitru Ceara wrote:

Hi Martin,

Sorry, I was reviewing V3 but I was slow in actually sending
out the email.

On 8/6/24 13:17, Martin Kalcok wrote:

This change adds a 'bgp-redirect' option to LRP that allows
redirecting of BGP control plane traffic to an arbitrary
LSP
in its peer LS.

The option expects a string with a LSP name. When set,
any traffic entering LS that's destined for any of the
LRP's IP addresses (including IPv6 LLA) is redirected
to the LSP specified in the option's value.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were
listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok
---

Strictly on this version of the patch, and with the thought
in mind that
we might want to consider this feature experimental [0] and
maybe change
it (NB too) in the future, I left a few comments inline.
With those
addressed I think the patch looks OK to me.

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20240802155449.3528777-1-mmich...@redhat.com/

Now, for the point when we'd remove the "experimental" tag:

In general, it makes me a bit uneasy that we have to share
the MAC and
IP between the LRP and the VIF of logical switch port that's
connected
to the same switch as the LRP.

I was thinking of alternatives for the future and the only
things I
could come up with until now are:

1. Create a separate, "disconnected" logical switch:
ovn-nbctl ls-add bgp

Add the bgp-daemon port to it:
ovn-nbctl lsp-add bgp bgp-daemon.

Then we don't need "unknown" either, I think.

But it's not possible today to move packets from the ingress
pipeline of
a logical switch ("public" in this test) to the egress
pipeline of a
different logical switch ("bgp" in my example). It also
feels icky to
implement that.

2. Add the ability to bind an OVS port directly to a logical
router port:

then we could do the same type of redirect you do here but
instead in
the logical router pipeline. The advantage is we don't have
to drop any
non-bgp traffic in the switch pipeline. The switch keeps
functioning as
it does today.

Maybe an advantage of this second alternative would be that
we can
easily attach a filtering option to this LRP (e.g.,
LRP.options:control-traffic-filter) to allow us to be more
flexible in
what kind of traffic we forward to the actuall routing
protocol daemon
that runs behind that OVS port - Vladislav also mentioned
during the
meeting it might be interesting to forward BFD packets to the
FRR (or
whatever application implements the routing protocol)
instance too.

The idea to be able to bind LRP to OVS port sounds very
interesting to
me. But with a note that it's not a pure "bind", but a partial:
as you
wrote with some "filter" applied to pass not all the traffic.
The main
idea here is to pass only control plane traffic destined to
LRP's

As we're discussing on the other thread (Martin pointed it out)
we also
probably need to involve conntrack and allow replies to any kind
of
traffic initiated by the entity behind the LRP's VIF (e.g., BGP
sessions
initiated from the host).

addresses. Other than that seems odd. Transit traffic should be
remained
in LR pipeline otherwise it will have no difference with a
regular VIF LSP.

Definitely, traffic that needs to be DNATed (DNAT/unSNAT rules or
LB
rules that will change the destination IP from LRP IP to
something else)
should not be affected. All other "transit" traffic doesn't have
LRP
IP, does it?

You're right.

@Dumitru, @Martin, what if we just redirect all traffic destined to
LRP
IPs to redirect port? Is there any drawbacks?
For security it is possible to optionally use ACLs with or without
conntrack. It's up to user/CMS.

This seems quite simple in the code and very flexible. So no
additional
flows seems to be needed in future to support any other routing
protocols (or for another possible non-routing usecases).

Won't this break all "transit" traffic that has destination IP the
LRP
IP (DNAT, LB), etc?

I'm afraid so. I guess I was just tired yesterday when I made that
proposal for general redirect. Given that the redirect is implemented
in the LS pipeline, it would eat up all the traffic for LRP's IP before
the DNAT/unSNAT occurs in the LR pipeline. I'll give it a quick

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.

On 08.08.2024 15:51, martin.kal...@canonical.com wrote:

On Thu, 2024-08-08 at 10:46 +0200, Dumitru Ceara wrote:

On 8/8/24 10:42, Vladislav Odintsov wrote:

On 08.08.2024 03:16, Dumitru Ceara wrote:

Re-adding the dev list.

On 8/7/24 18:12, Vladislav Odintsov wrote:

Hi Dumitru,

Hi Vladislav,

I'd like to add some thoughts to your inputs.

Thanks for that, I added some more of my own below. :)

On 06.08.2024 19:23, Dumitru Ceara wrote:

Hi Martin,

Sorry, I was reviewing V3 but I was slow in actually sending
out the email.

On 8/6/24 13:17, Martin Kalcok wrote:

This change adds a 'bgp-redirect' option to LRP that allows
redirecting of BGP control plane traffic to an arbitrary
LSP
in its peer LS.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were
listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok
---

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20240802155449.3528777-1-mmich...@redhat.com/

Now, for the point when we'd remove the "experimental" tag:

In general, it makes me a bit uneasy that we have to share
the MAC and
IP between the LRP and the VIF of logical switch port that's
connected
to the same switch as the LRP.

I was thinking of alternatives for the future and the only
things I
could come up with until now are:

1. Create a separate, "disconnected" logical switch:
ovn-nbctl ls-add bgp

Add the bgp-daemon port to it:
ovn-nbctl lsp-add bgp bgp-daemon.

Then we don't need "unknown" either, I think.

2. Add the ability to bind an OVS port directly to a logical
router port:

addresses. Other than that seems odd. Transit traffic should be
remained
in LR pipeline otherwise it will have no difference with a
regular VIF LSP.

You're right.

This seems quite simple in the code and very flexible. So no
additional
flows seems to be needed in future to support any other routing
protocols (or for another possible non-routing usecases).

Won't this break all "transit" traffic that has destination IP the
LRP
IP (DNAT, LB), etc?

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.

On 08.08.2024 03:16, Dumitru Ceara wrote:

Re-adding the dev list.

On 8/7/24 18:12, Vladislav Odintsov wrote:

Hi Dumitru,

Hi Vladislav,

I'd like to add some thoughts to your inputs.

Thanks for that, I added some more of my own below. :)

On 06.08.2024 19:23, Dumitru Ceara wrote:

Hi Martin,

Sorry, I was reviewing V3 but I was slow in actually sending out the email.

On 8/6/24 13:17, Martin Kalcok wrote:

This change adds a 'bgp-redirect' option to LRP that allows
redirecting of BGP control plane traffic to an arbitrary LSP
in its peer LS.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok
---

Strictly on this version of the patch, and with the thought in mind that
we might want to consider this feature experimental [0] and maybe change
it (NB too) in the future, I left a few comments inline. With those
addressed I think the patch looks OK to me.

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20240802155449.3528777-1-mmich...@redhat.com/

Now, for the point when we'd remove the "experimental" tag:

In general, it makes me a bit uneasy that we have to share the MAC and
IP between the LRP and the VIF of logical switch port that's connected
to the same switch as the LRP.

I was thinking of alternatives for the future and the only things I
could come up with until now are:

1. Create a separate, "disconnected" logical switch:
ovn-nbctl ls-add bgp

Add the bgp-daemon port to it:
ovn-nbctl lsp-add bgp bgp-daemon.

Then we don't need "unknown" either, I think.

But it's not possible today to move packets from the ingress pipeline of
a logical switch ("public" in this test) to the egress pipeline of a
different logical switch ("bgp" in my example). It also feels icky to
implement that.

2. Add the ability to bind an OVS port directly to a logical router port:

then we could do the same type of redirect you do here but instead in
the logical router pipeline. The advantage is we don't have to drop any
non-bgp traffic in the switch pipeline. The switch keeps functioning as
it does today.

Maybe an advantage of this second alternative would be that we can
easily attach a filtering option to this LRP (e.g.,
LRP.options:control-traffic-filter) to allow us to be more flexible in
what kind of traffic we forward to the actuall routing protocol daemon
that runs behind that OVS port - Vladislav also mentioned during the
meeting it might be interesting to forward BFD packets to the FRR (or
whatever application implements the routing protocol) instance too.

The idea to be able to bind LRP to OVS port sounds very interesting to
me. But with a note that it's not a pure "bind", but a partial: as you
wrote with some "filter" applied to pass not all the traffic. The main
idea here is to pass only control plane traffic destined to LRP's

As we're discussing on the other thread (Martin pointed it out) we also
probably need to involve conntrack and allow replies to any kind of
traffic initiated by the entity behind the LRP's VIF (e.g., BGP sessions
initiated from the host).

addresses. Other than that seems odd. Transit traffic should be remained
in LR pipeline otherwise it will have no difference with a regular VIF LSP.

Definitely, traffic that needs to be DNATed (DNAT/unSNAT rules or LB
rules that will change the destination IP from LRP IP to something else)
should not be affected. All other "transit" traffic doesn't have LRP
IP, does it?

You're right.

@Dumitru, @Martin, what if we just redirect all traffic destined to LRP
IPs to redirect port? Is there any drawbacks?
For security it is possible to optionally use ACLs with or without
conntrack. It's up to user/CMS.

This seems quite simple in the code and very flexible. So no additional
flows seems to be needed in future to support any other routing
protocols (or for another possible non-routing usecases).

Having a filter (or match like in ACL or Logical_Router_Policies) looks
more flexible in terms of used protocols. This can coexist with proposal
from current patch, where the flow is not programmable from user.

I think so too.

Regards,
Dumitru

But again, if we consider the current feature "experimental", we can
spend more time during the next release cycle to figure out what's best.

v4 of this patch renames the feature from "bgp-mirror" to "bgp-redirect"
as discussed during community meeting.

northd/northd.c | 108
northd/ovn

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.

2024-08-07 Thread Vladislav Odintsov

On 07.08.2024 16:23, Dumitru Ceara wrote:

On 8/7/24 11:17,martin.kal...@canonical.com wrote:

Hi Dumitru and Vladislav,
Thank you both for the review and the feedback

On Wed, 2024-08-07 at 09:17 +0200, Dumitru Ceara wrote:

On 8/6/24 19:22, Vladislav Odintsov wrote:

Hi Martin, Dumitru,

Hi Vladislav,

On 06.08.2024 19:23, Dumitru Ceara wrote:

Hi Martin,

Sorry, I was reviewing V3 but I was slow in actually sending out
the email.

On 8/6/24 13:17, Martin Kalcok wrote:

This change adds a 'bgp-redirect' option to LRP that allows
redirecting of BGP control plane traffic to an arbitrary LSP
in its peer LS.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok
---

Strictly on this version of the patch, and with the thought in
mind that
we might want to consider this feature experimental [0] and maybe
change
it (NB too) in the future, I left a few comments inline. With
those
addressed I think the patch looks OK to me.

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20240802155449.3528777-1-mmich...@redhat.com/

I'm keeping an eye on this one and I think you made a good point
Dumitru about the need for clear way to mark feature as experimental.
I'll chime in that patch discussion as well.

Thanks!

Now, for the point when we'd remove the "experimental" tag:

In general, it makes me a bit uneasy that we have to share the
MAC and
IP between the LRP and the VIF of logical switch port that's
connected
to the same switch as the LRP.

I think that in one way or another, the requirement for the interface
to share IP/MAC with the LRP will remain. The reason for sharing them
is to allow operating BGP unnumbered which uses IPv6 LLA instead of
pre-configured ASN.

I think we don't need it with any of the alternative options below but
we can discuss those in detail after branching, in the 25.03 cycle.

I was thinking of alternatives for the future and the only things
I
could come up with until now are:

1. Create a separate, "disconnected" logical switch:
ovn-nbctl ls-add bgp

Add the bgp-daemon port to it:
ovn-nbctl lsp-add bgp bgp-daemon.

Then we don't need "unknown" either, I think.

But it's not possible today to move packets from the ing,ress
pipeline of
a logical switch ("public" in this test) to the egress pipeline
of a
different logical switch ("bgp" in my example). It also feels
icky to
implement that.

2. Add the ability to bind an OVS port directly to a logical
router port:

then we could do the same type of redirect you do here but
instead in
the logical router pipeline. The advantage is we don't have to
drop any
non-bgp traffic in the switch pipeline. The switch keeps
functioning as
it does today.

I like the second approach a lot. It would indeed feel more "in place"
to have this logic in the LR, instead of LS. I didn't think this would
be possible initially.

Ack, maybe we should come back to this discussion after branching.

Maybe an advantage of this second alternative would be that we
can
easily attach a filtering option to this LRP (e.g.,
LRP.options:control-traffic-filter) to allow us to be more
flexible in
what kind of traffic we forward to the actuall routing protocol
daemon
that runs behind that OVS port - Vladislav also mentioned during
the
meeting it might be interesting to forward BFD packets to the FRR
(or
whatever application implements the routing protocol) instance
too.

@Martin, I'm again kindly asking about possible support for BFD
redirect. Is it possible to incorporate these changes as an
additional
configuration know ("bfd-redirect") into your patch so we can start
using this functionality internally and possibly give some feedback
(hopefully positive)?

It doesn't seem to be a very big change but looks very attractive
for us.

I can send a separate patch for this on top of your patch, but
technically it won't be able to jump into the 24.09 because
formally
it's already a soft-freeze in progress. As an option I can send a
patch,
which can be squashed into this one.

What do you think?

I didn't check with other maintainers (added them in CC now), but
given
that we agreed to try to accept the port redirecting patch as
experimental I think it's fine to expand it to allow BFD support too.
Posting a follow up patch on top of this one is fine from my
perspective.

Indeed the change is not that big code-wise, but given that it would
require some forethought about the implementation I was bit hesitant to
include it in here. I didn't want to rock th

Re: [ovs-dev] [Patch ovn v4] northd: BGP port redirecting.

2024-08-06 Thread Vladislav Odintsov


Hi Martin, Dumitru,

On 06.08.2024 19:23, Dumitru Ceara wrote:

Hi Martin,

Sorry, I was reviewing V3 but I was slow in actually sending out the email.

On 8/6/24 13:17, Martin Kalcok wrote:

This change adds a 'bgp-redirect' option to LRP that allows
redirecting of BGP control plane traffic to an arbitrary LSP
in its peer LS.

The option expects a string with a LSP name. When set,
any traffic entering LS that's destined for any of the
LRP's IP addresses (including IPv6 LLA) is redirected
to the LSP specified in the option's value.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok
---

Strictly on this version of the patch, and with the thought in mind that
we might want to consider this feature experimental [0] and maybe change
it (NB too) in the future, I left a few comments inline.  With those
addressed I think the patch looks OK to me.

[0]
https://patchwork.ozlabs.org/project/ovn/patch/20240802155449.3528777-1-mmich...@redhat.com/

Now, for the point when we'd remove the "experimental" tag:

In general, it makes me a bit uneasy that we have to share the MAC and
IP between the LRP and the VIF of logical switch port that's connected
to the same switch as the LRP.

I was thinking of alternatives for the future and the only things I
could come up with until now are:

1. Create a separate, "disconnected" logical switch:
ovn-nbctl ls-add bgp

Add the bgp-daemon port to it:
ovn-nbctl lsp-add bgp bgp-daemon.

Then we don't need "unknown" either, I think.

But it's not possible today to move packets from the ingress pipeline of
a logical switch ("public" in this test) to the egress pipeline of a
different logical switch ("bgp" in my example).  It also feels icky to
implement that.

2. Add the ability to bind an OVS port directly to a logical router port:

then we could do the same type of redirect you do here but instead in
the logical router pipeline.  The advantage is we don't have to drop any
non-bgp traffic in the switch pipeline.  The switch keeps functioning as
it does today.

Maybe an advantage of this second alternative would be that we can
easily attach a filtering option to this LRP (e.g.,
LRP.options:control-traffic-filter) to allow us to be more flexible in
what kind of traffic we forward to the actuall routing protocol daemon
that runs behind that OVS port - Vladislav also mentioned during the
meeting it might be interesting to forward BFD packets to the FRR (or
whatever application implements the routing protocol) instance too.


@Martin, I'm again kindly asking about possible support for BFD 
redirect. Is it possible to incorporate these changes as an additional 
configuration know ("bfd-redirect") into your patch so we can start 
using this functionality internally and possibly give some feedback 
(hopefully positive)?


It doesn't seem to be a very big change but looks very attractive for us.

I can send a separate patch for this on top of your patch, but 
technically it won't be able to jump into the 24.09 because formally 
it's already a soft-freeze in progress. As an option I can send a patch, 
which can be squashed into this one.


What do you think?



But again, if we consider the current feature "experimental", we can
spend more time during the next release cycle to figure out what's best.


v4 of this patch renames the feature from "bgp-mirror" to "bgp-redirect"
as discussed during community meeting.

  northd/northd.c | 108 
  northd/ovn-northd.8.xml |  23 +
  ovn-nb.xml  |  14 ++
  tests/ovn-northd.at |  58 +
  tests/system-ovn.at |  86 
  5 files changed, 289 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..088104f25 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,113 @@ build_arp_resolve_flows_for_lrp(struct ovn_port *op,
  }
  }
  
+static void

+build_bgp_redirect_rule__(
+const char *s_addr, const char *bgp_port_name, bool is_ipv6,
+struct ovn_port *ls_peer, struct lflow_table *lflows,
+struct ds *match, struct ds *actions)
+{
+int ip_ver = is_ipv6 ? 6 : 4;
+/* Redirect packets in the input pipeline destined for LR's IP to
+ * the port specified in 'bgp-redirect' option.
+ */
+ds_clear(match);
+ds_clear(actions);
+ds_put_format(match, "ip%d.dst == %s && tcp.dst == 179", ip_ver, s_addr);
+ds_put_format(actions, "outport = \"%s\"; output;", bgp_port_name);
+ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+
+
+/* Drop any traffic originating from 'bgp-redirect' port that does
+ * not originate from BGP daemon port. This blocks unnecessary
+ * traffic like ARP bro

[ovs-dev] [PATCH ovn] tests: Fix typo in read-only sb ssl-ciphers test.

2024-08-01 Thread Vladislav Odintsov

Though this typo does not affect test correctness, it is worth to be
fixed in a right way.

Fixes: bcc650a29d3f ("tests: Fix ssl-ciphers RO sb test with old openssl.")
Signed-off-by: Vladislav Odintsov 
---
 tests/ovn.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index b31afbfb3..5b81fc210 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -37939,7 +37939,7 @@ AT_CHECK([ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
 --ca-cert=$PKIDIR/testpki-cacert.pem \
 --ssl-ciphers='HIGH:!aNULL:!MD5:@SECLEVEL=1' \
 --ssl-protocols='TLSv1,TLSv1.1,TLSv1.2' \
-chassis-add ch vxlan 1.2.4.8 2>&1 | grep 'transaction 
error]', [0], [dnl
+chassis-add ch vxlan 1.2.4.8 2>&1 | grep 'transaction 
error'], [0], [dnl
 ovn-sbctl: transaction error: {"details":"insert operation not allowed when 
database server is in read only mode","error":"not allowed"}
 ], [ignore])
 
-- 
2.45.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [Patch ovn v2 1/1] northd: BGP port mirroring.

2024-07-29 Thread Vladislav Odintsov


Hi Martin, Frode,

I'd like to summarize OVN technical meeting discussion.

First, there was a discussion about new option naming. Ales proposed 
terms "forward" and "redirect" (IIRC). Do we want to reflect the exact 
behavior of new option? "redirect" from my perspective could be a good 
candidate.


Please correct me if I'm wrong in next 4 terms:

1. In current BGP support patch series [0] OVN only installs NAT and LB 
VIP addresses into a separate routing table via Netlink which is then 
should be redistributed by external routing daemon (FRR, Bird, etc.). 
External routing daemon is configured outside of OVN.
2. OVN doesn't import routes received and installed into separate kernel 
routing table by routing daemon. "OVN to outside" direction routes are 
configured as normal Logical_Router_Static_Route in OVN_Northbound by 
CMS/external automation, while "outside to OVN" direction routes are 
installed on the external router(s) automatically with BGP.
3. If user has more then one BGP peering with Leaf/external Router and 
needs fast (sub-second) fail over for "OVN to outside" direction, BFD 
should be configured for ECMP Logical_Router_Static_Routes from OVN side 
with these external router IPs as a nexthop group. On external router 
side BFD within BGP must be configured.
4. If to forward BFD traffic from LRP to "bgp daemon" LSP 
unconditionally, functionality from #3 will be broken.


I'm wondering, if we don't configure BFD for ECMP routes from OVNand use 
external tooling for routes learning, could we conditionally add BFD 
redirecting rules with a separate option? Would it be safe or there are 
any negative consequences?


0: https://patchwork.ozlabs.org/project/ovn/list/?series=416659

On 26.07.2024 23:10, Vladislav Odintsov wrote:

On 26.07.2024 21:21, martin.kal...@canonical.com wrote:

On Fri, 2024-07-26 at 21:07 +0700, Vladislav Odintsov wrote:

Hi Frode,

On 26.07.2024 19:17, Frode Nordahl wrote:

On Fri, Jul 26, 2024 at 11:28 AM Vladislav Odintsov
 wrote:

Hi Martin, thanks for the patch.

Typically for faster BGP (or other routing protocols) convergence
BFD
signalling is used. Would you mind adding flows to forward BFD
traffic
to the same LSP as it is already done for BGP?

It could be an additional option like ("enable-bfd-for-bgp") or
something like this, or we can install flows unconditionally.
UDP/3789
is the default BFD proto/port.

BFD is indeed an important part for fast convergence of routing
protocols, it is however also an important part of end to end
liveness
detection for a data path.

In this work our goal is to exchange control plane information with
an
external daemon so that it can take care of the routing protocol
state
machine. We do want to keep the data path in OVS so that users can
benefit from all the data path implementations it has to offer,
including hardware acceleration.

With this in mind, does it really make sense for the external
daemon
to speak BFD, or would it be better to integrate with the OVN
managed
BFD for static routes which is implemented in the OVS data path?

Typically BFD for routing protocols is configured in routing daemons
(on
both sides of peering), because main routing daemon (e.g. bgpd) has
to
get notifications from BFD engine (e.g. bfdd), that the connection is
lost. OVS-based BFD sessions seems nothing to do here. I proposed to
install "redirect" flows similar to BGP: forward udp/3789 to
dedicated
LSP for routing daemon. After installing openflow control plane is
not
needed for BFD to work. OVS datapath in this case just forwards
traffic
from external network (for instance, leaf switch) to internal OVS
port
to routing daemon).

Hope this explains the idea more clear.

Wouldn't this be like having multiple "sources of truth" in the system?
On one hand there's OVN injecting routes [0], that are picked up by the
BGP daemon, and on the other there's a BFD daemon that will be removing
them if it believes that they are unreachable. Couldn't this lead to
some flapping?
It shouldn't. Just in case - I'm not talking about OVN BFD for static 
routes feature. I mean BFD within routing daemon. BFD daemon is just a 
"sidecar" for the BGP to notify the latter that the connectivity is 
lost. After BFD detects connectivity failure it notifies BGP and it 
terminates BGP session and removes routes learnt from dead peer. [0] 
This works for both sides: for routing daemon on the "OVN side" and 
for external BGP speaker. This can protect against 2 types of failures:


1. L3 Gateway failure: power outage, physical disconnection, kernel 
panic, OVS failure, etc. If ha-chassis-group is configured, other OVN 
cluster nodes will detect this failure though OVS-based BFD and 
trigger failover to the next ha-chassis in the group. At the same time 
external BGP speaker will also detect that BFD se

Re: [ovs-dev] [Patch ovn v2 1/1] northd: BGP port mirroring.


On 26.07.2024 21:21, martin.kal...@canonical.com wrote:

On Fri, 2024-07-26 at 21:07 +0700, Vladislav Odintsov wrote:

Hi Frode,

On 26.07.2024 19:17, Frode Nordahl wrote:

On Fri, Jul 26, 2024 at 11:28 AM Vladislav Odintsov
 wrote:

Hi Martin, thanks for the patch.

Typically for faster BGP (or other routing protocols) convergence
BFD
signalling is used. Would you mind adding flows to forward BFD
traffic
to the same LSP as it is already done for BGP?

It could be an additional option like ("enable-bfd-for-bgp") or
something like this, or we can install flows unconditionally.
UDP/3789
is the default BFD proto/port.

BFD is indeed an important part for fast convergence of routing
protocols, it is however also an important part of end to end
liveness
detection for a data path.

In this work our goal is to exchange control plane information with
an
external daemon so that it can take care of the routing protocol
state
machine. We do want to keep the data path in OVS so that users can
benefit from all the data path implementations it has to offer,
including hardware acceleration.

With this in mind, does it really make sense for the external
daemon
to speak BFD, or would it be better to integrate with the OVN
managed
BFD for static routes which is implemented in the OVS data path?

Typically BFD for routing protocols is configured in routing daemons
(on
both sides of peering), because main routing daemon (e.g. bgpd) has
to
get notifications from BFD engine (e.g. bfdd), that the connection is
lost. OVS-based BFD sessions seems nothing to do here. I proposed to
install "redirect" flows similar to BGP: forward udp/3789 to
dedicated
LSP for routing daemon. After installing openflow control plane is
not
needed for BFD to work. OVS datapath in this case just forwards
traffic
from external network (for instance, leaf switch) to internal OVS
port
to routing daemon).

Hope this explains the idea more clear.

Wouldn't this be like having multiple "sources of truth" in the system?
On one hand there's OVN injecting routes [0], that are picked up by the
BGP daemon, and on the other there's a BFD daemon that will be removing
them if it believes that they are unreachable. Couldn't this lead to
some flapping?
It shouldn't. Just in case - I'm not talking about OVN BFD for static 
routes feature. I mean BFD within routing daemon. BFD daemon is just a 
"sidecar" for the BGP to notify the latter that the connectivity is 
lost. After BFD detects connectivity failure it notifies BGP and it 
terminates BGP session and removes routes learnt from dead peer. [0] 
This works for both sides: for routing daemon on the "OVN side" and for 
external BGP speaker. This can protect against 2 types of failures:


1. L3 Gateway failure: power outage, physical disconnection, kernel 
panic, OVS failure, etc. If ha-chassis-group is configured, other OVN 
cluster nodes will detect this failure though OVS-based BFD and trigger 
failover to the next ha-chassis in the group. At the same time external 
BGP speaker will also detect that BFD session went down and terminate 
BGP session with routing daemon and send BGP update with removal of 
routes through itself, because they are not reachable anymore. Then a 
next by priority l3gateway claims chassis-redirect LRP and start to 
advertise NAT/LB VIP addresses.


2. Leaf/external router failure. In case, where we have two or more 
peers/leafs, these leafs can go up and down. FRR should install/delete 
routes though each of them as fast as possible. Here BFD also comes to 
help to detect failures faster than BGP keepalives do. IIUC, in current 
BGP integration OVN doesn't import routes learned from BGP and installed 
by FRR into VRF. But this should be done in future so that LR could send 
traffic only to alive peer.


For the clarity, I've prepared a small illustration of interacting 
components in drawio [1] and PNG [2] formats. There are two pairs curve 
arrows between FRR running in a VRF and 2 external BGP speakers. The 
light blue and orange arrows show BGP session traffic datapaths and the 
dash and dash red and blue lines show BFD traffic datapaths.


Please correct me if I misunderstood your point.

0: https://docs.frrouting.org/en/latest/bfd.html#bgp-bfd-configuration

1: https://s3.k2.cloud/vlodintsov/public-artifacts/ovn-native-bgp.drawio

2: 
https://s3.k2.cloud/vlodintsov/public-artifacts/ovn-native-bgp.drawio.png

[0]
https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416038.html


Also, do we need to hard code on the BGP protocol or it can be
generalized so that an end user can pass a proto and optionally a
port
to forward? This can bring OSPF or other dynamic routing
protocols support.

What do you think?

While it is true that this may apply generally to other routing
protocols, it does make it more complicated to configure.

Well, I thought about it more and agree with you. To implement OSP

Re: [ovs-dev] [Patch ovn v2 1/1] northd: BGP port mirroring.


Hi Frode,

On 26.07.2024 19:17, Frode Nordahl wrote:

On Fri, Jul 26, 2024 at 11:28 AM Vladislav Odintsov  wrote:

Hi Martin, thanks for the patch.

Typically for faster BGP (or other routing protocols) convergence BFD
signalling is used. Would you mind adding flows to forward BFD traffic
to the same LSP as it is already done for BGP?

It could be an additional option like ("enable-bfd-for-bgp") or
something like this, or we can install flows unconditionally. UDP/3789
is the default BFD proto/port.

BFD is indeed an important part for fast convergence of routing
protocols, it is however also an important part of end to end liveness
detection for a data path.

In this work our goal is to exchange control plane information with an
external daemon so that it can take care of the routing protocol state
machine. We do want to keep the data path in OVS so that users can
benefit from all the data path implementations it has to offer,
including hardware acceleration.

With this in mind, does it really make sense for the external daemon
to speak BFD, or would it be better to integrate with the OVN managed
BFD for static routes which is implemented in the OVS data path?


Typically BFD for routing protocols is configured in routing daemons (on 
both sides of peering), because main routing daemon (e.g. bgpd) has to 
get notifications from BFD engine (e.g. bfdd), that the connection is 
lost. OVS-based BFD sessions seems nothing to do here. I proposed to 
install "redirect" flows similar to BGP: forward udp/3789 to dedicated 
LSP for routing daemon. After installing openflow control plane is not 
needed for BFD to work. OVS datapath in this case just forwards traffic 
from external network (for instance, leaf switch) to internal OVS port 
to routing daemon).


Hope this explains the idea more clear.




Also, do we need to hard code on the BGP protocol or it can be
generalized so that an end user can pass a proto and optionally a port
to forward? This can bring OSPF or other dynamic routing protocols support.

What do you think?

While it is true that this may apply generally to other routing
protocols, it does make it more complicated to configure.
Well, I thought about it more and agree with you. To implement OSPF 
there must be more work done like handling multicast traffic for a 
specific IPv4 or IPv6 address. This could be done in a separate patch 
when OSPF support is needed.


--
Frode Nordahl


On 25.07.2024 02:02, Mark Michelson wrote:

Hi Martin, thanks for the patch.

I have one note below, but other than that it looks good to me.

On 7/16/24 02:59, Martin Kalcok wrote:

This change adds a 'bgp-mirror' option to LRP that allows
mirroring of BGP control plane traffic to an arbitrary LSP
in its peer LS.

The option expects a string with a LSP name. When set,
any traffic entering LS that's destined for any of the
LRP's IP addresses (including IPv6 LLA) is redirected
to the LSP specified in the option's value.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok 
---
   northd/northd.c | 87 +
   northd/ovn-northd.8.xml | 23 +++
   ovn-nb.xml  | 14 +++
   tests/ovn-northd.at | 45 +
   tests/system-ovn.at | 86 
   5 files changed, 255 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..e07bf68cc 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,92 @@ build_arp_resolve_flows_for_lrp(struct
ovn_port *op,
   }
   }
   +static void
+build_bgp_redirect_rule__(
+const char *s_addr, const char *bgp_port_name, bool is_ipv6,
+struct ovn_port *ls_peer, struct lflow_table *lflows,
+struct ds *match, struct ds *actions)
+{
+int ip_ver = is_ipv6 ? 6 : 4;
+/* Redirect packets in the input pipeline destined for LR's IP to
+ * the port specified in 'bgp-mirror' option.
+ */
+ds_clear(match);
+ds_clear(actions);
+ds_put_format(match, "ip%d.dst == %s && tcp.dst == 179", ip_ver,
s_addr);
+ds_put_format(actions, "outport = \"%s\"; output;", bgp_port_name);
+ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+
+
+/* Drop any traffic originating from 'bgp-mirror' port that does
+ * not originate from BGP daemon port. This blocks unnecessary
+ * traffic like ARP broadcasts or IPv6 router solicitation packets
+ * from the dummy 'bgp-mirror' port.
+ */
+ds_clear(match);
+ds_put_format(match, "inport == \"%s\"", bgp_port_name);
+ovn_

Re: [ovs-dev] [Patch ovn v2 1/1] northd: BGP port mirroring.


Hi Martin, thanks for the patch.

Typically for faster BGP (or other routing protocols) convergence BFD 
signalling is used. Would you mind adding flows to forward BFD traffic 
to the same LSP as it is already done for BGP?


It could be an additional option like ("enable-bfd-for-bgp") or 
something like this, or we can install flows unconditionally. UDP/3789 
is the default BFD proto/port.


Also, do we need to hard code on the BGP protocol or it can be 
generalized so that an end user can pass a proto and optionally a port 
to forward? This can bring OSPF or other dynamic routing protocols support.


What do you think?

On 25.07.2024 02:02, Mark Michelson wrote:

Hi Martin, thanks for the patch.

I have one note below, but other than that it looks good to me.

On 7/16/24 02:59, Martin Kalcok wrote:

This change adds a 'bgp-mirror' option to LRP that allows
mirroring of BGP control plane traffic to an arbitrary LSP
in its peer LS.

The option expects a string with a LSP name. When set,
any traffic entering LS that's destined for any of the
LRP's IP addresses (including IPv6 LLA) is redirected
to the LSP specified in the option's value.

This enables external BGP daemons to listen on an interface
bound to a LSP and effectively act as if they were listening
on (and speaking from) LRP's IP address.

Signed-off-by: Martin Kalcok 
---
  northd/northd.c | 87 +
  northd/ovn-northd.8.xml | 23 +++
  ovn-nb.xml  | 14 +++
  tests/ovn-northd.at | 45 +
  tests/system-ovn.at | 86 
  5 files changed, 255 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 4353df07d..e07bf68cc 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13048,6 +13048,92 @@ build_arp_resolve_flows_for_lrp(struct 
ovn_port *op,

  }
  }
  +static void
+build_bgp_redirect_rule__(
+    const char *s_addr, const char *bgp_port_name, bool is_ipv6,
+    struct ovn_port *ls_peer, struct lflow_table *lflows,
+    struct ds *match, struct ds *actions)
+{
+    int ip_ver = is_ipv6 ? 6 : 4;
+    /* Redirect packets in the input pipeline destined for LR's IP to
+ * the port specified in 'bgp-mirror' option.
+ */
+    ds_clear(match);
+    ds_clear(actions);
+    ds_put_format(match, "ip%d.dst == %s && tcp.dst == 179", ip_ver, 
s_addr);

+    ds_put_format(actions, "outport = \"%s\"; output;", bgp_port_name);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100,
+  ds_cstr(match),
+  ds_cstr(actions),
+  ls_peer->lflow_ref);
+
+
+    /* Drop any traffic originating from 'bgp-mirror' port that does
+ * not originate from BGP daemon port. This blocks unnecessary
+ * traffic like ARP broadcasts or IPv6 router solicitation packets
+ * from the dummy 'bgp-mirror' port.
+ */
+    ds_clear(match);
+    ds_put_format(match, "inport == \"%s\"", bgp_port_name);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_CHECK_PORT_SEC, 80,
+  ds_cstr(match),
+  REGBIT_PORT_SEC_DROP " = 1; next;",
+  ls_peer->lflow_ref);
+
+    ds_put_format(match,
+  " && ip%d.src == %s && tcp.src == 179",
+  ip_ver,
+  s_addr);
+    ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_CHECK_PORT_SEC, 81,
+  ds_cstr(match),
+  REGBIT_PORT_SEC_DROP " = check_in_port_sec(); next;",
+  ls_peer->lflow_ref);
+}
+
+static void
+build_lrouter_bgp_redirect(
+    struct ovn_port *op, struct lflow_table *lflows,
+    struct ds *match, struct ds *actions)
+{
+    /* LRP has to have a peer.*/
+    if (op->peer == NULL) {
+    return;
+    }
+    /* LRP has to have NB record.*/
+    if (op->nbrp == NULL) {
+    return;
+    }
+
+    /* Proceed only for LRPs that have 'bgp-mirror' option set. 
Value of this
+ * option is the name of LSP to which a BGP traffic will be 
mirrored.

+ */
+    const char *bgp_port = smap_get(&op->nbrp->options, "bgp-mirror");
+    if (bgp_port == NULL) {
+    return;
+    }
+
+    if (op->cr_port != NULL) {
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+    VLOG_WARN_RL(&rl, "Option 'bgp-mirror' is not supported on"
+  " Distributed Gateway Port '%s'", op->key);
+    return;
+    }
Somewhere around here would be a good place to ensure that "bgp_port" 
exists on op->peer. If the port does not exist, then print a warning 
and return early.


It would also be a good idea to add this as part of the ovn-northd 
test. Set the router port's bgp_port to a nonexistent port on the 
connected logical switch and ensure that BGP-related logical flows are 
not installed.



+
+    /* Mirror traffic destined for LRP's IPs and default BGP port
+ * to the port defined in 'bgp-mirror' option.
+ */
+    for (size_t

Re: [ovs-dev] [PATCH ovn v2 4/5] northd: Add options for distributed route exchange.

Hi Frode,

 

First of all, thanks for the patch set and for starting works with OVN native

BGP integration!

 

I’ve got some questions/comments, please see inline.

 

regards,

Vladislav Odintsov




On 19 Jul 2024, at 09:10, Frode Nordahl  wrote:

Add three new options for Logical Router Ports that control
ovn-controller route exchange for NAT addresses and LB VIPs.

Load Balancers already have structured data in the Southbound
database which the ovn-controller can use directly.

NAT addresses are however currently only expressed as specialized
rules in the Port_Binding table nat_addresses column on LSP peer
records for LRPs, used for (G)ARP processing, as well as
logical flow rules for OpenFlow processing.

Options considered for how to redistribute these addresses to the
ovn-controllers in a structured way:
* Introduce even more conditional processing of the lsp
 nat_addresses column.
* Parse ct_dnat records in the Logical_Flow table.
* Add column to the Port_Binding table.
* Copy the Northbound NAT table over to the Southbound database,
 similar to what is done with Load Balancers.
* Populate Port_Binding table nat_addresses column on LRPs peer
 record (the proposed approach).

The Port_Binding table LRP peer records nat_addresses column is
currently unused, populate it with NAT addresses for route
exchange, when the redistribute-nat LRP option is set to 'true'.

The options are only processed for gateway routers.

Signed-off-by: Frode Nordahl 
---
controller/pinctrl.c |  8 +--
northd/northd.c  | 22 +++
ovn-nb.xml   | 45 ++
ovn-sb.xml   | 51 +++-
tests/ovn-northd.at  | 35 ++
5 files changed, 158 insertions(+), 3 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 708240e24..d9ef97ce1 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -6428,11 +6428,15 @@ get_nat_addresses_and_keys(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
const struct sbrec_port_binding *pb;

pb = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port);
-if (!pb) {
+if (!pb || !pb->datapath) {
continue;
}

-if (pb->n_nat_addresses) {
+/* We only want to consider nat_addresses column for LS datapaths. */
+const char *logical_switch = smap_get(&pb->datapath->external_ids,
+  "logical-switch");
+
+if (pb->n_nat_addresses && logical_switch) {
for (int i = 0; i < pb->n_nat_addresses; i++) {
consider_nat_address(sbrec_port_binding_by_name,
 pb->nat_addresses[i], pb,
diff --git a/northd/northd.c b/northd/northd.c
index 6898daa00..10d78b561 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3939,6 +3939,28 @@ sync_pb_for_lrp(struct ovn_port *op,
}
if (chassis_name) {
smap_add(&new, "l3gateway-chassis", chassis_name);
+if (smap_get_bool(&op->nbrp->options, "maintain-vrf", false)) {
+smap_add(&new, "maintain-vrf", "true");
+}
+if (smap_get_bool(&op->nbrp->options,
+  "redistribute-nat", false)) {
+smap_add(&new, "redistribute-nat", "true");
+
+size_t n_nats = 0;
+char **nats = NULL;
+nats = get_nat_addresses(op, &n_nats, false, false, NULL);
+sbrec_port_binding_set_nat_addresses(op->sb,
+ (const char **) nats,
+ n_nats);
+for (size_t i = 0; i < n_nats; i++) {
+free(nats[i]);
+}
+free(nats);
+}
+if (smap_get_bool(&op->nbrp->options,
+  "redistribute-lb-vips", false)) {
+smap_add(&new, "redistribute-lb-vips", "true");
+}
}
}

diff --git a/ovn-nb.xml b/ovn-nb.xml
index 9552534f6..7a5c1be57 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -3451,6 +3451,51 @@ or
   option.

  
+
+  
+
+  When configured the ovn-controller will redistribute
+  host routes to Load Balancer VIPs that are local to its chassis and
+  associated with the LR datapath.

 

Though I'm not a native English speaker, it seems that this sentence should be

adjusted because it is unclear what "host routes" are and why they should be

redistributed TO Load Balancer VIPs. Also, datapath is an internal term, should

we use just "Logical Router" instead?

 

Also, I

[ovs-dev] [PATCH ovn] tests: Fix ssl-ciphers RO sb test with old openssl.

2024-07-06 Thread Vladislav Odintsov

The test "read-only sb db:pssl access with ssl-ciphers and ssl-protocols"
fails when running with openssl which doesn't support some of passed
values.
For instance, on openssl 1.0.2 there is no support for 'SECLEVEL' and
test fails due to extra string in stderr, which is asserted as a part of
test:

  ./ovn.at:37851: ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
--private-key=$PKIDIR/testpki-test-privkey.pem \
  --certificate=$PKIDIR/testpki-test-cert.pem \
  --ca-cert=$PKIDIR/testpki-cacert.pem \
  --ssl-ciphers='HIGH:!aNULL:!MD5:@SECLEVEL=1' \
  --ssl-protocols='TLSv1,TLSv1.1,TLSv1.2' \
chassis-add ch vxlan 1.2.4.8
  --- - 2024-07-05 13:48:11.697647047 +0300
  +++ 
/builddir/build/BUILD/ovn-24.03.90/tests/testsuite.dir/at-groups/520/stderr 
2024-07-05 13:48:11.694353357 +0300
  @@ -1,2 +1,3 @@
  +2024-07-05T10:48:11Z|1|stream_ssl|ERR|SSL_CTX_set_cipher_list: 
error:140E6118:SSL routines:SSL_CIPHER_PROCESS_RULESTR:invalid command
   ovn-sbctl: transaction error: {"details":"insert operation not allowed when 
database server is in read only mode","error":"not allowed"}

This patch fixes the test adding grep of expected transaction error.

CC: Aliasgar Ginwala 
Fixes: 620203f9f0d9 ("Fix segfault due to ssl-ciphers.")
Signed-off-by: Vladislav Odintsov 
---
 tests/ovn.at | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 87a64499f..2341f52d5 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -37854,9 +37854,9 @@ AT_CHECK([ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
 --ca-cert=$PKIDIR/testpki-cacert.pem \
 --ssl-ciphers='HIGH:!aNULL:!MD5:@SECLEVEL=1' \
 --ssl-protocols='TLSv1,TLSv1.1,TLSv1.2' \
-chassis-add ch vxlan 1.2.4.8], [1], [ignore],
-[ovn-sbctl: transaction error: {"details":"insert operation not allowed when 
database server is in read only mode","error":"not allowed"}
-])
+chassis-add ch vxlan 1.2.4.8 2>&1 | grep 'transaction 
error]', [0], [dnl
+ovn-sbctl: transaction error: {"details":"insert operation not allowed when 
database server is in read only mode","error":"not allowed"}
+], [ignore])
 
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 AT_CLEANUP
-- 
2.45.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v6 0/2] Add support to disable VXLAN mode.

2024-07-05 Thread Vladislav Odintsov

Hi Alexey,

The discussion for explicit configuration vs automatic determining VXLAN mode 
can be found reading the next thread [1].

1: https://mail.openvswitch.org/pipermail/ovs-dev/2024-April/412986.html

regards,
 
Vladislav Odintsov

-Original Message-
From: dev  on behalf of Aleksey Baulin via dev 

Reply to: Aleksey Baulin 
Date: Friday, 5 July 2024 at 14:32
To: "ovs-dev@openvswitch.org" 
Subject: Re: [ovs-dev] [PATCH ovn v6 0/2] Add support to disable VXLAN mode.

I missed the discussion on this patch set. I can see that it's been
accepted. Still, I'd like to ask a question.

Originally there was the VTEP-only VxLAN scenario that worked great in
OVN. Then, the patch by Ihar Hrachyshka provided support for internal
VxLANs. The tunnel id space was severely limited which, in turn,
became the limitation for the VTEP-only VxLAN scenario as well. In
other words, the patch that introduced internal VxLANs broke the
scenario for VTEP-only VxLANs.

In essence, Vladislav Odintsov asserts that the patch by Ihar
Hrachyshka introduced an implicit configuration option the state of
which was determined in software (the function is_vxlan_mode()). The
new patch by Vladislav Odintsov makes that option explicit. That makes
the behavior of a configured cluster completely depend on its value -
as opposed to determining the behavior in software from configuration.

On the one hand, that makes it simple - one option to control whether
the cluster supports internal VxLANs or it is VTEP-only VxLANs. On the
other hand I can't help but wonder - why can't that be determined in
software - like in the original patch? I would think that all
necessary data is present in a configuration and there's no need for
an extra explicit option. With the new patch from Vladislav Odintsov
it can be done just once in one place.

So the question is: what are the pros and cons of each variant? Why is
the variant with a new option chosen?

Thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v6 0/2] Add support to disable VXLAN mode.

2024-06-28 Thread Vladislav Odintsov

Thank you all!

regards,
 
Vladislav Odintsov

-Original Message-
From: dev  on behalf of Dumitru Ceara 

Date: Friday, 28 June 2024 at 12:03
To: Vladislav Odintsov , "d...@openvswitch.org" 

Subject: Re: [ovs-dev] [PATCH ovn v6 0/2] Add support to disable VXLAN mode.

On 6/7/24 15:54, Vladislav Odintsov wrote:
> v6:
>   - Addressed Mark's review comments:
> 1. Removed global variable "vxlan_mode" change from "global" engine 
node.
> 2. Configuration knob "disable_vxlan_mode" was renamed to "vxlan_mode"
> v5:
>   - Addressed Ihar's review comments:
> 1. fixed errors after incorrect conflicts solving on rebase;
> 2. changed VXLAN mode naming to capitalized;
> 3. clarified VXLAN mode in ovn-architecture man page.
> v4:
>   - Addressed Dumitru's and Ihar's review comments;
>   - single patch was split into two:
> 1. function call replaced with a global variable `vxlan_mode`;
> 2. introduced `disable_vxlan_mode` configuration knob;
>   - rebased onto latest main branch.
> v3:
    >   - Removed accidental ovs submodule change.
> v2:
>   - Added NEWS item.
> 
> Vladislav Odintsov (2):
>   northd: Make `vxlan_mode` a global variable.
>   northd: Add support for disabling vxlan mode.
> 

Thanks, Vladislav, Ihar and Ales!

I applied this series to main.

Best regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

I’ve posted new version (v6) of patch set:

https://patchwork.ozlabs.org/project/ovn/list/?series=410010

> On 6 Jun 2024, at 22:40, Ihar Hrachyshka  wrote:
> 
> On Thu, Jun 6, 2024 at 1:27 AM Vladislav Odintsov  <mailto:odiv...@gmail.com>> wrote:
> 
>> Thanks Mark for such a detailed answer!
>> 
>> I agree with your points, and also was thinking about them, but could not
>> value their importance in terms of I-P logic. You helped with that.
>> 
>> I’d prefer to apply my proposal to revert back to “bool is_vxlan_mode()”
>> to make the “global” a true global. Will submit v6.
>> 
>> 
> Happy we are doing it. Mark is more eloquent than me. :)
> 
> 
>> regards,
>> Vladislav Odintsov
>> 
>>> On 5 Jun 2024, at 23:13, Mark Michelson  wrote:
>>> 
>>> On 6/5/24 08:51, Vladislav Odintsov wrote:
>>>> Hi Mark,
>>>> Thanks for the review!
>>>> Please, see below.
>>>> regards,
>>>> Vladislav Odintsov
>>>> -Original Message-
>>>> From: Mark Michelson 
>>>> Date: Tuesday, 4 June 2024 at 03:45
>>>> To: Vladislav Odintsov , 
>>>> Subject: Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a
>> global variable.
>>>>Hi Vladislav,
>>>>Generally speaking, I agree with this change. However, I think that
>>>>setting a global variable from an incremental processing engine node
>>>>runner feels wrong.
>>>> The init_vxlan_mode() is called inside the en_global_config_run() only
>> to
>>>> initialize global value, which is then read by
>> get_ovn_max_dp_key_local() to
>>>> fill the "max_tunid" variable inside incremental processing engine node.
>>>> Which drawbacks do you see of such variable initialization?
>>> 
>>> The biggest drawbacks are:
>>> * Reasoning about "ownership" of the vxlan_mode global variable
>>> * Maintenance of the en_global_config I-P engine node.
>>> 
>>> On the first point, since vxlan_mode is a global variable in northd.c,
>> it's not obvious that the owner of this data is the en_global_config engine
>> node. It's an easy mistake for someone to reference the variable before it
>> has been initialized, for instance. However, if the boolean is on the
>> ed_type_global_config struct, then it's clear to see that this data is
>> scoped to the en_global_config engine node.
>>> 
>>> On the second point, if someone were to overhaul the en_global_config
>> engine node, it would be an easy mistake to make to not notice that
>> vxlan_mode is being set by the engine node. I could see a developer
>> splitting the node into separate nodes, for instance. In doing so, the
>> developer could easily miss that the global vxlan_mode is being set by the
>> engine node, since it's hidden behind an init_ function call. However,
>> placing vxlan_mode on the ed_type_global_config makes it more clear that
>> the en_global_config engine node is responsible for setting the value.
>> 
>>> 
>>>>I think that instead, the "vxlan_mode" variable you have introduced
>>>>should be a field on struct ed_type_global_config. This way, the
>> engine
>>>>node is only modifying data local to itself.
>>>> I guess, that moving this to the struct ed_type_global_config will make
>> the code
>>>> a bit more complex: we have to pass this variable through all function
>> calls to
>>>> be able to read vxlan_mode value inside
>> ovn_datapath_assign_requested_tnl_id(),
>>>> ovn_port_assign_requested_tnl_id() and ovn_port_allocate_key().
>>> 
>>> I think dependency injection makes the code easier to read, understand,
>> and maintain rather than making it more complex. It's clearer that the data
>> from the en_global_config engine node is needed in all of the functions you
>> listed if those functions require an ed_type_global_config argument.
>>> 
>>>> Apart of this, the "vxlan_mode" variable has the same "global" meaning
>> as
>>>> "use_ct_inv_match", "check_lsp_is_up", "use_common_zone" and other
>> global
>>>> variables, which configure the global OVN behaviour. The difference is
>> that it
>>>> is required to read its value inside the en_global_config_run() to
>> reflect the
>>>> max_tunid back to NB_Global.
>>> 
>>> Personally, I don't l

[ovs-dev] [PATCH ovn v6 2/2] northd: Add support for disabling vxlan mode.

Commit [1] introduced a "VXLAN mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In VXLAN mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical ports per datapath.

Prior to this patch VXLAN mode was enabled automatically if at least one
chassis had encap of VXLAN type.  In scenarios where one want to use
VXLAN only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of VXLAN mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

Acked-By: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  4 
 northd/en-global-config.c |  8 +++-
 northd/northd.c   | 10 --
 northd/northd.h   |  3 ++-
 ovn-architecture.7.xml|  6 ++
 ovn-nb.xml| 10 ++
 tests/ovn-northd.at   | 29 +
 7 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3bdc55172..aa1669d9c 100644
--- a/NEWS
+++ b/NEWS
@@ -31,6 +31,10 @@ Post v24.03.0
 has been renamed to "options:ic-route-denylist" in order to comply with
 inclusive language guidelines. The previous name is still recognized to
 aid with backwards compatibility.
+  - Added new global config option NB_Global:options:vxlan_mode to support
+ability to disable "VXLAN mode" to extend available tunnel IDs space for
+datapaths from 4095 to 16711680.  For more details see man ovn-nb(5) for
+mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index df0f8e58c..784538a14 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -117,7 +117,8 @@ en_global_config_run(struct engine_node *node , void *data)
 
 char *max_tunid = xasprintf("%d",
 get_ovn_max_dp_key_local(
-is_vxlan_mode(sbrec_chassis_table)));
+is_vxlan_mode(&nb->options,
+  sbrec_chassis_table)));
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
@@ -534,6 +535,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index 6d118a19a..a4937b472 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -886,8 +886,13 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 
 bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+is_vxlan_mode(const struct smap *nb_options,
+  const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (!smap_get_bool(nb_options, "vxlan_mode", true)) {
+return false;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
@@ -17605,7 +17610,8 @@ ovnnb_db_run(struct northd_input *input_data,
 use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
 false);
 
-vxlan_mode = is_vxlan_mode(input_data->sbrec_chassis_table);
+vxlan_mode = is_vxlan_mode(input_data->nb_options,
+   input_data->sbrec_chassis_table);
 
 build_datapaths(ovnsb_txn,
 input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index 987f82954..2f2fdb673 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -790,7 +790,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
 }
 
 bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
+is_vxlan_mode(const struct smap *nb_options,
+  const struct sbrec_chassis_table *sbrec_chassis_table);
 
 uint32_t get_ovn_max_dp_key_local(bool _vxlan_mode);
 
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index e32d1a9f7..640944faf 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -2920,4 +2920,10 @@
 the future, gateways that do not support encapsulations with large amounts
 of metadata may continue to have a reduced feature set.
   
+  
+VXLAN mode is recommended to be disabled if VXLAN encap at
+hypervisors is needed only to support HW VTEP L2 Gateway functionality.
+See man ovn-nb(5) for table NB_Global column
+options key vxlan_mode for more details.
+  
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
inde

[ovs-dev] [PATCH ovn v6 1/2] northd: Make `vxlan_mode` a global variable.

This simplifies code and subsequent commit to explicitely disable VXLAN
mode is based on these changes.

Also "VXLAN mode" term is introduced in ovn-architecture man page.

Signed-off-by: Vladislav Odintsov 
---
 northd/en-global-config.c |  3 +-
 northd/northd.c   | 76 ---
 northd/northd.h   |  5 ++-
 ovn-architecture.7.xml| 10 +++---
 4 files changed, 41 insertions(+), 53 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 28c78a12c..df0f8e58c 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -116,7 +116,8 @@ en_global_config_run(struct engine_node *node , void *data)
 }
 
 char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+get_ovn_max_dp_key_local(
+is_vxlan_mode(sbrec_chassis_table)));
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
diff --git a/northd/northd.c b/northd/northd.c
index 9f81afccb..6d118a19a 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -881,7 +885,7 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
+bool
 is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
 {
 const struct sbrec_chassis *chassis;
@@ -896,25 +900,22 @@ is_vxlan_mode(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(bool _vxlan_mode)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
-/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
+if (_vxlan_mode) {
+/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for VXLAN mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
 return OVN_MAX_DP_KEY - OVN_MAX_DP_GLOBAL_NUM;
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL, get_ovn_max_dp_key_local(vxlan_mode), hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -927,7 +928,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
 
 static void
 ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct hmap *dp_tnlids, struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -936,8 +936,7 @@ ovn_datapath_assign_requested_tnl_id(
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
 const char *interconn_ts = smap_get(other_config, "interconn-ts");
-if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
-tunnel_key >= 1 << 12) {
+if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for datapath %s is "
  "incompatible with VXLAN", tunnel_key,
@@ -985,7 +984,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_table *nbrec_ls_table,
 const struct nbrec_logical_router_table *nbrec_lr_table,
 const struct sbrec_datapath_binding_table *sbrec_dp_table,
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct ovn_datapaths *ls_datapaths,
 struct ovn_datapaths *lr_datapaths,
 struct ovs_list *lr_list)
@@ -1000,12 +998,11 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 struct hmap dp_tnlids = HMAP_INITIALIZER(&dp_tnlids);
 struct ovn_datapath *od;
 LIST_FOR_EACH (od, list, &both) {
-ovn_datapath_assign_requested_tnl_id(sbrec_chassis_table, &dp_tnlids,
-

[ovs-dev] [PATCH ovn v6 0/2] Add support to disable VXLAN mode.

v6:
  - Addressed Mark's review comments:
1. Removed global variable "vxlan_mode" change from "global" engine node.
2. Configuration knob "disable_vxlan_mode" was renamed to "vxlan_mode"
v5:
  - Addressed Ihar's review comments:
1. fixed errors after incorrect conflicts solving on rebase;
2. changed VXLAN mode naming to capitalized;
3. clarified VXLAN mode in ovn-architecture man page.
v4:
  - Addressed Dumitru's and Ihar's review comments;
  - single patch was split into two:
1. function call replaced with a global variable `vxlan_mode`;
2. introduced `disable_vxlan_mode` configuration knob;
  - rebased onto latest main branch.
v3:
  - Removed accidental ovs submodule change.
v2:
  - Added NEWS item.

Vladislav Odintsov (2):
  northd: Make `vxlan_mode` a global variable.
  northd: Add support for disabling vxlan mode.

 NEWS  |  4 ++
 northd/en-global-config.c |  9 -
 northd/northd.c   | 84 +--
 northd/northd.h   |  6 ++-
 ovn-architecture.7.xml| 16 +---
 ovn-nb.xml| 10 +
 tests/ovn-northd.at   | 29 ++
 7 files changed, 104 insertions(+), 54 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

2024-06-05 Thread Vladislav Odintsov

Thanks Mark for such a detailed answer!

I agree with your points, and also was thinking about them, but could not value 
their importance in terms of I-P logic. You helped with that.

I’d prefer to apply my proposal to revert back to “bool is_vxlan_mode()” to 
make the “global” a true global. Will submit v6.

regards,
Vladislav Odintsov

> On 5 Jun 2024, at 23:13, Mark Michelson  wrote:
> 
> On 6/5/24 08:51, Vladislav Odintsov wrote:
>> Hi Mark,
>> Thanks for the review!
>> Please, see below.
>> regards,
>>  Vladislav Odintsov
>> -Original Message-
>> From: Mark Michelson 
>> Date: Tuesday, 4 June 2024 at 03:45
>> To: Vladislav Odintsov , 
>> Subject: Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global 
>> variable.
>> Hi Vladislav,
>> Generally speaking, I agree with this change. However, I think that
>> setting a global variable from an incremental processing engine node
>> runner feels wrong.
>> The init_vxlan_mode() is called inside the en_global_config_run() only to
>> initialize global value, which is then read by get_ovn_max_dp_key_local() to
>> fill the "max_tunid" variable inside incremental processing engine node.
>> Which drawbacks do you see of such variable initialization?
> 
> The biggest drawbacks are:
> * Reasoning about "ownership" of the vxlan_mode global variable
> * Maintenance of the en_global_config I-P engine node.
> 
> On the first point, since vxlan_mode is a global variable in northd.c, it's 
> not obvious that the owner of this data is the en_global_config engine node. 
> It's an easy mistake for someone to reference the variable before it has been 
> initialized, for instance. However, if the boolean is on the 
> ed_type_global_config struct, then it's clear to see that this data is scoped 
> to the en_global_config engine node.
> 
> On the second point, if someone were to overhaul the en_global_config engine 
> node, it would be an easy mistake to make to not notice that vxlan_mode is 
> being set by the engine node. I could see a developer splitting the node into 
> separate nodes, for instance. In doing so, the developer could easily miss 
> that the global vxlan_mode is being set by the engine node, since it's hidden 
> behind an init_ function call. However, placing vxlan_mode on the 
> ed_type_global_config makes it more clear that the en_global_config engine 
> node is responsible for setting the value.

> 
>> I think that instead, the "vxlan_mode" variable you have introduced
>> should be a field on struct ed_type_global_config. This way, the engine
>> node is only modifying data local to itself.
>> I guess, that moving this to the struct ed_type_global_config will make the 
>> code
>> a bit more complex: we have to pass this variable through all function calls 
>> to
>> be able to read vxlan_mode value inside 
>> ovn_datapath_assign_requested_tnl_id(),
>> ovn_port_assign_requested_tnl_id() and ovn_port_allocate_key().
> 
> I think dependency injection makes the code easier to read, understand, and 
> maintain rather than making it more complex. It's clearer that the data from 
> the en_global_config engine node is needed in all of the functions you listed 
> if those functions require an ed_type_global_config argument.
> 
>> Apart of this, the "vxlan_mode" variable has the same "global" meaning as
>> "use_ct_inv_match", "check_lsp_is_up", "use_common_zone" and other global
>> variables, which configure the global OVN behaviour. The difference is that 
>> it
>> is required to read its value inside the en_global_config_run() to reflect 
>> the
>> max_tunid back to NB_Global.
> 
> Personally, I don't like those global variables either :)
> 
> But those globals are also set within northd.c, and are initialized at the 
> begining of a DB run. From the perspective of northd processing, they are 
> truly "global" in their scope. The engine nodes form a dependency tree, and 
> so it's important that data that is scoped to a particular node is housed in 
> that engine node's data. This way, when nodes are being created, it's clear 
> to know which other engine nodes they depend on. If engine nodes are setting 
> global variables, then it becomes harder to understand the dependencies.
>> If the global variable setting is totally not acceptable from engine node, I
>> can propose another approach here. Revert init_vxlan_mode() back to
>> `bool is_vxlan_mode()` and assign global variable outside of global engine 
>>

Re: [ovs-dev] [PATCH ovn v5 2/2] northd: Add support for disabling vxlan mode.

2024-06-05 Thread Vladislav Odintsov

Hi Mark,

Please see below.

regards,
 
Vladislav Odintsov

-Original Message-
From: dev  on behalf of Mark Michelson 

Date: Tuesday, 4 June 2024 at 03:45
To: Vladislav Odintsov , "d...@openvswitch.org" 

Subject: Re: [ovs-dev] [PATCH ovn v5 2/2] northd: Add support for disabling 
vxlan mode.

Hi Vladislav,

I have just one comment below.

On 5/3/24 04:13, Vladislav Odintsov wrote:
> Commit [1] introduced a "VXLAN mode" concept.  It brought a limitation
> for available tunnel IDs because of lack of space in VXLAN VNI.
> In VXLAN mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
> and 2047 logical ports per datapath.
> 
> Prior to this patch VXLAN mode was enabled automatically if at least one
> chassis had encap of VXLAN type.  In scenarios where one want to use
> VXLAN only for HW VTEP (RAMP) switch, such limitation makes no sence.
> 
> This patch adds support for explicit disabling of VXLAN mode via
> Northbound database.
> 
> 1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
> 
> Acked-By: Ihar Hrachyshka 
> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
> Signed-off-by: Vladislav Odintsov 
> ---
>   NEWS  |  4 
>   northd/en-global-config.c |  7 ++-
>   northd/northd.c   | 10 --
>   northd/northd.h   |  3 ++-
>   ovn-architecture.7.xml|  6 ++
>   ovn-nb.xml| 10 ++
>   tests/ovn-northd.at   | 29 +
>   7 files changed, 65 insertions(+), 4 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 3b5e93dc9..007b27f3d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -17,6 +17,10 @@ Post v24.03.0
>   external-ids, the option is no longer needed as it became 
effectively
>   "true" for all scenarios.
> - Added DHCPv4 relay support.
> +  - Added new global config option NB_Global:options:disable_vxlan_mode 
to
> +extend available tunnel IDs space for datapaths from 4095 to 16711680
> +when running in "VXLAN mode".  For more details see man ovn-nb(5) for
> +mentioned option.
>   
>   OVN v24.03.0 - 01 Mar 2024
>   --
> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> index 873649a89..f5e2a8154 100644
> --- a/northd/en-global-config.c
> +++ b/northd/en-global-config.c
> @@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void 
*data)
>config_data->svc_monitor_mac);
>   }
>   
> -init_vxlan_mode(sbrec_chassis_table);
> +init_vxlan_mode(&nb->options, sbrec_chassis_table);
>   char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>   smap_replace(options, "max_tunid", max_tunid);
>   free(max_tunid);
> @@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct 
nbrec_nb_global *nb,
>   return true;
>   }
>   
> +if (config_out_of_sync(&nb->options, &config_data->nb_options,
> +   "disable_vxlan_mode", false)) {
> +return true;
> +}
> +
>   return false;
>   }
>   
> diff --git a/northd/northd.c b/northd/northd.c
> index 0e0ae24db..7bdffe531 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -886,8 +886,14 @@ join_datapaths(const struct 
nbrec_logical_switch_table *nbrec_ls_table,
>   }
>   
>   void
> -init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
> +init_vxlan_mode(const struct smap *nb_options,
> +const struct sbrec_chassis_table *sbrec_chassis_table)
>   {
> +if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
> +vxlan_mode = false;
> +return;
> +}
> +
>   const struct sbrec_chassis *chassis;
>   SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
>   for (int i = 0; i < chassis->n_encaps; i++) {
> @@ -17596,7 +17602,7 @@ ovnnb_db_run(struct northd_input *input_data,
>   use_common_zone = smap_get_bool(input_data->nb_options, 
"use_common_zone",
>   false);
>   
> -init_vxlan_mode(input_data->sbrec_chassis_table);
> +init_vxlan_mode(input_data->nb_options, 
input_data->sbrec_chassis_table);
>

Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

2024-06-05 Thread Vladislav Odintsov

Hi Mark,

Thanks for the review!
Please, see below.

regards,
 
Vladislav Odintsov

-Original Message-
From: Mark Michelson 
Date: Tuesday, 4 June 2024 at 03:45
To: Vladislav Odintsov , 
Subject: Re: [ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global 
variable.

Hi Vladislav,

Generally speaking, I agree with this change. However, I think that 
setting a global variable from an incremental processing engine node 
runner feels wrong.

The init_vxlan_mode() is called inside the en_global_config_run() only to
initialize global value, which is then read by get_ovn_max_dp_key_local() to
fill the "max_tunid" variable inside incremental processing engine node.

Which drawbacks do you see of such variable initialization?

I think that instead, the "vxlan_mode" variable you have introduced 
should be a field on struct ed_type_global_config. This way, the engine 
node is only modifying data local to itself.

I guess, that moving this to the struct ed_type_global_config will make the code
a bit more complex: we have to pass this variable through all function calls to
be able to read vxlan_mode value inside ovn_datapath_assign_requested_tnl_id(),
ovn_port_assign_requested_tnl_id() and ovn_port_allocate_key().

Apart of this, the "vxlan_mode" variable has the same "global" meaning as
"use_ct_inv_match", "check_lsp_is_up", "use_common_zone" and other global
variables, which configure the global OVN behaviour. The difference is that it
is required to read its value inside the en_global_config_run() to reflect the
max_tunid back to NB_Global.

If the global variable setting is totally not acceptable from engine node, I
can propose another approach here. Revert init_vxlan_mode() back to
`bool is_vxlan_mode()` and assign global variable outside of global engine node:


diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 873649a89..df0f8e58c 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,9 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-init_vxlan_mode(sbrec_chassis_table);
-char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
+char *max_tunid = xasprintf("%d",
+get_ovn_max_dp_key_local(
+is_vxlan_mode(sbrec_chassis_table)));
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
diff --git a/northd/northd.c b/northd/northd.c
index 0e0ae24db..9ac608f03 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -885,25 +885,24 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+bool
+is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
 {
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-vxlan_mode = true;
-return;
+return true;
 }
 }
 }
-vxlan_mode = false;
+return false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(void)
+get_ovn_max_dp_key_local(bool _vxlan_mode)
 {
-if (vxlan_mode) {
+if (_vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for VXLAN mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -916,9 +915,7 @@ ovn_datapath_allocate_key(struct hmap *datapaths, struct 
hmap *dp_tnlids,
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(),
-hint);
+OVN_MIN_DP_KEY_LOCAL, get_ovn_max_dp_key_local(vxlan_mode), hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -17596,7 +17593,7 @@ ovnnb_db_run(struct northd_input *input_data,
 use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
 false);
 
-init_vxlan_mode(input_data->sbrec_chassis_table);
+vxlan_mode = is_vxlan_mode(input_data->sbrec_chassis_table);
 
 build_datapaths(ovnsb_txn,
 input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index be480003e..c613652e9 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -791,9 +791,9 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
 return od->n_l3dgw_ports > 1 && !od->is_gw_router;
 }
 
-vo

Re: [ovs-dev] [PATCH] python: idl: Fix index not being updated on row modification.

2024-05-28 Thread Vladislav Odintsov

Hi Ilya,

Thanks for the fix!
Some time ago we internally noticed a problem with index updates, which was not 
a real issue, but I tried your fix and it fixes our original issue.

How we observed that issue:

In terminal one:
ovn-nbctl ls-add test

In terminal two:
Run python, load IDL and query the name for the created object (ls "test") (we 
use ovsdbapp, so example uses it as well): 

>>> from ovsdbapp.backend.ovs_idl import connection, idlutils
>>> import ovsdbapp.schema.ovn_northbound.impl_idl as nb_idl
>>>
>>> idl = connection.OvsdbIdl.from_server("tcp:127.0.0.1:6641", 
>>> "OVN_Northbound")
>>> api_idl = nb_idl.OvnNbApiIdlImpl(connection.Connection(idl, 100))
>>>
>>> sw = api_idl.ls_get("test").execute().name
>>> 'test'

Than switch back to first terminal and change ls 'name' (which is an indexed 
field):

ovn-nbctl set logical-switch test name=test2

Switch back to python terminal and try to get the name again.
In case of affected python-ovs the old instance "test" returns new name "test2" 
from "test" instance and "test2" instance is not accessible:

>>> sw = api_idl.ls_get("test").execute().name
>>> 'test2'
>>> sw = api_idl.ls_get("test2").execute().name
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'name'

With your patch it works as expected:

>>> sw = api_idl.ls_get("test").execute().name
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'name'
>>> sw = api_idl.ls_get("test2").execute().name
>>> 'test2'

I just wanted to share our experience with this problem and patch.
You can add this to OVS python tests, if you consider it's worth it.

Thanks again :)

regards,
 
Vladislav Odintsov

-Original Message-
From: dev  on behalf of Ilya Maximets 

Date: Tuesday, 28 May 2024 at 00:39
To: "ovs-dev@openvswitch.org" 
Cc: Ilya Maximets , Dumitru Ceara 
Subject: [ovs-dev] [PATCH] python: idl: Fix index not being updated on row  
modification.

When a row is modified, python IDL doesn't perform any operations on
existing client-side indexes.  This means that if the column on which
index is created changes, the old value will remain in the index and
the new one will not be added to the index.  Beside lookup failures
this is also causing inability to remove modified rows, because the
new column value doesn't exist in the index causing an exception on
attempt to remove it:

 Traceback (most recent call last):
   File "ovsdbapp/backend/ovs_idl/connection.py", line 110, in run
 self.idl.run()
   File "ovs/db/idl.py", line 465, in run
 self.__parse_update(msg.params[2], OVSDB_UPDATE3)
   File "ovs/db/idl.py", line 924, in __parse_update
 self.__do_parse_update(update, version, self.tables)
   File "ovs/db/idl.py", line 964, in __do_parse_update
 changes = self.__process_update2(table, uuid, row_update)
   File "ovs/db/idl.py", line 991, in __process_update2
 del table.rows[uuid]
   File "ovs/db/custom_index.py", line 102, in __delitem__
 index.remove(val)
   File "ovs/db/custom_index.py", line 66, in remove
 self.values.remove(self.index_entry_from_row(row))
   File "sortedcontainers/sortedlist.py", line 2015, in remove
 raise ValueError('{0!r} not in list'.format(value))
 ValueError: Datapath_Binding(
   uuid=UUID('498e66a2-70bc-4587-a66f-0433baf82f60'),
   tunnel_key=16711683, load_balancers=[], external_ids={}) not in list

Fix that by always removing an existing row from indexes before
modification and adding back afterwards.  This ensures that old
values are removed from the index and new ones are added.

This behavior is consistent with the C implementation.

The new test that reproduces the removal issue is added.  Some extra
testing infrastructure added to be able to handle and print out the
'indexed' table from the idltest schema.

Fixes: 13973bc41524 ("Add multi-column index support for the Python IDL")
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2024-May/053159.html
Reported-by: Roberto Bartzen Acosta 
Signed-off-by: Ilya Maximets 
---
 python/ovs/db/idl.py | 13 --
 tests/ovsdb-idl.at   | 95 +++-
 tests/test-ovsdb.c   | 43 
 tests/test-ovsdb.py  | 15 +++
 4 files ch

Re: [ovs-dev] OVS 3.3.1 release date

Thanks Ilya for clarification!
Looking forward to see new release.

regards,

Vladislav Odintsov

-Original Message-
From: dev  on behalf of Ilya Maximets 

Date: Wednesday, 22 May 2024 at 18:27
To: Vladislav Odintsov , "ovs-dev@openvswitch.org" 

Cc: "i.maxim...@ovn.org" , David Marchand 

Subject: Re: [ovs-dev] OVS 3.3.1 release date

    On 5/22/24 16:17, Vladislav Odintsov wrote:
> Hi all,
> 
> I’m wondering whether there is a planned date for OVS 3.3.1 release?
> 
> Currently there are a lot of useful bugfixes in branch-3.3 above 3.3.1
> tag and latest release was on the 17th of February (>3 months ago).

Hi.  I plan to make a series of releases in the next couple of weeks,
ideally by the end of May, but maybe the first week of June.

The current plan is try to incorporate at least partially David's fixes:
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=403694

And get DPDK update to the recently released v23.11.1.

But I think we'll need to make stable releases even if we will not be
able to incorporate changes above in time.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] OVS 3.3.1 release date

Hi all,

 

I’m wondering whether there is a planned date for OVS 3.3.1 release?

Currently there are a lot of useful bugfixes in branch-3.3 above 3.3.1 tag and 
latest release was on the 17th of February (>3 months ago).

 

regards,

 

Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v2] controller: Store src_mac, src_ip in svc_monitor struct.

Again adding forgotten tag:

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-April/413198.html

regards,
 
Vladislav Odintsov

On 22.05.2024, 15:19, "Vladislav Odintsov"  wrote:

These structure members are read in pinctrl_handler() in a separare thread.
This is unsafe: when IDL is re-connected or some IDL objects are freed
after svc_monitors list with svc_monitor structures, which point to
sbrec_service_monitor is initialized, sb_svc_mon structure property can
point to wrong address, which then leads to segmentation fault in
svc_monitor_send_tcp_health_check__() and
svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.

Imagine situation:

Main ovn-controller thread:
1. Connects to SB and fills IDL with DB contents.
2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
   of it.
3. Release mutex.

...
4. Loss of OVNSB connection for any reason (network outage/inactivity probe
   timeout/etc), start new main-loop iteration, re-initialize IDL in
   ovsdb_idl_run() (which probably will destroy previous IDL objects).
...

pinctrl thread:
5. Awake from poll_block().
... run new iteration in its main loop ...
6. Acquire mutex lock.
7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
   svc_monitor_send_udp_health_check(), which try to access IDL objects
   with outdated pointers.

8. Segmentation fault:

Stack trace of thread 212406:
   __GI_strlen (libc.so.6)
   inet_pton (libc.so.6)
   ip_parse (ovn-controller)
   svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
   svc_monitor_send_health_check (ovn-controller)
   pinctrl_handler (ovn-controller)
   ovsthread_wrapper (ovn-controller)
   start_thread (libpthread.so.0)
   __clone (libc.so.6)

This patch removes unsafe access to IDL objects from pinctrl thread.
New svc_monitor structure members are used to store needed data.

CC: Numan Siddique 
Acked-by: Ales Musil 
Fixes: 8be01f4a5329 ("Send service monitor health checks")
Signed-off-by: Vladislav Odintsov 

---
v1 -> v2:
  - Addressed Ales's comment: replaced ip_parse() & ipv6_parse() with
ip46_parse().
---
 controller/pinctrl.c | 37 -
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..0178ac6cc 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7307,6 +7307,9 @@ struct svc_monitor {
 long long int timestamp;
 bool is_ip6;

+struct eth_addr src_mac;
+struct in6_addr src_ip;
+
 long long int wait_time;
 long long int next_send_time;

@@ -7501,6 +7504,9 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 svc_mon->n_success = 0;
 svc_mon->n_failures = 0;

+eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
+ip46_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
+
 hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
 ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
 changed = true;
@@ -8259,19 +8265,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
*swconn,
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);

-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
- &ip6_src, &svc_mon->ip, IPPROTO_TCP,
+pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
+ &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
  63, TCP_HEADER_LEN);
 } else {
-ovs_be32 ip4_src;
-ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
-pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
- ip4_src, 
in6_addr_get_mapped_ipv4(&svc_mon->ip),
+pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
+ in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
+ in6_addr_get_mapped_ipv4(&svc_mon->ip),
  IPPROTO_TCP, 63, TCP_HEADER_LEN);
 }

@@ -8327,24 +8328,18 @@ svc_monitor_send_udp_health_check(struct rconn 
*swconn,

Re: [ovs-dev] [PATCH ovn] controller: Store src_mac, src_ip in svc_monitor struct.

Hi Ales!

Thanks for the review.
I've addressed requested changes in v2 with your Acked-by added:

https://patchwork.ozlabs.org/project/ovn/patch/20240522121913.609332-1-odiv...@gmail.com/

regards,
 
Vladislav Odintsov

On 22.05.2024, 10:59, "dev on behalf of Ales Musil" 
 wrote:

On Tue, May 14, 2024 at 9:25 PM Vladislav Odintsov 
wrote:

> These structure members are read in pinctrl_handler() in a separare 
thread.
> This is unsafe: when IDL is re-connected or some IDL objects are freed
> after svc_monitors list with svc_monitor structures, which point to
> sbrec_service_monitor is initialized, sb_svc_mon structure property can
> point to wrong address, which then leads to segmentation fault in
> svc_monitor_send_tcp_health_check__() and
> svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.
>
> Imagine situation:
>
> Main ovn-controller thread:
> 1. Connects to SB and fills IDL with DB contents.
> 2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
>of it.
> 3. Release mutex.
>
> ...
> 4. Loss of OVNSB connection for any reason (network outage/inactivity 
probe
>timeout/etc), start new main-loop iteration, re-initialize IDL in
>ovsdb_idl_run() (which probably will destroy previous IDL objects).
> ...
>
> pinctrl thread:
> 5. Awake from poll_block().
> ... run new iteration in its main loop ...
> 6. Aqcuire mutex lock.
> 7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
>svc_monitor_send_udp_health_check(), which try to access IDL objects
>with outdated pointers.
>
> 8. Segmentation fault:
>
> Stack trace of thread 212406:
> >> __GI_strlen (libc.so.6)
> >> inet_pton (libc.so.6)
> >> ip_parse (ovn-controller)
> >> svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
> >> svc_monitor_send_health_check (ovn-controller)
> >> pinctrl_handler (ovn-controller)
> >> ovsthread_wrapper (ovn-controller)
> >> start_thread (libpthread.so.0)
> >> __clone (libc.so.6)
>
> This patch removes unsafe access to IDL objects from pinctrl thread.
    > New svc_monitor structure members are used to store needed data.
>
> CC: Numan Siddique 
> Fixes: 8be01f4a5329 ("Send service monitor health checks")
> Signed-off-by: Vladislav Odintsov 
> ---
>

Hi Vladislav,

thank you for the fix. I have one comment down below.


>  controller/pinctrl.c | 43 ++-
>  1 file changed, 22 insertions(+), 21 deletions(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 6a2c3dc68..b843edb35 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -7307,6 +7307,9 @@ struct svc_monitor {
>  long long int timestamp;
>  bool is_ip6;
>
> +struct eth_addr src_mac;
> +struct in6_addr src_ip;
> +
>  long long int wait_time;
>  long long int next_send_time;
>
> @@ -7501,6 +7504,15 @@ sync_svc_monitors(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>  svc_mon->n_success = 0;
>  svc_mon->n_failures = 0;
>
> +eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
> +if (is_ipv4) {
> +ovs_be32 ip4_src;
> +ip_parse(sb_svc_mon->src_ip, &ip4_src);
> +svc_mon->src_ip = in6_addr_mapped_ipv4(ip4_src);
> +} else {
> +ipv6_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
> +}
> +
>

The whole if else block can be replaced with ip46_parse().

 hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
>  ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
>  changed = true;
> @@ -8259,19 +8271,14 @@ svc_monitor_send_tcp_health_check__(struct rconn
> *swconn,
>  struct dp_packet packet;
>  dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
>
> -struct eth_addr eth_src;
> -eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
>  if (svc_mon->is_ip6) {
> -struct in6_addr ip6_src;
> -ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
> -pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
> -

[ovs-dev] [PATCH ovn v2] controller: Store src_mac, src_ip in svc_monitor struct.

These structure members are read in pinctrl_handler() in a separare thread.
This is unsafe: when IDL is re-connected or some IDL objects are freed
after svc_monitors list with svc_monitor structures, which point to
sbrec_service_monitor is initialized, sb_svc_mon structure property can
point to wrong address, which then leads to segmentation fault in
svc_monitor_send_tcp_health_check__() and
svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.

Imagine situation:

Main ovn-controller thread:
1. Connects to SB and fills IDL with DB contents.
2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
   of it.
3. Release mutex.

...
4. Loss of OVNSB connection for any reason (network outage/inactivity probe
   timeout/etc), start new main-loop iteration, re-initialize IDL in
   ovsdb_idl_run() (which probably will destroy previous IDL objects).
...

pinctrl thread:
5. Awake from poll_block().
... run new iteration in its main loop ...
6. Acquire mutex lock.
7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
   svc_monitor_send_udp_health_check(), which try to access IDL objects
   with outdated pointers.

8. Segmentation fault:

Stack trace of thread 212406:
   __GI_strlen (libc.so.6)
   inet_pton (libc.so.6)
   ip_parse (ovn-controller)
   svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
   svc_monitor_send_health_check (ovn-controller)
   pinctrl_handler (ovn-controller)
   ovsthread_wrapper (ovn-controller)
   start_thread (libpthread.so.0)
   __clone (libc.so.6)

This patch removes unsafe access to IDL objects from pinctrl thread.
New svc_monitor structure members are used to store needed data.

CC: Numan Siddique 
Acked-by: Ales Musil 
Fixes: 8be01f4a5329 ("Send service monitor health checks")
Signed-off-by: Vladislav Odintsov 

---
v1 -> v2:
  - Addressed Ales's comment: replaced ip_parse() & ipv6_parse() with
ip46_parse().
---
 controller/pinctrl.c | 37 -
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..0178ac6cc 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7307,6 +7307,9 @@ struct svc_monitor {
 long long int timestamp;
 bool is_ip6;
 
+struct eth_addr src_mac;
+struct in6_addr src_ip;
+
 long long int wait_time;
 long long int next_send_time;
 
@@ -7501,6 +7504,9 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 svc_mon->n_success = 0;
 svc_mon->n_failures = 0;
 
+eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
+ip46_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
+
 hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
 ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
 changed = true;
@@ -8259,19 +8265,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
*swconn,
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
- &ip6_src, &svc_mon->ip, IPPROTO_TCP,
+pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
+ &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
  63, TCP_HEADER_LEN);
 } else {
-ovs_be32 ip4_src;
-ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
-pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
- ip4_src, in6_addr_get_mapped_ipv4(&svc_mon->ip),
+pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
+ in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
+ in6_addr_get_mapped_ipv4(&svc_mon->ip),
  IPPROTO_TCP, 63, TCP_HEADER_LEN);
 }
 
@@ -8327,24 +8328,18 @@ svc_monitor_send_udp_health_check(struct rconn *swconn,
   struct svc_monitor *svc_mon,
   ovs_be16 udp_src)
 {
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
-
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
-

Re: [ovs-dev] [PATCH ovn] ovn-ctl: Support for --config-file ovsdb-server option.

2024-05-15 Thread Vladislav Odintsov

Thanks Numan!

regards,
Vladislav Odintsov

> On 15 May 2024, at 23:55, Numan Siddique  wrote:
> 
> On Fri, May 3, 2024 at 2:05 AM Ales Musil  wrote:
>> 
>>> On Tue, Apr 23, 2024 at 6:43 PM Vladislav Odintsov 
>>> wrote:
>>> 
>>> Since OVS 3.3.0 ovsdb-server accepts databases and remotes configuration
>>> via JSON text file.  This patch adds support for such option.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
> 
> Thanks for the patch.
> 
> I applied this with the below changes
> 
> -
> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
> index fd1ae12567..a4f410e4f7 100755
> --- a/utilities/ovn-ctl
> +++ b/utilities/ovn-ctl
> @@ -1242,8 +1242,7 @@ File location options:
>   --db-sb-relay-sock=SOCKET  OVN_IC_Northbound db socket (default:
> $DB_SB_RELAY_SOCK)
>   --db-sb-relay-pidfile=FILE OVN_Southbound relay db pidfile
> (default: $DB_SB_RELAY_CTRL_PIDFILE)
>   --db-sb-relay-ctrl-sock=SOCKET OVN_Southbound relay db control
> socket (default: $DB_SB_RELAY_CTRL_SOCK)
> -  --db-sb-relay-config-file=FILE OVN_IC_Northbound ovsdb-server
> configuration file
> - Mutually exclusive with
> --db-ic-nb-use-remote-in-db=yes.
> +  --db-sb-relay-config-file=FILE OVN_Southbound relay ovsdb-server
> configuration file.

Oops, copy-paste typo.

>   --ovn-sb-relay-db-ssl-key=KEY OVN_Southbound DB relay SSL private key file
>   --ovn-sb-relay-db-ssl-cert=CERT OVN_Southbound DB relay SSL certificate file
>   --ovn-sb-relay-db-ssl-ca-cert=CERT OVN OVN_Southbound DB relay SSL
> CA certificate file
> diff --git a/utilities/ovn-ctl.8.xml b/utilities/ovn-ctl.8.xml
> index c0fbb0792d..4f21ba4ea3 100644
> --- a/utilities/ovn-ctl.8.xml
> +++ b/utilities/ovn-ctl.8.xml
> @@ -86,6 +86,11 @@
> --db-ic-sb-schema=FILE
> --db-ic-sb-create-insecure-remote=yes|no
> --db-ic-nb-create-insecure-remote=yes|no
> +--db-nb-config-file=FILE
> +--db-sb-config-file=FILE
> +--db-ic-nb-config-file=FILE
> +--db-ic-sb-config-file=FILE
> +--db-sb-relay-config-file=FILE
> --ovn-controller-ssl-key=KEY
> --ovn-controller-ssl-cert=CERT
> --ovn-controller-ssl-ca-cert=CERT
> 
> -
> 
> 
> Numan
> 
>>> ---
>>> NEWS  |  1 +
>>> utilities/ovn-ctl | 39 +++
>>> 2 files changed, 36 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/NEWS b/NEWS
>>> index 9adf6a31c..39ea88d78 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -16,6 +16,7 @@ Post v24.03.0
>>>   - Remove "ovn-set-local-ip" config option from vswitchd
>>> external-ids, the option is no longer needed as it became effectively
>>> "true" for all scenarios.
>>> +  - Add support for ovsdb-server `--config-file` option in ovn-ctl.
>>> 
>>> OVN v24.03.0 - 01 Mar 2024
>>> --
>>> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
>>> index dae5e22f4..fd1ae1256 100755
>>> --- a/utilities/ovn-ctl
>>> +++ b/utilities/ovn-ctl
>>> @@ -169,6 +169,7 @@ start_ovsdb__() {
>>> local sync_from_port
>>> local file
>>> local schema
>>> +local config_file
>>> local logfile
>>> local log
>>> local sock
>>> @@ -199,6 +200,7 @@ start_ovsdb__() {
>>> eval sync_from_port=\$DB_${DB}_SYNC_FROM_PORT
>>> eval file=\$DB_${DB}_FILE
>>> eval schema=\$DB_${DB}_SCHEMA
>>> +eval config_file=\$DB_${DB}_CONFIG_FILE
>>> eval logfile=\$OVN_${DB}_LOGFILE
>>> eval log=\$OVN_${DB}_LOG
>>> eval sock=\$DB_${DB}_SOCK
>>> @@ -281,7 +283,12 @@ $cluster_remote_port
>>> 
>>> set ovsdb-server
>>> set "$@" $log --log-file=$logfile
>>> -set "$@" --remote=punix:$sock --pidfile=$db_pid_file
>>> +set "$@" --pidfile=$db_pid_file
>>> +if test X"$config_file" == X; then
>>> +set "$@" --remote=punix:$sock
>>> +else
>>> +set "$@" --config-file=$config_file
>>> +fi
>>> set "$@" --unixctl=$ctrl_sock
>>> 
>>> [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
>>> @@ -297,7 +304,7 @@ $cluster_remote_port
>>> set exec "$@"
>>> fi
>>>

Re: [ovs-dev] [PATCH ovn] controller: Store src_mac, src_ip in svc_monitor struct.

2024-05-14 Thread Vladislav Odintsov

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2024-April/413198.html

> On 14 May 2024, at 22:25, Vladislav Odintsov  wrote:
> 
> These structure members are read in pinctrl_handler() in a separare thread.
> This is unsafe: when IDL is re-connected or some IDL objects are freed
> after svc_monitors list with svc_monitor structures, which point to
> sbrec_service_monitor is initialized, sb_svc_mon structure property can
> point to wrong address, which then leads to segmentation fault in
> svc_monitor_send_tcp_health_check__() and
> svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.
> 
> Imagine situation:
> 
> Main ovn-controller thread:
> 1. Connects to SB and fills IDL with DB contents.
> 2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
>   of it.
> 3. Release mutex.
> 
> ...
> 4. Loss of OVNSB connection for any reason (network outage/inactivity probe
>   timeout/etc), start new main-loop iteration, re-initialize IDL in
>   ovsdb_idl_run() (which probably will destroy previous IDL objects).
> ...
> 
> pinctrl thread:
> 5. Awake from poll_block().
> ... run new iteration in its main loop ...
> 6. Aqcuire mutex lock.
> 7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
>   svc_monitor_send_udp_health_check(), which try to access IDL objects
>   with outdated pointers.
> 
> 8. Segmentation fault:
> 
> Stack trace of thread 212406:
>>> __GI_strlen (libc.so.6)
>>> inet_pton (libc.so.6)
>>> ip_parse (ovn-controller)
>>> svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
>>> svc_monitor_send_health_check (ovn-controller)
>>> pinctrl_handler (ovn-controller)
>>> ovsthread_wrapper (ovn-controller)
>>> start_thread (libpthread.so.0)
>>> __clone (libc.so.6)
> 
> This patch removes unsafe access to IDL objects from pinctrl thread.
> New svc_monitor structure members are used to store needed data.
> 
> CC: Numan Siddique 
> Fixes: 8be01f4a5329 ("Send service monitor health checks")
> Signed-off-by: Vladislav Odintsov 
> ---
> controller/pinctrl.c | 43 ++-
> 1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 6a2c3dc68..b843edb35 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -7307,6 +7307,9 @@ struct svc_monitor {
> long long int timestamp;
> bool is_ip6;
> 
> +struct eth_addr src_mac;
> +struct in6_addr src_ip;
> +
> long long int wait_time;
> long long int next_send_time;
> 
> @@ -7501,6 +7504,15 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
> svc_mon->n_success = 0;
> svc_mon->n_failures = 0;
> 
> +eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
> +if (is_ipv4) {
> +ovs_be32 ip4_src;
> +ip_parse(sb_svc_mon->src_ip, &ip4_src);
> +svc_mon->src_ip = in6_addr_mapped_ipv4(ip4_src);
> +} else {
> +ipv6_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
> +}
> +
> hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
> ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
> changed = true;
> @@ -8259,19 +8271,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
> *swconn,
> struct dp_packet packet;
> dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
> 
> -struct eth_addr eth_src;
> -eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
> if (svc_mon->is_ip6) {
> -struct in6_addr ip6_src;
> -ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
> -pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
> - &ip6_src, &svc_mon->ip, IPPROTO_TCP,
> +pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
> + &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
>  63, TCP_HEADER_LEN);
> } else {
> -ovs_be32 ip4_src;
> -ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
> -pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
> - ip4_src, in6_addr_get_mapped_ipv4(&svc_mon->ip),
> +pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
> + in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
> +

[ovs-dev] [PATCH ovn] controller: Store src_mac, src_ip in svc_monitor struct.

2024-05-14 Thread Vladislav Odintsov

These structure members are read in pinctrl_handler() in a separare thread.
This is unsafe: when IDL is re-connected or some IDL objects are freed
after svc_monitors list with svc_monitor structures, which point to
sbrec_service_monitor is initialized, sb_svc_mon structure property can
point to wrong address, which then leads to segmentation fault in
svc_monitor_send_tcp_health_check__() and
svc_monitor_send_udp_health_check() on accessing svc_mon->sb_svc_mon.

Imagine situation:

Main ovn-controller thread:
1. Connects to SB and fills IDL with DB contents.
2. run pinctrl_run() with pinctrl mutex and sync_svc_monitors() as a part
   of it.
3. Release mutex.

...
4. Loss of OVNSB connection for any reason (network outage/inactivity probe
   timeout/etc), start new main-loop iteration, re-initialize IDL in
   ovsdb_idl_run() (which probably will destroy previous IDL objects).
...

pinctrl thread:
5. Awake from poll_block().
... run new iteration in its main loop ...
6. Aqcuire mutex lock.
7. Run svc_monitors_run(), run svc_monitor_send_tcp_health_check__() or
   svc_monitor_send_udp_health_check(), which try to access IDL objects
   with outdated pointers.

8. Segmentation fault:

Stack trace of thread 212406:
>> __GI_strlen (libc.so.6)
>> inet_pton (libc.so.6)
>> ip_parse (ovn-controller)
>> svc_monitor_send_tcp_health_check__.part.37 (ovn-controller)
>> svc_monitor_send_health_check (ovn-controller)
>> pinctrl_handler (ovn-controller)
>> ovsthread_wrapper (ovn-controller)
>> start_thread (libpthread.so.0)
>> __clone (libc.so.6)

This patch removes unsafe access to IDL objects from pinctrl thread.
New svc_monitor structure members are used to store needed data.

CC: Numan Siddique 
Fixes: 8be01f4a5329 ("Send service monitor health checks")
Signed-off-by: Vladislav Odintsov 
---
 controller/pinctrl.c | 43 ++-
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 6a2c3dc68..b843edb35 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7307,6 +7307,9 @@ struct svc_monitor {
 long long int timestamp;
 bool is_ip6;
 
+struct eth_addr src_mac;
+struct in6_addr src_ip;
+
 long long int wait_time;
 long long int next_send_time;
 
@@ -7501,6 +7504,15 @@ sync_svc_monitors(struct ovsdb_idl_txn *ovnsb_idl_txn,
 svc_mon->n_success = 0;
 svc_mon->n_failures = 0;
 
+eth_addr_from_string(sb_svc_mon->src_mac, &svc_mon->src_mac);
+if (is_ipv4) {
+ovs_be32 ip4_src;
+ip_parse(sb_svc_mon->src_ip, &ip4_src);
+svc_mon->src_ip = in6_addr_mapped_ipv4(ip4_src);
+} else {
+ipv6_parse(sb_svc_mon->src_ip, &svc_mon->src_ip);
+}
+
 hmap_insert(&svc_monitors_map, &svc_mon->hmap_node, hash);
 ovs_list_push_back(&svc_monitors, &svc_mon->list_node);
 changed = true;
@@ -8259,19 +8271,14 @@ svc_monitor_send_tcp_health_check__(struct rconn 
*swconn,
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-ipv6_parse(svc_mon->sb_svc_mon->src_ip, &ip6_src);
-pinctrl_compose_ipv6(&packet, eth_src, svc_mon->ea,
- &ip6_src, &svc_mon->ip, IPPROTO_TCP,
+pinctrl_compose_ipv6(&packet, svc_mon->src_mac, svc_mon->ea,
+ &svc_mon->src_ip, &svc_mon->ip, IPPROTO_TCP,
  63, TCP_HEADER_LEN);
 } else {
-ovs_be32 ip4_src;
-ip_parse(svc_mon->sb_svc_mon->src_ip, &ip4_src);
-pinctrl_compose_ipv4(&packet, eth_src, svc_mon->ea,
- ip4_src, in6_addr_get_mapped_ipv4(&svc_mon->ip),
+pinctrl_compose_ipv4(&packet, svc_mon->src_mac, svc_mon->ea,
+ in6_addr_get_mapped_ipv4(&svc_mon->src_ip),
+ in6_addr_get_mapped_ipv4(&svc_mon->ip),
  IPPROTO_TCP, 63, TCP_HEADER_LEN);
 }
 
@@ -8327,24 +8334,18 @@ svc_monitor_send_udp_health_check(struct rconn *swconn,
   struct svc_monitor *svc_mon,
   ovs_be16 udp_src)
 {
-struct eth_addr eth_src;
-eth_addr_from_string(svc_mon->sb_svc_mon->src_mac, ð_src);
-
 uint64_t packet_stub[128 / 8];
 struct dp_packet packet;
 dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
 if (svc_mon->is_ip6) {
-struct in6_addr ip6_src;
-

[ovs-dev] [PATCH ovn v5 1/2] northd: Make `vxlan_mode` a global variable.

2024-05-03 Thread Vladislav Odintsov

This simplifies code and subsequent commit to explicitely disable VXLAN
mode is based on these changes.

Also "VXLAN mode" term is introduced in ovn-architecture man page.

Signed-off-by: Vladislav Odintsov 
---
 northd/en-global-config.c |  4 +-
 northd/northd.c   | 85 +--
 northd/northd.h   |  5 ++-
 ovn-architecture.7.xml| 10 ++---
 4 files changed, 47 insertions(+), 57 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 28c78a12c..873649a89 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
diff --git a/northd/northd.c b/northd/northd.c
index 133cddb69..0e0ae24db 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -881,40 +885,40 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
 {
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
-/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
+if (vxlan_mode) {
+/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for VXLAN mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
 return OVN_MAX_DP_KEY - OVN_MAX_DP_GLOBAL_NUM;
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL,
+get_ovn_max_dp_key_local(),
+hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
 
 static void
 ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct hmap *dp_tnlids, struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
 const char *interconn_ts = smap_get(other_config, "interconn-ts");
-if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
-tunnel_key >= 1 << 12) {
+if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for datapath %s is "
  "incompatible with VXLAN", tunnel_key,
@@ -985,7 +987,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_table *nbrec_ls_table,
 const struct nbrec_logical_router_table *nbrec_lr_table,
 const struct sbrec_datapath_binding_table *sbrec_dp_table,
-const struct sbrec_chassis_table *sbrec_chassis

[ovs-dev] [PATCH ovn v5 2/2] northd: Add support for disabling vxlan mode.

2024-05-03 Thread Vladislav Odintsov

Commit [1] introduced a "VXLAN mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In VXLAN mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical ports per datapath.

Prior to this patch VXLAN mode was enabled automatically if at least one
chassis had encap of VXLAN type.  In scenarios where one want to use
VXLAN only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of VXLAN mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

Acked-By: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  4 
 northd/en-global-config.c |  7 ++-
 northd/northd.c   | 10 --
 northd/northd.h   |  3 ++-
 ovn-architecture.7.xml|  6 ++
 ovn-nb.xml| 10 ++
 tests/ovn-northd.at   | 29 +
 7 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..007b27f3d 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,10 @@ Post v24.03.0
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
   - Added DHCPv4 relay support.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680
+when running in "VXLAN mode".  For more details see man ovn-nb(5) for
+mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 873649a89..f5e2a8154 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-init_vxlan_mode(sbrec_chassis_table);
+init_vxlan_mode(&nb->options, sbrec_chassis_table);
 char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
@@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index 0e0ae24db..7bdffe531 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -886,8 +886,14 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
@@ -17596,7 +17602,7 @@ ovnnb_db_run(struct northd_input *input_data,
 use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
 false);
 
-init_vxlan_mode(input_data->sbrec_chassis_table);
+init_vxlan_mode(input_data->nb_options, input_data->sbrec_chassis_table);
 
 build_datapaths(ovnsb_txn,
 input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index be480003e..d0322e621 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -792,7 +792,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table);
 
 uint32_t get_ovn_max_dp_key_local(void);
 
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index 3ecb58933..f4eae340c 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -2920,4 +2920,10 @@
 the future, gateways that do not support encapsulations with large amounts
 of metadata may continue to have a reduced feature set.
   
+  
+VXLAN mode is recommended to be disabled if VXLAN encap at
+hypervisors is needed only to support HW VTEP L2 Gateway functionality.
+See man ovn-nb(5) for table NB_Global column
+options key disable_vxlan_mode for more details.
+  
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 5cb6ba640..84f1e07b6 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -381,6 +381,16 @@
 of SB changes would be very noticeable.
   
 
+  
+

[ovs-dev] [PATCH ovn v5 0/2] Add support to disable VXLAN mode.

2024-05-03 Thread Vladislav Odintsov

v5:
  - Addressed Ihar's review comments:
1. fixed errors after incorrect conflicts solving on rebase;
2. changed VXLAN mode naming to capitalized;
3. clarified VXLAN mode in ovn-architecture man page.
v4:
  - Addressed Dumitru's and Ihar's review comments;
  - single patch was split into two:
1. function call replaced with a global variable `vxlan_mode`;
2. introduced `disable_vxlan_mode` configuration knob;
  - rebased onto latest main branch.
v3:
  - Removed accidental ovs submodule change.
v2:
  - Added NEWS item.

Vladislav Odintsov (2):
  northd: Make `vxlan_mode` a global variable.
  northd: Add support for disabling vxlan mode.

 NEWS  |  4 ++
 northd/en-global-config.c |  9 +++-
 northd/northd.c   | 91 ++-
 northd/northd.h   |  6 ++-
 ovn-architecture.7.xml| 16 ---
 ovn-nb.xml| 10 +
 tests/ovn-northd.at   | 29 +
 7 files changed, 108 insertions(+), 57 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v4 1/2] northd: Make `vxlan_mode` a global variable.

Hi Ihar,

thanks for your review!

> On 2 May 2024, at 18:11, Ihar Hrachyshka  wrote:
> 
> On Thu, May 2, 2024 at 5:51 AM Vladislav Odintsov  <mailto:odiv...@gmail.com>> wrote:
> 
>> This simplifies code and subsequent commit to explicitely disable vxlan
>> 
> 
> I personally find it debatable that moving from explicit dependency through
> a function argument to implicit dependency through a global variable is a
> simplification. But I will leave others to chime in.
> 

Here I wanted to mention that in many pieces of code argument which
was passed just to find VXLAN encaps was removed and with less
code/arguments it looks more simple.

> 
>> mode is based on these changes.
>> 
>> Also `vxlan mode` term is introduced in ovn-architecture man page.
>> 
> 
> Should the mode name keep VXLAN capitalized?
> 

Dunno.
This was inspired by writing in initial commit [1].
I’m fine with both writings.


> 
>> 
>> Signed-off-by: Vladislav Odintsov 
>> ---
>> northd/en-global-config.c |  4 +-
>> northd/northd.c   | 94 ---
>> northd/northd.h   |  5 ++-
>> ovn-architecture.7.xml| 11 +++--
>> 4 files changed, 50 insertions(+), 64 deletions(-)
>> 
>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>> index 28c78a12c..873649a89 100644
>> --- a/northd/en-global-config.c
>> +++ b/northd/en-global-config.c
>> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void
>> *data)
>>  config_data->svc_monitor_mac);
>> }
>> 
>> -char *max_tunid = xasprintf("%d",
>> -get_ovn_max_dp_key_local(sbrec_chassis_table));
>> +init_vxlan_mode(sbrec_chassis_table);
>> +char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>> smap_replace(options, "max_tunid", max_tunid);
>> free(max_tunid);
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 5e12fd1e8..b54219a85 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
>>  */
>> static bool default_acl_drop;
>> 
>> +/* If this option is 'true' northd will use limited 24-bit space for
>> datapath
>> + * and ports tunnel key allocation (12 bits for each instead of default
>> 16). */
>> +static bool vxlan_mode;
>> +
>> #define MAX_OVN_TAGS 4096
>> 
>> 
>> @@ -881,24 +885,25 @@ join_datapaths(const struct
>> nbrec_logical_switch_table *nbrec_ls_table,
>> }
>> }
>> 
>> -static bool
>> -is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
>> +void
>> +init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
>> {
>> const struct sbrec_chassis *chassis;
>> SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
>> for (int i = 0; i < chassis->n_encaps; i++) {
>> if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
>> -return true;
>> +vxlan_mode = true;
>> +return;
>> }
>> }
>> }
>> -return false;
>> +vxlan_mode = false;
>> }
>> 
>> uint32_t
>> -get_ovn_max_dp_key_local(const struct sbrec_chassis_table
>> *sbrec_chassis_table)
>> +get_ovn_max_dp_key_local(void)
>> {
>> -if (is_vxlan_mode(sbrec_chassis_table)) {
>> +if (vxlan_mode) {
>> /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
>> return OVN_MAX_DP_VXLAN_KEY;
>> }
>> @@ -906,15 +911,14 @@ get_ovn_max_dp_key_local(const struct
>> sbrec_chassis_table *sbrec_chassis_table)
>> }
>> 
>> static void
>> -ovn_datapath_allocate_key(const struct sbrec_chassis_table
>> *sbrec_ch_table,
>> -  struct hmap *datapaths, struct hmap *dp_tnlids,
>> +ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
>>   struct ovn_datapath *od, uint32_t *hint)
>> {
>> if (!od->tunnel_key) {
>> od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
>> -OVN_MIN_DP_KEY_LOCAL,
>> -
>> get_ovn_max_dp_key_local(sbrec_ch_table),
>> -hint);
>> +OVN_MIN_DP_KEY_LOCAL,
>> +get_ovn_max_dp_key_local(),
>> +

[ovs-dev] [PATCH ovn v4 1/2] northd: Make `vxlan_mode` a global variable.

This simplifies code and subsequent commit to explicitely disable vxlan
mode is based on these changes.

Also `vxlan mode` term is introduced in ovn-architecture man page.

Signed-off-by: Vladislav Odintsov 
---
 northd/en-global-config.c |  4 +-
 northd/northd.c   | 94 ---
 northd/northd.h   |  5 ++-
 ovn-architecture.7.xml| 11 +++--
 4 files changed, 50 insertions(+), 64 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 28c78a12c..873649a89 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
diff --git a/northd/northd.c b/northd/northd.c
index 5e12fd1e8..b54219a85 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -881,24 +885,25 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
 {
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
+if (vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -906,15 +911,14 @@ get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL,
+get_ovn_max_dp_key_local(),
+hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -927,7 +931,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
 
 static void
 ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct hmap *dp_tnlids, struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -936,8 +939,7 @@ ovn_datapath_assign_requested_tnl_id(
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
 const char *interconn_ts = smap_get(other_config, "interconn-ts");
-if (!interconn_ts && is_vxlan_mode(sbrec_chassis_table) &&
-tunnel_key >= 1 << 12) {
+if (!interconn_ts && vxlan_mode && tunnel_key >= 1 << 12) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for datapath %s is "
  "incompatible with VXLAN", tunnel_key,
@@ -985,7 +987,6 @@ build_datapaths(struct ovsdb_idl_txn *ovnsb_txn,
 const struct nbrec_logical_switch_table *nbrec_ls_table,
 const struct nbrec_logical_router_table *nbrec_lr_table,
 const struct sbrec_datapath_binding_table *sbrec_dp_table,
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct ovn_dat

[ovs-dev] [PATCH ovn v4 2/2] northd: Add support for disabling vxlan mode.

Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical switch ports per datapath.

Prior to this patch vxlan mode was enabled automatically if at least one
chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of vxlan mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

CC: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  3 +++
 northd/en-global-config.c |  7 ++-
 northd/northd.c   | 10 --
 northd/northd.h   |  3 ++-
 ovn-architecture.7.xml|  6 ++
 ovn-nb.xml| 10 ++
 tests/ovn-northd.at   | 29 +
 7 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 3b5e93dc9..43ab05a68 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,9 @@ Post v24.03.0
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
   - Added DHCPv4 relay support.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680.
+For more details see man ovn-nb(5) for mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 873649a89..f5e2a8154 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,7 +115,7 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-init_vxlan_mode(sbrec_chassis_table);
+init_vxlan_mode(&nb->options, sbrec_chassis_table);
 char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
@@ -533,6 +533,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index b54219a85..d1535172e 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -886,8 +886,14 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
@@ -17593,7 +17599,7 @@ ovnnb_db_run(struct northd_input *input_data,
 use_common_zone = smap_get_bool(input_data->nb_options, "use_common_zone",
 false);
 
-init_vxlan_mode(input_data->sbrec_chassis_table);
+init_vxlan_mode(input_data->nb_options, input_data->sbrec_chassis_table);
 
 build_datapaths(ovnsb_txn,
 input_data->nbrec_logical_switch_table,
diff --git a/northd/northd.h b/northd/northd.h
index be480003e..d0322e621 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -792,7 +792,8 @@ lr_has_multiple_gw_ports(const struct ovn_datapath *od)
 }
 
 void
-init_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table);
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table);
 
 uint32_t get_ovn_max_dp_key_local(void);
 
diff --git a/ovn-architecture.7.xml b/ovn-architecture.7.xml
index 7abb1fa83..251c9c514 100644
--- a/ovn-architecture.7.xml
+++ b/ovn-architecture.7.xml
@@ -2919,4 +2919,10 @@
 the future, gateways that do not support encapsulations with large amounts
 of metadata may continue to have a reduced feature set.
   
+  
+vxlan mode is recommended to be disabled if VXLAN encap at
+hypervisors is needed only to support HW VTEP L2 Gateway functionality.
+See man ovn-nb(5) for table NB_Global column
+options key disable_vxlan_mode for more details.
+  
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 5cb6ba640..a99e663e5 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -381,6 +381,16 @@
 of SB changes would be very noticeable.
   
 
+  
+By default if at least one chassis in OVN

[ovs-dev] [PATCH ovn v4 0/2] Add support to disable vxlan mode.

v4:
  - Addressed Dumitru's and Ihar's review comments;
  - single patch was split into two:
1. function call replaced with a global variable `vxlan_mode`;
2. introduced `disable_vxlan_mode` configuration knob;
  - rebased onto latest main branch.
v3:
  - Removed accidental ovs submodule change.
v2:
  - Added NEWS item.

Vladislav Odintsov (2):
  northd: Make `vxlan_mode` a global variable.
  northd: Add support for disabling vxlan mode.

 NEWS  |   3 ++
 northd/en-global-config.c |   9 +++-
 northd/northd.c   | 100 +-
 northd/northd.h   |   6 ++-
 ovn-architecture.7.xml|  17 ---
 ovn-nb.xml|  10 
 tests/ovn-northd.at   |  29 +++
 7 files changed, 110 insertions(+), 64 deletions(-)

-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] ovn-ctl: Support for --config-file ovsdb-server option.

2024-04-23 Thread Vladislav Odintsov

Since OVS 3.3.0 ovsdb-server accepts databases and remotes configuration
via JSON text file.  This patch adds support for such option.

Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  1 +
 utilities/ovn-ctl | 39 +++
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 9adf6a31c..39ea88d78 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,7 @@ Post v24.03.0
   - Remove "ovn-set-local-ip" config option from vswitchd
 external-ids, the option is no longer needed as it became effectively
 "true" for all scenarios.
+  - Add support for ovsdb-server `--config-file` option in ovn-ctl.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index dae5e22f4..fd1ae1256 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -169,6 +169,7 @@ start_ovsdb__() {
 local sync_from_port
 local file
 local schema
+local config_file
 local logfile
 local log
 local sock
@@ -199,6 +200,7 @@ start_ovsdb__() {
 eval sync_from_port=\$DB_${DB}_SYNC_FROM_PORT
 eval file=\$DB_${DB}_FILE
 eval schema=\$DB_${DB}_SCHEMA
+eval config_file=\$DB_${DB}_CONFIG_FILE
 eval logfile=\$OVN_${DB}_LOGFILE
 eval log=\$OVN_${DB}_LOG
 eval sock=\$DB_${DB}_SOCK
@@ -281,7 +283,12 @@ $cluster_remote_port
 
 set ovsdb-server
 set "$@" $log --log-file=$logfile
-set "$@" --remote=punix:$sock --pidfile=$db_pid_file
+set "$@" --pidfile=$db_pid_file
+if test X"$config_file" == X; then
+set "$@" --remote=punix:$sock
+else
+set "$@" --config-file=$config_file
+fi
 set "$@" --unixctl=$ctrl_sock
 
 [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
@@ -297,7 +304,7 @@ $cluster_remote_port
 set exec "$@"
 fi
 
-if test X"$use_remote_in_db" != Xno; then
+if test X"$use_remote_in_db" != Xno && test X"$config_file" == X; then
 set "$@" --remote=db:$schema_name,$table_name,connections
 fi
 
@@ -343,6 +350,11 @@ $cluster_remote_port
 
 local run_ovsdb_in_bg="no"
 local process_id=
+
+if test X$config_file = X; then
+set "$@" "$file"
+fi
+
 if test X$detach = Xno && test $mode = cluster && test -z 
"$cluster_remote_addr" ; then
 # When detach is no (for run_nb_ovsdb/run_sb_ovsdb commands)
 # we want to run ovsdb-server in background rather than running it in
@@ -351,10 +363,10 @@ $cluster_remote_port
 # Note: We run only the ovsdb-server in backgroud which created the
 # cluster (i.e cluster_remote_addr is not set.).
 run_ovsdb_in_bg="yes"
-"$@" $file &
+"$@" &
 process_id=$!
 else
-start_wrapped_daemon "$wrapper" ovsdb-$db "" "$@" "$file"
+start_wrapped_daemon "$wrapper" ovsdb-$db "" "$@"
 fi
 
 # Initialize the database if it's NOT joining a cluster.
@@ -776,6 +788,7 @@ set_defaults () {
 DB_NB_SYNC_FROM_PORT=6641
 DB_NB_PROBE_INTERVAL_TO_ACTIVE=6
 DB_NB_ELECTION_TIMER=
+DB_NB_CONFIG_FILE=
 
 DB_SB_SOCK=$OVN_RUNDIR/ovnsb_db.sock
 DB_SB_PIDFILE=$OVN_RUNDIR/ovnsb_db.pid
@@ -788,6 +801,7 @@ set_defaults () {
 DB_SB_SYNC_FROM_PORT=6642
 DB_SB_PROBE_INTERVAL_TO_ACTIVE=6
 DB_SB_ELECTION_TIMER=
+DB_SB_CONFIG_FILE=
 
 DB_IC_NB_SOCK=$OVN_RUNDIR/ovn_ic_nb_db.sock
 DB_IC_NB_PIDFILE=$OVN_RUNDIR/ovn_ic_nb_db.pid
@@ -798,6 +812,7 @@ set_defaults () {
 DB_IC_NB_SYNC_FROM_PROTO=tcp
 DB_IC_NB_SYNC_FROM_ADDR=
 DB_IC_NB_SYNC_FROM_PORT=6645
+DB_IC_NB_CONFIG_FILE=
 
 DB_IC_SB_SOCK=$OVN_RUNDIR/ovn_ic_sb_db.sock
 DB_IC_SB_PIDFILE=$OVN_RUNDIR/ovn_ic_sb_db.pid
@@ -808,6 +823,7 @@ set_defaults () {
 DB_IC_SB_SYNC_FROM_PROTO=tcp
 DB_IC_SB_SYNC_FROM_ADDR=
 DB_IC_SB_SYNC_FROM_PORT=6646
+DB_IC_SB_CONFIG_FILE=
 
 DB_NB_SCHEMA=$ovn_datadir/ovn-nb.ovsschema
 DB_SB_SCHEMA=$ovn_datadir/ovn-sb.ovsschema
@@ -951,6 +967,7 @@ set_defaults () {
 OVN_SB_RELAY_DB_SSL_CERT=""
 OVN_SB_RELAY_DB_SSL_CA_CERT=""
 DB_SB_RELAY_USE_REMOTE_IN_DB="yes"
+DB_SB_RELAY_CONFIG_FILE=
 
 DB_CLUSTER_SCHEMA_UPGRADE="yes"
 }
@@ -1124,12 +1141,16 @@ File location options:
   --db-nb-create-insecure-remote=yes|no Create ptcp OVN Northbound remote 
(default: $DB_NB_CREATE_INSECURE_REMOTE)
   --db-nb-probe-interval-to-active Active probe interval from standby to 
active ovsdb-server remote (default: $DB_NB_PROBE_INTERVAL_TO_ACTIVE)
   --db-nb-election-timer=MS OVN Northbound RAFT db election timer to use on db 
creation (in

[ovs-dev] [ovn] ovn-controller segmentation fault in svc_monitor_send_tcp_health_check__()

2024-04-11 Thread Vladislav Odintsov

Hi all,

I’m running ovn 22.09 and sometimes see that ovn-controllers crash with
segmentation fault.  The backtrace is next:

(gdb) bt
#0  0x7f0742707de1 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x7f0742788c5d in inet_pton () from /lib64/libc.so.6
#2  0x564f45a1c784 in ip_parse (s=, 
ip=ip@entry=0x7f074040f90c) at lib/packets.c:698
#3  0x564f4594cbfb in svc_monitor_send_tcp_health_check__ 
(swconn=swconn@entry=0x7f0738000940,
svc_mon=svc_mon@entry=0x564f4c2960c0, ctl_flags=ctl_flags@entry=2, 
tcp_seq=3858078915, tcp_ack=tcp_ack@entry=0,
tcp_src=) at controller/pinctrl.c:7513
#4  0x564f4594d47c in svc_monitor_send_tcp_health_check__ 
(tcp_src=, tcp_ack=0, tcp_seq=,
ctl_flags=2, svc_mon=0x564f4c2960c0, swconn=0x7f0738000940) at 
controller/pinctrl.c:7502
#5  svc_monitor_send_health_check (swconn=swconn@entry=0x7f0738000940, 
svc_mon=svc_mon@entry=0x564f4c2960c0)
at controller/pinctrl.c:7621
#6  0x564f4595869b in svc_monitors_run 
(svc_monitors_next_run_time=0x564f45dd3970 ,
swconn=0x7f0738000940) at controller/pinctrl.c:7693
#7  pinctrl_handler (arg_=0x564f45e11240 ) at controller/pinctrl.c:3499
#8  0x564f45a0ad6f in ovsthread_wrapper (aux_=) at 
lib/ovs-thread.c:422
#9  0x7f074325bea5 in start_thread () from /lib64/libpthread.so.0
#10 0x7f07427798dd in clone () from /lib64/libc.so.6

After moving to frame #3, I can get actual data from svc_mon structure
(port/protocol/dp_key/port_key) - I’ve looked them up in SB DB and found
port_binding, which belongs to a logical port, which resides on this chassis.
It has configured LB with HC. Here everything seems good.  But if to check
svc_mon->sb_svc_mon structure, it seems to me that it contains garbage -
Address 0x564f out of bounds; logical_port == 0, etc (but I can be
wrong):

$1 = (const struct sbrec_service_monitor *) 0x564f54db2b40
(gdb) print *svc_mon->sb_svc_mon
$2 = {header_ = {hmap_node = {hash = 94898726054728, next = 0x0}, uuid = {parts 
= {0, 0, 0, 0}}, src_arcs = {prev = 0x564f54aae0d0, next = 0x0}, dst_arcs = 
{prev = 0x564f7f8bd470, next = 0x564f7f8bd540}, table = 0x64, old_datum = 0xf,
parsed = 152, reparse_node = {prev = 0x0, next = 0x0}, new_datum = 0x0, 
prereqs = 0x52eb8916, written = 0x171, txn_node = {hash = 1, next = 
0x564f54db2db0}, map_op_written = 0x0, map_op_lists = 0x0, set_op_written = 0x0,
set_op_lists = 0x0, change_seqno = {0, 0, 0}, track_node = {prev = 
0x564f, next = 0x0}, updated = 0x0, tracked_old_datum = 0x0}, 
external_ids = {map = {buckets = 0x1, one = 0x564f54db2d90, mask = 0, n = 0}},
  ip = 0x564f , logical_port = 
0x0, options = {map = {buckets = 0x0, one = 0x0, mask = 1, n = 
94898780242768}}, port = 0, protocol = 0x0, src_ip = 0x1 ,
  src_mac = 0x564f54db2d70 "`Ջ\177OV", status = 0x0}
…
(gdb) print svc_mon->state
$8 = SVC_MON_S_ONLINE
(gdb) print svc_mon->status
$9 = SVC_MON_ST_ONLINE
(gdb) print svc_mon->protocol
$10 = SVC_MON_PROTO_TCP
(gdb) print svc_mon->sb_svc_mon

This crash occurred right after ovsdb SB connection loss due to inactivity
probe failure.  So, ovn-controller was re-connecting to SB, and I guess, this
could somehow re-initialize SB IDL objects.

I’m not sure I can try to reproduce this behaviour on latest main branch, so my
question, if this theoretically can be connected with re-initialization of IDL?
If yes, what should be done to avoid such behavior?
Should ovn-controller process changes if its IDL is in inconsistent state?

Any help is appreciated.

Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v4] Make tunnel ids exhaustion test trigger the problem.

2024-04-05 Thread Vladislav Odintsov

Hi Ihar,

Thanks for cooperation and enhancements in the testcases!
The patch looks good to me.

> On 5 Apr 2024, at 19:14, Ihar Hrachyshka  wrote:
> 
> The original version of the scenario passed with or without the fix.
> This is because all LSs were processed in one go, so the allocate
> function was never entered with *hint==0.
> 
> Also, added another scenario that will check behavior when *hint is out
> of [min;max] bounds but > max (this happens in an obscure scenario where
> a vxlan chassis is added to the cluster mid-light, forcing northd to
> reduce its effective max value for tunnel ids; which may become lower
> than the current *hint for ports.)
> 
> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
> Co-Authored-By: Vladislav Odintsov 
> Signed-off-by: Vladislav Odintsov 
> Signed-off-by: Ihar Hrachyshka 
> ---
> v1: initial version.
> v2: cover both cases of hint = 0 and hint > max.
> v3: reduce the number of ports to create in the hint > max scenario needed to 
> trigger the problem.
> v4: remove spurious lib/ovn-util.c change.
> ---
> tests/ovn-northd.at | 43 ---
> 1 file changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index be006fb32..1a4e7274d 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2823,7 +2823,7 @@ AT_CLEANUP
> ])
> 
> OVN_FOR_EACH_NORTHD_NO_HV([
> -AT_SETUP([check tunnel ids exhaustion])
> +AT_SETUP([check datapath tunnel ids exhaustion])
> ovn_start
> 
> # Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
> @@ -2833,13 +2833,18 @@ ovn-sbctl \
> 
> cmd="ovn-nbctl --wait=sb"
> 
> -for i in {1..4097}; do
> +for i in {1..4095}; do
> cmd="${cmd} -- ls-add lsw-${i}"
> done
> 
> eval $cmd
> 
> -check_row_count nb:Logical_Switch 4097
> +check_row_count nb:Logical_Switch 4095
> +wait_row_count sb:Datapath_Binding 4095
> +
> +ovn-nbctl ls-add lsw-exhausted
> +
> +check_row_count nb:Logical_Switch 4096
> wait_row_count sb:Datapath_Binding 4095
> 
> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
> northd/ovn-northd.log])
> @@ -2847,6 +2852,38 @@ OVS_WAIT_UNTIL([grep "all datapath tunnel ids 
> exhausted" northd/ovn-northd.log])
> AT_CLEANUP
> ])
> 
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([check port tunnel ids exhaustion; vxlan chassis pops up midflight])
> +ovn_start
> +
> +cmd="ovn-nbctl --wait=sb"
> +
> +cmd="${cmd} -- ls-add lsw"
> +for i in {1..2048}; do
> +cmd="${cmd} -- lsp-add lsw lsp-${i}"
> +done
> +
> +eval $cmd
> +
> +check_row_count nb:Logical_Switch_Port 2048
> +wait_row_count sb:Port_Binding 2048
> +
> +# Now create a fake chassis with vxlan encap to lower MAX port tunnel key to 
> 2^11
> +ovn-sbctl \
> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> +-- --id=@c create chassis name=hv1 encaps=@e
> +
> +ovn-nbctl lsp-add lsw lsp-exhausted
> +
> +check_row_count nb:Logical_Switch_Port 2049
> +wait_row_count sb:Port_Binding 2048
> +
> +OVS_WAIT_UNTIL([grep "all port tunnel ids exhausted" northd/ovn-northd.log])
> +
> +AT_CLEANUP
> +])
> +
> +
> 
> OVN_FOR_EACH_NORTHD_NO_HV([
> AT_SETUP([Logical Flow Datapath Groups])
> -- 
> 2.41.0
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

2024-04-05 Thread Vladislav Odintsov



> On 5 Apr 2024, at 18:35, Ihar Hrachyshka  wrote:
> 
> On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov  <mailto:odiv...@gmail.com>> wrote:
>> Thanks Ihar for the patch.
>> 
>> It definitely triggers the bug mentioned in Fixes commit, but how do you 
>> like next diff as an alternative?
>> It seems a little easier to me, because it shows the real limit and the 
>> situation where the problem was (separate ls-add):
>> 
> 
> Ah, I think we are talking about two separate scenarios, both resulting in 
> *hint out of [min; max] bounds!
> 
> - You are talking about hint=0 with min:max = [1; 4096] - which indeed can be 
> triggered by creating a new DP *after* tunnel ids are exhausted;
> - I am talking about a more obscure case where hint=4097 (because originally 
> there were no vxlan chassis in the cluster); then a vxlan chassis is created 
> (reducing max to 4096); then the allocation function is entered with 
> hint=4097.
> 
> Both scenarios are fixed by your patch. It may be worth having both test 
> cases, one per scenario, in the test suite then. What do you think?

I agree, it’s worth adding both.
Thanks for clarification!

> 
> (Side Note: I now find the runtime flip of max cap as a vxlan chassis is 
> added - that I myself implemented - unfortunate.)
> 
> Ihar
>  
>> diff --git a/tests/ovn-northd.at <http://ovn-northd.at/> 
>> b/tests/ovn-northd.at <http://ovn-northd.at/>
>> index 6edb1129e..cef144f10 100644
>> --- a/tests/ovn-northd.at <http://ovn-northd.at/>
>> +++ b/tests/ovn-northd.at <http://ovn-northd.at/>
>> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>>  
>>  cmd="ovn-nbctl --wait=sb"
>>  
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>>  cmd="${cmd} -- ls-add lsw-${i}"
>>  done
>>  
>>  eval $cmd
>>  
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>>  wait_row_count sb:Datapath_Binding 4095
>>  
>>  OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
>> northd/ovn-northd.log])
>> 
>> 
>>> On 4 Apr 2024, at 20:13, Ihar Hrachyshka >> <mailto:ihrac...@redhat.com>> wrote:
>>> 
>>> The original version of the scenario passed with or without the fix.
>>> 
>>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>>> Signed-off-by: Ihar Hrachyshka >> <mailto:ihrac...@redhat.com>>
>>> ---
>>> tests/ovn-northd.at <http://ovn-northd.at/> | 17 +++--
>>> 1 file changed, 11 insertions(+), 6 deletions(-)
>>> 
>>> diff --git a/tests/ovn-northd.at <http://ovn-northd.at/> 
>>> b/tests/ovn-northd.at <http://ovn-northd.at/>
>>> index fc2c972a4..e8ea8b050 100644
>>> --- a/tests/ovn-northd.at <http://ovn-northd.at/>
>>> +++ b/tests/ovn-northd.at <http://ovn-northd.at/>
>>> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
>>> AT_SETUP([check tunnel ids exhaustion])
>>> ovn_start
>>> 
>>> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
>>> -ovn-sbctl \
>>> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>>> --- --id=@c create chassis name=hv1 encaps=@e
>>> -
>>> cmd="ovn-nbctl --wait=sb"
>>> 
>>> for i in {1..4097}; do
>>> @@ -2840,7 +2835,17 @@ done
>>> eval $cmd
>>> 
>>> check_row_count nb:Logical_Switch 4097
>>> -wait_row_count sb:Datapath_Binding 4095
>>> +wait_row_count sb:Datapath_Binding 4097
>>> +
>>> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key to 
>>> 2^12
>>> +ovn-sbctl \
>>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>>> +-- --id=@c create chassis name=hv1 encaps=@e
>>> +
>>> +ovn-nbctl --wait=sb ls-add lsw-exhausted
>>> +
>>> +check_row_count nb:Logical_Switch 4098
>>> +wait_row_count sb:Datapath_Binding 4097
>>> 
>>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
>>> northd/ovn-northd.log])
>>> 
>>> -- 
>>> 2.41.0
>>> 
>>> ___
>>> dev mailing list
>>> d...@openvswitch.org <mailto:d...@openvswitch.org>
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
>> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

Yes, this diff is from main.
To trigger an initial bug it is enough to create a new ls/lr while all 
available tunnel ids are used for datapaths (4095).
This is because we need to enter ovn_allocate_tnlid() with *hint=0 to trigger 
infinite loop.
That is why I suggest just to create 4095 LSs and then create another one.
I’ve tested this diff and see that northd goes to 100% CPU and doesn’t print 
warn log about ids exhaustion.

> On 4 Apr 2024, at 23:34, Ihar Hrachyshka  wrote:
> 
> On Thu, Apr 4, 2024 at 3:56 PM Vladislav Odintsov  wrote:
> 
>> Thanks Ihar for the patch.
>> 
>> It definitely triggers the bug mentioned in Fixes commit, but how do you
>> like next diff as an alternative?
>> It seems a little easier to me, because it shows the real limit and the
>> situation where the problem was (separate ls-add):
>> 
> 
> Is it a diff from main? I don't think it will trigger the issue. The key is
> to trigger northd to change its max cap for tunnel ids AFTER it bumped hint
> beyond the "vxlan mode max tun_id" (which is why I have to create vxlan
> chassis AFTER I create enough LSs to get into unsafe territory.)
> 
> Note: I haven't tried your version yet; I may check your version some time
> later. So it's the initial thought only.
> 
> 
>> 
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 6edb1129e..cef144f10 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2862,13 +2862,18 @@ ovn-sbctl \
>> 
>> cmd="ovn-nbctl --wait=sb"
>> 
>> -for i in {1..4097}; do
>> +for i in {1..4095}; do
>> cmd="${cmd} -- ls-add lsw-${i}"
>> done
>> 
>> eval $cmd
>> 
>> -check_row_count nb:Logical_Switch 4097
>> +check_row_count nb:Logical_Switch 4095
>> +wait_row_count sb:Datapath_Binding 4095
>> +
>> +ovn-nbctl ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4096
>> wait_row_count sb:Datapath_Binding 4095
>> 
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>> 
>> 
>> On 4 Apr 2024, at 20:13, Ihar Hrachyshka  wrote:
>> 
>> The original version of the scenario passed with or without the fix.
>> 
>> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
>> Signed-off-by: Ihar Hrachyshka 
>> ---
>> tests/ovn-northd.at | 17 +++--
>> 1 file changed, 11 insertions(+), 6 deletions(-)
>> 
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index fc2c972a4..e8ea8b050 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
>> AT_SETUP([check tunnel ids exhaustion])
>> ovn_start
>> 
>> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to
>> 2^12
>> -ovn-sbctl \
>> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> --- --id=@c create chassis name=hv1 encaps=@e
>> -
>> cmd="ovn-nbctl --wait=sb"
>> 
>> for i in {1..4097}; do
>> @@ -2840,7 +2835,17 @@ done
>> eval $cmd
>> 
>> check_row_count nb:Logical_Switch 4097
>> -wait_row_count sb:Datapath_Binding 4095
>> +wait_row_count sb:Datapath_Binding 4097
>> +
>> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key
>> to 2^12
>> +ovn-sbctl \
>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
>> +-- --id=@c create chassis name=hv1 encaps=@e
>> +
>> +ovn-nbctl --wait=sb ls-add lsw-exhausted
>> +
>> +check_row_count nb:Logical_Switch 4098
>> +wait_row_count sb:Datapath_Binding 4097
>> 
>> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>> northd/ovn-northd.log])
>> 
>> --
>> 2.41.0
>> 
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: fix infinite loop in ovn_allocate_tnlid()



> On 4 Apr 2024, at 22:51, Mark Michelson  wrote:
> 
> On 4/4/24 12:46, Dumitru Ceara wrote:
>> On 4/4/24 17:52, Vladislav Odintsov wrote:
>>> Thanks Dumitru!
>>> I’m totally fine with your change.
>>> Should I send backport patches with resolved conflicts for remaining 
>>> branches at least till 22.03, which is an LTS?
>>> 
>> Well, 24.03 is the most recent LTS.  We don't really backport patches to
>> 22.03 unless they fix critical issues.  I'm not completely convinced
>> that this is such a critical issue though.  You need 4K logical
>> datapaths with vxlan enabled before this gets hit.  In any case, Mark,
>> what do you think?
> 
> I don't think this needs backporting down to 22.03.

I just wanted to mention that to reproduce this bug it is only enough to have 
at least one chassis with vxlan encap and create 4096 LSs/LRs.
If the problem is triggered, ovn-northd starts consuming 100% CPU and hangs 
(doesn’t process any change) until excess LS/LR is removed and northd is 
restarted.
I can submit backport patches for old branches if needed (already rebased).

> 
>> Regards,
>> Dumitru
>>>> On 4 Apr 2024, at 18:26, Dumitru Ceara  wrote:
>>>> 
>>>> On 4/1/24 16:27, Mark Michelson wrote:
>>>>> Thanks Vladislav,
>>>>> 
>>>>> Acked-by: Mark Michelson >>>> <mailto:mmich...@redhat.com>>
>>>>> 
>>>> 
>>>> Thanks, Vladislav and Mark!  Applied to main and backported down to
>>>> 23.06 with a minor test change, please see below.
>>>> 
>>>>> On 4/1/24 08:15, Vladislav Odintsov wrote:
>>>>>> In case if all tunnel ids are exhausted, ovn_allocate_tnlid() function
>>>>>> iterates over tnlids indefinitely when *hint is outside of [min, max].
>>>>>> This is because when tnlid reaches max, next tnlid is min and for-loop
>>>>>> never reaches exit condition for tnlid != *hint.
>>>>>> 
>>>>>> This patch fixes mentioned issue and adds a testcase.
>>>>>> 
>>>>>> Signed-off-by: Vladislav Odintsov 
>>>>>> ---
>>>>>>   lib/ovn-util.c  | 10 +++---
>>>>>>   tests/ovn-northd.at | 26 ++
>>>>>>   2 files changed, 33 insertions(+), 3 deletions(-)
>>>>>> 
>>>>>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>>>>>> index ee5cbcdc3..9f97ae2ca 100644
>>>>>> --- a/lib/ovn-util.c
>>>>>> +++ b/lib/ovn-util.c
>>>>>> @@ -693,13 +693,17 @@ uint32_t
>>>>>>   ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
>>>>>>  uint32_t max, uint32_t *hint)
>>>>>>   {
>>>>>> -for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
>>>>>> - tnlid = next_tnlid(tnlid, min, max)) {
>>>>>> +/* Normalize hint, because it can be outside of [min, max]. */
>>>>>> +*hint = next_tnlid(*hint, min, max);
>>>>>> +
>>>>>> +uint32_t tnlid = *hint;
>>>>>> +do {
>>>>>>   if (ovn_add_tnlid(set, tnlid)) {
>>>>>>   *hint = tnlid;
>>>>>>   return tnlid;
>>>>>>   }
>>>>>> -}
>>>>>> +tnlid = next_tnlid(tnlid, min, max);
>>>>>> +} while (tnlid != *hint);
>>>>>> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>>>>>>   VLOG_WARN_RL(&rl, "all %s tunnel ids exhausted", name);
>>>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>>>> index cd53755b2..174dbacda 100644
>>>>>> --- a/tests/ovn-northd.at
>>>>>> +++ b/tests/ovn-northd.at
>>>>>> @@ -2822,6 +2822,32 @@ AT_CHECK([test $lsp02 = 3 && test $ls1 = 123])
>>>>>>   AT_CLEANUP
>>>>>>   ])
>>>>>>   +OVN_FOR_EACH_NORTHD_NO_HV([
>>>>>> +AT_SETUP([check tunnel ids exhaustion])
>>>>>> +ovn_start
>>>>>> +
>>>>>> +# Create a fake chassis with vxlan encap to lower MAX DP tunnel key
>>>>>> to 2^12
>>>>>> +ovn-sbctl \
>>>>>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1"
>>>>>> type="vxlan" \
>>>>>> +-- --id=@c create chassis name=hv1 encaps=@e
>>>>>> +
>>>>>> +cmd="ovn-nbctl --wait=sb"
>>>>>> +
>>>>>> +for i in {1..4097..1}; do
>>>> 
>>>> This can be changed to:
>>>> 
>>>> for i in {1..4097}; do
>>>> 
>>>>>> +cmd="${cmd} -- ls-add lsw-${i}"
>>>>>> +done
>>>>>> +
>>>>>> +eval $cmd
>>>>>> +
>>>>>> +check_row_count nb:Logical_Switch 4097
>>>>>> +wait_row_count sb:Datapath_Binding 4095
>>>>>> +
>>>>>> +OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>>>>>> northd/ovn-northd.log])
>>>>>> +
>>>>>> +AT_CLEANUP
>>>>>> +])
>>>>>> +
>>>>>> +
>>>>>>   OVN_FOR_EACH_NORTHD_NO_HV([
>>>>>>   AT_SETUP([Logical Flow Datapath Groups])
>>>>>>   ovn_start
>>>> 
>>>> Regards,
>>>> Dumitru
>>>> 
>>>> ___
>>>> dev mailing list
>>>> d...@openvswitch.org <mailto:d...@openvswitch.org>
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> 
>>> 
>>> Regards,
>>> Vladislav Odintsov
>>> 
>>> 
> 


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] Make tunnel ids exhaustion test case trigger the problem.

Thanks Ihar for the patch.

It definitely triggers the bug mentioned in Fixes commit, but how do you like 
next diff as an alternative?
It seems a little easier to me, because it shows the real limit and the 
situation where the problem was (separate ls-add):

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 6edb1129e..cef144f10 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2862,13 +2862,18 @@ ovn-sbctl \
 
 cmd="ovn-nbctl --wait=sb"
 
-for i in {1..4097}; do
+for i in {1..4095}; do
 cmd="${cmd} -- ls-add lsw-${i}"
 done
 
 eval $cmd
 
-check_row_count nb:Logical_Switch 4097
+check_row_count nb:Logical_Switch 4095
+wait_row_count sb:Datapath_Binding 4095
+
+ovn-nbctl ls-add lsw-exhausted
+
+check_row_count nb:Logical_Switch 4096
 wait_row_count sb:Datapath_Binding 4095
 
 OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
northd/ovn-northd.log])


> On 4 Apr 2024, at 20:13, Ihar Hrachyshka  wrote:
> 
> The original version of the scenario passed with or without the fix.
> 
> Fixes: a1f165a7b807 ("northd: fix infinite loop in ovn_allocate_tnlid()")
> Signed-off-by: Ihar Hrachyshka 
> ---
> tests/ovn-northd.at | 17 +++--
> 1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index fc2c972a4..e8ea8b050 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -2826,11 +2826,6 @@ OVN_FOR_EACH_NORTHD_NO_HV([
> AT_SETUP([check tunnel ids exhaustion])
> ovn_start
> 
> -# Create a fake chassis with vxlan encap to lower MAX DP tunnel key to 2^12
> -ovn-sbctl \
> ---id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> --- --id=@c create chassis name=hv1 encaps=@e
> -
> cmd="ovn-nbctl --wait=sb"
> 
> for i in {1..4097}; do
> @@ -2840,7 +2835,17 @@ done
> eval $cmd
> 
> check_row_count nb:Logical_Switch 4097
> -wait_row_count sb:Datapath_Binding 4095
> +wait_row_count sb:Datapath_Binding 4097
> +
> +# Now create a fake chassis with vxlan encap to lower MAX DP tunnel key to 
> 2^12
> +ovn-sbctl \
> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="vxlan" \
> +-- --id=@c create chassis name=hv1 encaps=@e
> +
> +ovn-nbctl --wait=sb ls-add lsw-exhausted
> +
> +check_row_count nb:Logical_Switch 4098
> +wait_row_count sb:Datapath_Binding 4097
> 
> OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted" 
> northd/ovn-northd.log])
> 
> -- 
> 2.41.0
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev




Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: fix infinite loop in ovn_allocate_tnlid()



> On 4 Apr 2024, at 21:07, Ihar Hrachyshka  wrote:
> 
> On Thu, Apr 4, 2024 at 1:46 PM Dumitru Ceara  <mailto:dce...@redhat.com>> wrote:
>> On 4/4/24 19:17, Ihar Hrachyshka wrote:
>> > I tried to revert the util change and the test case passed just fine.
>> > 
>> 
>> I had done that before pushing the patch but.. I got tricked by the fact
>> that northd was spinning and using cpu 100% while the switches were
>> added.  My bad.
>> 
>> > I think the scenario that may get the hint out of bounds is 1) start with
>> > no vxlan chassis; 2) create 4097 DPs; 3) add a vxlan chassis - this makes
>> > northd downgrade its max key to 4096. Now when we create a DP, it will spin
>> > in circles. Posted this here:
>> > https://patchwork.ozlabs.org/project/ovn/patch/20240404171358.54678-1-ihrac...@redhat.com/
>> > 

Nice catch! Thanks for the patch!

>> > (We can probably discuss in this context whether it's a good idea for a
>> > cluster to change the max tun id value as chassis come and go. Perhaps it
>> > should be initialized once and for all?)
>> > 
>> > What I also notice is that with the new patch, *hint is always overridden
>> > at the start of the function, so it's always bumped by 1. I am not sure it
>> > was intended. Comments?
>> > 
>> 
>> But the actual change in behavior for '*hint' is only for the case in
>> which we run out of IDs, or am I missing something?  It didn't seem that
>> big of a deal to me.
> 
> Yes, I also don't see a problem, but want the author to confirm if there's a 
> reason for that.

I’ve just revised the code again and see that for the case, where *hint = 0 and 
min=10, max=20 this still will not work.
However I’m not sure if this must be fixed, while there are no such cases for 
now.
What do you think?

*hint bump every time in normal situation (where we have enough available IDs) 
should be safe because it has similar behaviour to previous implementation.
First, tnlid was set to * + 1 and then *hint was set by 
current tnlid.
It seems the same to me. Am I missing something?

>  
>> 
>> > This is all probably relevant to the question of whether any backports
>> > should happen for this patch.
>> > 
>> > Ihar
>> > 
>> 
>> Regards,
>> Dumitru
>> 
>> > 
>> > On Thu, Apr 4, 2024 at 12:46 PM Dumitru Ceara > > <mailto:dce...@redhat.com>> wrote:
>> > 
>> >> On 4/4/24 17:52, Vladislav Odintsov wrote:
>> >>> Thanks Dumitru!
>> >>> I’m totally fine with your change.
>> >>> Should I send backport patches with resolved conflicts for remaining
>> >> branches at least till 22.03, which is an LTS?
>> >>>
>> >>
>> >> Well, 24.03 is the most recent LTS.  We don't really backport patches to
>> >> 22.03 unless they fix critical issues.  I'm not completely convinced
>> >> that this is such a critical issue though.  You need 4K logical
>> >> datapaths with vxlan enabled before this gets hit.  In any case, Mark,
>> >> what do you think?
>> >>
>> >> Regards,
>> >> Dumitru
>> >>
>> >>>> On 4 Apr 2024, at 18:26, Dumitru Ceara > >>>> <mailto:dce...@redhat.com>> wrote:
>> >>>>
>> >>>> On 4/1/24 16:27, Mark Michelson wrote:
>> >>>>> Thanks Vladislav,
>> >>>>>
>> >>>>> Acked-by: Mark Michelson > >>>>> <mailto:mmich...@redhat.com> > >> mmich...@redhat.com <mailto:mmich...@redhat.com>>>
>> >>>>>
>> >>>>
>> >>>> Thanks, Vladislav and Mark!  Applied to main and backported down to
>> >>>> 23.06 with a minor test change, please see below.
>> >>>>
>> >>>>> On 4/1/24 08:15, Vladislav Odintsov wrote:
>> >>>>>> In case if all tunnel ids are exhausted, ovn_allocate_tnlid() function
>> >>>>>> iterates over tnlids indefinitely when *hint is outside of [min, max].
>> >>>>>> This is because when tnlid reaches max, next tnlid is min and for-loop
>> >>>>>> never reaches exit condition for tnlid != *hint.
>> >>>>>>
>> >>>>>> This patch fixes mentioned issue and adds a testcase.
>> >>>>>>
>> >>>>>> Signed-off-by: Vladislav Odintsov > >>

[ovs-dev] [PATCH ovn v3] northd: Add support for disabling vxlan mode.

Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical switch ports per datapath.

Prior to this patch vxlan mode was enabled automatically if at least one
chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of vxlan mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

CC: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  3 ++
 northd/en-global-config.c |  9 +++-
 northd/northd.c   | 90 ++-
 northd/northd.h   |  6 ++-
 ovn-nb.xml| 12 ++
 tests/ovn-northd.at   | 29 +
 6 files changed, 97 insertions(+), 52 deletions(-)

diff --git a/NEWS b/NEWS
index 141f1831c..fe95391cd 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,9 @@ Post v24.03.0
 "lflow-stage-to-oftable STAGE_NAME" that converts stage name into OpenFlow
 table id.
   - Rename the ovs-sandbox script to ovn-sandbox.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680.
+For more details see man ovn-nb for mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 34e393b33..9310c4575 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(&nb->options, sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
@@ -523,6 +523,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index c568f6360..859b233e8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -875,24 +879,31 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
+if (vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -900,15 +911,14 @@ get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOC

Re: [ovs-dev] [PATCH ovn v2] northd: Add support for disabling vxlan mode.

Oh, my bad. I’ll send out v3.
Sorry.

> On 4 Apr 2024, at 19:53, Dumitru Ceara  wrote:
> 
> On 4/4/24 18:06, Vladislav Odintsov wrote:
>> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
>> for available tunnel IDs because of lack of space in VXLAN VNI.
>> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
>> and 2047 logical switch ports per datapath.
>> 
>> Prior to this patch vxlan mode was enabled automatically if at least one
>> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
>> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>> 
>> This patch adds support for explicit disabling of vxlan mode via
>> Northbound database.
>> 
>> 1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
>> 
>> CC: Ihar Hrachyshka 
>> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
>> Signed-off-by: Vladislav Odintsov 
>> ---
> 
> 
>> diff --git a/ovs b/ovs
>> index fe55ce37a..94191b7a4 16
>> --- a/ovs
>> +++ b/ovs
>> @@ -1 +1 @@
>> -Subproject commit fe55ce37a7b090d09dee5c01ae0797320ad678f6
>> +Subproject commit 94191b7a4926204510931770c52992c9ea24d4e2
> 
> Looks like you included a submodule change by accident.  This causes the
> CI to fail:
> https://github.com/ovsrobot/ovn/actions/runs/8557944614
> 
> Regards,
> Dumitru


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

I’ve sent v2:
https://patchwork.ozlabs.org/project/ovn/patch/20240404160628.970615-1-odiv...@gmail.com/

> On 4 Apr 2024, at 18:27, Dumitru Ceara  wrote:
> 
> On 4/4/24 14:38, Vladislav Odintsov wrote:
>> *Patch [1] is 
>> https://patchwork.ozlabs.org/project/ovn/patch/20240401121510.758326-1-odiv...@gmail.com/
>> 
>>> On 4 Apr 2024, at 15:33, Vladislav Odintsov  wrote:
>>> 
>>> Hi Dumitru,
>>> 
>>> thanks for your attention on this!
>>> 
>>>> On 4 Apr 2024, at 13:06, Dumitru Ceara  wrote:
>>>> 
>>>> On 4/3/24 22:05, Vladislav Odintsov wrote:
>>>>> re-sending email because ovs list rejected previous its content for some 
>>>>> reason:
>>>>> 
>>>>> Hi Ihar,
>>>>> 
>>>> 
>>>> Hi Vladislav, Ihar,
>>>> 
>>>>> thanks for your quick reaction!
>>>>> I didn’t see mentioned thread, but I think that it is not safe enough to 
>>>>> have automatic detection of this scenario here.
>>>>> 
>>>>> Imagine: for VXLAN with HW VTEP scenario besides VXLAN encap one must 
>>>>> configure also either GENEVE and/or STT encap(s) for HV chassis.
>>>>> 
>>>>> So, detection could be implemented like this:
>>>>> Check all non-VTEP chassis' encaps and find "effective encap" for each of 
>>>>> them. If we detect at least one chassis with "effective encap" == vxlan, 
>>>>> then enable vxlan mode. Normal mode otherwise.
>>>>> "effective encap" means that for 'vxlan,geneve,stt' encaps effective is 
>>>>> geneve, for 'vxlan,stt' -> stt, for 'vxlan' -> vxlan.
>>>>> Such behavior was my first idea.
>>>>> 
>>>>> But I decided that there possible flapping of modes if there is a 
>>>>> problem/bug in deployment tooling and it is enough to have only one 
>>>>> chassis with wrong encap set to affect vxlan mode for entire OVN cluster. 
>>>>> Such mode flapping can result in problems with tunnel ids allocation.
>>>> 
>>>> These are valid points.
>>>> 
>>>>> So it seems that to have an option that statically sets vxlan mode is 
>>>>> more resilient.
>>>> 
>>>> In general we try to avoid new config knobs.
>>>> .
>>>>> What do you think?
>>>>> 
>>>> 
>>>> But in this case it make actually be easier if we offload the work of
>>>> determining vxlan-mode to the CMS.
>>>> 
>>>>> 
>>>>>> On 3 Apr 2024, at 20:43, Ihar Hrachyshka  wrote:
>>>>>> 
>>>>>> Thank you Vladislav.
>>>>>> 
>>>>>> FYI it was reported in the past in 
>>>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-July/051931.html 
>>>>>> but fell through cracks then. Thanks for picking it up!
>>>>>> 
>>>>>> In your patch, you introduce a new config option to disable the 
>>>>>> 'vxlan-mode' behavior. This will definitely work. But I wonder if we can 
>>>>>> automatically detect this scenario by ignoring the chassis that are VTEP 
>>>>>> from consideration? I believe ovn-controller-vtep sets `is-vtep` in 
>>>>>> other_config, so - would it work if we modify is_vxlan_mode to consider 
>>>>>> it too?
>>>>>> 
>>>>>> Thanks again for looking into this.
>>>>>> Ihar
>>>>>> 
>>>>>> On Wed, Apr 3, 2024 at 6:34 AM Vladislav Odintsov >>>>> <mailto:odiv...@gmail.com>> wrote:
>>>>>>> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
>>>>>>> for available tunnel IDs because of lack of space in VXLAN VNI.
>>>>>>> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
>>>>>>> and 2047 logical switch ports per datapath.
>>>>>>> 
>>>>>>> Prior to this patch vxlan mode was enabled automatically if at least one
>>>>>>> chassis had encap of vxlan type.  In scenarios where one want to use 
>>>>>>> VXLAN
>>>>>>> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>>>>>>> 
>>>>>>> This patch

[ovs-dev] [PATCH ovn v2] northd: Add support for disabling vxlan mode.

Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical switch ports per datapath.

Prior to this patch vxlan mode was enabled automatically if at least one
chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of vxlan mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

CC: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 NEWS  |  3 ++
 northd/en-global-config.c |  9 +++-
 northd/northd.c   | 90 ++-
 northd/northd.h   |  6 ++-
 ovn-nb.xml| 12 ++
 ovs   |  2 +-
 tests/ovn-northd.at   | 29 +
 7 files changed, 98 insertions(+), 53 deletions(-)

diff --git a/NEWS b/NEWS
index 141f1831c..fe95391cd 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,9 @@ Post v24.03.0
 "lflow-stage-to-oftable STAGE_NAME" that converts stage name into OpenFlow
 table id.
   - Rename the ovs-sandbox script to ovn-sandbox.
+  - Added new global config option NB_Global:options:disable_vxlan_mode to
+extend available tunnel IDs space for datapaths from 4095 to 16711680.
+For more details see man ovn-nb for mentioned option.
 
 OVN v24.03.0 - 01 Mar 2024
 --
diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 34e393b33..9310c4575 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(&nb->options, sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
@@ -523,6 +523,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index c568f6360..859b233e8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -875,24 +879,31 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
+if (vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -900,15 +911,14 @@ get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+

Re: [ovs-dev] [PATCH ovn] northd: fix infinite loop in ovn_allocate_tnlid()

Thanks Dumitru!
I’m totally fine with your change.
Should I send backport patches with resolved conflicts for remaining branches 
at least till 22.03, which is an LTS?

> On 4 Apr 2024, at 18:26, Dumitru Ceara  wrote:
> 
> On 4/1/24 16:27, Mark Michelson wrote:
>> Thanks Vladislav,
>> 
>> Acked-by: Mark Michelson mailto:mmich...@redhat.com>>
>> 
> 
> Thanks, Vladislav and Mark!  Applied to main and backported down to
> 23.06 with a minor test change, please see below.
> 
>> On 4/1/24 08:15, Vladislav Odintsov wrote:
>>> In case if all tunnel ids are exhausted, ovn_allocate_tnlid() function
>>> iterates over tnlids indefinitely when *hint is outside of [min, max].
>>> This is because when tnlid reaches max, next tnlid is min and for-loop
>>> never reaches exit condition for tnlid != *hint.
>>> 
>>> This patch fixes mentioned issue and adds a testcase.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
>>> ---
>>>   lib/ovn-util.c  | 10 +++---
>>>   tests/ovn-northd.at | 26 ++
>>>   2 files changed, 33 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>>> index ee5cbcdc3..9f97ae2ca 100644
>>> --- a/lib/ovn-util.c
>>> +++ b/lib/ovn-util.c
>>> @@ -693,13 +693,17 @@ uint32_t
>>>   ovn_allocate_tnlid(struct hmap *set, const char *name, uint32_t min,
>>>  uint32_t max, uint32_t *hint)
>>>   {
>>> -for (uint32_t tnlid = next_tnlid(*hint, min, max); tnlid != *hint;
>>> - tnlid = next_tnlid(tnlid, min, max)) {
>>> +/* Normalize hint, because it can be outside of [min, max]. */
>>> +*hint = next_tnlid(*hint, min, max);
>>> +
>>> +uint32_t tnlid = *hint;
>>> +do {
>>>   if (ovn_add_tnlid(set, tnlid)) {
>>>   *hint = tnlid;
>>>   return tnlid;
>>>   }
>>> -}
>>> +tnlid = next_tnlid(tnlid, min, max);
>>> +} while (tnlid != *hint);
>>> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>>>   VLOG_WARN_RL(&rl, "all %s tunnel ids exhausted", name);
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index cd53755b2..174dbacda 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -2822,6 +2822,32 @@ AT_CHECK([test $lsp02 = 3 && test $ls1 = 123])
>>>   AT_CLEANUP
>>>   ])
>>>   +OVN_FOR_EACH_NORTHD_NO_HV([
>>> +AT_SETUP([check tunnel ids exhaustion])
>>> +ovn_start
>>> +
>>> +# Create a fake chassis with vxlan encap to lower MAX DP tunnel key
>>> to 2^12
>>> +ovn-sbctl \
>>> +--id=@e create encap chassis_name=hv1 ip="192.168.0.1"
>>> type="vxlan" \
>>> +-- --id=@c create chassis name=hv1 encaps=@e
>>> +
>>> +cmd="ovn-nbctl --wait=sb"
>>> +
>>> +for i in {1..4097..1}; do
> 
> This can be changed to:
> 
> for i in {1..4097}; do
> 
>>> +cmd="${cmd} -- ls-add lsw-${i}"
>>> +done
>>> +
>>> +eval $cmd
>>> +
>>> +check_row_count nb:Logical_Switch 4097
>>> +wait_row_count sb:Datapath_Binding 4095
>>> +
>>> +OVS_WAIT_UNTIL([grep "all datapath tunnel ids exhausted"
>>> northd/ovn-northd.log])
>>> +
>>> +AT_CLEANUP
>>> +])
>>> +
>>> +
>>>   OVN_FOR_EACH_NORTHD_NO_HV([
>>>   AT_SETUP([Logical Flow Datapath Groups])
>>>   ovn_start
> 
> Regards,
> Dumitru
> 
> ___
> dev mailing list
> d...@openvswitch.org <mailto:d...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Regards,
Vladislav Odintsov

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

*Patch [1] is 
https://patchwork.ozlabs.org/project/ovn/patch/20240401121510.758326-1-odiv...@gmail.com/

> On 4 Apr 2024, at 15:33, Vladislav Odintsov  wrote:
> 
> Hi Dumitru,
> 
> thanks for your attention on this!
> 
>> On 4 Apr 2024, at 13:06, Dumitru Ceara  wrote:
>> 
>> On 4/3/24 22:05, Vladislav Odintsov wrote:
>>> re-sending email because ovs list rejected previous its content for some 
>>> reason:
>>> 
>>> Hi Ihar,
>>> 
>> 
>> Hi Vladislav, Ihar,
>> 
>>> thanks for your quick reaction!
>>> I didn’t see mentioned thread, but I think that it is not safe enough to 
>>> have automatic detection of this scenario here.
>>> 
>>> Imagine: for VXLAN with HW VTEP scenario besides VXLAN encap one must 
>>> configure also either GENEVE and/or STT encap(s) for HV chassis.
>>> 
>>> So, detection could be implemented like this:
>>> Check all non-VTEP chassis' encaps and find "effective encap" for each of 
>>> them. If we detect at least one chassis with "effective encap" == vxlan, 
>>> then enable vxlan mode. Normal mode otherwise.
>>> "effective encap" means that for 'vxlan,geneve,stt' encaps effective is 
>>> geneve, for 'vxlan,stt' -> stt, for 'vxlan' -> vxlan.
>>> Such behavior was my first idea.
>>> 
>>> But I decided that there possible flapping of modes if there is a 
>>> problem/bug in deployment tooling and it is enough to have only one chassis 
>>> with wrong encap set to affect vxlan mode for entire OVN cluster. Such mode 
>>> flapping can result in problems with tunnel ids allocation.
>> 
>> These are valid points.
>> 
>>> So it seems that to have an option that statically sets vxlan mode is more 
>>> resilient.
>> 
>> In general we try to avoid new config knobs.
>> .
>>> What do you think?
>>> 
>> 
>> But in this case it make actually be easier if we offload the work of
>> determining vxlan-mode to the CMS.
>> 
>>> 
>>>> On 3 Apr 2024, at 20:43, Ihar Hrachyshka  wrote:
>>>> 
>>>> Thank you Vladislav.
>>>> 
>>>> FYI it was reported in the past in 
>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-July/051931.html 
>>>> but fell through cracks then. Thanks for picking it up!
>>>> 
>>>> In your patch, you introduce a new config option to disable the 
>>>> 'vxlan-mode' behavior. This will definitely work. But I wonder if we can 
>>>> automatically detect this scenario by ignoring the chassis that are VTEP 
>>>> from consideration? I believe ovn-controller-vtep sets `is-vtep` in 
>>>> other_config, so - would it work if we modify is_vxlan_mode to consider it 
>>>> too?
>>>> 
>>>> Thanks again for looking into this.
>>>> Ihar
>>>> 
>>>> On Wed, Apr 3, 2024 at 6:34 AM Vladislav Odintsov >>> <mailto:odiv...@gmail.com>> wrote:
>>>>> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
>>>>> for available tunnel IDs because of lack of space in VXLAN VNI.
>>>>> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
>>>>> and 2047 logical switch ports per datapath.
>>>>> 
>>>>> Prior to this patch vxlan mode was enabled automatically if at least one
>>>>> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
>>>>> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>>>>> 
>>>>> This patch adds support for explicit disabling of vxlan mode via
>>>>> Northbound database.
>>>>> 
>>>>> 0: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
>>>>> 
>>>>> CC: Ihar Hrachyshka mailto:ihrac...@redhat.com>>
>>>>> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
>>>>> Signed-off-by: Vladislav Odintsov >>>> <mailto:odiv...@gmail.com>>
>>>>> ---
>>>>> northd/en-global-config.c |  9 +++-
>>>>> northd/northd.c   | 90 ++-
>>>>> northd/northd.h   |  6 ++-
>>>>> ovn-nb.xml| 12 ++
>>>>> tests/ovn-northd.at <http://ovn-northd.at/>   | 29 +
>&g

Re: [ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

Hi Dumitru,

thanks for your attention on this!

> On 4 Apr 2024, at 13:06, Dumitru Ceara  wrote:
> 
> On 4/3/24 22:05, Vladislav Odintsov wrote:
>> re-sending email because ovs list rejected previous its content for some 
>> reason:
>> 
>> Hi Ihar,
>> 
> 
> Hi Vladislav, Ihar,
> 
>> thanks for your quick reaction!
>> I didn’t see mentioned thread, but I think that it is not safe enough to 
>> have automatic detection of this scenario here.
>> 
>> Imagine: for VXLAN with HW VTEP scenario besides VXLAN encap one must 
>> configure also either GENEVE and/or STT encap(s) for HV chassis.
>> 
>> So, detection could be implemented like this:
>> Check all non-VTEP chassis' encaps and find "effective encap" for each of 
>> them. If we detect at least one chassis with "effective encap" == vxlan, 
>> then enable vxlan mode. Normal mode otherwise.
>> "effective encap" means that for 'vxlan,geneve,stt' encaps effective is 
>> geneve, for 'vxlan,stt' -> stt, for 'vxlan' -> vxlan.
>> Such behavior was my first idea.
>> 
>> But I decided that there possible flapping of modes if there is a 
>> problem/bug in deployment tooling and it is enough to have only one chassis 
>> with wrong encap set to affect vxlan mode for entire OVN cluster. Such mode 
>> flapping can result in problems with tunnel ids allocation.
> 
> These are valid points.
> 
>> So it seems that to have an option that statically sets vxlan mode is more 
>> resilient.
> 
> In general we try to avoid new config knobs.
> .
>> What do you think?
>> 
> 
> But in this case it make actually be easier if we offload the work of
> determining vxlan-mode to the CMS.
> 
>> 
>>> On 3 Apr 2024, at 20:43, Ihar Hrachyshka  wrote:
>>> 
>>> Thank you Vladislav.
>>> 
>>> FYI it was reported in the past in 
>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-July/051931.html 
>>> but fell through cracks then. Thanks for picking it up!
>>> 
>>> In your patch, you introduce a new config option to disable the 
>>> 'vxlan-mode' behavior. This will definitely work. But I wonder if we can 
>>> automatically detect this scenario by ignoring the chassis that are VTEP 
>>> from consideration? I believe ovn-controller-vtep sets `is-vtep` in 
>>> other_config, so - would it work if we modify is_vxlan_mode to consider it 
>>> too?
>>> 
>>> Thanks again for looking into this.
>>> Ihar
>>> 
>>> On Wed, Apr 3, 2024 at 6:34 AM Vladislav Odintsov >> <mailto:odiv...@gmail.com>> wrote:
>>>> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
>>>> for available tunnel IDs because of lack of space in VXLAN VNI.
>>>> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
>>>> and 2047 logical switch ports per datapath.
>>>> 
>>>> Prior to this patch vxlan mode was enabled automatically if at least one
>>>> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
>>>> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>>>> 
>>>> This patch adds support for explicit disabling of vxlan mode via
>>>> Northbound database.
>>>> 
>>>> 0: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
>>>> 
>>>> CC: Ihar Hrachyshka mailto:ihrac...@redhat.com>>
>>>> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
>>>> Signed-off-by: Vladislav Odintsov >>> <mailto:odiv...@gmail.com>>
>>>> ---
>>>> northd/en-global-config.c |  9 +++-
>>>> northd/northd.c   | 90 ++-
>>>> northd/northd.h   |  6 ++-
>>>> ovn-nb.xml| 12 ++
>>>> tests/ovn-northd.at <http://ovn-northd.at/>   | 29 +
>>>> 5 files changed, 94 insertions(+), 52 deletions(-)
>>>> 
>>>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>>>> index 34e393b33..9310c4575 100644
>>>> --- a/northd/en-global-config.c
>>>> +++ b/northd/en-global-config.c
>>>> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void 
>>>> *data)
>>>>  config_data->svc_monitor_mac);
>>>> }
>>>> 
>>>> -cha

Re: [ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

re-sending email because ovs list rejected previous its content for some reason:

Hi Ihar,

thanks for your quick reaction!
I didn’t see mentioned thread, but I think that it is not safe enough to have 
automatic detection of this scenario here.

Imagine: for VXLAN with HW VTEP scenario besides VXLAN encap one must configure 
also either GENEVE and/or STT encap(s) for HV chassis.

So, detection could be implemented like this:
Check all non-VTEP chassis' encaps and find "effective encap" for each of them. 
If we detect at least one chassis with "effective encap" == vxlan, then enable 
vxlan mode. Normal mode otherwise.
"effective encap" means that for 'vxlan,geneve,stt' encaps effective is geneve, 
for 'vxlan,stt' -> stt, for 'vxlan' -> vxlan.
Such behavior was my first idea.

But I decided that there possible flapping of modes if there is a problem/bug 
in deployment tooling and it is enough to have only one chassis with wrong 
encap set to affect vxlan mode for entire OVN cluster. Such mode flapping can 
result in problems with tunnel ids allocation.
So it seems that to have an option that statically sets vxlan mode is more 
resilient.
What do you think?


> On 3 Apr 2024, at 20:43, Ihar Hrachyshka  wrote:
> 
> Thank you Vladislav.
> 
> FYI it was reported in the past in 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-July/051931.html but 
> fell through cracks then. Thanks for picking it up!
> 
> In your patch, you introduce a new config option to disable the 'vxlan-mode' 
> behavior. This will definitely work. But I wonder if we can automatically 
> detect this scenario by ignoring the chassis that are VTEP from 
> consideration? I believe ovn-controller-vtep sets `is-vtep` in other_config, 
> so - would it work if we modify is_vxlan_mode to consider it too?
> 
> Thanks again for looking into this.
> Ihar
> 
> On Wed, Apr 3, 2024 at 6:34 AM Vladislav Odintsov  <mailto:odiv...@gmail.com>> wrote:
>> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
>> for available tunnel IDs because of lack of space in VXLAN VNI.
>> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
>> and 2047 logical switch ports per datapath.
>> 
>> Prior to this patch vxlan mode was enabled automatically if at least one
>> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
>> only for HW VTEP (RAMP) switch, such limitation makes no sence.
>> 
>> This patch adds support for explicit disabling of vxlan mode via
>> Northbound database.
>> 
>> 0: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
>> 
>> CC: Ihar Hrachyshka mailto:ihrac...@redhat.com>>
>> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
>> Signed-off-by: Vladislav Odintsov > <mailto:odiv...@gmail.com>>
>> ---
>>  northd/en-global-config.c |  9 +++-
>>  northd/northd.c   | 90 ++-
>>  northd/northd.h   |  6 ++-
>>  ovn-nb.xml| 12 ++
>>  tests/ovn-northd.at <http://ovn-northd.at/>   | 29 +
>>  5 files changed, 94 insertions(+), 52 deletions(-)
>> 
>> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
>> index 34e393b33..9310c4575 100644
>> --- a/northd/en-global-config.c
>> +++ b/northd/en-global-config.c
>> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void 
>> *data)
>>   config_data->svc_monitor_mac);
>>  }
>> 
>> -char *max_tunid = xasprintf("%d",
>> -get_ovn_max_dp_key_local(sbrec_chassis_table));
>> +init_vxlan_mode(&nb->options, sbrec_chassis_table);
>> +char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
>>  smap_replace(options, "max_tunid", max_tunid);
>>  free(max_tunid);
>> 
>> @@ -523,6 +523,11 @@ check_nb_options_out_of_sync(const struct 
>> nbrec_nb_global *nb,
>>  return true;
>>  }
>> 
>> +if (config_out_of_sync(&nb->options, &config_data->nb_options,
>> +   "disable_vxlan_mode", false)) {
>> +return true;
>> +}
>> +
>>  return false;
>>  }
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index c568f6360..859b233e8 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
>>   */
>>  static bool default_acl_drop;
>> 
>>

Re: [ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

The failed new testcase assumes that patch [1] is applied.
Should I resend them both as a single patchset?

1: 
https://patchwork.ozlabs.org/project/ovn/patch/20240401121510.758326-1-odiv...@gmail.com/

> On 3 Apr 2024, at 13:34, Vladislav Odintsov  wrote:
> 
> Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
> for available tunnel IDs because of lack of space in VXLAN VNI.
> In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
> and 2047 logical switch ports per datapath.
> 
> Prior to this patch vxlan mode was enabled automatically if at least one
> chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
> only for HW VTEP (RAMP) switch, such limitation makes no sence.
> 
> This patch adds support for explicit disabling of vxlan mode via
> Northbound database.
> 
> 0: https://github.com/ovn-org/ovn/commit/b07f1bc3d068
> 
> CC: Ihar Hrachyshka 
> Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
> Signed-off-by: Vladislav Odintsov 
> ---
> northd/en-global-config.c |  9 +++-
> northd/northd.c   | 90 ++-
> northd/northd.h   |  6 ++-
> ovn-nb.xml| 12 ++
> tests/ovn-northd.at   | 29 +
> 5 files changed, 94 insertions(+), 52 deletions(-)
> 
> diff --git a/northd/en-global-config.c b/northd/en-global-config.c
> index 34e393b33..9310c4575 100644
> --- a/northd/en-global-config.c
> +++ b/northd/en-global-config.c
> @@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void 
> *data)
>  config_data->svc_monitor_mac);
> }
> 
> -char *max_tunid = xasprintf("%d",
> -get_ovn_max_dp_key_local(sbrec_chassis_table));
> +init_vxlan_mode(&nb->options, sbrec_chassis_table);
> +char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
> smap_replace(options, "max_tunid", max_tunid);
> free(max_tunid);
> 
> @@ -523,6 +523,11 @@ check_nb_options_out_of_sync(const struct 
> nbrec_nb_global *nb,
> return true;
> }
> 
> +if (config_out_of_sync(&nb->options, &config_data->nb_options,
> +   "disable_vxlan_mode", false)) {
> +return true;
> +}
> +
> return false;
> }
> 
> diff --git a/northd/northd.c b/northd/northd.c
> index c568f6360..859b233e8 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
>  */
> static bool default_acl_drop;
> 
> +/* If this option is 'true' northd will use limited 24-bit space for datapath
> + * and ports tunnel key allocation (12 bits for each instead of default 16). 
> */
> +static bool vxlan_mode;
> +
> #define MAX_OVN_TAGS 4096
> 
> 
> @@ -875,24 +879,31 @@ join_datapaths(const struct nbrec_logical_switch_table 
> *nbrec_ls_table,
> }
> }
> 
> -static bool
> -is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
> +void
> +init_vxlan_mode(const struct smap *nb_options,
> +const struct sbrec_chassis_table *sbrec_chassis_table)
> {
> +if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
> +vxlan_mode = false;
> +return;
> +}
> +
> const struct sbrec_chassis *chassis;
> SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
> for (int i = 0; i < chassis->n_encaps; i++) {
> if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
> -return true;
> +vxlan_mode = true;
> +return;
> }
> }
> }
> -return false;
> +vxlan_mode = false;
> }
> 
> uint32_t
> -get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
> *sbrec_chassis_table)
> +get_ovn_max_dp_key_local(void)
> {
> -if (is_vxlan_mode(sbrec_chassis_table)) {
> +if (vxlan_mode) {
> /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
> return OVN_MAX_DP_VXLAN_KEY;
> }
> @@ -900,15 +911,14 @@ get_ovn_max_dp_key_local(const struct 
> sbrec_chassis_table *sbrec_chassis_table)
> }
> 
> static void
> -ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
> -  struct hmap *datapaths, struct hmap *dp_tnlids,
> +ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
>   struct ovn_datapath *od, uint32_t *hint)
> {
> if (!od->tunnel_key) {
> od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, &quo

[ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.

Commit [1] introduced a "vxlan mode" concept.  It brought a limitation
for available tunnel IDs because of lack of space in VXLAN VNI.
In vxlan mode OVN is limited by 4095 datapaths (LRs or non-transit LSs)
and 2047 logical switch ports per datapath.

Prior to this patch vxlan mode was enabled automatically if at least one
chassis had encap of vxlan type.  In scenarios where one want to use VXLAN
only for HW VTEP (RAMP) switch, such limitation makes no sence.

This patch adds support for explicit disabling of vxlan mode via
Northbound database.

1: https://github.com/ovn-org/ovn/commit/b07f1bc3d068

CC: Ihar Hrachyshka 
Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Vladislav Odintsov 
---
 northd/en-global-config.c |  9 +++-
 northd/northd.c   | 90 ++-
 northd/northd.h   |  6 ++-
 ovn-nb.xml| 12 ++
 tests/ovn-northd.at   | 29 +
 5 files changed, 94 insertions(+), 52 deletions(-)

diff --git a/northd/en-global-config.c b/northd/en-global-config.c
index 34e393b33..9310c4575 100644
--- a/northd/en-global-config.c
+++ b/northd/en-global-config.c
@@ -115,8 +115,8 @@ en_global_config_run(struct engine_node *node , void *data)
  config_data->svc_monitor_mac);
 }
 
-char *max_tunid = xasprintf("%d",
-get_ovn_max_dp_key_local(sbrec_chassis_table));
+init_vxlan_mode(&nb->options, sbrec_chassis_table);
+char *max_tunid = xasprintf("%d", get_ovn_max_dp_key_local());
 smap_replace(options, "max_tunid", max_tunid);
 free(max_tunid);
 
@@ -523,6 +523,11 @@ check_nb_options_out_of_sync(const struct nbrec_nb_global 
*nb,
 return true;
 }
 
+if (config_out_of_sync(&nb->options, &config_data->nb_options,
+   "disable_vxlan_mode", false)) {
+return true;
+}
+
 return false;
 }
 
diff --git a/northd/northd.c b/northd/northd.c
index c568f6360..859b233e8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -90,6 +90,10 @@ static bool use_ct_inv_match = true;
  */
 static bool default_acl_drop;
 
+/* If this option is 'true' northd will use limited 24-bit space for datapath
+ * and ports tunnel key allocation (12 bits for each instead of default 16). */
+static bool vxlan_mode;
+
 #define MAX_OVN_TAGS 4096
 
 
@@ -875,24 +879,31 @@ join_datapaths(const struct nbrec_logical_switch_table 
*nbrec_ls_table,
 }
 }
 
-static bool
-is_vxlan_mode(const struct sbrec_chassis_table *sbrec_chassis_table)
+void
+init_vxlan_mode(const struct smap *nb_options,
+const struct sbrec_chassis_table *sbrec_chassis_table)
 {
+if (smap_get_bool(nb_options, "disable_vxlan_mode", false)) {
+vxlan_mode = false;
+return;
+}
+
 const struct sbrec_chassis *chassis;
 SBREC_CHASSIS_TABLE_FOR_EACH (chassis, sbrec_chassis_table) {
 for (int i = 0; i < chassis->n_encaps; i++) {
 if (!strcmp(chassis->encaps[i]->type, "vxlan")) {
-return true;
+vxlan_mode = true;
+return;
 }
 }
 }
-return false;
+vxlan_mode = false;
 }
 
 uint32_t
-get_ovn_max_dp_key_local(const struct sbrec_chassis_table *sbrec_chassis_table)
+get_ovn_max_dp_key_local(void)
 {
-if (is_vxlan_mode(sbrec_chassis_table)) {
+if (vxlan_mode) {
 /* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
 return OVN_MAX_DP_VXLAN_KEY;
 }
@@ -900,15 +911,14 @@ get_ovn_max_dp_key_local(const struct sbrec_chassis_table 
*sbrec_chassis_table)
 }
 
 static void
-ovn_datapath_allocate_key(const struct sbrec_chassis_table *sbrec_ch_table,
-  struct hmap *datapaths, struct hmap *dp_tnlids,
+ovn_datapath_allocate_key(struct hmap *datapaths, struct hmap *dp_tnlids,
   struct ovn_datapath *od, uint32_t *hint)
 {
 if (!od->tunnel_key) {
 od->tunnel_key = ovn_allocate_tnlid(dp_tnlids, "datapath",
-OVN_MIN_DP_KEY_LOCAL,
-get_ovn_max_dp_key_local(sbrec_ch_table),
-hint);
+OVN_MIN_DP_KEY_LOCAL,
+get_ovn_max_dp_key_local(),
+hint);
 if (!od->tunnel_key) {
 if (od->sb) {
 sbrec_datapath_binding_delete(od->sb);
@@ -921,7 +931,6 @@ ovn_datapath_allocate_key(const struct sbrec_chassis_table 
*sbrec_ch_table,
 
 static void
 ovn_datapath_assign_requested_tnl_id(
-const struct sbrec_chassis_table *sbrec_chassis_table,
 struct hmap *dp_tnlids, struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->n

[ovs-dev] [PATCH ovn] northd: Add support for disabling vxlan mode.