Re: [ovs-dev] [PATCH] datapath-windows:adjust Offset when processing packet in POP_VLAN action

2021-09-30 Thread alinserdean
Thank you for addressing the comments I applied the incremental version and 
backported it until 2.7.

Thank you,
Alin.

-Original Message-
From: Wilson Peng  
Sent: Thursday, September 30, 2021 8:02 AM
To: Alin-Gabriel Serdean ; ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] datapath-windows:adjust Offset when processing 
packet in POP_VLAN action

Alin,

Thanks for your comments, I have sent out the updated patch, please help check.
https://patchwork.ozlabs.org/project/openvswitch/patch/20210930045626.9250-1-pweis...@vmware.com/

Regards
Wilson

在 2021/9/29 23:21,“Alin-Gabriel Serdean” 写入:

Indeed. Thanks for pointing that out.

To be honest that looks like a bug.
We should return an error if the it is not a valid VLAN frame.

-Original Message-
From: Wilson Peng  
Sent: Wednesday, September 29, 2021 5:52 PM
To: Alin-Gabriel Serdean ; ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] datapath-windows:adjust Offset when 
processing packet in POP_VLAN action

Alin,

Original code diff is to avoid the case below(it will return 
NDIS_STATUS_SUCCESS before RtlMoveMemory in function OvsPopFieldInPacketBuf).

Regards
Wilson

In function OvsPopFieldInPacketBuf(..)
 if (!bufferData) {
EthHdr *ethHdr = (EthHdr *)bufferStart;
/* If the frame is not VLAN make it a no op */
if (ethHdr->Type != ETH_TYPE_802_1PQ_NBO) {
return NDIS_STATUS_SUCCESS;>  may exit here.
}
}

  RtlMoveMemory(bufferStart + shiftLength, bufferStart, shiftOffset);
  NdisAdvanceNetBufferDataStart(curNb, shiftLength, FALSE, NULL);

在 2021/9/29 21:55,“dev 代表 Alin-Gabriel 
Serdean” 写入:

From: aserd...@ovn.org

Thank you for raising awareness about this issue. You are right the
offsets are not being updated when a pop vlan action is hit, they are
updated only on pop_mpls.

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenvswitch%2Fovs%2Fblob%2Fmaster%2Fdatapath-windows%2Fovsext%2FActions.c%23L1173-L1174&data=04%7C01%7Cpweisong%40vmware.com%7Ca65191ed9abd47b21a8608d9835cddfe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637685257162775283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oonWmImrDVicJ%2BBdGaYvmNDiUsueMXwOOjSn9MuFmBE%3D&reserved=0
Can you please update the offsets in the corresponding pop vlan
function. I.e.:
---
 datapath-windows/ovsext/Actions.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index e130c2f96..34aa5c5e4 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -1141,11 +1141,21 @@ OvsPopVlanInPktBuf(OvsForwardingContext 
*ovsFwdCtx)
  * Declare a dummy vlanTag structure since we need to compute the 
size
  * of shiftLength. The NDIS one is a unionized structure.
  */
+NDIS_STATUS status;
+OVS_PACKET_HDR_INFO* layers = &ovsFwdCtx->layers;
 NDIS_PACKET_8021Q_INFO vlanTag = {0};
 UINT32 shiftLength = sizeof(vlanTag.TagHeader);
 UINT32 shiftOffset = sizeof(DL_EUI48) + sizeof(DL_EUI48);

-return OvsPopFieldInPacketBuf(ovsFwdCtx, shiftOffset, shiftLength, 
NULL);
+status = OvsPopFieldInPacketBuf(ovsFwdCtx, shiftOffset, 
shiftLength,
+NULL);
+
+if (status == NDIS_STATUS_SUCCESS) {
+layers->l3Offset -= (UINT16) shiftLength;
+layers->l4Offset -= (UINT16) shiftLength;
+}
+
+return status;
 }


-- 
2.32.0

___
dev mailing list
d...@openvswitch.org

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Cpweisong%40vmware.com%7Ca65191ed9abd47b21a8608d9835cddfe%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637685257162775283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=29gFWHNVRPTixEkYAQeD6HevknOxi6%2BYerYFAUM3FTI%3D&reserved=0




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Anton Ivanov

After quickly adding some more prints into the testsuite.

Test 1:

Without

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1130
  Average (NB in msec): 620.375000
  Maximum (SB in msec): 23
  Average (SB in msec): 21.468759
  Maximum (northd-loop in msec): 6002
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 914.760417
  Long term average (northd-loop in msec): 104.799340

With

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1148
  Average (NB in msec): 630.25
  Maximum (SB in msec): 24
  Average (SB in msec): 21.468744
  Maximum (northd-loop in msec): 6090
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 762.101565
  Long term average (northd-loop in msec): 80.735192

The metric which actually matters and which SHOULD me measured - long term 
average is better by 20%. Using short term average instead of long term in the 
test suite is actually a BUG.

Are you running yours under some sort of virtualization?

A.

On 30/09/2021 07:52, Han Zhou wrote:

Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @ 
2.90GHz, 24 cores.
It is weird why my result is so different. I also verified with a scale test 
script that creates a large scale NB/SB with 800 nodes of simulated k8s setup. 
And then just run:
    ovn-nbctl --print-wait-time --wait=sb sync

Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms

I suspected the hmap size problem but I tried changing the initial size to 64k buckets 
and it didn't help. I will find some time to check the "perf" reports.

Thanks,
Han

On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov mailto:anton.iva...@cambridgegreys.com>> wrote:

On 30/09/2021 07:16, Anton Ivanov wrote:

Results on a Ryzen 5 3600 - 6 cores 12 threads


I will also have a look into the "maximum" measurement for multi-thread.

It does not tie up with the drop in average across the board.

A.



Without


  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1256
  Average (NB in msec): 679.463785
  Maximum (SB in msec): 25
  Average (SB in msec): 22.489798
  Maximum (northd-loop in msec): 1347
  Average (northd-loop in msec): 799.944878

  2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 1956
  Average (NB in msec): 809.387285
  Maximum (SB in msec): 24
  Average (SB in msec): 21.649258
  Maximum (northd-loop in msec): 2011
  Average (northd-loop in msec): 961.718686

  5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 557
  Average (NB in msec): 474.010337
  Maximum (SB in msec): 15
  Average (SB in msec): 13.927192
  Maximum (northd-loop in msec): 1261
  Average (northd-loop in msec): 580.999122

  6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 756
  Average (NB in msec): 625.614724
  Maximum (SB in msec): 15
  Average (SB in msec): 14.181048
  Maximum (northd-loop in msec): 1649
  Average (northd-loop in msec): 746.208332


With

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1140
  Average (NB in msec): 631.125000
  Maximum (SB in msec): 24
  Average (SB in msec): 21.453609
  Maximum (northd-loop in msec): 6080
  Average (northd-loop in msec): 759.718815

  2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 1210
  Average (NB in msec): 673.00
  Maximum (SB in msec): 27
  Average (SB in msec): 22.453125
  Maximum (northd-loop in msec): 6514
  Average (northd-loop in msec): 808.596842

  5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 798
  Average (NB in msec): 429.75
  Maximum (SB in msec): 15
  Average (SB in msec): 12.998533
  Maximum (northd-loop in msec): 3835
  Average (northd-loop in msec): 564.875986

  6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 1074
  Average (NB in msec): 593.875000
  Maximum (SB in msec): 14
  Average (SB in msec): 13.655273
  Maximum (northd-loop in msec): 4973
  Average (

[ovs-dev] northd-ddlog slowness when adding the first regular LSP to a LS full of router ports

2021-09-30 Thread Han Zhou
Hi Ben,

I understand that we have difficulties for northd-ddlog progress, but I
still want to try it before it goes away. I tested with the latest version,
and it is super fast for most of the operations. With a large NB & SB
created, the C northd takes ~8 seconds for any change computation. For the
same DB, northd usually takes less than 1 sec for most operations.

However, I did find an interesting problem. I tried to create one more LSP
on a LS that already has 800 gateway-routers connected to it, which means
there are already 800 LSPs on the LS of the type "router". Creating the
extra LSP (without type) took 12 sec, which is even longer than a full
compute of the C version. What's more interesting is, when I create another
LSP on the same LS, it takes only 100+ms, and same for the 3rd, 4th LSPs,
etc. When I remove any one of the extra LSPs I created, it is also fast,
just 100+ms. But when I remove the last LSP that I just created it takes 12
sec again. Then I tried creating a LSP with type=router on the same LS, it
is very fast, less than 100ms. Basically, only creating the first
non-router LSP or removing the last non-router LSP takes a very long time.

I haven't debugged yet (and not sure if I am capable of), but I think it
might be useful to report it first.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Han Zhou
On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov <
anton.iva...@cambridgegreys.com> wrote:

> After quickly adding some more prints into the testsuite.
>
> Test 1:
>
> Without
>
>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>   ---
>   Maximum (NB in msec): 1130
>   Average (NB in msec): 620.375000
>   Maximum (SB in msec): 23
>   Average (SB in msec): 21.468759
>   Maximum (northd-loop in msec): 6002
>   Minimum (northd-loop in msec): 0
>   Average (northd-loop in msec): 914.760417
>   Long term average (northd-loop in msec): 104.799340
>
> With
>
>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>   ---
>   Maximum (NB in msec): 1148
>   Average (NB in msec): 630.25
>   Maximum (SB in msec): 24
>   Average (SB in msec): 21.468744
>   Maximum (northd-loop in msec): 6090
>   Minimum (northd-loop in msec): 0
>   Average (northd-loop in msec): 762.101565
>   Long term average (northd-loop in msec): 80.735192
>
> The metric which actually matters and which SHOULD me measured - long term
> average is better by 20%. Using short term average instead of long term in
> the test suite is actually a BUG.
>
Good catch!


> Are you running yours under some sort of virtualization?
>

No, I am testing on a bare-metal.


> A.
> On 30/09/2021 07:52, Han Zhou wrote:
>
> Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @
> 2.90GHz, 24 cores.
> It is weird why my result is so different. I also verified with a scale
> test script that creates a large scale NB/SB with 800 nodes of simulated
> k8s setup. And then just run:
> ovn-nbctl --print-wait-time --wait=sb sync
>
> Without parallel:
> ovn-northd completion: 7807ms
>
> With parallel:
> ovn-northd completion: 41267ms
>
> I suspected the hmap size problem but I tried changing the initial size to
> 64k buckets and it didn't help. I will find some time to check the "perf"
> reports.
>
> Thanks,
> Han
>
> On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov <
> anton.iva...@cambridgegreys.com> wrote:
>
>> On 30/09/2021 07:16, Anton Ivanov wrote:
>>
>> Results on a Ryzen 5 3600 - 6 cores 12 threads
>>
>> I will also have a look into the "maximum" measurement for multi-thread.
>>
>> It does not tie up with the drop in average across the board.
>>
>> A.
>>
>>
>> Without
>>
>>
>>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 1256
>>   Average (NB in msec): 679.463785
>>   Maximum (SB in msec): 25
>>   Average (SB in msec): 22.489798
>>   Maximum (northd-loop in msec): 1347
>>   Average (northd-loop in msec): 799.944878
>>
>>   2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>   ---
>>   Maximum (NB in msec): 1956
>>   Average (NB in msec): 809.387285
>>   Maximum (SB in msec): 24
>>   Average (SB in msec): 21.649258
>>   Maximum (northd-loop in msec): 2011
>>   Average (northd-loop in msec): 961.718686
>>
>>   5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 557
>>   Average (NB in msec): 474.010337
>>   Maximum (SB in msec): 15
>>   Average (SB in msec): 13.927192
>>   Maximum (northd-loop in msec): 1261
>>   Average (northd-loop in msec): 580.999122
>>
>>   6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>   ---
>>   Maximum (NB in msec): 756
>>   Average (NB in msec): 625.614724
>>   Maximum (SB in msec): 15
>>   Average (SB in msec): 14.181048
>>   Maximum (northd-loop in msec): 1649
>>   Average (northd-loop in msec): 746.208332
>>
>>
>> With
>>
>>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 1140
>>   Average (NB in msec): 631.125000
>>   Maximum (SB in msec): 24
>>   Average (SB in msec): 21.453609
>>   Maximum (northd-loop in msec): 6080
>>   Average (northd-loop in msec): 759.718815
>>
>>   2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=no
>>   ---
>>   Maximum (NB in msec): 1210
>>   Average (NB in msec): 673.00
>>   Maximum (SB in msec): 27
>>   Average (SB in msec): 22.453125
>>   Maximum (northd-loop in msec): 6514
>>   Average (northd-loop in msec): 808.596842
>>
>>   5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 798
>>   Average (NB in msec): 429.75
>>   Maximum (SB in msec): 15
>>   Average (SB in msec): 12.998533
>>   Maximum (northd-loop in msec): 3835
>>   Average (northd-loop in msec): 564.875986
>>
>>   6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical
>> Ports/Hypervisor -- ovn-northd -- 

Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Anton Ivanov

OK,

I can dig into this later this afternoon.

There is quite a bit of dispersion in tests without parallelization on my 
system which should not be there.

I want to get down to the bottom of where it is coming from and why are we 
getting different results compared to ovn-heater.

I did all the original tests with ovn-heater and they were consistently 5-10% 
better end-to-end with parallelization enabled.

As far as the worker threads never reaching 100% and the northd thread being 
regularly at 100% that is unfortunately how it is. Large sections of northd 
cannot be parallelized at present. The only bit which can be run in parallel is 
lflow compute.

Generation of datapaths, ports, groups - all before the lflows cannot be 
parallelized and it is compute heavy.

Post-processing of flows once they have been generated - hash recompute, 
reconciliation of databases, etc - cannot be parallelized at present. Some of 
it may be run in parallel if there were parallel macros in the OVS source, but 
they are likely to give only marginal effect on performance - 1-2% at most.

Best Regards,

A.

On 30/09/2021 08:26, Han Zhou wrote:



On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov mailto:anton.iva...@cambridgegreys.com>> wrote:

After quickly adding some more prints into the testsuite.

Test 1:

Without

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1130
  Average (NB in msec): 620.375000
  Maximum (SB in msec): 23
  Average (SB in msec): 21.468759
  Maximum (northd-loop in msec): 6002
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 914.760417
  Long term average (northd-loop in msec): 104.799340

With

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1148
  Average (NB in msec): 630.25
  Maximum (SB in msec): 24
  Average (SB in msec): 21.468744
  Maximum (northd-loop in msec): 6090
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 762.101565
  Long term average (northd-loop in msec): 80.735192

The metric which actually matters and which SHOULD me measured - long term 
average is better by 20%. Using short term average instead of long term in the 
test suite is actually a BUG.

Good catch!

Are you running yours under some sort of virtualization?


No, I am testing on a bare-metal.

A.

On 30/09/2021 07:52, Han Zhou wrote:

Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @ 
2.90GHz, 24 cores.
It is weird why my result is so different. I also verified with a scale 
test script that creates a large scale NB/SB with 800 nodes of simulated k8s 
setup. And then just run:
    ovn-nbctl --print-wait-time --wait=sb sync

Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms

I suspected the hmap size problem but I tried changing the initial size to 64k 
buckets and it didn't help. I will find some time to check the "perf" reports.

Thanks,
Han

On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov mailto:anton.iva...@cambridgegreys.com>> wrote:

On 30/09/2021 07:16, Anton Ivanov wrote:

Results on a Ryzen 5 3600 - 6 cores 12 threads


I will also have a look into the "maximum" measurement for multi-thread.

It does not tie up with the drop in average across the board.

A.



Without


  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1256
  Average (NB in msec): 679.463785
  Maximum (SB in msec): 25
  Average (SB in msec): 22.489798
  Maximum (northd-loop in msec): 1347
  Average (northd-loop in msec): 799.944878

  2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 1956
  Average (NB in msec): 809.387285
  Maximum (SB in msec): 24
  Average (SB in msec): 21.649258
  Maximum (northd-loop in msec): 2011
  Average (northd-loop in msec): 961.718686

  5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 557
  Average (NB in msec): 474.010337
  Maximum (SB in msec): 15
  Average (SB in msec): 13.927192
  Maximum (northd-loop in msec): 1261
  Average (northd-loop in msec): 580.999122

  6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 756
  Average (NB in msec): 

[ovs-dev] [PATCH v1] dpdk-stub: Change the ERR log to DBG.

2021-09-30 Thread Kumar Amber
When DPDK is not availble and avx512 looks for isa
the function returns an error log which results in
unit test failures.

By logging a debug level log, this still shows up
in the vswitchd.log file, but won't fail unit tests
that do not have DPDK built in.

Suggested by: Ilya Maximets 
Signed-off-by: Kumar Amber 
---
 lib/dpdk-stub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
index b7d577870..fe24f9abd 100644
--- a/lib/dpdk-stub.c
+++ b/lib/dpdk-stub.c
@@ -83,7 +83,7 @@ bool
 dpdk_get_cpu_has_isa(const char *arch OVS_UNUSED,
  const char *feature OVS_UNUSED)
 {
-VLOG_ERR_ONCE("DPDK not supported in this version of Open vSwitch, "
+VLOG_DBG_ONCE("DPDK not supported in this version of Open vSwitch, "
   "cannot use CPU flag based optimizations");
 return false;
 }
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Documentation: Change the address in userspace-tunneling.rst

2021-09-30 Thread Ferriter, Cian



> -Original Message-
> From: dev  On Behalf Of Paolo Valerio
> Sent: Wednesday 29 September 2021 21:43
> To: d...@openvswitch.org
> Cc: f...@redhat.com; i.maxim...@ovn.org
> Subject: [ovs-dev] [PATCH] Documentation: Change the address in 
> userspace-tunneling.rst
> 
> Fixes: b438493e1b03 ("doc: Add DPDK to userspace tunneling guide")
> Signed-off-by: Paolo Valerio 
> ---
>  Documentation/howto/userspace-tunneling.rst |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/howto/userspace-tunneling.rst 
> b/Documentation/howto/userspace-tunneling.rst
> index 4e23b2e0c..e96b79985 100644
> --- a/Documentation/howto/userspace-tunneling.rst
> +++ b/Documentation/howto/userspace-tunneling.rst
> @@ -175,7 +175,7 @@ If the tunnel route is missing, adding it now::
> 
>  $ ovs-appctl ovs/route/add 172.168.1.1/24 br-phy
> 
> -Repeat these steps if necessary for `host2`, but using ``192.168.1.1`` and
> +Repeat these steps if necessary for `host2`, but using ``192.168.1.2`` and
>  ``172.168.1.2`` for the VM and tunnel interface IP addresses, respectively.
> 
>  Testing
> 

Hi Paolo,

I'm just wondering if you've seen my patch changing these lines in the docs:
http://patchwork.ozlabs.org/project/openvswitch/patch/20210922120033.39221-1-cian.ferri...@intel.com/

Thanks,
Cian
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Documentation: Change the address in userspace-tunneling.rst

2021-09-30 Thread Paolo Valerio
"Ferriter, Cian"  writes:

>> -Original Message-
>> From: dev  On Behalf Of Paolo Valerio
>> Sent: Wednesday 29 September 2021 21:43
>> To: d...@openvswitch.org
>> Cc: f...@redhat.com; i.maxim...@ovn.org
>> Subject: [ovs-dev] [PATCH] Documentation: Change the address in 
>> userspace-tunneling.rst
>> 
>> Fixes: b438493e1b03 ("doc: Add DPDK to userspace tunneling guide")
>> Signed-off-by: Paolo Valerio 
>> ---
>>  Documentation/howto/userspace-tunneling.rst |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/Documentation/howto/userspace-tunneling.rst 
>> b/Documentation/howto/userspace-tunneling.rst
>> index 4e23b2e0c..e96b79985 100644
>> --- a/Documentation/howto/userspace-tunneling.rst
>> +++ b/Documentation/howto/userspace-tunneling.rst
>> @@ -175,7 +175,7 @@ If the tunnel route is missing, adding it now::
>> 
>>  $ ovs-appctl ovs/route/add 172.168.1.1/24 br-phy
>> 
>> -Repeat these steps if necessary for `host2`, but using ``192.168.1.1`` and
>> +Repeat these steps if necessary for `host2`, but using ``192.168.1.2`` and
>>  ``172.168.1.2`` for the VM and tunnel interface IP addresses, respectively.
>> 
>>  Testing
>> 
>
> Hi Paolo,
>
> I'm just wondering if you've seen my patch changing these lines in the docs:
> http://patchwork.ozlabs.org/project/openvswitch/patch/20210922120033.39221-1-cian.ferri...@intel.com/

Hi Cian,

no, I missed it. I'll drop the patch.

Paolo

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] docs/userspace-tunneling: Fix IP addresses for host2.

2021-09-30 Thread Paolo Valerio
Cian Ferriter  writes:

> The IP addresses being recommended for the VM interface and the
> "remote_ip" on the tunnel port are wrong. The host1 values were being
> used before. Update to use the host2 values.
>
> Signed-off-by: Cian Ferriter 
>

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] OVS-DPDK source MAC address corruption upon MPLS decapsulation

2021-09-30 Thread Rohit Kumar via dev

Hi Ovs-dev,

I am trying to do some testing of MPLS decapsulation in OVS DPDK
(compiled with MPLS encap/decap patch). And I am running into an issue
that upon decapsulation of the MPLS headers, the two middle bytes of the
source MAC in the inner ethernet header are being overwritten with
0x6558 (65:58).

The versions we've been using are:
- openvswitch-2.14.0 (also seen using 2.14.1 and 2.14.2)
- dpdk-19.11.6
- MPLS patch
(https://patchwork.ozlabs.org/project/openvswitch/patch/20210326062146.56336-1-martinvargheseno...@gmail.com/)

This is the input packet:

  Ethernet II, Src: fa:16:3e:f6:bd:6d (fa:16:3e:f6:bd:6d), Dst:
fa:03:33:50:03:43 (fa:03:33:50:03:43)
  Destination: fa:03:33:50:03:43 (fa:03:33:50:03:43)
  Source: fa:16:3e:f6:bd:6d (fa:16:3e:f6:bd:6d)
  Type: 802.1Q Virtual LAN (0x8100)
  802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 3010
  000.    = Priority: 0
  ...0    = CFI: 0
   1011 1100 0010 = ID: 3010
  Type: MPLS label switched packet (0x8847)
  MultiProtocol Label Switching Header, Label: 403, Exp: 7, S: 0,
TTL: 63
  MPLS Label: 403
  MPLS Experimental Bits: 7
  MPLS Bottom Of Label Stack: 0
  MPLS TTL: 63
  MultiProtocol Label Switching Header, Label: 589824, Exp: 7, S:
1, TTL: 63
  MPLS Label: 589824
  MPLS Experimental Bits: 7
  MPLS Bottom Of Label Stack: 1
  MPLS TTL: 63
  Ethernet II, Src: f6:33:6c:8d:e1:16 (f6:33:6c:8d:e1:16), Dst:
c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Destination: c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Source: f6:33:6c:8d:e1:16 (f6:33:6c:8d:e1:16)
  Type: 802.1Q Virtual LAN (0x8100)
  802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 14
  000.    = Priority: 0
  ...0    = CFI: 0
     1110 = ID: 14
  Type: 802.1Q Virtual LAN (0x8100)
  802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 100
  000.    = Priority: 0
  ...0    = CFI: 0
    0110 0100 = ID: 100
  Type: ARP (0x0806)
  Address Resolution Protocol (reply)
  Hardware type: Ethernet (0x0001)
  Protocol type: IP (0x0800)
  Hardware size: 6
  Protocol size: 4
  Opcode: reply (0x0002)
  [Is gratuitous: False]
  Sender MAC address: f6:33:6c:8d:e1:16 (f6:33:6c:8d:e1:16)
  Sender IP address: 1.1.1.1 (1.1.1.1)
  Target MAC address: c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Target IP address: 1.2.3.4 (1.2.3.4)


This is the output packet > (with source MAC address
corruption):

  Ethernet II, Src: f6:33:65:58:e1:16 (f6:33:65:58:e1:16), Dst:
c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Destination: c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Source: f6:33:65:58:e1:16 (f6:33:65:58:e1:16)
  Type: 802.1Q Virtual LAN (0x8100)
  802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 14
  000.    = Priority: 0
  ...0    = CFI: 0
     1110 = ID: 14
  Type: 802.1Q Virtual LAN (0x8100)
  802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 100
  000.    = Priority: 0
  ...0    = CFI: 0
    0110 0100 = ID: 100
  Type: ARP (0x0806)
  Address Resolution Protocol (reply)
  Hardware type: Ethernet (0x0001)
  Protocol type: IP (0x0800)
  Hardware size: 6
  Protocol size: 4
  Opcode: reply (0x0002)
  [Is gratuitous: False]
  Sender MAC address: f6:33:6c:8d:e1:16 (f6:33:6c:8d:e1:16)
  Sender IP address: 1.1.1.1 (1.1.1.1)
  Target MAC address: c6:18:76:53:96:a2 (c6:18:76:53:96:a2)
  Target IP address: 1.2.3.4 (1.2.3.4)

Flow rules:
ovs-ofctl -O Openflow15 add-flow $mpls "cookie=0x10,table=0,
priority=15,in_port=2,dl_type=0x8847 actions=resubmit(,30)"
ovs-ofctl -O Openflow15 add-flow $mpls
"cookie=0x10,table=30,priority=15,dl_vlan=3010
actions=pop_vlan,resubmit(,31)"
ovs-ofctl -O Openflow15 add-flow $mpls
"cookie=0x10,table=31,priority=15,dl_type=0x8847,mpls_label=403
actions=decap(),decap(packet_type(ns=0,type=0)),output:br-mpls"

#ovs-ofctl dump-flows br-mpls -O Openflow15
cookie=0x10, duration=19.149s, table=30, n_packets=20, n_bytes=2080,
idle_age=0, priority=15,dl_vlan=3010 actions=pop_vlan,resubmit(,31)
cookie=0x10, duration=19.116s, table=31, n_packets=20, n_bytes=2080,
idle_age=0, priority=15,mpls,mpls_label=403
actions=decap(),decap(packet_type(ns=0,type=0)),LOCAL

#ovs-appctl dpif/dump-flows br-mpls
recirc_id(0x1a),in_port(4),packet_type(ns=0,id=0),eth(dst=c6:18:76:53:96:a2),eth_type(0x8100),vlan(vid=14,pcp=0),encap(eth_type(0x8100)),
packets:81, bytes:4722, used:0.163s, actions:3
recirc_id(0),in_port(4),packet_type(ns=0,id=0),eth(dst=8a:fe:76:13:2a:49/01:00:00:00:00:00),eth_type(0x8100),vlan(vid=3010,p

Re: [ovs-dev] Unit Test Failure Report to OVS ML

2021-09-30 Thread Amber, Kumar
Hi IIya,

Pls find the replies inline.

> -Original Message-
> From: Ilya Maximets 
> Sent: Tuesday, September 28, 2021 6:22 PM
> To: Amber, Kumar ; Ilya Maximets
> ; ovs-dev@openvswitch.org;
> tony.vanderp...@alliedtelesis.co.nz
> Cc: Stokes, Ian ; Van Haaren, Harry
> 
> Subject: Re: Unit Test Failure Report to OVS ML
> 
> On 9/28/21 13:30, Amber, Kumar wrote:
> > Hi llya,
> >
> > The test-case fails with the following build command on the master branch.
> >
> > Pass:
> > ./configure --with-dpdk=static CFLAGS=""
> > Fails:
> > ./configure --with-dpdk=static CFLAGS="-msse4.2"
> >
> > Testing on ovs-master branch, running test case like this $ make check
> > TESTSUITEFLAGS="779"
> >
> > Based on the above build tests it was identified that "-msse4.2" is causing 
> > the
> unit-test to fail. Note that OVS changes its hashing implementation based on
> SSE4.2 being available at compile-time. This switches between murmur hash SW
> implementation, and SSE CRC32 instruction.
> >
> > This test seems to expect *a specific value* of a hash result, causing it to
> pass/fail based on hashing implementation selected at ./configure time.
> 
> Hi.  Thanks for details.  That make sense.
> 
> Could you try following patch:
> 
> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at index
> 12fc1ef91..cf021c0cc 100644
> --- a/tests/tunnel-push-pop.at
> +++ b/tests/tunnel-push-pop.at
> @@ -628,20 +628,22 @@ AT_CHECK([
>  AT_CHECK([ovs-vsctl -- set Interface p0 options:tx_pcap=p0.pcap])
> 
>  packet=5054000a50540009123
> -
> encap=f8bc124434b6aa55aa55080045324000401134060101025
> 80101025c83a917c1001e65587b00
> +dnl Source port is based on a packet hash, so it might be different
> +depending dnl on compiler flags and CPU.  Masked with ''.
> +encap=f8bc124434b6aa55aa5508004532400040113406010102
> 5801010
> +25c17c1001e65587b00
> 
>  dnl Output to tunnel from a int-br internal port.
>  dnl Checking that the packet arrived and it was correctly encapsulated.
>  AT_CHECK([ovs-ofctl add-flow int-br
> "in_port=LOCAL,actions=debug_slow,output:2"])
>  AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4"]) -
> OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep "${encap}${packet}4" | wc -l` -
> ge 1])
> +OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}4" |
> +wc -l` -ge 1])
>  dnl Sending again to exercise the non-miss upcall path.
>  AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4"]) -
> OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep "${encap}${packet}4" | wc -l` -
> ge 2])
> +OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}4" |
> +wc -l` -ge 2])
> 
>  dnl Output to tunnel from the controller.
>  AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out int-br CONTROLLER
> "debug_slow,output:2" "${packet}5"]) -OVS_WAIT_UNTIL([test `ovs-pcap
> p0.pcap | grep "${encap}${packet}5" | wc -l` -ge 1])
> +OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | egrep "${encap}${packet}5" |
> +wc -l` -ge 1])
> 
>  dnl Datapath actions should not have tunnel push action.
>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep -q tnl_push], [1])
> ---
> 
> I can send it formally once I'm back from PTO.
> 
The patch works for the "-msse4.2" flag and now the test-case passes for all 
the builds.

> One other thing that I noticed is that all tests are failing with '-msse4.2'
> and without dpdk due to extra warnings from an AVX512 code.  I think we need
> to lower them down to DBG, as they are not very useful, but breaks tests and
> make issues like this drain in the flood of test failures.
> If you can fix that, that would be great.
> 
The fix is pushed on the patchworks.
 http://patchwork.ozlabs.org/project/openvswitch/list/?series=264765

Regards
Amber

> Best regards, Ilya Maximets.
> 
> >
> > Regards
> > Amber
> >
> >> -Original Message-
> >> From: Ilya Maximets 
> >> Sent: Tuesday, September 21, 2021 7:00 PM
> >> To: Amber, Kumar ; Ilya Maximets
> >> ; ovs-dev@openvswitch.org;
> >> tony.vanderp...@alliedtelesis.co.nz
> >> Cc: Stokes, Ian ; Van Haaren, Harry
> >> 
> >> Subject: Re: Unit Test Failure Report to OVS ML
> >>
> >> On 9/21/21 14:51, Amber, Kumar wrote:
> >>> Hi Ilya,
> >>>
> >>> The Test-case failure is not related to AVX512 or any patches we are
> >>> directly
> >> failing on "master" latest of OVS with no patches on top of it.
> >>> I am still trying to figure out or root cause the issue, we tested
> >>> the master on 4
> >> different servers, and all fails on the same test-case.
> >>
> >> This sounds very weird.  How do you build it?
> >>
> >>>
> >>> Regards
> >>> Amber
> >>>
>  -Original Message-
>  From: Ilya Maximets 
>  Sent: Monday, September 20, 2021 5:05 PM
>  To: Amber, Kumar ; ovs-dev@openvswitch.org;
>  i.maxim...@ovn.org; tony.vanderp...@alliedtelesis.co.nz
>  Cc: Stokes, Ian ; Van Haaren, Harry
>  
>  Subject: Re: Unit Test Failure Report to OVS ML
> 
>  On 9/

[ovs-dev] [PATCH v1 0/1] DPCLS

2021-09-30 Thread Kumar Amber
The patch depend on the dpcls-dpif test-case patch:
http://patchwork.ozlabs.org/project/openvswitch/patch/20210902172346.2389795-1-kumar.am...@intel.com/

Kumar Amber (1):
  dpcls: Change info-get function to fetch dpcls usage stats.

 Documentation/topics/dpdk/bridge.rst | 16 +++---
 lib/dpif-netdev-lookup.c | 80 ++--
 lib/dpif-netdev-lookup.h | 19 ++-
 lib/dpif-netdev.c| 30 +--
 tests/pmd.at | 16 +++---
 5 files changed, 111 insertions(+), 50 deletions(-)

-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v1 1/1] dpcls: Change info-get function to fetch dpcls usage stats.

2021-09-30 Thread Kumar Amber
Modified the dplcs info-get command output to include
the count for different dpcls implementations.

$ovs-appctl dpif-netdev/subtable-lookup-prio-get

Available dpcls implementations:
  autovalidator (Use count: 1, Priority: 5)
  generic (Use count: 0, Priority: 1)
  avx512_gather (Use count: 0, Priority: 3)

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 
---
 Documentation/topics/dpdk/bridge.rst | 16 +++---
 lib/dpif-netdev-lookup.c | 80 ++--
 lib/dpif-netdev-lookup.h | 19 ++-
 lib/dpif-netdev.c| 30 +--
 tests/pmd.at | 16 +++---
 5 files changed, 111 insertions(+), 50 deletions(-)

diff --git a/Documentation/topics/dpdk/bridge.rst 
b/Documentation/topics/dpdk/bridge.rst
index f645b9ade..63a54da1c 100644
--- a/Documentation/topics/dpdk/bridge.rst
+++ b/Documentation/topics/dpdk/bridge.rst
@@ -156,10 +156,10 @@ OVS provides multiple implementations of dpcls. The 
following command enables
 the user to check what implementations are available in a running instance ::
 
 $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-Available lookup functions (priority : name)
-0 : autovalidator
-1 : generic
-0 : avx512_gather
+Available dpcls implementations:
+autovalidator (Use count: 1, Priority: 5)
+generic (Use count: 0, Priority: 1)
+avx512_gather (Use count: 0, Priority: 3)
 
 To set the priority of a lookup function, run the ``prio-set`` command ::
 
@@ -172,10 +172,10 @@ function due to the command being run. To verify the 
prioritization, re-run the
 get command, note the updated priority of the ``avx512_gather`` function ::
 
 $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-Available lookup functions (priority : name)
-0 : autovalidator
-1 : generic
-5 : avx512_gather
+Available dpcls implementations:
+autovalidator (Use count: 0, Priority: 0)
+generic (Use count: 0, Priority: 0)
+avx512_gather (Use count: 1, Priority: 5)
 
 If two lookup functions have the same priority, the first one in the list is
 chosen, and the 2nd occurance of that priority is not used. Put in logical
diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
index bd0a99abe..c0b137299 100644
--- a/lib/dpif-netdev-lookup.c
+++ b/lib/dpif-netdev-lookup.c
@@ -36,18 +36,21 @@ static struct dpcls_subtable_lookup_info_t 
subtable_lookups[] = {
 { .prio = 0,
 #endif
   .probe = dpcls_subtable_autovalidator_probe,
-  .name = "autovalidator", },
+  .name = "autovalidator",
+  .usage_cnt = 0,},
 
 /* The default scalar C code implementation. */
 { .prio = 1,
   .probe = dpcls_subtable_generic_probe,
-  .name = "generic", },
+  .name = "generic",
+  .usage_cnt = 0, },
 
 #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__)
 /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
 { .prio = 0,
   .probe = dpcls_subtable_avx512_gather_probe,
-  .name = "avx512_gather", },
+  .name = "avx512_gather",
+  .usage_cnt = 0,},
 #else
 /* Disabling AVX512 at compile time, as compile time requirements not met.
  * This could be due to a number of reasons:
@@ -93,32 +96,79 @@ dpcls_subtable_set_prio(const char *name, uint8_t priority)
 }
 
 dpcls_subtable_lookup_func
-dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
+dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
+ dpcls_subtable_lookup_func old_func,
+ uint32_t will_use_result)
 {
 /* Iter over each subtable impl, and get highest priority one. */
 int32_t prio = -1;
+uint32_t i;
 const char *name = NULL;
+uint32_t best_idx = 0;
+uint32_t usage_cnt = 0;
 dpcls_subtable_lookup_func best_func = NULL;
 
-for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
+for (i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
 int32_t probed_prio = subtable_lookups[i].prio;
+dpcls_subtable_lookup_func probed_func;
+probed_func = subtable_lookups[i].probe(u0_bit_count,
+u1_bit_count);
+if (!probed_func) {
+continue;
+}
+
+/* Better candidate - track this to return it later. */
 if (probed_prio > prio) {
-dpcls_subtable_lookup_func probed_func;
-probed_func = subtable_lookups[i].probe(u0_bit_count,
-u1_bit_count);
-if (probed_func) {
-best_func = probed_func;
-prio = probed_prio;
-name = subtable_lookups[i].name;
-}
+best_func = probed_func;
+best_idx = i;
+prio = probed_prio;
+ 

Re: [ovs-dev] [PATCH v1 1/1] dpcls: Change info-get function to fetch dpcls usage stats.

2021-09-30 Thread 0-day Robot
Bleep bloop.  Greetings Kumar Amber, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 dpcls: Change info-get function to fetch dpcls usage stats.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Patch skipped due to previous failure.

Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Anton Ivanov

Summary of findings.

1. The numbers on the perf test do not align with heater which is much closer 
to a realistic load. On some tests where heater gives 5-10% end-to-end 
improvement with parallelization we get worse results with the perf-test. You 
spotted this one correctly.

Example of the northd average pulled out of the test report via grep and sed.

   127.489353
   131.509458
   116.088205
   94.721911
   119.629756
   114.896258
   124.811069
   129.679160
   106.699905
   134.490338
   112.106713
   135.957658
   132.47
   94.106849
   117.431450
   115.861592
   106.830657
   132.396905
   107.092542
   128.945760
   94.298464
   120.455510
   136.910426
   134.311765
   115.881292
   116.918458

These values are all over the place - this is not a reproducible test.

2. In the present state you need to re-run it > 30+ times and take an average. The 
standard deviation for the values for the northd loop is > 10%. Compared to that 
the reproducibility of ovn-heater is significantly better. I usually get less than 
0.5% difference between runs if there was no iteration failures. I would suggest 
using that instead if you want to do performance comparisons until we have figured 
out what affects the perf-test.

3. It is using the short term running average value in reports which is 
probably wrong because you have very significant skew from the last several 
values.

I will look into all of these.

Brgds,

On 30/09/2021 08:26, Han Zhou wrote:



On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov mailto:anton.iva...@cambridgegreys.com>> wrote:

After quickly adding some more prints into the testsuite.

Test 1:

Without

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1130
  Average (NB in msec): 620.375000
  Maximum (SB in msec): 23
  Average (SB in msec): 21.468759
  Maximum (northd-loop in msec): 6002
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 914.760417
  Long term average (northd-loop in msec): 104.799340

With

  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1148
  Average (NB in msec): 630.25
  Maximum (SB in msec): 24
  Average (SB in msec): 21.468744
  Maximum (northd-loop in msec): 6090
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 762.101565
  Long term average (northd-loop in msec): 80.735192

The metric which actually matters and which SHOULD me measured - long term 
average is better by 20%. Using short term average instead of long term in the 
test suite is actually a BUG.

Good catch!

Are you running yours under some sort of virtualization?


No, I am testing on a bare-metal.

A.

On 30/09/2021 07:52, Han Zhou wrote:

Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @ 
2.90GHz, 24 cores.
It is weird why my result is so different. I also verified with a scale 
test script that creates a large scale NB/SB with 800 nodes of simulated k8s 
setup. And then just run:
    ovn-nbctl --print-wait-time --wait=sb sync

Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms

I suspected the hmap size problem but I tried changing the initial size to 64k 
buckets and it didn't help. I will find some time to check the "perf" reports.

Thanks,
Han

On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov mailto:anton.iva...@cambridgegreys.com>> wrote:

On 30/09/2021 07:16, Anton Ivanov wrote:

Results on a Ryzen 5 3600 - 6 cores 12 threads


I will also have a look into the "maximum" measurement for multi-thread.

It does not tie up with the drop in average across the board.

A.



Without


  1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1256
  Average (NB in msec): 679.463785
  Maximum (SB in msec): 25
  Average (SB in msec): 22.489798
  Maximum (northd-loop in msec): 1347
  Average (northd-loop in msec): 799.944878

  2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
  ---
  Maximum (NB in msec): 1956
  Average (NB in msec): 809.387285
  Maximum (SB in msec): 24
  Average (SB in msec): 21.649258
  Maximum (northd-loop in msec): 2011
  Average (northd-loop in msec): 961.718686

  5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 557
  Average (NB in msec): 474.010337
  Maximum (SB in msec): 15

Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-30 Thread Cpp Code
On Wed, Sep 29, 2021 at 6:19 AM Jakub Kicinski  wrote:
>
> On Wed, 29 Sep 2021 08:19:05 +0200 Nicolas Dichtel wrote:
> > > /* Insert a kernel only KEY_ATTR */
> > > #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX
> > > #undef OVS_KEY_ATTR_MAX
> > > #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX
> > Following the other thread [1], this will break if a new app runs over an 
> > old
> > kernel.
>
> Good point.
>
> > Why not simply expose this attribute to userspace and throw an error if a
> > userspace app uses it?
>
> Does it matter if it's exposed or not? Either way the parsing policy
> for attrs coming from user space should have a reject for the value.
> (I say that not having looked at the code, so maybe I shouldn't...)

To remove some confusion, there are some architectural nuances if we
want to extend code without large refactor.
The ovs_key_attr is defined only in kernel side. Userspace side is
generated from this file. As well the code can be built without kernel
modules.
The code inside OVS repository and net-next is not identical, but I
try to keep some consistency.

JFYI This is the file responsible for generating userspace part:
https://github.com/openvswitch/ovs/blob/master/build-aux/extract-odp-netlink-h
This is the how corresponding file for ovs_key_attr looks inside OVS:
https://github.com/openvswitch/ovs/blob/master/datapath/linux/compat/include/linux/openvswitch.h
one can see there are more values than in net-next version.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v6] net: openvswitch: IPv6: Add IPv6 extension header support

2021-09-30 Thread Cpp Code
On Tue, Sep 28, 2021 at 5:48 PM Jakub Kicinski  wrote:
>
> On Tue, 28 Sep 2021 12:47:27 -0700 Toms Atteka wrote:
> > diff --git a/include/uapi/linux/openvswitch.h 
> > b/include/uapi/linux/openvswitch.h
> > index a87b44cd5590..dc6eb5f6399f 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -346,6 +346,13 @@ enum ovs_key_attr {
> >  #ifdef __KERNEL__
> >   OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
> >  #endif
> > +
> > +#ifndef __KERNEL__
>
> #else
>
> > + PADDING,  /* Padding so kernel and non kernel field count would match 
> > */
>
> The name PADDING seems rather risky, collisions will be likely.
> OVS_KEY_ATTR_PADDING maybe?
>
> But maybe we don't need to define this special value and bake it into
> the uAPI, why can't we add something like this to the kernel header
> (i.e. include/linux/openvswitch.h):
>
> /* Insert a kernel only KEY_ATTR */
> #define OVS_KEY_ATTR_TUNNEL_INFO__OVS_KEY_ATTR_MAX
> #undef OVS_KEY_ATTR_MAX
> #define OVS_KEY_ATTR_MAX__OVS_KEY_ATTR_MAX
>
> > +#endif

Agree, name should be changed, I think I will go with __OVS_KEY_ATTR_PADDING
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] controller: Allocate zone ids for localport port bindings.

2021-09-30 Thread Frode Nordahl
Commit d4bca93 limited ct zone allocation to VIF-ports only, this
breaks OpenStack's use of localport for providing metadata to
instances.

This patch adds lports of type 'localport' to the list of lport
types to allocate zone ids for.

No test case is added as I am unable to make the error condition
occur reliably in the sandbox test environment, however I have
confirmed that the error condition is there on an unmodified OVN
installation and that the behavior changes by adding this patch.
Both for a fresh deployment where a modified package is slipped in
prior to first start of OVN and comparison of patched and
non-patched OVN after reboot of a deployed node.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2021-September/051473.html
Fixes: d4bca93 ("controller: Don't allocate zone ids for non-VIF port bindings. 
")
Signed-off-by: Frode Nordahl 
---
 controller/ovn-controller.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index a719beb0e..4202f32cc 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1919,8 +1919,9 @@ ct_zones_runtime_data_handler(struct engine_node *node, 
void *data)
 struct shash_node *shash_node;
 SHASH_FOR_EACH (shash_node, &tdp->lports) {
 struct tracked_lport *t_lport = shash_node->data;
-if (strcmp(t_lport->pb->type, "")) {
-/* We allocate zone-id's only to VIF lports. */
+if (strcmp(t_lport->pb->type, "")
+&& strcmp(t_lport->pb->type, "localport")) {
+/* We allocate zone-id's only to VIF and localport lports. */
 continue;
 }
 
-- 
2.32.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Han Zhou
On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov <
anton.iva...@cambridgegreys.com> wrote:

> Summary of findings.
>
> 1. The numbers on the perf test do not align with heater which is much
> closer to a realistic load. On some tests where heater gives 5-10%
> end-to-end improvement with parallelization we get worse results with the
> perf-test. You spotted this one correctly.
>
> Example of the northd average pulled out of the test report via grep and
> sed.
>
>127.489353
>131.509458
>116.088205
>94.721911
>119.629756
>114.896258
>124.811069
>129.679160
>106.699905
>134.490338
>112.106713
>135.957658
>132.47
>94.106849
>117.431450
>115.861592
>106.830657
>132.396905
>107.092542
>128.945760
>94.298464
>120.455510
>136.910426
>134.311765
>115.881292
>116.918458
>
> These values are all over the place - this is not a reproducible test.
>
> 2. In the present state you need to re-run it > 30+ times and take an
> average. The standard deviation for the values for the northd loop is >
> 10%. Compared to that the reproducibility of ovn-heater is significantly
> better. I usually get less than 0.5% difference between runs if there was
> no iteration failures. I would suggest using that instead if you want to do
> performance comparisons until we have figured out what affects the
> perf-test.
>
> 3. It is using the short term running average value in reports which is
> probably wrong because you have very significant skew from the last several
> values.
>
> I will look into all of these.
>
Thanks for the summary! However, I think there is a bigger problem
(probably related to my environment) than the stability of the test (make
check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in an earlier
email I observed even worse results with a large scale topology closer to a
real world deployment of ovn-k8s just testing with the command:
ovn-nbctl --print-wait-time --wait=sb sync

This command simply triggers a change in NB_Global table and wait for
northd to complete all the recompute and update SB. It doesn't have to use
"sync" command but any change to the NB DB produces similar result (e.g.:
ovn-nbctl --print-wait-time --wait=sb ls-add ls1)

Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms

This result is stable and consistent when repeating the command on my
machine. Would you try it on your machine as well? I understand that only
the lflow generation part can be parallelized and it doesn't solve all the
bottleneck, but I did expect it to be faster instead of slower. If your
result always shows that parallel is better, then I will have to dig it out
myself on my test machine.

Thanks,
Han

> Brgds,
> On 30/09/2021 08:26, Han Zhou wrote:
>
>
>
> On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov <
> anton.iva...@cambridgegreys.com> wrote:
>
>> After quickly adding some more prints into the testsuite.
>>
>> Test 1:
>>
>> Without
>>
>>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 1130
>>   Average (NB in msec): 620.375000
>>   Maximum (SB in msec): 23
>>   Average (SB in msec): 21.468759
>>   Maximum (northd-loop in msec): 6002
>>   Minimum (northd-loop in msec): 0
>>   Average (northd-loop in msec): 914.760417
>>   Long term average (northd-loop in msec): 104.799340
>>
>> With
>>
>>   1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical
>> Ports/Hypervisor -- ovn-northd -- dp-groups=yes
>>   ---
>>   Maximum (NB in msec): 1148
>>   Average (NB in msec): 630.25
>>   Maximum (SB in msec): 24
>>   Average (SB in msec): 21.468744
>>   Maximum (northd-loop in msec): 6090
>>   Minimum (northd-loop in msec): 0
>>   Average (northd-loop in msec): 762.101565
>>   Long term average (northd-loop in msec): 80.735192
>>
>> The metric which actually matters and which SHOULD me measured - long
>> term average is better by 20%. Using short term average instead of long
>> term in the test suite is actually a BUG.
>>
> Good catch!
>
>
>> Are you running yours under some sort of virtualization?
>>
>
> No, I am testing on a bare-metal.
>
>
>> A.
>> On 30/09/2021 07:52, Han Zhou wrote:
>>
>> Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @
>> 2.90GHz, 24 cores.
>> It is weird why my result is so different. I also verified with a scale
>> test script that creates a large scale NB/SB with 800 nodes of simulated
>> k8s setup. And then just run:
>> ovn-nbctl --print-wait-time --wait=sb sync
>>
>> Without parallel:
>> ovn-northd completion: 7807ms
>>
>> With parallel:
>> ovn-northd completion: 41267ms
>>
>> I suspected the hmap size problem but I tried changing the initial size
>> to 64k buckets and it didn't help. I will find some time to check the
>> "perf" reports.
>>
>> Thanks,
>> Han
>>
>> On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov <
>>

Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Anton Ivanov

On 30/09/2021 20:48, Han Zhou wrote:



On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov 
> wrote:


Summary of findings.

1. The numbers on the perf test do not align with heater which is
much closer to a realistic load. On some tests where heater gives
5-10% end-to-end improvement with parallelization we get worse
results with the perf-test. You spotted this one correctly.

Example of the northd average pulled out of the test report via
grep and sed.

   127.489353
   131.509458
   116.088205
   94.721911
   119.629756
   114.896258
   124.811069
   129.679160
   106.699905
   134.490338
   112.106713
   135.957658
   132.47
   94.106849
   117.431450
   115.861592
   106.830657
   132.396905
   107.092542
   128.945760
   94.298464
   120.455510
   136.910426
   134.311765
   115.881292
   116.918458

These values are all over the place - this is not a reproducible test.

2. In the present state you need to re-run it > 30+ times and take
an average. The standard deviation for the values for the northd
loop is > 10%. Compared to that the reproducibility of ovn-heater
is significantly better. I usually get less than 0.5% difference
between runs if there was no iteration failures. I would suggest
using that instead if you want to do performance comparisons until
we have figured out what affects the perf-test.

3. It is using the short term running average value in reports
which is probably wrong because you have very significant skew
from the last several values.

I will look into all of these.

Thanks for the summary! However, I think there is a bigger problem 
(probably related to my environment) than the stability of the test 
(make check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in 
an earlier email I observed even worse results with a large scale 
topology closer to a real world deployment of ovn-k8s just testing 
with the command:

    ovn-nbctl --print-wait-time --wait=sb sync

This command simply triggers a change in NB_Global table and wait for 
northd to complete all the recompute and update SB. It doesn't have to 
use "sync" command but any change to the NB DB produces similar result 
(e.g.: ovn-nbctl --print-wait-time --wait=sb ls-add ls1)


Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms


Is this with current master or prior to these patches?

1. There was an issue prior to these where the hash on first iteration 
with an existing database when loading a large database for the first 
time was not sized correctly. These numbers sound about right when this 
bug was around.


2. There should be NO DIFFERENCE in a single compute cycle with an 
existing database between a run with parallel and without with dp groups 
at present. This is because the first cycle does not use parallel 
compute. It is disabled in order to achieve the correct hash sizings for 
future cycle by auto-scaling the hash.


So what exact tag/commit are you running this with and with what options 
are on/off?


A.



This result is stable and consistent when repeating the command on my 
machine. Would you try it on your machine as well? I understand that 
only the lflow generation part can be parallelized and it doesn't 
solve all the bottleneck, but I did expect it to be faster instead of 
slower. If your result always shows that parallel is better, then I 
will have to dig it out myself on my test machine.


Thanks,
Han

Brgds,

On 30/09/2021 08:26, Han Zhou wrote:



On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov
mailto:anton.iva...@cambridgegreys.com>> wrote:

After quickly adding some more prints into the testsuite.

Test 1:

Without

  1: ovn-northd basic scale test -- 200 Hypervisors, 200
Logical Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1130
  Average (NB in msec): 620.375000
  Maximum (SB in msec): 23
  Average (SB in msec): 21.468759
  Maximum (northd-loop in msec): 6002
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 914.760417
  Long term average (northd-loop in msec): 104.799340

With

  1: ovn-northd basic scale test -- 200 Hypervisors, 200
Logical Ports/Hypervisor -- ovn-northd -- dp-groups=yes
  ---
  Maximum (NB in msec): 1148
  Average (NB in msec): 630.25
  Maximum (SB in msec): 24
  Average (SB in msec): 21.468744
  Maximum (northd-loop in msec): 6090
  Minimum (northd-loop in msec): 0
  Average (northd-loop in msec): 762.101565
  Long term average (northd-loop in msec): 80.735192

The metric which actually matters and which SHOULD me
me

[ovs-dev] [PATCH v3 ovn] Fix basic multicast flows for vxlan (non-vtep) tunnels

2021-09-30 Thread Ihar Hrachyshka
The 15-bit port key range used for multicast groups can't be covered
by 12-bit key space available for port keys in VXLAN. To make
multicast keys work, we have to transform 16-bit multicast port keys
to 12-bit keys before fanning out packets through VXLAN tunnels.
Otherwise significant bits are not retained, and multicast / broadcast
traffic does not reach ports located on other chassis.

This patch introduces a mapping scheme between core 16-bit multicast
port keys and 12-bit key range available in VXLAN. The scheme is as
follows:

1) Before sending a packet through VXLAN tunnel, the most significant
   bit of a 16-bit port key is copied into the most significant bit of
   12-bit VXLAN key. The 11 least significant bits of a 16-bit port
   key are copied to the least significant bits of 12-bit VXLAN key.

2) When receiving a packet through VXLAN tunnel, the most significant
   bit of a VXLAN 12-bit port key is copied into the most significant
   bit of 16-bit port key used in core. The 11 least significant bits
   of a VXLAN 12-bit port key are copied into the least significant
   bits of a 16-bit port key used in core.

This change also implies that the range available for multicast port
keys is more limited and fits into 11-bit space. The same rule should
be enforced for unicast port keys, like we already do for tunnel keys
when a VXLAN encap is present in a cluster. This enforcement is
implied here but missing in master and will be implemented in a
separate patch. (The missing enforcement is an oversight of the
original patch that added support for VXLAN tunnels.)

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka 

--

v1: initial commit
v2: updated docs
v2: removed newly added but unused macros
v3: rebase
---
 controller-vtep/gateway.c |   2 +
 controller/physical.c | 101 ++
 lib/mcast-group-index.h   |   2 +
 ovn-architecture.7.xml|  13 +++--
 tests/ovn.at  |  23 +++--
 5 files changed, 111 insertions(+), 30 deletions(-)

diff --git a/controller-vtep/gateway.c b/controller-vtep/gateway.c
index e9419138b..288772dc4 100644
--- a/controller-vtep/gateway.c
+++ b/controller-vtep/gateway.c
@@ -61,6 +61,8 @@ create_chassis_rec(struct ovsdb_idl_txn *txn, const char 
*name,
 sbrec_encap_set_options(encap_rec, &options);
 sbrec_encap_set_chassis_name(encap_rec, name);
 sbrec_chassis_set_encaps(chassis_rec, &encap_rec, 1);
+const struct smap oc = SMAP_CONST1(&oc, "is-vtep", "true");
+sbrec_chassis_set_other_config(chassis_rec, &oc);
 
 return chassis_rec;
 }
diff --git a/controller/physical.c b/controller/physical.c
index 0cfb158c8..04e5f0182 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -37,6 +37,7 @@
 #include "openvswitch/ofp-parse.h"
 #include "ovn-controller.h"
 #include "lib/chassis-index.h"
+#include "lib/mcast-group-index.h"
 #include "lib/ovn-sb-idl.h"
 #include "lib/ovn-util.h"
 #include "physical.h"
@@ -63,6 +64,7 @@ static void
 load_logical_ingress_metadata(const struct sbrec_port_binding *binding,
   const struct zone_ids *zone_ids,
   struct ofpbuf *ofpacts_p);
+static int64_t get_vxlan_port_key(int64_t port_key);
 
 /* UUID to identify OF flows not associated with ovsdb rows. */
 static struct uuid *hc_uuid = NULL;
@@ -160,8 +162,9 @@ put_encapsulation(enum mf_field_id mff_ovn_geneve,
 } else if (tun->type == VXLAN) {
 uint64_t vni = datapath->tunnel_key;
 if (!is_ramp_switch) {
-/* Only some bits are used for regular tunnels. */
-vni |= (uint64_t) outport << 12;
+/* Map southbound 16-bit port key to limited 12-bit range
+ * available for VXLAN, which differs for multicast groups. */
+vni |= get_vxlan_port_key(outport) << 12;
 }
 put_load(vni, MFF_TUN_ID, 0, 24, ofpacts);
 } else {
@@ -1372,6 +1375,43 @@ out:
 }
 }
 
+static int64_t
+get_vxlan_port_key(int64_t port_key)
+{
+if (port_key >= OVN_MIN_MULTICAST) {
+/* 0b1<11 least significant bits> */
+return OVN_VXLAN_MIN_MULTICAST |
+(port_key & (OVN_VXLAN_MIN_MULTICAST - 1));
+}
+return port_key;
+}
+
+static void
+fanout_to_chassis(enum mf_field_id mff_ovn_geneve,
+  struct sset *remote_chassis,
+  const struct hmap *chassis_tunnels,
+  const struct sbrec_datapath_binding *datapath,
+  uint16_t outport, bool is_ramp_switch,
+  struct ofpbuf *remote_ofpacts)
+{
+const char *chassis_name;
+const struct chassis_tunnel *prev = NULL;
+SSET_FOR_EACH (chassis_name, remote_chassis) {
+const struct chassis_tunnel *tun
+= chassis_tunnel_find(chassis_tunnels, chassis_name, NULL);
+if (!tun) {
+continue;
+}
+
+if (!prev || tun->type != prev->type) {
+ 

[ovs-dev] [PATCH v3 ovn] Enforce datapath and port key constraints in vxlan mode

2021-09-30 Thread Ihar Hrachyshka
With vxlan enabled for in-cluster communication, the ranges available
for port and datapath keys are limited to 12 bits (including multigroup
keys). (The default range is 16 bit long.)

This means that OVN should avoid allocating, or allowing to request,
tunnel keys for datapaths and ports that are equal or higher than
2 << 11. This was not enforced before, and this patch adds the missing
enforcement rules.

Fixes: b07f1bc3d068 ("Add VXLAN support for non-VTEP datapath bindings")
Signed-off-by: Ihar Hrachyshka 

--

v1: initial commit
v2: fix build (added missing OVN_VXLAN_MIN_MULTICAST macro)
v3: rebase
---
 lib/mcast-group-index.h |  2 ++
 northd/northd.c | 53 -
 northd/ovn_northd.dl| 31 +---
 tests/ovn.at| 49 +
 4 files changed, 110 insertions(+), 25 deletions(-)

diff --git a/lib/mcast-group-index.h b/lib/mcast-group-index.h
index 72af117a4..5bc725451 100644
--- a/lib/mcast-group-index.h
+++ b/lib/mcast-group-index.h
@@ -23,6 +23,8 @@ struct sbrec_datapath_binding;
 #define OVN_MIN_MULTICAST 32768
 #define OVN_MAX_MULTICAST 65535
 
+#define OVN_VXLAN_MIN_MULTICAST 2048
+
 enum ovn_mcast_tunnel_keys {
 
 OVN_MCAST_FLOOD_TUNNEL_KEY = OVN_MIN_MULTICAST,
diff --git a/northd/northd.c b/northd/northd.c
index cf2467fe1..21b34a70b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -1305,7 +1305,8 @@ ovn_datapath_allocate_key(struct northd_context *ctx,
 }
 
 static void
-ovn_datapath_assign_requested_tnl_id(struct hmap *dp_tnlids,
+ovn_datapath_assign_requested_tnl_id(struct northd_context *ctx,
+ struct hmap *dp_tnlids,
  struct ovn_datapath *od)
 {
 const struct smap *other_config = (od->nbs
@@ -1313,6 +1314,13 @@ ovn_datapath_assign_requested_tnl_id(struct hmap 
*dp_tnlids,
: &od->nbr->options);
 uint32_t tunnel_key = smap_get_int(other_config, "requested-tnl-key", 0);
 if (tunnel_key) {
+if (is_vxlan_mode(ctx->ovnsb_idl) && tunnel_key >= 1 << 12) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for datapath %s is "
+ "incompatible with VXLAN", tunnel_key,
+ od->nbs ? od->nbs->name : od->nbr->name);
+return;
+}
 if (ovn_add_tnlid(dp_tnlids, tunnel_key)) {
 od->tunnel_key = tunnel_key;
 } else {
@@ -1342,10 +1350,10 @@ build_datapaths(struct northd_context *ctx, struct hmap 
*datapaths,
 struct hmap dp_tnlids = HMAP_INITIALIZER(&dp_tnlids);
 struct ovn_datapath *od, *next;
 LIST_FOR_EACH (od, list, &both) {
-ovn_datapath_assign_requested_tnl_id(&dp_tnlids, od);
+ovn_datapath_assign_requested_tnl_id(ctx, &dp_tnlids, od);
 }
 LIST_FOR_EACH (od, list, &nb_only) {
-ovn_datapath_assign_requested_tnl_id(&dp_tnlids, od);
+ovn_datapath_assign_requested_tnl_id(ctx, &dp_tnlids, od);
 }
 
 /* Keep nonconflicting tunnel IDs that are already assigned. */
@@ -3750,27 +3758,40 @@ ovn_port_add_tnlid(struct ovn_port *op, uint32_t 
tunnel_key)
 }
 
 static void
-ovn_port_assign_requested_tnl_id(struct ovn_port *op)
+ovn_port_assign_requested_tnl_id(struct northd_context *ctx,
+ struct ovn_port *op)
 {
 const struct smap *options = (op->nbsp
   ? &op->nbsp->options
   : &op->nbrp->options);
 uint32_t tunnel_key = smap_get_int(options, "requested-tnl-key", 0);
-if (tunnel_key && !ovn_port_add_tnlid(op, tunnel_key)) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
-VLOG_WARN_RL(&rl, "Logical %s port %s requests same tunnel key "
- "%"PRIu32" as another LSP or LRP",
- op->nbsp ? "switch" : "router",
- op_get_name(op), tunnel_key);
+if (tunnel_key) {
+if (is_vxlan_mode(ctx->ovnsb_idl) &&
+tunnel_key >= OVN_VXLAN_MIN_MULTICAST) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "Tunnel key %"PRIu32" for port %s "
+ "is incompatible with VXLAN",
+ tunnel_key, op_get_name(op));
+return;
+}
+if (!ovn_port_add_tnlid(op, tunnel_key)) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "Logical %s port %s requests same tunnel key "
+ "%"PRIu32" as another LSP or LRP",
+ op->nbsp ? "switch" : "router",
+ op_get_name(op), tunnel_key);
+}
 }
 }
 
 static void
-ovn_port_allocate_key(struct hmap *ports, struct ovn_port *op)
+ovn_port_allocate_key(struct northd_

Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Han Zhou
On Thu, Sep 30, 2021 at 2:03 PM Anton Ivanov <
anton.iva...@cambridgegreys.com> wrote:

> On 30/09/2021 20:48, Han Zhou wrote:
>
>
>
> On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov <
> anton.iva...@cambridgegreys.com> wrote:
>
>> Summary of findings.
>>
>> 1. The numbers on the perf test do not align with heater which is much
>> closer to a realistic load. On some tests where heater gives 5-10%
>> end-to-end improvement with parallelization we get worse results with the
>> perf-test. You spotted this one correctly.
>>
>> Example of the northd average pulled out of the test report via grep and
>> sed.
>>
>>127.489353
>>131.509458
>>116.088205
>>94.721911
>>119.629756
>>114.896258
>>124.811069
>>129.679160
>>106.699905
>>134.490338
>>112.106713
>>135.957658
>>132.47
>>94.106849
>>117.431450
>>115.861592
>>106.830657
>>132.396905
>>107.092542
>>128.945760
>>94.298464
>>120.455510
>>136.910426
>>134.311765
>>115.881292
>>116.918458
>>
>> These values are all over the place - this is not a reproducible test.
>>
>> 2. In the present state you need to re-run it > 30+ times and take an
>> average. The standard deviation for the values for the northd loop is >
>> 10%. Compared to that the reproducibility of ovn-heater is significantly
>> better. I usually get less than 0.5% difference between runs if there was
>> no iteration failures. I would suggest using that instead if you want to do
>> performance comparisons until we have figured out what affects the
>> perf-test.
>>
>> 3. It is using the short term running average value in reports which is
>> probably wrong because you have very significant skew from the last several
>> values.
>>
>> I will look into all of these.
>>
> Thanks for the summary! However, I think there is a bigger problem
> (probably related to my environment) than the stability of the test (make
> check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in an earlier
> email I observed even worse results with a large scale topology closer to a
> real world deployment of ovn-k8s just testing with the command:
> ovn-nbctl --print-wait-time --wait=sb sync
>
> This command simply triggers a change in NB_Global table and wait for
> northd to complete all the recompute and update SB. It doesn't have to use
> "sync" command but any change to the NB DB produces similar result (e.g.:
> ovn-nbctl --print-wait-time --wait=sb ls-add ls1)
>
> Without parallel:
> ovn-northd completion: 7807ms
>
> With parallel:
> ovn-northd completion: 41267ms
>
> Is this with current master or prior to these patches?
>
1. There was an issue prior to these where the hash on first iteration with
> an existing database when loading a large database for the first time was
> not sized correctly. These numbers sound about right when this bug was
> around.
>
The patches are included. The commit id is 9242f27f63 as mentioned in my
first email.

> 2. There should be NO DIFFERENCE in a single compute cycle with an
> existing database between a run with parallel and without with dp groups at
> present. This is because the first cycle does not use parallel compute. It
> is disabled in order to achieve the correct hash sizings for future cycle
> by auto-scaling the hash.
>
Yes, I understand this and I did enable dp-group for the above "ovn-nbctl
sync" test, so the number I showed above for "with parallel" was for the
2nd run and onwards. For the first round the result is exactly the same as
without parallel.

I just tried disabling DP group for the large scale "ovn-nbctl sync" test
(after taking some effort squeezing out memory spaces on my desktop), and
the result shows that parallel build performs slightly better (although it
is 3x slower than with dp-group & without parallel, which is expected).
Summarize the result together below:

Without parallel, with dp-group:
ovn-northd completion: 7807ms

With parallel, with dp-group:
ovn-northd completion: 41267ms

without parallel, without dp-group:
ovn-northd completion: 27996ms

with parallel, without dp-group:
ovn-northd completion: 26584ms

Now the interesting part:
I implemented a POC of a hash based mutex array that replaces the rw lock
in the function do_ovn_lflow_add_pd(), and the performance is greatly
improved for the dp-group test:

with parallel, with dp-group (hash based mutex):
ovn-northd completion: 5081ms

This is 8x faster than the current parallel one and 30% faster than without
parallel. This result looks much more reasonable to me. My theory is that
when using parallel with dp-group, the rwlock contention is causing the low
CPU utilization of the threads and the overall slowness on my machine. I
will refine the POC to a formal patch and send it for review, hopefully by
tomorrow.

Thanks,
Han
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] controller: Allocate zone ids for localport port bindings.

2021-09-30 Thread Numan Siddique
On Thu, Sep 30, 2021 at 3:35 PM Frode Nordahl
 wrote:
>
> Commit d4bca93 limited ct zone allocation to VIF-ports only, this
> breaks OpenStack's use of localport for providing metadata to
> instances.
>
> This patch adds lports of type 'localport' to the list of lport
> types to allocate zone ids for.
>
> No test case is added as I am unable to make the error condition
> occur reliably in the sandbox test environment, however I have
> confirmed that the error condition is there on an unmodified OVN
> installation and that the behavior changes by adding this patch.
> Both for a fresh deployment where a modified package is slipped in
> prior to first start of OVN and comparison of patched and
> non-patched OVN after reboot of a deployed node.
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2021-September/051473.html
> Fixes: d4bca93 ("controller: Don't allocate zone ids for non-VIF port 
> bindings. ")
> Signed-off-by: Frode Nordahl 

Thanks a  lot for fixing this issue and thanks to Benjamin Reichel for
reporting it.

I was able to reproduce locally with the below changes to the test
case - "ovn-controller ct zone I-P handling"
So I added this test case to this patch and applied to the main branch
and branch-21.09.

Also updated the commit message accordingly. Hope that's fine with you.

Thanks
Numan


---diff --git a/tests/ovn.at b/tests/ovn.at
index 172b5c713..fc8f31d06 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -28481,6 +28481,9 @@ check as hv1 ovs-vsctl add-port br-int vif11 \
 ovn-nbctl ls-add sw0
 ovn-nbctl lsp-add sw0 sw0-port1
 ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 10.0.0.3"
+check ovn-nbctl lsp-add sw0 sw0-port2
+check ovn-nbctl lsp-set-addresses sw0-port2 "50:54:00:00:00:01 10.0.0.4"
+check ovn-nbctl lsp-set-type sw0-port2 localport

 ovn-nbctl lr-add lr0
 ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
@@ -28495,6 +28498,17 @@ AT_CHECK([as hv1 ovn-appctl -t ovn-controller
ct-zone-list | \
 grep sw0-port1 -c], [0], [1
 ])

+check as hv1 ovs-vsctl add-port br-int vif13 \
+-- set interface vif13 external_ids:iface-id=sw0-port2 ofport-request=13
+
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=0 | grep
"in_port=13" | wc -l` -eq 1])
+
+# There should be ct zone for sw0-port2 (localport).
+AT_CHECK([as hv1 ovn-appctl -t ovn-controller ct-zone-list | \
+grep sw0-port2 -c], [0], [1
+])
+
 # There should be no ct-zone id allocated for sw0-lr0 and lr0-sw0
 AT_CHECK([as hv1 ovn-appctl -t ovn-controller ct-zone-list | \
 grep sw0-lr0], [1], [])



> ---
>  controller/ovn-controller.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index a719beb0e..4202f32cc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1919,8 +1919,9 @@ ct_zones_runtime_data_handler(struct engine_node *node, 
> void *data)
>  struct shash_node *shash_node;
>  SHASH_FOR_EACH (shash_node, &tdp->lports) {
>  struct tracked_lport *t_lport = shash_node->data;
> -if (strcmp(t_lport->pb->type, "")) {
> -/* We allocate zone-id's only to VIF lports. */
> +if (strcmp(t_lport->pb->type, "")
> +&& strcmp(t_lport->pb->type, "localport")) {
> +/* We allocate zone-id's only to VIF and localport lports. */
>  continue;
>  }
>
> --
> 2.32.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

2021-09-30 Thread Anton Ivanov

On 01/10/2021 01:32, Han Zhou wrote:



On Thu, Sep 30, 2021 at 2:03 PM Anton Ivanov 
> wrote:


On 30/09/2021 20:48, Han Zhou wrote:



On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov
mailto:anton.iva...@cambridgegreys.com>> wrote:

Summary of findings.

1. The numbers on the perf test do not align with heater
which is much closer to a realistic load. On some tests where
heater gives 5-10% end-to-end improvement with
parallelization we get worse results with the perf-test. You
spotted this one correctly.

Example of the northd average pulled out of the test report
via grep and sed.

   127.489353
   131.509458
   116.088205
   94.721911
   119.629756
   114.896258
   124.811069
   129.679160
   106.699905
   134.490338
   112.106713
   135.957658
   132.47
   94.106849
   117.431450
   115.861592
   106.830657
   132.396905
   107.092542
   128.945760
   94.298464
   120.455510
   136.910426
   134.311765
   115.881292
   116.918458

These values are all over the place - this is not a
reproducible test.

2. In the present state you need to re-run it > 30+ times and
take an average. The standard deviation for the values for
the northd loop is > 10%. Compared to that the
reproducibility of ovn-heater is significantly better. I
usually get less than 0.5% difference between runs if there
was no iteration failures. I would suggest using that instead
if you want to do performance comparisons until we have
figured out what affects the perf-test.

3. It is using the short term running average value in
reports which is probably wrong because you have very
significant skew from the last several values.

I will look into all of these.

Thanks for the summary! However, I think there is a bigger
problem (probably related to my environment) than the stability
of the test (make check-perf TESTSUITEFLAGS="--rebuild") itself.
As I mentioned in an earlier email I observed even worse results
with a large scale topology closer to a real world deployment of
ovn-k8s just testing with the command:
    ovn-nbctl --print-wait-time --wait=sb sync

This command simply triggers a change in NB_Global table and wait
for northd to complete all the recompute and update SB. It
doesn't have to use "sync" command but any change to the NB DB
produces similar result (e.g.: ovn-nbctl --print-wait-time
--wait=sb ls-add ls1)

Without parallel:
ovn-northd completion: 7807ms

With parallel:
ovn-northd completion: 41267ms


Is this with current master or prior to these patches?

1. There was an issue prior to these where the hash on first
iteration with an existing database when loading a large database
for the first time was not sized correctly. These numbers sound
about right when this bug was around.

The patches are included. The commit id is 9242f27f63 as mentioned in 
my first email.


2. There should be NO DIFFERENCE in a single compute cycle with an
existing database between a run with parallel and without with dp
groups at present. This is because the first cycle does not use
parallel compute. It is disabled in order to achieve the correct
hash sizings for future cycle by auto-scaling the hash.

Yes, I understand this and I did enable dp-group for the above 
"ovn-nbctl sync" test, so the number I showed above for "with 
parallel" was for the 2nd run and onwards. For the first round the 
result is exactly the same as without parallel.


I just tried disabling DP group for the large scale "ovn-nbctl sync" 
test (after taking some effort squeezing out memory spaces on my 
desktop), and the result shows that parallel build performs slightly 
better (although it is 3x slower than with dp-group & without 
parallel, which is expected). Summarize the result together below:


Without parallel, with dp-group:
ovn-northd completion: 7807ms

With parallel, with dp-group:
ovn-northd completion: 41267ms

without parallel, without dp-group:
ovn-northd completion: 27996ms

with parallel, without dp-group:
ovn-northd completion: 26584ms

Now the interesting part:
I implemented a POC of a hash based mutex array that replaces the rw 
lock in the function do_ovn_lflow_add_pd(), and the performance is 
greatly improved for the dp-group test:


with parallel, with dp-group (hash based mutex):
ovn-northd completion: 5081ms

This is 8x faster than the current parallel one and 30% faster than 
without parallel. This result looks much more reasonable to me. My 
theory is that when using parallel with dp-group, the rwlock 
contention

Re: [ovs-dev] [PATCH ovn] controller: Allocate zone ids for localport port bindings.

2021-09-30 Thread Frode Nordahl
On Fri, Oct 1, 2021 at 2:34 AM Numan Siddique  wrote:
>
> On Thu, Sep 30, 2021 at 3:35 PM Frode Nordahl
>  wrote:
> >
> > Commit d4bca93 limited ct zone allocation to VIF-ports only, this
> > breaks OpenStack's use of localport for providing metadata to
> > instances.
> >
> > This patch adds lports of type 'localport' to the list of lport
> > types to allocate zone ids for.
> >
> > No test case is added as I am unable to make the error condition
> > occur reliably in the sandbox test environment, however I have
> > confirmed that the error condition is there on an unmodified OVN
> > installation and that the behavior changes by adding this patch.
> > Both for a fresh deployment where a modified package is slipped in
> > prior to first start of OVN and comparison of patched and
> > non-patched OVN after reboot of a deployed node.
> >
> > Reported-at: 
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2021-September/051473.html
> > Fixes: d4bca93 ("controller: Don't allocate zone ids for non-VIF port 
> > bindings. ")
> > Signed-off-by: Frode Nordahl 
>
> Thanks a  lot for fixing this issue and thanks to Benjamin Reichel for
> reporting it.
>
> I was able to reproduce locally with the below changes to the test
> case - "ovn-controller ct zone I-P handling"
> So I added this test case to this patch and applied to the main branch
> and branch-21.09.
>
> Also updated the commit message accordingly. Hope that's fine with you.

That's great, thank you for helping with that.

Cheers!

-- 
Frode Nordahl

> Thanks
> Numan
>
>
> ---diff --git a/tests/ovn.at b/tests/ovn.at
> index 172b5c713..fc8f31d06 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -28481,6 +28481,9 @@ check as hv1 ovs-vsctl add-port br-int vif11 \
>  ovn-nbctl ls-add sw0
>  ovn-nbctl lsp-add sw0 sw0-port1
>  ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 10.0.0.3"
> +check ovn-nbctl lsp-add sw0 sw0-port2
> +check ovn-nbctl lsp-set-addresses sw0-port2 "50:54:00:00:00:01 10.0.0.4"
> +check ovn-nbctl lsp-set-type sw0-port2 localport
>
>  ovn-nbctl lr-add lr0
>  ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
> @@ -28495,6 +28498,17 @@ AT_CHECK([as hv1 ovn-appctl -t ovn-controller
> ct-zone-list | \
>  grep sw0-port1 -c], [0], [1
>  ])
>
> +check as hv1 ovs-vsctl add-port br-int vif13 \
> +-- set interface vif13 external_ids:iface-id=sw0-port2 ofport-request=13
> +
> +ovn-nbctl --wait=hv sync
> +OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=0 | grep
> "in_port=13" | wc -l` -eq 1])
> +
> +# There should be ct zone for sw0-port2 (localport).
> +AT_CHECK([as hv1 ovn-appctl -t ovn-controller ct-zone-list | \
> +grep sw0-port2 -c], [0], [1
> +])
> +
>  # There should be no ct-zone id allocated for sw0-lr0 and lr0-sw0
>  AT_CHECK([as hv1 ovn-appctl -t ovn-controller ct-zone-list | \
>  grep sw0-lr0], [1], [])
> 
>
>
> > ---
> >  controller/ovn-controller.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index a719beb0e..4202f32cc 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -1919,8 +1919,9 @@ ct_zones_runtime_data_handler(struct engine_node 
> > *node, void *data)
> >  struct shash_node *shash_node;
> >  SHASH_FOR_EACH (shash_node, &tdp->lports) {
> >  struct tracked_lport *t_lport = shash_node->data;
> > -if (strcmp(t_lport->pb->type, "")) {
> > -/* We allocate zone-id's only to VIF lports. */
> > +if (strcmp(t_lport->pb->type, "")
> > +&& strcmp(t_lport->pb->type, "localport")) {
> > +/* We allocate zone-id's only to VIF and localport lports. 
> > */
> >  continue;
> >  }
> >
> > --
> > 2.32.0
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev