OK,

I can dig into this later this afternoon.

There is quite a bit of dispersion in tests without parallelization on my 
system which should not be there.

I want to get down to the bottom of where it is coming from and why are we 
getting different results compared to ovn-heater.

I did all the original tests with ovn-heater and they were consistently 5-10% 
better end-to-end with parallelization enabled.

As far as the worker threads never reaching 100% and the northd thread being 
regularly at 100% that is unfortunately how it is. Large sections of northd 
cannot be parallelized at present. The only bit which can be run in parallel is 
lflow compute.

Generation of datapaths, ports, groups - all before the lflows cannot be 
parallelized and it is compute heavy.

Post-processing of flows once they have been generated - hash recompute, 
reconciliation of databases, etc - cannot be parallelized at present. Some of 
it may be run in parallel if there were parallel macros in the OVS source, but 
they are likely to give only marginal effect on performance - 1-2% at most.

Best Regards,

A.

On 30/09/2021 08:26, Han Zhou wrote:


On Thu, Sep 30, 2021 at 12:08 AM Anton Ivanov <anton.iva...@cambridgegreys.com 
<mailto:anton.iva...@cambridgegreys.com>> wrote:

    After quickly adding some more prints into the testsuite.

    Test 1:

    Without

      1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
      ---
      Maximum (NB in msec): 1130
      Average (NB in msec): 620.375000
      Maximum (SB in msec): 23
      Average (SB in msec): 21.468759
      Maximum (northd-loop in msec): 6002
      Minimum (northd-loop in msec): 0
      Average (northd-loop in msec): 914.760417
      Long term average (northd-loop in msec): 104.799340

    With

      1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
      ---
      Maximum (NB in msec): 1148
      Average (NB in msec): 630.250000
      Maximum (SB in msec): 24
      Average (SB in msec): 21.468744
      Maximum (northd-loop in msec): 6090
      Minimum (northd-loop in msec): 0
      Average (northd-loop in msec): 762.101565
      Long term average (northd-loop in msec): 80.735192

    The metric which actually matters and which SHOULD me measured - long term 
average is better by 20%. Using short term average instead of long term in the 
test suite is actually a BUG.

Good catch!

    Are you running yours under some sort of virtualization?


No, I am testing on a bare-metal.

    A.

    On 30/09/2021 07:52, Han Zhou wrote:
    Thanks Anton for checking. I am using: Intel(R) Core(TM) i9-7920X CPU @ 
2.90GHz, 24 cores.
    It is weird why my result is so different. I also verified with a scale 
test script that creates a large scale NB/SB with 800 nodes of simulated k8s 
setup. And then just run:
        ovn-nbctl --print-wait-time --wait=sb sync

    Without parallel:
    ovn-northd completion: 7807ms

    With parallel:
    ovn-northd completion: 41267ms

    I suspected the hmap size problem but I tried changing the initial size to 64k 
buckets and it didn't help. I will find some time to check the "perf" reports.

    Thanks,
    Han

    On Wed, Sep 29, 2021 at 11:31 PM Anton Ivanov <anton.iva...@cambridgegreys.com 
<mailto:anton.iva...@cambridgegreys.com>> wrote:

        On 30/09/2021 07:16, Anton Ivanov wrote:
        Results on a Ryzen 5 3600 - 6 cores 12 threads

        I will also have a look into the "maximum" measurement for multi-thread.

        It does not tie up with the drop in average across the board.

        A.


        Without


          1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
          ---
          Maximum (NB in msec): 1256
          Average (NB in msec): 679.463785
          Maximum (SB in msec): 25
          Average (SB in msec): 22.489798
          Maximum (northd-loop in msec): 1347
          Average (northd-loop in msec): 799.944878

          2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
          ---
          Maximum (NB in msec): 1956
          Average (NB in msec): 809.387285
          Maximum (SB in msec): 24
          Average (SB in msec): 21.649258
          Maximum (northd-loop in msec): 2011
          Average (northd-loop in msec): 961.718686

          5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
          ---
          Maximum (NB in msec): 557
          Average (NB in msec): 474.010337
          Maximum (SB in msec): 15
          Average (SB in msec): 13.927192
          Maximum (northd-loop in msec): 1261
          Average (northd-loop in msec): 580.999122

          6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
          ---
          Maximum (NB in msec): 756
          Average (NB in msec): 625.614724
          Maximum (SB in msec): 15
          Average (SB in msec): 14.181048
          Maximum (northd-loop in msec): 1649
          Average (northd-loop in msec): 746.208332


        With

          1: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
          ---
          Maximum (NB in msec): 1140
          Average (NB in msec): 631.125000
          Maximum (SB in msec): 24
          Average (SB in msec): 21.453609
          Maximum (northd-loop in msec): 6080
          Average (northd-loop in msec): 759.718815

          2: ovn-northd basic scale test -- 200 Hypervisors, 200 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
          ---
          Maximum (NB in msec): 1210
          Average (NB in msec): 673.000000
          Maximum (SB in msec): 27
          Average (SB in msec): 22.453125
          Maximum (northd-loop in msec): 6514
          Average (northd-loop in msec): 808.596842

          5: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=yes
          ---
          Maximum (NB in msec): 798
          Average (NB in msec): 429.750000
          Maximum (SB in msec): 15
          Average (SB in msec): 12.998533
          Maximum (northd-loop in msec): 3835
          Average (northd-loop in msec): 564.875986

          6: ovn-northd basic scale test -- 500 Hypervisors, 50 Logical 
Ports/Hypervisor -- ovn-northd -- dp-groups=no
          ---
          Maximum (NB in msec): 1074
          Average (NB in msec): 593.875000
          Maximum (SB in msec): 14
          Average (SB in msec): 13.655273
          Maximum (northd-loop in msec): 4973
          Average (northd-loop in msec): 771.102605

        The only one slower is test 6 which I will look into.

        The rest are > 5% faster.

        A.

        On 30/09/2021 00:56, Han Zhou wrote:


        On Wed, Sep 15, 2021 at 5:45 AM <anton.iva...@cambridgegreys.com 
<mailto:anton.iva...@cambridgegreys.com>> wrote:
        >
        > From: Anton Ivanov <anton.iva...@cambridgegreys.com 
<mailto:anton.iva...@cambridgegreys.com>>
        >
        > Restore parallel build with dp groups using rwlock instead
        > of per row locking as an underlying mechanism.
        >
        > This provides improvement ~ 10% end-to-end on ovn-heater
        > under virutalization despite awakening some qemu gremlin
        > which makes qemu climb to silly CPU usage. The gain on
        > bare metal is likely to be higher.
        >
        Hi Anton,

        I am trying to see the benefit of parallel_build, but encountered 
unexpected performance result when running the perf tests with command:
             make check-perf TESTSUITEFLAGS="--rebuild"

        It shows significantly worse performance than without parallel_build. For 
dp_group = no cases, it is better, but still ~30% slower than without parallel_build. I 
have 24 cores, but each thread is not consuming much CPU except the main thread. I also 
tried hardcode the number of thread to just 4, which end up with slightly better results, 
but still far behind "without parallel_build".

                   no parallel                    | parallel  (24 pool threads) 
      |        parallel with (4 pool threads)
                           |   |
            1: ovn-northd basic scale test -- 200 Hypervisors, 200 |    1: 
ovn-northd basic scale test -- 200 Hypervisors, 200 |    1: ovn-northd basic 
scale test -- 200 Hypervisors, 200
            ---                      |    ---      |    ---
            Maximum (NB in msec): 1058                     |    Maximum (NB in 
msec): 4269     |    Maximum (NB in msec): 4097
            Average (NB in msec): 836.941167                     |    Average 
(NB in msec): 3697.253931      |    Average (NB in msec): 3498.311525
            Maximum (SB in msec): 30                     |    Maximum (SB in 
msec): 30     |    Maximum (SB in msec): 28
            Average (SB in msec): 25.934011                      |    Average 
(SB in msec): 26.001840      |    Average (SB in msec): 25.685091
            Maximum (northd-loop in msec): 1204                    |    Maximum 
(northd-loop in msec): 4379          |    Maximum (northd-loop in msec): 4251
            Average (northd-loop in msec): 1005.330078             |    Average 
(northd-loop in msec): 4233.871504         |    Average (northd-loop in msec): 
4022.774208
                             |     |
            2: ovn-northd basic scale test -- 200 Hypervisors, 200 |    2: 
ovn-northd basic scale test -- 200 Hypervisors, 200 |    2: ovn-northd basic 
scale test -- 200 Hypervisors, 200
            ---                      |    ---      |    ---
            Maximum (NB in msec): 1124                     |    Maximum (NB in 
msec): 1480     |    Maximum (NB in msec): 1331
            Average (NB in msec): 892.403405                     |    Average 
(NB in msec): 1206.189287      |    Average (NB in msec): 1089.378455
            Maximum (SB in msec): 29                     |    Maximum (SB in 
msec): 31     |    Maximum (SB in msec): 30
            Average (SB in msec): 26.922632                      |    Average 
(SB in msec): 26.636706      |    Average (SB in msec): 25.657484
            Maximum (northd-loop in msec): 1275                    |    Maximum 
(northd-loop in msec): 1639          |    Maximum (northd-loop in msec): 1495
            Average (northd-loop in msec): 1074.917873             |    Average 
(northd-loop in msec): 1458.152327         |    Average (northd-loop in msec): 
1301.057201
                             |     |
            5: ovn-northd basic scale test -- 500 Hypervisors, 50 L|    5: 
ovn-northd basic scale test -- 500 Hypervisors, 50 L|    5: ovn-northd basic 
scale test -- 500 Hypervisors, 50
            ---                      |    ---      |    ---
            Maximum (NB in msec): 768                      |    Maximum (NB in 
msec): 3086     |    Maximum (NB in msec): 2876
            Average (NB in msec): 614.491938                     |    Average 
(NB in msec): 2681.688365      |    Average (NB in msec): 2531.255444
            Maximum (SB in msec): 18                     |    Maximum (SB in 
msec): 17     |    Maximum (SB in msec): 18
            Average (SB in msec): 16.347526                      |    Average 
(SB in msec): 15.955263      |    Average (SB in msec): 16.278075
            Maximum (northd-loop in msec): 889                     |    Maximum 
(northd-loop in msec): 3247          |    Maximum (northd-loop in msec): 3031
            Average (northd-loop in msec): 772.083572              |    Average 
(northd-loop in msec): 3117.504297         |    Average (northd-loop in msec): 
2833.182361
                             |     |
            6: ovn-northd basic scale test -- 500 Hypervisors, 50 L|    6: 
ovn-northd basic scale test -- 500 Hypervisors, 50 L|    6: ovn-northd basic 
scale test -- 500 Hypervisors, 50
            ---                      |    ---      |    ---
            Maximum (NB in msec): 1046                     |    Maximum (NB in 
msec): 1371     |    Maximum (NB in msec): 1262
            Average (NB in msec): 827.735852                     |    Average 
(NB in msec): 1135.514228      |    Average (NB in msec): 970.544792
            Maximum (SB in msec): 19                     |    Maximum (SB in 
msec): 18     |    Maximum (SB in msec): 19
            Average (SB in msec): 16.828127                      |    Average 
(SB in msec): 16.083914      |    Average (SB in msec): 15.602525
            Maximum (northd-loop in msec): 1163                    |    Maximum 
(northd-loop in msec): 1545          |    Maximum (northd-loop in msec): 1411
            Average (northd-loop in msec): 972.567407              |    Average 
(northd-loop in msec): 1328.617583         |    Average (northd-loop in msec): 
1207.667100

        I didn't debug yet, but do you have any clue what could be the reason? 
I am using the upstream commit 9242f27f63 which already included this patch.
        Below is my change to the perf-northd.at <http://perf-northd.at> file 
just to enable parallel_build:

        diff --git a/tests/perf-northd.at <http://perf-northd.at> 
b/tests/perf-northd.at <http://perf-northd.at>
        index 74b69e9d4..9328c2e21 100644
        --- a/tests/perf-northd.at <http://perf-northd.at>
        +++ b/tests/perf-northd.at <http://perf-northd.at>
        @@ -191,6 +191,7 @@ AT_SETUP([ovn-northd basic scale test -- 200 
Hypervisors, 200 Logical Ports/Hype
         PERF_RECORD_START()

         ovn_start
        +ovn-nbctl set nb_global . options:use_parallel_build=true

         BUILD_NBDB(OVN_BASIC_SCALE_CONFIG(200, 200))

        @@ -203,9 +204,10 @@ AT_SETUP([ovn-northd basic scale test -- 500 
Hypervisors, 50 Logical Ports/Hyper
         PERF_RECORD_START()

         ovn_start
        +ovn-nbctl set nb_global . options:use_parallel_build=true

         BUILD_NBDB(OVN_BASIC_SCALE_CONFIG(500, 50))

        Thanks,
        Han


-- Anton R. Ivanov
        Cambridgegreys Limited. Registered in England. Company Number 10273661
        https://www.cambridgegreys.com/  <https://www.cambridgegreys.com/>


-- Anton R. Ivanov
        Cambridgegreys Limited. Registered in England. Company Number 10273661
        https://www.cambridgegreys.com/  <https://www.cambridgegreys.com/>

-- Anton R. Ivanov
    Cambridgegreys Limited. Registered in England. Company Number 10273661
    https://www.cambridgegreys.com/  <https://www.cambridgegreys.com/>

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to