On 2/10/25 14:30, Ilya Maximets wrote:
> On 2/7/25 06:46, Mike Pattrick wrote:
>> On Thu, Feb 6, 2025 at 9:15 AM David Marchand <[email protected]> 
>> wrote:
>>>
>>> Hello,
>>>
>>> On Wed, Feb 5, 2025 at 1:55 PM Ilya Maximets <[email protected]> wrote:
>>>>
>>>> On 1/23/25 16:56, David Marchand wrote:
>>>>> Rather than drop all pending Tx offloads on recirculation,
>>>>> preserve inner offloads (and mark packet with outer Tx offloads)
>>>>> when parsing the packet again.
>>>>>
>>>>> Fixes: c6538b443984 ("dpif-netdev: Fix crash due to tunnel offloading on 
>>>>> recirculation.")
>>>>> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
>>>>> Signed-off-by: David Marchand <[email protected]>
>>>>> ---
>>>>> Changes since v1:
>>>>> - rebased,
>>>>> - dropped API change on miniflow_extract(), rely on tunnel offloading
>>>>>   flag presence instead,
>>>>> - introduced dp_packet_reset_outer_offsets,
>>>>>
>>>>> ---
>>>>>  lib/dp-packet.h   | 23 +++++++++++------------
>>>>>  lib/dpif-netdev.c | 27 ---------------------------
>>>>>  lib/flow.c        | 34 ++++++++++++++++++++++++++++------
>>>>>  3 files changed, 39 insertions(+), 45 deletions(-)
>>>>
>>>> Hi, David.  Thanks for the patch!
>>>>
>>>> Did you run some performance tests with this change?  It touches the very
>>>> core of packet parsing, so we need to check how that impacts normal V2V or
>>>> PVP scenarios even without tunneling.
>>>
>>> I would be surprised those added branches add much to the already good
>>> number of branches in miniflow_extract.
>>> Though I can understand a concern of decreased performance.
>>>
>>>
>>> I did a "simple" test with testpmd as a tgen and a simple port0 ->
>>> port1 and port1 -> port0 openflow rules.
>>> 1 pmd thread per port on isolated cpu, no thread sibling.
>>>
>>> I used current main branch:
>>> 481bc0979 - (HEAD, origin/main, origin/HEAD) route-table: Allow
>>> parsing routes without nexthop. (7 days ago) <Martin Kalcok>
>>>
>>> Unexpectedly, I see a slight improvement (I repeated builds,
>>> configuration and tests a few times).
>>
>> Hello David,
>>
>> I also did a few performance tests. In all tests below I generated
>> traffic in a VM with iperf3, transited a netdev datapath OVS, and
>> egressed through an i40e network card. All tests were repeated 10
>> times and I restarted OVS in between some tests.
>>
>> First I tested with tso + tunnel encapsulation with a vxlan tunnel.
>>
>> Without patch:
>> Mean: 6.09 Gbps
>> Stdev: 0.098
>>
>> With patch:
>> Mean: 6.20 Gbps
>> Stdev: 0.097
>>
>> From this it's clear in the tunnel + TSO case there is a noticeable 
>> improvement!
>>
>> Next I just tested just a straight path from the VM, through OVS, to the nic.
>>
>> Without patch:
>> Mean: 16.81 Gbps
>> Stdev: 0.86
>>
>> With patch:
>> Mean: 17.68 Gbps
>> Stdev: 0.91
>>
>> Again we see the small but paradoxical performance improvement with
>> the patch. There weren't a lot of samples overall, but I ran a t-test
>> and found a p value of 0.045 suggesting significance.
>>
>> Cheers,
>> M
>>
>>>
>>>
>>> - testpmd (txonly) mlx5 -> mlx5 OVS mlx5 <-> mlx5 testpmd (mac)
>>> * Before patch:
>>> flow-dump from pmd on cpu core: 6
>>> ufid:5ba3b6ab-7595-4904-aeb3-410ec10f0f84,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(dpdk1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=04:3f:72:b2:c0:91/00:00:00:00:00:00,dst=04:3f:72:b2:c0:90/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:100320113, bytes:6420487232, used:0.000s, dp:ovs,
>>> actions:dpdk0, dp-extra-info:miniflow_bits(4,1)
>>> flow-dump from pmd on cpu core: 4
>>> ufid:3627c676-e0f9-4293-b86b-6824c35f9a6c,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(dpdk0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=04:3f:72:b2:c0:90/00:00:00:00:00:00,dst=02:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:106807423, bytes:6835675072, used:0.000s, dp:ovs,
>>> actions:dpdk1, dp-extra-info:miniflow_bits(4,1)
>>>
>>>   Rx-pps:     11367442          Rx-bps:   5820130688
>>>   Tx-pps:     11367439          Tx-bps:   5820128800
>>>
>>> * After patch:
>>> flow-dump from pmd on cpu core: 6
>>> ufid:41a51bc1-f6cb-4810-8372-4a9254a1db52,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(dpdk1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=04:3f:72:b2:c0:91/00:00:00:00:00:00,dst=04:3f:72:b2:c0:90/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:32408002, bytes:2074112128, used:0.000s, dp:ovs,
>>> actions:dpdk0, dp-extra-info:miniflow_bits(4,1)
>>> flow-dump from pmd on cpu core: 4
>>> ufid:115e4654-1e01-467b-9360-de75eb1e872b,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(dpdk0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=04:3f:72:b2:c0:90/00:00:00:00:00:00,dst=02:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:37689559, bytes:2412131776, used:0.000s, dp:ovs,
>>> actions:dpdk1, dp-extra-info:miniflow_bits(4,1)
>>>
>>>   Rx-pps:     12084135          Rx-bps:   6187077192
>>>   Tx-pps:     12084135          Tx-bps:   6187077192
>>>
>>>
>>> - testpmd (txonly) virtio-user -> vhost-user OVS vhost-user ->
>>> virtio-user testpmd (mac)
>>> * Before patch:
>>> flow-dump from pmd on cpu core: 6
>>> ufid:79248354-3697-4d2e-9d70-cc4df5602ff9,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(vhost1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=00:11:22:33:44:56/00:00:00:00:00:00,dst=00:11:22:33:44:55/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:23402111, bytes:1497735104, used:0.000s, dp:ovs,
>>> actions:vhost0, dp-extra-info:miniflow_bits(4,1)
>>> flow-dump from pmd on cpu core: 4
>>> ufid:ca8974b4-2c7e-49c1-bdc6-5d90638997b6,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(vhost0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=00:11:22:33:44:55/00:00:00:00:00:00,dst=00:11:22:33:44:66/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:23402655, bytes:1497769920, used:0.001s, dp:ovs,
>>> actions:vhost1, dp-extra-info:miniflow_bits(4,1)
>>>
>>>   Rx-pps:      6022487          Rx-bps:   3083513840
>>>   Tx-pps:      6022487          Tx-bps:   3083513840
>>>
>>> * After patch:
>>> flow-dump from pmd on cpu core: 6
>>> ufid:c2bac91a-d8a6-4a96-9d56-aee133d1f047,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(vhost1),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=00:11:22:33:44:56/00:00:00:00:00:00,dst=00:11:22:33:44:55/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:53921535, bytes:3450978240, used:0.000s, dp:ovs,
>>> actions:vhost0, dp-extra-info:miniflow_bits(4,1)
>>> flow-dump from pmd on cpu core: 4
>>> ufid:c4989fca-2662-4645-8291-8971c00b7cb4,
>>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(vhost0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0,id=0),eth(src=00:11:22:33:44:55/00:00:00:00:00:00,dst=00:11:22:33:44:66/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=198.18.0.1/0.0.0.0,dst=198.18.0.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=9/0,dst=9/0),
>>> packets:53921887, bytes:3451000768, used:0.000s, dp:ovs,
>>> actions:vhost1, dp-extra-info:miniflow_bits(4,1)
>>>
>>>   Rx-pps:      6042410          Rx-bps:   3093714208
>>>   Tx-pps:      6042407          Tx-bps:   3093712616
>>>
> 
> Hi, Mike and David.
> 
> Thanks for the test results, but I don't think they are relevant.  At least 
> the
> David's ones.  The datapath flows show no matches on eth addresses and that
> suggests that the simple match is in use.  And miniflow_extract is not called 
> in
> this case, so the test doesn't really check the changes.  The variance in the
> test results is also concerning as nothing should have changed in the 
> datapath,
> but the performance changes for some reason.
> 
> Mike, what OpenFlow rules are you using in your setup?
> 
> 
> On my end, I did my own set of runs with and without this patch and I see 
> about
> 0.82% performance degradation for a V2V scenario with a NORMAL OpenFlow rule 
> and
> no real difference with simple match, which is expected.  My numbers are:
> 
>         NORMAL            Simple match
>     patch    main        patch    main
>     7420.0   7481.6      8333.1   8333.9

These numbers are in Kpps.

> 
>         -0.82 %              -0.009 %
> 
> The numbres are averages over 14 alternating runs of each type, so the results
> should be statistically significant.  The fact that there is no difference in
> a simple match case also suggests that the difference with NORMAL is real.
> 
> Could you re-check your tests?
> 
> For the reference, my configuration is:
> 
> ---
> ./configure CFLAGS="-msse4.2 -g -Ofast -march=native"  --with-dpdk=static 
> CC=gcc
> 
> ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x14
> ovs-vsctl add-br ovsbr -- set bridge ovsbr datapath_type=netdev
> ovs-vsctl add-port ovsbr vhost0 \
>   -- set Interface vhost0 type=dpdkvhostuserclient \
>      options:vhost-server-path=/tmp/vhost0
> ovs-vsctl set Interface vhost0 other_config:pmd-rxq-affinity=0:2,1:2
> ovs-vsctl add-port ovsbr vhost1 \
>   -- set Interface vhost1 type=dpdkvhostuserclient \
>      options:vhost-server-path=/tmp/vhost1
> ovs-vsctl set Interface vhost1 other_config:pmd-rxq-affinity=0:4,1:4
> 
> ovs-vsctl set Open_vSwitch . other_config:dpdk-extra='--no-pci 
> --single-file-segments'
> ovs-vsctl set Open_vSwitch . other_config:dpdk-init=try
> 
> ovs-ofctl del-flows ovsbr
> ovs-ofctl add-flow ovsbr actions=NORMAL
> 
> ./build-24.11/bin/dpdk-testpmd -l 12,14 -n 4 --socket-mem=1024,0 --no-pci \
>   
> --vdev="net_virtio_user,path=/tmp/vhost1,server=1,mac=E6:49:42:EC:67:3C,in_order=1"
>  \
>   --in-memory --single-file-segments -- \
>   --burst=32 --txd=2048 --rxd=2048 --rxq=1 --txq=1 --nb-cores=1 \
>   --eth-peer=0,5A:90:B6:77:22:F8 --forward-mode=txonly --stats-period=5
> 
> ./build-24.11/bin/dpdk-testpmd -l 8,10 -n 4 --socket-mem=1024,0 --no-pci \
>   
> --vdev="net_virtio_user,path=/tmp/vhost0,server=1,mac=5A:90:B6:77:22:F8,in_order=1"
>  \
>   --in-memory  --single-file-segments  -- \
>   --burst=32 --txd=2048 --rxd=2048 --rxq=1 --txq=1 --nb-cores=1  \
>   --eth-peer=0,E6:49:42:EC:67:3C  --forward-mode=mac --stats-period=5
> ---
> 
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to