On 5/22/25 1:13 PM, Ilya Maximets wrote:
> On 3/13/25 1:43 PM, David Marchand wrote:
>> Following a bug report about vhost-user port traffic with a vxlan tunnel
>> being dropped, I relooked at the offloading code to see how we could
>> switch to the non legacy offload flags mode in the vhost-user libary.
>>
>> As I dug into this, and with Ilya recent comment (on switching to an
>> inner offload API) in mind, I wrote this series.
>>
>> The result seems promising in terms of code.
>> I did no extensive performance testing yet, but I would expect either
>> similar performance (or maybe better performance for non offloaded
>> traffic, as a number of branches are removed in the flow extraction
>> code).
>>
>>
>> Mike recently posted a patch for removing the csum_start/csum_offset
>> netdev-linux for which I also had a patch, but since it had some
>> implications on the handling of inner checksums, I ended up including my
>> version for now.
>>
>>
>> There is still one extra step that is left for a future revision:
>> eliminate last if (tunnel) in miniflow_extract().
>>
>>
>> Comments?
>
> Thanks, David.
>
> I think, this is a right approach for moving forward, should be
> much easier to understand and maintain in the long term.
>
> For now I applied patches 1 though 6 and the patch 8. I made a
> few small adjustments listed below as per review comments from
> Mike and my own while looking through the set.
> Since there are mainly bug fixes and test improvements, I also
> backported them down to 3.3. Having all these tests on LTS will
> also allow us to be slightly more confident when backporting
> checksum offload fixes in the future.
>
> Patch 7 is also a bug fix and I agree that we should be handling
> offload flags properly in the simple match case, I just want to
> do a few more additional tests manually, including a performance
> check, before applying this one.
>
> List of adjustments:
> Patch 3: Re-worded a commit message.
> Patch 4: Added coverage counter checks in the test as they were
> missing in a couple 'Flag as Rx good.' cases.
> Patch 6: Renamed coverage counters removing the 'netdev_' part.
> This module is generally named 'native_tnl' in the logs,
> we should be using the same shorter name for the counters.
>
> Will follow up on reviews for the rest of the set next week.
I replied to the patch 9 conversation and I don't have major comments for
other patches in the set, so it may be a good idea to respin and continue
from there.
I did some performance testing with a basic v2v setup and it's a bit of a
roller coaster throughout the set. virtio-user ports in this setup do not
have any offloads enabled and a pure v2v performance first goes noticeably
higher after patches 9 and 10, but then it goes down and ends up lower than
the original by about 1-2% at the end of the set.
I can get back most of the performance gains seen at patch 11 by moving the
following part to the case where L4 checksums are enabled:
/* There is no support in virtio net to offload IPv4 csum,
* but the vhost library handles IPv4 csum offloading fine. */
dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD;
And then wrapping the netdev_dpdk_prep_hwol_batch() call with the
if (!netdev->ol_flags) check.
I'm not sure if having ip checksum offload has a lot of sense when L4
checksums are not supported.
I also tried the following dirty hack:
@@ -3281,6 +3283,7 @@ netdev_dpdk_common_send(struct netdev *netdev, struct
dp_packet_batch *batch,
size_t cnt, pkt_cnt = dp_packet_batch_size(batch);
struct dp_packet *packet;
bool need_copy = false;
+ uint64_t offloads = 0;
memset(stats, 0, sizeof *stats);
@@ -3289,6 +3292,8 @@ netdev_dpdk_common_send(struct netdev *netdev, struct
dp_packet_batch *batch,
need_copy = true;
break;
}
+ offloads |= (uint64_t) packet->offloads & ALL_CSUM_MASK;
+ offloads |= pkts[i]->tso_segsz;
}
/* Copy dp-packets to mbufs. */
@@ -3304,9 +3309,11 @@ netdev_dpdk_common_send(struct netdev *netdev, struct
dp_packet_batch *batch,
pkt_cnt = cnt;
/* Prepare each mbuf for hardware offloading. */
- cnt = netdev_dpdk_prep_hwol_batch(dev, pkts, pkt_cnt);
- stats->tx_invalid_hwol_drops += pkt_cnt - cnt;
- pkt_cnt = cnt;
+ if (offloads || need_copy) {
+ cnt = netdev_dpdk_prep_hwol_batch(dev, pkts, pkt_cnt);
+ stats->tx_invalid_hwol_drops += pkt_cnt - cnt;
+ pkt_cnt = cnt;
+ }
/* Apply Quality of Service policy. */
cnt = netdev_dpdk_qos_run(dev, pkts, pkt_cnt, true);
---
It also improved performance quite a bit.
I guess, there might be an optimization of accumulating offload flags
per batch and then avoid calling the offload preparation function per
packet if none required.
WDYT?
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev