在 2023/3/21 22:49, Heng Qi 写道:
在 2023/3/21 下午3:34, Michael S. Tsirkin 写道:
On Tue, Mar 21, 2023 at 11:56:14AM +0800, Heng Qi wrote:
在 2023/3/21 上午3:43, Michael S. Tsirkin 写道:
On Mon, Mar 20, 2023 at 07:18:40PM +0800, Heng Qi wrote:
1. Currently, a received encapsulated packet has an outer and an
inner header, but
the virtio device is unable to calculate the hash for the inner
header. Multiple
flows with the same outer header but different inner headers are
steered to the
same receive queue. This results in poor receive performance.
To address this limitation, a new feature VIRTIO_NET_F_HASH_TUNNEL
has been
introduced, which enables the device to advertise the capability
to calculate the
hash for the inner packet header. Compared with the out header
hash, it regains
better receive performance.
So this would be a very good argument however the cost would be it
would
seem we have to keep extending this indefinitely as new tunneling
protocols come to light.
But I believe in fact we don't at least for this argument:
the standard way to address this is actually by propagating entropy
from inner to outer header.
Yes, we don't argue with this.
So I'd maybe reorder the commit log and give the explanation 2 below
then say "for some legacy systems
including entropy in IP header
as done in modern protocols is not practical, resulting in
bad performance under RSS".
I agree. But not necessarily the legacy system, some scenarios need to
connect multiple tunnels, for compatibility, they will not use optional
fields or choose the old tunnel protocol.
compatibility ... with legacy systems, no?
2. The same flow can traverse through different tunnels, resulting
in the encapsulated
packets being spread across multiple receive queues (refer to the
figure below).
However, in certain scenarios, it becomes necessary to direct
these encapsulated
packets of the same flow to a single receive queue. This
facilitates the processing
of the flow by the same CPU to improve performance (warm caches,
less locking, etc.).
client1 client2
| |
| +-------+ |
+------->|tunnels|<--------+
+-------+
| |
| |
v v
+-----------------+
| processing host |
+-----------------+
necessary is too strong a word I feel.
All this is, is an optimization, we don't really know how strong it is
even.
Here's how I understand this:
Imagine two clients client1 and client2 talking to each other.
A copy of all packets is sent to a processing host over a virtio
device.
Two directions of the same flow between two clients might be
encapsulated in two different tunnels, with current RSS
strategies they would land on two arbitrary, unrelated queues.
As an optimization, some hosts might wish to make sure both directions
of the encapsulated flow land on the same queue.
Is this a good summary?
I think yes.
Now that things begin to be clearer, I kind of begin to agree with
Jason's suggestion that this is extremely narrow. And what if I want
one direction on queue1 and another one queue2 e.g. adjacent
numbers for
I don't understand why we need this, can you point out some usage
scenarios?
If traffic is predominantly UDP, each queue can be processed in
parallel. If you need to look at the other side of the flow once
in a while, you can find it by doing ^1.
I'm not sure if I align with you, but I try to answer. When we try to
place traffic in one direction on a certain queue,
it means that we have calculated the hash, we can record the
five-tuple information and the queue number. When
the traffic in the other direction comes, we can match what we just
recorded information and place it on the ^1 queue.
the same flow? If enough people agree this is needed we can accept
this
but did you at all consider using something programmable like BPF for
I think the problem is that our virtio device cannot support ebpf,
we can
also ask Alvaro, Parav if their virtio devices can support ebpf
offloading.
:)
This isn't ebpf, more like classic bpf. Just math done on packets,
no tables.
We would also really like to use simple bpf offloading, which is cool.
But it still takes time, for example to
support parsing of bpf instructions etc. on devices like fpga, which
they can't do easily now. Few devices
are supported right now, I only see support for the netronome iNIC in
the kernel.
#git grep XDP_SETUP_PROG_HW
drivers/net/ethernet/netronome/nfp/nfp_net_common.c: case
XDP_SETUP_PROG_HW:
drivers/net/netdevsim/bpf.c: if (bpf->command ==
XDP_SETUP_PROG_HW && !ns->bpf_xdpoffload_accept) {
drivers/net/netdevsim/bpf.c: if (bpf->command ==
XDP_SETUP_PROG_HW) {
drivers/net/netdevsim/bpf.c: case XDP_SETUP_PROG_HW:
include/linux/netdevice.h: XDP_SETUP_PROG_HW,
net/core/dev.c: xdp.command = mode == XDP_MODE_HW ?
XDP_SETUP_PROG_HW : XDP_SETUP_PROG;
Note that this is the eBPF hardware offloading which is much more
complicated than what we propose now. For hash calculation, a simple
classical bpf or other like P4 would be sufficient. The point is to
allow the user to customize the hash calculation.
If this is too flexible for the hardware, it would be stillbetter to
consider a more general hash calculation pipeline (XOR, swap, hash
masks, hash key customization) like:
https://docs.napatech.com/r/Feature-Set-N-ANL10/Hash-Value-Generation
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org