from:"Yuri Benditovich"

Re: [RFC PATCH 1/5] uapi/linux/if_tun.h: Added new ioctl for tun/tap.

2022-02-23 Thread Yuri Benditovich

Hi Jason,
We agree that the same can be done also using the old way, i.e. try to
set specific offload - if failed, probably it is not supported.
We think this is a little not scalable and we suggest adding the ioctl
that will allow us to query allo the supported features in a single
call.
We think this will make QEMU code more simple also in future.
Do I understand correctly that you suggest to skip this new ioctl and
use the old way of query for this (USO) feature and all future
extensions?

Thanks


On Wed, Feb 23, 2022 at 5:53 AM Jason Wang  wrote:
>
> On Tue, Feb 22, 2022 at 9:28 PM Andrew Melnichenko  wrote:
> >
> > Hi all,
> >
> > On Wed, Feb 9, 2022 at 6:26 AM Jason Wang  wrote:
> > >
> > >
> > > 在 2022/1/25 下午4:46, Andrew Melnychenko 写道:
> > > > Added TUNGETSUPPORTEDOFFLOADS that should allow
> > > > to get bits of supported offloads.
> > >
> > >
> > > So we don't use dedicated ioctls in the past, instead, we just probing
> > > by checking the return value of TUNSETOFFLOADS.
> > >
> > > E.g qemu has the following codes:
> > >
> > > int tap_probe_has_ufo(int fd)
> > > {
> > >  unsigned offload;
> > >
> > >  offload = TUN_F_CSUM | TUN_F_UFO;
> > >
> > >  if (ioctl(fd, TUNSETOFFLOAD, offload) < 0)
> > >  return 0;
> > >
> > >  return 1;
> > > }
> > >
> > > Any reason we can't keep using that?
> > >
> > > Thanks
> > >
> >
> > Well, even in this example. To check the ufo feature, we trying to set it.
> > What if we don't need to "enable" UFO and/or do not change its state?
>
> So at least Qemu doesn't have such a requirement since during the
> probe the virtual networking backend is not even started.
>
> > I think it's a good idea to have the ability to get supported offloads
> > without changing device behavior.
>
> Do you see a real user for this?
>
> Btw, we still need to probe this new ioctl anyway.
>
> Thanks
>
> >
> > >
> > > > Added 2 additional offlloads for USO(IPv4 & IPv6).
> > > > Separate offloads are required for Windows VM guests,
> > > > g.e. Windows may set USO rx only for IPv4.
> > > >
> > > > Signed-off-by: Andrew Melnychenko 
> > > > ---
> > > >   include/uapi/linux/if_tun.h | 3 +++
> > > >   1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
> > > > index 454ae31b93c7..07680fae6e18 100644
> > > > --- a/include/uapi/linux/if_tun.h
> > > > +++ b/include/uapi/linux/if_tun.h
> > > > @@ -61,6 +61,7 @@
> > > >   #define TUNSETFILTEREBPF _IOR('T', 225, int)
> > > >   #define TUNSETCARRIER _IOW('T', 226, int)
> > > >   #define TUNGETDEVNETNS _IO('T', 227)
> > > > +#define TUNGETSUPPORTEDOFFLOADS _IOR('T', 228, unsigned int)
> > > >
> > > >   /* TUNSETIFF ifr flags */
> > > >   #define IFF_TUN 0x0001
> > > > @@ -88,6 +89,8 @@
> > > >   #define TUN_F_TSO6  0x04/* I can handle TSO for IPv6 packets */
> > > >   #define TUN_F_TSO_ECN   0x08/* I can handle TSO with ECN 
> > > > bits. */
> > > >   #define TUN_F_UFO   0x10/* I can handle UFO packets */
> > > > +#define TUN_F_USO4   0x20/* I can handle USO for IPv4 packets */
> > > > +#define TUN_F_USO6   0x40/* I can handle USO for IPv6 packets */
> > > >
> > > >   /* Protocol info prepended to the packets (when IFF_NO_PI is not set) 
> > > > */
> > > >   #define TUN_PKT_STRIP   0x0001
> > >
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 0/5] TUN/VirtioNet USO features support.

2022-01-26 Thread Yuri Benditovich

On Wed, Jan 26, 2022 at 9:54 AM Xuan Zhuo  wrote:
>
> On Tue, 25 Jan 2022 10:46:57 +0200, Andrew Melnychenko  
> wrote:
> > Added new offloads for TUN devices TUN_F_USO4 and TUN_F_USO6.
> > Technically they enable NETIF_F_GSO_UDP_L4
> > (and only if USO4 & USO6 are set simultaneously).
> > It allows to transmission of large UDP packets.
> >
> > Different features USO4 and USO6 are required for qemu where Windows guests 
> > can
> > enable disable USO receives for IPv4 and IPv6 separately.
> > On the other side, Linux can't really differentiate USO4 and USO6, for now.
> > For now, to enable USO for TUN it requires enabling USO4 and USO6 together.
> > In the future, there would be a mechanism to control UDP_L4 GSO separately.
> >
> > Test it WIP Qemu https://github.com/daynix/qemu/tree/Dev_USOv2
> >
> > New types for VirtioNet already on mailing:
> > https://lists.oasis-open.org/archives/virtio-comment/202110/msg00010.html
>
> Seems like this hasn't been upvoted yet.
>
> https://github.com/oasis-tcs/virtio-spec#use-of-github-issues

Yes, correct. This is a reason why this series of patches is RFC.

>
> Thanks.
>
> >
> > Also, there is a known issue with transmitting packages between two guests.
> > Without hacks with skb's GSO - packages are still segmented on the host's 
> > postrouting.
> >
> > Andrew Melnychenko (5):
> >   uapi/linux/if_tun.h: Added new ioctl for tun/tap.
> >   driver/net/tun: Added features for USO.
> >   uapi/linux/virtio_net.h: Added USO types.
> >   linux/virtio_net.h: Added Support for GSO_UDP_L4 offload.
> >   drivers/net/virtio_net.c: Added USO support.
> >
> >  drivers/net/tap.c   | 18 --
> >  drivers/net/tun.c   | 15 ++-
> >  drivers/net/virtio_net.c| 22 ++
> >  include/linux/virtio_net.h  | 11 +++
> >  include/uapi/linux/if_tun.h |  3 +++
> >  include/uapi/linux/virtio_net.h |  4 
> >  6 files changed, 66 insertions(+), 7 deletions(-)
> >
> > --
> > 2.34.1
> >
> > ___
> > Virtualization mailing list
> > Virtualization@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/virtualization
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-14 Thread Yuri Benditovich

On Fri, May 14, 2021 at 10:16 AM Jason Wang  wrote:
>
> On Fri, May 14, 2021 at 4:35 AM Willem de Bruijn
>  wrote:
> >
> > > > But surprisingly when TUN receives TUN_F_UFO it does not propagate it
> > > > anywhere, there is no corresponding NETIF flag.
> > >
> > > (It looks like I drop the community and other ccs accidentally, adding
> > > them back and sorry)
> > >
> > > Actually, there is one, NETIF_F_GSO_UDP.
> > >
> > > Kernel used to have NETIF_F_UFO, but it was removed due to bugs and
> > > the lack of real hardware support. Then we found it breaks uABI, so
> > > Willem tries to make it appear for userspace again, and then it was
> > > renamed to NETIF_F_GSO_UDP.
> > >
> > > But I think it's a bug that we don't proporate TUN_F_UFO to NETIF
> > > flag, this is a must for the driver that doesn't support
> > > VIRTIO_NET_F_GUEST_UFO. I just try to disable all offloads and
> > > mrg_rxbuf, then netperf UDP_STREAM from host to guest gives me bad
> > > length packet in the guest.
> > >
> > > Willem, I think we probably need to fix this.
> >
> > We had to add back support for the kernel to accept UFO packets from
> > userspace over tuntap.
> >
> > The kernel does not generate such packets, so a guest should never be
> > concerned of receiving UFO packets.
>
> That's my feeling as well.
>
> But when I:
>
> 1) turn off all guest gso feature and mrg rx buffers, in this case
> virtio-net will only allocate 1500 bytes for each packet
> 2) doing netperf (UDP_STREAM) from local host to guest, I see packet
> were truncated in the guest

Is it possible that the virtio-net does not disable UFO offload?
IMO it sets NETIF_F_LRO too bravely.
>
> >
> > Perhaps i'm misunderstanding the problem here.
> >
>
> I will re-check and get back to you.
> (probably need a while since I will not be online for the next week).
>
> Thanks
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-13 Thread Yuri Benditovich

On Thu, May 13, 2021 at 11:43 PM Willem de Bruijn
 wrote:
>
> > > > > > So the question is what to do now:
> > > > > > A)
> > > > > > Finalize patches for guest TX and respective QEMU patches
> > > > > > Prepare RFC patches for guest RX, get ack on them
> > > > > > Change the spec
> > > > > > Finalize patches for guest RX according to the spec
> > > > > >
> > > > > > B)
> > > > > > Reject the patches for guest TX
> > > > > > Prepare RFC patches for everything, get ack on them
> > > > > > Change the spec
> > > > > > Finalize patches for everything according to the spec
> > > > > >
> > > > > > I'm for A) of course :)
> > > > >
> > > > > I'm for B :)
> > > > >
> > > > > The reasons are:
> > > > >
> > > > > 1) keep the assumption of tun_set_offload() to simply the logic and
> > > > > compatibility
> > > > > 2) it's hard or tricky to touch guest TX path only (e.g the
> > > > > virtio_net_hdr_from_skb() is called in both RX and TX)
> > > >
> > > > I suspect there is _some_ misunderstanding here.
> > > > I did not touch virtio_net_hdr_from_skb at all.
> > > >
> > >
> > > Typo, actually I meant virtio_net_hdr_to_skb().
> > OK.
> > 2) tun_get_user() which is guest TX - this is covered
> > 3) tap_get_user() which is guest TX - this is covered
> > 4) {t}packet_send() which is userspace TX - this is OK, the userspace
> > does not have this feature, it will never use USO
>
> What do you mean exactly? I can certainly imagine packet socket users
> that could benefit from using udp gso.
I've just tried to understand whether we have a real functional
problem due to the fact that I define the USO feature only for guest
TX path.
This set of patches modifies virtio_net_hdr_to_skb and Jason's comment
was that this procedure is called in both guest TX and RX, there are 4
places where the virtio_net_hdr_to_skb is called, userspace TX is one
of them.
AFAIU userspace 'socket' and 'user' backends of qemu do not have any
offloads at all so they will never use USO also.
Sorry for misunderstanding if any.
>
> When adding support for a new GSO type in virtio_net_hdr, it ideally
> is supported by all users of that interface. Alternatively, if some
> users do not support the flag, a call that sets the flag has to
> (continue to) fail hard, so that we can enable it at a later time.
I agree of course. IMO this is what I've tried to do. I did not have
in the initial plan to make Linux virtio-net to use the USO at all but
this should not present any problem (if I'm not mistaken).
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-13 Thread Yuri Benditovich

On Thu, May 13, 2021 at 10:05 AM Jason Wang  wrote:
>
> On Thu, May 13, 2021 at 12:36 PM Yuri Benditovich
>  wrote:
> >
> > On Thu, May 13, 2021 at 5:07 AM Jason Wang  wrote:
> > >
> > > On Wed, May 12, 2021 at 6:37 PM Yuri Benditovich
> > >  wrote:
> > > >
> > > > On Wed, May 12, 2021 at 12:10 PM Jason Wang  wrote:
> > > > >
> > > > > On Wed, May 12, 2021 at 4:32 PM Yuri Benditovich
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, May 12, 2021 at 10:50 AM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, May 12, 2021 at 1:24 PM Yuri Benditovich
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Wed, May 12, 2021 at 4:33 AM Jason Wang 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 在 2021/5/11 下午4:33, Yuri Benditovich 写道:
> > > > > > > > > > On Tue, May 11, 2021 at 9:50 AM Jason Wang 
> > > > > > > > > >  wrote:
> > > > > > > > > >>
> > > > > > > > > >> 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > > > > > > > > >>> Signed-off-by: Yuri Benditovich 
> > > > > > > > > >>> 
> > > > > > > > > >>> ---
> > > > > > > > > >>>drivers/net/tun.c | 2 +-
> > > > > > > > > >>>1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > >>>
> > > > > > > > > >>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > > > > > > > >>> index 84f832806313..a35054f9d941 100644
> > > > > > > > > >>> --- a/drivers/net/tun.c
> > > > > > > > > >>> +++ b/drivers/net/tun.c
> > > > > > > > > >>> @@ -2812,7 +2812,7 @@ static int set_offload(struct 
> > > > > > > > > >>> tun_struct *tun, unsigned long arg)
> > > > > > > > > >>>arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> > > > > > > > > >>>}
> > > > > > > > > >>>
> > > > > > > > > >>> - arg &= ~TUN_F_UFO;
> > > > > > > > > >>> + arg &= ~(TUN_F_UFO|TUN_F_USO);
> > > > > > > > > >>
> > > > > > > > > >> It looks to me kernel doesn't use "USO", so 
> > > > > > > > > >> TUN_F_UDP_GSO_L4 is a better
> > > > > > > > > >> name for this
> > > > > > > > > > No problem, I can change it in v2
> > > > > > > > > >
> > > > > > > > > >   and I guess we should toggle NETIF_F_UDP_GSO_l4 here?
> > > > > > > > > >
> > > > > > > > > > No, we do not, because this indicates only the fact that 
> > > > > > > > > > the guest can
> > > > > > > > > > send large UDP packets and have them splitted to UDP 
> > > > > > > > > > segments.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Actually the reverse. The set_offload() controls the tuntap 
> > > > > > > > > TX path
> > > > > > > > > (guest RX path).
> > > > > > > >
> > > > > > > > The set_offloads does 2 things:
> > > > > > > > 1. At the initialization time qemu probes 
> > > > > > > > set_offload(something) to
> > > > > > > > check which features are supported by TAP/TUN.
> > > > > > >
> > > > > > > Note that the probing is used for guest RX features not host RX.
> > > > > >
> > > > > > It looks like the hidden assumption (till now) is that if some 
> > > > > > feature
> > > > > > is present - it exists simultaneously for host and guest.
> > > > > > See QEMU get

Re: [PATCH 2/4] virtio-net: add support of UDP segmentation (USO) on the host

2021-05-12 Thread Yuri Benditovich

On Tue, May 11, 2021 at 8:48 PM Willem de Bruijn
 wrote:
>
> On Tue, May 11, 2021 at 12:43 AM Yuri Benditovich
>  wrote:
> >
> > Large UDP packet provided by the guest with GSO type set to
> > VIRTIO_NET_HDR_GSO_UDP_L4 will be divided to several UDP
> > packets according to the gso_size field.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >  include/linux/virtio_net.h | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index b465f8f3e554..4ecf9a1ca912 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -51,6 +51,11 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
> > *skb,
> > ip_proto = IPPROTO_UDP;
> > thlen = sizeof(struct udphdr);
> > break;
> > +   case VIRTIO_NET_HDR_GSO_UDP_L4:
> > +   gso_type = SKB_GSO_UDP_L4;
> > +   ip_proto = IPPROTO_UDP;
> > +   thlen = sizeof(struct udphdr);
> > +   break;
>
> If adding a new VIRTIO_NET_HDR type I suggest adding separate IPv4 and
> IPv6 variants, analogous to VIRTIO_NET_HDR_GSO_TCPV[46]. To avoid
> having to infer protocol again, as for UDP fragmentation offload (the
> retry case below this code).

Thank you for denoting this important point of distinguishing between v4 and v6.
Let's try to take a deeper look to see what is the correct thing to do
and please correct me if I'm wrong:
1. For USO we do not need to guess the protocol as it is used with
VIRTIO_NET_HDR_F_NEEDS_CSUM (unlike UFO) and the USO packets
transmitted by the guest are under the same clause as both
VIRTIO_NET_HDR_GSO_TCP, i.e. under if (hdr->flags &
VIRTIO_NET_HDR_F_NEEDS_CSUM) {
2. If we even define VIRTIO_NET_HDR_GSO_UDPv4_L4 and
VIRTIO_NET_HDR_GSO_UDPv6_L4 - both will be translated to
SKB_GSO_UDP_L4, so this information is immediately lost (the code will
look like:
case VIRTIO_NET_HDR_GSO_UDP4_L4: case VIRTIO_NET_HDR_GSO_UDP6_L4
gso_type = SKB_GSO_UDP;

3. When we will define the respective guest features (like
VIRTIO_NET_F_HOST_USO4 VIRTIO_NET_F_HOST_USO6) we will need to
recreate the virtio_net header from the skb when both v4 and v6 have
the same SKB_GSO_UDP_L4, (see virtio_net_hdr_from_skb) and I'm not
sure whether somebody needs the exact v4 or v6 information on guest RX
path.
4. What is completely correct is that when we will start working with
the guest RX path we will need to define something like NETIF_F_USO4
and NETIF_F_USO6 and configure them according to exact guest offload
capabilities.
Do you agree?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-11 Thread Yuri Benditovich

On Wed, May 12, 2021 at 4:33 AM Jason Wang  wrote:
>
>
> 在 2021/5/11 下午4:33, Yuri Benditovich 写道:
> > On Tue, May 11, 2021 at 9:50 AM Jason Wang  wrote:
> >>
> >> 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> >>> Signed-off-by: Yuri Benditovich 
> >>> ---
> >>>drivers/net/tun.c | 2 +-
> >>>1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> >>> index 84f832806313..a35054f9d941 100644
> >>> --- a/drivers/net/tun.c
> >>> +++ b/drivers/net/tun.c
> >>> @@ -2812,7 +2812,7 @@ static int set_offload(struct tun_struct *tun, 
> >>> unsigned long arg)
> >>>arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> >>>}
> >>>
> >>> - arg &= ~TUN_F_UFO;
> >>> + arg &= ~(TUN_F_UFO|TUN_F_USO);
> >>
> >> It looks to me kernel doesn't use "USO", so TUN_F_UDP_GSO_L4 is a better
> >> name for this
> > No problem, I can change it in v2
> >
> >   and I guess we should toggle NETIF_F_UDP_GSO_l4 here?
> >
> > No, we do not, because this indicates only the fact that the guest can
> > send large UDP packets and have them splitted to UDP segments.
>
>
> Actually the reverse. The set_offload() controls the tuntap TX path
> (guest RX path).

The set_offloads does 2 things:
1. At the initialization time qemu probes set_offload(something) to
check which features are supported by TAP/TUN.
2. Later it configures the guest RX path according to guest's needs/capabilities
Typical initialization sequence is (in case the QEMU supports USO feature):
TAP/TUN set offload 11 (probe for UFO support)
TAP/TUN set offload 21 (probe for USO support)
TAP/TUN set offload 0
...
TAP/TUN set offload 7 (configuration of offloads according to GUEST features)

This series of patches is for VIRTIO_NET_F_HOST_USO only, virtio-net
features like VIRTIO_NET_F_GUEST_USO_(4/6/whatever) are not defined in
the spec yet.

>
> When VIRTIO_NET_F_GUEST_XXX was not negotiated, the corresponding netdev
> features needs to be disabled. When host tries to send those packets to
> guest, it needs to do software segmentation.
>
> See virtio_net_apply_guest_offloads().
>
> There's currently no way (or not need) to prevent tuntap from receiving
> GSO packets.
>
> Thanks
>
>
> >
> >> And how about macvtap?
> > We will check how to do that for macvtap. We will send a separate
> > patch for macvtap or ask for advice.
> >
> >> Thanks
> >>
> >>
> >>>}
> >>>
> >>>/* This gives the user a way to test for new features in future by
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-11 Thread Yuri Benditovich

On Tue, May 11, 2021 at 11:33 AM Yuri Benditovich
 wrote:
>
> On Tue, May 11, 2021 at 9:50 AM Jason Wang  wrote:
> >
> >
> > 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > > Signed-off-by: Yuri Benditovich 
> > > ---
> > >   drivers/net/tun.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > index 84f832806313..a35054f9d941 100644
> > > --- a/drivers/net/tun.c
> > > +++ b/drivers/net/tun.c
> > > @@ -2812,7 +2812,7 @@ static int set_offload(struct tun_struct *tun, 
> > > unsigned long arg)
> > >   arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> > >   }
> > >
> > > - arg &= ~TUN_F_UFO;
> > > + arg &= ~(TUN_F_UFO|TUN_F_USO);
> >
> >
> > It looks to me kernel doesn't use "USO", so TUN_F_UDP_GSO_L4 is a better
> > name for this
>
> No problem, I can change it in v2
>
>  and I guess we should toggle NETIF_F_UDP_GSO_l4 here?
>
> No, we do not, because this indicates only the fact that the guest can
> send large UDP packets and have them splitted to UDP segments.
>
> >
> > And how about macvtap?
>
> We will check how to do that for macvtap. We will send a separate
> patch for macvtap or ask for advice.
>

I'll add this feature to the tap.c also (AFAIU this will enable the
USO for macvtap).
Please correct me if I'm mistaken.

> >
> > Thanks
> >
> >
> > >   }
> > >
> > >   /* This gives the user a way to test for new features in future by
> >
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/4] virtio-net: add definitions for host USO feature

2021-05-11 Thread Yuri Benditovich

On Tue, May 11, 2021 at 11:24 AM Jason Wang  wrote:
>
> On Tue, May 11, 2021 at 4:12 PM Yuri Benditovich
>  wrote:
> >
> > On Tue, May 11, 2021 at 9:47 AM Jason Wang  wrote:
> > >
> > >
> > > 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > > > Define feature bit and GSO type according to the VIRTIO
> > > > specification.
> > > >
> > > > Signed-off-by: Yuri Benditovich 
> > > > ---
> > > >   include/uapi/linux/virtio_net.h | 2 ++
> > > >   1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/virtio_net.h 
> > > > b/include/uapi/linux/virtio_net.h
> > > > index 3f55a4215f11..a556ac735d7f 100644
> > > > --- a/include/uapi/linux/virtio_net.h
> > > > +++ b/include/uapi/linux/virtio_net.h
> > > > @@ -57,6 +57,7 @@
> > > >* Steering */
> > > >   #define VIRTIO_NET_F_CTRL_MAC_ADDR 23   /* Set MAC address */
> > > >
> > > > +#define VIRTIO_NET_F_HOST_USO 56 /* Host can handle USO packets */
> >
> > This is the virtio-net feature
>
> Right, I miss this part.
>
> >
> > > >   #define VIRTIO_NET_F_HASH_REPORT  57/* Supports hash report */
> > > >   #define VIRTIO_NET_F_RSS  60/* Supports RSS RX steering */
> > > >   #define VIRTIO_NET_F_RSC_EXT  61/* extended coalescing 
> > > > info */
> > > > @@ -130,6 +131,7 @@ struct virtio_net_hdr_v1 {
> > > >   #define VIRTIO_NET_HDR_GSO_TCPV41   /* GSO frame, IPv4 TCP 
> > > > (TSO) */
> > > >   #define VIRTIO_NET_HDR_GSO_UDP  3   /* GSO frame, 
> > > > IPv4 UDP (UFO) */
> > > >   #define VIRTIO_NET_HDR_GSO_TCPV64   /* GSO frame, IPv6 TCP */
> > > > +#define VIRTIO_NET_HDR_GSO_UDP_L45   /* GSO frame, IPv4 UDP 
> > > > (USO) */
> >
> > This is respective GSO type
> >
> > >
> > >
> > > This is the gso_type not the feature actually.
> > >
> > > I wonder what's the reason for not
> > >
> > > 1) introducing a dedicated virtio-net feature bit for this
> > > (VIRTIO_NET_F_GUEST_GSO_UDP_L4.
> >
> > This series is not for GUEST's feature, it is only for host feature.
> >
> > > 2) toggle the NETIF_F_GSO_UDP_L4  feature for tuntap based on the
> > > negotiated feature.
> >
> > The NETIF_F_GSO_UDP_L4 would be required for the guest RX path.
> > The guest TX path does not require any flags to be propagated, it only
> > allows the guest to transmit large UDP packets and have them
> > automatically splitted.
> > (This is similar to HOST_UFO but does packet segmentation instead of
> > fragmentation. GUEST_UFO indeed requires a respective NETIF flag, as
> > it is unclear whether the guest is capable of receiving such packets).
>
> So I think it's better to implement TX/RX in the same series unless
> there's something missed:
>
> For Guest TX, NETIF_F_GSO_UDP_L4 needs to be enabled in the guest
> virtio-net only when VIRTIO_NET_F_HOST_USO is negotiated.

I understand that this is what should be done when this feature will
be added to Linux virtio-net driver.
But at the moment we do not have enough resources to work on it.
Currently we have a clear use case and ability to test in on Windows guest.
Respective QEMU changes are pending for kernel patches, current
reference is https://github.com/daynix/qemu/tree/uso

> For guest RX, NETIF_F_GSO_UDP_L4 needs to be enabled on the host
> tuntap only when VIRTIO_NET_F_GUEST_USO is neogiated.

Currently we are not able to use guest RX UDP GSO.
In order to do that we at least should be able to build our Windows
drivers with the most updated driver development kit (2004+).
At the moment we can't, this task is in a plan but can take several
months. So we do not have a test/use case with Windows VM.


>
> Thanks
>
> >
> > >
> > > Thanks
> > >
> > >
> > > >   #define VIRTIO_NET_HDR_GSO_ECN  0x80/* TCP has ECN 
> > > > set */
> > > >   __u8 gso_type;
> > > >   __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
> > >
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/4] tun: indicate support for USO feature

2021-05-11 Thread Yuri Benditovich

On Tue, May 11, 2021 at 9:50 AM Jason Wang  wrote:
>
>
> 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > Signed-off-by: Yuri Benditovich 
> > ---
> >   drivers/net/tun.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index 84f832806313..a35054f9d941 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -2812,7 +2812,7 @@ static int set_offload(struct tun_struct *tun, 
> > unsigned long arg)
> >   arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
> >   }
> >
> > - arg &= ~TUN_F_UFO;
> > + arg &= ~(TUN_F_UFO|TUN_F_USO);
>
>
> It looks to me kernel doesn't use "USO", so TUN_F_UDP_GSO_L4 is a better
> name for this

No problem, I can change it in v2

 and I guess we should toggle NETIF_F_UDP_GSO_l4 here?

No, we do not, because this indicates only the fact that the guest can
send large UDP packets and have them splitted to UDP segments.

>
> And how about macvtap?

We will check how to do that for macvtap. We will send a separate
patch for macvtap or ask for advice.

>
> Thanks
>
>
> >   }
> >
> >   /* This gives the user a way to test for new features in future by
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 2/4] virtio-net: add support of UDP segmentation (USO) on the host

2021-05-11 Thread Yuri Benditovich

On Tue, May 11, 2021 at 9:47 AM Jason Wang  wrote:
>
>
> 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > Large UDP packet provided by the guest with GSO type set to
> > VIRTIO_NET_HDR_GSO_UDP_L4 will be divided to several UDP
> > packets according to the gso_size field.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >   include/linux/virtio_net.h | 5 +
> >   1 file changed, 5 insertions(+)
> >
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index b465f8f3e554..4ecf9a1ca912 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -51,6 +51,11 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
> > *skb,
> >   ip_proto = IPPROTO_UDP;
> >   thlen = sizeof(struct udphdr);
> >   break;
> > + case VIRTIO_NET_HDR_GSO_UDP_L4:
> > + gso_type = SKB_GSO_UDP_L4;
> > + ip_proto = IPPROTO_UDP;
> > + thlen = sizeof(struct udphdr);
> > + break;
>
>
> This is only for rx, how about tx?

In terms of the guest this is only for TX.
Guest RX is a different thing, this is actually coalescing of
segmented UDP packets into a large one.
This feature is not defined in the virtio spec yet and the support of
it first of all depends on the OS.
For example: TCP LSO (guest TX) is supported almost by all the
versions of Windows.
TCP RSC (coalescing of TCP segments) is supported by Win 8 / Server 2012 and up.
UDP segmentation is supported by Windows kernels 1903+
UDP coalescing is defined by Windows kernels 2004+ and not supported
by the driver yet.

>
> Thanks
>
>
>
> >   default:
> >   return -EINVAL;
> >   }
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/4] virtio-net: add definitions for host USO feature

2021-05-11 Thread Yuri Benditovich

On Tue, May 11, 2021 at 9:47 AM Jason Wang  wrote:
>
>
> 在 2021/5/11 下午12:42, Yuri Benditovich 写道:
> > Define feature bit and GSO type according to the VIRTIO
> > specification.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >   include/uapi/linux/virtio_net.h | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/include/uapi/linux/virtio_net.h 
> > b/include/uapi/linux/virtio_net.h
> > index 3f55a4215f11..a556ac735d7f 100644
> > --- a/include/uapi/linux/virtio_net.h
> > +++ b/include/uapi/linux/virtio_net.h
> > @@ -57,6 +57,7 @@
> >* Steering */
> >   #define VIRTIO_NET_F_CTRL_MAC_ADDR 23   /* Set MAC address */
> >
> > +#define VIRTIO_NET_F_HOST_USO 56 /* Host can handle USO packets */

This is the virtio-net feature

> >   #define VIRTIO_NET_F_HASH_REPORT  57/* Supports hash report */
> >   #define VIRTIO_NET_F_RSS  60/* Supports RSS RX steering */
> >   #define VIRTIO_NET_F_RSC_EXT  61/* extended coalescing info */
> > @@ -130,6 +131,7 @@ struct virtio_net_hdr_v1 {
> >   #define VIRTIO_NET_HDR_GSO_TCPV41   /* GSO frame, IPv4 TCP (TSO) 
> > */
> >   #define VIRTIO_NET_HDR_GSO_UDP  3   /* GSO frame, IPv4 
> > UDP (UFO) */
> >   #define VIRTIO_NET_HDR_GSO_TCPV64   /* GSO frame, IPv6 TCP */
> > +#define VIRTIO_NET_HDR_GSO_UDP_L45   /* GSO frame, IPv4 UDP (USO) 
> > */

This is respective GSO type

>
>
> This is the gso_type not the feature actually.
>
> I wonder what's the reason for not
>
> 1) introducing a dedicated virtio-net feature bit for this
> (VIRTIO_NET_F_GUEST_GSO_UDP_L4.

This series is not for GUEST's feature, it is only for host feature.

> 2) toggle the NETIF_F_GSO_UDP_L4  feature for tuntap based on the
> negotiated feature.

The NETIF_F_GSO_UDP_L4 would be required for the guest RX path.
The guest TX path does not require any flags to be propagated, it only
allows the guest to transmit large UDP packets and have them
automatically splitted.
(This is similar to HOST_UFO but does packet segmentation instead of
fragmentation. GUEST_UFO indeed requires a respective NETIF flag, as
it is unclear whether the guest is capable of receiving such packets).

>
> Thanks
>
>
> >   #define VIRTIO_NET_HDR_GSO_ECN  0x80/* TCP has ECN set */
> >   __u8 gso_type;
> >   __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 4/4] tun: indicate support for USO feature

2021-05-10 Thread Yuri Benditovich

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 84f832806313..a35054f9d941 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2812,7 +2812,7 @@ static int set_offload(struct tun_struct *tun, unsigned 
long arg)
arg &= ~(TUN_F_TSO4|TUN_F_TSO6);
}
 
-   arg &= ~TUN_F_UFO;
+   arg &= ~(TUN_F_UFO|TUN_F_USO);
}
 
/* This gives the user a way to test for new features in future by
-- 
2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 1/4] virtio-net: add definitions for host USO feature

2021-05-10 Thread Yuri Benditovich

Define feature bit and GSO type according to the VIRTIO
specification.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 3f55a4215f11..a556ac735d7f 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_HOST_USO 56   /* Host can handle USO packets */
 #define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
@@ -130,6 +131,7 @@ struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
 #define VIRTIO_NET_HDR_GSO_UDP 3   /* GSO frame, IPv4 UDP (UFO) */
 #define VIRTIO_NET_HDR_GSO_TCPV6   4   /* GSO frame, IPv6 TCP */
+#define VIRTIO_NET_HDR_GSO_UDP_L4  5   /* GSO frame, IPv4 UDP (USO) */
 #define VIRTIO_NET_HDR_GSO_ECN 0x80/* TCP has ECN set */
__u8 gso_type;
__virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
-- 
2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 3/4] tun: define feature bit for USO support

2021-05-10 Thread Yuri Benditovich

User mode software can probe this bit to check whether the
USO feature is supported by TUN/TAP device.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/if_tun.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 454ae31b93c7..24f246920dd5 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -88,6 +88,7 @@
 #define TUN_F_TSO6 0x04/* I can handle TSO for IPv6 packets */
 #define TUN_F_TSO_ECN  0x08/* I can handle TSO with ECN bits. */
 #define TUN_F_UFO  0x10/* I can handle UFO packets */
+#define TUN_F_USO  0x20/* I can handle USO packets */
 
 /* Protocol info prepended to the packets (when IFF_NO_PI is not set) */
 #define TUN_PKT_STRIP  0x0001
-- 
2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 0/4] Add host USO support to TUN device

2021-05-10 Thread Yuri Benditovich

This series adds support for UDP segmentation offload feature
in TUN device according to the VIRTIO specification

Yuri Benditovich (4):
  virtio-net: add definitions for host USO feature
  virtio-net: add support of UDP segmentation (USO) on the host
  tun: define feature bit for USO support
  tun: indicate support for USO feature

 drivers/net/tun.c   | 2 +-
 include/linux/virtio_net.h  | 5 +
 include/uapi/linux/if_tun.h | 1 +
 include/uapi/linux/virtio_net.h | 2 ++
 4 files changed, 9 insertions(+), 1 deletion(-)

-- 
2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-24 Thread Yuri Benditovich

On Wed, Jan 20, 2021 at 8:45 PM Alexei Starovoitov
 wrote:
>
> On Tue, Jan 12, 2021 at 12:55 PM Yuri Benditovich
>  wrote:
> >
> > On Tue, Jan 12, 2021 at 10:40 PM Yuri Benditovich
> >  wrote:
> > >
> > > On Tue, Jan 12, 2021 at 9:42 PM Yuri Benditovich
> > >  wrote:
> > > >
> > > > This program type can set skb hash value. It will be useful
> > > > when the tun will support hash reporting feature if virtio-net.
> > > >
> > > > Signed-off-by: Yuri Benditovich 
> > > > ---
> > > >  drivers/net/tun.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > > index 7959b5c2d11f..455f7afc1f36 100644
> > > > --- a/drivers/net/tun.c
> > > > +++ b/drivers/net/tun.c
> > > > @@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, 
> > > > struct tun_prog __rcu **prog_p,
> > > > prog = NULL;
> > > > } else {
> > > > prog = bpf_prog_get_type(fd, 
> > > > BPF_PROG_TYPE_SOCKET_FILTER);
> > > > +   if (IS_ERR(prog))
> > > > +   prog = bpf_prog_get_type(fd, 
> > > > BPF_PROG_TYPE_SCHED_CLS);
> > > > if (IS_ERR(prog))
> > > > return PTR_ERR(prog);
> > > > }
> > >
> > > Comment from Alexei Starovoitov:
> > > Patches 1 and 2 are missing for me, so I couldn't review properly,
> > > but this diff looks odd.
> > > It allows sched_cls prog type to attach to tun.
> > > That means everything that sched_cls progs can do will be done from tun 
> > > hook?
> >
> > We do not have an intention to modify the packet in this steering eBPF.
>
> The intent is irrelevant. Using SCHED_CLS here will let users modify the 
> packet
> and some users will do so. Hence the tun code has to support it.
>
> > There is just one function that unavailable for BPF_PROG_TYPE_SOCKET_FILTER
> > that the eBPF needs to make possible to deliver the hash to the guest
> > VM - it is 'bpf_set_hash'
> >
> > Does it mean that we need to define a new eBPF type for socket filter
> > operations + set_hash?
> >
> > Our problem is that the eBPF calculates 32-bit hash, 16-bit queue
> > index and 8-bit of hash type.
> > But it is able to return only 32-bit integer, so in this set of
> > patches the eBPF returns
> > queue index and hash type and saves the hash in skb->hash using 
> > bpf_set_hash().
>
> bpf prog can only return a 32-bit integer. That's true.
> But the prog can use helpers to set any number of bits and variables.
> bpf_set_hash_v2() with hash, queue and index arguments could fit this purpose,
> but if you allow it for SCHED_CLS type,

Do I understand correctly that this means:
1. Creation of new helper like
https://lists.linuxfoundation.org/pipermail/bridge/2020-July/013036.html
2. Validation on tun side that the BPF uses only limited subset of
helpers available for SCHED_CLS

> tc side of the code should be ready to deal with that too and this extended
> helper should be meaningful for both tc and tun.
>
> In general if the purpose of the prog is to compute three values they better 
> be
> grouped together. Returned two of them via ORed 32-bit integer and
> returning 32-bit via bpf_set_hash is an awkward api.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-18 Thread Yuri Benditovich

Hello Alexei,

Can you please answer the questions in the last email of this thread?
Your comment will be extremely helpful.

Thanks

On Tue, Jan 12, 2021 at 10:55 PM Yuri Benditovich
 wrote:
>
> On Tue, Jan 12, 2021 at 10:40 PM Yuri Benditovich
>  wrote:
> >
> > On Tue, Jan 12, 2021 at 9:42 PM Yuri Benditovich
> >  wrote:
> > >
> > > This program type can set skb hash value. It will be useful
> > > when the tun will support hash reporting feature if virtio-net.
> > >
> > > Signed-off-by: Yuri Benditovich 
> > > ---
> > >  drivers/net/tun.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > > index 7959b5c2d11f..455f7afc1f36 100644
> > > --- a/drivers/net/tun.c
> > > +++ b/drivers/net/tun.c
> > > @@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, 
> > > struct tun_prog __rcu **prog_p,
> > > prog = NULL;
> > > } else {
> > > prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
> > > +   if (IS_ERR(prog))
> > > +   prog = bpf_prog_get_type(fd, 
> > > BPF_PROG_TYPE_SCHED_CLS);
> > > if (IS_ERR(prog))
> > > return PTR_ERR(prog);
> > > }
> >
> > Comment from Alexei Starovoitov:
> > Patches 1 and 2 are missing for me, so I couldn't review properly,
> > but this diff looks odd.
> > It allows sched_cls prog type to attach to tun.
> > That means everything that sched_cls progs can do will be done from tun 
> > hook?
>
> We do not have an intention to modify the packet in this steering eBPF.
> There is just one function that unavailable for BPF_PROG_TYPE_SOCKET_FILTER
> that the eBPF needs to make possible to deliver the hash to the guest
> VM - it is 'bpf_set_hash'
>
> Does it mean that we need to define a new eBPF type for socket filter
> operations + set_hash?
>
> Our problem is that the eBPF calculates 32-bit hash, 16-bit queue
> index and 8-bit of hash type.
> But it is able to return only 32-bit integer, so in this set of
> patches the eBPF returns
> queue index and hash type and saves the hash in skb->hash using 
> bpf_set_hash().
>
> If this is unacceptable, can you please recommend a better solution?
>
> > sched_cls assumes l2 and can modify the packet.
>
> The steering eBPF in TUN module also assumes l2.
>
> > I think crashes are inevitable.
> >
> > > --
> > > 2.17.1
> > >
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 0/7] Support for virtio-net hash reporting

2021-01-18 Thread Yuri Benditovich

On Mon, Jan 18, 2021 at 4:46 AM Jason Wang  wrote:
>
>
> On 2021/1/17 下午3:57, Yuri Benditovich wrote:
> > On Thu, Jan 14, 2021 at 5:39 AM Jason Wang  wrote:
> >>
> >> On 2021/1/13 下午10:33, Willem de Bruijn wrote:
> >>> On Tue, Jan 12, 2021 at 11:11 PM Jason Wang  wrote:
> >>>> On 2021/1/13 上午7:47, Willem de Bruijn wrote:
> >>>>> On Tue, Jan 12, 2021 at 3:29 PM Yuri Benditovich
> >>>>>  wrote:
> >>>>>> On Tue, Jan 12, 2021 at 9:49 PM Yuri Benditovich
> >>>>>>  wrote:
> >>>>>>> On Tue, Jan 12, 2021 at 9:41 PM Yuri Benditovich
> >>>>>>>  wrote:
> >>>>>>>> Existing TUN module is able to use provided "steering eBPF" to
> >>>>>>>> calculate per-packet hash and derive the destination queue to
> >>>>>>>> place the packet to. The eBPF uses mapped configuration data
> >>>>>>>> containing a key for hash calculation and indirection table
> >>>>>>>> with array of queues' indices.
> >>>>>>>>
> >>>>>>>> This series of patches adds support for virtio-net hash reporting
> >>>>>>>> feature as defined in virtio specification. It extends the TUN module
> >>>>>>>> and the "steering eBPF" as follows:
> >>>>>>>>
> >>>>>>>> Extended steering eBPF calculates the hash value and hash type, keeps
> >>>>>>>> hash value in the skb->hash and returns index of destination 
> >>>>>>>> virtqueue
> >>>>>>>> and the type of the hash. TUN module keeps returned hash type in
> >>>>>>>> (currently unused) field of the skb.
> >>>>>>>> skb->__unused renamed to 'hash_report_type'.
> >>>>>>>>
> >>>>>>>> When TUN module is called later to allocate and fill the virtio-net
> >>>>>>>> header and push it to destination virtqueue it populates the hash
> >>>>>>>> and the hash type into virtio-net header.
> >>>>>>>>
> >>>>>>>> VHOST driver is made aware of respective virtio-net feature that
> >>>>>>>> extends the virtio-net header to report the hash value and hash 
> >>>>>>>> report
> >>>>>>>> type.
> >>>>>>> Comment from Willem de Bruijn:
> >>>>>>>
> >>>>>>> Skbuff fields are in short supply. I don't think we need to add one
> >>>>>>> just for this narrow path entirely internal to the tun device.
> >>>>>>>
> >>>>>> We understand that and try to minimize the impact by using an already
> >>>>>> existing unused field of skb.
> >>>>> Not anymore. It was repurposed as a flags field very recently.
> >>>>>
> >>>>> This use case is also very narrow in scope. And a very short path from
> >>>>> data producer to consumer. So I don't think it needs to claim scarce
> >>>>> bits in the skb.
> >>>>>
> >>>>> tun_ebpf_select_queue stores the field, tun_put_user reads it and
> >>>>> converts it to the virtio_net_hdr in the descriptor.
> >>>>>
> >>>>> tun_ebpf_select_queue is called from .ndo_select_queue.  Storing the
> >>>>> field in skb->cb is fragile, as in theory some code could overwrite
> >>>>> that between field between ndo_select_queue and
> >>>>> ndo_start_xmit/tun_net_xmit, from which point it is fully under tun
> >>>>> control again. But in practice, I don't believe anything does.
> >>>>>
> >>>>> Alternatively an existing skb field that is used only on disjoint
> >>>>> datapaths, such as ingress-only, could be viable.
> >>>> A question here. We had metadata support in XDP for cooperation between
> >>>> eBPF programs. Do we have something similar in the skb?
> >>>>
> >>>> E.g in the RSS, if we want to pass some metadata information between
> >>>> eBPF program and the logic that generates the vnet header (either hard
> >>>> logic in the kernel or another eBPF program). Is there any way that can
> >>>> avoid the possible

Re: [RFC PATCH 0/7] Support for virtio-net hash reporting

2021-01-16 Thread Yuri Benditovich

On Thu, Jan 14, 2021 at 5:39 AM Jason Wang  wrote:
>
>
> On 2021/1/13 下午10:33, Willem de Bruijn wrote:
> > On Tue, Jan 12, 2021 at 11:11 PM Jason Wang  wrote:
> >>
> >> On 2021/1/13 上午7:47, Willem de Bruijn wrote:
> >>> On Tue, Jan 12, 2021 at 3:29 PM Yuri Benditovich
> >>>  wrote:
> >>>> On Tue, Jan 12, 2021 at 9:49 PM Yuri Benditovich
> >>>>  wrote:
> >>>>> On Tue, Jan 12, 2021 at 9:41 PM Yuri Benditovich
> >>>>>  wrote:
> >>>>>> Existing TUN module is able to use provided "steering eBPF" to
> >>>>>> calculate per-packet hash and derive the destination queue to
> >>>>>> place the packet to. The eBPF uses mapped configuration data
> >>>>>> containing a key for hash calculation and indirection table
> >>>>>> with array of queues' indices.
> >>>>>>
> >>>>>> This series of patches adds support for virtio-net hash reporting
> >>>>>> feature as defined in virtio specification. It extends the TUN module
> >>>>>> and the "steering eBPF" as follows:
> >>>>>>
> >>>>>> Extended steering eBPF calculates the hash value and hash type, keeps
> >>>>>> hash value in the skb->hash and returns index of destination virtqueue
> >>>>>> and the type of the hash. TUN module keeps returned hash type in
> >>>>>> (currently unused) field of the skb.
> >>>>>> skb->__unused renamed to 'hash_report_type'.
> >>>>>>
> >>>>>> When TUN module is called later to allocate and fill the virtio-net
> >>>>>> header and push it to destination virtqueue it populates the hash
> >>>>>> and the hash type into virtio-net header.
> >>>>>>
> >>>>>> VHOST driver is made aware of respective virtio-net feature that
> >>>>>> extends the virtio-net header to report the hash value and hash report
> >>>>>> type.
> >>>>> Comment from Willem de Bruijn:
> >>>>>
> >>>>> Skbuff fields are in short supply. I don't think we need to add one
> >>>>> just for this narrow path entirely internal to the tun device.
> >>>>>
> >>>> We understand that and try to minimize the impact by using an already
> >>>> existing unused field of skb.
> >>> Not anymore. It was repurposed as a flags field very recently.
> >>>
> >>> This use case is also very narrow in scope. And a very short path from
> >>> data producer to consumer. So I don't think it needs to claim scarce
> >>> bits in the skb.
> >>>
> >>> tun_ebpf_select_queue stores the field, tun_put_user reads it and
> >>> converts it to the virtio_net_hdr in the descriptor.
> >>>
> >>> tun_ebpf_select_queue is called from .ndo_select_queue.  Storing the
> >>> field in skb->cb is fragile, as in theory some code could overwrite
> >>> that between field between ndo_select_queue and
> >>> ndo_start_xmit/tun_net_xmit, from which point it is fully under tun
> >>> control again. But in practice, I don't believe anything does.
> >>>
> >>> Alternatively an existing skb field that is used only on disjoint
> >>> datapaths, such as ingress-only, could be viable.
> >>
> >> A question here. We had metadata support in XDP for cooperation between
> >> eBPF programs. Do we have something similar in the skb?
> >>
> >> E.g in the RSS, if we want to pass some metadata information between
> >> eBPF program and the logic that generates the vnet header (either hard
> >> logic in the kernel or another eBPF program). Is there any way that can
> >> avoid the possible conflicts of qdiscs?
> > Not that I am aware of. The closest thing is cb[].
> >
> > It'll have to aliase a field like that, that is known unused for the given 
> > path.
>
>
> Right, we need to make sure cb is not used by other ones. I'm not sure
> how hard to achieve that consider Qemu installs the eBPF program but it
> doesn't deal with networking configurations.
>
>
> >
> > One other approach that has been used within linear call stacks is out
> > of band. Like percpu variables softnet_data.xmit.more and
> > mirred_rec_level. But that is perhaps a bit overwrought for this use
> > case.
>
>
> Ye

Re: [RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-12 Thread Yuri Benditovich

On Tue, Jan 12, 2021 at 10:40 PM Yuri Benditovich
 wrote:
>
> On Tue, Jan 12, 2021 at 9:42 PM Yuri Benditovich
>  wrote:
> >
> > This program type can set skb hash value. It will be useful
> > when the tun will support hash reporting feature if virtio-net.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >  drivers/net/tun.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index 7959b5c2d11f..455f7afc1f36 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, 
> > struct tun_prog __rcu **prog_p,
> > prog = NULL;
> > } else {
> > prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
> > +   if (IS_ERR(prog))
> > +   prog = bpf_prog_get_type(fd, 
> > BPF_PROG_TYPE_SCHED_CLS);
> > if (IS_ERR(prog))
> > return PTR_ERR(prog);
> > }
>
> Comment from Alexei Starovoitov:
> Patches 1 and 2 are missing for me, so I couldn't review properly,
> but this diff looks odd.
> It allows sched_cls prog type to attach to tun.
> That means everything that sched_cls progs can do will be done from tun hook?

We do not have an intention to modify the packet in this steering eBPF.
There is just one function that unavailable for BPF_PROG_TYPE_SOCKET_FILTER
that the eBPF needs to make possible to deliver the hash to the guest
VM - it is 'bpf_set_hash'

Does it mean that we need to define a new eBPF type for socket filter
operations + set_hash?

Our problem is that the eBPF calculates 32-bit hash, 16-bit queue
index and 8-bit of hash type.
But it is able to return only 32-bit integer, so in this set of
patches the eBPF returns
queue index and hash type and saves the hash in skb->hash using bpf_set_hash().

If this is unacceptable, can you please recommend a better solution?

> sched_cls assumes l2 and can modify the packet.

The steering eBPF in TUN module also assumes l2.

> I think crashes are inevitable.
>
> > --
> > 2.17.1
> >
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-12 Thread Yuri Benditovich

On Tue, Jan 12, 2021 at 9:42 PM Yuri Benditovich
 wrote:
>
> This program type can set skb hash value. It will be useful
> when the tun will support hash reporting feature if virtio-net.
>
> Signed-off-by: Yuri Benditovich 
> ---
>  drivers/net/tun.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 7959b5c2d11f..455f7afc1f36 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, struct 
> tun_prog __rcu **prog_p,
> prog = NULL;
> } else {
> prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
> +   if (IS_ERR(prog))
> +   prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SCHED_CLS);
> if (IS_ERR(prog))
> return PTR_ERR(prog);
> }

Comment from Alexei Starovoitov:
Patches 1 and 2 are missing for me, so I couldn't review properly,
but this diff looks odd.
It allows sched_cls prog type to attach to tun.
That means everything that sched_cls progs can do will be done from tun hook?
sched_cls assumes l2 and can modify the packet.
I think crashes are inevitable.

> --
> 2.17.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 0/7] Support for virtio-net hash reporting

2021-01-12 Thread Yuri Benditovich

On Tue, Jan 12, 2021 at 9:49 PM Yuri Benditovich
 wrote:
>
> On Tue, Jan 12, 2021 at 9:41 PM Yuri Benditovich
>  wrote:
> >
> > Existing TUN module is able to use provided "steering eBPF" to
> > calculate per-packet hash and derive the destination queue to
> > place the packet to. The eBPF uses mapped configuration data
> > containing a key for hash calculation and indirection table
> > with array of queues' indices.
> >
> > This series of patches adds support for virtio-net hash reporting
> > feature as defined in virtio specification. It extends the TUN module
> > and the "steering eBPF" as follows:
> >
> > Extended steering eBPF calculates the hash value and hash type, keeps
> > hash value in the skb->hash and returns index of destination virtqueue
> > and the type of the hash. TUN module keeps returned hash type in
> > (currently unused) field of the skb.
> > skb->__unused renamed to 'hash_report_type'.
> >
> > When TUN module is called later to allocate and fill the virtio-net
> > header and push it to destination virtqueue it populates the hash
> > and the hash type into virtio-net header.
> >
> > VHOST driver is made aware of respective virtio-net feature that
> > extends the virtio-net header to report the hash value and hash report
> > type.
>
> Comment from Willem de Bruijn:
>
> Skbuff fields are in short supply. I don't think we need to add one
> just for this narrow path entirely internal to the tun device.
>

We understand that and try to minimize the impact by using an already
existing unused field of skb.


> Instead, you could just run the flow_dissector in tun_put_user if the
> feature is negotiated. Indeed, the flow dissector seems more apt to me
> than BPF here. Note that the flow dissector internally can be
> overridden by a BPF program if the admin so chooses.
>
When this set of patches is related to hash delivery in the virtio-net
packet in general,
it was prepared in context of RSS feature implementation as defined in
virtio spec [1]
In case of RSS it is not enough to run the flow_dissector in tun_put_user:
in tun_ebpf_select_queue the TUN calls eBPF to calculate the hash,
hash type and queue index
according to the (mapped) parameters (key, hash types, indirection
table) received from the guest.
Our intention is to keep the hash and hash type in the skb to populate them
into a virtio-net header later in tun_put_user.
Note that in this case the type of calculated hash is selected not
only from flow dissections
but also from limitations provided by the guest.
This is already implemented in qemu (for case of vhost=off), see [2]
(virtio_net_process_rss)
For case of vhost=on there are WIP for qemu to load eBPF and attach it to TUN.

Note that exact way of selecting rx virtqueue depends on the guest,
it could be automatic steering (typical for Linux VM), RSS (typical
for Windows VM) or
any other steering mechanism implemented in loadable TUN steering BPF with
or without hash calculation.

[1] https://github.com/oasis-tcs/virtio-spec/blob/master/content.tex#L3740
[2] https://github.com/qemu/qemu/blob/master/hw/net/virtio-net.c#L1591

> This also hits on a deeper point with the choice of hash values, that
> I also noticed in my RFC patchset to implement the inverse [1][2]. It
> is much more detailed than skb->hash + skb->l4_hash currently offers,
> and that can be gotten for free from most hardware.

Unfortunately in the case of RSS we can't get this hash from the hardware as
this requires configuration of the NIC's hardware with key and hash types for
Toeplitz hash calculation.

> In most practical
> cases, that information suffices. I added less specific fields
> VIRTIO_NET_HASH_REPORT_L4, VIRTIO_NET_HASH_REPORT_OTHER that work
> without explicit flow dissection. I understand that the existing
> fields are part of the standard. Just curious, what is their purpose
> beyond 4-tuple based flow hashing?

The hash is used in combination with the indirection table to select
destination rx virtqueue.
The hash and hash type are to be reported in virtio-net header, if requested.
For Windows VM - in case the device does not report the hash (even if
it calculated it to
schedule the packet to a proper queue), the driver must do that for each packet
(this is a certification requirement).

>
> [1] https://patchwork.kernel.org/project/netdevbpf/list/?series=406859=*
> [2] 
> https://github.com/wdebruij/linux/commit/0f77febf22cd6ffc242a575807fa8382a26e511e
> >
> > Yuri Benditovich (7):
> >   skbuff: define field for hash report type
> >   vhost: support for hash report virtio-net feature
> >   tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type
> >   tun: free bpf_program by bpf_prog_p

Re: [RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-12 Thread Yuri Benditovich

On Tue, Jan 12, 2021 at 9:46 PM Alexei Starovoitov
 wrote:
>
> On Tue, Jan 12, 2021 at 11:42 AM Yuri Benditovich
>  wrote:
> >
> > This program type can set skb hash value. It will be useful
> > when the tun will support hash reporting feature if virtio-net.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >  drivers/net/tun.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index 7959b5c2d11f..455f7afc1f36 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, 
> > struct tun_prog __rcu **prog_p,
> > prog = NULL;
> > } else {
> > prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
> > +   if (IS_ERR(prog))
> > +   prog = bpf_prog_get_type(fd, 
> > BPF_PROG_TYPE_SCHED_CLS);
>
> You've ignored the feedback and just resend? what for?

No, I do not. Some patches did not reach relevant people at all, so I
just resent _all_ the patches to all the people.
I will copy your earlier comment to this patch and will address it in
the discussion.
Sorry for misunderstanding and some redundant noise.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 7/7] tun: report new tun feature IFF_HASH

2021-01-12 Thread Yuri Benditovich

IFF_HASH feature indicates that the tun supports
TUNSETHASHPOPULATION ioctl and can propagate the hash
data to the virtio-net packet.

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c   | 2 +-
 include/uapi/linux/if_tun.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 214feb0b16fb..b46aa8941a9d 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -88,7 +88,7 @@ static void tun_default_link_ksettings(struct net_device *dev,
 #define TUN_VNET_LE 0x8000
 #define TUN_VNET_BE 0x4000
 
-#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
+#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | IFF_HASH |\
  IFF_MULTI_QUEUE | IFF_NAPI | IFF_NAPI_FRAGS)
 
 #define GOODCOPY_LEN 128
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 0fd43533da26..116b84ede3a0 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -73,6 +73,7 @@
 #define IFF_ONE_QUEUE  0x2000
 #define IFF_VNET_HDR   0x4000
 #define IFF_TUN_EXCL   0x8000
+#define IFF_HASH   0x0080
 #define IFF_MULTI_QUEUE 0x0100
 #define IFF_ATTACH_QUEUE 0x0200
 #define IFF_DETACH_QUEUE 0x0400
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH 0/7] Support for virtio-net hash reporting

2021-01-12 Thread Yuri Benditovich

On Tue, Jan 12, 2021 at 9:41 PM Yuri Benditovich
 wrote:
>
> Existing TUN module is able to use provided "steering eBPF" to
> calculate per-packet hash and derive the destination queue to
> place the packet to. The eBPF uses mapped configuration data
> containing a key for hash calculation and indirection table
> with array of queues' indices.
>
> This series of patches adds support for virtio-net hash reporting
> feature as defined in virtio specification. It extends the TUN module
> and the "steering eBPF" as follows:
>
> Extended steering eBPF calculates the hash value and hash type, keeps
> hash value in the skb->hash and returns index of destination virtqueue
> and the type of the hash. TUN module keeps returned hash type in
> (currently unused) field of the skb.
> skb->__unused renamed to 'hash_report_type'.
>
> When TUN module is called later to allocate and fill the virtio-net
> header and push it to destination virtqueue it populates the hash
> and the hash type into virtio-net header.
>
> VHOST driver is made aware of respective virtio-net feature that
> extends the virtio-net header to report the hash value and hash report
> type.

Comment from Willem de Bruijn:

Skbuff fields are in short supply. I don't think we need to add one
just for this narrow path entirely internal to the tun device.

Instead, you could just run the flow_dissector in tun_put_user if the
feature is negotiated. Indeed, the flow dissector seems more apt to me
than BPF here. Note that the flow dissector internally can be
overridden by a BPF program if the admin so chooses.

This also hits on a deeper point with the choice of hash values, that
I also noticed in my RFC patchset to implement the inverse [1][2]. It
is much more detailed than skb->hash + skb->l4_hash currently offers,
and that can be gotten for free from most hardware. In most practical
cases, that information suffices. I added less specific fields
VIRTIO_NET_HASH_REPORT_L4, VIRTIO_NET_HASH_REPORT_OTHER that work
without explicit flow dissection. I understand that the existing
fields are part of the standard. Just curious, what is their purpose
beyond 4-tuple based flow hashing?

[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=406859=*
[2] 
https://github.com/wdebruij/linux/commit/0f77febf22cd6ffc242a575807fa8382a26e511e
>
> Yuri Benditovich (7):
>   skbuff: define field for hash report type
>   vhost: support for hash report virtio-net feature
>   tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type
>   tun: free bpf_program by bpf_prog_put instead of bpf_prog_destroy
>   tun: add ioctl code TUNSETHASHPOPULATION
>   tun: populate hash in virtio-net header when needed
>   tun: report new tun feature IFF_HASH
>
>  drivers/net/tun.c   | 43 +++--
>  drivers/vhost/net.c | 37 ---
>  include/linux/skbuff.h  |  7 +-
>  include/uapi/linux/if_tun.h |  2 ++
>  4 files changed, 74 insertions(+), 15 deletions(-)
>
> --
> 2.17.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 4/7] tun: free bpf_program by bpf_prog_put instead of bpf_prog_destroy

2021-01-12 Thread Yuri Benditovich

The module never creates the bpf program with bpf_prog_create
so it shouldn't free it with bpf_prog_destroy.
The program is obtained by bpf_prog_get and should be freed
by bpf_prog_put. For BPF_PROG_TYPE_SOCKET_FILTER both
methods do the same but for other program types they don't.

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 455f7afc1f36..18c1baf1a6c1 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2218,7 +2218,7 @@ static void tun_prog_free(struct rcu_head *rcu)
 {
struct tun_prog *prog = container_of(rcu, struct tun_prog, rcu);
 
-   bpf_prog_destroy(prog->prog);
+   bpf_prog_put(prog->prog);
kfree(prog);
 }
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 2/7] vhost: support for hash report virtio-net feature

2021-01-12 Thread Yuri Benditovich

According to the virtio specification if VIRTIO_NET_F_HASH_REPORT
feature acked the virtio-net header is extended to hold the hash
value and hash report type.

Signed-off-by: Yuri Benditovich 
---
 drivers/vhost/net.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 531a00d703cd..31a894b9a992 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -73,7 +73,8 @@ enum {
VHOST_NET_FEATURES = VHOST_FEATURES |
 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
 (1ULL << VIRTIO_NET_F_MRG_RXBUF) |
-(1ULL << VIRTIO_F_ACCESS_PLATFORM)
+(1ULL << VIRTIO_F_ACCESS_PLATFORM) |
+(1ULL << VIRTIO_NET_F_HASH_REPORT)
 };
 
 enum {
@@ -1108,14 +1109,16 @@ static void handle_rx(struct vhost_net *net)
.msg_controllen = 0,
.msg_flags = MSG_DONTWAIT,
};
-   struct virtio_net_hdr hdr = {
-   .flags = 0,
-   .gso_type = VIRTIO_NET_HDR_GSO_NONE
+   struct virtio_net_hdr_v1_hash hdrv1 = {
+   {
+   .flags = 0,
+   .gso_type = VIRTIO_NET_HDR_GSO_NONE
+   }
};
size_t total_len = 0;
int err, mergeable;
s16 headcount;
-   size_t vhost_hlen, sock_hlen;
+   size_t vhost_hlen, sock_hlen, extra_hlen;
size_t vhost_len, sock_len;
bool busyloop_intr = false;
struct socket *sock;
@@ -1137,9 +1140,12 @@ static void handle_rx(struct vhost_net *net)
vhost_hlen = nvq->vhost_hlen;
sock_hlen = nvq->sock_hlen;
 
+
vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
vq->log : NULL;
mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
+   extra_hlen = vhost_has_feature(vq, VIRTIO_NET_F_HASH_REPORT) ?
+   sizeof(hdrv1) - sizeof(hdrv1.hdr) : 0;
 
do {
sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
@@ -1201,8 +1207,8 @@ static void handle_rx(struct vhost_net *net)
}
/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
if (unlikely(vhost_hlen)) {
-   if (copy_to_iter(, sizeof(hdr),
-) != sizeof(hdr)) {
+   if (copy_to_iter(, sizeof(struct virtio_net_hdr),
+) != sizeof(struct 
virtio_net_hdr)) {
vq_err(vq, "Unable to write vnet_hdr "
   "at addr %p\n", vq->iov->iov_base);
goto out;
@@ -1211,7 +1217,7 @@ static void handle_rx(struct vhost_net *net)
/* Header came from socket; we'll need to patch
 * ->num_buffers over if VIRTIO_NET_F_MRG_RXBUF
 */
-   iov_iter_advance(, sizeof(hdr));
+   iov_iter_advance(, sizeof(struct virtio_net_hdr));
}
/* TODO: Should check and handle checksum. */
 
@@ -1223,6 +1229,18 @@ static void handle_rx(struct vhost_net *net)
vhost_discard_vq_desc(vq, headcount);
goto out;
}
+   if (unlikely(extra_hlen)) {
+   if (unlikely(vhost_hlen)) {
+   if (copy_to_iter(_value, extra_hlen,
+   ) != extra_hlen) {
+   vq_err(vq, "Unable to write extra_hdr "
+   "at addr %p\n", vq->iov->iov_base);
+   goto out;
+   }
+   } else {
+   iov_iter_advance(, extra_hlen);
+   }
+   }
nvq->done_idx += headcount;
if (nvq->done_idx > VHOST_NET_BATCH)
vhost_net_signal_used(nvq);
@@ -1624,6 +1642,9 @@ static int vhost_net_set_features(struct vhost_net *n, 
u64 features)
   (1ULL << VIRTIO_F_VERSION_1))) ?
sizeof(struct virtio_net_hdr_mrg_rxbuf) :
sizeof(struct virtio_net_hdr);
+   if (features & (1ULL << VIRTIO_NET_F_HASH_REPORT)) {
+   hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
+   }
if (features & (1 << VHOST_NET_F_VIRTIO_NET_HDR)) {
/* vhost provides vnet_hdr */
vhost_hlen = hdr_len;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 3/7] tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type

2021-01-12 Thread Yuri Benditovich

This program type can set skb hash value. It will be useful
when the tun will support hash reporting feature if virtio-net.

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7959b5c2d11f..455f7afc1f36 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2981,6 +2981,8 @@ static int tun_set_ebpf(struct tun_struct *tun, struct 
tun_prog __rcu **prog_p,
prog = NULL;
} else {
prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
+   if (IS_ERR(prog))
+   prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SCHED_CLS);
if (IS_ERR(prog))
return PTR_ERR(prog);
}
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 6/7] tun: populate hash in virtio-net header when needed

2021-01-12 Thread Yuri Benditovich

If the BPF program populated the hash in the skb the tun
propagates the hash value and hash report type to the
respective fields of virtio-net header.

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 45f4f04a4a3e..214feb0b16fb 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -556,15 +556,20 @@ static u16 tun_ebpf_select_queue(struct tun_struct *tun, 
struct sk_buff *skb)
 {
struct tun_prog *prog;
u32 numqueues;
-   u16 ret = 0;
+   u32 ret = 0;
 
numqueues = READ_ONCE(tun->numqueues);
if (!numqueues)
return 0;
 
prog = rcu_dereference(tun->steering_prog);
-   if (prog)
+   if (prog) {
ret = bpf_prog_run_clear_cb(prog->prog, skb);
+   if (tun->bpf_populates_hash) {
+   *skb_hash_report_type(skb) = (__u8)(ret >> 16);
+   ret &= 0x;
+   }
+   }
 
return ret % numqueues;
 }
@@ -2062,6 +2067,7 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 
if (vnet_hdr_sz) {
struct virtio_net_hdr gso;
+   __u16 extra_copy = 0;
 
if (iov_iter_count(iter) < vnet_hdr_sz)
return -EINVAL;
@@ -2085,7 +2091,20 @@ static ssize_t tun_put_user(struct tun_struct *tun,
if (copy_to_iter(, sizeof(gso), iter) != sizeof(gso))
return -EFAULT;
 
-   iov_iter_advance(iter, vnet_hdr_sz - sizeof(gso));
+   if (tun->bpf_populates_hash &&
+   vnet_hdr_sz >= sizeof(struct virtio_net_hdr_v1_hash)) {
+   struct virtio_net_hdr_v1_hash hdr;
+
+   hdr.hdr.num_buffers = 0;
+   hdr.hash_value = cpu_to_le32(skb_get_hash(skb));
+   hdr.hash_report = 
cpu_to_le16(*skb_hash_report_type(skb));
+   hdr.padding = 0;
+   extra_copy = sizeof(hdr) - sizeof(gso);
+   if (copy_to_iter(_buffers, extra_copy, 
iter) != extra_copy)
+   return -EFAULT;
+   }
+
+   iov_iter_advance(iter, vnet_hdr_sz - sizeof(gso) - extra_copy);
}
 
if (vlan_hlen) {
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 5/7] tun: add ioctl code TUNSETHASHPOPULATION

2021-01-12 Thread Yuri Benditovich

User mode program calls this ioctl before loading of
BPF program to inform the tun that the BPF program has
extended functionality, i.e. sets hash value and returns
the virtqueue number in the lower 16 bits and the type
of the hash report in the upper 16 bits.

Signed-off-by: Yuri Benditovich 
---
 drivers/net/tun.c   | 12 +++-
 include/uapi/linux/if_tun.h |  1 +
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 18c1baf1a6c1..45f4f04a4a3e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -197,6 +197,7 @@ struct tun_struct {
struct sock_fprog   fprog;
/* protected by rtnl lock */
boolfilter_attached;
+   boolbpf_populates_hash;
u32 msg_enable;
spinlock_t lock;
struct hlist_head flows[TUN_NUM_FLOW_ENTRIES];
@@ -2765,6 +2766,7 @@ static int tun_set_iff(struct net *net, struct file 
*file, struct ifreq *ifr)
 
tun->align = NET_SKB_PAD;
tun->filter_attached = false;
+   tun->bpf_populates_hash = false;
tun->sndbuf = tfile->socket.sk->sk_sndbuf;
tun->rx_batched = 0;
RCU_INIT_POINTER(tun->steering_prog, NULL);
@@ -2997,7 +2999,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned 
int cmd,
struct net *net = sock_net(>sk);
struct tun_struct *tun;
void __user* argp = (void __user*)arg;
-   unsigned int ifindex, carrier;
+   unsigned int ifindex, carrier, populate_hash;
struct ifreq ifr;
kuid_t owner;
kgid_t group;
@@ -3298,6 +3300,14 @@ static long __tun_chr_ioctl(struct file *file, unsigned 
int cmd,
ret = open_related_ns(>ns, get_net_ns);
break;
 
+   case TUNSETHASHPOPULATION:
+   ret = -EFAULT;
+   if (copy_from_user(_hash, argp, sizeof(populate_hash)))
+   goto unlock;
+   tun->bpf_populates_hash = !!populate_hash;
+   ret = 0;
+   break;
+
default:
ret = -EINVAL;
break;
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 454ae31b93c7..0fd43533da26 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -61,6 +61,7 @@
 #define TUNSETFILTEREBPF _IOR('T', 225, int)
 #define TUNSETCARRIER _IOW('T', 226, int)
 #define TUNGETDEVNETNS _IO('T', 227)
+#define TUNSETHASHPOPULATION _IOR('T', 228, int)
 
 /* TUNSETIFF ifr flags */
 #define IFF_TUN0x0001
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 1/7] skbuff: define field for hash report type

2021-01-12 Thread Yuri Benditovich

Used by virtio-net receive side scaling

Signed-off-by: Yuri Benditovich 
---
 include/linux/skbuff.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 416bf95cd5f2..36cf40ec0259 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -510,7 +510,7 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct 
sk_buff *skb,
  * the end of the header data, ie. at skb->end.
  */
 struct skb_shared_info {
-   __u8__unused;
+   __u8hash_report_type; /* virtio-net rss */
__u8meta_len;
__u8nr_frags;
__u8tx_flags;
@@ -1430,6 +1430,11 @@ static inline struct skb_shared_hwtstamps 
*skb_hwtstamps(struct sk_buff *skb)
return _shinfo(skb)->hwtstamps;
 }
 
+static inline __u8 *skb_hash_report_type(struct sk_buff *skb)
+{
+   return _shinfo(skb)->hash_report_type;
+}
+
 static inline struct ubuf_info *skb_zcopy(struct sk_buff *skb)
 {
bool is_zcopy = skb && skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 0/7] Support for virtio-net hash reporting

2021-01-12 Thread Yuri Benditovich

Existing TUN module is able to use provided "steering eBPF" to
calculate per-packet hash and derive the destination queue to
place the packet to. The eBPF uses mapped configuration data
containing a key for hash calculation and indirection table
with array of queues' indices.

This series of patches adds support for virtio-net hash reporting
feature as defined in virtio specification. It extends the TUN module
and the "steering eBPF" as follows:

Extended steering eBPF calculates the hash value and hash type, keeps
hash value in the skb->hash and returns index of destination virtqueue
and the type of the hash. TUN module keeps returned hash type in
(currently unused) field of the skb. 
skb->__unused renamed to 'hash_report_type'.

When TUN module is called later to allocate and fill the virtio-net
header and push it to destination virtqueue it populates the hash
and the hash type into virtio-net header.

VHOST driver is made aware of respective virtio-net feature that
extends the virtio-net header to report the hash value and hash report
type.

Yuri Benditovich (7):
  skbuff: define field for hash report type
  vhost: support for hash report virtio-net feature
  tun: allow use of BPF_PROG_TYPE_SCHED_CLS program type
  tun: free bpf_program by bpf_prog_put instead of bpf_prog_destroy
  tun: add ioctl code TUNSETHASHPOPULATION
  tun: populate hash in virtio-net header when needed
  tun: report new tun feature IFF_HASH

 drivers/net/tun.c   | 43 +++--
 drivers/vhost/net.c | 37 ---
 include/linux/skbuff.h  |  7 +-
 include/uapi/linux/if_tun.h |  2 ++
 4 files changed, 74 insertions(+), 15 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC PATCH 2/7] vhost: support for hash report virtio-net feature

2021-01-05 Thread Yuri Benditovich

According to the virtio specification if VIRTIO_NET_F_HASH_REPORT
feature acked the virtio-net header is extended to hold the hash
value and hash report type.

Signed-off-by: Yuri Benditovich 
---
 drivers/vhost/net.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 531a00d703cd..31a894b9a992 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -73,7 +73,8 @@ enum {
VHOST_NET_FEATURES = VHOST_FEATURES |
 (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
 (1ULL << VIRTIO_NET_F_MRG_RXBUF) |
-(1ULL << VIRTIO_F_ACCESS_PLATFORM)
+(1ULL << VIRTIO_F_ACCESS_PLATFORM) |
+(1ULL << VIRTIO_NET_F_HASH_REPORT)
 };
 
 enum {
@@ -1108,14 +1109,16 @@ static void handle_rx(struct vhost_net *net)
.msg_controllen = 0,
.msg_flags = MSG_DONTWAIT,
};
-   struct virtio_net_hdr hdr = {
-   .flags = 0,
-   .gso_type = VIRTIO_NET_HDR_GSO_NONE
+   struct virtio_net_hdr_v1_hash hdrv1 = {
+   {
+   .flags = 0,
+   .gso_type = VIRTIO_NET_HDR_GSO_NONE
+   }
};
size_t total_len = 0;
int err, mergeable;
s16 headcount;
-   size_t vhost_hlen, sock_hlen;
+   size_t vhost_hlen, sock_hlen, extra_hlen;
size_t vhost_len, sock_len;
bool busyloop_intr = false;
struct socket *sock;
@@ -1137,9 +1140,12 @@ static void handle_rx(struct vhost_net *net)
vhost_hlen = nvq->vhost_hlen;
sock_hlen = nvq->sock_hlen;
 
+
vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
vq->log : NULL;
mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
+   extra_hlen = vhost_has_feature(vq, VIRTIO_NET_F_HASH_REPORT) ?
+   sizeof(hdrv1) - sizeof(hdrv1.hdr) : 0;
 
do {
sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
@@ -1201,8 +1207,8 @@ static void handle_rx(struct vhost_net *net)
}
/* Supply virtio_net_hdr if VHOST_NET_F_VIRTIO_NET_HDR */
if (unlikely(vhost_hlen)) {
-   if (copy_to_iter(, sizeof(hdr),
-) != sizeof(hdr)) {
+   if (copy_to_iter(, sizeof(struct virtio_net_hdr),
+) != sizeof(struct 
virtio_net_hdr)) {
vq_err(vq, "Unable to write vnet_hdr "
   "at addr %p\n", vq->iov->iov_base);
goto out;
@@ -1211,7 +1217,7 @@ static void handle_rx(struct vhost_net *net)
/* Header came from socket; we'll need to patch
 * ->num_buffers over if VIRTIO_NET_F_MRG_RXBUF
 */
-   iov_iter_advance(, sizeof(hdr));
+   iov_iter_advance(, sizeof(struct virtio_net_hdr));
}
/* TODO: Should check and handle checksum. */
 
@@ -1223,6 +1229,18 @@ static void handle_rx(struct vhost_net *net)
vhost_discard_vq_desc(vq, headcount);
goto out;
}
+   if (unlikely(extra_hlen)) {
+   if (unlikely(vhost_hlen)) {
+   if (copy_to_iter(_value, extra_hlen,
+   ) != extra_hlen) {
+   vq_err(vq, "Unable to write extra_hdr "
+   "at addr %p\n", vq->iov->iov_base);
+   goto out;
+   }
+   } else {
+   iov_iter_advance(, extra_hlen);
+   }
+   }
nvq->done_idx += headcount;
if (nvq->done_idx > VHOST_NET_BATCH)
vhost_net_signal_used(nvq);
@@ -1624,6 +1642,9 @@ static int vhost_net_set_features(struct vhost_net *n, 
u64 features)
   (1ULL << VIRTIO_F_VERSION_1))) ?
sizeof(struct virtio_net_hdr_mrg_rxbuf) :
sizeof(struct virtio_net_hdr);
+   if (features & (1ULL << VIRTIO_NET_F_HASH_REPORT)) {
+   hdr_len = sizeof(struct virtio_net_hdr_v1_hash);
+   }
if (features & (1 << VHOST_NET_F_VIRTIO_NET_HDR)) {
/* vhost provides vnet_hdr */
vhost_hlen = hdr_len;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v4 0/3] virtio-net: introduce features defined in the spec

2020-03-02 Thread Yuri Benditovich

This series introduce virtio-net features VIRTIO_NET_F_RSC_EXT,
VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT.

Changes from v3: reformatted structure in patch 1

Yuri Benditovich (3):
  virtio-net: Introduce extended RSC feature
  virtio-net: Introduce RSS receive steering feature
  virtio-net: Introduce hash report feature

 include/uapi/linux/virtio_net.h | 102 ++--
 1 file changed, 98 insertions(+), 4 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v4 1/3] virtio-net: Introduce extended RSC feature

2020-03-02 Thread Yuri Benditovich

VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
is able to provide extended RSC information. When the feature
is negotiatede and 'gso_type' field in received packet is not
GSO_NONE, the device reports number of coalesced packets in
'csum_start' field and number of duplicated acks in 'csum_offset'
field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index a3715a3224c1..6466c5979a93 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
 */
@@ -104,6 +105,7 @@ struct virtio_net_config {
 struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM1   /* Use csum_start, csum_offset 
*/
 #define VIRTIO_NET_HDR_F_DATA_VALID2   /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO  4   /* rsc info in csum_ fields */
__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   /* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
@@ -113,8 +115,26 @@ struct virtio_net_hdr_v1 {
__u8 gso_type;
__virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
__virtio16 gso_size;/* Bytes to append to hdr_len per frame */
-   __virtio16 csum_start;  /* Position to start checksumming from */
-   __virtio16 csum_offset; /* Offset after that to place checksum */
+   union {
+   struct {
+   __virtio16 csum_start;
+   __virtio16 csum_offset;
+   };
+   /* Checksum calculation */
+   struct {
+   /* Position to start checksumming from */
+   __virtio16 start;
+   /* Offset after that to place checksum */
+   __virtio16 offset;
+   } csum;
+   /* Receive Segment Coalescing */
+   struct {
+   /* Number of coalesced segments */
+   __le16 segments;
+   /* Number of duplicated acks */
+   __le16 dup_acks;
+   } rsc;
+   };
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v4 2/3] virtio-net: Introduce RSS receive steering feature

2020-03-02 Thread Yuri Benditovich

RSS (Receive-side scaling) defines hash calculation
rules and decision on receive virtqueue according to
the calculated hash, provided mask to apply and
provided indirection table containing indices of
receive virqueues. The driver sends the control
command to enable multiqueue and provide parameters
for receive steering.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 42 +++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 6466c5979a93..aec6fac3666a 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
@@ -70,6 +71,17 @@
 #define VIRTIO_NET_S_LINK_UP   1   /* Link is up */
 #define VIRTIO_NET_S_ANNOUNCE  2   /* Announcement is needed */
 
+/* supported/enabled hash types */
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
+#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
+
 struct virtio_net_config {
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
__u8 mac[ETH_ALEN];
@@ -93,6 +105,12 @@ struct virtio_net_config {
 * Any other value stands for unknown.
 */
__u8 duplex;
+   /* maximum size of RSS key */
+   __u8 rss_max_key_size;
+   /* maximum number of indirection table entries */
+   __le16 rss_max_indirection_table_length;
+   /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
+   __le32 supported_hash_types;
 } __attribute__((packed));
 
 /*
@@ -248,7 +266,9 @@ struct virtio_net_ctrl_mac {
 
 /*
  * Control Receive Flow Steering
- *
+ */
+#define VIRTIO_NET_CTRL_MQ   4
+/*
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
  * enables Receive Flow Steering, specifying the number of the transmit and
  * receive queues that will be used. After the command is consumed and acked by
@@ -261,11 +281,29 @@ struct virtio_net_ctrl_mq {
__virtio16 virtqueue_pairs;
 };
 
-#define VIRTIO_NET_CTRL_MQ   4
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
+ * the receive steering to use a hash calculated for incoming packet
+ * to decide on receive virtqueue to place the packet. The command
+ * also provides parameters to calculate a hash and receive virtqueue.
+ */
+struct virtio_net_rss_config {
+   __le32 hash_types;
+   __le16 indirection_table_mask;
+   __le16 unclassified_queue;
+   __le16 indirection_table[1/* + indirection_table_mask */];
+   __le16 max_tx_vq;
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v4 3/3] virtio-net: Introduce hash report feature

2020-03-02 Thread Yuri Benditovich

The feature VIRTIO_NET_F_HASH_REPORT extends the
layout of the packet and requests the device to
calculate hash on incoming packets and report it
in the packet header.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 36 +
 1 file changed, 36 insertions(+)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index aec6fac3666a..19d23e5baa4e 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
@@ -156,6 +157,23 @@ struct virtio_net_hdr_v1 {
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
+struct virtio_net_hdr_v1_hash {
+   struct virtio_net_hdr_v1 hdr;
+   __le32 hash_value;
+#define VIRTIO_NET_HASH_REPORT_NONE0
+#define VIRTIO_NET_HASH_REPORT_IPv41
+#define VIRTIO_NET_HASH_REPORT_TCPv4   2
+#define VIRTIO_NET_HASH_REPORT_UDPv4   3
+#define VIRTIO_NET_HASH_REPORT_IPv64
+#define VIRTIO_NET_HASH_REPORT_TCPv6   5
+#define VIRTIO_NET_HASH_REPORT_UDPv6   6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX9
+   __le16 hash_report;
+   __le16 padding;
+};
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
@@ -304,6 +322,24 @@ struct virtio_net_rss_config {
 
  #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_HASH_CONFIG requests the device
+ * to include in the virtio header of the packet the value of the
+ * calculated hash and the report type of hash. It also provides
+ * parameters for hash calculation. The command requires feature
+ * VIRTIO_NET_F_HASH_REPORT to be negotiated to extend the
+ * layout of virtio header as defined in virtio_net_hdr_v1_hash.
+ */
+struct virtio_net_hash_config {
+   __le32 hash_types;
+   /* for compatibility with virtio_net_rss_config */
+   __le16 reserved[4];
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 2/3] virtio-net: Introduce RSS receive steering feature

2020-03-02 Thread Yuri Benditovich

All the classes of commands are defined without indentation.
All the commands are defined with indentation of 1 space.
Only the last one (VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET at the end of
the file) does not have an indentation.

On Mon, Mar 2, 2020 at 12:54 PM Michael S. Tsirkin  wrote:
>
> On Mon, Mar 02, 2020 at 10:53:14AM +0200, Yuri Benditovich wrote:
> > On Sun, Mar 1, 2020 at 9:58 PM Michael S. Tsirkin  wrote:
> > >
> > > On Sun, Mar 01, 2020 at 04:33:01PM +0200, Yuri Benditovich wrote:
> > > > RSS (Receive-side scaling) defines hash calculation
> > > > rules and decision on receive virtqueue according to
> > > > the calculated hash, provided mask to apply and
> > > > provided indirection table containing indices of
> > > > receive virqueues. The driver sends the control
> > > > command to enable multiqueue and provide parameters
> > > > for receive steering.
> > > >
> > > > Signed-off-by: Yuri Benditovich 
> > > > ---
> > > >  include/uapi/linux/virtio_net.h | 42 +++--
> > > >  1 file changed, 40 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/include/uapi/linux/virtio_net.h 
> > > > b/include/uapi/linux/virtio_net.h
> > > > index 19e76b3e3a64..188ad3eecdc8 100644
> > > > --- a/include/uapi/linux/virtio_net.h
> > > > +++ b/include/uapi/linux/virtio_net.h
> > > > @@ -57,6 +57,7 @@
> > > >* Steering */
> > > >  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> > > >
> > > > +#define VIRTIO_NET_F_RSS   60/* Supports RSS RX steering */
> > > >  #define VIRTIO_NET_F_RSC_EXT   61/* extended coalescing info */
> > > >  #define VIRTIO_NET_F_STANDBY   62/* Act as standby for another 
> > > > device
> > > >* with the same MAC.
> > > > @@ -70,6 +71,17 @@
> > > >  #define VIRTIO_NET_S_LINK_UP 1   /* Link is up */
> > > >  #define VIRTIO_NET_S_ANNOUNCE2   /* Announcement is needed 
> > > > */
> > > >
> > > > +/* supported/enabled hash types */
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
> > > > +#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
> > > > +
> > > >  struct virtio_net_config {
> > > >   /* The config defining mac address (if VIRTIO_NET_F_MAC) */
> > > >   __u8 mac[ETH_ALEN];
> > > > @@ -93,6 +105,12 @@ struct virtio_net_config {
> > > >* Any other value stands for unknown.
> > > >*/
> > > >   __u8 duplex;
> > > > + /* maximum size of RSS key */
> > > > + __u8 rss_max_key_size;
> > > > + /* maximum number of indirection table entries */
> > > > + __le16 rss_max_indirection_table_length;
> > > > + /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
> > > > + __le32 supported_hash_types;
> > > >  } __attribute__((packed));
> > > >
> > > >  /*
> > > > @@ -246,7 +264,9 @@ struct virtio_net_ctrl_mac {
> > > >
> > > >  /*
> > > >   * Control Receive Flow Steering
> > > > - *
> > > > + */
> > > > +#define VIRTIO_NET_CTRL_MQ   4
> > > > +/*
> > > >   * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
> > > >   * enables Receive Flow Steering, specifying the number of the 
> > > > transmit and
> > > >   * receive queues that will be used. After the command is consumed and 
> > > > acked by
> > > > @@ -259,11 +279,29 @@ struct virtio_net_ctrl_mq {
> > > >   __virtio16 virtqueue_pairs;
> > > >  };
> > > >
> > > > -#define VIRTIO_NET_CTRL_MQ   4
> > > >   #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
> &

Re: [PATCH v3 2/3] virtio-net: Introduce RSS receive steering feature

2020-03-02 Thread Yuri Benditovich

On Sun, Mar 1, 2020 at 9:58 PM Michael S. Tsirkin  wrote:
>
> On Sun, Mar 01, 2020 at 04:33:01PM +0200, Yuri Benditovich wrote:
> > RSS (Receive-side scaling) defines hash calculation
> > rules and decision on receive virtqueue according to
> > the calculated hash, provided mask to apply and
> > provided indirection table containing indices of
> > receive virqueues. The driver sends the control
> > command to enable multiqueue and provide parameters
> > for receive steering.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >  include/uapi/linux/virtio_net.h | 42 +++--
> >  1 file changed, 40 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/virtio_net.h 
> > b/include/uapi/linux/virtio_net.h
> > index 19e76b3e3a64..188ad3eecdc8 100644
> > --- a/include/uapi/linux/virtio_net.h
> > +++ b/include/uapi/linux/virtio_net.h
> > @@ -57,6 +57,7 @@
> >* Steering */
> >  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> >
> > +#define VIRTIO_NET_F_RSS   60/* Supports RSS RX steering */
> >  #define VIRTIO_NET_F_RSC_EXT   61/* extended coalescing info */
> >  #define VIRTIO_NET_F_STANDBY   62/* Act as standby for another device
> >* with the same MAC.
> > @@ -70,6 +71,17 @@
> >  #define VIRTIO_NET_S_LINK_UP 1   /* Link is up */
> >  #define VIRTIO_NET_S_ANNOUNCE2   /* Announcement is needed */
> >
> > +/* supported/enabled hash types */
> > +#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
> > +#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
> > +
> >  struct virtio_net_config {
> >   /* The config defining mac address (if VIRTIO_NET_F_MAC) */
> >   __u8 mac[ETH_ALEN];
> > @@ -93,6 +105,12 @@ struct virtio_net_config {
> >* Any other value stands for unknown.
> >*/
> >   __u8 duplex;
> > + /* maximum size of RSS key */
> > + __u8 rss_max_key_size;
> > + /* maximum number of indirection table entries */
> > + __le16 rss_max_indirection_table_length;
> > + /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
> > + __le32 supported_hash_types;
> >  } __attribute__((packed));
> >
> >  /*
> > @@ -246,7 +264,9 @@ struct virtio_net_ctrl_mac {
> >
> >  /*
> >   * Control Receive Flow Steering
> > - *
> > + */
> > +#define VIRTIO_NET_CTRL_MQ   4
> > +/*
> >   * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
> >   * enables Receive Flow Steering, specifying the number of the transmit and
> >   * receive queues that will be used. After the command is consumed and 
> > acked by
> > @@ -259,11 +279,29 @@ struct virtio_net_ctrl_mq {
> >   __virtio16 virtqueue_pairs;
> >  };
> >
> > -#define VIRTIO_NET_CTRL_MQ   4
> >   #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
> >   #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
> >   #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
> >
> > +/*
> > + * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
> > + * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
> > + * the receive steering to use a hash calculated for incoming packet
> > + * to decide on receive virtqueue to place the packet. The command
> > + * also provides parameters to calculate a hash and receive virtqueue.
> > + */
> > +struct virtio_net_rss_config {
> > + __le32 hash_types;
> > + __le16 indirection_table_mask;
> > + __le16 unclassified_queue;
> > + __le16 indirection_table[1/* + indirection_table_mask */];
> > + __le16 max_tx_vq;
> > + __u8 hash_key_length;
> > + __u8 hash_key_data[/* hash_key_length */];
> > +};
> > +
> > + #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
> > +
>
>
> Extra space here.

Where exactly you want to remove the empty line?
The format here is exactly as in other places:
comment - structure - space - command - space


>
> >  /*
> >   * Control network offloads
> >   *
> > --
> > 2.17.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 3/3] virtio-net: Introduce hash report feature

2020-03-01 Thread Yuri Benditovich

The feature VIRTIO_NET_F_HASH_REPORT extends the
layout of the packet and requests the device to
calculate hash on incoming packets and report it
in the packet header.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 36 +
 1 file changed, 36 insertions(+)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 188ad3eecdc8..4bd940501ba6 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
@@ -154,6 +155,23 @@ struct virtio_net_hdr_v1 {
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
+struct virtio_net_hdr_v1_hash {
+   struct virtio_net_hdr_v1 hdr;
+   __le32 hash_value;
+#define VIRTIO_NET_HASH_REPORT_NONE0
+#define VIRTIO_NET_HASH_REPORT_IPv41
+#define VIRTIO_NET_HASH_REPORT_TCPv4   2
+#define VIRTIO_NET_HASH_REPORT_UDPv4   3
+#define VIRTIO_NET_HASH_REPORT_IPv64
+#define VIRTIO_NET_HASH_REPORT_TCPv6   5
+#define VIRTIO_NET_HASH_REPORT_UDPv6   6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX9
+   __le16 hash_report;
+   __le16 padding;
+};
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
@@ -302,6 +320,24 @@ struct virtio_net_rss_config {
 
  #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_HASH_CONFIG requests the device
+ * to include in the virtio header of the packet the value of the
+ * calculated hash and the report type of hash. It also provides
+ * parameters for hash calculation. The command requires feature
+ * VIRTIO_NET_F_HASH_REPORT to be negotiated to extend the
+ * layout of virtio header as defined in virtio_net_hdr_v1_hash.
+ */
+struct virtio_net_hash_config {
+   __le32 hash_types;
+   /* for compatibility with virtio_net_rss_config */
+   __le16 reserved[4];
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 0/3] virtio-net: introduce features defined in the spec

2020-03-01 Thread Yuri Benditovich

This series introduce virtio-net features VIRTIO_NET_F_RSC_EXT,
VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT.

Changes from v2: reformatted structure in patch 1

Yuri Benditovich (3):
  virtio-net: Introduce extended RSC feature
  virtio-net: Introduce RSS receive steering feature
  virtio-net: Introduce hash report feature

 include/uapi/linux/virtio_net.h | 100 ++--
 1 file changed, 96 insertions(+), 4 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 1/3] virtio-net: Introduce extended RSC feature

2020-03-01 Thread Yuri Benditovich

VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
is able to provide extended RSC information. When the feature
is negotiatede and 'gso_type' field in received packet is not
GSO_NONE, the device reports number of coalesced packets in
'csum_start' field and number of duplicated acks in 'csum_offset'
field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index a3715a3224c1..19e76b3e3a64 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
 */
@@ -104,6 +105,7 @@ struct virtio_net_config {
 struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM1   /* Use csum_start, csum_offset 
*/
 #define VIRTIO_NET_HDR_F_DATA_VALID2   /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO  4   /* rsc info in csum_ fields */
__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   /* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
@@ -113,8 +115,24 @@ struct virtio_net_hdr_v1 {
__u8 gso_type;
__virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
__virtio16 gso_size;/* Bytes to append to hdr_len per frame */
-   __virtio16 csum_start;  /* Position to start checksumming from */
-   __virtio16 csum_offset; /* Offset after that to place checksum */
+   union {
+   struct {
+   __virtio16 csum_start;
+   __virtio16 csum_offset;
+   };
+   struct {
+   /* Position to start checksumming from */
+   __virtio16 start;
+   /* Offset after that to place checksum */
+   __virtio16 offset;
+   } csum;
+   struct {
+   /* num of coalesced packets */
+   __le16 packets;
+   /* num of duplicated acks */
+   __le16 dup_acks;
+   } rsc;
+   };
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 2/3] virtio-net: Introduce RSS receive steering feature

2020-03-01 Thread Yuri Benditovich

RSS (Receive-side scaling) defines hash calculation
rules and decision on receive virtqueue according to
the calculated hash, provided mask to apply and
provided indirection table containing indices of
receive virqueues. The driver sends the control
command to enable multiqueue and provide parameters
for receive steering.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 42 +++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 19e76b3e3a64..188ad3eecdc8 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
@@ -70,6 +71,17 @@
 #define VIRTIO_NET_S_LINK_UP   1   /* Link is up */
 #define VIRTIO_NET_S_ANNOUNCE  2   /* Announcement is needed */
 
+/* supported/enabled hash types */
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
+#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
+
 struct virtio_net_config {
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
__u8 mac[ETH_ALEN];
@@ -93,6 +105,12 @@ struct virtio_net_config {
 * Any other value stands for unknown.
 */
__u8 duplex;
+   /* maximum size of RSS key */
+   __u8 rss_max_key_size;
+   /* maximum number of indirection table entries */
+   __le16 rss_max_indirection_table_length;
+   /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
+   __le32 supported_hash_types;
 } __attribute__((packed));
 
 /*
@@ -246,7 +264,9 @@ struct virtio_net_ctrl_mac {
 
 /*
  * Control Receive Flow Steering
- *
+ */
+#define VIRTIO_NET_CTRL_MQ   4
+/*
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
  * enables Receive Flow Steering, specifying the number of the transmit and
  * receive queues that will be used. After the command is consumed and acked by
@@ -259,11 +279,29 @@ struct virtio_net_ctrl_mq {
__virtio16 virtqueue_pairs;
 };
 
-#define VIRTIO_NET_CTRL_MQ   4
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
+ * the receive steering to use a hash calculated for incoming packet
+ * to decide on receive virtqueue to place the packet. The command
+ * also provides parameters to calculate a hash and receive virtqueue.
+ */
+struct virtio_net_rss_config {
+   __le32 hash_types;
+   __le16 indirection_table_mask;
+   __le16 unclassified_queue;
+   __le16 indirection_table[1/* + indirection_table_mask */];
+   __le16 max_tx_vq;
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 1/3] virtio-net: Introduce extended RSC feature

2020-03-01 Thread Yuri Benditovich

No problem, I'll send an update

On Sun, Mar 1, 2020 at 1:31 PM Michael S. Tsirkin  wrote:
>
> On Sun, Mar 01, 2020 at 01:07:31PM +0200, Yuri Benditovich wrote:
> > VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
> > is able to provide extended RSC information. When the feature
> > is negotiatede and 'gso_type' field in received packet is not
> > GSO_NONE, the device reports number of coalesced packets in
> > 'csum_start' field and number of duplicated acks in 'csum_offset'
> > field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.
> >
> > Signed-off-by: Yuri Benditovich 
> > ---
> >  include/uapi/linux/virtio_net.h | 12 ++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/virtio_net.h 
> > b/include/uapi/linux/virtio_net.h
> > index a3715a3224c1..536152fad3c4 100644
> > --- a/include/uapi/linux/virtio_net.h
> > +++ b/include/uapi/linux/virtio_net.h
> > @@ -57,6 +57,7 @@
> >* Steering */
> >  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> >
> > +#define VIRTIO_NET_F_RSC_EXT   61/* extended coalescing info */
> >  #define VIRTIO_NET_F_STANDBY   62/* Act as standby for another device
> >* with the same MAC.
> >*/
> > @@ -104,6 +105,7 @@ struct virtio_net_config {
> >  struct virtio_net_hdr_v1 {
> >  #define VIRTIO_NET_HDR_F_NEEDS_CSUM  1   /* Use csum_start, 
> > csum_offset */
> >  #define VIRTIO_NET_HDR_F_DATA_VALID  2   /* Csum is valid */
> > +#define VIRTIO_NET_HDR_F_RSC_INFO4   /* rsc info in csum_ fields */
> >   __u8 flags;
> >  #define VIRTIO_NET_HDR_GSO_NONE  0   /* Not a GSO frame */
> >  #define VIRTIO_NET_HDR_GSO_TCPV4 1   /* GSO frame, IPv4 TCP (TSO) 
> > */
> > @@ -113,8 +115,14 @@ struct virtio_net_hdr_v1 {
> >   __u8 gso_type;
> >   __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
> >   __virtio16 gso_size;/* Bytes to append to hdr_len per frame */
> > - __virtio16 csum_start;  /* Position to start checksumming from */
> > - __virtio16 csum_offset; /* Offset after that to place checksum */
> > + union {
> > + __virtio16 csum_start;  /* Position to start checksumming 
> > from */
> > + __le16 rsc_ext_num_packets; /* num of coalesced packets */
> > + };
> > + union {
> > + __virtio16 csum_offset; /* Offset after that to place 
> > checksum */
> > + __le16 rsc_ext_num_dupacks; /* num of duplicated acks */
>
> dupacks -> dup_acks ?
>
> Also wouldn't it be cleaner to have an rsc struct? And "num" is kind of
> extraneous, right?
> So how about we group the fields:
>
> union {
> /* Unnamed struct for compatiblity. */
> struct {
> csum_start
> csum_offset
> };
> struct {
> virtio16 start;
> virtio16 offset;
> } csum;
> struct {
> le16 packets;
> le16 dup_acks;
> } rsc;
> };
>
>
> what do you think?
>
>
> > + };
> >   __virtio16 num_buffers; /* Number of merged rx buffers */
> >  };
>
>
>
> > --
> > 2.17.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 0/3] virtio-net: introduce features defined in the spec

2020-03-01 Thread Yuri Benditovich

On Sun, Mar 1, 2020 at 1:32 PM Michael S. Tsirkin  wrote:
>
> On Sun, Mar 01, 2020 at 01:07:30PM +0200, Yuri Benditovich wrote:
> > This series introduce virtio-net features VIRTIO_NET_F_RSC_EXT,
> > VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT.
> >
> > Changes from v1:
> > __virtio -> __le
> > maximal -> maximum
> > minor style fixes
>
>
> Looks good to me - sent a bit of consmetics.
>
> But as any virtio UAPI change, please CC virtio-dev as virtio TC maintains the
> interface. Thanks!

Probably 'virtio-dev' should be added to maintainers file.

>
> > Yuri Benditovich (3):
> >   virtio-net: Introduce extended RSC feature
> >   virtio-net: Introduce RSS receive steering feature
> >   virtio-net: Introduce hash report feature
> >
> >  include/uapi/linux/virtio_net.h | 90 +++--
> >  1 file changed, 86 insertions(+), 4 deletions(-)
> >
> > --
> > 2.17.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2 0/3] virtio-net: introduce features defined in the spec

2020-03-01 Thread Yuri Benditovich

This series introduce virtio-net features VIRTIO_NET_F_RSC_EXT,
VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT.

Changes from v1:
__virtio -> __le
maximal -> maximum
minor style fixes

Yuri Benditovich (3):
  virtio-net: Introduce extended RSC feature
  virtio-net: Introduce RSS receive steering feature
  virtio-net: Introduce hash report feature

 include/uapi/linux/virtio_net.h | 90 +++--
 1 file changed, 86 insertions(+), 4 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2 3/3] virtio-net: Introduce hash report feature

2020-03-01 Thread Yuri Benditovich

The feature VIRTIO_NET_F_HASH_REPORT extends the
layout of the packet and requests the device to
calculate hash on incoming packets and report it
in the packet header.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 36 +
 1 file changed, 36 insertions(+)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 6b309fe23671..c9ca62dd77a4 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
@@ -144,6 +145,23 @@ struct virtio_net_hdr_v1 {
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
+struct virtio_net_hdr_v1_hash {
+   struct virtio_net_hdr_v1 hdr;
+   __le32 hash_value;
+#define VIRTIO_NET_HASH_REPORT_NONE0
+#define VIRTIO_NET_HASH_REPORT_IPv41
+#define VIRTIO_NET_HASH_REPORT_TCPv4   2
+#define VIRTIO_NET_HASH_REPORT_UDPv4   3
+#define VIRTIO_NET_HASH_REPORT_IPv64
+#define VIRTIO_NET_HASH_REPORT_TCPv6   5
+#define VIRTIO_NET_HASH_REPORT_UDPv6   6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX9
+   __le16 hash_report;
+   __le16 padding;
+};
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
@@ -292,6 +310,24 @@ struct virtio_net_rss_config {
 
  #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_HASH_CONFIG requests the device
+ * to include in the virtio header of the packet the value of the
+ * calculated hash and the report type of hash. It also provides
+ * parameters for hash calculation. The command requires feature
+ * VIRTIO_NET_F_HASH_REPORT to be negotiated to extend the
+ * layout of virtio header as defined in virtio_net_hdr_v1_hash.
+ */
+struct virtio_net_hash_config {
+   __le32 hash_types;
+   /* for compatibility with virtio_net_rss_config */
+   __le16 reserved[4];
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2 2/3] virtio-net: Introduce RSS receive steering feature

2020-03-01 Thread Yuri Benditovich

RSS (Receive-side scaling) defines hash calculation
rules and decision on receive virtqueue according to
the calculated hash, provided mask to apply and
provided indirection table containing indices of
receive virqueues. The driver sends the control
command to enable multiqueue and provide parameters
for receive steering.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 42 +++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 536152fad3c4..6b309fe23671 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
@@ -70,6 +71,17 @@
 #define VIRTIO_NET_S_LINK_UP   1   /* Link is up */
 #define VIRTIO_NET_S_ANNOUNCE  2   /* Announcement is needed */
 
+/* supported/enabled hash types */
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
+#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
+
 struct virtio_net_config {
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
__u8 mac[ETH_ALEN];
@@ -93,6 +105,12 @@ struct virtio_net_config {
 * Any other value stands for unknown.
 */
__u8 duplex;
+   /* maximum size of RSS key */
+   __u8 rss_max_key_size;
+   /* maximum number of indirection table entries */
+   __le16 rss_max_indirection_table_length;
+   /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
+   __le32 supported_hash_types;
 } __attribute__((packed));
 
 /*
@@ -236,7 +254,9 @@ struct virtio_net_ctrl_mac {
 
 /*
  * Control Receive Flow Steering
- *
+ */
+#define VIRTIO_NET_CTRL_MQ   4
+/*
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
  * enables Receive Flow Steering, specifying the number of the transmit and
  * receive queues that will be used. After the command is consumed and acked by
@@ -249,11 +269,29 @@ struct virtio_net_ctrl_mq {
__virtio16 virtqueue_pairs;
 };
 
-#define VIRTIO_NET_CTRL_MQ   4
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
+ * the receive steering to use a hash calculated for incoming packet
+ * to decide on receive virtqueue to place the packet. The command
+ * also provides parameters to calculate a hash and receive virtqueue.
+ */
+struct virtio_net_rss_config {
+   __le32 hash_types;
+   __le16 indirection_table_mask;
+   __le16 unclassified_queue;
+   __le16 indirection_table[1/* + indirection_table_mask */];
+   __le16 max_tx_vq;
+   __u8 hash_key_length;
+   __u8 hash_key_data[/* hash_key_length */];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2 1/3] virtio-net: Introduce extended RSC feature

2020-03-01 Thread Yuri Benditovich

VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
is able to provide extended RSC information. When the feature
is negotiatede and 'gso_type' field in received packet is not
GSO_NONE, the device reports number of coalesced packets in
'csum_start' field and number of duplicated acks in 'csum_offset'
field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index a3715a3224c1..536152fad3c4 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
 */
@@ -104,6 +105,7 @@ struct virtio_net_config {
 struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM1   /* Use csum_start, csum_offset 
*/
 #define VIRTIO_NET_HDR_F_DATA_VALID2   /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO  4   /* rsc info in csum_ fields */
__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   /* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
@@ -113,8 +115,14 @@ struct virtio_net_hdr_v1 {
__u8 gso_type;
__virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
__virtio16 gso_size;/* Bytes to append to hdr_len per frame */
-   __virtio16 csum_start;  /* Position to start checksumming from */
-   __virtio16 csum_offset; /* Offset after that to place checksum */
+   union {
+   __virtio16 csum_start;  /* Position to start checksumming from 
*/
+   __le16 rsc_ext_num_packets; /* num of coalesced packets */
+   };
+   union {
+   __virtio16 csum_offset; /* Offset after that to place checksum 
*/
+   __le16 rsc_ext_num_dupacks; /* num of duplicated acks */
+   };
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 1/3] virtio-net: Introduce extended RSC feature

2020-02-29 Thread Yuri Benditovich

VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
is able to provide extended RSC information. When the feature
is negotiatede and 'gso_type' field in received packet is not
GSO_NONE, the device reports number of coalesced packets in
'csum_start' field and number of duplicated acks in 'csum_offset'
field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index a3715a3224c1..6e26a2cc6ad0 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
 */
@@ -104,6 +105,7 @@ struct virtio_net_config {
 struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM1   /* Use csum_start, csum_offset 
*/
 #define VIRTIO_NET_HDR_F_DATA_VALID2   /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO  4   /* rsc info in csum_ fields */
__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   /* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
@@ -113,8 +115,14 @@ struct virtio_net_hdr_v1 {
__u8 gso_type;
__virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
__virtio16 gso_size;/* Bytes to append to hdr_len per frame */
-   __virtio16 csum_start;  /* Position to start checksumming from */
-   __virtio16 csum_offset; /* Offset after that to place checksum */
+   union {
+   __virtio16 csum_start;  /* Position to start checksumming from 
*/
+   __virtio16 rsc_ext_num_packets; /* num of coalesced packets */
+   };
+   union {
+   __virtio16 csum_offset; /* Offset after that to place checksum 
*/
+   __virtio16 rsc_ext_num_dupacks; /* num of duplicated acks */
+   };
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 3/3] virtio-net: Introduce hash report feature

2020-02-29 Thread Yuri Benditovich

The feature VIRTIO_NET_F_HASH_REPORT extends the
layout of the packet and requests the device to
calculate hash on incoming packets and report it
in the packet header.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 36 +
 1 file changed, 36 insertions(+)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 7a342657fb6c..6d7230469c57 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_HASH_REPORT  57   /* Supports hash report */
 #define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
@@ -144,6 +145,23 @@ struct virtio_net_hdr_v1 {
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
+struct virtio_net_hdr_v1_hash {
+   struct virtio_net_hdr_v1 hdr;
+   __virtio32 hash_value;
+#define VIRTIO_NET_HASH_REPORT_NONE0
+#define VIRTIO_NET_HASH_REPORT_IPv41
+#define VIRTIO_NET_HASH_REPORT_TCPv4   2
+#define VIRTIO_NET_HASH_REPORT_UDPv4   3
+#define VIRTIO_NET_HASH_REPORT_IPv64
+#define VIRTIO_NET_HASH_REPORT_TCPv6   5
+#define VIRTIO_NET_HASH_REPORT_UDPv6   6
+#define VIRTIO_NET_HASH_REPORT_IPv6_EX 7
+#define VIRTIO_NET_HASH_REPORT_TCPv6_EX8
+#define VIRTIO_NET_HASH_REPORT_UDPv6_EX9
+   __virtio16 hash_report;
+   __virtio16 padding;
+};
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
@@ -292,6 +310,24 @@ struct virtio_net_rss_config {
 
  #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_HASH_CONFIG requests the device
+ * to include in the virtio header of the packet the value of the
+ * calculated hash and the report type of hash. It also provides
+ * parameters for hash calculation. The command requires feature
+ * VIRTIO_NET_F_HASH_REPORT to be negotiated to extend the
+ * layout of virtio header as defined in virtio_net_hdr_v1_hash.
+ */
+struct virtio_net_hash_config {
+   __virtio32 hash_types;
+   /* for compatibility with virtio_net_rss_config */
+   __virtio16 reserved[4];
+   __u8 hash_key_length;
+   __u8 hash_key_data[/*hash_key_length*/];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_HASH_CONFIG 2
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 2/3] virtio-net: Introduce RSS receive steering feature

2020-02-29 Thread Yuri Benditovich

RSS (Receive-side scaling) defines hash calculation
rules and decision on receive virtqueue according to
the calculated hash, provided mask to apply and
provided indirection table containing indices of
receive virqueues. The driver sends the control
command to enable multiqueue and provide parameters
for receive steering.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 42 +++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 6e26a2cc6ad0..7a342657fb6c 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_RSS 60/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT 61/* extended coalescing info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
@@ -70,6 +71,17 @@
 #define VIRTIO_NET_S_LINK_UP   1   /* Link is up */
 #define VIRTIO_NET_S_ANNOUNCE  2   /* Announcement is needed */
 
+/* supported/enabled hash types */
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv4  (1 << 0)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv4 (1 << 1)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv4 (1 << 2)
+#define VIRTIO_NET_RSS_HASH_TYPE_IPv6  (1 << 3)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCPv6 (1 << 4)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDPv6 (1 << 5)
+#define VIRTIO_NET_RSS_HASH_TYPE_IP_EX (1 << 6)
+#define VIRTIO_NET_RSS_HASH_TYPE_TCP_EX(1 << 7)
+#define VIRTIO_NET_RSS_HASH_TYPE_UDP_EX(1 << 8)
+
 struct virtio_net_config {
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
__u8 mac[ETH_ALEN];
@@ -93,6 +105,12 @@ struct virtio_net_config {
 * Any other value stands for unknown.
 */
__u8 duplex;
+   /* maximal size of RSS key */
+   __u8 rss_max_key_size;
+   /* maximal number of indirection table entries */
+   __u16 rss_max_indirection_table_length;
+   /* bitmask of supported VIRTIO_NET_RSS_HASH_ types */
+   __u32 supported_hash_types;
 } __attribute__((packed));
 
 /*
@@ -236,7 +254,9 @@ struct virtio_net_ctrl_mac {
 
 /*
  * Control Receive Flow Steering
- *
+ */
+#define VIRTIO_NET_CTRL_MQ   4
+/*
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
  * enables Receive Flow Steering, specifying the number of the transmit and
  * receive queues that will be used. After the command is consumed and acked by
@@ -249,11 +269,29 @@ struct virtio_net_ctrl_mq {
__virtio16 virtqueue_pairs;
 };
 
-#define VIRTIO_NET_CTRL_MQ   4
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET0
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN1
  #define VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX0x8000
 
+/*
+ * The command VIRTIO_NET_CTRL_MQ_RSS_CONFIG has the same effect as
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET does and additionally configures
+ * the receive steering to use a hash calculated for incoming packet
+ * to decide on receive virtqueue to place the packet. The command
+ * also provides parameters to calculate a hash and receive virtqueue.
+ */
+struct virtio_net_rss_config {
+   __virtio32 hash_types;
+   __virtio16 indirection_table_mask;
+   __virtio16 unclassified_queue;
+   __virtio16 indirection_table[1/* + indirection_table_mask*/];
+   __virtio16 max_tx_vq;
+   __u8 hash_key_length;
+   __u8 hash_key_data[/*hash_key_length*/];
+};
+
+ #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG  1
+
 /*
  * Control network offloads
  *
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 0/3] virtio-net: introduce features defined in the spec

2020-02-29 Thread Yuri Benditovich

This series introduce virtio-net features VIRTIO_NET_F_RSC_EXT,
VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT.

Yuri Benditovich (3):
  virtio-net: Introduce extended RSC feature
  virtio-net: Introduce RSS receive steering feature
  virtio-net: Introduce hash report feature

 include/uapi/linux/virtio_net.h | 90 +++--
 1 file changed, 86 insertions(+), 4 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] virtio-net: Introduce extended RSC feature

2020-01-13 Thread Yuri Benditovich

On Mon, Jan 13, 2020 at 1:08 PM Jason Wang  wrote:
>
>
> On 2020/1/13 下午4:17, Yuri Benditovich wrote:
> > VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
> > is able to provide extended RSC information. When the feature
> > is negotiatede and 'gso_type' field in received packet is not
> > GSO_NONE, the device reports number of coalesced packets in
> > 'csum_start' field and number of duplicated acks in 'csum_offset'
> > field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.
> >
> > Signed-off-by: Yuri Benditovich 
>
>
> Hi Yuri:
>
> Is the feature used by Linux? If yes, it's better to include the real user.
>

It is not used by Linux. Mainly needed for certification under Windows.

>
> > ---
> >   include/uapi/linux/virtio_net.h | 10 +-
> >   1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/virtio_net.h 
> > b/include/uapi/linux/virtio_net.h
> > index a3715a3224c1..2bdd26f8a4ed 100644
> > --- a/include/uapi/linux/virtio_net.h
> > +++ b/include/uapi/linux/virtio_net.h
> > @@ -56,7 +56,7 @@
> >   #define VIRTIO_NET_F_MQ 22  /* Device supports Receive Flow
> >* Steering */
> >   #define VIRTIO_NET_F_CTRL_MAC_ADDR 23   /* Set MAC address */
> > -
> > +#define VIRTIO_NET_F_RSC_EXT   61/* Provides extended RSC info */
>
>
> Is this better to keep the newline around?

No problem, let's wait until the rest is accepted.

>
>
> >   #define VIRTIO_NET_F_STANDBY  62/* Act as standby for another 
> > device
> >* with the same MAC.
> >*/
> > @@ -104,6 +104,7 @@ struct virtio_net_config {
> >   struct virtio_net_hdr_v1 {
> >   #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1   /* Use csum_start, 
> > csum_offset */
> >   #define VIRTIO_NET_HDR_F_DATA_VALID 2   /* Csum is valid */
> > +#define VIRTIO_NET_HDR_F_RSC_INFO4   /* rsc_ext data in csum_ 
> > fields */
> >   __u8 flags;
> >   #define VIRTIO_NET_HDR_GSO_NONE 0   /* Not a GSO frame */
> >   #define VIRTIO_NET_HDR_GSO_TCPV41   /* GSO frame, IPv4 TCP (TSO) 
> > */
> > @@ -118,6 +119,13 @@ struct virtio_net_hdr_v1 {
> >   __virtio16 num_buffers; /* Number of merged rx buffers */
> >   };
> >
> > +/*
> > + * if VIRTIO_NET_F_RSC_EXT feature has been negotiated and
> > + * VIRTIO_NET_HDR_F_RSC_INFO is set in RX packet
> > + */
> > +#define virtio_net_rsc_ext_num_packets   csum_start
> > +#define virtio_net_rsc_ext_num_dupacks   csum_offset
>
>
> This looks sub-optimal, it looks to me union is better?

This was discussed in v1, MST decided the define is better.

>
> Thanks
>
>
> > +
> >   #ifndef VIRTIO_NET_NO_LEGACY
> >   /* This header comes first in the scatter-gather list.
> >* For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v2] virtio-net: Introduce extended RSC feature

2020-01-13 Thread Yuri Benditovich

VIRTIO_NET_F_RSC_EXT feature bit indicates that the device
is able to provide extended RSC information. When the feature
is negotiatede and 'gso_type' field in received packet is not
GSO_NONE, the device reports number of coalesced packets in
'csum_start' field and number of duplicated acks in 'csum_offset'
field and sets VIRTIO_NET_HDR_F_RSC_INFO in 'flags' field.

Signed-off-by: Yuri Benditovich 
---
 include/uapi/linux/virtio_net.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index a3715a3224c1..2bdd26f8a4ed 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -56,7 +56,7 @@
 #define VIRTIO_NET_F_MQ22  /* Device supports Receive Flow
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
-
+#define VIRTIO_NET_F_RSC_EXT 61/* Provides extended RSC info */
 #define VIRTIO_NET_F_STANDBY 62/* Act as standby for another device
 * with the same MAC.
 */
@@ -104,6 +104,7 @@ struct virtio_net_config {
 struct virtio_net_hdr_v1 {
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM1   /* Use csum_start, csum_offset 
*/
 #define VIRTIO_NET_HDR_F_DATA_VALID2   /* Csum is valid */
+#define VIRTIO_NET_HDR_F_RSC_INFO  4   /* rsc_ext data in csum_ fields 
*/
__u8 flags;
 #define VIRTIO_NET_HDR_GSO_NONE0   /* Not a GSO frame */
 #define VIRTIO_NET_HDR_GSO_TCPV4   1   /* GSO frame, IPv4 TCP (TSO) */
@@ -118,6 +119,13 @@ struct virtio_net_hdr_v1 {
__virtio16 num_buffers; /* Number of merged rx buffers */
 };
 
+/*
+ * if VIRTIO_NET_F_RSC_EXT feature has been negotiated and
+ * VIRTIO_NET_HDR_F_RSC_INFO is set in RX packet
+ */
+#define virtio_net_rsc_ext_num_packets csum_start
+#define virtio_net_rsc_ext_num_dupacks csum_offset
+
 #ifndef VIRTIO_NET_NO_LEGACY
 /* This header comes first in the scatter-gather list.
  * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it must
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

56 matches

Mail list logo