[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-23 Thread Michael S. Tsirkin
On Tue, May 23, 2023 at 09:32:41PM +, Parav Pandit wrote:
> 
> 
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, May 23, 2023 2:17 PM
> > 
> > On Mon, May 15, 2023 at 02:30:55PM +, Parav Pandit wrote:
> > >
> > > > From: Michael S. Tsirkin 
> > > > Sent: Monday, May 15, 2023 6:09 AM
> > > > To be more specific, device config often has the same layout under
> > > > legacy and modern. Thus if getting an offset within device config,
> > > > same decoding logic should be reusable between legacy and modern.
> > > Modern device registers are directly accessible from the guest driver 
> > > without
> > mediation for VF,SF,SIOV.
> > > Hence, there is no need to have commands for modern register access over
> > some VQ.
> > 
> > Yes, there's need, and some of it you were pushing yourself.
> Yes, there is need without mediation of AQ.
> It was discussed in depth cfgq idea with trade off.
> 
> > AQ will allow reduced memory with SRIOV, and support for SIOV.
> Already discussed, v0 and v1, mediation via AQ is not the way for SRIOV.
> SIOV starting with single goal of only backward compatilbity hence, AQ is 
> also doesn't look future forward.
> 
> And your summary email, we parked it aside for now.

I am frankly lost at this point, this thread is too big.

So where do you want to go for this project?  I would like something
specific, email threads are growing too big.
I think the reason is that there is actually a host of small
related subprojects. So when you propose just this small
feature, everyone jumps in with their own pet project
proposing addressing that too.

For example:
- SIOV guys were adding transport over VQ. It would seem that
  can include support for legacy.
- For migration we need ability to report member events to owner.
  It would seem that can be used to support INT#x.
- Ability to stick capabilities in extended config space
  is a good idea, for a variety of reasons.

and so on.  Understandably frustrating, and it is easy to go overboard
with generality.

If you feel your path for progress is clear, that's good.
if not, here are ideas for getting this unstuck:
- start a TODO.md file with a list of things we need to address,
  and for each one a list of options
- everyone will add to this file then we get a higher level
  picture and can discuss "this will be addressed by A, that by B"
- have a chat at the upstream developers meeting (May 31)
  so far it's about vdpa blk
- anyone coming to the kvm forum to have a chat there?

Hope this helps.
-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-nety

2023-05-23 Thread Jason Wang
On Wed, May 24, 2023 at 1:07 PM Heng Qi  wrote:
>
> On Wed, May 24, 2023 at 12:09:22PM +0800, Jason Wang wrote:
> > On Tue, May 23, 2023 at 9:51 PM Heng Qi  wrote:
> > >
> > > On Tue, May 23, 2023 at 09:30:28AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, May 23, 2023 at 05:18:20PM +0800, Heng Qi wrote:
> > > > > On Tue, May 23, 2023 at 03:15:37AM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> > > > > > > On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > > > > > > > 1) Add a feature bit to the virtio specification to tell the 
> > > > > > > > > sender that a fully
> > > > > > > > > csumed packet must be sent.
> > > > > > > >
> > > > > > > > Who is the sender in this picture? The driver?
> > > > > > >
> > > > > > > The device or the driver.
> > > > > > >
> > > > > > > When the device is hw, the sender is more likely to be a device.
> > > > > > > When the device is sw, the sender can be a device or a driver.
> > > > > > >
> > > > > > > But in general, this feature is inclined to constrain the 
> > > > > > > behavior of the device and
> > > > > > > the driver from the receiving side.
> > > > > >
> > > > > > Based on above I am guessing you are talking about driver getting
> > > > > > packets from device, I wish you used terms from virtio spec.
> > > > >
> > > > > Yes, I'm going to use the terminology of the virtio spec.
> > > > >
> > > > > >
> > > > > > > For example:
> > > > > > > VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that 
> > > > > > > you must send me a fully csumed packet.
> > > > > > >
> > > > > > > Then the specific implementation can be
> > > > > > >
> > > > > > > (1) the sender sends a fully csumed packet;
> > > > > > > (2) the receiver receives a CHECKSUM_PARTIAL packet, and the 
> > > > > > > device helps calculate the fully csum
> > > > > > > (because the two parties in the communication are located on 
> > > > > > > the same host, the packet is trusted.).
> > > > > > >
> > > > > > > In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the 
> > > > > > > driver will no longer receive any packets marked CHECKSUM_PARTIAL.
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > This is what clearing VIRTIO_NET_F_GUEST_CSUM does.
> > > > >
> > > > > Yes, but with VIRTIO_NET_F_GUEST_CSUM cleared, although the device can
> > > > > receive a fully checksummed packet, we can no longer enjoy
> > > > > the device's ability to validate the packet checksum. That is, the 
> > > > > value
> > > > > of \field{flags} in the virtio_net_hdr structure is set to 0, which 
> > > > > means
> > > > > that the packet received by the driver will not be marked as
> > > > > VIRTIO_NET_HDR_F_DATA_VALID.
> > > > >
> > > > > So, we need a feature bit (let's say VIRTIO_NET_F_UNNECESSARY_CSUM).
> > > > > If VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the device must give 
> > > > > the
> > > > > driver a fully checksummed packet, and the packet is validated by the
> > > > > device with \field{flags} set to VIRTIO_NET_HDR_F_DATA_VALID.
> > > > >
> > > > > >
> > > > > > I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
> > > > > > disables all offloads but you want to keep some of them?
> > > > > >
> > > > >
> > > > > No, what I mean is that a feature VIRTIO_NET_F_UNNECESSARY_CSUM is 
> > > > > needed
> > > > > in addition to VIRTIO_NET_F_GUEST_CSUM, if both features are 
> > > > > negotiated,
> > > > > then the driver may always receive packets marked as
> > > > > VIRTIO_NET_HDR_F_DATA_VALID, which means that we can now load XDP at 
> > > > > the
> > > > > same time.
> > > >
> > > > Makes no sense to me. VIRTIO_NET_F_GUEST_CSUM set already allows
> > > > VIRTIO_NET_HDR_F_DATA_VALID:
> > >
> > > We need to focus on what happens when the XDP program is loaded:
> > >
> > > The current virtio-net needs to turn off the VIRTIO_NET_F_GUEST_CSUM
> > > feature when loading XDP. The main reason for doing this is because
> > > VIRTIO_NET_F_GUEST_CSUM allows to receive packets marked as
> > > VIRTIO_NET_HDR_F_NEEDS_CSUM. Such packets are not compatible with
> > > XDP programs, because we cannot guarantee that the csum_{start, offset}
> > > fields are correct after XDP modifies the packets.
> >
> > We can try to fix or workaround this. A rough idea is for example by
> > using a flow dissector?
>
> How can we fix this with a flow dissector? Could you please explain in
> detail?

I mean you can use a flow dissector to probe the csum_start and csum_offset.

>
> > Anyhow the GSO of virtio-net was marked as
> > dodgy which means the csum_start/offset needs to be validated again in
> > the xmit path. Or we can monitor if the packet is modified or not, if
> > not, we don't need to zero vnet header?
>
> Yes, this is a way, and we've thought about this solution.
> A relatively simple solution is to add a new flag to the XDP 

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-net

2023-05-23 Thread Jason Wang
On Tue, May 23, 2023 at 9:51 PM Heng Qi  wrote:
>
> On Tue, May 23, 2023 at 09:30:28AM -0400, Michael S. Tsirkin wrote:
> > On Tue, May 23, 2023 at 05:18:20PM +0800, Heng Qi wrote:
> > > On Tue, May 23, 2023 at 03:15:37AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> > > > > On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > > > > > 1) Add a feature bit to the virtio specification to tell the 
> > > > > > > sender that a fully
> > > > > > > csumed packet must be sent.
> > > > > >
> > > > > > Who is the sender in this picture? The driver?
> > > > >
> > > > > The device or the driver.
> > > > >
> > > > > When the device is hw, the sender is more likely to be a device.
> > > > > When the device is sw, the sender can be a device or a driver.
> > > > >
> > > > > But in general, this feature is inclined to constrain the behavior of 
> > > > > the device and
> > > > > the driver from the receiving side.
> > > >
> > > > Based on above I am guessing you are talking about driver getting
> > > > packets from device, I wish you used terms from virtio spec.
> > >
> > > Yes, I'm going to use the terminology of the virtio spec.
> > >
> > > >
> > > > > For example:
> > > > > VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that you 
> > > > > must send me a fully csumed packet.
> > > > >
> > > > > Then the specific implementation can be
> > > > >
> > > > > (1) the sender sends a fully csumed packet;
> > > > > (2) the receiver receives a CHECKSUM_PARTIAL packet, and the device 
> > > > > helps calculate the fully csum
> > > > > (because the two parties in the communication are located on the 
> > > > > same host, the packet is trusted.).
> > > > >
> > > > > In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the 
> > > > > driver will no longer receive any packets marked CHECKSUM_PARTIAL.
> > > > >
> > > > > Thanks.
> > > >
> > > > This is what clearing VIRTIO_NET_F_GUEST_CSUM does.
> > >
> > > Yes, but with VIRTIO_NET_F_GUEST_CSUM cleared, although the device can
> > > receive a fully checksummed packet, we can no longer enjoy
> > > the device's ability to validate the packet checksum. That is, the value
> > > of \field{flags} in the virtio_net_hdr structure is set to 0, which means
> > > that the packet received by the driver will not be marked as
> > > VIRTIO_NET_HDR_F_DATA_VALID.
> > >
> > > So, we need a feature bit (let's say VIRTIO_NET_F_UNNECESSARY_CSUM).
> > > If VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the device must give the
> > > driver a fully checksummed packet, and the packet is validated by the
> > > device with \field{flags} set to VIRTIO_NET_HDR_F_DATA_VALID.
> > >
> > > >
> > > > I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
> > > > disables all offloads but you want to keep some of them?
> > > >
> > >
> > > No, what I mean is that a feature VIRTIO_NET_F_UNNECESSARY_CSUM is needed
> > > in addition to VIRTIO_NET_F_GUEST_CSUM, if both features are negotiated,
> > > then the driver may always receive packets marked as
> > > VIRTIO_NET_HDR_F_DATA_VALID, which means that we can now load XDP at the
> > > same time.
> >
> > Makes no sense to me. VIRTIO_NET_F_GUEST_CSUM set already allows
> > VIRTIO_NET_HDR_F_DATA_VALID:
>
> We need to focus on what happens when the XDP program is loaded:
>
> The current virtio-net needs to turn off the VIRTIO_NET_F_GUEST_CSUM
> feature when loading XDP. The main reason for doing this is because
> VIRTIO_NET_F_GUEST_CSUM allows to receive packets marked as
> VIRTIO_NET_HDR_F_NEEDS_CSUM. Such packets are not compatible with
> XDP programs, because we cannot guarantee that the csum_{start, offset}
> fields are correct after XDP modifies the packets.

We can try to fix or workaround this. A rough idea is for example by
using a flow dissector? Anyhow the GSO of virtio-net was marked as
dodgy which means the csum_start/offset needs to be validated again in
the xmit path. Or we can monitor if the packet is modified or not, if
not, we don't need to zero vnet header?

Thanks

>
> So in order for the driver to continue to enjoy the device's ability to
> validate packets while XDP is loaded, we need a new feature to tell the
> device not to deliver packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM (in
> other words, the device can only deliver packets marked as
> VIRTIO_NET_HDR_F_DATA_VALID).
>
> Thanks.
>
> > \item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
> >   VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} can be
> >   set: if so, the packet checksum at offset \field{csum_offset}
> >   from \field{csum_start} and any preceding checksums
> >   have been validated.  The checksum on the packet is incomplete and
> >   if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags},
> >   then \field{csum_start} and \field{csum_offset} indicate how to calculate 
> > it
> >   (see 

[virtio-dev] Re: [RFC 1/4] vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag

2023-05-23 Thread Jason Wang
On Tue, May 23, 2023 at 1:55 PM Eugenio Perez Martin
 wrote:
>
> On Mon, May 22, 2023 at 11:22 PM Shannon Nelson  
> wrote:
> >
> > On 5/22/23 1:07 PM, Eugenio Perez Martin wrote:
> > >
> > > On Mon, May 22, 2023 at 9:23 PM Michael S. Tsirkin  
> > > wrote:
> > >>
> > >> On Mon, May 22, 2023 at 08:57:27PM +0200, Eugenio Pérez wrote:
> > >>> Signed-off-by: Eugenio Pérez 
> > >>> ---
> > >>>   include/uapi/linux/vhost_types.h | 2 ++
> > >>>   1 file changed, 2 insertions(+)
> > >>>
> > >>> diff --git a/include/uapi/linux/vhost_types.h 
> > >>> b/include/uapi/linux/vhost_types.h
> > >>> index c5690a8992d8..a1b316f49b38 100644
> > >>> --- a/include/uapi/linux/vhost_types.h
> > >>> +++ b/include/uapi/linux/vhost_types.h
> > >>> @@ -165,5 +165,7 @@ struct vhost_vdpa_iova_range {
> > >>>   #define VHOST_BACKEND_F_SUSPEND  0x4
> > >>>   /* Device can be resumed */
> > >>>   #define VHOST_BACKEND_F_RESUME  0x5
> > >>> +/* Device can enable virtqueues after DRIVER_OK */
> > >>
> > >> A bit more detail on what does this mean?
> > >> It's normally driver that enables VQs not device itself ...
> > >>
> > >
> > > I agree, it is not well explained.
> > >
> > > What about "Device supports the driver to enable virtqueues after 
> > > DRIVER_OK"?
> > >
> > >>> +#define VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK  0x6
> >
> > Does this mean it is possible only after DRIVER_OK, or does it mean in
> > addition to before DRIVER_OK?  It isn't clear to me.
> >
> > Maybe something like
> > "Device supports virtqueue enable only after DRIVER_OK"
> > or
> > "Device supports virtqueue enable both before and after DRIVER_OK"
> >
>
> I agree too,
>
> With the previous suggestion it would be:
>
> "Device supports the driver to enable virtqueues both before and after
> DRIVER_OK"

Btw, I think it might be worth clarifying whether or not this is
supported by the current virtio spec. Spec seems to be unclear on
this:

1) The driver MUST NOT write a 0 to queue_enable.
2) if reset is support, driver can wrote 1 to re-enable a virtqueue

But it doesn't say if the driver can write 1 after DRIVER_OK without reset.

Thanks


>
> Does it look better?
>
> Thanks!
>
> > sln
> >
> >
> > >>>
> > >>>   #endif
> > >>> --
> > >>> 2.31.1
> > >>
> > >
> >
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands

2023-05-23 Thread Jason Wang
Hi Parav:

On Wed, May 24, 2023 at 6:22 AM Parav Pandit  wrote:
>
>
> > From: Michael S. Tsirkin 
> > Sent: Tuesday, May 23, 2023 2:49 PM
> > > > iokit is from my previous work experience not virtio.
> > > > For virtio we had a lot of trouble with hacks like this on windows.
> > > I don't know why windows driver need to talk to PF driver until now when 
> > > the
> > AQ doesn't exists.
> >
> > What is it you want to know?
> >
> Why a VF driver needs to talk to a PF driver in windows without AQ?
>
> > > > That's an important platform.
> > > >
> > > I am not sure how important is it.
> > > What I learnt that a decade has passed and even after that virtio drivers 
> > > are
> > not part of the OS distro.
> >
> > Because with QEMU they are so easy to side-load. So we never bothered.
> >
> I don't understand your comment.
> I was saying to my knowledge a Windows OS do not have a virtio drivers in 
> their distro which you said is important.
>
> > > It doesn't need to because guest is anyway going to do whatever it does.
> >
> > yes. But hypervisor can do work-arounds or detect broken functionality and
> > stop cleanly.
> >
> You mentioned this several times.
> Below requirement still holds fine.
>
> > > The clear requirement is: to not fix legacy but to support legacy.
> > > Fixing legacy is role of 1.x. Not the role of legacy interface itself 
> > > using
> > AQ/MMIO.
> >
>  > And for purpose of BAR mapping, such BAR in the VF can be mapped too.
> > > You didn't answer why VF BAR is a problem and suggest PF BAR.
> >
> > VF BAR by itself is not a problem. Discovering VF BAR mapping offset 
> > through PF
> > is.
> >
> At the end of the email you right that "AQ is fine" but here you write 
> discovering that through the PF AQ is.
> How to interpret your comment?
>
> > > If notification for VF is in VF -> hence other config registers also in 
> > > VF, such
> > blanket statements does not help.
> >
> > not other config registers. what's needed in VF driver should be retrievable
> > through VF.
> >
> This applies to the modern device, for legacy emulation, I don't see this a 
> mandatory.
>
> > > Your comments and suggestions are helpful.
> > > Some of them follow the loop that doesn't help.
> > > Again, thanks a lot for the inputs and reviews.
> > >
> > > The design discussion helped to produce two solutions here than what was
> > started in v0.
> >
> > OK good. Now I suggest you make an effort to come up with some compromise
> > that will make this thread stop. I really can't say I even remember what we 
> > said
> > in the beginning.
>
> > You had this idea of starting with a problem statement document how about
> > doing that? Because I feel every time people bring up some idea and a 
> > problem
> > this idea would solve, you go "this is out of scope". Let's decide what the 
> > scope
> > is?
> >
> The requirements are simple and straightforward at start of the cover letter.
> i.e. for quick reference in short: A PCI VF passthrough to guest VM to 
> support legacy + 1.x.
>
> I propose:
> Method-1:
> 1. VF legacy registers access over AQ of PF
> 2. VF driver notifications using MMIO of VF
>
> Method-2:
> 1. legacy registers access using two MMIO registers (data and addr)
> 2. VF driver notifications using MMIO of VF
>

Based on your above summary (and a more detailed one in another
thread), it's not hard to see the tricky part is the try to
provide/mediate modern features for legacy. Assuming we all agree:

1) We want transitional devices but not legacy only devices
2) Mediation is a must for legacy

Then finding a way to reuse a modern interface is much better than
building a new path back to the old legacy maze. I really think we
should go with a way to meditate on top of a modern device. This
probably requires some new feature bit(s) like
VIRTIO_NET_F_LEGACY_HEADER but it has the lowest cost.

1) no new hardware interface like registers/capabilities etc
2) no datapath meditation, hypervisor will negotiate
VIRTIO_NET_F_LEGACY_HEADER for legacy guest drivers
3) endian is fine, hypervisor know it's modern so it's little
4) reset is fine, hypervisor know it's modern so it can do proper meditation
5) notification is fine, hypervisor know it's modern so it can use
modern notification areas
6) all the behaviour is deterministic, no undocumented or code defined
legacy behaviours

Anything that prevents you from adding things like VIRTIO_NET_F_LEGACY_HEADER?

You wrote the following in v1:

"""
There is virtio PCI VF.
The user who attaches this VF to the VM, it doesn’t know if its guest
is to run legacy driver or 1.x
Hence hypervisor doesn’t want to run special stack when guest may
(likely) be 1.x and device also supports 1.x.
"""

I don't understand this actually. For "special stack", are you
referring to the mediation path for legacy? I don't see how
VIRTIO_NET_F_LEGACY_HEADER differs from new hardware interfaces like
admin virtqueue in this regard. If management stack wants a
transitional device, Hypervisor just 

[virtio-dev] RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands

2023-05-23 Thread Parav Pandit


> From: Michael S. Tsirkin 
> Sent: Tuesday, May 23, 2023 2:49 PM
> > > iokit is from my previous work experience not virtio.
> > > For virtio we had a lot of trouble with hacks like this on windows.
> > I don't know why windows driver need to talk to PF driver until now when the
> AQ doesn't exists.
> 
> What is it you want to know?
>
Why a VF driver needs to talk to a PF driver in windows without AQ?
 
> > > That's an important platform.
> > >
> > I am not sure how important is it.
> > What I learnt that a decade has passed and even after that virtio drivers 
> > are
> not part of the OS distro.
> 
> Because with QEMU they are so easy to side-load. So we never bothered.
> 
I don't understand your comment.
I was saying to my knowledge a Windows OS do not have a virtio drivers in their 
distro which you said is important.

> > It doesn't need to because guest is anyway going to do whatever it does.
> 
> yes. But hypervisor can do work-arounds or detect broken functionality and
> stop cleanly.
>
You mentioned this several times.
Below requirement still holds fine.
 
> > The clear requirement is: to not fix legacy but to support legacy.
> > Fixing legacy is role of 1.x. Not the role of legacy interface itself using
> AQ/MMIO.
>
 > And for purpose of BAR mapping, such BAR in the VF can be mapped too.
> > You didn't answer why VF BAR is a problem and suggest PF BAR.
> 
> VF BAR by itself is not a problem. Discovering VF BAR mapping offset through 
> PF
> is.
> 
At the end of the email you right that "AQ is fine" but here you write 
discovering that through the PF AQ is.
How to interpret your comment?

> > If notification for VF is in VF -> hence other config registers also in VF, 
> > such
> blanket statements does not help.
> 
> not other config registers. what's needed in VF driver should be retrievable
> through VF.
>
This applies to the modern device, for legacy emulation, I don't see this a 
mandatory.

> > Your comments and suggestions are helpful.
> > Some of them follow the loop that doesn't help.
> > Again, thanks a lot for the inputs and reviews.
> >
> > The design discussion helped to produce two solutions here than what was
> started in v0.
> 
> OK good. Now I suggest you make an effort to come up with some compromise
> that will make this thread stop. I really can't say I even remember what we 
> said
> in the beginning.

> You had this idea of starting with a problem statement document how about
> doing that? Because I feel every time people bring up some idea and a problem
> this idea would solve, you go "this is out of scope". Let's decide what the 
> scope
> is?
>
The requirements are simple and straightforward at start of the cover letter.
i.e. for quick reference in short: A PCI VF passthrough to guest VM to support 
legacy + 1.x.

I propose:
Method-1:
1. VF legacy registers access over AQ of PF
2. VF driver notifications using MMIO of VF

Method-2:
1. legacy registers access using two MMIO registers (data and addr)
2. VF driver notifications using MMIO of VF

> > I am asking a 1.x PCI device has the VF notification BAR already, why we 
> > don't
> use it, why to build an additional one in device?
> 
> Following this logic, a 1.x PCI device also has modern common config.
> All you really truly need is a flag to make it behave like legacy, from POV of
> device config and VQs.
> 
Dual definition to same area in device is not elegant way to achieve it.
We already discussed that some area is RW, some is not. It moves location.

And it still doesn't solve the problem of reset.
Hence, either doing via AQ or 2 registers method would work fine.

It keeps legacy also isolated from mainstream.

> So let's try to keep drivers self-contained as much as possible.
>
You said "AQ is fine" at the end of email.
Hard to understand your mixed suggestion in context of sending v3.

> > And notification also works well that reuses transitional device's 
> > notification.
> 
> I don't know what these are. You had some weird
> +struct virtio_admin_cmd_lq_notify_query_result {
> +   u8 bar; /* PCI BAR number of the member VF */
> +   u8 reserved[7];
> +   le64 offset; /* Byte offset within the BAR */ };
> if you want to re-use the normal notifications for VFs we already describe 
> them
> using capabilities.
>
That capability is intermixed with 1.x register queue_notify_off.
So, I see following options.
a. either to create a new legacy specific extended capability, or 
b. say queue_notify_off doesn't apply for legacy notifications because this 
register doesn't exist for legacy
c. exchange it via the AQ like above struct.

> However, this creates a weird mix where there's this command that mostly
> follows legacy pci config layout except notifications which are there in the
> legacy pci config are actually somewhere else.  And this is just inconsistent 
> and
> confusing, and the reason for this is implementation defined.
Depends on how you see it.
It is at same level of inconsistency as 1.x for a 

[virtio-dev] RE: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-23 Thread Parav Pandit



> From: Michael S. Tsirkin 
> Sent: Tuesday, May 23, 2023 2:17 PM
> 
> On Mon, May 15, 2023 at 02:30:55PM +, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin 
> > > Sent: Monday, May 15, 2023 6:09 AM
> > > To be more specific, device config often has the same layout under
> > > legacy and modern. Thus if getting an offset within device config,
> > > same decoding logic should be reusable between legacy and modern.
> > Modern device registers are directly accessible from the guest driver 
> > without
> mediation for VF,SF,SIOV.
> > Hence, there is no need to have commands for modern register access over
> some VQ.
> 
> Yes, there's need, and some of it you were pushing yourself.
Yes, there is need without mediation of AQ.
It was discussed in depth cfgq idea with trade off.

> AQ will allow reduced memory with SRIOV, and support for SIOV.
Already discussed, v0 and v1, mediation via AQ is not the way for SRIOV.
SIOV starting with single goal of only backward compatilbity hence, AQ is also 
doesn't look future forward.

And your summary email, we parked it aside for now.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands

2023-05-23 Thread Michael S. Tsirkin
On Tue, May 23, 2023 at 05:13:54PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Monday, May 22, 2023 5:35 PM
> 
> > 
> > iokit is from my previous work experience not virtio.
> > For virtio we had a lot of trouble with hacks like this on windows.
> I don't know why windows driver need to talk to PF driver until now when the 
> AQ doesn't exists.

What is it you want to know?

> > That's an important platform.
> > 
> I am not sure how important is it.
> What I learnt that a decade has passed and even after that virtio drivers are 
> not part of the OS distro.

Because with QEMU they are so easy to side-load. So we never bothered.

> > > > If you are mapping VF BAR, this has to be done by VF driver, and
> > > > please just avoid pain for everyone by exposing the necessary info
> > > > through the VF itself.
> > > > A capability seems perfect for this.
> > > >
> > > Why not the MMIO region as proposed in v0?
> > 
> > Are you asking what were issues with v0? The issue with v0 was not MMIO as
> > such at least for me.  The issue was that it was not giving us any way to 
> > work
> > around bugs in the legacy interface. AQ was a fix for that.
> > 
> It doesn't need to because guest is anyway going to do whatever it does. 

yes. But hypervisor can do work-arounds or detect broken functionality
and stop cleanly.

> The clear requirement is: to not fix legacy but to support legacy.
> Fixing legacy is role of 1.x. Not the role of legacy interface itself using 
> AQ/MMIO.
> 
> And even if one desire to coalesce MMIO writes, it can still do it for mac 
> address.
> We discussed this.
> We also discussed that kernel as old as 2.6.32 uses CVQ for mac address.
> 
> And Even device can apply mac address when it reaches the count of byte 
> writes to 6, if wishes to.
> > 
> > > Capability based read/write requires extra redirection, it is less 
> > > optimal than
> > MMIO.
> > >
> > > > An alternative is not to map VF BAR at all, steal space from PF BAR 
> > > > instead.
> > > Come an iokit will be able to access this PF BAR without binding to it?
> > 
> > I don't get what you are saying here.  You have to bind a PF driver.
> > Otherwise you can not send AQ command. That driver can map PF BAR into
> > applications.
> > 
> This is usual model that we follow. When PF driver is serving and emulating 
> it provides the shared platform for other devices for legacy functionality.
> It is for legacy not for any other newer interface.
> 
> And for purpose of BAR mapping, such BAR in the VF can be mapped too.
> You didn't answer why VF BAR is a problem and suggest PF BAR.

VF BAR by itself is not a problem. Discovering VF BAR mapping offset
through PF is.


> Such VF BAR already exists and asking for new resource creation doesn't fit 
> in your "simple" scheme suggestion either.
> 
> > 
> > > > Guest is not accessing this anyway. So I really do not see why not
> > > > from software point of view. There's a hardware reason? Could you
> > > > talk to hardware guys about this? You objected to AQ too then
> > > > hardware guys told you it is not a big deal.
> > > I didn't object to AQ on hw reason.
> > > I explained AQ is not the best compared to MMIO because it is slow
> > compared to MMIO by law of physics as it needs to move more bits than
> > MMIO.
> > > And so, the capability based tunneling too is slow to MMIO.
> > > >
> > > > > > So we have legacy emulation send commands to VF or to PF.  Okay.
> > > > > > But let us avoid the need for VF driver to send commands to PF to
> > initialize.
> > > > > > Just get all information it needs from VF itself.
> > > > > >
> > > > > >
> > > > > > Maybe it's a good idea to reuse existing notification
> > > > > > capability, or maybe a new one, but let's avoid making VF driver
> > > > > > depend on PF
> > > > commands.
> > > > > >
> > > > > We agreed in v1 on Jason's suggestion to have the AQ command and
> > > > > yet you and Jason hinder this in v2 with this exact repeated question.
> > > > > Lets please avoid this and move forward.
> > > >
> > > > How was I supposed to raise this issue on v1 if it was not there?
> > > > Stop arguing in circles and things will move forward.
> > > v1 version clearly had AQ commands.
> > >
> > > Please revisit your comments.
> > > v0 -> you suggested:
> > >  - "why not AQ, like transport vq, it can coalesce partial mac writes".
> > >
> > > v1-> over AQ, you suggested:
> > >  - to "drop the size"
> > > - summarized to split 2 commands to 4 commands to split device and config
> > area.
> > >
> > > After that summary, you propose, why not self-contained in VF itself? -> 
> > > circle
> > back to v0 method.
> > > (Instead of mmio, suggesting capability, but not explaining why mmio is 
> > > not
> > fine. Which is better than cap).
> > >
> > > Then now you propose "have BAR in the PF which is owned by other PF
> > driver", with least reasoning.
> > > And it convolutes with your suggestion of PF driver access...
> > >
> > > So, I am bit lost 

[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-23 Thread Michael S. Tsirkin
On Mon, May 15, 2023 at 02:30:55PM +, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin 
> > Sent: Monday, May 15, 2023 6:09 AM
> > To be more specific, device config often has the same layout under legacy 
> > and
> > modern. Thus if getting an offset within device config, same decoding logic
> > should be reusable between legacy and modern.
> Modern device registers are directly accessible from the guest driver without 
> mediation for VF,SF,SIOV.
> Hence, there is no need to have commands for modern register access over some 
> VQ.

Yes, there's need, and some of it you were pushing yourself.
AQ will allow reduced memory with SRIOV, and support for SIOV.

> Will reply to rest of the Jason's comments in some time.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] RE: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-23 Thread Parav Pandit

> From: Michael S. Tsirkin 
> Sent: Tuesday, May 23, 2023 2:39 AM

> but I would like us to better
> document how and when does device switch to and from legacy mode, and for
> the switch to be somehow reusable for solutions that are closer to vdpa.
> 
To my knowledge vdpa do not have such switch, nor such switch in the spec, nor 
in the vdpa or PCI device.

> Does enabling legacy commands cause the switch?  Or should we add a
> LEGACY_PCI command to enable/disable? I kind of like the second option ...

Some devices may find it helpful to issue this command on the AQ, to say enable 
your legacy emulation and allocate needed resources.
I guess before getting distracted on this side topic, lets close on the AQ or 
two register MMIO approach in other email thread.


[virtio-dev] RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands

2023-05-23 Thread Parav Pandit


> From: Michael S. Tsirkin 
> Sent: Monday, May 22, 2023 5:35 PM

> 
> iokit is from my previous work experience not virtio.
> For virtio we had a lot of trouble with hacks like this on windows.
I don't know why windows driver need to talk to PF driver until now when the AQ 
doesn't exists.
> That's an important platform.
> 
I am not sure how important is it.
What I learnt that a decade has passed and even after that virtio drivers are 
not part of the OS distro.

> > > If you are mapping VF BAR, this has to be done by VF driver, and
> > > please just avoid pain for everyone by exposing the necessary info
> > > through the VF itself.
> > > A capability seems perfect for this.
> > >
> > Why not the MMIO region as proposed in v0?
> 
> Are you asking what were issues with v0? The issue with v0 was not MMIO as
> such at least for me.  The issue was that it was not giving us any way to work
> around bugs in the legacy interface. AQ was a fix for that.
> 
It doesn't need to because guest is anyway going to do whatever it does. 
The clear requirement is: to not fix legacy but to support legacy.
Fixing legacy is role of 1.x. Not the role of legacy interface itself using 
AQ/MMIO.

And even if one desire to coalesce MMIO writes, it can still do it for mac 
address.
We discussed this.
We also discussed that kernel as old as 2.6.32 uses CVQ for mac address.

And Even device can apply mac address when it reaches the count of byte writes 
to 6, if wishes to.
> 
> > Capability based read/write requires extra redirection, it is less optimal 
> > than
> MMIO.
> >
> > > An alternative is not to map VF BAR at all, steal space from PF BAR 
> > > instead.
> > Come an iokit will be able to access this PF BAR without binding to it?
> 
> I don't get what you are saying here.  You have to bind a PF driver.
> Otherwise you can not send AQ command. That driver can map PF BAR into
> applications.
> 
This is usual model that we follow. When PF driver is serving and emulating it 
provides the shared platform for other devices for legacy functionality.
It is for legacy not for any other newer interface.

And for purpose of BAR mapping, such BAR in the VF can be mapped too.
You didn't answer why VF BAR is a problem and suggest PF BAR.
Such VF BAR already exists and asking for new resource creation doesn't fit in 
your "simple" scheme suggestion either.

> 
> > > Guest is not accessing this anyway. So I really do not see why not
> > > from software point of view. There's a hardware reason? Could you
> > > talk to hardware guys about this? You objected to AQ too then
> > > hardware guys told you it is not a big deal.
> > I didn't object to AQ on hw reason.
> > I explained AQ is not the best compared to MMIO because it is slow
> compared to MMIO by law of physics as it needs to move more bits than
> MMIO.
> > And so, the capability based tunneling too is slow to MMIO.
> > >
> > > > > So we have legacy emulation send commands to VF or to PF.  Okay.
> > > > > But let us avoid the need for VF driver to send commands to PF to
> initialize.
> > > > > Just get all information it needs from VF itself.
> > > > >
> > > > >
> > > > > Maybe it's a good idea to reuse existing notification
> > > > > capability, or maybe a new one, but let's avoid making VF driver
> > > > > depend on PF
> > > commands.
> > > > >
> > > > We agreed in v1 on Jason's suggestion to have the AQ command and
> > > > yet you and Jason hinder this in v2 with this exact repeated question.
> > > > Lets please avoid this and move forward.
> > >
> > > How was I supposed to raise this issue on v1 if it was not there?
> > > Stop arguing in circles and things will move forward.
> > v1 version clearly had AQ commands.
> >
> > Please revisit your comments.
> > v0 -> you suggested:
> >  - "why not AQ, like transport vq, it can coalesce partial mac writes".
> >
> > v1-> over AQ, you suggested:
> >  - to "drop the size"
> > - summarized to split 2 commands to 4 commands to split device and config
> area.
> >
> > After that summary, you propose, why not self-contained in VF itself? -> 
> > circle
> back to v0 method.
> > (Instead of mmio, suggesting capability, but not explaining why mmio is not
> fine. Which is better than cap).
> >
> > Then now you propose "have BAR in the PF which is owned by other PF
> driver", with least reasoning.
> > And it convolutes with your suggestion of PF driver access...
> >
> > So, I am bit lost on your circular proposal, or it was just a probe 
> > question...
> > Not sure.
> 
> You did move things to AQ however not completely.  You kept notification in
> MMIO. If everything was in AQ we would be done as far as I am concerned
> (though Jason still does not like duplication of functionality with transport 
> vq,
> but that is a separate issue).
> 
It is a wrong design point to ask to "move everything to AQ", because AQ cannot 
be the data link layer for everything.
One can ask in the, why don't I put my 1GB of data inside the AQ itself? AQ 
should do 

[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-net

2023-05-23 Thread Heng Qi
On Tue, May 23, 2023 at 09:30:28AM -0400, Michael S. Tsirkin wrote:
> On Tue, May 23, 2023 at 05:18:20PM +0800, Heng Qi wrote:
> > On Tue, May 23, 2023 at 03:15:37AM -0400, Michael S. Tsirkin wrote:
> > > On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> > > > On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin wrote:
> > > > > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > > > > 1) Add a feature bit to the virtio specification to tell the sender 
> > > > > > that a fully
> > > > > > csumed packet must be sent.
> > > > > 
> > > > > Who is the sender in this picture? The driver?
> > > > 
> > > > The device or the driver.
> > > > 
> > > > When the device is hw, the sender is more likely to be a device.
> > > > When the device is sw, the sender can be a device or a driver.
> > > >
> > > > But in general, this feature is inclined to constrain the behavior of 
> > > > the device and
> > > > the driver from the receiving side.
> > > 
> > > Based on above I am guessing you are talking about driver getting
> > > packets from device, I wish you used terms from virtio spec.
> > 
> > Yes, I'm going to use the terminology of the virtio spec.
> > 
> > > 
> > > > For example: 
> > > > VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that you 
> > > > must send me a fully csumed packet.
> > > > 
> > > > Then the specific implementation can be
> > > > 
> > > > (1) the sender sends a fully csumed packet;
> > > > (2) the receiver receives a CHECKSUM_PARTIAL packet, and the device 
> > > > helps calculate the fully csum
> > > > (because the two parties in the communication are located on the 
> > > > same host, the packet is trusted.).
> > > > 
> > > > In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the driver 
> > > > will no longer receive any packets marked CHECKSUM_PARTIAL.
> > > > 
> > > > Thanks.
> > > 
> > > This is what clearing VIRTIO_NET_F_GUEST_CSUM does.
> > 
> > Yes, but with VIRTIO_NET_F_GUEST_CSUM cleared, although the device can
> > receive a fully checksummed packet, we can no longer enjoy
> > the device's ability to validate the packet checksum. That is, the value
> > of \field{flags} in the virtio_net_hdr structure is set to 0, which means
> > that the packet received by the driver will not be marked as
> > VIRTIO_NET_HDR_F_DATA_VALID.
> > 
> > So, we need a feature bit (let's say VIRTIO_NET_F_UNNECESSARY_CSUM).
> > If VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the device must give the
> > driver a fully checksummed packet, and the packet is validated by the
> > device with \field{flags} set to VIRTIO_NET_HDR_F_DATA_VALID.
> > 
> > > 
> > > I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
> > > disables all offloads but you want to keep some of them?
> > > 
> > 
> > No, what I mean is that a feature VIRTIO_NET_F_UNNECESSARY_CSUM is needed
> > in addition to VIRTIO_NET_F_GUEST_CSUM, if both features are negotiated,
> > then the driver may always receive packets marked as
> > VIRTIO_NET_HDR_F_DATA_VALID, which means that we can now load XDP at the
> > same time.
> 
> Makes no sense to me. VIRTIO_NET_F_GUEST_CSUM set already allows
> VIRTIO_NET_HDR_F_DATA_VALID:

We need to focus on what happens when the XDP program is loaded:

The current virtio-net needs to turn off the VIRTIO_NET_F_GUEST_CSUM
feature when loading XDP. The main reason for doing this is because
VIRTIO_NET_F_GUEST_CSUM allows to receive packets marked as
VIRTIO_NET_HDR_F_NEEDS_CSUM. Such packets are not compatible with
XDP programs, because we cannot guarantee that the csum_{start, offset}
fields are correct after XDP modifies the packets.

So in order for the driver to continue to enjoy the device's ability to
validate packets while XDP is loaded, we need a new feature to tell the
device not to deliver packets marked as VIRTIO_NET_HDR_F_NEEDS_CSUM (in
other words, the device can only deliver packets marked as
VIRTIO_NET_HDR_F_DATA_VALID).

Thanks.

> \item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
>   VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} can be
>   set: if so, the packet checksum at offset \field{csum_offset}
>   from \field{csum_start} and any preceding checksums
>   have been validated.  The checksum on the packet is incomplete and
>   if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags}, 
>   then \field{csum_start} and \field{csum_offset} indicate how to calculate it
>   (see Packet Transmission point 1).
> 
> Did you maybe mean if either feature is negotiated?
> 
> 
> 
> > > Again please use virtio terminology not Linux. to help you out,
> > > in current linux, VIRTIO_NET_HDR_F_NEEDS_CSUM and 
> > > VIRTIO_NET_HDR_F_DATA_VALID
> > > will set CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY respectively.
> > > 
> > 
> > Sure. Will do as you suggested.
> > 
> > Thanks.
> > 
> > > 
> > > -- 
> > > MST
> > > 
> > > 
> > > -
> > > To unsubscribe, e-mail: 

Re: [virtio-dev] [RFC PATCH 1/1] can: virtio: Initial virtio CAN driver.

2023-05-23 Thread Harald Mommer

Hello Vincent,

On 15.05.23 07:58, Vincent Mailhol wrote:

Hi Harald,

On Fri. 12 May 2023 at 22:19, Harald Mommer
  wrote:

Hello Vincent,

searched for the old E-Mail, this was one of that which slipped through.
Too much of those.

On 05.11.22 10:21, Vincent Mailhol wrote:

On Fry. 4 nov. 2022 at 20:13, Arnd Bergmann  wrote:

On Thu, Nov 3, 2022, at 13:26, Harald Mommer wrote:

On 25.08.22 20:21, Arnd Bergmann wrote:

...

The messages are not necessarily processed in sequence by the CAN stack.
CAN is priority based. The lower the CAN ID the higher the priority. So
a message with CAN ID 0x100 can surpass a message with ID 0x123 if the
hardware is not just simple basic CAN controller using a single TX
mailbox with a FIFO queue on top of it.

Really? I acknowledge that it is priority based *on the bus*, i.e. if
two devices A and B on the same bus try to send CAN ID 0x100 and 0x123
at the same time, then device A will win the CAN arbitration.
However, I am not aware of any devices which reorder their own stack
according to the CAN IDs. If I first send CAN ID 0x123 and then ID
0x100 on the device stack, 0x123 would still go out first, right?

The CAN hardware may be a basic CAN hardware: Single mailbox only with a
TX FIFO on top of this.

No reordering takes place, the CAN hardware will try to arbitrate the
CAN bus with a low priority CAN message (big CAN ID) while some high
priority CAN message (small CAN ID) is waiting in the FIFO. This is
called "internal priority inversion", a property of basic CAN hardware.
A basic CAN hardware does exactly what you describe.

Should be the FIFO in software it's a bad idea to try to improve this
doing some software sorting, the processing time needed is likely to
make things even worse. Therefore no software does this or at least it's
not recommended to do this.

But the hardware may also be a better one. No FIFO but a lot of TX
mailboxes. A full CAN hardware tries to arbitrate the bus using the
highest priority waiting CAN message considering all hardware TX
mailboxes. Such a better (full CAN) hardware does not cause "internal
priority inversion" but tries to arbitrate the bus in the correct order
given by the message IDs.

We don't know about the actually used CAN hardware and how it's used on
this level we are with our virtio can device. We are using SocketCAN, no
information about the properties of the underlying hardware is provided
at some API. May be basic CAN using a FIFO and a single TX mailbox or
full CAN using a lot of TX mailboxes in parallel.

On the bus it's guaranteed always that the sender with the lowest CAN ID
winds regardless which hardware is used, the only difference is whether
we have "internal priority inversion" or not.

If I look at the CAN stack = Software + hardware (and not only software)
it's correct: The hardware device may re-order if it's a better (full
CAN) one and thus the actual sending on the bus is not done in the same
sequence as the messages were provided internally (e.g. at some socket).

OK. Thanks for the clarification.

So, you are using scatterlist to be able to interface with the
different CAN mailboxes. But then, it means that all the heuristics to
manage those mailboxes are done in the virtio host.


There is some heuristic when VIRTIO_CAN_F_LATE_TX_ACK is supported on 
the device side. The feature means that the host marks a TX message as 
done not at the moment when it's scheduled for sending but when it has 
been really sent on the bus.


To do that SocketCAN needs to be configured to receive it's own sent 
message. On RX the device identifies the message which has been sent on 
the bus. The heuristic is going through the list of pending messages, 
check CAN ID and payload and mark the respective message as done.


Problem with SocketCAN: There is a load case (full sending without any 
delay in both directions) where it seems that own sent messages are 
getting lost in the software stack. Thus we get in a state where the 
list of pending messages gets full and TX gets stuck.


The feature flag is not offered in the open source device, it is only 
experimental in our proprietary device and normally disabled.


Without this feature there is no heuristic, just send to SocketCAN and 
put immediately as used (done). But for an AUTOSAR CAN driver this means 
CanIf_TxConfirmation() came too early, not late when the message "has 
been transmitted on the CAN network" but already earlier when the 
message is put to SocketCAN scheduled for transmission.



Did you consider exposing the number of supported mailboxes as a
configuration parameter and let the virtio guest manage these? In
Linux, it is supported since below commit:

   commit 038709071328 ("can: dev: enable multi-queue for SocketCAN devices")
   

Re: [virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-net

2023-05-23 Thread Michael S. Tsirkin
On Tue, May 23, 2023 at 05:18:20PM +0800, Heng Qi wrote:
> On Tue, May 23, 2023 at 03:15:37AM -0400, Michael S. Tsirkin wrote:
> > On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> > > On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin wrote:
> > > > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > > > 1) Add a feature bit to the virtio specification to tell the sender 
> > > > > that a fully
> > > > > csumed packet must be sent.
> > > > 
> > > > Who is the sender in this picture? The driver?
> > > 
> > > The device or the driver.
> > > 
> > > When the device is hw, the sender is more likely to be a device.
> > > When the device is sw, the sender can be a device or a driver.
> > >
> > > But in general, this feature is inclined to constrain the behavior of the 
> > > device and
> > > the driver from the receiving side.
> > 
> > Based on above I am guessing you are talking about driver getting
> > packets from device, I wish you used terms from virtio spec.
> 
> Yes, I'm going to use the terminology of the virtio spec.
> 
> > 
> > > For example: 
> > > VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that you must 
> > > send me a fully csumed packet.
> > > 
> > > Then the specific implementation can be
> > > 
> > > (1) the sender sends a fully csumed packet;
> > > (2) the receiver receives a CHECKSUM_PARTIAL packet, and the device helps 
> > > calculate the fully csum
> > > (because the two parties in the communication are located on the same 
> > > host, the packet is trusted.).
> > > 
> > > In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the driver 
> > > will no longer receive any packets marked CHECKSUM_PARTIAL.
> > > 
> > > Thanks.
> > 
> > This is what clearing VIRTIO_NET_F_GUEST_CSUM does.
> 
> Yes, but with VIRTIO_NET_F_GUEST_CSUM cleared, although the device can
> receive a fully checksummed packet, we can no longer enjoy
> the device's ability to validate the packet checksum. That is, the value
> of \field{flags} in the virtio_net_hdr structure is set to 0, which means
> that the packet received by the driver will not be marked as
> VIRTIO_NET_HDR_F_DATA_VALID.
> 
> So, we need a feature bit (let's say VIRTIO_NET_F_UNNECESSARY_CSUM).
> If VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the device must give the
> driver a fully checksummed packet, and the packet is validated by the
> device with \field{flags} set to VIRTIO_NET_HDR_F_DATA_VALID.
> 
> > 
> > I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
> > disables all offloads but you want to keep some of them?
> > 
> 
> No, what I mean is that a feature VIRTIO_NET_F_UNNECESSARY_CSUM is needed
> in addition to VIRTIO_NET_F_GUEST_CSUM, if both features are negotiated,
> then the driver may always receive packets marked as
> VIRTIO_NET_HDR_F_DATA_VALID, which means that we can now load XDP at the
> same time.

Makes no sense to me. VIRTIO_NET_F_GUEST_CSUM set already allows
VIRTIO_NET_HDR_F_DATA_VALID:
\item If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
  VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags} can be
  set: if so, the packet checksum at offset \field{csum_offset}
  from \field{csum_start} and any preceding checksums
  have been validated.  The checksum on the packet is incomplete and
  if bit VIRTIO_NET_HDR_F_RSC_INFO is not set in \field{flags}, 
  then \field{csum_start} and \field{csum_offset} indicate how to calculate it
  (see Packet Transmission point 1).

Did you maybe mean if either feature is negotiated?



> > Again please use virtio terminology not Linux. to help you out,
> > in current linux, VIRTIO_NET_HDR_F_NEEDS_CSUM and 
> > VIRTIO_NET_HDR_F_DATA_VALID
> > will set CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY respectively.
> > 
> 
> Sure. Will do as you suggested.
> 
> Thanks.
> 
> > 
> > -- 
> > MST
> > 
> > 
> > -
> > To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-net

2023-05-23 Thread Heng Qi
On Tue, May 23, 2023 at 03:15:37AM -0400, Michael S. Tsirkin wrote:
> On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> > On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin wrote:
> > > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > > 1) Add a feature bit to the virtio specification to tell the sender 
> > > > that a fully
> > > > csumed packet must be sent.
> > > 
> > > Who is the sender in this picture? The driver?
> > 
> > The device or the driver.
> > 
> > When the device is hw, the sender is more likely to be a device.
> > When the device is sw, the sender can be a device or a driver.
> >
> > But in general, this feature is inclined to constrain the behavior of the 
> > device and
> > the driver from the receiving side.
> 
> Based on above I am guessing you are talking about driver getting
> packets from device, I wish you used terms from virtio spec.

Yes, I'm going to use the terminology of the virtio spec.

> 
> > For example: 
> > VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that you must 
> > send me a fully csumed packet.
> > 
> > Then the specific implementation can be
> > 
> > (1) the sender sends a fully csumed packet;
> > (2) the receiver receives a CHECKSUM_PARTIAL packet, and the device helps 
> > calculate the fully csum
> > (because the two parties in the communication are located on the same 
> > host, the packet is trusted.).
> > 
> > In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the driver will 
> > no longer receive any packets marked CHECKSUM_PARTIAL.
> > 
> > Thanks.
> 
> This is what clearing VIRTIO_NET_F_GUEST_CSUM does.

Yes, but with VIRTIO_NET_F_GUEST_CSUM cleared, although the device can
receive a fully checksummed packet, we can no longer enjoy
the device's ability to validate the packet checksum. That is, the value
of \field{flags} in the virtio_net_hdr structure is set to 0, which means
that the packet received by the driver will not be marked as
VIRTIO_NET_HDR_F_DATA_VALID.

So, we need a feature bit (let's say VIRTIO_NET_F_UNNECESSARY_CSUM).
If VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the device must give the
driver a fully checksummed packet, and the packet is validated by the
device with \field{flags} set to VIRTIO_NET_HDR_F_DATA_VALID.

> 
> I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
> disables all offloads but you want to keep some of them?
> 

No, what I mean is that a feature VIRTIO_NET_F_UNNECESSARY_CSUM is needed
in addition to VIRTIO_NET_F_GUEST_CSUM, if both features are negotiated,
then the driver may always receive packets marked as
VIRTIO_NET_HDR_F_DATA_VALID, which means that we can now load XDP at the
same time.

> Again please use virtio terminology not Linux. to help you out,
> in current linux, VIRTIO_NET_HDR_F_NEEDS_CSUM and VIRTIO_NET_HDR_F_DATA_VALID
> will set CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY respectively.
> 

Sure. Will do as you suggested.

Thanks.

> 
> -- 
> MST
> 
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [Proposal] Relationship between XDP and rx-csum in virtio-net

2023-05-23 Thread Michael S. Tsirkin
On Tue, May 23, 2023 at 10:41:18AM +0800, Heng Qi wrote:
> On Mon, May 22, 2023 at 03:10:05PM -0400, Michael S. Tsirkin wrote:
> > On Mon, May 22, 2023 at 08:12:00PM +0800, Heng Qi wrote:
> > > 1) Add a feature bit to the virtio specification to tell the sender that 
> > > a fully
> > > csumed packet must be sent.
> > 
> > Who is the sender in this picture? The driver?
> 
> The device or the driver.
> 
> When the device is hw, the sender is more likely to be a device.
> When the device is sw, the sender can be a device or a driver.
>
> But in general, this feature is inclined to constrain the behavior of the 
> device and
> the driver from the receiving side.

Based on above I am guessing you are talking about driver getting
packets from device, I wish you used terms from virtio spec.

> For example: 
> VIRTIO_NET_F_UNNECESSARY_CSUM : The driver tells the device that you must 
> send me a fully csumed packet.
> 
> Then the specific implementation can be
> 
> (1) the sender sends a fully csumed packet;
> (2) the receiver receives a CHECKSUM_PARTIAL packet, and the device helps 
> calculate the fully csum
> (because the two parties in the communication are located on the same 
> host, the packet is trusted.).
> 
> In summary, if VIRTIO_NET_F_UNNECESSARY_CSUM is negotiated, the driver will 
> no longer receive any packets marked CHECKSUM_PARTIAL.
> 
> Thanks.

This is what clearing VIRTIO_NET_F_GUEST_CSUM does.

I feel you are trying to say that clearing VIRTIO_NET_F_GUEST_CSUM
disables all offloads but you want to keep some of them?

Again please use virtio terminology not Linux. to help you out,
in current linux, VIRTIO_NET_HDR_F_NEEDS_CSUM and VIRTIO_NET_HDR_F_DATA_VALID
will set CHECKSUM_PARTIAL and CHECKSUM_UNNECESSARY respectively.


-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-23 Thread Michael S. Tsirkin
On Sat, May 06, 2023 at 03:01:33AM +0300, Parav Pandit wrote:
> This short series introduces legacy registers access commands for the owner
> group member PCI PF to access the legacy registers of the member VFs.
> 
> If in future any SIOV devices to support legacy registers, they
> can be easily supported using same commands by using the group
> member identifiers of the future SIOV devices.
> 
> More details as overview, motivation, use case are further described
> below.
> 
> Patch summary:
> --
> patch-1 adds administrative virtuqueue commands
> patch-2 adds its conformance section
> 
> This short series is on top of latest work [1] from Michael.
> It uses the newly introduced administrative virtqueue facility with 3 new
> commands which uses the existing virtio_admin_cmd.
> 
> [1] https://lists.oasis-open.org/archives/virtio-comment/202305/msg00112.html
> 
> Usecase:
> 
> 1. A hypervisor/system needs to provide transitional
>virtio devices to the guest VM at scale of thousands,
>typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>virtio device type (net/blk) and be future compatible with a
>single vfio stack using SR-IOV or other scalable device
>virtualization technology to map PCI devices to the guest VM.
>(as transitional or otherwise)
> 
> Motivation/Background:
> --
> The existing virtio transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. Currently it
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developer’s Manual:
> The processor’s I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through H.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range will be
> aligned to a 4 KB boundary.
> 
> Overview:
> -
> Above usecase requirements can be solved by PCI PF group owner accessing
> its group member PCI VFs legacy registers using an admin virtqueue of
> the group owner PCI PF.
> 
> Two new admin virtqueue commands are added which read/write PCI VF
> registers.
> 
> The third command suggested by Jason queries the VF device's driver
> notification region.
> 
> Software usage example:
> ---
> One way to use and map to the guest VM is by using vfio driver
> framework in Linux kernel.
> 
> +--+
> |pci_dev_id = 0x100X   |
> +---|pci_rev_id = 0x0  |-+
> |vfio device|BAR0 = I/O region | |
> |   |Other attributes  | |
> |   +--+ |
> ||
> +   +--+ +-+ |
> |   |I/O BAR to AQ | | Other vfio  | |
> |   |rd/wr mapper  | | functionalities | |
> |   +--+ +-+ |
> ||
> +--+-+---+
>| |
>   +++   +++
>   | +-+ |   | PCI VF device A |
>   | | AQ  |-+>+-+ |
>   | +-+ |   |   | | legacy regs | |
>   | PCI PF device   |   |   | +-+ |
>   +-+   |   +-+
> |
> |   +++
> |   | PCI VF device N |
> +>+-+ |
> | | legacy regs | |
> | +-+ |
> +-+
> 
> 2. Virtio pci driver to bind to the listed device id and
>use it as native device in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Please review.
> 
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/167
> Signed-off-by: Parav Pandit 


Just thought of an issue: one use-case this can't address is SVQ.
Specifically, legacy has:

/* A 32-bit r/w PFN for the currently selected queue */
#define VIRTIO_PCI_QUEUE_PFN8


this can only address up to 40 bits which might be ok for legacy guests,
but if instead we want the queue to live in host memory (this is what
SVQ does) then this does not work as host is likely to have more than
2^40 byte memory.

Further, one of advantages of modern is that used ring can live
in host memory with aliasing when available is passed through directly