[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash

2023-05-10 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 05:15:37PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:
> > On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:
> > > 
> > > 在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:
> > > > On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:
> > > > > On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:
> > > > > > > 在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:
> > > > > > > > On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:
> > > > > > > > > This does not mean that every device needs to implement and 
> > > > > > > > > support all of
> > > > > > > > > these, they can choose to support some protocols they want.
> > > > > > > > > 
> > > > > > > > > I add these because we have scale application scenarios for 
> > > > > > > > > modern protocols
> > > > > > > > > VXLAN-GPE/GENEVE:
> > > > > > > > > 
> > > > > > > > > +\item In scenarios where the same flow passing through 
> > > > > > > > > different tunnels is expected to be received in the same 
> > > > > > > > > queue,
> > > > > > > > > +  warm caches, lessing locking, etc. are optimized to 
> > > > > > > > > obtain receiving performance.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a 
> > > > > > > > > little crossover.
> > > > > > > > > 
> > > > > > > > > Thanks.
> > > > > > > > But VXLAN-GPE/GENEVE can use source port for entropy.
> > > > > > > > 
> > > > > > > > It is recommended that the UDP source port number
> > > > > > > >  be calculated using a hash of fields from the inner 
> > > > > > > > packet
> > > > > > > > 
> > > > > > > > That is best because
> > > > > > > > it allows end to end control and is protocol agnostic.
> > > > > > > Yes. I agree with this, I don't think we have an argument on this 
> > > > > > > point
> > > > > > > right now.:)
> > > > > > > 
> > > > > > > For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have 
> > > > > > > to deal
> > > > > > > with
> > > > > > > scenarios where the same flow passes through different tunnels.
> > > > > > > 
> > > > > > > Having them hashed to the same rx queue, is hard to do via outer 
> > > > > > > headers.
> > > > > > > > All that is missing is symmetric Toepliz and all is well?
> > > > > > > The scenarios above or in the commit log also require inner 
> > > > > > > headers.
> > > > > > Hmm I am not sure I get it 100%.
> > > > > > Could you show an example with inner header hash in the port #,
> > > > > > hash is symmetric, and you still have trouble?
> > > > > > 
> > > > > > 
> > > > > > It kinds of sounds like not enough entropy is not the problem
> > > > > > at this point.
> > > > > Sorry for the late reply. :)
> > > > > 
> > > > > For modern tunneling protocols, yes.
> > > > > 
> > > > > > You now want to drop everything from the header
> > > > > > except the UDP source port. Is that a fair summary?
> > > > > > 
> > > > > For example, for the same flow passing through different VXLAN 
> > > > > tunnels,
> > > > > packets in this flow have the same inner header and different outer
> > > > > headers. Sometimes these packets of the flow need to be hashed to the
> > > > > same rxq, then we can use the inner header as the hash input.
> > > > > 
> > > > > Thanks!
> > > > So, they will have the same source port yes?
> > > Yes. The outer source port can be calculated using the 5-tuple of the
> > > original packet,
> > > and the outer ports are the same but the outer IPs are different after
> > > different directions of the same flow pass through different tunnels.
> > > > Any way to use that
> > > We use it in monitoring, firewall and other scenarios.
> > > 
> > > > so we don't depend on a specific protocol?
> > > Yes, selected tunneling protocols can be used in this scenario like this.
> > > 
> > > Thanks.
> > > 
> > No, the question was - can we generalize this somehow then?
> > For example, a flag to ignore source IP when hashing?
> > Or maybe just for UDP packets?
> 
> 1. I think the common solution is based on the inner header, so that
> GRE/IPIP tunnels can also enjoy inner symmetric hashing.
> 
> 2. The VXLAN spec does not show that the outer source port in both
> directions of the same flow must be the same [1]
> (although the outer source port is calculated based on the consistent hash
> in the kernel. The consistent hash will sort the five-tuple before
> calculating hashing),
> but it is best not to assume that consistent hashing is used in all VXLAN
> implementations.

I agree, best not to assume if it's not in the spec.
The requirement to hash two sides to same queue might
not be necessary for everyone though, right?

> The GENEVE spec uses "SHOUlD"[2].

What about other tunnels? Could you summarize please?
SHOULD means "if you ignore this
things will work but not well".
You mentioned concerns such as worse performance,
this is fine with SHOULD. Is inner

[virtio-dev] RE: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit
> From: Michael S. Tsirkin 
> Sent: Wednesday, May 10, 2023 5:33 PM
> 
> I think some kind of "enable" command for VFs might have value. No?

Unlike the SIOV devices, VFs life cycle is under PF control, not sure how the 
enable command can help.
After SR-IOV is enabled, VF is short of enabled, it may not be ready, but that 
is guarded by the device reset flow.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 09:08:44PM +, Parav Pandit wrote:
> Hi Jason, Michel,
> 
> > From: virtio-comm...@lists.oasis-open.org  > open.org> On Behalf Of Parav Pandit
> > Sent: Wednesday, May 10, 2023 1:34 PM
> > 
> > > From: Michael S. Tsirkin 
> > > Sent: Wednesday, May 10, 2023 12:16 PM
> > >
> > > On Wed, May 10, 2023 at 12:11:50PM -0400, Parav Pandit wrote:
> > > >
> > > >
> > > > On 5/10/2023 2:04 AM, Michael S. Tsirkin wrote:
> > > > > On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > > > > > I thought so too originally. Unfortunately I now think that
> > > > > > > no, legacy is not going to be a byproduct of transport
> > > > > > > virtqueue for modern - it is different enough that it needs 
> > > > > > > dedicated
> > commands.
> > > > > >
> > > > > > If you mean the transport virtqueue, I think some dedicated
> > > > > > commands for legacy are needed. Then it would be a transport
> > > > > > that supports transitional devices. It would be much better than
> > > > > > having commands for a partial transport like this patch did.
> > > > >
> > > > > OK I am beginning to get what you are saying.  So your criticism
> > > > > is
> > > > > this: what if device supports vq transport for modern, and we want
> > > > > to build a transitional device on top.  how will that look. yes?
> > > > > A reasonable thing to include at least in the commit log. Parav?
> > > > >
> > > > I am still trying to understand what is "vq transport for modern"?
> > > > Do you mean transporting currently defined config space access over vq?
> > > > If so, is this VQ belong to the guest or hypervisor?
> > >
> > > https://lore.kernel.org/all/20220826100034.200432-2-
> > > lingshan.zhu%40intel.com/t.mbox.gz
> > 
> > The gz link is not accessible.
> > But I got the right link [1].
> > 
> > [1] https://lore.kernel.org/all/20220826100034.200432-2-
> > lingshan@intel.com/
> > 
> > 1. Above patch cover letter [1] is missing the basic objective/problem
> > statement.
> > i.e. why a transport virtqueue is needed?
> > But I probably get the idea of [1] as we did the AQ.
> > 
> > 2. Commit log says about
> > a. querying resource of management device (aka group owner in AQ now) b.
> > creating and destroying the managed device  (aka group owner creating group
> > member devices) c. configure the managed device (aka group owner
> > configuring/composing group member devices such as VFs, SFs, SIOV).
> > 
> > So, all above 2.a to 2.c belongs to the admin group owner and group
> > management commands like how it is defined in the AQ proposal.
> > 
> > So, 3 out of the 4 motivations are achieved by AQ proposal.
> > This AQ belongs to the hypervisor. I am clear on this part.
> > 
> > 4th point in cover letter is: "config virtqueues of the managed device".
> > 
> > This work belongs to the driver -> device direct communication using a queue
> > from driver to device.
> > So, I imagine this work can be done using a queue by the guest driver and
> > serviced by the device like how a guest driver configures the queue today
> > without any mediation.
> > For PCI, MMIO transport, surely this can be done by the PCI device directly
> > being is PF, VF or SIOV.
> > (Instead of config register, using a new queue interface). Looks fine to me.
> > 
> > Can this new cfg queue mediated like CVQ that is done in a sw? May be yes.
> > Should it be always mediated when it is of VF, SIOV Device? Mostly no 
> > because
> > it is yet another VQ for PF, VF, SIOV.
> > 
> > I am yet to parse rest of the 4 patches, please give me some time to review 
> > it.
> 
> I went over the past work of [1], [2].
> 
> [1] 
> https://lore.kernel.org/all/20220826100034.200432-2-lingshan@intel.com/
> [2] https://lists.oasis-open.org/archives/virtio-comment/202208/msg00141.html
> 
> The "virtio q as transport" in [2] is bit misleading as its only role is to 
> transport the _registers_ of the SIOV_R1 device through its parent PF.
> Rest of the work is the pure management work to manage the life cycle of SIOV 
> devices (create/delete/configure/compose).
> 
> And the motivation is also clear is to provide composing a virtio device for 
> the guest VM for the backward compatibility for 1.x part of the specification.
> 
> It seems fine and indeed orthogonal to me that: it is for backward 
> compatibility for already defined config fields for existing guest VM driver.
> 
> It does not conflict with the cfgq/cmdq idea whose main purpose is for the 
> new config fields, new use cases that doesn't require any mediation.
> Such cfgq works across PF, VF, SF/SIOV devices in uniform way without 
> mediation.
> It also scales device register memory also very well in predictable way.
> 
> The registers and feature bits access described in [2], can certainly be done 
> using new non_legacy new AQ commands.
> Both legacy and non-legacy command have different semantics as Michael 
> mentioned.
> 
> The AQ semantics that Michael did as opposed to "virtqueue as transport" fits 

[virtio-dev] RE: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit
Hi Jason, Michel,

> From: virtio-comm...@lists.oasis-open.org  open.org> On Behalf Of Parav Pandit
> Sent: Wednesday, May 10, 2023 1:34 PM
> 
> > From: Michael S. Tsirkin 
> > Sent: Wednesday, May 10, 2023 12:16 PM
> >
> > On Wed, May 10, 2023 at 12:11:50PM -0400, Parav Pandit wrote:
> > >
> > >
> > > On 5/10/2023 2:04 AM, Michael S. Tsirkin wrote:
> > > > On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > > > > I thought so too originally. Unfortunately I now think that
> > > > > > no, legacy is not going to be a byproduct of transport
> > > > > > virtqueue for modern - it is different enough that it needs 
> > > > > > dedicated
> commands.
> > > > >
> > > > > If you mean the transport virtqueue, I think some dedicated
> > > > > commands for legacy are needed. Then it would be a transport
> > > > > that supports transitional devices. It would be much better than
> > > > > having commands for a partial transport like this patch did.
> > > >
> > > > OK I am beginning to get what you are saying.  So your criticism
> > > > is
> > > > this: what if device supports vq transport for modern, and we want
> > > > to build a transitional device on top.  how will that look. yes?
> > > > A reasonable thing to include at least in the commit log. Parav?
> > > >
> > > I am still trying to understand what is "vq transport for modern"?
> > > Do you mean transporting currently defined config space access over vq?
> > > If so, is this VQ belong to the guest or hypervisor?
> >
> > https://lore.kernel.org/all/20220826100034.200432-2-
> > lingshan.zhu%40intel.com/t.mbox.gz
> 
> The gz link is not accessible.
> But I got the right link [1].
> 
> [1] https://lore.kernel.org/all/20220826100034.200432-2-
> lingshan@intel.com/
> 
> 1. Above patch cover letter [1] is missing the basic objective/problem
> statement.
> i.e. why a transport virtqueue is needed?
> But I probably get the idea of [1] as we did the AQ.
> 
> 2. Commit log says about
> a. querying resource of management device (aka group owner in AQ now) b.
> creating and destroying the managed device  (aka group owner creating group
> member devices) c. configure the managed device (aka group owner
> configuring/composing group member devices such as VFs, SFs, SIOV).
> 
> So, all above 2.a to 2.c belongs to the admin group owner and group
> management commands like how it is defined in the AQ proposal.
> 
> So, 3 out of the 4 motivations are achieved by AQ proposal.
> This AQ belongs to the hypervisor. I am clear on this part.
> 
> 4th point in cover letter is: "config virtqueues of the managed device".
> 
> This work belongs to the driver -> device direct communication using a queue
> from driver to device.
> So, I imagine this work can be done using a queue by the guest driver and
> serviced by the device like how a guest driver configures the queue today
> without any mediation.
> For PCI, MMIO transport, surely this can be done by the PCI device directly
> being is PF, VF or SIOV.
> (Instead of config register, using a new queue interface). Looks fine to me.
> 
> Can this new cfg queue mediated like CVQ that is done in a sw? May be yes.
> Should it be always mediated when it is of VF, SIOV Device? Mostly no because
> it is yet another VQ for PF, VF, SIOV.
> 
> I am yet to parse rest of the 4 patches, please give me some time to review 
> it.

I went over the past work of [1], [2].

[1] https://lore.kernel.org/all/20220826100034.200432-2-lingshan@intel.com/
[2] https://lists.oasis-open.org/archives/virtio-comment/202208/msg00141.html

The "virtio q as transport" in [2] is bit misleading as its only role is to 
transport the _registers_ of the SIOV_R1 device through its parent PF.
Rest of the work is the pure management work to manage the life cycle of SIOV 
devices (create/delete/configure/compose).

And the motivation is also clear is to provide composing a virtio device for 
the guest VM for the backward compatibility for 1.x part of the specification.

It seems fine and indeed orthogonal to me that: it is for backward 
compatibility for already defined config fields for existing guest VM driver.

It does not conflict with the cfgq/cmdq idea whose main purpose is for the new 
config fields, new use cases that doesn't require any mediation.
Such cfgq works across PF, VF, SF/SIOV devices in uniform way without mediation.
It also scales device register memory also very well in predictable way.

The registers and feature bits access described in [2], can certainly be done 
using new non_legacy new AQ commands.
Both legacy and non-legacy command have different semantics as Michael 
mentioned.

The AQ semantics that Michael did as opposed to "virtqueue as transport" fits 
well for the use case described in [1].

There are changes on going in MSI-X area and SIOV, so you might want to wait 
for it.
Or proposed command in [1] should be tagged as siov_r1, then things will be 
cleaner.

With that I don't see legacy 3 commands anyway conflict with [1].
Some 

[virtio-dev] RE: [PATCH v3 0/2] transport-pci: msix register desc improve

2023-05-10 Thread Parav Pandit
> From: Parav Pandit 
> Sent: Friday, April 28, 2023 12:10 AM
> To: Parav Pandit ; Michael S. Tsirkin 
> 
> Hi,
> 
> > From: virtio-comm...@lists.oasis-open.org  > open.org> On Behalf Of Parav Pandit
> > Sent: Monday, April 24, 2023 11:15 AM
> 
> > > LGTM.
> > > Given the amount of back and forth, I guess we should vote.
> > >
> > Ok. created github at [1].
> > Will wait for few more days to ask for vote for any other review comments.
> >
> > [1] https://github.com/oasis-tcs/virtio-spec/issues/169
> 
> Can you please raise the ballot for this?
Can you please merge the patches in the tree?

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] RE: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit



> From: Michael S. Tsirkin 
> Sent: Wednesday, May 10, 2023 12:16 PM
> 
> On Wed, May 10, 2023 at 12:11:50PM -0400, Parav Pandit wrote:
> >
> >
> > On 5/10/2023 2:04 AM, Michael S. Tsirkin wrote:
> > > On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > > > I thought so too originally. Unfortunately I now think that no,
> > > > > legacy is not going to be a byproduct of transport virtqueue for
> > > > > modern - it is different enough that it needs dedicated commands.
> > > >
> > > > If you mean the transport virtqueue, I think some dedicated
> > > > commands for legacy are needed. Then it would be a transport that
> > > > supports transitional devices. It would be much better than having
> > > > commands for a partial transport like this patch did.
> > >
> > > OK I am beginning to get what you are saying.  So your criticism is
> > > this: what if device supports vq transport for modern, and we want
> > > to build a transitional device on top.  how will that look. yes?
> > > A reasonable thing to include at least in the commit log. Parav?
> > >
> > I am still trying to understand what is "vq transport for modern"?
> > Do you mean transporting currently defined config space access over vq?
> > If so, is this VQ belong to the guest or hypervisor?
> 
> https://lore.kernel.org/all/20220826100034.200432-2-
> lingshan.zhu%40intel.com/t.mbox.gz

The gz link is not accessible.
But I got the right link [1].

[1] https://lore.kernel.org/all/20220826100034.200432-2-lingshan@intel.com/

1. Above patch cover letter [1] is missing the basic objective/problem 
statement.
i.e. why a transport virtqueue is needed?
But I probably get the idea of [1] as we did the AQ.

2. Commit log says about 
a. querying resource of management device (aka group owner in AQ now)
b. creating and destroying the managed device  (aka group owner creating group 
member devices)
c. configure the managed device (aka group owner configuring/composing group 
member devices such as VFs, SFs, SIOV).

So, all above 2.a to 2.c belongs to the admin group owner and group management 
commands like how it is defined in the AQ proposal.

So, 3 out of the 4 motivations are achieved by AQ proposal.
This AQ belongs to the hypervisor. I am clear on this part.

4th point in cover letter is: "config virtqueues of the managed device".

This work belongs to the driver -> device direct communication using a queue 
from driver to device.
So, I imagine this work can be done using a queue by the guest driver and 
serviced by the device like how a guest driver configures the queue today 
without any mediation.
For PCI, MMIO transport, surely this can be done by the PCI device directly 
being is PF, VF or SIOV.
(Instead of config register, using a new queue interface). Looks fine to me.

Can this new cfg queue mediated like CVQ that is done in a sw? May be yes.
Should it be always mediated when it is of VF, SIOV Device? Mostly no because 
it is yet another VQ for PF, VF, SIOV.

I am yet to parse rest of the 4 patches, please give me some time to review it.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 12:11:50PM -0400, Parav Pandit wrote:
> 
> 
> On 5/10/2023 2:04 AM, Michael S. Tsirkin wrote:
> > On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > > I thought so too originally. Unfortunately I now think that no, legacy 
> > > > is not
> > > > going to be a byproduct of transport virtqueue for modern -
> > > > it is different enough that it needs dedicated commands.
> > > 
> > > If you mean the transport virtqueue, I think some dedicated commands
> > > for legacy are needed. Then it would be a transport that supports
> > > transitional devices. It would be much better than having commands for
> > > a partial transport like this patch did.
> > 
> > OK I am beginning to get what you are saying.  So your criticism is
> > this: what if device supports vq transport for modern, and we want to
> > build a transitional device on top.  how will that look. yes?
> > A reasonable thing to include at least in the commit log. Parav?
> > 
> I am still trying to understand what is "vq transport for modern"?
> Do you mean transporting currently defined config space access over vq?
> If so, is this VQ belong to the guest or hypervisor?

https://lore.kernel.org/all/20220826100034.200432-2-lingshan.zhu%40intel.com/t.mbox.gz

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit




On 5/10/2023 3:43 AM, Michael S. Tsirkin wrote:

On Wed, May 10, 2023 at 03:01:25PM +0800, Jason Wang wrote:

On Wed, May 10, 2023 at 2:05 PM Michael S. Tsirkin  wrote:


On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:

I thought so too originally. Unfortunately I now think that no, legacy is not
going to be a byproduct of transport virtqueue for modern -
it is different enough that it needs dedicated commands.


If you mean the transport virtqueue, I think some dedicated commands
for legacy are needed. Then it would be a transport that supports
transitional devices. It would be much better than having commands for
a partial transport like this patch did.


OK I am beginning to get what you are saying.  So your criticism is
this: what if device supports vq transport for modern, and we want to
build a transitional device on top.  how will that look. yes?


Yes. I think it needs to be done through the transport virtqueue
otherwise the transport is not self-contained.


I mean, any feature can be done over transport vq.

But there is value in adding legacy commands to an otherwise
modern device without reworking it completely to
switch to a different transport.


Yes.




A reasonable thing to include at least in the commit log. Parav?

You are also asking what if the device uses transport vq,
and we want transitional on top of that.
It's a fair question but I don't exactly get why would
this legacy support feature be wanted for the vq transport
and not for other transports.


Not sure I get the question, but all the existing transport support
legacy, if we want to have another, should the legacy support be a
must or not?


This specific proposal is for tunneling legacy over admin vq.
It can live alongside a normal modern VF, with hypervisor
combining these to create a transitional device.


True.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v13 00/10] Introduce device group and device management

2023-05-10 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 02:33:52PM +, Parav Pandit wrote:
> 
> > From: Jiri Pirko 
> > Sent: Wednesday, May 10, 2023 10:30 AM
> 
> > 
> > Michael, not sure if it is problem of this patchset or on my side but 
> > makepdf
> > fails to build this. With master branch, works fine.
> 
> This patch fixes it.
> https://lore.kernel.org/virtio-comment/20230505163712.589460-1-pa...@nvidia.com/
> 
> You can use b4 now to get this patch.

Thanks a lot, I will fold your patchset in when applying.

-- 
MST


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit




On 5/10/2023 2:04 AM, Michael S. Tsirkin wrote:

On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:

I thought so too originally. Unfortunately I now think that no, legacy is not
going to be a byproduct of transport virtqueue for modern -
it is different enough that it needs dedicated commands.


If you mean the transport virtqueue, I think some dedicated commands
for legacy are needed. Then it would be a transport that supports
transitional devices. It would be much better than having commands for
a partial transport like this patch did.


OK I am beginning to get what you are saying.  So your criticism is
this: what if device supports vq transport for modern, and we want to
build a transitional device on top.  how will that look. yes?
A reasonable thing to include at least in the commit log. Parav?


I am still trying to understand what is "vq transport for modern"?
Do you mean transporting currently defined config space access over vq?
If so, is this VQ belong to the guest or hypervisor?

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit




On 5/10/2023 12:22 AM, Jason Wang wrote:

On Wed, May 10, 2023 at 11:51 AM Jason Wang  wrote:




Just to make sure we are at the same page.

1) if the hardware has configq and we need to make it work for current
virtio-pci driver, hypervisor needs to trap guest PCI access and
translate it to configq command. This means the hypervisor needs to
hide the configq from guests. In this case the configq needs a
dedicated DMA address which is what PASID can help.



2) if the hardware can report the device states, unless we want to
migrate L2 guest, hypervisor should hide it from L1, so PASID is
required to isolate the DMA for guest traffic and device state.


A configq negotiates + discovers new fields for the PCI PF, VF, SF/SIOV 
devices over PCIe or other transports.

So no need to hide/mediate for hw based devices, like cvq and like data vqs.

For vdpa kind of use case it can work like cvq mediation, which I 
believe is happening without PASID today.


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Parav Pandit




On 5/9/2023 11:51 PM, Jason Wang wrote:

On Tue, May 9, 2023 at 11:56 AM Parav Pandit  wrote:


SIOV does not require mediation.
Composition is optional like VFs.


This is not what I read from SIOV spec.

SIOV R1 spec was defined by _a_ vendor and it is undergoing upgrade to 
more practical use now. So we should let them finish the work.


It is hard to comment in this forum about it.





Hence, the transport we built is to consider this in mind for the
coming future.


For transport virtqueue, it's not specific to PCI. It could be used in a much
broader use case.


May be.
More below.


So if each VF has its own configq, or cmdq, it totally make sense to
me which is bootstrap interface to transport existing config space interface.
The problem is: it is not backward compatible; Hence a device has no
way of when to support both or only new configq.


Providing compatibility in software is much more simpler than inventing new
hardware interfaces. Isn't it? (e.g if we want to provide compatibility for VF 
on
a SIOV device). And inventing a new hardware interface for compatibility might
not always work, it may break the advantages of the new hardware (like
scalability).


What I proposed below is new hardware interface for new features.
Not for compatibility.


I'm confused, the proposal is for legacy drives so it's for
compatibility for sure. No?

This proposal is to get legacy devices to work over a PCI VF.

a configq/cmdq discussion is for new features for PF, VF, and SF/SIOV 
devices to work in _non_ backward compatible way, because for new 
features there is no such thing as backward compatibility between guest 
driver and the device.





Compatibility is coming at a cost which is not scaling.
If you attempt to scale, it breaks the future path forward to go without 
mediation.



So eve growing these fields and optionally placement on configq
doesn't really help and device builder to build it efficiently
(without much predictability).


Config queue is not the only choice, we have a lot of other choices (for example
PASID may help to reduce the on-chip resources).


PASID is for process isolation security construct, it is not for transportation.


I meant with PASID you don't even need a BAR.

Not sure I fully understand, but my guess is, this is coming from 
mediation thought process.



Same, SIOV and VFs do not prefer mediation going forward in the use cases we 
come across.


Unless you don't want to provide VF compatibility for SIOV.

I do not understand what compatibility is this between which two elements?
VF has its identify currently and SIOV will have its own identity in future.



Just to be sure we're on the same page. The proposal of both you and mine are
based on the adminq for PF not VF. The reason is obvious:
adminq per VF won't work without PASID, since it would have security issues.


The current proposal for legacy should be seen as separate and not to intermix 
with per VF based configuration queue.

adminvq/config/control vq per VF and SIOV both devices will work without PASID 
(unrelated to legacy).
Because they are like any other queue, such as txq, rxq, cvq. All of these 
queues are non-mediated today for PF, VF devices.
(And so additional configq follows the natural model for config space access).


This is only true if:

1) you don't want to provide any backward compatibility for current
PCI transport, 

There is no concept of backward compatibility for new features.
There will be new driver anyway in the guest.

yes, I don't see a point in mediating 1.x config space for existing 
drivers as it is requires mediation.



this means you need to use new drivers in the guest
2) you don't want to do live migration


Not sure how you assert this.

Live migration at VF level for the passthrough device can be just fine 
without any mediation as long as all the parameters on src and dst side 
match.
We already discussed in v0, so I prefer to avoid this again and focus on 
the patch in discussion.



If you need one of the above, PASID is a must.


A PCI VF passthrough device LM without mediation is achievable without 
PASID for all the existing defined features using config space.


and also possible when new features are communicated over configq/cmdq.

Transporting existing config space using some additional queue is _not_ 
the objective, because building a scalable system where device has zero 
knowledge about needed backward compatibility is hard.
I provided the example of 100 Vfs with 30 and (30+40) bytes config 
layout previously.


Yet we are repeating and diverging from the discussion, that we should 
not intermix:

1. How to access legacy registers of the VF
vs
2. How to access existing or new configuration of the VF

Because both have very different requirements.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v13 00/10] Introduce device group and device management

2023-05-10 Thread Jiri Pirko
Wed, May 10, 2023 at 04:33:52PM CEST, pa...@nvidia.com wrote:
>
>> From: Jiri Pirko 
>> Sent: Wednesday, May 10, 2023 10:30 AM
>
>> 
>> Michael, not sure if it is problem of this patchset or on my side but makepdf
>> fails to build this. With master branch, works fine.
>
>This patch fixes it.
>https://lore.kernel.org/virtio-comment/20230505163712.589460-1-pa...@nvidia.com/
>
>You can use b4 now to get this patch.

Indeed, that helped. Thanks!

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] RE: [PATCH v13 00/10] Introduce device group and device management

2023-05-10 Thread Parav Pandit


> From: Jiri Pirko 
> Sent: Wednesday, May 10, 2023 10:30 AM

> 
> Michael, not sure if it is problem of this patchset or on my side but makepdf
> fails to build this. With master branch, works fine.

This patch fixes it.
https://lore.kernel.org/virtio-comment/20230505163712.589460-1-pa...@nvidia.com/

You can use b4 now to get this patch.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v13 00/10] Introduce device group and device management

2023-05-10 Thread Jiri Pirko
Fri, May 05, 2023 at 05:40:33PM CEST, m...@redhat.com wrote:
>
>
>
>Change log:
>
>since 13:
>   command specific data is u8 again
>   exclude admin queues in blk's num_queues
>   minor other tweaks
>
>since 11:
>   addressed lots of comments, all minor. consistency with
>   outstanding number->index and queue->enqueue work
>   i did not intentionally drop any reviewed-by tags
>   as all changes are minor - if yours is missing it is
>   because I forgot to record it, sorry
>
>   one "breaking" change in response to stefan's comment:
>   in patch 5, num_queues has been specified not to include admin
>   queues: just regular ones.
>
>since v10:
>   addressed lots of comments by Jiri, Stefan. Cornelia, Lngshan, Parav, 
> Max
>
>since v9:
>   addressed comments by Parav, Max, Cornelia, David and Zhu Lingshan:
>   added link to errno header from Linux
>   rename _MEM to _MEMBER
>   admin vq num is zero based
>   clarify who sends commands where
>   minor english tweaks
>   clarify command length
>   specify interaction with sriov capability
>   correct commit log - NumVFs can be 0
>
>   i could not decide what should happen when VFs are
>   disabled. for now did not specify.
>
>since v8:
>   addressed comments by Cornelia - as we agreed on list
>   
>since v7:
>   make high level error codes match linux, with virtio specific codes
>   in a separate field
>   renamed _ACCEPT to _USE since that's what it does
>   clarified forward compatibility and non pci transports
>   support multiple admin vqs
>   conformance statements
>   lots of changes all over the place to I changed author from Max
>   to myself. Don't need to take credit but also don't want
>   to blame Max for my mistakes.
>
>since v6:
>
>   - removed some extentions intended for future use.
> We'll do them when we get there.
>
>   - brought back command list query from v5 in a simplified form -
> it's here to address the case where a single parent
> can address multiple groups, such as PF addressing
> transport vq and sriov vfs.
>
>   - attempt to make terminology more formal.
>   In particular a term for whoever controls the group.
>   I am still going back
>   and forth between "parent" and "owner" - owner might
>   be better after all since it will work if we ever
>   have a self group. For now it's parent.
>
>TODO (maybe?) - probably ok to defer until this part is upstream:
>
>   Add "all members" member id.
>
>   Add commands for MSI, feature discovery.
>
>   Add commands for transport vq.
>
>
>My intent is to try and support both SR-IOV and SIOV
>usecases with the same structure and maybe even the same
>VQ.
>
>For example, it might make sense to split creating/destroying
>SIOV devices from the transport passing data from the guest - the
>driver would then not negotiate VIRTIO_F_SR_IOV (which
>then means auto-provisioning).
>
>More ideas for use-cases:
>virtio VF features query and configuration space provisioning
>virtio VF resource (queues, msix vectors count) provisioning
>
>
>Future directions (shouldn't block this patch)
>- aborting commands - left for later. or is vq reset enough?
>- should we rename structures from admin to group admin?
>
>
>Michael S. Tsirkin (10):
>  virtio: document forward compatibility guarantees
>  admin: introduce device group and related concepts
>  admin: introduce group administration commands
>  admin: introduce virtio admin virtqueues
>  pci: add admin vq registers to virtio over pci
>  mmio: document ADMIN_VQ as reserved
>  ccw: document ADMIN_VQ as reserved
>  admin: command list discovery
>  admin: conformance clauses
>  ccw: document more reserved features
>
> admin.tex| 584 +++
> content.tex  |  62 +++-
> device-types/blk/description.tex |   2 +-
> introduction.tex |   3 +
> transport-ccw.tex|  14 +
> transport-mmio.tex   |  12 +
> transport-pci.tex|  33 ++

Michael, not sure if it is problem of this patchset or on my side
but makepdf fails to build this. With master branch, works fine.

-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] [PATCH v13] virtio-net: support inner header hash

2023-05-10 Thread Heng Qi




在 2023/5/9 下午11:15, Michael S. Tsirkin 写道:

On Tue, May 09, 2023 at 10:22:19PM +0800, Heng Qi wrote:


在 2023/5/5 下午10:56, Michael S. Tsirkin 写道:

On Fri, May 05, 2023 at 09:51:15PM +0800, Heng Qi wrote:

On Thu, Apr 27, 2023 at 01:13:29PM -0400, Michael S. Tsirkin wrote:

On Thu, Apr 27, 2023 at 10:28:29AM +0800, Heng Qi wrote:

在 2023/4/26 下午10:48, Michael S. Tsirkin 写道:

On Wed, Apr 26, 2023 at 10:14:30PM +0800, Heng Qi wrote:

This does not mean that every device needs to implement and support all of
these, they can choose to support some protocols they want.

I add these because we have scale application scenarios for modern protocols
VXLAN-GPE/GENEVE:

+\item In scenarios where the same flow passing through different tunnels is 
expected to be received in the same queue,
+  warm caches, lessing locking, etc. are optimized to obtain receiving 
performance.


Maybe the legacy GRE, VXLAN-GPE and GENEVE? But it has a little crossover.

Thanks.

But VXLAN-GPE/GENEVE can use source port for entropy.

It is recommended that the UDP source port number
 be calculated using a hash of fields from the inner packet

That is best because
it allows end to end control and is protocol agnostic.

Yes. I agree with this, I don't think we have an argument on this point
right now.:)

For VXLAN-GPE/GENEVE or other modern tunneling protocols, we have to deal
with
scenarios where the same flow passes through different tunnels.

Having them hashed to the same rx queue, is hard to do via outer headers.

All that is missing is symmetric Toepliz and all is well?

The scenarios above or in the commit log also require inner headers.

Hmm I am not sure I get it 100%.
Could you show an example with inner header hash in the port #,
hash is symmetric, and you still have trouble?


It kinds of sounds like not enough entropy is not the problem
at this point.

Sorry for the late reply. :)

For modern tunneling protocols, yes.


You now want to drop everything from the header
except the UDP source port. Is that a fair summary?


For example, for the same flow passing through different VXLAN tunnels,
packets in this flow have the same inner header and different outer
headers. Sometimes these packets of the flow need to be hashed to the
same rxq, then we can use the inner header as the hash input.

Thanks!

So, they will have the same source port yes?

Yes. The outer source port can be calculated using the 5-tuple of the
original packet,
and the outer ports are the same but the outer IPs are different after
different directions of the same flow pass through different tunnels.

Any way to use that

We use it in monitoring, firewall and other scenarios.


so we don't depend on a specific protocol?

Yes, selected tunneling protocols can be used in this scenario like this.

Thanks.


No, the question was - can we generalize this somehow then?
For example, a flag to ignore source IP when hashing?
Or maybe just for UDP packets?


1. I think the common solution is based on the inner header, so that 
GRE/IPIP tunnels can also enjoy inner symmetric hashing.


2. The VXLAN spec does not show that the outer source port in both 
directions of the same flow must be the same [1]
(although the outer source port is calculated based on the consistent 
hash in the kernel. The consistent hash will sort the five-tuple before 
calculating hashing),
but it is best not to assume that consistent hashing is used in all 
VXLAN implementations. The GENEVE spec uses "SHOUlD"[2].


3. How should we generalize? The device uses a feature to advertise all 
the tunnel types it supports, and hashes these tunnel types using the 
outer source port,
and then we still have to give the specific tunneling protocols 
supported by the device, just like we do now.


[1] "Source Port: It is recommended that the UDP source port number be 
calculated using a hash of fields from the inner packet -- one example
being a hash of the inner Ethernet frame's headers. This is to enable a 
level of entropy for the ECMP/load-balancing of the VM-to-VM traffic across
the VXLAN overlay. When calculating the UDP source port number in this 
manner, it is RECOMMENDED that the value be in the dynamic/private

port range 49152-65535 [RFC6335] "

[2] "Source Port: A source port selected by the originating tunnel 
endpoint. This source port SHOULD be the same for all packets belonging to a
single encapsulated flow to prevent reordering due to the use of 
different paths. To encourage an even distribution of flows across 
multiple links,
the source port SHOULD be calculated using a hash of the encapsulated 
packet headers using, for example, a traditional 5-tuple. Since the port
represents a flow identifier rather than a true UDP connection, the 
entire 16-bit range MAY be used to maximize entropy. In addition to 
setting the
source port, for IPv6, the flow label MAY also be used for providing 
entropy. For an example of using the IPv6 flow label for tunnel use 
cases, see [RFC6438

[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Michael S. Tsirkin
On Wed, May 10, 2023 at 03:01:25PM +0800, Jason Wang wrote:
> On Wed, May 10, 2023 at 2:05 PM Michael S. Tsirkin  wrote:
> >
> > On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > > I thought so too originally. Unfortunately I now think that no, legacy 
> > > > is not
> > > > going to be a byproduct of transport virtqueue for modern -
> > > > it is different enough that it needs dedicated commands.
> > >
> > > If you mean the transport virtqueue, I think some dedicated commands
> > > for legacy are needed. Then it would be a transport that supports
> > > transitional devices. It would be much better than having commands for
> > > a partial transport like this patch did.
> >
> > OK I am beginning to get what you are saying.  So your criticism is
> > this: what if device supports vq transport for modern, and we want to
> > build a transitional device on top.  how will that look. yes?
> 
> Yes. I think it needs to be done through the transport virtqueue
> otherwise the transport is not self-contained.

I mean, any feature can be done over transport vq.

But there is value in adding legacy commands to an otherwise
modern device without reworking it completely to
switch to a different transport.


> > A reasonable thing to include at least in the commit log. Parav?
> >
> > You are also asking what if the device uses transport vq,
> > and we want transitional on top of that.
> > It's a fair question but I don't exactly get why would
> > this legacy support feature be wanted for the vq transport
> > and not for other transports.
> 
> Not sure I get the question, but all the existing transport support
> legacy, if we want to have another, should the legacy support be a
> must or not?

This specific proposal is for tunneling legacy over admin vq.
It can live alongside a normal modern VF, with hypervisor
combining these to create a transitional device.

> >
> >
> >
> >
> > > > Consider simplest case, multibyte fields. Legacy needs multibyte write,
> > > > modern does not even need multibyte read.
> > >
> > > I'm not sure I will get here,
> >
> > What does this mean?
> 
> I meant I don't get what the issue if "modern does not even need
> multibyte read".

parse error again. reword?

> >
> > > since we can't expose admin vq to
> > > guests, it means we need some software mediation. So if we just
> > > implement what PCI allows us, then everything would be fine (even if
> > > some method is not used).
> > >
> > > Thanks
> >
> > To repeat discussion on one of the previous versions, no it will not be
> > fine because legacy virtio abuses pci in fundamentally broken ways.
> > So yes you need a mediator on the host but even giving this
> > mediator a chance to be robust on top of hardware
> > means the hardware interface can not simply mirror legacy
> > to hardware.
> >
> > For example, host mediator needs to trap writes into mac,
> > buffer them and then send a 6 byte write.
> > Now, pci actually does allow 6 byte writes, but legacy
> > guests instead to 6 single byte writes for portability reasons.
> 
> It's a known race condition, so PCI over adminq doesn't make it worse.

it can however make it better - you can do a single 6 byte write command.

> The mediator can just mirror what guests write over the admin
> commands.

and this "just" just isn't good enough, or we end up with hacks
in hardware.

> Thanks
> 
> > --
> > MSr
> >


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



[virtio-dev] Re: [PATCH v2 0/2] transport-pci: Introduce legacy registers access using AQ

2023-05-10 Thread Jason Wang
On Wed, May 10, 2023 at 2:05 PM Michael S. Tsirkin  wrote:
>
> On Mon, May 08, 2023 at 10:23:39AM +0800, Jason Wang wrote:
> > > I thought so too originally. Unfortunately I now think that no, legacy is 
> > > not
> > > going to be a byproduct of transport virtqueue for modern -
> > > it is different enough that it needs dedicated commands.
> >
> > If you mean the transport virtqueue, I think some dedicated commands
> > for legacy are needed. Then it would be a transport that supports
> > transitional devices. It would be much better than having commands for
> > a partial transport like this patch did.
>
> OK I am beginning to get what you are saying.  So your criticism is
> this: what if device supports vq transport for modern, and we want to
> build a transitional device on top.  how will that look. yes?

Yes. I think it needs to be done through the transport virtqueue
otherwise the transport is not self-contained.

> A reasonable thing to include at least in the commit log. Parav?
>
> You are also asking what if the device uses transport vq,
> and we want transitional on top of that.
> It's a fair question but I don't exactly get why would
> this legacy support feature be wanted for the vq transport
> and not for other transports.

Not sure I get the question, but all the existing transport support
legacy, if we want to have another, should the legacy support be a
must or not?

>
>
>
>
> > > Consider simplest case, multibyte fields. Legacy needs multibyte write,
> > > modern does not even need multibyte read.
> >
> > I'm not sure I will get here,
>
> What does this mean?

I meant I don't get what the issue if "modern does not even need
multibyte read".

>
> > since we can't expose admin vq to
> > guests, it means we need some software mediation. So if we just
> > implement what PCI allows us, then everything would be fine (even if
> > some method is not used).
> >
> > Thanks
>
> To repeat discussion on one of the previous versions, no it will not be
> fine because legacy virtio abuses pci in fundamentally broken ways.
> So yes you need a mediator on the host but even giving this
> mediator a chance to be robust on top of hardware
> means the hardware interface can not simply mirror legacy
> to hardware.
>
> For example, host mediator needs to trap writes into mac,
> buffer them and then send a 6 byte write.
> Now, pci actually does allow 6 byte writes, but legacy
> guests instead to 6 single byte writes for portability reasons.

It's a known race condition, so PCI over adminq doesn't make it worse.
The mediator can just mirror what guests write over the admin
commands.

Thanks

> --
> MSr
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org