[virtio-dev] Re: [PATCH v19] virtio-net: support inner header hash
在 2023/6/28 18:10, Michael S. Tsirkin 写道: On Wed, Jun 28, 2023 at 11:46:22AM +0800, Jason Wang wrote: On Wed, Jun 28, 2023 at 12:35 AM Heng Qi wrote: 1. Currently, a received encapsulated packet has an outer and an inner header, but the virtio device is unable to calculate the hash for the inner header. The same flow can traverse through different tunnels, resulting in the encapsulated packets being spread across multiple receive queues (refer to the figure below). However, in certain scenarios, we may need to direct these encapsulated packets of the same flow to a single receive queue. This facilitates the processing of the flow by the same CPU to improve performance (warm caches, less locking, etc.). client1client2 |+---+ | +--->|tunnels|<+ +---+ | | v v +-+ | monitoring host | +-+ To achieve this, the device can calculate a symmetric hash based on the inner headers of the same flow. 2. For legacy systems, they may lack entropy fields which modern protocols have in the outer header, resulting in multiple flows with the same outer header but different inner headers being directed to the same receive queue. This results in poor receive performance. To address this limitation, inner header hash can be used to enable the device to advertise the capability to calculate the hash for the inner packet, regaining better receive performance. Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173 Signed-off-by: Heng Qi Reviewed-by: Xuan Zhuo Reviewed-by: Parav Pandit --- v18->v19: 1. Have a single structure instead of two. @Michael S . Tsirkin 2. Some small rewrites. @Michael S . Tsirkin 3. Rebase to master. v17->v18: 1. Some rewording suggestions from Michael (Thanks!). 2. Use 0 to disable inner header hash and remove VIRTIO_NET_HASH_TUNNEL_TYPE_NONE. v16->v17: 1. Some small rewrites. @Parav Pandit 2. Add Parav's Reviewed-by tag (Thanks!). v15->v16: 1. Remove the hash_option. In order to delimit the inner header hash and RSS configuration, the ability to configure the outer src udp port hash is given to RSS. This is orthogonal to inner header hash, which will be done in the RSS capability extension topic (considered as an RSS extension together with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit @Michael S . Tsirkin 2. Fix a 'field' typo. @Parav Pandit v14->v15: 1. Add tunnel hash option suggested by @Michael S . Tsirkin 2. Adjust some descriptions. v13->v14: 1. Move supported_hash_tunnel_types from config space into cvq command. @Parav Pandit I may miss some discussions, but this complicates the provisioning a lot. Having it in the config space, then a type agnostic provisioning through config space + feature bits just works fine. If we move it only via cvq, we need device specific provisioning interface. Thanks Yea that's what I said too. Debugging too. I think we should build a consistent solution that allows accessing config space through DMA, separately from this effort. Parav do you think you can live with this approach so this specific proposal can move forward? We can probably go another way, invent a new device configuration space capability which fixed size like PCI configuration access capability? struct virtio_pci_cfg_cap { struct virtio_pci_cap cap; u8 dev_cfg_data[4]; /* Data for device configuration space access. */ }; So it won't grow as the size of device configuration space grows. Thanks - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] [RFC PATCH] admin-queue: bind the group member to the device
On Wed, Jun 28, 2023 at 11:55 PM Michael S. Tsirkin wrote: > > On Wed, Jun 28, 2023 at 02:06:32PM +0800, Xuan Zhuo wrote: > > On Wed, 28 Jun 2023 10:49:45 +0800, Jason Wang wrote: > > > On Tue, Jun 27, 2023 at 6:54 PM Xuan Zhuo > > > wrote: > > > > > > > > On Tue, 27 Jun 2023 17:00:06 +0800, Jason Wang > > > > wrote: > > > > > On Tue, Jun 27, 2023 at 4:28 PM Xuan Zhuo > > > > > wrote: > > > > > > > > > > > > > > > > > > Thanks Parav for pointing it out. We may have some gaps on the case. > > > > > > > > > > > > Let me introduce our case, which I think it is simple and should be > > > > > > easy to > > > > > > understand. > > > > > > > > > > > > First, the user (customer) purchased a bare metal machine. > > > > > > > > > > > > ## Bare metal machine > > > > > > > > > > > > Let me briefly explain the characteristics of a bare metal machine. > > > > > > It is not a > > > > > > virtual machine, it is a physical machine, and the difference > > > > > > between it and a > > > > > > general physical machine is that its PCI is connected to a device > > > > > > similar to a > > > > > > DPU. This DPU provides devices such as virtio-blk/net to the host > > > > > > through PCI. > > > > > > These devices are managed by the vendor, and must be created and > > > > > > purchased > > > > > > on the vendor's management platform. > > > > > > > > > > > > ## DPU > > > > > > > > > > > > There is a software implementation in the DPU, which will respond > > > > > > to PCI > > > > > > operations. But as mentioned above, resources such as network cards > > > > > > must be > > > > > > purchased and created before they can exist. So users can create > > > > > > VF, which is > > > > > > just a pci-level operation, but there may not be a corresponding > > > > > > backend. > > > > > > > > > > > > ## Management Platform > > > > > > > > > > > > The creation and configuration of devices is realized on the > > > > > > management > > > > > > platform. > > > > > > > > > > > > After the user completed the purchase on the management platform > > > > > > (this is an > > > > > > independent platform provided by the vendor and has nothing to do > > > > > > with > > > > > > virtio), then there will be a corresponding device implementation > > > > > > in the DPU. > > > > > > This includes some user configurations, available bandwidth > > > > > > resources and other > > > > > > information. > > > > > > > > > > > > ## Usage > > > > > > > > > > > > Since the user is directly on the HOST, the user can create VMs, > > > > > > passthrough PF > > > > > > or VF into the VM. Or users can create a large number of dockers, > > > > > > all of which > > > > > > use a separate virtio-net device for performance. > > > > > > > > > > > > The reason why users use vf is that we need to use a large number > > > > > > of virtio-net > > > > > > devices. This number reaches 1k+. > > > > > > > > > > > > Based on this scenario, we need to bind vf to the backend device. > > > > > > Because, we > > > > > > cannot automatically complete the creation of the virtio-net > > > > > > backend device when > > > > > > the user creates a vf. > > > > > > > > > > > > ## Migration > > > > > > > > > > > > In addition, let's consider another scenario of migration. If a vm > > > > > > is migrated > > > > > > from another host, of course its corresponding virtio device is > > > > > > also migrated to > > > > > > the DPU. At this time, our newly created vf can only be used by the > > > > > > vm after it > > > > > > is bound to the migrated device. We do not want this vf to be a > > > > > > brand new > > > > > > device. > > > > > > > > > > > > ## Abstraction > > > > > > > > > > > > So, this is how I understand the process of creating vf: > > > > > > > > > > > > 1. Create a PCI VF, at this time there may be no backend virtio > > > > > > device, or there > > > > > > is only a default backend. It does not fully meet our > > > > > > expectations. > > > > > > 2. Create device or migrate device > > > > > > 3. Bind the backend virtio device to the vf > > > > > > > > > > 3) should come before 2)? > > > > > > > > > > Who is going to do 3) btw, is it the user? If yes, for example, if a > > > > > user wants another 4 queue virtio-net devices, after purchase, how > > > > > does the user know its id? > > > > > > > > Got the id from the management platform. > > > > > > So it can do the binding via that management platform which this > > > became a cloud vendor specific interface. > > > > In our scenario, this is bound by the user using this id and vf id in the > > os. > > > > > > > > > > > > > > > > > > > > > > > > > > In most scenarios, the first step may be enough. We can make some > > > > > > fine-tuning on > > > > > > this default device, such as modifying its mac. In the future, we > > > > > > can use admin > > > > > > queue to modify its msix vector and other configurations. > > > > > > > > > > > > But we should allow, we bind a backend virtio device to a certain
[virtio-dev] Re: [PATCH v19] virtio-net: support inner header hash
On Thu, Jun 29, 2023 at 12:00 AM Parav Pandit wrote: > > > > > From: Jason Wang > > Sent: Wednesday, June 28, 2023 1:38 AM > > [..] > > > Provisioning is far simpler thing to do in device specific way than > > > asking device > > to store this value in onchip area which is rarely accessed. > > > > Are you suggesting to not place any new fields in the config space? > > > Yes. It's not the fault of config space itself, but the way to access the config space, more below. > > > struct virtio_net_config { > > u8 mac[6]; > > le16 status; > > le16 max_virtqueue_pairs; > > le16 mtu; > > le32 speed; > > u8 duplex; > > u8 rss_max_key_size; > > le16 rss_max_indirection_table_length; > > le32 supported_hash_types; > > }; > > > > Which of the above do you think can be accessed frequently and which part of > > the spec says it must be stored in the onchip area? > > > Most are not accessed frequently. > The fact that they are in MMIO a device needs to place in a memory with tight > latency budget. This really depends on the implementation and vendor architectures. For example, 1) MSI BAR might require much more resources than a simple device configuration space 2) I was told my some vendors that the virtqueue is much more valuable than MMIO 3) We can introduce new ways to access the device configuration space > Spec is not going to talk on onchip area, it is the reflection of spec that > forces certain inefficient implementation . Exactly, it's implementation specific, so config space is fine, we just need to invent new methods to access them that fit for your hardware. Thanks - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
在 2023/6/29 上午9:56, Parav Pandit 写道: From: Michael S. Tsirkin Sent: Wednesday, June 28, 2023 3:45 PM Maybe I get it. You want to use the new features as a carrot to force drivers to implement DMA? You suspect they will ignore the spec requirement just because things seem to work? Right because it is not a must normative. Well SHOULD also does not mean "ok to just ignore". This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. RECOMMENDED and SHOULD forces the device to support MMIO, which is not good. So rather a good design is device tells the starting offset for the extended config space. And extended config space MUST be accessed using a DMA. With this sw can have infinite size MMIO and hw device forces DMA based on its implementation of where to start DMA from. This also gives the ability to maintain current config as MMIO for backward compatibility. There's some logic here, for sure. you just might be right. However, surely we can discuss this small tweak in 1.4 timeframe? Sure, if we prefer the DMA approach I don't have a problem in adding temporary one field to config space. I propose to add a line to the spec " Device Configuration Space" section, something like, Note: Any new device configuration space fields additional MUST consider accessing such fields via a DMA interface. And this will guide the new patches of what to do instead of last moment rush. Yea, except again I'd probably make it a SHOULD: e.g. I can see how switching to MMIO might be an option for qemu helping us debug DMA issues. There are too many queues whose debugging is needed and MMIO likely not the way to debug. The time to discuss this detail would be around when proposal for the DMA access to config space is on list though: I feel this SHOULD vs MUST is a small enough detail. From implementation POV it is certainly critical and good step forward to optimize virtio interface. Going back to inner hash. If we move supported_tunnels back to config space, do you feel we still need GET or just drop it? I note we do not have GET for either hash or rss config. For hash and rss config, debugging is missing. :) Yes, we can drop the GET after switching supported_tunnels to struct virtio_net_hash_config. Great! Glad to hear this! And if we no longer have GET is there still a reason for a separate command as opposed to a field in virtio_net_hash_config? I know this was done in v11 but there it was misaligned. We went with a command because we needed it for supported_tunnels but now that is no longer the case and there are reserved words in virtio_net_hash_config ... Let me know how you feel it about that, not critical for me. struct virtio_net_hash_config reserved is fine. +1. Inner header hash is orthogonal to RSS, and it's fine to have its own structure and commands. There is no need to send additional RSS fields when we configure inner header hash. Thanks. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 3:45 PM > > > Maybe I get it. You want to use the new features as a carrot to > > > force drivers to implement DMA? You suspect they will ignore the > > > spec requirement just because things seem to work? > > > > > Right because it is not a must normative. > > Well SHOULD also does not mean "ok to just ignore". > > This word, or the adjective "RECOMMENDED", mean that there > may exist valid reasons in particular circumstances to ignore a > particular item, but the full implications must be understood and > carefully weighed before choosing a different course. > RECOMMENDED and SHOULD forces the device to support MMIO, which is not good. So rather a good design is device tells the starting offset for the extended config space. And extended config space MUST be accessed using a DMA. With this sw can have infinite size MMIO and hw device forces DMA based on its implementation of where to start DMA from. This also gives the ability to maintain current config as MMIO for backward compatibility. > > > > > There's some logic here, for sure. you just might be right. > > > > > > However, surely we can discuss this small tweak in 1.4 timeframe? > > > > Sure, if we prefer the DMA approach I don't have a problem in adding > temporary one field to config space. > > > > I propose to add a line to the spec " Device Configuration Space" > > section, something like, > > > > Note: Any new device configuration space fields additional MUST consider > accessing such fields via a DMA interface. > > > > And this will guide the new patches of what to do instead of last moment > rush. > > Yea, except again I'd probably make it a SHOULD: e.g. I can see how switching > to > MMIO might be an option for qemu helping us debug DMA issues. > There are too many queues whose debugging is needed and MMIO likely not the way to debug. > The time to discuss this detail would be around when proposal for the DMA > access to config space is on list though: I feel this SHOULD vs MUST is a > small > enough detail. > >From implementation POV it is certainly critical and good step forward to >optimize virtio interface. > Going back to inner hash. If we move supported_tunnels back to config space, > do you feel we still need GET or just drop it? I note we do not have GET for > either hash or rss config. > For hash and rss config, debugging is missing. :) Yes, we can drop the GET after switching supported_tunnels to struct virtio_net_hash_config. > And if we no longer have GET is there still a reason for a separate command as > opposed to a field in virtio_net_hash_config? > I know this was done in v11 but there it was misaligned. > We went with a command because we needed it for supported_tunnels but > now that is no longer the case and there are reserved words in > virtio_net_hash_config ... > > Let me know how you feel it about that, not critical for me. struct virtio_net_hash_config reserved is fine. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 05:38:29PM +, Parav Pandit wrote: > > > From: Michael S. Tsirkin > > Sent: Wednesday, June 28, 2023 1:24 PM > > > > > Because when device has two ways to access config space, it always have to > > account and build the interface that, hey some driver will not use DMA. > > > Hence have it always in the MMIO accessible area. > > > > Maybe I get it. You want to use the new features as a carrot to force > > drivers to > > implement DMA? You suspect they will ignore the spec requirement just > > because things seem to work? > > > Right because it is not a must normative. Well SHOULD also does not mean "ok to just ignore". This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. > > There's some logic here, for sure. you just might be right. > > > > However, surely we can discuss this small tweak in 1.4 timeframe? > > Sure, if we prefer the DMA approach I don't have a problem in adding > temporary one field to config space. > > I propose to add a line to the spec " Device Configuration Space" section, > something like, > > Note: Any new device configuration space fields additional MUST consider > accessing such fields via a DMA interface. > > And this will guide the new patches of what to do instead of last moment rush. Yea, except again I'd probably make it a SHOULD: e.g. I can see how switching to MMIO might be an option for qemu helping us debug DMA issues. The time to discuss this detail would be around when proposal for the DMA access to config space is on list though: I feel this SHOULD vs MUST is a small enough detail. Going back to inner hash. If we move supported_tunnels back to config space, do you feel we still need GET or just drop it? I note we do not have GET for either hash or rss config. And if we no longer have GET is there still a reason for a separate command as opposed to a field in virtio_net_hash_config? I know this was done in v11 but there it was misaligned. We went with a command because we needed it for supported_tunnels but now that is no longer the case and there are reserved words in virtio_net_hash_config ... Let me know how you feel it about that, not critical for me. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 1:24 PM > > Because when device has two ways to access config space, it always have to > account and build the interface that, hey some driver will not use DMA. > > Hence have it always in the MMIO accessible area. > > Maybe I get it. You want to use the new features as a carrot to force drivers > to > implement DMA? You suspect they will ignore the spec requirement just > because things seem to work? > Right because it is not a must normative. > There's some logic here, for sure. you just might be right. > > However, surely we can discuss this small tweak in 1.4 timeframe? Sure, if we prefer the DMA approach I don't have a problem in adding temporary one field to config space. I propose to add a line to the spec " Device Configuration Space" section, something like, Note: Any new device configuration space fields additional MUST consider accessing such fields via a DMA interface. And this will guide the new patches of what to do instead of last moment rush. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 1:16 PM > > 1. Because than device is 100% sure that it does not fulfill MMIO > > needs 2. Single interface to access extended config space, which is what you > asked for. > > 3. Single driver code because there is single way to get it. > > > > Right. But I think there's a better way. > Just say that any feature in MMIO MUST be also in DMA. > So any modern driver will have no reason at all to use MMIO - DMA is a > superset. > How does that relax the need of MMIO for the device? > If we say that drivers should use DMA, they will. If we additionally explain > that > some features might not be in MMIO no sane driver will use MMIO if it does not > have to. > Or if it does then it has violated the spec and will get less features? > So than why to have it in MMIO anyway? Why not say driver must use dma? > > This demands two ways of access and that conflicts with your point of desire > to do single way. > > I proposed single way, extended config space via DMA interface, that is > > really > as simple as that. > > > My desire is to have a single way that works for everything. > We have legacy so we will have legacy ways too, these might not work for > everything. Here is what can work well, A device tells the offset of the extended config space to the driver. Any field starting from that offset must be accessed via a DMA interface. This way, driver must support DMA and all new features come from there. If device chose to use MMIO, say sw, it can still do it using MMIO. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 05:06:40PM +, Parav Pandit wrote: > > > > > > > > Please let people just focus on fixing config space instead of > > > > temporary cvq hacks. > > > > > > The ask is to have predictable sizing for existing defined config space > > > field and > > not keep infinitely growing by two interfaces. > > > > I don't know what "predictable sizing" is or why does it matter. > > > Because when device has two ways to access config space, it always have to > account and build the interface that, hey some driver will not use DMA. > Hence have it always in the MMIO accessible area. Maybe I get it. You want to use the new features as a carrot to force drivers to implement DMA? You suspect they will ignore the spec requirement just because things seem to work? There's some logic here, for sure. you just might be right. However, surely we can discuss this small tweak in 1.4 timeframe? -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 05:06:40PM +, Parav Pandit wrote: > > > From: Michael S. Tsirkin > > Sent: Wednesday, June 28, 2023 12:46 PM > > > Actually, my preblem is exactly if there's no one single interface that can > > do > > everything. > > > Extended config space reading via single DMA interface is just enough. > > > > > Two interfaces that do the same might seem redundant but there could be good > > reasons for this, in this case legacy. > > > But legacy should not force it to newer interface. More below. > > > See the difference? > > > > > > I didn't propose two interfaces. Let me explain again. > > > > > > I proposed, > > > a. all new fields to go via vq (single interface) b. all existing > > > fields to stay in cfg space (single interface) c. if one wants to > > > optimize optionally existing fields can utilize (same flexibility how you > > > asked > > for PF and VF notification). > > > > > > > > > > We already need config space with provisioning. > > > Provisioning does provision features, config space and control parameters, > > msix vectors and more. > > > Provisioning does _not_ need config space. > > > > I mean it needs to provision it. > > > No matter there is config space of dma, a device to be provisioned. > > > > > In fact e.g. on Linux we want drivers to keep doing virtio_cread() > > > > and all the DMA tricks should be hidden behind the scenes. You don't > > > > like it that config space is on-chip? Build an interface for it not > > > > to be on chip. There is simply no downside to that. > > > > > > > This is the ask to build software interface between driver and device to > > > not > > demand on-chip config space more importantly _predictably_. > > > > Absolutely. So we would say that driver SHOULD use the DMA interface if it > > is > > present. > > > I am asking for driver MUST use the DMA interface for _extended_ config space. > Why? > For two reasons. > > 1. Because than device is 100% sure that it does not fulfill MMIO needs > 2. Single interface to access extended config space, which is what you asked > for. > 3. Single driver code because there is single way to get it. > Right. But I think there's a better way. Just say that any feature in MMIO MUST be also in DMA. So any modern driver will have no reason at all to use MMIO - DMA is a superset. > > > > I came back after a small break and I still see this discussion that > > > > keeps playing out as a big distraction. > > > > In particular where are you doing to store the things that are for DMA? > > > > In system RAM? We don't *have* anywhere in system RAM to store this, > > > > we will need an interface to allocate system RAM and pass it to > > > > device for this idea to work. So in 1.3 where we won't have this, > > > > there is much less of an excuse to use a vq. > > > To store what? > > > CVQ is already there to do GET and SET operation. > > > > To store whatever value CVQ is requesting. Where is it going to come from? > > > It is going to come from the device memory that does not have hard > requirements of MMIO accessibility. > For example slow flash memory. > > > Asynchronous is just harder for software. Look at the hoops virtio net has > > to > > jump through to do control path over cvq and weep. Compare to single liner > > virtio_cread. > > > > I happen to maintain Linux drivers too, so I care. > > > It is harder for first time if no such construct exists; and it is clearly > not the case. > This patch is already adding CVQ get and set command. I expect get to mostly go unused. > Async behavior already built into the virtio spec. > Extending it for extended config space just make device more cheaper to build. > Software cost is paid only once at device init time just like CVQ, AQ and > more. > > > > > > > > > Please let people just focus on fixing config space instead of > > > > temporary cvq hacks. > > > > > > The ask is to have predictable sizing for existing defined config space > > > field and > > not keep infinitely growing by two interfaces. > > > > I don't know what "predictable sizing" is or why does it matter. > > > Because when device has two ways to access config space, it always have to > account and build the interface that, hey some driver will not use DMA. > Hence have it always in the MMIO accessible area. If we say that drivers should use DMA, they will. If we additionally explain that some features might not be in MMIO no sane driver will use MMIO if it does not have to. Or if it does then it has violated the spec and will get less features? > > I get the value of reduced MMIO register map. > > I get that the only way to get that appears to be to pass values around > > using > > DMA. > > > Ok. great. > > > I don't think we need to throw away config space - just use DMA to access > > it. > > > > And "existing defined" here seems to be at a completely random point in > > time. > > If someone wants to have fields in MMIO, I see no reason not to. > >
[virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 04:46:04PM +, Parav Pandit wrote: > > > From: Michael S. Tsirkin > > Sent: Wednesday, June 28, 2023 6:41 AM > > > > > So just quickly add the new capability in the spec and then the > > > > number of linux releases that will have the new feature but not > > > > config command or whatever that is will be too small for vendors to > > > > care. > > > > > > > I didn't follow this suggestion. > > > > It is very simple though. 1.3 has inner hash feature. Imagine that instead > > of > > endless flamewars rehashing same arguments, immediately post > > For sure if you call this discussion a war, I didn't start it and I didn't > delay it to this point either :) Yea sorry I'm being harsh, the threads here grow too much. We all need to make an effort not to repeat ourselves. > Patch added GET and SET command, GET returns device static and dynamic value, > without burdening the device in consistent way. > Will moving to config space, take away the GET command? No? > > Then what did one gain other than extra complexity of additional config > space? Just more device memory burning. I don't think Linux will ever use GET. It's there for completeness, I'm fine with getting rid of it too. > > 1.3 we vote on an interface to access config space through DMA (not sure > > it's a > > VQ BTW but that's a separate discussion). This should include a way to > > expose > > a subset of DMA features through MMIO, for compatibility. > > I think you are missing the main point that I try to highlight but presumably > it didn't come across. :( > > Having an optional DMA interface does _not_ help device to optimize, because > device must be support old drivers. Yea, I don't get it, sorry. So where's this old driver that uses the inner hash feature? It maybe will be released down the road right? Why can't it at the same time support DMA for access to this field? > The proposal I explained in previous email is: > a. All new fields via vq > This means it is _guaranteed_ to be offchip and zero reason for backward > compatibility, because there is no backward compat of non existing fields. > > b. all existing fields stay in cfg space > > c. Optionally existing fields can also be queried via cfg space And what I say is even simpler: all config space is accessible through DMA some of it optionally in MMIO which fields to have in MMIO will be up to device. Research what do drivers you care about use, and include that. > > If we are lucky guest support can land in the same Linux release that has > > inner > > hash. Drivers do not need to care: they just do virtio_cread and that is > > it. > > > virtio_cread for extended config space accessible only via DMA or VQ will > surely be hidden inside th virtio_cread(). Easy. > > If we agree on this approach, great, one additional field like this can be > added to config space. I think we are pretty close then, great! > > So a vendor that builds device with inner hash, can expose inner hash only > > through DMA and not through MMIO. > > > > I conclude that there's no reason to block this feature just because it > > uses config > > space. > > > I must repeat that I am not blocking, I replied to way back on both the > approaches in v14 or even before that. > And also explained part of the proposal. > > This is why one should discuss the design and not keep asking for patches.. I certainly don't prevent anyone from posting design sketches. Problem is this: even spec patches are often hard to understand, agrammatical, etc. Not sure what to do here, but it's understandable when people only start reviewing in earnest when the text gets half way readable. > > > > > > > > > > > > > A good implementation of virtio_cread can abstract that easily > > > > > > so we don't need to change drivers. > > > > > > > > > > There is no backward compat issue for the GET command being new. > > > > > > > > It's just a shortcut replacing what we really want. As long as a > > > > shortcut is available people will keep using exactly that. So I > > > > fully expect more proposals for such GET commands on the pretext > > > > that one is there so why not another one. Adding more tech debt for > > > > whoever finally gets around to building a config space access gateway. > > > > > > > Not really. as suggested, the first addition of new field to the config > > > space in > > 1.4-time frame, should add the cfgvq, and not follow the previous example. > > > Because this is being thought through now, it is not at all hard for any > > > new > > things to follow the guideline. > > > > Really. Oh great, so there will be 3 ways to provision things! > > > Not sure how you concluded 3 ways... > Provision is via single command via AQ by the owner device. heh but what does this command get? if everything device specific is in config space it can just get config space layout exactly and we have no extra work. if there's stuff not in config
[virtio-dev] RE: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 12:46 PM > Actually, my preblem is exactly if there's no one single interface that can do > everything. > Extended config space reading via single DMA interface is just enough. > > Two interfaces that do the same might seem redundant but there could be good > reasons for this, in this case legacy. > But legacy should not force it to newer interface. More below. > See the difference? > > > I didn't propose two interfaces. Let me explain again. > > > > I proposed, > > a. all new fields to go via vq (single interface) b. all existing > > fields to stay in cfg space (single interface) c. if one wants to > > optimize optionally existing fields can utilize (same flexibility how you > > asked > for PF and VF notification). > > > > > > > We already need config space with provisioning. > > Provisioning does provision features, config space and control parameters, > msix vectors and more. > > Provisioning does _not_ need config space. > > I mean it needs to provision it. > No matter there is config space of dma, a device to be provisioned. > > > In fact e.g. on Linux we want drivers to keep doing virtio_cread() > > > and all the DMA tricks should be hidden behind the scenes. You don't > > > like it that config space is on-chip? Build an interface for it not > > > to be on chip. There is simply no downside to that. > > > > > This is the ask to build software interface between driver and device to not > demand on-chip config space more importantly _predictably_. > > Absolutely. So we would say that driver SHOULD use the DMA interface if it is > present. > I am asking for driver MUST use the DMA interface for _extended_ config space. Why? For two reasons. 1. Because than device is 100% sure that it does not fulfill MMIO needs 2. Single interface to access extended config space, which is what you asked for. 3. Single driver code because there is single way to get it. > > > I came back after a small break and I still see this discussion that > > > keeps playing out as a big distraction. > > > In particular where are you doing to store the things that are for DMA? > > > In system RAM? We don't *have* anywhere in system RAM to store this, > > > we will need an interface to allocate system RAM and pass it to > > > device for this idea to work. So in 1.3 where we won't have this, > > > there is much less of an excuse to use a vq. > > To store what? > > CVQ is already there to do GET and SET operation. > > To store whatever value CVQ is requesting. Where is it going to come from? > It is going to come from the device memory that does not have hard requirements of MMIO accessibility. For example slow flash memory. > Asynchronous is just harder for software. Look at the hoops virtio net has to > jump through to do control path over cvq and weep. Compare to single liner > virtio_cread. > > I happen to maintain Linux drivers too, so I care. > It is harder for first time if no such construct exists; and it is clearly not the case. This patch is already adding CVQ get and set command. Async behavior already built into the virtio spec. Extending it for extended config space just make device more cheaper to build. Software cost is paid only once at device init time just like CVQ, AQ and more. > > > > > > Please let people just focus on fixing config space instead of > > > temporary cvq hacks. > > > > The ask is to have predictable sizing for existing defined config space > > field and > not keep infinitely growing by two interfaces. > > I don't know what "predictable sizing" is or why does it matter. > Because when device has two ways to access config space, it always have to account and build the interface that, hey some driver will not use DMA. Hence have it always in the MMIO accessible area. > I get the value of reduced MMIO register map. > I get that the only way to get that appears to be to pass values around using > DMA. > Ok. great. > I don't think we need to throw away config space - just use DMA to access it. > > And "existing defined" here seems to be at a completely random point in time. > If someone wants to have fields in MMIO, I see no reason not to. This demands two ways of access and that conflicts with your point of desire to do single way. I proposed single way, extended config space via DMA interface, that is really as simple as that. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 6:41 AM > > > So just quickly add the new capability in the spec and then the > > > number of linux releases that will have the new feature but not > > > config command or whatever that is will be too small for vendors to care. > > > > > I didn't follow this suggestion. > > It is very simple though. 1.3 has inner hash feature. Imagine that instead of > endless flamewars rehashing same arguments, immediately post For sure if you call this discussion a war, I didn't start it and I didn't delay it to this point either :) Patch added GET and SET command, GET returns device static and dynamic value, without burdening the device in consistent way. Will moving to config space, take away the GET command? No? Then what did one gain other than extra complexity of additional config space? Just more device memory burning. > 1.3 we vote on an interface to access config space through DMA (not sure it's > a > VQ BTW but that's a separate discussion). This should include a way to expose > a subset of DMA features through MMIO, for compatibility. I think you are missing the main point that I try to highlight but presumably it didn't come across. :( Having an optional DMA interface does _not_ help device to optimize, because device must be support old drivers. The proposal I explained in previous email is: a. All new fields via vq This means it is _guaranteed_ to be offchip and zero reason for backward compatibility, because there is no backward compat of non existing fields. b. all existing fields stay in cfg space c. Optionally existing fields can also be queried via cfg space > If we are lucky guest support can land in the same Linux release that has > inner > hash. Drivers do not need to care: they just do virtio_cread and that is it. > virtio_cread for extended config space accessible only via DMA or VQ will surely be hidden inside th virtio_cread(). Easy. If we agree on this approach, great, one additional field like this can be added to config space. > So a vendor that builds device with inner hash, can expose inner hash only > through DMA and not through MMIO. > > I conclude that there's no reason to block this feature just because it uses > config > space. > I must repeat that I am not blocking, I replied to way back on both the approaches in v14 or even before that. And also explained part of the proposal. This is why one should discuss the design and not keep asking for patches.. > > > > > > > > > > A good implementation of virtio_cread can abstract that easily > > > > > so we don't need to change drivers. > > > > > > > > There is no backward compat issue for the GET command being new. > > > > > > It's just a shortcut replacing what we really want. As long as a > > > shortcut is available people will keep using exactly that. So I > > > fully expect more proposals for such GET commands on the pretext > > > that one is there so why not another one. Adding more tech debt for > > > whoever finally gets around to building a config space access gateway. > > > > > Not really. as suggested, the first addition of new field to the config > > space in > 1.4-time frame, should add the cfgvq, and not follow the previous example. > > Because this is being thought through now, it is not at all hard for any new > things to follow the guideline. > > Really. Oh great, so there will be 3 ways to provision things! > Not sure how you concluded 3 ways... Provision is via single command via AQ by the owner device. New config fields via VQ of member device itself, again single way. Existing config fields access via config space, single way. > I have not seen this patch yet. And how long will this take to materialize? > I don't > believe all TC work must be blocked until this happens, or alternatively use > ad- > hock hacks. It is surely not the right way to ask for the patch when doing the design discussion and claiming that it is not ready. > > I get it you want to save on chip memory. So work on a consistent soltion for > this. > > All config space accesses should go through DMA. This is not predictable due to backward compat need. Hence the ask is all *new* config space access go through DMA. > Thus no special guidelines should be necessary, and drivers can just keep > doing > virtio_cread like they always did. > This can be very easily achieved in sw by knowing extended config space offset to use DMA. > To add to that, it will allow cheaper devices as some existing config space > will be > able to move quickly. This is possible only in those config when device side can know early enough, through a new feature bit and new registers for DMA, which will pay off for larger config space extended region. I want to summarize my humble input: 1. all new extended config space fields to be accessed only via DMA/VQ by member device itself. 2. all existing fields to stay in config space without any interface change 3.
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 04:18:02PM +, Parav Pandit wrote: > > > > From: Michael S. Tsirkin > > Sent: Wednesday, June 28, 2023 6:27 AM > > > > On Wed, Jun 28, 2023 at 04:23:09AM +, Parav Pandit wrote: > > > > > > > From: Jason Wang > > > > Sent: Tuesday, June 27, 2023 11:46 PM > > > > > > > > Having it in the config space, then a type agnostic provisioning > > > > through config space + feature bits just works fine. > > > > > > > Provisioning is far simpler thing to do in device specific way than > > > asking device > > to store this value in onchip area which is rarely accessed. > > > > I thought a lot about this over the weekend. I just do not see this working > > out, > > sorry. Two things is never easier than one. > > You proposed two things and making it mandatory. > I am glad that you realized to do using one interface. > > In different context, you proposed two interface for driver notification on > PF and/or VF BAR in separate thread for flexibility. I asked for one. > Somehow you promoted flexibility (and also complexity that no vendor wanted), > and here you lean towards single interface... > > So more explanation of the rationale will help to understand. OK I will try. Actually, my preblem is exactly if there's no one single interface that can do everything. Two interfaces that do the same might seem redundant but there could be good reasons for this, in this case legacy. See the difference? > I didn't propose two interfaces. Let me explain again. > > I proposed, > a. all new fields to go via vq (single interface) > b. all existing fields to stay in cfg space (single interface) > c. if one wants to optimize optionally existing fields can utilize (same > flexibility how you asked for PF and VF notification). > > > > We already need config space with provisioning. > Provisioning does provision features, config space and control parameters, > msix vectors and more. > Provisioning does _not_ need config space. I mean it needs to provision it. > > In fact e.g. on Linux we want drivers to keep doing virtio_cread() > > and all the DMA tricks should be hidden behind the scenes. You don't like > > it that > > config space is on-chip? Build an interface for it not to be on chip. There > > is > > simply no downside to that. > > > This is the ask to build software interface between driver and device to not > demand on-chip config space more importantly _predictably_. Absolutely. So we would say that driver SHOULD use the DMA interface if it is present. > > I came back after a small break and I still see this discussion that keeps > > playing > > out as a big distraction. > > In particular where are you doing to store the things that are for DMA? > > In system RAM? We don't *have* anywhere in system RAM to store this, we will > > need an interface to allocate system RAM and pass it to device for this > > idea to > > work. So in 1.3 where we won't have this, there is much less of an excuse > > to use > > a vq. > To store what? > CVQ is already there to do GET and SET operation. To store whatever value CVQ is requesting. Where is it going to come from? > > > > > > > > > > > There are many fields of many features for 1.4 are discussed including > > > this > > one. All of these capabilities just cannot be stored in config space. > > > > config space is synchronous interface, vq is asynchronous. Setup is just > > too > > messy to do asynchronously. So yes, config space. > > > Huh, I disagree. The whole virtio spec has got the VQs all over and you say > that VQ is messy. > Doesn't make any sense at all. Asynchronous is just harder for software. Look at the hoops virtio net has to jump through to do control path over cvq and weep. Compare to single liner virtio_cread. I happen to maintain Linux drivers too, so I care. > > > > Please let people just focus on fixing config space instead of temporary cvq > > hacks. > > The ask is to have predictable sizing for existing defined config space field > and not keep infinitely growing by two interfaces. I don't know what "predictable sizing" is or why does it matter. I get the value of reduced MMIO register map. I get that the only way to get that appears to be to pass values around using DMA. I don't think we need to throw away config space - just use DMA to access it. And "existing defined" here seems to be at a completely random point in time. If someone wants to have fields in MMIO, I see no reason not to. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
> From: Michael S. Tsirkin > Sent: Wednesday, June 28, 2023 6:27 AM > > On Wed, Jun 28, 2023 at 04:23:09AM +, Parav Pandit wrote: > > > > > From: Jason Wang > > > Sent: Tuesday, June 27, 2023 11:46 PM > > > > > > Having it in the config space, then a type agnostic provisioning > > > through config space + feature bits just works fine. > > > > > Provisioning is far simpler thing to do in device specific way than asking > > device > to store this value in onchip area which is rarely accessed. > > I thought a lot about this over the weekend. I just do not see this working > out, > sorry. Two things is never easier than one. You proposed two things and making it mandatory. I am glad that you realized to do using one interface. In different context, you proposed two interface for driver notification on PF and/or VF BAR in separate thread for flexibility. I asked for one. Somehow you promoted flexibility (and also complexity that no vendor wanted), and here you lean towards single interface... So more explanation of the rationale will help to understand. I didn't propose two interfaces. Let me explain again. I proposed, a. all new fields to go via vq (single interface) b. all existing fields to stay in cfg space (single interface) c. if one wants to optimize optionally existing fields can utilize (same flexibility how you asked for PF and VF notification). > We already need config space with provisioning. Provisioning does provision features, config space and control parameters, msix vectors and more. Provisioning does _not_ need config space. > In fact e.g. on Linux we want drivers to keep doing virtio_cread() > and all the DMA tricks should be hidden behind the scenes. You don't like it > that > config space is on-chip? Build an interface for it not to be on chip. There is > simply no downside to that. > This is the ask to build software interface between driver and device to not demand on-chip config space more importantly _predictably_. > I came back after a small break and I still see this discussion that keeps > playing > out as a big distraction. > In particular where are you doing to store the things that are for DMA? > In system RAM? We don't *have* anywhere in system RAM to store this, we will > need an interface to allocate system RAM and pass it to device for this idea > to > work. So in 1.3 where we won't have this, there is much less of an excuse to > use > a vq. To store what? CVQ is already there to do GET and SET operation. > > > > > > There are many fields of many features for 1.4 are discussed including this > one. All of these capabilities just cannot be stored in config space. > > config space is synchronous interface, vq is asynchronous. Setup is just too > messy to do asynchronously. So yes, config space. > Huh, I disagree. The whole virtio spec has got the VQs all over and you say that VQ is messy. Doesn't make any sense at all. > > Please let people just focus on fixing config space instead of temporary cvq > hacks. The ask is to have predictable sizing for existing defined config space field and not keep infinitely growing by two interfaces. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [PATCH v19] virtio-net: support inner header hash
> From: Jason Wang > Sent: Wednesday, June 28, 2023 1:38 AM [..] > > Provisioning is far simpler thing to do in device specific way than asking > > device > to store this value in onchip area which is rarely accessed. > > Are you suggesting to not place any new fields in the config space? > Yes. > struct virtio_net_config { > u8 mac[6]; > le16 status; > le16 max_virtqueue_pairs; > le16 mtu; > le32 speed; > u8 duplex; > u8 rss_max_key_size; > le16 rss_max_indirection_table_length; > le32 supported_hash_types; > }; > > Which of the above do you think can be accessed frequently and which part of > the spec says it must be stored in the onchip area? > Most are not accessed frequently. The fact that they are in MMIO a device needs to place in a memory with tight latency budget. Spec is not going to talk on onchip area, it is the reflection of spec that forces certain inefficient implementation .
[virtio-dev] Re: [virtio-comment] [RFC PATCH] admin-queue: bind the group member to the device
On Wed, Jun 28, 2023 at 02:06:32PM +0800, Xuan Zhuo wrote: > On Wed, 28 Jun 2023 10:49:45 +0800, Jason Wang wrote: > > On Tue, Jun 27, 2023 at 6:54 PM Xuan Zhuo > > wrote: > > > > > > On Tue, 27 Jun 2023 17:00:06 +0800, Jason Wang > > > wrote: > > > > On Tue, Jun 27, 2023 at 4:28 PM Xuan Zhuo > > > > wrote: > > > > > > > > > > > > > > > Thanks Parav for pointing it out. We may have some gaps on the case. > > > > > > > > > > Let me introduce our case, which I think it is simple and should be > > > > > easy to > > > > > understand. > > > > > > > > > > First, the user (customer) purchased a bare metal machine. > > > > > > > > > > ## Bare metal machine > > > > > > > > > > Let me briefly explain the characteristics of a bare metal machine. > > > > > It is not a > > > > > virtual machine, it is a physical machine, and the difference between > > > > > it and a > > > > > general physical machine is that its PCI is connected to a device > > > > > similar to a > > > > > DPU. This DPU provides devices such as virtio-blk/net to the host > > > > > through PCI. > > > > > These devices are managed by the vendor, and must be created and > > > > > purchased > > > > > on the vendor's management platform. > > > > > > > > > > ## DPU > > > > > > > > > > There is a software implementation in the DPU, which will respond to > > > > > PCI > > > > > operations. But as mentioned above, resources such as network cards > > > > > must be > > > > > purchased and created before they can exist. So users can create VF, > > > > > which is > > > > > just a pci-level operation, but there may not be a corresponding > > > > > backend. > > > > > > > > > > ## Management Platform > > > > > > > > > > The creation and configuration of devices is realized on the > > > > > management > > > > > platform. > > > > > > > > > > After the user completed the purchase on the management platform > > > > > (this is an > > > > > independent platform provided by the vendor and has nothing to do with > > > > > virtio), then there will be a corresponding device implementation in > > > > > the DPU. > > > > > This includes some user configurations, available bandwidth resources > > > > > and other > > > > > information. > > > > > > > > > > ## Usage > > > > > > > > > > Since the user is directly on the HOST, the user can create VMs, > > > > > passthrough PF > > > > > or VF into the VM. Or users can create a large number of dockers, all > > > > > of which > > > > > use a separate virtio-net device for performance. > > > > > > > > > > The reason why users use vf is that we need to use a large number of > > > > > virtio-net > > > > > devices. This number reaches 1k+. > > > > > > > > > > Based on this scenario, we need to bind vf to the backend device. > > > > > Because, we > > > > > cannot automatically complete the creation of the virtio-net backend > > > > > device when > > > > > the user creates a vf. > > > > > > > > > > ## Migration > > > > > > > > > > In addition, let's consider another scenario of migration. If a vm is > > > > > migrated > > > > > from another host, of course its corresponding virtio device is also > > > > > migrated to > > > > > the DPU. At this time, our newly created vf can only be used by the > > > > > vm after it > > > > > is bound to the migrated device. We do not want this vf to be a brand > > > > > new > > > > > device. > > > > > > > > > > ## Abstraction > > > > > > > > > > So, this is how I understand the process of creating vf: > > > > > > > > > > 1. Create a PCI VF, at this time there may be no backend virtio > > > > > device, or there > > > > > is only a default backend. It does not fully meet our > > > > > expectations. > > > > > 2. Create device or migrate device > > > > > 3. Bind the backend virtio device to the vf > > > > > > > > 3) should come before 2)? > > > > > > > > Who is going to do 3) btw, is it the user? If yes, for example, if a > > > > user wants another 4 queue virtio-net devices, after purchase, how > > > > does the user know its id? > > > > > > Got the id from the management platform. > > > > So it can do the binding via that management platform which this > > became a cloud vendor specific interface. > > In our scenario, this is bound by the user using this id and vf id in the os. > > > > > > > > > > > > > > > > > > > > In most scenarios, the first step may be enough. We can make some > > > > > fine-tuning on > > > > > this default device, such as modifying its mac. In the future, we can > > > > > use admin > > > > > queue to modify its msix vector and other configurations. > > > > > > > > > > But we should allow, we bind a backend virtio device to a certain vf. > > > > > This is > > > > > useful for live migration and virtio devices with special > > > > > configurations. > > > > > > > > All of these could be addressed if a dynamic provisioning model is > > > > implemented (SIOV or transport virtqueue). Trying to have a workaround > > > > in SR-IOV might be tricky. > > > > >
[virtio-dev] Re: [virtio-comment] [RFC PATCH] admin-queue: bind the group member to the device
On Wed, Jun 28, 2023 at 10:21:04AM +0800, Xuan Zhuo wrote: > On Tue, 27 Jun 2023 12:02:41 -0400, "Michael S. Tsirkin" > wrote: > > On Tue, Jun 27, 2023 at 04:23:05PM +0800, Xuan Zhuo wrote: > > > So, this is how I understand the process of creating vf: > > > > > > 1. Create a PCI VF, at this time there may be no backend virtio device, > > > or there > > > is only a default backend. It does not fully meet our expectations. > > > 2. Create device or migrate device > > > 3. Bind the backend virtio device to the vf > > > > I can see this making sense as a feature bit that says VFs are not > > initialized by default and must first be setup through an admin command. > > This will likely need to be a feature bit because it's changing > > behaviour outside of admin commands. > > > > Then, we can have: > > > > ADMIN_SETUP VF# > > ADMIN_CLEANUP VF# > > > > I like this because this generalizes CREATE/DESTROY that SIOV guys proposed. > > Great!! > > > > > > > Why do we need an id as a level of indirection though? What is wrong > > with just using VF# directly? > > I think VF# is ok. I also need to use it. But we need an ID for virtio device > not the transport(PF, VF). > > What I want to emphasize is that PCI(pf or vf) is a transport, it is only used > to connect the virtio driver and the virtio device. Right? > > The virtio device does not necessarily exist depending on PCI. For example, a > virtio device is migrated from another DPU, and it is not associated with any > PCI. What I have always wanted to say is that this device(not PCI) must have > its > own ID, which has nothing to do with the transport. > > Now we want to use this migrated device and connect it to the corresponding > vm (migrated from the same host). We can passthrough vf to this vm. But how > do I > tell our DPU to bind this migrated device with this vf? > We can specify the VF by the VF#, but how can we specify the virtio device? > Maybe there are two migrated virtio device. > > Maybe we should compare it to the case of using PF directly before. The > biggest > difference here is that PF is directly hot-plugged by the DPU, so when a > user(custom) buys a virtio-net device, the DPU directly inserts a new PCI > device > on the host. Now the vf is created by the user, and only the user knows how to > use each one. We cannot directly bind the device purchased or migrated by the > user to a certain vf. We need the user in the host to bind the vf(transport) > to > a certain virtio device. > > Thanks. > > Maybe I didn't have enough coffee today but I can't figure out what all this means :( For example what does "exist depending" mean? What does "device is migrated" mean? What does it mean to be "migrated from the same host"? What is "any PCI"? E.g. I know people did vm migration moving virtio from a hardware device to a software implementation. Is that "not associated with any PCI" ? What is "user(custom)"? how is "the vf is created by the user"? what does it mean to "bind the device .. to a certain vf"? It looks like Parav understand what's going on though, so maybe it's my fault. But fundamentally, the problem is that the spec patch doesn't do anything useful by itself, it relies on some out of spec interface to make these id values make sense. So the TC is reduced to guessing: we could just trust you that it's useful somehow and at the same time serves the purpose whatever it is. But it would be better if instead of trying to explain what that does in vague terms, you post a spec patch that allows doing whatever needs doing for these IDs to make sense through e.g. admin commands. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] RE: [virtio-comment] [RFC PATCH] admin-queue: bind the group member to the device
> From: Xuan Zhuo > Sent: Tuesday, June 27, 2023 10:21 PM > > I can see this making sense as a feature bit that says VFs are not > > initialized by default and must first be setup through an admin command. > > This will likely need to be a feature bit because it's changing > > behaviour outside of admin commands. > > > > Then, we can have: > > > > ADMIN_SETUP VF# > > ADMIN_CLEANUP VF# > > > > I like this because this generalizes CREATE/DESTROY that SIOV guys proposed. > What does this command actually do or expected do on the device? > Great!! > > > > > > > Why do we need an id as a level of indirection though? What is wrong > > with just using VF# directly? > > I think VF# is ok. I also need to use it. But we need an ID for virtio device > not the > transport(PF, VF). > > What I want to emphasize is that PCI(pf or vf) is a transport, it is only > used to > connect the virtio driver and the virtio device. Right? > > The virtio device does not necessarily exist depending on PCI. For example, a > virtio device is migrated from another DPU, and it is not associated with any > PCI. > What I have always wanted to say is that this device(not PCI) must have its > own > ID, which has nothing to do with the transport. > A virtio device can have the id visible to self and visible to group owner device. > Now we want to use this migrated device and connect it to the corresponding > vm (migrated from the same host). We can passthrough vf to this vm. But how > do I tell our DPU to bind this migrated device with this vf? When a virtio device state is set on a specific VF on the compute side (not on the dpu side), This directly indicates to the dpu side, which virtio device is attached to which VF. > We can specify the VF by the VF#, but how can we specify the virtio device? > Maybe there are two migrated virtio device. > virtio device state setting itself will contain the device identifier. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
On Thu, Jun 22, 2023 at 08:27:48PM +, Parav Pandit wrote: > > > > From: Michael S. Tsirkin > > Sent: Thursday, June 22, 2023 3:02 PM > > > > > Existing fields used by existing drivers must be in MMIO. > > > Device does not have a choice. > > > > They can if they like mask some less important features reducing the > > footprint. > How do to that? How would device know to not have existing fields on MMIO? As you said yourself, feature bits also have cost. They must be accessed with DMA too, not through MMIO. This set of features can have more bits than the one in MMIO. > > This will be the cost/benefit analysis each vendor does separately. > > > > Case in point, if inner hash support is in 1.3, and linux drivers don't > > really use > > that at all for a year or two because it's only used by alibaba in their > > stack, and > > by the time linux starts using them it also supports the new commands, then > > practically devices do not care and can just mask the feature bit in MMIO. > > > Once it lands to config space, it becomes existing fields. What's in which version of the spec is immaterial. If all guests using it also support config space through DMA then it does not matter that technically there was a spec version which had inner hash but not DMA. > More below. > > > > > > > If they want to cut at 1.3 time, they can, but they also can cut it > > > > at 1.2 time, this means some features will not be accessible through > > > > MMIO only through the new commands. > > > > > > > New commands services new fields by newer driver. > > > Newer driver can call new command to get new and old fields both. > > > > yes. > > > > > So 1.3 and 1.4 devices cannot optimize for older drivers. > > > > > > I don't know what this means. > > > I guess next below question answers it. I still don't know what did you mean by "So 1.3 and 1.4 devices cannot optimize for older drivers". If it's immaterial, fine. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] Re: [PATCH v18] virtio-net: support inner header hash
On Thu, Jun 22, 2023 at 05:58:29PM +, Parav Pandit wrote: > > > > > > So what is the solution proposed for this? > > > > > > > > > > > 1. Query member device capabilities via admin command > > > > > > > > But that's not 1.3 material. > > > > > > > > > > Yes the current migration is broken in many ways but that's what > > > > > > we have. Let's build something better sure but that is not 1.3 > > > > > > material. > > > > > > > > > > True, it is not 1.3 material, hence the proposal was to have the GET > > command. > > > > > Once/if we reach agreement that no new fields to be added to > > > > > config space > > > > starting 1.4 and should be queried using non intercepted cfgvq, it > > > > makes sense to let this go in cfg space. > > > > > Else GET command seems the elegant and right approach. > > > > > > > > I expect no such agreement at all. Instead, I expect that we'll have > > > > an alternative way to access config space. guest virtio core then > > > > needs to learn both ways, and devices can support one or both. > > > > > > > Yeah, we disagree. > > > Because alternative way that you propose is not predictable way to build > > > the > > device efficiently. > > > It always needs to account for old driver to support. > > > This is clearly sub-optimal as the capabilities grow. > > > > So just quickly add the new capability in the spec and then the number of > > linux > > releases that will have the new feature but not config command or whatever > > that is will be too small for vendors to care. > > > I didn't follow this suggestion. It is very simple though. 1.3 has inner hash feature. Imagine that instead of endless flamewars rehashing same arguments, immediately post 1.3 we vote on an interface to access config space through DMA (not sure it's a VQ BTW but that's a separate discussion). This should include a way to expose a subset of DMA features through MMIO, for compatibility. If we are lucky guest support can land in the same Linux release that has inner hash. Drivers do not need to care: they just do virtio_cread and that is it. So a vendor that builds device with inner hash, can expose inner hash only through DMA and not through MMIO. I conclude that there's no reason to block this feature just because it uses config space. > > > > > > > A good implementation of virtio_cread can abstract that easily so we > > > > don't need to change drivers. > > > > > > There is no backward compat issue for the GET command being new. > > > > It's just a shortcut replacing what we really want. As long as a shortcut > > is > > available people will keep using exactly that. So I fully expect more > > proposals > > for such GET commands on the pretext that one is there so why not another > > one. Adding more tech debt for whoever finally gets around to building a > > config > > space access gateway. > > > Not really. as suggested, the first addition of new field to the config space > in 1.4-time frame, should add the cfgvq, and not follow the previous example. > Because this is being thought through now, it is not at all hard for any new > things to follow the guideline. Really. Oh great, so there will be 3 ways to provision things! I have not seen this patch yet. And how long will this take to materialize? I don't believe all TC work must be blocked until this happens, or alternatively use ad-hock hacks. I get it you want to save on chip memory. So work on a consistent soltion for this. All config space accesses should go through DMA. Thus no special guidelines should be necessary, and drivers can just keep doing virtio_cread like they always did. To add to that, it will allow cheaper devices as some existing config space will be able to move quickly. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [virtio-comment] RE: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 04:23:09AM +, Parav Pandit wrote: > > > From: Jason Wang > > Sent: Tuesday, June 27, 2023 11:46 PM > > > > Having it in the config space, then a type agnostic provisioning through > > config > > space + feature bits just works fine. > > > Provisioning is far simpler thing to do in device specific way than asking > device to store this value in onchip area which is rarely accessed. I thought a lot about this over the weekend. I just do not see this working out, sorry. Two things is never easier than one. We already need config space with provisioning. In fact e.g. on Linux we want drivers to keep doing virtio_cread() and all the DMA tricks should be hidden behind the scenes. You don't like it that config space is on-chip? Build an interface for it not to be on chip. There is simply no downside to that. I came back after a small break and I still see this discussion that keeps playing out as a big distraction. In particular where are you doing to store the things that are for DMA? In system RAM? We don't *have* anywhere in system RAM to store this, we will need an interface to allocate system RAM and pass it to device for this idea to work. So in 1.3 where we won't have this, there is much less of an excuse to use a vq. > There are many fields of many features for 1.4 are discussed including this > one. All of these capabilities just cannot be stored in config space. config space is synchronous interface, vq is asynchronous. Setup is just too messy to do asynchronously. So yes, config space. Please let people just focus on fixing config space instead of temporary cvq hacks. -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
[virtio-dev] Re: [PATCH v19] virtio-net: support inner header hash
On Wed, Jun 28, 2023 at 11:46:22AM +0800, Jason Wang wrote: > On Wed, Jun 28, 2023 at 12:35 AM Heng Qi wrote: > > > > 1. Currently, a received encapsulated packet has an outer and an inner > > header, but > > the virtio device is unable to calculate the hash for the inner header. The > > same > > flow can traverse through different tunnels, resulting in the encapsulated > > packets being spread across multiple receive queues (refer to the figure > > below). > > However, in certain scenarios, we may need to direct these encapsulated > > packets of > > the same flow to a single receive queue. This facilitates the processing > > of the flow by the same CPU to improve performance (warm caches, less > > locking, etc.). > > > >client1client2 > > |+---+ | > > +--->|tunnels|<+ > >+---+ > > | | > > v v > > +-+ > > | monitoring host | > > +-+ > > > > To achieve this, the device can calculate a symmetric hash based on the > > inner headers > > of the same flow. > > > > 2. For legacy systems, they may lack entropy fields which modern protocols > > have in > > the outer header, resulting in multiple flows with the same outer header but > > different inner headers being directed to the same receive queue. This > > results in > > poor receive performance. > > > > To address this limitation, inner header hash can be used to enable the > > device to advertise > > the capability to calculate the hash for the inner packet, regaining better > > receive performance. > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/173 > > Signed-off-by: Heng Qi > > Reviewed-by: Xuan Zhuo > > Reviewed-by: Parav Pandit > > --- > > v18->v19: > > 1. Have a single structure instead of two. @Michael S . Tsirkin > > 2. Some small rewrites. @Michael S . Tsirkin > > 3. Rebase to master. > > > > v17->v18: > > 1. Some rewording suggestions from Michael (Thanks!). > > 2. Use 0 to disable inner header hash and remove > >VIRTIO_NET_HASH_TUNNEL_TYPE_NONE. > > v16->v17: > > 1. Some small rewrites. @Parav Pandit > > 2. Add Parav's Reviewed-by tag (Thanks!). > > > > v15->v16: > > 1. Remove the hash_option. In order to delimit the inner header > > hash and RSS > >configuration, the ability to configure the outer src udp port > > hash is given > >to RSS. This is orthogonal to inner header hash, which will be > > done in the > >RSS capability extension topic (considered as an RSS extension > > together > >with the symmetric toeplitz hash algorithm, etc.). @Parav Pandit > > @Michael S . Tsirkin > > 2. Fix a 'field' typo. @Parav Pandit > > > > v14->v15: > > 1. Add tunnel hash option suggested by @Michael S . Tsirkin > > 2. Adjust some descriptions. > > > > v13->v14: > > 1. Move supported_hash_tunnel_types from config space into cvq > > command. @Parav Pandit > > I may miss some discussions, but this complicates the provisioning a lot. > > Having it in the config space, then a type agnostic provisioning > through config space + feature bits just works fine. > > If we move it only via cvq, we need device specific provisioning interface. > > Thanks Yea that's what I said too. Debugging too. I think we should build a consistent solution that allows accessing config space through DMA, separately from this effort. Parav do you think you can live with this approach so this specific proposal can move forward? -- MST - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
Re: [virtio-dev][PATCH 2/2] virtio-spi: add the device specification
Hi @jrreinh...@google.com, Thank you very much for your helpful comments. I missed the delay and cs_change_delay parameters. I will add both of them, although cs_change_delay can not be set from userspace, but can be set in kernel space. For CS delays such as PRE_DELAY/POST_DELAY, it seems that they are defined in structure spi_device: @cs_setup: delay to be introduced by the controller after CS is asserted. @cs_hold: delay to be introduced by the controller before CS is deasserted. @cs_inactive: delay to be introduced by the controller after CS is deasserted. If @cs_change_delay is used from @spi_transfer, then the two delays will be added up. They show the SPI controller ability to control the CS assertion/deassertion timing and should not be changed for each transfer (because thay can be updated by setting structure spi_transfer or structure spi_ioc_transfer). I think it better to define these parameter in host OS rather than in guest OS since it's the host OS to operate the hardware SPI controller directly. Besides, it can also avoid passing the same values from guest to host time after time. What's your opinion on this topic? Again, thank you very much. Best Regards Haixu Cui On 6/28/2023 1:06 AM, jrreinh...@google.com wrote: Hi Haixu, +The \field{word_delay} defines how long to wait between words within one SPI transfer, +in ns unit. I'm a little surprised to see a word_delay but no frame_delay or transfer_delay. For example, many SPI peripherals require a delay after CS is asserted, but before the first SCLK edge, allowing them to prepare to send data. (E.g. an ADC might begin taking a sample.) The linux struct spi_transfer has three delay fields: * @cs_change_delay: delay between cs deassert and assert when * @cs_change is set and @spi_transfer is not the last in * @spi_message * @delay: delay to be introduced after this transfer before * (optionally) changing the chipselect status, then starting the * next transfer or completing this @spi_message. * @word_delay: inter word delay to be introduced after each word size * (set by bits_per_word) transmission. The userspace spidev.h API has only two: * @delay_usecs: If nonzero, how long to delay after the last bit * transfer before optionally deselecting the device before the * next transfer. * @word_delay_usecs: If nonzero, how long to wait between words within * one transfer. This property needs explicit support in the SPI * controller, otherwise it is silently ignored. The NXP i.MX RT500 SPI (master) controller, for example, has four configurable delays: - PRE_DELAY: After CS assertion, before first SCLK edge. - POST_DELAY: After a transfer, before CS deassertion. - FRAME_DELAY: Between frames (which are arbitrarily defined by software). - TRANSFER_DELAY: Minimum CS deassertion time between transfers. The NVIDIA Tegra114 SPI controller has: - nvidia,cs-setup-clk-count: CS setup timing parameter. - nvidia,cs-hold-clk-count: CS hold timing parameter. I understand that accurately representing all possible delays might be difficult or futile, but I'm curious to understand why, of all the possible delays, inter-word delay was chosen for inclusion. In a microcontroller, delays around CS (de)assertion can be customized by using a GPIO -- rather than the hardware SPI block -- to drive the CS signal. This way, delays can be inserted where needed. To do so with a virtualized SPI controller might prove difficult however, as the virtual channel carrying a CS GPIO signal might not be synchronized to the channel carrying the SPI data. Curious to hear your thoughts. Thanks, Jonathon Reinhart - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org