> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jerin Jacob > Sent: Sunday, 4 July 2021 09.43 > > On Sat, Jul 3, 2021 at 5:54 PM Morten Brørup <m...@smartsharesystems.com> > wrote: > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jerin Jacob > > > Sent: Saturday, 3 July 2021 11.09 > > > > > > On Sat, Jul 3, 2021 at 2:23 PM Morten Brørup > <m...@smartsharesystems.com> > > > wrote: > > > > > > > > > From: fengchengwen [mailto:fengcheng...@huawei.com] > > > > > Sent: Saturday, 3 July 2021 02.32 > > > > > > > > > > On 2021/7/2 22:57, Morten Brørup wrote: > > > > > >> In the DPDK framework, many data-plane API names contain > queues. > > > > > e.g. > > > > > >> eventdev/crypto.. > > > > > >> The concept of virt queues has continuity. > > > > > > > > > > > > I was also wondering about the name "virtual queue". > > > > > > > > > > > > Usually, something "virtual" would be an abstraction of > something > > > > > physical, e.g. a software layer on top of something physical. > > > > > > > > > > > > Back in the days, a "DMA channel" used to mean a DMA engine > on a > > > CPU. > > > > > If a CPU had 2 DMA channels, they could both be set up > > > simultaneously. > > > > > > > > > > > > The current design has the "dmadev" representing a CPU or > other > > > chip, > > > > > which has one or more "HW-queues" representing DMA channels (of > the > > > > > same type), and then "virt-queue" as a software abstraction on > top, > > > for > > > > > using a DMA channel in different ways through individually > > > configured > > > > > contexts (virt-queues). > > > > > > > > > > > > It makes sense to me, although I would consider renaming "HW- > > > queue" > > > > > to "channel" and perhaps "virt-queue" to "queue". > > > > > > > > > > The 'DMA channel' is more used than 'DMA queue', at least > google > > > show > > > > > that there are at least 20+ times more. > > > > > > > > > > It's a good idea build the abstraction layer: queue <> channel > <> > > > dma- > > > > > controller. > > > > > In this way, the meaning of each layer is relatively easy to > > > > > distinguish literally. > > > > > > > > > > will fix in V2 > > > > > > > > > > > > > After re-reading all the mails in this thread, I have found one > more > > > important high level detail still not decided: > > > > > > > > Bruce had suggested flattening the DMA channels, so each dmadev > > > represents a DMA channel. And DMA controllers with multiple DMA > > > channels will have to instantiate multiple dmadevs, one for each > DMA > > > channel. > > > > > > > > Just like a four port NIC instantiates four ethdevs. > > > > > > > > Then, like ethdevs, there would only be two abstraction layers: > > > dmadev <> queue, where a dmadev is a DMA channel on a DMA > controller. > > > > > > > > However, this assumes that the fast path functions on the > individual > > > DMA channels of a DMA controller can be accessed completely > > > independently and simultaneously by multiple threads. (Otherwise, > the > > > driver would need to implement critical regions or locking around > > > accessing the common registers in the DMA controller shared by the > DMA > > > channels.) > > > > > > > > Unless any of the DMA controller vendors claim that this > assumption > > > about independence of the DMA channels is wrong, I strongly support > > > Bruce's flattening suggestion. > > > > > > It is wrong from alteast octeontx2_dma PoV. > > > > > > # The PCI device is DMA controller where the driver/device is > > > mapped.(As device driver is based on PCI bus, We dont want to have > > > vdev for this) > > > # The PCI device has HW queue(s) > > > # Each HW queue has different channels. > > > > > > In the current configuration, we have only one queue per device and > it > > > has 4 channels. 4 channels are not threaded safe as it is based on > > > single queue. > > > > Please clarify "current configuration": Is that a configuration > modifiable by changing some software/driver, or is it the chip that was > built that way in the RTL code? > > We have 8 queues per SoC, Based on some of HW versions it can be > configured as (a) or (b) using FW settings. > a) One PCI devices with 8 Queues > b) 8 PCI devices with each one has one queue. > > Everyone is using mode (b) as it helps 8 different applications to use > DMA as if one application binds the PCI device other applications can > not use the same PCI device. > If one application needs 8 queues, it is possible that 8 dmadevice can > be bound to a single application with mode (b). > > > I think, in above way we can flatten to <device> <> <channel/queue> > > > > > > > > > I think, if we need to flatten it, I think, it makes sense to have > > > dmadev <> channel (and each channel can have thread-safe capability > > > based on how it mapped on HW queues based on the device driver > > > capability). > > > > The key question is how many threads can independently call data- > plane dmadev functions (rte_dma_copy() etc.) simultaneously. If I > understand your explanation correctly, only one - because you only have > one DMA device, and all access to it goes through a single hardware > queue. > > > > I just realized that although you only have one DMA Controller with > only one HW queue, your four DMA channels allows four sequentially > initiated transactions to be running simultaneously. Does the > application have any benefit by knowing that the dmadev can have > multiple ongoing transactions, or can the fast-path dmadev API hide > that ability? > > In my view it is better to hide and I have similar proposal at > http://mails.dpdk.org/archives/dev/2021-July/213141.html > -------------- > > 7) Because data-plane APIs are not thread-safe, and user could > determine > > virt-queue to HW-queue's map (at the queue-setup stage), so it > is user's > > duty to ensure thread-safe. > > +1. But I am not sure how easy for the fast-path application to have > this logic, > Instead, I think, it is better to tell the capa for queue by driver > and in channel configuration, > the application can request for requirement (Is multiple producers enq > to the same HW queue or not). > Based on the request, the implementation can pick the correct function > pointer for enq.(lock vs lockless version if HW does not support > lockless)
+1 to that! > > ------------------------ > >