On Fri, Apr 17, 2020 at 2:56 PM Fu, Patrick <patrick...@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Maxime Coquelin <maxime.coque...@redhat.com>
> > Sent: Friday, April 17, 2020 4:40 PM
> > To: Fu, Patrick <patrick...@intel.com>; Jerin Jacob <jerinjac...@gmail.com>
> > Cc: dev@dpdk.org; Ye, Xiaolong <xiaolong...@intel.com>; Hu, Jiayu
> > <jiayu...@intel.com>; Wang, Zhihong <zhihong.w...@intel.com>; Liang,
> > Cunming <cunming.li...@intel.com>
> > Subject: Re: [dpdk-dev] [RFC] Accelerating Data Movement for DPDK vHost
> > with DMA Engines
> >
> >
> >
> > On 4/17/20 10:29 AM, Fu, Patrick wrote:
> > > Hi Jerin,
> > >
> > >> -----Original Message-----
> > >> From: Jerin Jacob <jerinjac...@gmail.com>
> > >> Sent: Friday, April 17, 2020 4:02 PM
> > >> To: Fu, Patrick <patrick...@intel.com>
> > >> Cc: dev@dpdk.org; Maxime Coquelin <maxime.coque...@redhat.com>;
> > Ye,
> > >> Xiaolong <xiaolong...@intel.com>; Hu, Jiayu <jiayu...@intel.com>;
> > >> Wang, Zhihong <zhihong.w...@intel.com>; Liang, Cunming
> > >> <cunming.li...@intel.com>
> > >> Subject: Re: [dpdk-dev] [RFC] Accelerating Data Movement for DPDK
> > >> vHost with DMA Engines
> > >>
> > >> On Fri, Apr 17, 2020 at 12:56 PM Fu, Patrick <patrick...@intel.com> 
> > >> wrote:
> > >>>
> > >>> Background
> > >>> ====================================
> > >>> DPDK vhost library implements a user-space VirtIO net backend
> > >>> allowing
> > >> host applications to directly communicate with VirtIO front-end in
> > >> VMs and containers. However, every vhost enqueue/dequeue operation
> > >> requires to copy packet buffers between guest and host memory. The
> > >> overhead of copying large bulk of data makes the vhost backend become
> > >> the I/O bottleneck. DMA engines, including un-core DMA accelerator,
> > >> like Crystal Beach DMA (CBDMA) and Data Streaming Accelerator (DSA),
> > >> and discrete card general purpose DMA, are extremely efficient in
> > >> data movement within system memory. Therefore, we propose a set of
> > >> asynchronous DMA data movement API in vhost library for DMA
> > >> acceleration. With offloading packet copies in vhost data-path from
> > >> the CPU to the DMA engine, which can not only accelerate data transfers,
> > but also save precious CPU core resources.
> > >>>
> > >>> New API Overview
> > >>> ====================================
> > >>> The proposed APIs in the vhost library support various DMA engines
> > >>> to
> > >> accelerate data transfers in the data-path. For the higher
> > >> performance, DMA engines work in an asynchronous manner, where
> > DMA
> > >> data transfers and CPU computations are executed in parallel. The
> > >> proposed API consists of control path API and data path API. The
> > >> control path API includes Registration API and DMA operation
> > >> callback, and the data path API includes asynchronous API. To remove
> > >> the dependency of vendor specific DMA engines, the DMA operation
> > >> callback provides generic DMA data transfer abstractions. To support
> > >> asynchronous DMA data movement, the new async API provides
> > >> asynchronous ring operation semantic in data-path. To enable/disable
> > >> DMA acceleration for virtqueues, users need to use registration API
> > >> is to register/unregister DMA callback implementations to the vhost
> > >> library and bind DMA channels to virtqueues. The DMA channels used by
> > >> virtqueues are provided by DPDK applications, which is backed by virtual
> > or physical DMA devices.
> > >>> The proposed APIs are consisted of 3 sub-sets:
> > >>> 1. DMA Registration APIs
> > >>> 2. DMA Operation Callbacks
> > >>> 3. Async Data APIs
> > >>>
> > >>> DMA Registration APIs
> > >>> ====================================
> > >>> DMA acceleration is per queue basis. DPDK applications need to
> > >>> explicitly
> > >> decide whether a virtqueue needs DMA acceleration and which DMA
> > >> channel to use. In addition, a DMA channel is dedicated to a
> > >> virtqueue and a DMA channel cannot be bound to multiple virtqueues at
> > >> the same time. To enable DMA acceleration for a virtqueue, DPDK
> > >> applications need to implement DMA operation callbacks for a specific
> > >> DMA type (e.g. CBDMA) first, then register the callbacks to the vhost
> > >> library and bind a DMA channel to a virtqueue, and finally use the
> > >> new async API to perform data-path operations on the virtqueue.
> > >>> The definitions of registration API are shown below:
> > >>> int rte_vhost_async_channel_register(int vid, uint16_t queue_id,
> > >>>                                         struct rte_vdma_device_ops
> > >>> *ops);
> > >>>
> > >>> int rte_vhost_async_channel_unregister(int vid, uint16_t queue_id);
> > >>
> > >> We already have multiple DMA implementation over raw dev.
> > >> Why not make a new dmadev class for DMA acceleration and use it by
> > >> virtio and any other clients?
> > >
> > > I believe it doesn't conflict. The purpose of this RFC is to create an 
> > > async
> > data path in vhost-user and provide a way for applications to work with this
> > new path. dmadev is another topic which could be discussed separately. If
> > we do have the dmadev available in the future, this vhost async data path
> > could certainly be backed by the new dma abstraction without major
> > interface change.
> >
> > Maybe that one advantage of a dmadev class is that it would be easier and
> > more transparent for the application to consume.
> >
> > The application would register some DMA devices, pass them to the Vhost
> > library, and then rte_vhost_submit_enqueue_burst and
> > rte_vhost_poll_enqueue_completed would call the dmadev callbacks directly.
> >
> > Do you think that could work?
> >
> Yes, this is a workable model. As I said in previous reply, I have no 
> objection to make the dmadev. However, what we currently want to do is 
> creating the async data path for vhost, and we actually have no preference to 
> the underlying DMA device model. I believe our current design of the API 
> proto type /data structures are quite common for various DMA acceleration 
> solutions and there is no blocker for any new DMA device to adapt to these 
> APIs or extend to a new one.

IMO, as a driver writer,  we should not be writing TWO DMA driver. One
for vhost and other one for rawdev.
If vhost is the first consumer of DMA needed then I think, it make
sense to add dmadev first.
The rawdev DMA driver to dmadev DMA driver conversion will be the
driver owner job.
I think, it makes sense to define the dmadev API and then costume by
virtio to avoid integration issues.



>
> Thanks,
>
> Patrick
>

Reply via email to