From: David Marchand [mailto:david.march...@6wind.com]
Sent: Wednesday, February 25, 2015 6:22 PM
To: Zhou, Danny
Cc: dev at dpdk.org; Liang, Cunming
Subject: Re: [dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling 
based on VFIO

Hello Danny,

On Wed, Feb 25, 2015 at 7:58 AM, Zhou, Danny <danny.zhou at 
intel.com<mailto:danny.zhou at intel.com>> wrote:

+int
+rte_intr_wait_rx_pkt(struct rte_intr_handle *intr_handle, uint8_t queue_id)
+{
+       struct epoll_event ev;
+       unsigned numfds = 0;
+
+       if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+               return -1;
+       if (queue_id >= VFIO_MAX_QUEUE_ID)
+               return -1;
+
+       /* create epoll fd */
+       int pfd = epoll_create(1);
+       if (pfd < 0) {
+               RTE_LOG(ERR, EAL, "Cannot create epoll instance\n");
+               return -1;
+       }

Why recreate the epoll instance at each call to this function ?

DZ: To avoid recreating the epoll instance for each queue, the struct 
rte_intr_handle(or a new structure added to ethdev)
should be extended by adding fields storing per-queue pfd. This way, it could 
reduce user/kernel context  switch overhead
when calling epoll_create() each time.

Sounds good?

You don't need a epfd per queue. And hardcoding epfd == eventfd will give a not 
very usable api.

Plus, epoll is something linux-specific, so you can't move it out of eal/linux.
I suppose you need an abstraction here (and in the future we could add 
something for bsd ?).

DZ: libev provides abstraction layer which is a good candidate to integrate, 
rather than
reinventing one I think. The BSD support can be implemented in the files under
lib\librte_eal\bsdapp\eal folder by calling BSD specific APIs. Maybe it is a 
good idea to introduce
a separated component like OS Adaption Layer into EAL in the future once DPDK 
is widely adopted in
BSD as well as Windows, then all DPDK components invoke Linux specific APIs 
could instead calling abstraction APIs.

Adding an abstraction here specifically for epoll does not resolve all the 
porting/migration problem in my mind.


Looking at this patchset, I think there is a design issue.
eal does not need to know about portid neither queueid.

eal can provide an api to retrieve the interrupt fds, configure an epoll 
instance, wait on an epoll instance etc...
ethdev is then responsible to setup the mapping between port id / queue id and 
interrupt fds by asking the eal about those fds.

This would result in an eal api even simpler and we could add other fds in a 
single epoll fd for other uses.

DZ: The queueid is just an index to the queue related eventfd array stored in 
EAL. If this array is still in the EAL and ethdev can apply for it and setup 
mapping for certain queue, there
might be issue for multiple-process use case where the fd resources allocated 
for secondary process are not freed if the secondary process exits unexpectedly.

Not sure I follow you.
If a secondary process exits, the eventfds created in primary process should 
still be valid and reusable.
Why would you need to free them ? Something to do with vfio ?

DZ: See below.

Probably we can setup the eventfd array inside ethdev,  and we just need EAL 
API to wait for ethdev?fd. So application invokes ethdev API with portid and 
queueid, and ethdev calls eal
API to wait on a ethdev fd which correlates with the specified portid and 
queueid.

Sounds ok to you?

eventfds creation can not be handled by ethdev, since it needs infrastructure 
and informations from within the eal/linux.
Again, do we need an abstraction ?

ethdev must be the one that does the mappings between port/queue and eventfds 
(or any object that represents a way to wake up for a given port/queue).

DZ: agreed after revisiting code. Let us follow your direction to create a 
ethdev API, similar to 
rte_eth_dev_rx_queue_intr_enable()/rte_eth_dev_rx_queue_intr_disable(), to use 
portiid and queueid as arguments. Then this ethdev API uses the mapped eventfds 
to invoke corresponding EAL API, waiting for interrupt event notification from 
kernel.  A V6 patchset will be created for this.

--
David Marchand

Reply via email to