Matias Elo(matiaselo) replied on github web page: platform/linux-generic/pktio/socket.c line 161 @@ -855,6 +937,126 @@ static int sock_init_global(void) return 0; } +static int sock_recv_mq_tmo_select(pktio_entry_t * const *entry, + const int index[], + unsigned num_q, unsigned *from, + odp_packet_t packets[], int num, + uint64_t usecs, fd_set *readfds, + int maxfd) +{ + struct timeval timeout; + unsigned i; + int ret; + + timeout.tv_sec = usecs / 1000 / 1000; + timeout.tv_usec = usecs % (1000 * 1000); + + for (i = 0; i < num_q; i++) { + ret = entry[i]->s.ops->recv(entry[i], index[i], packets, num); + + if (ret > 0 && from) + *from = i; + + if (ret != 0) + return ret; + } + + if (select(maxfd + 1, readfds, NULL, NULL, + usecs == ODP_PKTIN_WAIT ? NULL : &timeout) == 0) + return 0; + + for (i = 0; i < num_q; i++) { + ret = entry[i]->s.ops->recv(entry[i], index[i], packets, num); + + if (ret > 0 && from) + *from = i; + + if (ret != 0) + return ret; + } + + return 0; +} + +int sock_recv_mq_tmo_try_int_driven(const struct odp_pktin_queue_t queues[], + unsigned num_q, unsigned *from, + odp_packet_t packets[], int num, + uint64_t usecs, int *trial_successful) +{ + unsigned i; + pktio_entry_t *entry[num_q]; + int index[num_q]; + fd_set readfds; + int maxfd = -1; + int (*impl)(pktio_entry_t *entry[], int index[], int num_q, + odp_packet_t packets[], int num, unsigned *from, + uint64_t wait_usecs) = NULL; + int impl_set = 0; + + /* First, we get pktio entries and queue indices. We then see if the + implementation function pointers are the same. If they are the + same, impl will be set to non-NULL; otherwise it will be NULL. */ + + for (i = 0; i < num_q; i++) { + entry[i] = get_pktio_entry(queues[i].pktio); + index[i] = queues[i].index; + if (entry[i] == NULL) { + ODP_DBG("pktio entry %d does not exist\n", + queues[i].pktio); + *trial_successful = 0; + return -1; + }
Comment: By checking both ops here you can return right away and minimize overhead for pktios not supporting interrupt driven mode. E.g. ``` if (!entry[i]->s.ops->recv_mq_tmo && !entry[i]->s.ops->fd_set) { *trial_successful = 0; return -1; } ``` > Juha-Matti Tilli(jmtilli) wrote: > I'm somewhat worried that somebody could adjust SLEEP_NSEC and then the > entire thing fails. What about this: > > <pre> > #if SLEEP_NSEC == 1000 > uint64_t usecs = wait; > #else > uint64_t usecs = (wait * SLEEP_NSEC + 999) / 1000; > #endif > </pre> > > For the return value, you mean <code>odp_pktin_wait_time()</code> and not > <code>odp_pktin_recv_mq_tmo()</code>, right? Because I don't see how the > return value of the latter could depend on <code>SLEEP_NSEC</code> definition. > > Also, if we modify <code>odp_pktin_wait_time()</code> to always return > microseconds, then <code>odp_pktin_recv_mq_tmo()</code> suddenly starts to > work incorrectly with <code>SLEEP_NSEC != 1000</code> as there is a > <code>wait--</code> line in there that does this just before sleeping one > microsecond, exactly. >> Matias Elo(matiaselo) wrote: >> Same thing with the loop order than with netmap. >>> Matias Elo(matiaselo) wrote: >>> Couple comment lines describing the different stages of this function >>> wouldn't hurt when someone is reading this code later on. >>>> Matias Elo(matiaselo) wrote: >>>> Same thing with the loop order than with netmap. >>>>> Matias Elo(matiaselo) wrote: >>>>> Since this loop is a performance optimization I would move it before the >>>>> loop which sets the read descriptors. No use to set the descriptors if >>>>> packets are already available. >>>>>> Matias Elo(matiaselo) wrote: >>>>>> The motivation for a separate `odp_pktin_wait_time()` function was to >>>>>> avoid having to do time conversions in the fast path functions. As >>>>>> `odp_pktin_wait_time()` already returns the time in microseconds no >>>>>> additional conversions are needed here. Same thing for >>>>>> `odp_pktin_recv_mq_tmo()`. >>>>>> >>>>>> Currently, the value returned by `odp_pktin_recv_mq_tmo()` depends on >>>>>> `SLEEP_NSEC` definition. This should be modified so that the returned >>>>>> value is always in microseconds. >>>>>>> Matias Elo(matiaselo) wrote: >>>>>>> Every pktio type we currenly have uses microseconds for timeout, so you >>>>>>> could change `recv_tmo()` and `recv_mq_tmo()` to use microseconds >>>>>>> instead of nanoseconds. This removes the need for per operation >>>>>>> conversions. >>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>> > Are you entirely sure about this? This is a function pointer, and if >>>>>>>> > two pktio devices have the same function pointer for recv_mq_tmo(), >>>>>>>> > the other function pointers including the recv() function pointer >>>>>>>> > should be identical as well. We do pass the device (entry[i]) to the >>>>>>>> > function pointed to by recv_mq_tmo as well, so it should be able to >>>>>>>> > update the correct counters. The ops structure is const, after all... >>>>>>>> >>>>>>>> I stand corrected, didn't notice the pktio_entry_t array parameter. A >>>>>>>> bit confusing as `recv_mq_tmo()` behaves differently than pretty much >>>>>>>> every other pktio op. >>>>>>>>> muvarov wrote >>>>>>>>> just pass nsec to drivers or you will have double calculation like in >>>>>>>>> netmap code. >>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>> Are you entirely sure about this? This is a function pointer, and if >>>>>>>>>> two pktio devices have the same function pointer for recv_mq_tmo(), >>>>>>>>>> the other function pointers including the recv() function pointer >>>>>>>>>> should be identical as well. We do pass the device (entry[i]) to the >>>>>>>>>> function pointed to by recv_mq_tmo as well, so it should be able to >>>>>>>>>> update the correct counters. The ops structure is const, after all... >>>>>>>>>> >>>>>>>>>> So, when receiving packets from a particular device using netmap, we >>>>>>>>>> do call netmap_recv() and pass to it the pktio_entry[i] and also >>>>>>>>>> index[i] for the queue index. This is the same that the >>>>>>>>>> odp_pktin_recv_mq_tmo() function already did. It calls >>>>>>>>>> odp_pktin_recv() in a loop, and the latter function calls >>>>>>>>>> entry->s.ops->recv(entry, queue.index, packets, num); which to me >>>>>>>>>> seems to be exactly the same thing -- you pass exactly the same >>>>>>>>>> entry and the same queue.index. Same function, same arguments, I >>>>>>>>>> can't see how the behavior would be different... >>>>>>>>>> >>>>>>>>>> Now, if somebody opposes the .recv_mq_tmo function pointer due to >>>>>>>>>> added code line count, I might perhaps agree. But it seems to be the >>>>>>>>>> only way the DPDK support for genuine sleep could be added, so I see >>>>>>>>>> some value in it. >>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>> Several options: (1) I could implement a binary search algorithm >>>>>>>>>>> that incrementally checks various usec values and uses >>>>>>>>>>> odp_pktin_wait_time() to check which one of these trial attempts >>>>>>>>>>> result in the same sleep wait value, but that would be slow; (2) I >>>>>>>>>>> could implement odp_pktin_wait_time_reverse() that does the >>>>>>>>>>> conversion in the other direction, but that would be an API change; >>>>>>>>>>> (3) I could just do the conversion and pass usec values instead of >>>>>>>>>>> sleep wait values. I carefully considered each and selected (3), >>>>>>>>>>> but do you think (2) or (1) would be better? >>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>> Will fix. >>>>>>>>>>>>> Juha-Matti Tilli(jmtilli) wrote: >>>>>>>>>>>>> I disagree. SLEEP_NSEC is a private #define in odp_packet_io.c so >>>>>>>>>>>>> what the wait parameter actually means cannot be interpreted in >>>>>>>>>>>>> the drivers. (Believe me, I tried it!) Thus, the conversion from >>>>>>>>>>>>> the sleep wait parameter to microseconds is needed. >>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>> As I noted in the previous comment, I think this check isn't >>>>>>>>>>>>>> enough. The input queues would have to be from the same pktio >>>>>>>>>>>>>> device for this to work. >>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>> Internal functions should not use odp_ prefix. >>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>> You should pass the `wait` parameter to >>>>>>>>>>>>>>>> `entry->s.ops->recv_tmo()` and not do any conversions here. >>>>>>>>>>>>>>>>> Matias Elo(matiaselo) wrote: >>>>>>>>>>>>>>>>> ODP_TIME_SEC_IN_NS would work here. https://github.com/Linaro/odp/pull/341#discussion_r158223173 updated_at 2017-12-21 11:23:20