On 29 May 2015 at 13:55, Zoltan Kiss <zoltan.k...@linaro.org> wrote: > > > On 28/05/15 17:40, Ola Liljedahl wrote: > >> On 28 May 2015 at 17:23, Zoltan Kiss <zoltan.k...@linaro.org >> <mailto:zoltan.k...@linaro.org>> wrote: >> >> >> >> On 28/05/15 16:00, Ola Liljedahl wrote: >> >> I disprove of this solution. TX completion processing (cleaning TX >> descriptor rings after transmission complete) is an implementation >> (hardware) aspect and should be hidden from the application. >> >> >> Unfortunately you can't, if you want your pktio application work >> with poll mode drivers. In that case TX completion interrupt (can >> be) disabled and the application has to control that as well. In >> case of DPDK you just call the send function (with 0 packets, if you >> don't have anything to send at the time) >> >> Why do you have to retire transmitted packet if you are not transmitting >> new packets (and need those descriptors in the TX ring)? >> > Because otherwise they are a memory leak.
They are not leaked! They are still in the TX ring, just waiting to get retired. > Those buffers might be needed somewhere else. If they are only released > when you send/receive packets out next time, you are in trouble, because > that might never happen. Especially when that event is blocked because your > TX ring is full of unreleased packets. Having to few buffers is always a problem. You don't want to have too large RX/TX rings because that just increases buffering and latency ("buffer bloat" problem). > > Does the > >> application have too few packets in the pool so that reception will >> suffer? >> > Let me approach the problem from a different angle: the current workaround > is that you have to allocate a pool with _loooads_ of buffers, so you have > a good chance you never run out of free buffers. Probably. Because it still > doesn't guarantee that there will be a next send/receive event on that > interface to release the packets. > > > > >> >> >> There isn't >> >> any corresponding call that refills the RX descriptor rings with >> fresh >> buffers. >> >> You can do that in the receive function, I think that's how the >> drivers are doing it generally. >> >> >> The completion processing can be performed from any ODP call, not >> necessary odp_pktio_send(). >> >> >> I think "any" is not specific enough. Which one? >> >> odp_pktio_recv, odp_schedule. Wherever the application blocks or busy >> waits waiting for more packets. >> > We do that already on odp_pktio_recv. It doesn't help, because you can > only release the buffers held in the current interface's TX ring. You can't > do anything about other interfaces. > Why not? There is no guarantee that the application thread calling odp_pktio_recv() on one interface is the only one transmitting on that specific egress interface. In the general case, all threads may be using all pktio interfaces for both reception and transmission. I mean, you could trigger TX completion on every interface every time you > receive on one, but that would be a scalability nightmare. Maybe not every time. I expect a more creative solution than this. Perhaps when you run out of buffers in the pool? > > > >> >> >> Can you provide a vague draft how would you fix the l2fwd example >> below? >> >> I don't think anything needs fixing on the application level. >> > > Wrong. odp_l2fwd uses one packet pool, receives from pktio_src and then if > there is anything received, it sends it out on pktio_dst. This specific application has this specific behavior. Are you sure this is a general solution? I am not. > Let's say the pool has 576 elements, and the interfaces uses 256 RX and > 256 TX descriptors. You start with 2*256 buffers kept in the two RX ring. > Let's say you receive the first 64 packets, you refill the RX ring > immediately, so now you're out of buffers. You can send out that 64, but in > the next iteration odp_pktio_recv() will return 0 because it can't refill > the RX descriptors. (and the driver won't give you back any buffer unless > you can refill it). And now you are in an infinite loop, recv will always > return 0, because you never release the packets. > The size of the pool should somehow be correlated with the size of the RX and TX rings for "best performance" (whatever this means). But I also think that the system should function regardless of RX/TX ring sizes and pool size, "function" meaning not deadlock. There are several ways to fix this: > - tell the application writer that if you see deadlocks, increase the > element size of the buffer. I doubt anyone would ever use ODP to anything > serious when seeing such thing. > - you can't really give anything more specific than in the previous point, > because such details as RX/TX descriptor numbers are abstracted away, > intentionally. And your platform can't autotune them, because it doesn't > know how many elements you have in the pool used for TX. In fact, it could > be more than just one pool. > - make sure that you run odp_pktio_send even if pkts == 0. In case of > ODP-DPDK it can help because that actually triggers TX completion. > Actually, we can make odp_pktio_send_complete() == odp_pktio_send(len=0), > so we don't have to introduce a new function. But that doesn't change the > fact that we have to call TX completion periodically to make sure nothing > is blocked. > So why doesn't the ODP-for-DPDK implementation call TX completion "periodically" or at some other suitable times? > - or we can just do what I proposed in the patch, which is very similar to > the previous point, but articulate the importance of TX completion more. > Which is a platform specific problem and exactly the kind of things that the ODP API should hide and not expose. > > > >> >> -- Ola >> >> >> On 28 May 2015 at 16:38, Zoltan Kiss <zoltan.k...@linaro.org >> <mailto:zoltan.k...@linaro.org> >> <mailto:zoltan.k...@linaro.org <mailto:zoltan.k...@linaro.org>>> >> >> wrote: >> >> A pktio interface can be used with poll mode drivers, where >> TX >> completion often >> has to be done manually. This turned up as a problem with >> ODP-DPDK and >> odp_l2fwd: >> >> while (!exit_threads) { >> pkts = odp_pktio_recv(pktio_src,...); >> if (pkts <= 0) >> continue; >> ... >> if (pkts_ok > 0) >> odp_pktio_send(pktio_dst, pkt_tbl, pkts_ok); >> ... >> } >> >> In this example we never call odp_pktio_send() on pktio_dst >> if there >> wasn't >> any new packets received on pktio_src. DPDK needs manual TX >> completion. The >> above example should have an >> odp_pktio_send_completion(pktio_dst) >> right at the >> beginning of the loop. >> >> Signed-off-by: Zoltan Kiss <zoltan.k...@linaro.org >> <mailto:zoltan.k...@linaro.org> >> <mailto:zoltan.k...@linaro.org >> >> <mailto:zoltan.k...@linaro.org>>> >> >> --- >> include/odp/api/packet_io.h | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff --git a/include/odp/api/packet_io.h >> b/include/odp/api/packet_io.h >> index b97b2b8..3a4054c 100644 >> --- a/include/odp/api/packet_io.h >> +++ b/include/odp/api/packet_io.h >> @@ -119,6 +119,22 @@ int odp_pktio_recv(odp_pktio_t pktio, >> odp_packet_t pkt_table[], int len); >> int odp_pktio_send(odp_pktio_t pktio, odp_packet_t >> pkt_table[], >> int len); >> >> /** >> + * Release sent packets >> + * >> + * This function should be called after sending on a >> pktio. If the >> platform >> + * doesn't implement send completion in other ways, this >> function >> should call >> + * odp_packet_free() on packets where transmission is >> already >> completed. It can >> + * be a no-op if the platform guarantees that the packets >> will be >> released upon >> + * completion, but the application must call it >> periodically after >> send to make >> + * sure packets are released. >> + * >> + * @param pktio ODP packet IO handle >> + * >> + * @retval <0 on failure >> + */ >> +int odp_pktio_send_complete(odp_pktio_t pktio); >> + >> +/** >> * Set the default input queue to be associated with a >> pktio handle >> * >> * @param pktio ODP packet IO handle >> -- >> 1.9.1 >> >> _______________________________________________ >> lng-odp mailing list >> lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org> >> <mailto:lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org >> >> >> https://lists.linaro.org/mailman/listinfo/lng-odp >> >> >> >>
_______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp