On Fri, May 29, 2015 at 04:21:03PM +0200, Ola Liljedahl wrote:
> On 29 May 2015 at 13:55, Zoltan Kiss <zoltan.k...@linaro.org> wrote:
> 
> >
> >
> > On 28/05/15 17:40, Ola Liljedahl wrote:
> >
> >> On 28 May 2015 at 17:23, Zoltan Kiss <zoltan.k...@linaro.org
> >> <mailto:zoltan.k...@linaro.org>> wrote:
> >>
> >>
> >>
> >>     On 28/05/15 16:00, Ola Liljedahl wrote:
> >>
> >>         I disprove of this solution. TX completion processing (cleaning TX
> >>         descriptor rings after transmission complete) is an implementation
> >>         (hardware) aspect and should be hidden from the application.
> >>
> >>
> >>     Unfortunately you can't, if you want your pktio application work
> >>     with poll mode drivers. In that case TX completion interrupt (can
> >>     be) disabled and the application has to control that as well. In
> >>     case of DPDK you just call the send function (with 0 packets, if you
> >>     don't have anything to send at the time)
> >>
> >> Why do you have to retire transmitted packet if you are not transmitting
> >> new packets (and need those descriptors in the TX ring)?
> >>
> > Because otherwise they are a memory leak.
> 
> They are not leaked! They are still in the TX ring, just waiting to get
> retired.
> 
> 
> > Those buffers might be needed somewhere else. If they are only released
> > when you send/receive packets out next time, you are in trouble, because
> > that might never happen. Especially when that event is blocked because your
> > TX ring is full of unreleased packets.
> 
> Having to few buffers is always a problem. You don't want to have too large
> RX/TX rings because that just increases buffering and latency ("buffer
> bloat" problem).
> 
> 
> >
> >  Does the
> >
> >> application have too few packets in the pool so that reception will
> >> suffer?
> >>
> > Let me approach the problem from a different angle: the current workaround
> > is that you have to allocate a pool with _loooads_ of buffers, so you have
> > a good chance you never run out of free buffers. Probably. Because it still
> > doesn't guarantee that there will be a next send/receive event on that
> > interface to release the packets.
> 
> 
> 
> >
> >
> >
> >
> >>
> >>
> >>       There isn't
> >>
> >>         any corresponding call that refills the RX descriptor rings with
> >>         fresh
> >>         buffers.
> >>
> >>     You can do that in the receive function, I think that's how the
> >>     drivers are doing it generally.
> >>
> >>
> >>         The completion processing can be performed from any ODP call, not
> >>         necessary odp_pktio_send().
> >>
> >>
> >>     I think "any" is not specific enough. Which one?
> >>
> >> odp_pktio_recv, odp_schedule. Wherever the application blocks or busy
> >> waits waiting for more packets.
> >>
> > We do that already on odp_pktio_recv. It doesn't help, because you can
> > only release the buffers held in the current interface's TX ring. You can't
> > do anything about other interfaces.
> >
> Why not?
> 
> There is no guarantee that the application thread calling odp_pktio_recv()
> on one interface is the only one transmitting on that specific egress
> interface. In the general case, all threads may be using all pktio
> interfaces for both reception and transmission.
> 
> I mean, you could trigger TX completion on every interface every time you
> > receive on one, but that would be a scalability nightmare.
> 
> Maybe not every time. I expect a more creative solution than this. Perhaps
> when you run out of buffers in the pool?
> 
> 
> 
> >
> >
> >
> >>
> >>
> >>     Can you provide a vague draft how would you fix the l2fwd example
> >> below?
> >>
> >> I don't think anything needs fixing on the application level.
> >>
> >
> > Wrong. odp_l2fwd uses one packet pool, receives from pktio_src and then if
> > there is anything received, it sends it out on pktio_dst.
> 
> This specific application has this specific behavior. Are you sure this is
> a general solution? I am not.
> 
> 
> > Let's say the pool has 576 elements, and the interfaces uses 256 RX and
> > 256 TX descriptors. You start with 2*256 buffers kept in the two RX ring.
> > Let's say you receive the first 64 packets, you refill the RX ring
> > immediately, so now you're out of buffers. You can send out that 64, but in
> > the next iteration odp_pktio_recv() will return 0 because it can't refill
> > the RX descriptors. (and the driver won't give you back any buffer unless
> > you can refill it). And now you are in an infinite loop, recv will always
> > return 0, because you never release the packets.
> >
> The size of the pool should somehow be correlated with the size of the RX
> and TX rings for "best performance" (whatever this means). But I also think
> that the system should function regardless of RX/TX ring sizes and pool
> size, "function" meaning not deadlock.
> 
> There are several ways to fix this:
> > - tell the application writer that if you see deadlocks, increase the
> > element size of the buffer. I doubt anyone would ever use ODP to anything
> > serious when seeing such thing.
> > - you can't really give anything more specific than in the previous point,
> > because such details as RX/TX descriptor numbers are abstracted away,
> > intentionally. And your platform can't autotune them, because it doesn't
> > know how many elements you have in the pool used for TX. In fact, it could
> > be more than just one pool.
> > - make sure that you run odp_pktio_send even if pkts == 0. In case of
> > ODP-DPDK it can help because that actually triggers TX completion.
> > Actually, we can make odp_pktio_send_complete() == odp_pktio_send(len=0),
> > so we don't have to introduce a new function. But that doesn't change the
> > fact that we have to call TX completion periodically to make sure nothing
> > is blocked.
> >
> So why doesn't the ODP-for-DPDK implementation call TX completion
> "periodically" or at some other suitable times?
> 
> 
> > - or we can just do what I proposed in the patch, which is very similar to
> > the previous point, but articulate the importance of TX completion more.
> >
> Which is a platform specific problem and exactly the kind of things that
> the ODP API should hide and not expose.

I agree. Is it possbile to dedicate "core 0"/"any core" in ODP-DPDK 
implementation
to do the house keeping job ? If we are planning for ODP-DPDK
implementation as just wrapper over DPDK API then there will not be any
value addition to use the ODP API. At least from my experience, We have
changed our  SDK "a lot" to fit into ODP model. IMO that kind of effort will
be required for useful ODP-DPDK port.

> 
> 
> >
> >
> >
> >>
> >>         -- Ola
> >>
> >>
> >>         On 28 May 2015 at 16:38, Zoltan Kiss <zoltan.k...@linaro.org
> >>         <mailto:zoltan.k...@linaro.org>
> >>         <mailto:zoltan.k...@linaro.org <mailto:zoltan.k...@linaro.org>>>
> >>
> >>         wrote:
> >>
> >>              A pktio interface can be used with poll mode drivers, where
> >> TX
> >>              completion often
> >>              has to be done manually. This turned up as a problem with
> >>         ODP-DPDK and
> >>              odp_l2fwd:
> >>
> >>              while (!exit_threads) {
> >>                       pkts = odp_pktio_recv(pktio_src,...);
> >>                       if (pkts <= 0)
> >>                               continue;
> >>              ...
> >>                       if (pkts_ok > 0)
> >>                               odp_pktio_send(pktio_dst, pkt_tbl, pkts_ok);
> >>              ...
> >>              }
> >>
> >>              In this example we never call odp_pktio_send() on pktio_dst
> >>         if there
> >>              wasn't
> >>              any new packets received on pktio_src. DPDK needs manual TX
> >>              completion. The
> >>              above example should have an
> >>         odp_pktio_send_completion(pktio_dst)
> >>              right at the
> >>              beginning of the loop.
> >>
> >>              Signed-off-by: Zoltan Kiss <zoltan.k...@linaro.org
> >>         <mailto:zoltan.k...@linaro.org>
> >>              <mailto:zoltan.k...@linaro.org
> >>
> >>         <mailto:zoltan.k...@linaro.org>>>
> >>
> >>              ---
> >>                include/odp/api/packet_io.h | 16 ++++++++++++++++
> >>                1 file changed, 16 insertions(+)
> >>
> >>              diff --git a/include/odp/api/packet_io.h
> >>         b/include/odp/api/packet_io.h
> >>              index b97b2b8..3a4054c 100644
> >>              --- a/include/odp/api/packet_io.h
> >>              +++ b/include/odp/api/packet_io.h
> >>              @@ -119,6 +119,22 @@ int odp_pktio_recv(odp_pktio_t pktio,
> >>              odp_packet_t pkt_table[], int len);
> >>                int odp_pktio_send(odp_pktio_t pktio, odp_packet_t
> >>         pkt_table[],
> >>              int len);
> >>
> >>                /**
> >>              + * Release sent packets
> >>              + *
> >>              + * This function should be called after sending on a
> >>         pktio. If the
> >>              platform
> >>              + * doesn't implement send completion in other ways, this
> >>         function
> >>              should call
> >>              + * odp_packet_free() on packets where transmission is
> >> already
> >>              completed. It can
> >>              + * be a no-op if the platform guarantees that the packets
> >>         will be
> >>              released upon
> >>              + * completion, but the application must call it
> >>         periodically after
> >>              send to make
> >>              + * sure packets are released.
> >>              + *
> >>              + * @param pktio        ODP packet IO handle
> >>              + *
> >>              + * @retval <0 on failure
> >>              + */
> >>              +int odp_pktio_send_complete(odp_pktio_t pktio);
> >>              +
> >>              +/**
> >>                 * Set the default input queue to be associated with a
> >>         pktio handle
> >>                 *
> >>                 * @param pktio                ODP packet IO handle
> >>              --
> >>              1.9.1
> >>
> >>              _______________________________________________
> >>              lng-odp mailing list
> >>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
> >>         <mailto:lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org
> >> >>
> >>         https://lists.linaro.org/mailman/listinfo/lng-odp
> >>
> >>
> >>
> >>

> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp

_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to