Thanks, Ola. I need to think about this and respond more carefully, but in the meantime could you propose the syntax/semantics of odp_queue_end(), odp_queue_end_multi(), and odp_schedule_release() in a bit more detail?
These seem to be new APIs and we need to be clear about their proposed semantics and intended use. Thanks. Bill On Mon, Oct 20, 2014 at 9:40 AM, Ola Liljedahl <ola.liljed...@linaro.org> wrote: > On 17 October 2014 10:01, Alexandru Badicioiu < > alexandru.badici...@linaro.org> wrote: > >> Hi Bill, check my thoughts inline. >> Thanks, >> Alex >> >> On 17 October 2014 03:31, Bill Fischofer <bill.fischo...@linaro.org> >> wrote: >> >>> Based on discussions we had yesterday and today, I'd like to outline the >>> open issues regarding queues and synchronization/scheduling models. We'd >>> like to get consensus on this in time for next week's Tuesday call. >>> >>> ODP identifies three different synchronization/scheduling models for >>> queues: Parallel, Atomic, and Ordered. Here are my current understandings >>> of what these mean: >>> >>> - Parallel: Buffers on a parallel queue can be dequeued by the >>> scheduler for any caller without restriction. This permits maximum >>> scale-out and concurrency for events that are truly independent. >>> >>> >>> - Atomic: Buffers on an atomic queue can be dequeued by the >>> scheduler for any caller. However, only one buffer from an atomic queue >>> may >>> be in process at any given time. When the scheduler dequeues a buffer >>> from >>> an atomic queue, the queue is locked and cannot dequeue further buffers >>> until it is released. Releasing an atomic queue can occur in two ways: >>> >>> >>> - The dequeued buffer is enqueued to another queue via an >>> odp_queue_enq() call. This action implicitly unlocks the atomic queue >>> the >>> buffer was sourced from. Note that this is the most common way in >>> which >>> atomic queues are unlocked. >>> >>> >>> - A call is made to odp_schedule_release_atomic() for the locked >>> queue. This tells the scheduler that the queue's atomicity guarantee >>> is >>> deemed satisfied by the application and the queue is free to dequeue >>> items >>> to other scheduler callers. This method MUST be used if the caller >>> consumes >>> the buffer (e.g., frees it instead of enqueues it to another queue) >>> and MAY >>> be used as a performance optimization if the caller is done with any >>> references to data that was serialized by the queue (e.g., the queue's >>> context). It is an application programming error to release a queue >>> prematurely as references subsequent to the release will not be >>> synchronized. >>> >>> Third way - odp_buffer_free() called on a buffer which was dequeued from >> an atomic queue. >> > odp_queue_end(): implicit release of atomic queue > odp_queue_end_multi(): ? I assume multi-calls cannot be used with atomic > scheduling > odp_buffer_free(): (buffer is consumed) implicit release of atomic queue > odp_schedule_release_atomic(): explicit release of atomic queue > > > >> >>> - Ordered: Buffers on an ordered queue can be dequeued by the >>> scheduler for any caller, however buffers on an ordered queue retain >>> knowledge of their sequence on their source queue and this sequence will >>> be >>> restored whenever they are enqueued to a subsequent ordered queue. That >>> is, if ordered queue A contains buffers A1, A2, and A3, and these are >>> dequeued for processing by three different threads, then when they are >>> subsequently enqueued to another ordered queue B by these threads, they >>> will appear on B as B1, B2, and B3 regardless of the order in which their >>> processing threads issued odp_queue_enq() calls to place them on B. >>> >>> Why ordering has to be restored only if the destination is an ordered >> queue? I think it should be restored regardless of the destination queue >> type. Also it should be restored if there are multiple destination queues, >> if the implementation supports it. >> > Atomic queues are also ordered queues. > > If you enqueue a packet from an atomic or ordered queue onto a parallel > queue, ordering would normally be lost. If there was ever only one thread > scheduling packets from this parallel queue (how do you guarantee that?), > then you could argue that ordering is still maintained. But this seems like > a fragile situation. > > Nothing would stop an implementation from always restoring order for > packets scheduled from ordered queues when enqueuing them. > > > Implicit in these definitions is the fact that all queues associated with >>> odp_pktio_t objects are ordered queues. >>> >> Why is this implicit? The order restoration happens when buffers are >> enqueued to the destination queue(s). Aren't the queues associated with >> odp_pktio_t the first queues seen by a packet? If an pktio queue is >> parallel, there is no requirement at all to ensure any ordering at next >> enqueue. >> > Packet I/O ingress queues may be atomic and thus also ordered in some > sense. > I think the application should decide what ordering requirements there are > for packet I/O ingress and egress queues. > > > >>> First question: Are these definitions accurate and complete? If not, >>> then what are the correct definitions for these types we wish to define? >>> Assuming these are correct, then there are several areas in ODP API that >>> seem to need refinement: >>> >>> - It seems that ODP buffers need at least two additional pieces of >>> system meta data that are missing: >>> >>> >>> - Buffers need to have a last_queue that is the odp_queue_t of the >>> last queue they were dequeued from. Needed so that odp_queue_enq() >>> can >>> unlock a previous atomic queue. >>> >>> >>> - Buffers need to retain sequence knowledge from the first ordered >>> queue they were sourced from (i.e., their ingress odp_pktio_t) so >>> this can >>> be used for order restoration as they are enqueued to downstream >>> ordered >>> queues >>> >>> >>> - odp_schedule_release_atomic() currently takes a void argument and >>> this is ambiguous. It should either take an odp_queue_t which is the >>> atomic >>> queue that is to be unlocked or else an odp_buffer_t and use that >>> buffer's >>> last_queue to find the queue to be unlocked. Taking an odp_buffer_t >>> argument would seem to be more consistent. >>> >>> The idea is probably that the ODP implementation knows which queue is > associated with each thread. Better that the implementation keeps track of > this than the application. As both odp_queue_end() and odp_buffer_free() > takes the buffer as parameter, I think it makes sense to specify the buffer > also to the odp_schedule_release() call (use the same call for atomic and > ordered scheduling). > > odp_schedule.h: > void odp_schedule_release(odp_buffer_t buf); > > > I think any argument is superfluous for odp_schedule_release_atomic , if >> the application really needs to continue the buffer processing outside the >> atomic context. Otherwise the context release happens on free. There can be >> only one atomic context associated with a thread, at any given moment, and >> the scheduler implementation has to keep track of it. Of course when >> freeing the buffer the free call has to know that the atomic context of >> that buffer has been previously released. >> > An application could store a packet on some internal data structure and > continue processing later. Thus the packet would be "consumed" from a > scheduler perspective and the application wants to inform the scheduler of > this and unlock the corresponding queue. > > > >>> - Given the above definition of ordered queues, it would seem that >>> there needs to be some ordered equivalent to >>> odp_schedule_release_atomic() >>> to handle the case where a buffer from an ordered queue is consumed >>> rather >>> than propagated. This is to avoid creating un-fillable gaps in the >>> ordering sequence downstream. >>> >>> Any consumption will end up with calling odp_buffer_free() and free has >> to inform the order restoration logic. Do we really need to extract a >> buffer from the ordered flow before the consumption? >> > You have no idea when the buffer will be freed. We don't want to pause > scheduling (for the involved queue) for an indeterminate time. > > > >> >> >>> Second question: If an application generates packets (via >>> odp_packet_alloc()), how are these sequenced with respect to other packets >>> on downstream ordered queues? Same question for cloned/copied packets. >>> >>> Clones/copies share the same sequencing information. The same for >> fragments. If locally generated packets have to be inserted in an ordered >> flow they have to explicitly request a sequencing information relevant to >> source queue at the moment of insertion. Then they can be enqueued/freed >> similarly as the other buffers in the flow. >> >> >>> Third question: If an application takes packets from a source ordered >>> queue and enqueues them to different target ordered queues, how is >>> sequencing handled here since by definition there are gaps in the >>> individual downstream ordered queues. The simplest example is a switch or >>> router that takes packets from one odp_pktio_t and sends them to multiple >>> target odp_pktio_ts. A similar question arises for packets sourced from >>> multiple input ordered queues going to the same target ordered queue. >>> >> I think that order is important to be maintained per flow, not the >> absolute one between unrelated flows. >> I think type of the destination queues and their number also may have no >> importance. Ordering should not be associated with a particular >> destination, is rather a source defined thing. Ordering works between two >> points - definition point (where the order is observed) and restoration >> point , which can be rather logical (e.g. next enqueue) than physical (a >> given queue). Order definition point is usually a queue. If it's feasible >> for the HW that the order can be observed across multiple queues, then >> ordering can work this way too. Maybe we can view the API this way - define >> order definition/restoration points and associate queues with them. >> > Why would you maintain order across multiple queues? I don't think we are > interested in global packet ordering, just per-flow. And a queue is a proxy > for a flow. > > > >> >>> Fourth question: How does this intersect with flow identification from >>> the classifier? It would seem that the classifier should override the raw >>> packet sequence and re-write this information as a flow sequence which >>> would be honored as the ordering sequence by subsequent downstream ordered >>> queues. Note that flows would still have the same downstream gap issues if >>> they are enqueued to multiple downstream ordered queues. This would >>> definitely arise in multihoming support for SCTP, though this example is >>> not an ODP v1.0 consideration. >>> >> End-to-end packet order is not guaranteed anyway so termination software > at the end must always handle misordered (and missing) packets. > > > >>> We may not have a complete set of answers to all of these questions for >>> ODP v1.0 but we need to be precise about what is and is not done for them >>> in ODP v1.0 so that we can do accurate testing for v1.0 as well as identify >>> the areas that need further work next year as we move beyond v1.0. >>> >>> Thanks for your thoughts and suggestions on these. If there are >>> additional questions along these lines feel free to add them, but I'd >>> really like to scope this discussion to what is in ODP v1.0. >>> >>> Bill >>> >>> _______________________________________________ >>> lng-odp mailing list >>> lng-odp@lists.linaro.org >>> http://lists.linaro.org/mailman/listinfo/lng-odp >>> >>> >> >> _______________________________________________ >> lng-odp mailing list >> lng-odp@lists.linaro.org >> http://lists.linaro.org/mailman/listinfo/lng-odp >> >> >
_______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org http://lists.linaro.org/mailman/listinfo/lng-odp