On Tue, Apr 18, 2017 at 4:05 AM, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolai...@nokia-bell-labs.com> wrote:

>
>
> > -----Original Message-----
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Bill
> > Fischofer
> > Sent: Thursday, April 13, 2017 3:59 PM
> > To: Elo, Matias (Nokia - FI/Espoo) <matias....@nokia-bell-labs.com>
> > Cc: lng-odp-forward <lng-odp@lists.linaro.org>
> > Subject: Re: [lng-odp] [API-NEXT PATCH 0/4] add maximum packet counts to
> > pool info
> >
> > On Thu, Apr 13, 2017 at 2:17 AM, Elo, Matias (Nokia - FI/Espoo)
> > <matias....@nokia-bell-labs.com> wrote:
> > >
> > >> On 12 Apr 2017, at 22:52, Bill Fischofer <bill.fischo...@linaro.org>
> > wrote:
> > >>
> > >> On Wed, Apr 12, 2017 at 7:58 AM, Matias Elo <matias....@nokia.com>
> > wrote:
> > >>> On some packet pool implementations the number of available packets
> > may vary
> > >>> depending on the packet length (examples below). E.g. a packet pool
> > may include
> > >>> smaller sub-pools for different packet length ranges.
> > >>
> > >> I'm not sure what the motivation is for this proposed change, but the
> > >> pkt.num specified on odp_pool_create() sets the maximum number of
> > >> packets that can be allocated from the pool, not the minimum. This is
> > >> combined with the pkt.len field to say that pkt.num packets of len
> > >> pkt.len can be allocated.
> > >
> > > The need for this information originally came up when fixing an issue
> in
> > > pool_create().  pool_create() didn't allocate enough packets if the
> > requested
> > > length (pkt.len) packets were segmented. As a result of this fix it is
> > now
> > > possible to allocate more than pkt.num packets from the pool, if the
> > > requested packet length fits into a single segment (or less segments
> > than
> > > the original pkt.len).
> >
> > This sounds like trying to fix an implementation bug with an API
> > change. The implementation has all the info it needs with pkt.num,
> > pkt.len, and pkt.max_len to realize the specified semantics. If there
> > is segmentation involved in pkt.len it can take that into account in
> > configuring the pool.
>
>
> It's a fact that most HW and SW pool implementations deal with a fixed
> segment size. A pool is initialized with N number of segments from which
> packets are built. The number of segments per packet depend on packet
> length. When all *segments* are consumed the pool is empty. Currently, if
> pkt.len is larger than the segment size, a pool is not allowed to return N
> short packets (< pkt.len), but it would need to count and cap those to
> pkt.num (e.g. N/2).
>

In the case of odp-linux, since the implementation picks a segment size
based on the odp_pool_param_t input, this is not an issue, so I assume
you're referring to some actual HW implementation. I'd like to be sure that
implementation reviews any proposed API changes in this area.


>
> This adds overhead, either in
> * packet counting - in addition to segments also packets needs to be
> counted or
> * memory usage - to avoid additional packet counting segment size must be
> increased to >= pkt.len
>

Packet pools allocate packets, not segments. Packets are comprised of one
or more segments, but segments are not themselves allocatable objects in
the current API (i.e., there is no such thing, from an ODP API perspective,
as a segment that is not part of some packet). We've talked about allowing
segment allocation for use by DDF but we've not advanced much beyond that
yet.

This is an important distinction because each packet carries system and
user metadata and pkt.num is essentially a bound on how much metadata the
pool needs to be able to support. To date we have not defined segment-level
metadata (nor, IMO, should we given that actual segment sizes are not under
application control), so an implementation that uses segments as "pure
data" and maintains the packet-level metadata separately is perfectly valid.


>
>
> The proposed API change adds implementation flexibility. There can be more
> short packets than pkt.len packets.
>

Again, pkt.num is a statement about packet metadata, which is a logically
separate issue from how many (data) segments are used as "backing store"
for a packet. This is particularly relevant to user metadata, since that
can be of arbitrary size in some implementations (max_uarea_size == 0).


>
> For example, the current spec would not allow a HW pool implementation
> which has small fixed segments size (e.g. 256 bytes which cannot be
> changed) and which does not do packet level counting (only segment level).
> Also some HW


However the underlying HW is organized, the ODP packet semantics still
apply. Again, ODP packet pools allocate ODP packets, not segments. Segments
are an aspect of implementations that applications need to be aware of, but
are essentially out of their control (other than requiring a minimum
segment size). Does this mean that some implementations may implement the
ODP packet semantics more efficiently than others? Yes, but that's the same
across all ODP APIs. There are plenty of other ODP packet APIs (split /
concat, references, etc.) that are going to have varying levels of
implementation efficiency across different implementations.


> pools may be implemented with a couple of fixed segment sizes underneath
> (e.g. N small segs, M large segs), where you would have a pool of N small
> packets or N/2 mid packets or M large packets: max num packets of any size
> would be N.


Is this a theoretical or an actual issue? Again, we earlier discussed this
possibility in the context of "pool groups", which was rejected as being
too complex to be usefully portable. On a platform that supports multiple
HW segment sizes, each pool would simply use the most appropriate size
based on the odp_pool_param_t input info. Currently the ODP API is silent
on whether the segments contained within a multi-segment packet are of
uniform size, so if an implementation wished to make use of variable-sized
segments it's free to do so. But again, I don't see why the ODP API needs
to be adjusted for this.


>
>
> >
> > >
> > > The spec of pkt.num is changed to "The exact number of 'len' byte
> > packets
> > > that the pool must provide.".
> >
> > The current spec say this is "The number of packets that the pool must
> > provide that are packet length 'len' bytes or smaller". That seems
> > perfectly unambiguous. If a given implementation doesn't do that,
> > that's a bug in that implementation and it should be fixed. The
> > validation tests should be testing this, so if that's not being done
> > then that's a bug in the validation test which should also be fixed.
>
> The "or smaller" is an issue for implementation efficiency / flexibility.
> A HW implementation with small fixed seg size and no packet level counting
> cannot full fill that requirement.
>

Which specific platform has this problem? As I recall, you argued
convincingly earlier that it's important for applications to be able to
control how many packets can be "in flight" within a given pool, and that
things like packet references count against this limit, which is why the
spec is written the way it is.


>
> >
> > >
> > >>  If the application allocated packets larger
> > >> than this size (up to pkt.max_len) then the actual number of packets
> > >> that can be successfully allocated from the pool may be lower than
> > >> pkt.num, but it will never be greater.
> > >
> > > Depending on the pool implementation this may not always be the case
> > > (not likely) and the ascii pictures in the cover letter tried to
> > visualise this
> > > possibility. For example, a pool may consists of smaller HW sub-pools
> > for
> > > different packet length ranges and these sub-pools may be different in
> > size.
> >
> > How an implementation realizes the specified API semantics is not
> > defined by ODP. It should never be possible to allocate more than
> > pkt.num packets from a pool, and it should be possible to allocate
> > exactly pkt.num packets of len pkt.len. That's what the configuration
> > parameters control.
>
> Application can set len == max_len, or seg_len == len, or seg_len == len
> == max_len, which is practice would limit number of short/long packets.
> When it does not care, it can read the max number of those from the info
> struct.
>
>
> >
> > >
> > >>
> > >> The reason for wanting the maximum number is to bound the number of
> > >> "in flight" packets for QoS purposes. As a pool reaches this limit,
> > >> RED and similar algorithms kick in to start dropping packets at RX. If
> > >> there is no fixed maximum number of packets that makes QoS processing
> > >> erratic.
> > >>
> > >
> > > There is still a maximum and an application can query it with
> > odp_pool_info()
> > > (odp_pool_info_t.pkt.max_num). RED and similar work as previously as
> > > there is no API for configuring them in ODP.
> >
> > True today, but we have discussed adding watermarks on pools that can
> > be used to trigger such behavior.
>
>
> RED is a property of packet input HW and should be configured through
> pktin config. One pool may be used by multiple interfaces, and per
> interface RED counting would guarantee that one interface would not cause
> others to drop in their first allocs.
>

Yes, which is why pools have watermarks to trigger such events. It's pool
utilization levels that trigger such actions because packets going into a
pool need storage within it.


>
>
> >
> > >
> > >> We had previously floated the idea of allowing pool groups to be
> > >> defined which would accommodate the practice where packets get
> > >> assigned to different sub-pools based on their size, however this was
> > >> seen to be overly complicated, especially on platforms that have
> > >> hardware buffer managers.
> > >
> > > True, this gets complicated fast. The updated odp_pool_info_t includes
> > new
> > > maximum packet counts to provide more information to the application
> > > without going to the specifics of the underlying pool implementation.
> > >
> > >> Do you have a specific use case that can't be handled by the current
> > structures?
> > >
> > > Described above.
> >
> > It seems you've described problems with some implementations rather
> > than application needs. The spec was designed to allow applications to
> > specify their requirements on pool capacity, which it seems to do.
>
>
> The flat number of packet from 0 bytes to pkt.len is problematic for all
> implementations in practice.
>

The number of packets within a pool is a determinate number at any given
time, so I'm not sure why this is a problem.


>
> It would be also problematic to allow application to control multiple
> points in the num/len graph. So, the compromise is to allow control only in
> one point and add num information, so that application can e.g. allocate
> per packet context memory.
>

The concern here seems to be about efficiency on some unnamed
implementation with less concern for other unnamed implementations that
might see benefit from multiple control points. That doesn't seem
consistent.


>
> -Petri
>
>
>

Reply via email to