On Tue, Jan 17, 2017 at 02:38:20PM +0100, Olivier Matz wrote:
> Hi Bruce,
> 
> Maybe it's worth checking the impact. The size check could be done only
> once per bulk, so it may not cost that much.
> 
> It's also possible to have a particular case for pointer size, and
> use a memcpy for other sizes.
> 
> 
<snip> 
> I think having a performance test showing storing the elt size in the
> ring structure has a deep impact would help to reach a consensus
> faster :)
> 
> 
Hi Olivier,

I did a quick prototype using a switch statement for three data element
sizes: 8, 16, and 32 bytes. The performance difference was neglible to
none. In most cases, with ring_perf_autotest on my system, there was a
small degradation, of less than 1 cycle per packet, and a few were
slightly faster, probably due to the natural variation in results
between runs. I did not test with any memcpy calls in the datapath, all
assignments were done using uint64_t's or vectors of the appropriate
sizes.

Therefore it looks like some kind of solution without macros and using a
stored element size is possible. However, I think there is a third
alternative too. It is outlined below as option 3.

        1. Use macros as in original RFC

        2. Update rte_ring like I did for tests described above so that
           create takes the size parameter, and the switch statment in
           enqueue and dequeue looks that up at runtime.
           This means that rte_ring becomes the type used for all
           transfers of all sizes. Also, enqueues/dequeue functions take
           void * or const void * obj_table parameters rather than void
           ** and void * const * obj_table. 
           Downside, this would change the ring API and ABI, and the
           ring maintains no type information

        3. Update rte_ring as above but rename it to rte_common_ring,
           and have the element size parameter passed to enqueue and
           dequeue functions too - allowing the compiler to optimise the
           switch out. Then we update the existing rte_ring to use the
           rte_common_ring calls, passing in sizeof(void *) as parameter
           to each common call. An event-ring type, or any other ring
           types can similarly be written using common ring code, and
           present the appropriate type information on enqueue/dequeue
           to the apps using them.
           Downside: more code to maintain, and more specialised APIs.

Personally, because I like having type-specialised code, I prefer the
third option. It also gives us the ability to change the common code
without affecting the API/ABI of the rings [which could be updated later
after a proper deprecation period, if we want].

An example of a change I have in mind for this common code would be some
rework around the watermarks support. While the watermarks support is
useful, for the event_rings we actually need more information provided
from enqueue. To that end, I would see the common_rings code changed so
that the enqueue function returns an additional parameter of the
amount of space left in the ring. This information is computed by the
function anyway, and can therefore be efficiently returned by the calls.
For sp_enqueue, this extra parameter would allow the app to know the
minimum number of elements which can be successfully enqueued to the
ring in a subsequent call. The existing rte_ring code can use the return
value to calculate itself if the watermark is exceeded and return a
value as it does now. Other ring types can then decide themselves if
they want to provide watermark functionality, or what their API set
would be - though it's probably best to keep the APIs consistent.

Further thoughts?
/Bruce

Reply via email to