Pool performance is optimized by using a ring as the global buffer storage. IPC build is disabled, since it needs large modifications due to dependency to pool internals. Old pool implementation was based on locks and linked list of buffer headers. New implementation maintain a ring of buffer handles, which enable fast, burst based allocs and frees. Also ring scales better with number of cpus than a list (enq and deq operations update opposite ends of the pool). Seqmentation was not implemented yet to limit the patch set size. Instead, each packet is created from single, fixed size (64kB) buffer. Segmentation and more efficient packet data pointer, etc operations will follow.
L2fwd link rate (%), 2 x 40GE, 64 byte packets direct- parallel- atomic- cpus orig direct diff orig parall diff orig atomic diff 1 7 % 8 % 1 % 6 % 6 % 2 % 5.4 % 5.6 % 4 % 2 14 % 15 % 7 % 9 % 9 % 5 % 8 % 9 % 8 % 4 28 % 30 % 6 % 13 % 14 % 13 % 12 % 15 % 19 % 6 42 % 44 % 6 % 16 % 19 % 19 % 8 % 20 % 150 % 8 46 % 59 % 28 % 19 % 23 % 26 % 18 % 24 % 34 % 10 55 % 57 % 3 % 20 % 27 % 37 % 8 % 28 % 264 % 12 56 % 56 % -1 % 22 % 31 % 43 % 7 % 32 % 357 % Max packet rate of NICs are reached with 10-12 cpu on direct mode. Otherwise, all cases were improved. Especially, scheduler driven cases suffered on bad pool scalability. changes in v2: * rebased to api-next branch * added a comment that ring size must be larger than number of items in it * fixed clang build issue * added parens in align macro v1 reviews: Reviewed-by: Brian Brooks <brian.bro...@linaro.org> Petri Savolainen (10): linux-gen: ipc: disable build of ipc pktio linux-gen: pktio: do not free zero packets linux-gen: ring: created common ring implementation linux-gen: align: added round up power of two linux-gen: pool: reimplement pool with ring linux-gen: ring: added multi enq and deq linux-gen: pool: use ring multi enq and deq operations linux-gen: pool: optimize buffer alloc linux-gen: pool: clean up pool inlines functions linux-gen: pool: ptr instead of hdl in buffer_alloc_multi platform/linux-generic/Makefile.am | 3 +- .../include/odp/api/plat/pool_types.h | 6 - .../linux-generic/include/odp_align_internal.h | 34 +- .../linux-generic/include/odp_buffer_inlines.h | 160 +-- .../linux-generic/include/odp_buffer_internal.h | 101 +- .../include/odp_classification_datamodel.h | 2 +- .../linux-generic/include/odp_config_internal.h | 34 +- .../linux-generic/include/odp_packet_internal.h | 15 +- platform/linux-generic/include/odp_pool_internal.h | 290 +--- platform/linux-generic/include/odp_ring_internal.h | 176 +++ .../linux-generic/include/odp_timer_internal.h | 4 - platform/linux-generic/odp_buffer.c | 14 +- platform/linux-generic/odp_classification.c | 25 +- platform/linux-generic/odp_crypto.c | 4 +- platform/linux-generic/odp_packet.c | 109 +- platform/linux-generic/odp_packet_io.c | 2 +- platform/linux-generic/odp_pool.c | 1467 ++++++++------------ platform/linux-generic/odp_queue.c | 4 +- platform/linux-generic/odp_schedule.c | 102 +- platform/linux-generic/odp_schedule_ordered.c | 4 +- platform/linux-generic/odp_timer.c | 3 +- platform/linux-generic/pktio/dpdk.c | 10 +- platform/linux-generic/pktio/loop.c | 2 +- platform/linux-generic/pktio/netmap.c | 14 +- platform/linux-generic/pktio/socket.c | 16 +- platform/linux-generic/pktio/socket_mmap.c | 10 +- test/common_plat/performance/odp_pktio_perf.c | 2 +- test/common_plat/performance/odp_scheduling.c | 8 +- test/common_plat/validation/api/packet/packet.c | 8 +- 29 files changed, 1015 insertions(+), 1614 deletions(-) create mode 100644 platform/linux-generic/include/odp_ring_internal.h -- 2.8.1