Pool performance is optimized by using a ring as the global buffer storage. 
IPC build is disabled, since it needs large modifications due to dependency to 
pool internals. Old pool implementation was based on locks and linked list of 
buffer headers. New implementation maintain a ring of buffer handles, which 
enable fast, burst based allocs and frees. Also ring scales better with number 
of cpus than a list (enq and deq operations update opposite ends of the pool). 

L2fwd link rate (%), 2 x 40GE, 64 byte packets

        direct-                 parallel-               atomic-
cpus    orig    direct  diff    orig    parall  diff    orig    atomic  diff
1       7 %     8 %     1 %     6 %     6 %     2 %     5.4 %   5.6 %   4 %
2       14 %    15 %    7 %     9 %     9 %     5 %     8 %     9 %     8 %
4       28 %    30 %    6 %     13 %    14 %    13 %    12 %    15 %    19 %
6       42 %    44 %    6 %     16 %    19 %    19 %    8 %     20 %    150 %
8       46 %    59 %    28 %    19 %    23 %    26 %    18 %    24 %    34 %
10      55 %    57 %    3 %     20 %    27 %    37 %    8 %     28 %    264 %
12      56 %    56 %    -1 %    22 %    31 %    43 %    7 %     32 %    357 %

Max packet rate of NICs are reached with 10-12 cpu on direct mode. Otherwise, 
all cases were improved. Especially, scheduler driven cases suffered on bad 
pool scalability.

changed in v3:
* rebased
* ipc disabled with #ifdef
* added support for multi-segment packets
* API: added explicit limits for packet length in alloc calls
* Corrected validation test and example application bugs found during
  segmentation implementation

changed in v2:
* rebased to api-next branch
* added a comment that ring size must be larger than number of items in it
* fixed clang build issue
* added parens in align macro

v1 reviews:
Reviewed-by: Brian Brooks <brian.bro...@linaro.org>



Petri Savolainen (19):
  linux-gen: ipc: disable build of ipc pktio
  linux-gen: pktio: do not free zero packets
  linux-gen: ring: created common ring implementation
  linux-gen: align: added round up power of two
  linux-gen: pool: reimplement pool with ring
  linux-gen: ring: added multi enq and deq
  linux-gen: pool: use ring multi enq and deq operations
  linux-gen: pool: optimize buffer alloc
  linux-gen: pool: clean up pool inlines functions
  linux-gen: pool: ptr instead of hdl in buffer_alloc_multi
  test: validation: buf: test alignment
  test: performance: crypto: use capability to select max packet
  test: correctly initialize pool parameters
  test: validation: packet: fix bugs in tailroom and concat tests
  linux-gen: packet: added support for segmented packets
  test: validation: packet: improved multi-segment alloc test
  api: packet: added limits for packet len on alloc
  linux-gen: packet: remove zero len support from alloc
  linux-gen: packet: enable multi-segment packets

 example/generator/odp_generator.c                  |    2 +-
 include/odp/api/spec/packet.h                      |    9 +-
 include/odp/api/spec/pool.h                        |    6 +
 platform/linux-generic/Makefile.am                 |    1 +
 .../include/odp/api/plat/packet_types.h            |    6 +-
 .../include/odp/api/plat/pool_types.h              |    6 -
 .../linux-generic/include/odp_align_internal.h     |   34 +-
 .../linux-generic/include/odp_buffer_inlines.h     |  167 +--
 .../linux-generic/include/odp_buffer_internal.h    |  120 +-
 .../include/odp_classification_datamodel.h         |    2 +-
 .../linux-generic/include/odp_config_internal.h    |   55 +-
 .../linux-generic/include/odp_packet_internal.h    |   87 +-
 platform/linux-generic/include/odp_pool_internal.h |  289 +---
 platform/linux-generic/include/odp_ring_internal.h |  176 +++
 .../linux-generic/include/odp_timer_internal.h     |    4 -
 platform/linux-generic/odp_buffer.c                |   22 +-
 platform/linux-generic/odp_classification.c        |   25 +-
 platform/linux-generic/odp_crypto.c                |   12 +-
 platform/linux-generic/odp_packet.c                |  717 ++++++++--
 platform/linux-generic/odp_packet_io.c             |    2 +-
 platform/linux-generic/odp_pool.c                  | 1440 ++++++++------------
 platform/linux-generic/odp_queue.c                 |    4 +-
 platform/linux-generic/odp_schedule.c              |  102 +-
 platform/linux-generic/odp_schedule_ordered.c      |    4 +-
 platform/linux-generic/odp_timer.c                 |    3 +-
 platform/linux-generic/pktio/dpdk.c                |   10 +-
 platform/linux-generic/pktio/ipc.c                 |    3 +-
 platform/linux-generic/pktio/loop.c                |    2 +-
 platform/linux-generic/pktio/netmap.c              |   14 +-
 platform/linux-generic/pktio/socket.c              |   17 +-
 platform/linux-generic/pktio/socket_mmap.c         |   10 +-
 test/common_plat/performance/odp_crypto.c          |   47 +-
 test/common_plat/performance/odp_pktio_perf.c      |    2 +-
 test/common_plat/performance/odp_scheduling.c      |    8 +-
 test/common_plat/validation/api/buffer/buffer.c    |  113 +-
 test/common_plat/validation/api/crypto/crypto.c    |    2 +-
 test/common_plat/validation/api/packet/packet.c    |   96 +-
 test/common_plat/validation/api/pktio/pktio.c      |   21 +-
 38 files changed, 1745 insertions(+), 1895 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_ring_internal.h

-- 
2.8.1

Reply via email to