Re: [lng-odp] [API-NEXT PATCH v2 3/5] linux-gen: sched: add internal API for max number of ordered locks per queue

2016-12-01 Thread Elo, Matias (Nokia - FI/Espoo)


Good to see this, however include/odp_schedule_if.h still has
SCHEDULE_ORDERED_LOCKS_PER_QUEUE defined at 2 and you have
MAX_ORDERED_LOCKS_PER_QUEUE defined at 1 below. This is confusing.
Better to use this one value here throughout that should be > 1 to
illustrate generality.

Good catch, the SCHEDULE_ORDERED_LOCKS_PER_QUEUE is old unused code I had 
missed. I'll remove it.

This name should also make it clear that this
is a per-queue limit, not a total limit on number of ordered locks in
the system. Perhaps change this to CONFIG_MAX_ORDERED_LOCKS_PER_QUEUE?

I prefer the current name as the documentation already states the per queue 
part and the name is already rather long.

The actual number of ordered locks for any given queue is set at
odp_queue_create() subject to this maximum value.


The CONFIG_QUEUE_MAX_ORD_LOCKS define is used only by the implementation for 
example to define arrays  inside structs. When calling odp_queue_create() the 
max lock count is scheduler dependent. The scheduler implementation has to make 
sure not to use larger values than CONFIG_QUEUE_MAX_ORD_LOCKS. I’ve added 
static asserts to both scheduler implementations to make sure of this.

-Matias



Re: [lng-odp] [API-NEXT PATCH v2 4/5] linux-gen: sched: new ordered queue implementation

2016-12-01 Thread Elo, Matias (Nokia - FI/Espoo)


/*
@@ -658,9 +768,21 @@ static int do_schedule(odp_queue_t *out_queue, odp_event_t 
out_ev[],
   ret = copy_events(out_ev, max_num);


This update doesn't modify the preceding code:

ordered = sched_cb_queue_is_ordered(qi);

/* Do not cache ordered events locally to improve
* parallelism. Ordered context can only be released
* when the local cache is empty. */
if (ordered && max_num < MAX_DEQ)
max_deq = max_num;

num = sched_cb_queue_deq_multi(qi, sched_local.ev_stash,
 max_deq);

which seems incorrect since this code appears to do the opposite of
what the comment claims--assign multiple consecutive ordered events to
the same thread meaning that they cannot be processed in parallel.

In my opinion it is the application's choice how many events it wants to 
schedule and the implementation should not force single event operation for 
ordered queues. If the application wants the maximal parallelism it can always 
request a single event a time.

This if clause was added to make sure that ordered events aren't saved in the 
thread local scheduler cache, which improves parallelism. It also enables 
implementing the ordered locks using atomic counters, which is done in the 
following patch.



static void order_unlock(void)
@@ -795,6 +925,15 @@ static void order_unlock(void)

static void schedule_order_lock(unsigned lock_index ODP_UNUSED)
{
+   queue_entry_t *queue;
+
+   queue = sched_local.ordered.src_queue;
+   if (!queue || lock_index >= queue->s.param.sched.lock_count) {
+   ODP_ERR("Invalid ordered lock usage\n");
+   return;
+   }
+

The previous versions of schedule_order_lock() and
schedule_order_unlock() had ODP_ASSERTs() to guard against stale
lock/unlock requests. These should be retained as they are useful for
debugging and ODP_ASSERT() compiles away in non-debug builds.

OK, will change this.

-Matias



[lng-odp] [API-NEXT PATCH v3] api: ipsec: added IPSEC API

2016-12-01 Thread Petri Savolainen
Added definitions for a look-a-side IPSEC offload API. In addition to
IPSEC packet transformations, it also supports:
* inbound SA look up
* outbound IP fragmentation

Signed-off-by: Petri Savolainen 
---

Changes in v3:
* Reword packet ordering specification

Changes in v2:
* Specify that synchronous calls cannot process all packets
  if output.num_pkt < input.num_pkt
* Specify that resulting event must be freed before calling using packets
* Added soft/hard sec limit capability
* Improved packet order specification

Changes in v1:
* renamed odp_ipsec_proto_t to renamed odp_ipsec_protocol_t
* specify that lifetime sec limit is from the SA creation
* added odp_ipsec_sa_context()
* pool for output packets is the same as packet input pool
* added antireplay check and protocol error codes
* specified which input / output packet offsets and flags are set
* moved sync/async mode selection to global config (odp_ipsec_config())
* added IPSEC capability to aid mode selection
* specify that also packet user area is copied from input to output packet


 include/odp/api/spec/event.h   |   2 +-
 include/odp/api/spec/ipsec.h   | 883 +
 include/odp_api.h  |   1 +
 platform/Makefile.inc  |   1 +
 platform/linux-generic/Makefile.am |   2 +
 platform/linux-generic/include/odp/api/ipsec.h |  36 +
 .../include/odp/api/plat/event_types.h |   1 +
 .../include/odp/api/plat/ipsec_types.h |  39 +
 8 files changed, 964 insertions(+), 1 deletion(-)
 create mode 100644 include/odp/api/spec/ipsec.h
 create mode 100644 platform/linux-generic/include/odp/api/ipsec.h
 create mode 100644 platform/linux-generic/include/odp/api/plat/ipsec_types.h

diff --git a/include/odp/api/spec/event.h b/include/odp/api/spec/event.h
index fdfa52d..75c0bbc 100644
--- a/include/odp/api/spec/event.h
+++ b/include/odp/api/spec/event.h
@@ -39,7 +39,7 @@ extern "C" {
  * @typedef odp_event_type_t
  * ODP event types:
  * ODP_EVENT_BUFFER, ODP_EVENT_PACKET, ODP_EVENT_TIMEOUT,
- * ODP_EVENT_CRYPTO_COMPL
+ * ODP_EVENT_CRYPTO_COMPL, ODP_EVENT_IPSEC_RESULT
  */
 
 /**
diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
new file mode 100644
index 000..86f66e6
--- /dev/null
+++ b/include/odp/api/spec/ipsec.h
@@ -0,0 +1,883 @@
+/* Copyright (c) 2016, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+/**
+ * @file
+ *
+ * ODP IPSEC API
+ */
+
+#ifndef ODP_API_IPSEC_H_
+#define ODP_API_IPSEC_H_
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+
+/** @defgroup odp_ipsec ODP IPSEC
+ *  Operations of IPSEC API.
+ *  @{
+ */
+
+/**
+ * @typedef odp_ipsec_sa_t
+ * IPSEC Security Association (SA)
+ */
+
+ /**
+ * @def ODP_IPSEC_SA_INVALID
+ * Invalid IPSEC SA
+ */
+
+/**
+ * IPSEC operation mode
+ */
+typedef enum odp_ipsec_op_mode_t {
+   /** Synchronous IPSEC operation
+ *
+ * Application uses synchronous IPSEC operations,
+ * which output all results on function return.
+ */
+   ODP_IPSEC_OP_MODE_SYNC = 0,
+
+   /** Asynchronous IPSEC operation
+ *
+ * Application uses asynchronous IPSEC operations,
+ * which return results via events.
+ */
+   ODP_IPSEC_OP_MODE_ASYNC
+
+} odp_ipsec_op_mode_t;
+
+/**
+ * IPSEC capability
+ */
+typedef struct odp_ipsec_capability_t {
+   /** Maximum number of IPSEC SAs */
+   uint32_t max_num_sa;
+
+   /** Synchronous IPSEC operation mode (ODP_IPSEC_OP_MODE_SYNC) support
+*
+*  0: Synchronous mode is not supported
+*  1: Synchronous mode is supported
+*  2: Synchronous mode is supported and preferred
+*/
+   uint8_t op_mode_sync;
+
+   /** Asynchronous IPSEC operation mode (ODP_IPSEC_OP_MODE_ASYNC) support
+*
+*  0: Asynchronous mode is not supported
+*  1: Asynchronous mode is supported
+*  2: Asynchronous mode is supported and preferred
+*/
+   uint8_t op_mode_async;
+
+   /** Soft expiry limit in seconds support
+*
+*  0: Limit is not supported
+*  1: Limit is supported
+*/
+   uint8_t soft_limit_sec;
+
+   /** Hard expiry limit in seconds support
+*
+*  0: Limit is not supported
+*  1: Limit is supported
+*/
+   uint8_t hard_limit_sec;
+
+   /** Supported cipher algorithms */
+   odp_crypto_cipher_algos_t ciphers;
+
+   /** Supported authentication algorithms */
+   odp_crypto_auth_algos_t   auths;
+
+} odp_ipsec_capability_t;
+
+/**
+ * IPSEC configuration options
+ */
+typedef struct odp_ipsec_config_t {
+   /** IPSEC operation mode. Application selects which mode (sync or async)
+*  will be used for IPSEC operations.
+*
+*  @see odp_ipsec_in(), odp_ipsec_in_enq()
+*/
+   

[lng-odp] [API-NEXT PATCH v3 0/5] new ordered queue implementation

2016-12-01 Thread Matias Elo
V3:
- Removed old SCHEDULE_ORDERED_LOCKS_PER_QUEUE define (Bill)
- Replaced error checks with asserts in ordered lock/unlock (Bill)

V2:
- Support for multiple ordered locks (Bill)
- New ordered lock implementation

Add new implementation for ordered queues. Compared to the old
implementation this is much simpler and improves performance ~1-4x
depending on the test case. Some performance numbers are provided below.

The implementation is based on an atomic ordered context, which only a
single thread may possess at a time. Only the thread owning the atomic
context may do enqueue(s) from the ordered queue. All other threads put
their enqueued events to a thread local enqueue stash (ordered_stash_t).
All stashed enqueue operations will be performed in the original order when
the thread acquires the ordered context. If the ordered stash becomes full,
the enqueue blocks. At the latest a thread blocks when the ev_stash is
empty and the thread tries to release the order context.


The patch set also resolves the following bug:
https://bugs.linaro.org/show_bug.cgi?id=2644


Performance benchmarks:

odp_l2fwd (64B packets)

Throughput (Gbps)
Cores   Old New Gain (%)

1:  3.0 7.0 136
2:  3.2 11.1244
4:  5.0 17.6252
6:  5.9 23.0286
8:  7.0 28.6307
10: 8.0 33.6321
12: 8.7 38.2340


odp_pktio_ordered (64B packets)

Throughput (Gbps)
Cores   Old New Gain (%)

1:  1.2 1.6 33
2:  1.1 1.8 64
4:  1.4 2.6 78
6:  1.3 2.9 125
8:  1.4 3.3 141
10: 1.3 3.5 175
12: 1.2 3.8 213

Matias Elo (5):
  linux-gen: sched: add internal APIs for locking/unlocking ordered
processing
  linux-gen: sched: remove old ordered queue implementation
  linux-gen: sched: add internal API for max number of ordered locks per
queue
  linux-gen: sched: new ordered queue implementation
  linux-gen: sched: new ordered lock implementation

 platform/linux-generic/Makefile.am |   3 -
 .../linux-generic/include/odp_buffer_internal.h|   7 -
 .../linux-generic/include/odp_config_internal.h|   5 +
 .../linux-generic/include/odp_packet_io_queue.h|   5 +-
 .../linux-generic/include/odp_queue_internal.h |  33 +-
 platform/linux-generic/include/odp_schedule_if.h   |  15 +-
 .../linux-generic/include/odp_schedule_internal.h  |  50 --
 .../include/odp_schedule_ordered_internal.h|  25 -
 platform/linux-generic/odp_packet_io.c |  17 +-
 platform/linux-generic/odp_queue.c |  76 +-
 platform/linux-generic/odp_schedule.c  | 281 ++-
 platform/linux-generic/odp_schedule_ordered.c  | 818 -
 platform/linux-generic/odp_schedule_sp.c   |  25 +-
 platform/linux-generic/odp_traffic_mngr.c  |  28 +-
 platform/linux-generic/pktio/loop.c|   2 +-
 15 files changed, 360 insertions(+), 1030 deletions(-)
 delete mode 100644 platform/linux-generic/include/odp_schedule_internal.h
 delete mode 100644 
platform/linux-generic/include/odp_schedule_ordered_internal.h
 delete mode 100644 platform/linux-generic/odp_schedule_ordered.c

-- 
2.7.4



[lng-odp] [API-NEXT PATCH v3 1/5] linux-gen: sched: add internal APIs for locking/unlocking ordered processing

2016-12-01 Thread Matias Elo
The internal ordered processing locking functions can be more streamlined
compared to the public API functions.

Signed-off-by: Matias Elo 
---
 platform/linux-generic/include/odp_schedule_if.h |  4 
 platform/linux-generic/odp_schedule.c| 12 +++-
 platform/linux-generic/odp_schedule_sp.c | 12 +++-
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/platform/linux-generic/include/odp_schedule_if.h 
b/platform/linux-generic/include/odp_schedule_if.h
index df73e70..37f88a4 100644
--- a/platform/linux-generic/include/odp_schedule_if.h
+++ b/platform/linux-generic/include/odp_schedule_if.h
@@ -37,6 +37,8 @@ typedef int (*schedule_init_global_fn_t)(void);
 typedef int (*schedule_term_global_fn_t)(void);
 typedef int (*schedule_init_local_fn_t)(void);
 typedef int (*schedule_term_local_fn_t)(void);
+typedef void (*schedule_order_lock_fn_t)(void);
+typedef void (*schedule_order_unlock_fn_t)(void);
 
 typedef struct schedule_fn_t {
schedule_pktio_start_fn_t   pktio_start;
@@ -51,6 +53,8 @@ typedef struct schedule_fn_t {
schedule_term_global_fn_t   term_global;
schedule_init_local_fn_tinit_local;
schedule_term_local_fn_tterm_local;
+   schedule_order_lock_fn_torder_lock;
+   schedule_order_unlock_fn_t  order_unlock;
 } schedule_fn_t;
 
 /* Interface towards the scheduler */
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index dfc9555..cab68a3 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -755,6 +755,14 @@ static int schedule_multi(odp_queue_t *out_queue, uint64_t 
wait,
return schedule_loop(out_queue, wait, events, num);
 }
 
+static void order_lock(void)
+{
+}
+
+static void order_unlock(void)
+{
+}
+
 static void schedule_pause(void)
 {
sched_local.pause = 1;
@@ -991,7 +999,9 @@ const schedule_fn_t schedule_default_fn = {
.init_global = schedule_init_global,
.term_global = schedule_term_global,
.init_local  = schedule_init_local,
-   .term_local  = schedule_term_local
+   .term_local  = schedule_term_local,
+   .order_lock = order_lock,
+   .order_unlock = order_unlock
 };
 
 /* Fill in scheduler API calls */
diff --git a/platform/linux-generic/odp_schedule_sp.c 
b/platform/linux-generic/odp_schedule_sp.c
index 8b355da..5090a5c 100644
--- a/platform/linux-generic/odp_schedule_sp.c
+++ b/platform/linux-generic/odp_schedule_sp.c
@@ -660,6 +660,14 @@ static void schedule_order_unlock(unsigned lock_index)
(void)lock_index;
 }
 
+static void order_lock(void)
+{
+}
+
+static void order_unlock(void)
+{
+}
+
 /* Fill in scheduler interface */
 const schedule_fn_t schedule_sp_fn = {
.pktio_start   = pktio_start,
@@ -673,7 +681,9 @@ const schedule_fn_t schedule_sp_fn = {
.init_global   = init_global,
.term_global   = term_global,
.init_local= init_local,
-   .term_local= term_local
+   .term_local= term_local,
+   .order_lock =order_lock,
+   .order_unlock =  order_unlock
 };
 
 /* Fill in scheduler API calls */
-- 
2.7.4



[lng-odp] [API-NEXT PATCH v3 5/5] linux-gen: sched: new ordered lock implementation

2016-12-01 Thread Matias Elo
Implement ordered locks using per lock atomic counters. The counter values
are compared to the queue’s atomic context to guarantee ordered locking.
Compared to the previous implementation this enables parallel processing of
ordered events outside of the lock context.

Signed-off-by: Matias Elo 
---
 .../linux-generic/include/odp_queue_internal.h |  2 +
 platform/linux-generic/odp_queue.c |  6 +++
 platform/linux-generic/odp_schedule.c  | 49 --
 3 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/platform/linux-generic/include/odp_queue_internal.h 
b/platform/linux-generic/include/odp_queue_internal.h
index b905bd8..8b55de1 100644
--- a/platform/linux-generic/include/odp_queue_internal.h
+++ b/platform/linux-generic/include/odp_queue_internal.h
@@ -59,6 +59,8 @@ struct queue_entry_s {
struct {
odp_atomic_u64_t  ctx; /**< Current ordered context id */
odp_atomic_u64_t  next_ctx; /**< Next unallocated context id */
+   /** Array of ordered locks */
+   odp_atomic_u64_t  lock[CONFIG_QUEUE_MAX_ORD_LOCKS];
} ordered ODP_ALIGNED_CACHE;
 
enq_func_t   enqueue ODP_ALIGNED_CACHE;
diff --git a/platform/linux-generic/odp_queue.c 
b/platform/linux-generic/odp_queue.c
index 4c7f497..d9cb9f3 100644
--- a/platform/linux-generic/odp_queue.c
+++ b/platform/linux-generic/odp_queue.c
@@ -77,8 +77,14 @@ static int queue_init(queue_entry_t *queue, const char *name,
queue->s.param.deq_mode = ODP_QUEUE_OP_DISABLED;
 
if (param->sched.sync == ODP_SCHED_SYNC_ORDERED) {
+   unsigned i;
+
odp_atomic_init_u64(&queue->s.ordered.ctx, 0);
odp_atomic_init_u64(&queue->s.ordered.next_ctx, 0);
+
+   for (i = 0; i < queue->s.param.sched.lock_count; i++)
+   odp_atomic_init_u64(&queue->s.ordered.lock[i],
+   0);
}
}
queue->s.type = queue->s.param.type;
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 4b33513..c628142 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -126,6 +126,15 @@ typedef struct {
int num;
 } ordered_stash_t;
 
+/* Ordered lock states */
+typedef union {
+   uint8_t u8[CONFIG_QUEUE_MAX_ORD_LOCKS];
+   uint32_t all;
+} lock_called_t;
+
+ODP_STATIC_ASSERT(sizeof(lock_called_t) == sizeof(uint32_t),
+ "Lock_called_values_do_not_fit_in_uint32");
+
 /* Scheduler local data */
 typedef struct {
int thr;
@@ -145,6 +154,7 @@ typedef struct {
ordered_stash_t stash[MAX_ORDERED_STASH];
int stash_num; /**< Number of stashed enqueue operations */
uint8_t in_order; /**< Order status */
+   lock_called_t lock_called; /**< States of ordered locks */
} ordered;
 
 } sched_local_t;
@@ -553,12 +563,21 @@ static inline void ordered_stash_release(void)
 
 static inline void release_ordered(void)
 {
+   unsigned i;
queue_entry_t *queue;
 
queue = sched_local.ordered.src_queue;
 
wait_for_order(queue);
 
+   /* Release all ordered locks */
+   for (i = 0; i < queue->s.param.sched.lock_count; i++) {
+   if (!sched_local.ordered.lock_called.u8[i])
+   odp_atomic_store_rel_u64(&queue->s.ordered.lock[i],
+sched_local.ordered.ctx + 1);
+   }
+
+   sched_local.ordered.lock_called.all = 0;
sched_local.ordered.src_queue = NULL;
sched_local.ordered.in_order = 0;
 
@@ -923,19 +942,43 @@ static void order_unlock(void)
 {
 }
 
-static void schedule_order_lock(unsigned lock_index ODP_UNUSED)
+static void schedule_order_lock(unsigned lock_index)
 {
+   odp_atomic_u64_t *ord_lock;
queue_entry_t *queue;
 
queue = sched_local.ordered.src_queue;
 
ODP_ASSERT(queue && lock_index <= queue->s.param.sched.lock_count);
 
-   wait_for_order(queue);
+   ord_lock = &queue->s.ordered.lock[lock_index];
+
+   /* Busy loop to synchronize ordered processing */
+   while (1) {
+   uint64_t lock_seq;
+
+   lock_seq = odp_atomic_load_acq_u64(ord_lock);
+
+   if (lock_seq == sched_local.ordered.ctx) {
+   sched_local.ordered.lock_called.u8[lock_index] = 1;
+   return;
+   }
+   odp_cpu_pause();
+   }
 }
 
-static void schedule_order_unlock(unsigned lock_index ODP_UNUSED)
+static void schedule_order_unlock(unsigned lock_index)
 {
+   odp_atomic_u64_t *ord_lock;
+   queue_entry_t *queue;
+
+   queue = sched_local.ordered.src_queue;
+
+   ODP_ASSERT(queue && lock_index <= queue->s.param.sched.lock_count);
+
+   o

[lng-odp] [API-NEXT PATCH v3 3/5] linux-gen: sched: add internal API for max number of ordered locks per queue

2016-12-01 Thread Matias Elo
The number of supported ordered locks may vary between the scheduler
implementations. Add an internal scheduler API call for fetching the
maximum value from currently active scheduler.

Add an internal definition CONFIG_QUEUE_MAX_ORD_LOCKS for the scheduler
independent maximum value.

Signed-off-by: Matias Elo 
---
 platform/linux-generic/include/odp_config_internal.h |  5 +
 platform/linux-generic/include/odp_schedule_if.h |  8 ++--
 platform/linux-generic/odp_queue.c   |  5 ++---
 platform/linux-generic/odp_schedule.c| 14 +-
 platform/linux-generic/odp_schedule_sp.c | 12 +++-
 5 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index 8818cda..c494660 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -22,6 +22,11 @@ extern "C" {
 #define ODP_CONFIG_QUEUES 1024
 
 /*
+ * Maximum number of ordered locks per queue
+ */
+#define CONFIG_QUEUE_MAX_ORD_LOCKS 4
+
+/*
  * Maximum number of packet IO resources
  */
 #define ODP_CONFIG_PKTIO_ENTRIES 64
diff --git a/platform/linux-generic/include/odp_schedule_if.h 
b/platform/linux-generic/include/odp_schedule_if.h
index 72af01e..6c2b050 100644
--- a/platform/linux-generic/include/odp_schedule_if.h
+++ b/platform/linux-generic/include/odp_schedule_if.h
@@ -14,12 +14,6 @@ extern "C" {
 #include 
 #include 
 
-/* Constants defined by the scheduler. These should be converted into interface
- * functions. */
-
-/* Number of ordered locks per queue */
-#define SCHEDULE_ORDERED_LOCKS_PER_QUEUE 2
-
 typedef void (*schedule_pktio_start_fn_t)(int pktio_index, int num_in_queue,
  int in_queue_idx[]);
 typedef int (*schedule_thr_add_fn_t)(odp_schedule_group_t group, int thr);
@@ -38,6 +32,7 @@ typedef int (*schedule_init_local_fn_t)(void);
 typedef int (*schedule_term_local_fn_t)(void);
 typedef void (*schedule_order_lock_fn_t)(void);
 typedef void (*schedule_order_unlock_fn_t)(void);
+typedef unsigned (*schedule_max_ordered_locks_fn_t)(void);
 
 typedef struct schedule_fn_t {
schedule_pktio_start_fn_t   pktio_start;
@@ -54,6 +49,7 @@ typedef struct schedule_fn_t {
schedule_term_local_fn_tterm_local;
schedule_order_lock_fn_torder_lock;
schedule_order_unlock_fn_t  order_unlock;
+   schedule_max_ordered_locks_fn_t max_ordered_locks;
 } schedule_fn_t;
 
 /* Interface towards the scheduler */
diff --git a/platform/linux-generic/odp_queue.c 
b/platform/linux-generic/odp_queue.c
index 74f384d..99c91e7 100644
--- a/platform/linux-generic/odp_queue.c
+++ b/platform/linux-generic/odp_queue.c
@@ -70,8 +70,7 @@ static int queue_init(queue_entry_t *queue, const char *name,
queue->s.name[ODP_QUEUE_NAME_LEN - 1] = 0;
}
memcpy(&queue->s.param, param, sizeof(odp_queue_param_t));
-   if (queue->s.param.sched.lock_count >
-   SCHEDULE_ORDERED_LOCKS_PER_QUEUE)
+   if (queue->s.param.sched.lock_count > sched_fn->max_ordered_locks())
return -1;
 
if (param->type == ODP_QUEUE_TYPE_SCHED)
@@ -162,7 +161,7 @@ int odp_queue_capability(odp_queue_capability_t *capa)
 
/* Reserve some queues for internal use */
capa->max_queues= ODP_CONFIG_QUEUES - NUM_INTERNAL_QUEUES;
-   capa->max_ordered_locks = SCHEDULE_ORDERED_LOCKS_PER_QUEUE;
+   capa->max_ordered_locks = sched_fn->max_ordered_locks();
capa->max_sched_groups  = sched_fn->num_grps();
capa->sched_prios   = odp_schedule_num_prio();
 
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 50639ff..5bc274f 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -110,6 +110,12 @@ ODP_STATIC_ASSERT((8 * sizeof(pri_mask_t)) >= 
QUEUES_PER_PRIO,
 /* Maximum number of dequeues */
 #define MAX_DEQ CONFIG_BURST_SIZE
 
+/* Maximum number of ordered locks per queue */
+#define MAX_ORDERED_LOCKS_PER_QUEUE 1
+
+ODP_STATIC_ASSERT(MAX_ORDERED_LOCKS_PER_QUEUE <= CONFIG_QUEUE_MAX_ORD_LOCKS,
+ "Too_many_ordered_locks");
+
 /* Scheduler local data */
 typedef struct {
int thr;
@@ -323,6 +329,11 @@ static int schedule_term_local(void)
return 0;
 }
 
+static unsigned schedule_max_ordered_locks(void)
+{
+   return MAX_ORDERED_LOCKS_PER_QUEUE;
+}
+
 static inline int queue_per_prio(uint32_t queue_index)
 {
return ((QUEUES_PER_PRIO - 1) & queue_index);
@@ -1026,7 +1037,8 @@ const schedule_fn_t schedule_default_fn = {
.init_local  = schedule_init_local,
.term_local  = schedule_term_local,
.order_lock = order_lock,
-   .order_unlock = order_unlock
+   .order_unlock = order_unlock,
+   .max_ordered_locks = schedule_max_ordered_locks
 };
 
 /*

[lng-odp] [API-NEXT PATCH v3 4/5] linux-gen: sched: new ordered queue implementation

2016-12-01 Thread Matias Elo
Add new implementation for ordered queues. Compared to the old
implementation this is much simpler and improves performance ~1-4x
depending on the test case.

The implementation is based on an atomic ordered context, which only a
single thread may possess at a time. Only the thread owning the atomic
context may do enqueue(s) from the ordered queue. All other threads put
their enqueued events to a thread local enqueue stash (ordered_stash_t).
All stashed enqueue operations will be performed in the original order when
the thread acquires the ordered context. If the ordered stash becomes full,
the enqueue blocks. At the latest a thread blocks when the ev_stash is
empty and the thread tries to release the order context.

Signed-off-by: Matias Elo 
---
 .../linux-generic/include/odp_queue_internal.h |   5 +
 platform/linux-generic/odp_queue.c |  14 +-
 platform/linux-generic/odp_schedule.c  | 171 +++--
 3 files changed, 172 insertions(+), 18 deletions(-)

diff --git a/platform/linux-generic/include/odp_queue_internal.h 
b/platform/linux-generic/include/odp_queue_internal.h
index df36b76..b905bd8 100644
--- a/platform/linux-generic/include/odp_queue_internal.h
+++ b/platform/linux-generic/include/odp_queue_internal.h
@@ -56,6 +56,11 @@ struct queue_entry_s {
odp_buffer_hdr_t *tail;
int   status;
 
+   struct {
+   odp_atomic_u64_t  ctx; /**< Current ordered context id */
+   odp_atomic_u64_t  next_ctx; /**< Next unallocated context id */
+   } ordered ODP_ALIGNED_CACHE;
+
enq_func_t   enqueue ODP_ALIGNED_CACHE;
deq_func_t   dequeue;
enq_multi_func_t enqueue_multi;
diff --git a/platform/linux-generic/odp_queue.c 
b/platform/linux-generic/odp_queue.c
index 99c91e7..4c7f497 100644
--- a/platform/linux-generic/odp_queue.c
+++ b/platform/linux-generic/odp_queue.c
@@ -73,9 +73,14 @@ static int queue_init(queue_entry_t *queue, const char *name,
if (queue->s.param.sched.lock_count > sched_fn->max_ordered_locks())
return -1;
 
-   if (param->type == ODP_QUEUE_TYPE_SCHED)
+   if (param->type == ODP_QUEUE_TYPE_SCHED) {
queue->s.param.deq_mode = ODP_QUEUE_OP_DISABLED;
 
+   if (param->sched.sync == ODP_SCHED_SYNC_ORDERED) {
+   odp_atomic_init_u64(&queue->s.ordered.ctx, 0);
+   odp_atomic_init_u64(&queue->s.ordered.next_ctx, 0);
+   }
+   }
queue->s.type = queue->s.param.type;
 
queue->s.enqueue = queue_enq;
@@ -301,6 +306,13 @@ int odp_queue_destroy(odp_queue_t handle)
ODP_ERR("queue \"%s\" not empty\n", queue->s.name);
return -1;
}
+   if (queue_is_ordered(queue) &&
+   odp_atomic_load_u64(&queue->s.ordered.ctx) !=
+   odp_atomic_load_u64(&queue->s.ordered.next_ctx)) {
+   UNLOCK(&queue->s.lock);
+   ODP_ERR("queue \"%s\" reorder incomplete\n", queue->s.name);
+   return -1;
+   }
 
switch (queue->s.status) {
case QUEUE_STATUS_READY:
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 5bc274f..4b33513 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -111,11 +111,21 @@ ODP_STATIC_ASSERT((8 * sizeof(pri_mask_t)) >= 
QUEUES_PER_PRIO,
 #define MAX_DEQ CONFIG_BURST_SIZE
 
 /* Maximum number of ordered locks per queue */
-#define MAX_ORDERED_LOCKS_PER_QUEUE 1
+#define MAX_ORDERED_LOCKS_PER_QUEUE 2
 
 ODP_STATIC_ASSERT(MAX_ORDERED_LOCKS_PER_QUEUE <= CONFIG_QUEUE_MAX_ORD_LOCKS,
  "Too_many_ordered_locks");
 
+/* Ordered stash size */
+#define MAX_ORDERED_STASH 512
+
+/* Storage for stashed enqueue operation arguments */
+typedef struct {
+   odp_buffer_hdr_t *buf_hdr[QUEUE_MULTI_MAX];
+   queue_entry_t *queue;
+   int num;
+} ordered_stash_t;
+
 /* Scheduler local data */
 typedef struct {
int thr;
@@ -128,7 +138,15 @@ typedef struct {
uint32_t queue_index;
odp_queue_t queue;
odp_event_t ev_stash[MAX_DEQ];
-   void *queue_entry;
+   struct {
+   queue_entry_t *src_queue; /**< Source queue entry */
+   uint64_t ctx; /**< Ordered context id */
+   /** Storage for stashed enqueue operations */
+   ordered_stash_t stash[MAX_ORDERED_STASH];
+   int stash_num; /**< Number of stashed enqueue operations */
+   uint8_t in_order; /**< Order status */
+   } ordered;
+
 } sched_local_t;
 
 /* Priority queue */
@@ -491,17 +509,81 @@ static void schedule_release_atomic(void)
}
 }
 
+static inline int ordered_own_turn(queue_entry_t *queue)
+{
+   uint64_t ctx;
+
+   ctx = odp_atomic_load_acq_u64(&queue->s.ordered.ctx);
+
+   return ctx == sched_local.ordered.ctx;
+}
+
+static inline void wait_f

[lng-odp] [API-NEXT PATCH v3 2/5] linux-gen: sched: remove old ordered queue implementation

2016-12-01 Thread Matias Elo
Remove old ordered queue code. Replaced temporarily by atomic handling.

Signed-off-by: Matias Elo 
---
 platform/linux-generic/Makefile.am |   3 -
 .../linux-generic/include/odp_buffer_internal.h|   7 -
 .../linux-generic/include/odp_packet_io_queue.h|   5 +-
 .../linux-generic/include/odp_queue_internal.h |  26 +-
 platform/linux-generic/include/odp_schedule_if.h   |   3 +-
 .../linux-generic/include/odp_schedule_internal.h  |  50 --
 .../include/odp_schedule_ordered_internal.h|  25 -
 platform/linux-generic/odp_packet_io.c |  17 +-
 platform/linux-generic/odp_queue.c |  57 +-
 platform/linux-generic/odp_schedule.c  |  83 ++-
 platform/linux-generic/odp_schedule_ordered.c  | 818 -
 platform/linux-generic/odp_schedule_sp.c   |   3 +-
 platform/linux-generic/odp_traffic_mngr.c  |  28 +-
 platform/linux-generic/pktio/loop.c|   2 +-
 14 files changed, 103 insertions(+), 1024 deletions(-)
 delete mode 100644 platform/linux-generic/include/odp_schedule_internal.h
 delete mode 100644 
platform/linux-generic/include/odp_schedule_ordered_internal.h
 delete mode 100644 platform/linux-generic/odp_schedule_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index b60eacb..adbe24d 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -153,8 +153,6 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_queue_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_schedule_if.h \
- ${srcdir}/include/odp_schedule_internal.h \
- ${srcdir}/include/odp_schedule_ordered_internal.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_timer_internal.h \
@@ -208,7 +206,6 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_rwlock_recursive.c \
   odp_schedule.c \
   odp_schedule_if.c \
-  odp_schedule_ordered.c \
   odp_schedule_sp.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
diff --git a/platform/linux-generic/include/odp_buffer_internal.h 
b/platform/linux-generic/include/odp_buffer_internal.h
index 4e75908..2064f7c 100644
--- a/platform/linux-generic/include/odp_buffer_internal.h
+++ b/platform/linux-generic/include/odp_buffer_internal.h
@@ -79,7 +79,6 @@ struct odp_buffer_hdr_t {
uint32_t all;
struct {
uint32_t hdrdata:1;  /* Data is in buffer hdr */
-   uint32_t sustain:1;  /* Sustain order */
};
} flags;
 
@@ -95,12 +94,6 @@ struct odp_buffer_hdr_t {
uint32_t uarea_size; /* size of user area */
uint32_t segcount;   /* segment count */
uint32_t segsize;/* segment size */
-   uint64_t order;  /* sequence for ordered queues */
-   queue_entry_t   *origin_qe;  /* ordered queue origin */
-   union {
-   queue_entry_t   *target_qe;  /* ordered queue target */
-   uint64_t sync[SCHEDULE_ORDERED_LOCKS_PER_QUEUE];
-   };
 #ifdef _ODP_PKTIO_IPC
/* ipc mapped process can not walk over pointers,
 * offset has to be used */
diff --git a/platform/linux-generic/include/odp_packet_io_queue.h 
b/platform/linux-generic/include/odp_packet_io_queue.h
index 13b79f3..d1d4b22 100644
--- a/platform/linux-generic/include/odp_packet_io_queue.h
+++ b/platform/linux-generic/include/odp_packet_io_queue.h
@@ -28,11 +28,10 @@ extern "C" {
 ODP_STATIC_ASSERT(ODP_PKTIN_QUEUE_MAX_BURST >= QUEUE_MULTI_MAX,
  "ODP_PKTIN_DEQ_MULTI_MAX_ERROR");
 
-int pktin_enqueue(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr, int 
sustain);
+int pktin_enqueue(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr);
 odp_buffer_hdr_t *pktin_dequeue(queue_entry_t *queue);
 
-int pktin_enq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], int num,
-   int sustain);
+int pktin_enq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], int 
num);
 int pktin_deq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], int 
num);
 
 
diff --git a/platform/linux-generic/include/odp_queue_internal.h 
b/platform/linux-generic/include/odp_queue_internal.h
index e223d9f..df36b76 100644
--- a/platform/linux-generic/include/odp_queue_internal.h
+++ b/platform/linux-generic/include/odp_queue_internal.h
@@ -41,11 +41,11 @@ extern "C" {
 /* forward declaration */
 union queue_entry_u;
 
-typedef int (*enq_func_t)(union queue_entry_u *, odp_buffer_hdr_t *, int);
+typedef int (*enq_func_t)(union queue_entry_u *, odp_b

[lng-odp] [PATCH v3 2/2] test: pktio_ordered: add test script

2016-12-01 Thread Matias Elo
Enable application testing using pcap pktio.

Signed-off-by: Matias Elo 
---
 test/common_plat/performance/Makefile.am   |   1 +
 .../performance/odp_pktio_ordered_run.sh   |  28 +
 test/common_plat/performance/udp64.pcap| Bin 0 -> 7624 bytes
 3 files changed, 29 insertions(+)
 create mode 100755 test/common_plat/performance/odp_pktio_ordered_run.sh
 create mode 100644 test/common_plat/performance/udp64.pcap

diff --git a/test/common_plat/performance/Makefile.am 
b/test/common_plat/performance/Makefile.am
index 790ddae..2de07aa 100644
--- a/test/common_plat/performance/Makefile.am
+++ b/test/common_plat/performance/Makefile.am
@@ -10,6 +10,7 @@ COMPILE_ONLY = odp_l2fwd$(EXEEXT) \
   odp_scheduling$(EXEEXT)
 
 TESTSCRIPTS = odp_l2fwd_run.sh \
+ odp_pktio_ordered_run.sh \
  odp_sched_latency_run.sh \
  odp_scheduling_run.sh
 
diff --git a/test/common_plat/performance/odp_pktio_ordered_run.sh 
b/test/common_plat/performance/odp_pktio_ordered_run.sh
new file mode 100755
index 000..c31eed5
--- /dev/null
+++ b/test/common_plat/performance/odp_pktio_ordered_run.sh
@@ -0,0 +1,28 @@
+#!/bin/bash
+#
+# Copyright (c) 2016, Linaro Limited
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+#
+
+PCAP_IN=`find . ${TEST_DIR} $(dirname $0) -name udp64.pcap -print -quit`
+echo "using PCAP_IN = ${PCAP_IN}"
+
+./odp_pktio_ordered -i pcap:in=${PCAP_IN},pcap:out=pcapout.pcap -t 2 &
+wait $!
+STATUS=$?
+
+if [ "$STATUS" -ne 0 ]; then
+  echo "Error: status was: $STATUS, expected 0"
+  exit 1
+fi
+
+if [ `stat -c %s pcapout.pcap` -ne `stat -c %s  ${PCAP_IN}` ]; then
+  echo "File sizes disagree"
+  exit 1
+fi
+
+rm -f pcapout.pcap
+
+exit 0
diff --git a/test/common_plat/performance/udp64.pcap 
b/test/common_plat/performance/udp64.pcap
new file mode 100644
index 
..45f9d6e6341a331125e1e3e49ab8ad1e71b20712
GIT binary patch
literal 7624
zcmca|c+)~A1{MYw_+QV!zzF1AIDRVZQX4OW4Ui4OOdthJV3Lu8!IgnQ52VaNFl`SP
zPy-M%&2gOL#31#rG%+bTB{eNQBQq;ICpRy@ps;AvtkLiqO%tPeXtbOdEel8Mj?wyY
zv^_D}W*Ti5GK{vPNBat+eXG&_<7gilID#

[lng-odp] [PATCH v3 1/2] test: perf: add new ordered pktio application

2016-12-01 Thread Matias Elo
Add new test application for ordered queue functionality and performance
validation. The application sets sequence numbers to incoming packets using
ordered pktin queues and ordered context locks. After being tagged packets
are enqueued to atomic queues based on flow hash (IPv4 5-tuple). In atomic
flow processing the sequence number is validated and packet is sent to
selected output interface.

Main options:
-m: Input queue type can be changed to atomic or parallel to enable easy
performance comparison. With parallel input queues the packet order is
not maintained.
-r: Number of input queues per interface
-f: Number of atomic flow queues per interface
-e: Number of extra input processing rounds. This can be used to simulate
"fat pipe" traffic processing.

Signed-off-by: Matias Elo 
---

V3:
- Fixed clang build error

 test/common_plat/performance/.gitignore  |1 +
 test/common_plat/performance/Makefile.am |7 +-
 test/common_plat/performance/dummy_crc.h |  493 
 test/common_plat/performance/odp_pktio_ordered.c | 1337 ++
 4 files changed, 1837 insertions(+), 1 deletion(-)
 create mode 100644 test/common_plat/performance/dummy_crc.h
 create mode 100644 test/common_plat/performance/odp_pktio_ordered.c

diff --git a/test/common_plat/performance/.gitignore 
b/test/common_plat/performance/.gitignore
index 1527d25..8bb18f5 100644
--- a/test/common_plat/performance/.gitignore
+++ b/test/common_plat/performance/.gitignore
@@ -3,6 +3,7 @@
 odp_atomic
 odp_crypto
 odp_l2fwd
+odp_pktio_ordered
 odp_pktio_perf
 odp_sched_latency
 odp_scheduling
diff --git a/test/common_plat/performance/Makefile.am 
b/test/common_plat/performance/Makefile.am
index f184609..790ddae 100644
--- a/test/common_plat/performance/Makefile.am
+++ b/test/common_plat/performance/Makefile.am
@@ -5,6 +5,7 @@ TESTS_ENVIRONMENT += TEST_DIR=${builddir}
 EXECUTABLES = odp_crypto$(EXEEXT) odp_pktio_perf$(EXEEXT)
 
 COMPILE_ONLY = odp_l2fwd$(EXEEXT) \
+  odp_pktio_ordered$(EXEEXT) \
   odp_sched_latency$(EXEEXT) \
   odp_scheduling$(EXEEXT)
 
@@ -22,15 +23,19 @@ bin_PROGRAMS = $(EXECUTABLES) $(COMPILE_ONLY)
 
 odp_crypto_LDFLAGS = $(AM_LDFLAGS) -static
 odp_crypto_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/test
+odp_pktio_ordered_LDFLAGS = $(AM_LDFLAGS) -static
+odp_pktio_ordered_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/test
 odp_sched_latency_LDFLAGS = $(AM_LDFLAGS) -static
 odp_sched_latency_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/test
 odp_scheduling_LDFLAGS = $(AM_LDFLAGS) -static
 odp_scheduling_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/test
 
 noinst_HEADERS = \
- $(top_srcdir)/test/test_debug.h
+ $(top_srcdir)/test/test_debug.h \
+ dummy_crc.h
 
 dist_odp_crypto_SOURCES = odp_crypto.c
+dist_odp_pktio_ordered_SOURCES = odp_pktio_ordered.c
 dist_odp_sched_latency_SOURCES = odp_sched_latency.c
 dist_odp_scheduling_SOURCES = odp_scheduling.c
 dist_odp_pktio_perf_SOURCES = odp_pktio_perf.c
diff --git a/test/common_plat/performance/dummy_crc.h 
b/test/common_plat/performance/dummy_crc.h
new file mode 100644
index 000..38da444
--- /dev/null
+++ b/test/common_plat/performance/dummy_crc.h
@@ -0,0 +1,493 @@
+/* Copyright (c) 2016, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2013 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 

Re: [lng-odp] [PATCHv10 3/3] configure.ac update version numbers

2016-12-01 Thread Maxim Uvarov

On 12/01/16 01:09, Anders Roxell wrote:

On 30 November 2016 at 22:02, Mike Holmes  wrote:

I just CC'ed you in Steve.

My head is spinning but I think we have this straight now, perhaps you have
time to sync with Maxim and  possibly Anders if he has time to check this
from ytour Debian background?

I think if we can get the next couple of release out correctly the pattern
will establish and it will be easier by the time we get to TigerMoth.

Mike

On 30 November 2016 at 15:32, Maxim Uvarov  wrote:


Default is abi compat mode, all interface functions changed,
so increase first number of .so

Signed-off-by: Maxim Uvarov 
---
  configure.ac | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index b460a65..fe7e47d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3,7 +3,7 @@ AC_PREREQ([2.5])
  # Set correct API version
  
##
  m4_define([odpapi_generation_version], [1])
-m4_define([odpapi_major_version], [11])
+m4_define([odpapi_major_version], [12])
  m4_define([odpapi_minor_version], [0])
  m4_define([odpapi_point_version], [0])
  m4_define([odpapi_version],
@@ -30,10 +30,10 @@ AM_SILENT_RULES([yes])
  
##
  # Set correct platform library version
  
##
-ODP_LIBSO_VERSION=111:0:0
+ODP_LIBSO_VERSION=112:0:0
  AC_SUBST(ODP_LIBSO_VERSION)

-ODPHELPER_LIBSO_VERSION=110:0:1
+ODPHELPER_LIBSO_VERSION=110:1:2

Since the ABI isn't changed we shouldn't bump the age only the revision.
The curl project [1] describes the rules in a easier way.

Cheers,
Anders
[1] https://github.com/curl/curl/blob/master/lib/Makefile.am#L95


curl project is not official documentation for autotools. So we can take 
this under account but can not just follow it.


So we have 2 official links describing that numbers:

1. https://autotools.io/libtool/version.html
2. 
https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html


From link 1:
"Always increase the revision value. "

From link 2:
"If the library source code has changed at all since the last update, 
then increment revision (‘c:r:a’ becomes ‘c:/r+1/:a’). "



So I think we understood document right.

Maxim.


  AC_SUBST(ODPHELPER_LIBSO_VERSION)

  # Checks for programs.
--
2.7.1.250.gff4ea60




--
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"




Re: [lng-odp] [PATCHv10 3/3] configure.ac update version numbers

2016-12-01 Thread Steve McIntyre
On Thu, Dec 01, 2016 at 04:05:14PM +0300, Maxim Uvarov wrote:
>On 12/01/16 01:09, Anders Roxell wrote:
>> Since the ABI isn't changed we shouldn't bump the age only the revision.
>> The curl project [1] describes the rules in a easier way.
>>
>>[1] https://github.com/curl/curl/blob/master/lib/Makefile.am#L95
>
>curl project is not official documentation for autotools. So we can take this
>under account but can not just follow it.
>
>So we have 2 official links describing that numbers:
>
>1. https://autotools.io/libtool/version.html
>2. 
>https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
>
>From link 1:
>"Always increase the revision value. "
>
>From link 2:
>"If the library source code has changed at all since the last update, then
>increment revision (‘c:r:a’ becomes ‘c:/r+1/:a’). "

Hi guys,

Two things here...

1. Maxim's two docs say exactly what the curl doc says - just in
   different language. Also from link 1:

   """
   Warning

   A common mistake is to assume that the three values passed to
   -version-info map directly into the three numbers at the end of the
   library name. This is not the case, and indeed, current, revision
   and age are applied differently depending on the operating system
   that one is using.
   """

   The libtool -version_info stuff is *horrendously* confusing for
   many people precisely because of this awful mismatch :-( WTF
   they've defined things this way I have no idea...

2. That just describes the *revision*, however. You've also increased
   the *age* by 2, and that's what Anders was complaining about. From
   the doc you have referenced here (link 1), increasing the *age* but
   not touching *current* makes no sense:

   * Increase the current value whenever an interface has been added, removed 
or changed.
   * Increase the age value only if the changes made to the ABI are backward 
compatible.

   The curl doc again agrees with that.

Do these two points make sense to people?

Cheers,
-- 
Steve McIntyresteve.mcint...@linaro.org
 Linaro.org | Open source software for ARM SoCs



Re: [lng-odp] [PATCHv10 3/3] configure.ac update version numbers

2016-12-01 Thread Mike Holmes
Thanks Steve

In an effort to close on this

Steve and Anders both say it should be

-ODPHELPER_LIBSO_VERSION=110:0:1
+ODPHELPER_LIBSO_VERSION=110:1:1

And they are our best resourse I consider my self educated :)
Maxim / Anders can we take the CURL  text and add any extra explanation we
need before  adding it to our docs ?

Mike

On 1 December 2016 at 09:17, Steve McIntyre 
wrote:

> On Thu, Dec 01, 2016 at 04:05:14PM +0300, Maxim Uvarov wrote:
> >On 12/01/16 01:09, Anders Roxell wrote:
> >> Since the ABI isn't changed we shouldn't bump the age only the revision.
> >> The curl project [1] describes the rules in a easier way.
> >>
> >>[1] https://github.com/curl/curl/blob/master/lib/Makefile.am#L95
> >
> >curl project is not official documentation for autotools. So we can take
> this
> >under account but can not just follow it.
> >
> >So we have 2 official links describing that numbers:
> >
> >1. https://autotools.io/libtool/version.html
> >2. https://www.gnu.org/software/libtool/manual/html_node/
> Updating-version-info.html
> >
> >From link 1:
> >"Always increase the revision value. "
> >
> >From link 2:
> >"If the library source code has changed at all since the last update, then
> >increment revision (‘c:r:a’ becomes ‘c:/r+1/:a’). "
>
> Hi guys,
>
> Two things here...
>
> 1. Maxim's two docs say exactly what the curl doc says - just in
>different language. Also from link 1:
>
>"""
>Warning
>
>A common mistake is to assume that the three values passed to
>-version-info map directly into the three numbers at the end of the
>library name. This is not the case, and indeed, current, revision
>and age are applied differently depending on the operating system
>that one is using.
>"""
>
>The libtool -version_info stuff is *horrendously* confusing for
>many people precisely because of this awful mismatch :-( WTF
>they've defined things this way I have no idea...
>
> 2. That just describes the *revision*, however. You've also increased
>the *age* by 2, and that's what Anders was complaining about. From
>the doc you have referenced here (link 1), increasing the *age* but
>not touching *current* makes no sense:
>
>* Increase the current value whenever an interface has been added,
> removed or changed.
>* Increase the age value only if the changes made to the ABI are
> backward compatible.
>
>The curl doc again agrees with that.
>
> Do these two points make sense to people?
>
> Cheers,
> --
> Steve McIntyresteve.mcint...@linaro.org
>  Linaro.org | Open source software for ARM SoCs
>
>


-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


Re: [lng-odp] [PATCHv10 3/3] configure.ac update version numbers

2016-12-01 Thread Maxim Uvarov

On 12/01/16 17:25, Mike Holmes wrote:

Thanks Steve

In an effort to close on this

Steve and Anders both say it should be

-ODPHELPER_LIBSO_VERSION=110:0:1
+ODPHELPER_LIBSO_VERSION=110:1:1

And they are our best resourse I consider my self educated :)
Maxim / Anders can we take the CURL  text and add any extra 
explanation we need before  adding it to our docs ?


Mike


ok, if first two patches ok then please add review-by and I will resping 
the latest with that changes and comment from curl.


Maxim.



On 1 December 2016 at 09:17, Steve McIntyre > wrote:


On Thu, Dec 01, 2016 at 04:05:14PM +0300, Maxim Uvarov wrote:
>On 12/01/16 01:09, Anders Roxell wrote:
>> Since the ABI isn't changed we shouldn't bump the age only the revision.
>> The curl project [1] describes the rules in a easier way.
>>
>>[1]https://github.com/curl/curl/blob/master/lib/Makefile.am#L95

>
>curl project is not official documentation for autotools. So we
can take this
>under account but can not just follow it.
>
>So we have 2 official links describing that numbers:
>
>1. https://autotools.io/libtool/version.html

>2.

https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html


>
>From link 1:
>"Always increase the revision value. "
>
>From link 2:
>"If the library source code has changed at all since the last
update, then
>increment revision (‘c:r:a’ becomes ‘c:/r+1/:a’). "

Hi guys,

Two things here...

1. Maxim's two docs say exactly what the curl doc says - just in
   different language. Also from link 1:

   """
   Warning

   A common mistake is to assume that the three values passed to
   -version-info map directly into the three numbers at the end of the
   library name. This is not the case, and indeed, current, revision
   and age are applied differently depending on the operating system
   that one is using.
   """

   The libtool -version_info stuff is *horrendously* confusing for
   many people precisely because of this awful mismatch :-( WTF
   they've defined things this way I have no idea...

2. That just describes the *revision*, however. You've also increased
   the *age* by 2, and that's what Anders was complaining about. From
   the doc you have referenced here (link 1), increasing the *age* but
   not touching *current* makes no sense:

   * Increase the current value whenever an interface has been
added, removed or changed.
   * Increase the age value only if the changes made to the ABI
are backward compatible.

   The curl doc again agrees with that.

Do these two points make sense to people?

Cheers,
--
Steve McIntyre steve.mcint...@linaro.org

 Linaro.org | Open source software for ARM
SoCs




--
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org ***│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"





[lng-odp] [Bug 2652] drvshmem_main fails and segfaults

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2652

Mike Holmes  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Mike Holmes  ---
b5781ee

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [Bug 2595] vlan insertion test

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2595

Mike Holmes  changed:

   What|Removed |Added

 CC||mike.hol...@linaro.org
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Mike Holmes  ---
783ca69

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [Bug 2498] Coverty bugs found with licenced copy of the tool

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2498

Mike Holmes  changed:

   What|Removed |Added

 Status|CONFIRMED   |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Mike Holmes  ---
We dont have access to the full tool and the code base has moved on.
Would welcome patches but closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [lng-odp] [PATCHv3] platform: linux-generic: reading cpu affinity from cpuset

2016-12-01 Thread Krishna Garapati
ping

/Krishna

On 28 November 2016 at 15:34, Balakrishna Garapati <
balakrishna.garap...@linaro.org> wrote:

> With this new proposal cpu affinity is read correctly especially
> when using cgroups otherwise wrong cpu mask is set.
>
> Fixes bug: https://bugs.linaro.org/show_bug.cgi?id=2472
>
> Signed-off-by: Balakrishna Garapati 
> ---
>
>  v1 to v2: added Description of the issue to the patch commit log.
>  v2 to v3: Resending the patch adding the log change from v1 to v2
>
>  platform/linux-generic/odp_cpumask.c | 69 +-
> --
>  1 file changed, 16 insertions(+), 53 deletions(-)
>
> diff --git a/platform/linux-generic/odp_cpumask.c
> b/platform/linux-generic/odp_cpumask.c
> index 6bf2632..7b0d80a 100644
> --- a/platform/linux-generic/odp_cpumask.c
> +++ b/platform/linux-generic/odp_cpumask.c
> @@ -227,71 +227,34 @@ int odp_cpumask_next(const odp_cpumask_t *mask, int
> cpu)
>   */
>  static int get_installed_cpus(void)
>  {
> -   char *numptr;
> -   char *endptr;
> -   long int cpu_idnum;
> -   DIR  *d;
> -   struct dirent *dir;
> +   int cpu_idnum;
> +   cpu_set_t cpuset;
> +   int ret;
>
> /* Clear the global cpumasks for control and worker CPUs */
> odp_cpumask_zero(&odp_global_data.control_cpus);
> odp_cpumask_zero(&odp_global_data.worker_cpus);
>
> -   /*
> -* Scan the /sysfs pseudo-filesystem for CPU info directories.
> -* There should be one subdirectory for each installed logical CPU
> -*/
> -   d = opendir("/sys/devices/system/cpu");
> -   if (d) {
> -   while ((dir = readdir(d)) != NULL) {
> -   cpu_idnum = CPU_SETSIZE;
> -
> -   /*
> -* If the current directory entry doesn't represent
> -* a CPU info subdirectory then skip to the next
> entry.
> -*/
> -   if (dir->d_type == DT_DIR) {
> -   if (!strncmp(dir->d_name, "cpu", 3)) {
> -   /*
> -* Directory name starts with
> "cpu"...
> -* Try to extract a CPU ID number
> -* from the remainder of the
> dirname.
> -*/
> -   errno = 0;
> -   numptr = dir->d_name;
> -   numptr += 3;
> -   cpu_idnum = strtol(numptr, &endptr,
> -  10);
> -   if (errno || (endptr == numptr))
> -   continue;
> -   } else {
> -   continue;
> -   }
> -   } else {
> -   continue;
> -   }
> -   /*
> -* If we get here the current directory entry
> specifies
> -* a CPU info subdir for the CPU indexed by
> cpu_idnum.
> -*/
> +   CPU_ZERO(&cpuset);
> +   ret = sched_getaffinity(0, sizeof(cpuset), &cpuset);
>
> -   /* Track number of logical CPUs discovered */
> -   if (odp_global_data.num_cpus_installed <
> -   (int)(cpu_idnum + 1))
> -   odp_global_data.num_cpus_installed =
> -   (int)(cpu_idnum + 1);
> +   if (ret < 0) {
> +   ODP_ERR("Failed to get cpu affinity");
> +   return -1;
> +   }
>
> +   for (cpu_idnum = 0; cpu_idnum < CPU_SETSIZE - 1; cpu_idnum++) {
> +   if (CPU_ISSET(cpu_idnum, &cpuset)) {
> +   odp_global_data.num_cpus_installed++;
> /* Add the CPU to our default cpumasks */
> odp_cpumask_set(&odp_global_data.control_cpus,
> -   (int)cpu_idnum);
> +   (int)cpu_idnum);
> odp_cpumask_set(&odp_global_data.worker_cpus,
> -   (int)cpu_idnum);
> +   (int)cpu_idnum);
> }
> -   closedir(d);
> -   return 0;
> -   } else {
> -   return -1;
> }
> +
> +   return 0;
>  }
>
>  /*
> --
> 1.9.1
>
>


[lng-odp] [Bug 2429] validation thread_main fails process mode

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2429

Mike Holmes  changed:

   What|Removed |Added

   Assignee|lng-odp@lists.linaro.org|rizwan.ans...@linaro.org

-- 
You are receiving this mail because:
You are the assignee for the bug.

[lng-odp] [Bug 2428] validation timer_main fails process mode

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2428

Mike Holmes  changed:

   What|Removed |Added

   Assignee|lng-odp@lists.linaro.org|rizwan.ans...@linaro.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.

[lng-odp] [Bug 2309] Timer validation test failure

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2309

Mike Holmes  changed:

   What|Removed |Added

 Resolution|--- |NON REPRODUCIBLE
 Status|IN_PROGRESS |RESOLVED

--- Comment #10 from Mike Holmes  ---
We dont have HW capable of showing the issue, closing

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.

[lng-odp] [API-NEXT PATCH] linux-gen: pool add missing eof for error prints

2016-12-01 Thread Maxim Uvarov
During debug found missing end of lines in debug prints.

Signed-off-by: Maxim Uvarov 
---
 platform/linux-generic/odp_pool.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index 4be3827..5ae37e1 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -281,11 +281,11 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
uint32_t max_len, max_seg_len;
uint32_t ring_size;
int name_len;
-   const char *postfix = "_uarea";
+   const char *postfix = "_uarea\n";
char uarea_name[ODP_POOL_NAME_LEN + sizeof(postfix)];
 
if (params == NULL) {
-   ODP_ERR("No params");
+   ODP_ERR("No params\n");
return ODP_POOL_INVALID;
}
 
@@ -300,7 +300,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
/* Validate requested buffer alignment */
if (align > ODP_CONFIG_BUFFER_ALIGN_MAX ||
align != ODP_ALIGN_ROUNDDOWN_POWER_2(align, align)) {
-   ODP_ERR("Bad align requirement");
+   ODP_ERR("Bad align requirement\n");
return ODP_POOL_INVALID;
}
 
@@ -332,7 +332,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
break;
 
default:
-   ODP_ERR("Bad pool type");
+   ODP_ERR("Bad pool type\n");
return ODP_POOL_INVALID;
}
 
@@ -342,7 +342,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool = reserve_pool();
 
if (pool == NULL) {
-   ODP_ERR("No more free pools");
+   ODP_ERR("No more free pools\n");
return ODP_POOL_INVALID;
}
 
@@ -390,7 +390,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool->shm = shm;
 
if (shm == ODP_SHM_INVALID) {
-   ODP_ERR("Shm reserve failed");
+   ODP_ERR("Shm reserve failed\n");
goto error;
}
 
@@ -404,7 +404,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool->uarea_shm = shm;
 
if (shm == ODP_SHM_INVALID) {
-   ODP_ERR("Shm reserve failed (uarea)");
+   ODP_ERR("Shm reserve failed (uarea)\n");
goto error;
}
 
-- 
2.7.1.250.gff4ea60



Re: [lng-odp] [API-NEXT PATCH] linux-gen: pool add missing eof for error prints

2016-12-01 Thread Maxim Uvarov

On 12/01/16 18:26, Maxim Uvarov wrote:

During debug found missing end of lines in debug prints.

Signed-off-by: Maxim Uvarov 
---
  platform/linux-generic/odp_pool.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index 4be3827..5ae37e1 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -281,11 +281,11 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
uint32_t max_len, max_seg_len;
uint32_t ring_size;
int name_len;
-   const char *postfix = "_uarea";
+   const char *postfix = "_uarea\n";

opps,  v2.

Maxim.

char uarea_name[ODP_POOL_NAME_LEN + sizeof(postfix)];
  
  	if (params == NULL) {

-   ODP_ERR("No params");
+   ODP_ERR("No params\n");
return ODP_POOL_INVALID;
}
  
@@ -300,7 +300,7 @@ static odp_pool_t pool_create(const char *name, odp_pool_param_t *params,

/* Validate requested buffer alignment */
if (align > ODP_CONFIG_BUFFER_ALIGN_MAX ||
align != ODP_ALIGN_ROUNDDOWN_POWER_2(align, align)) {
-   ODP_ERR("Bad align requirement");
+   ODP_ERR("Bad align requirement\n");
return ODP_POOL_INVALID;
}
  
@@ -332,7 +332,7 @@ static odp_pool_t pool_create(const char *name, odp_pool_param_t *params,

break;
  
  	default:

-   ODP_ERR("Bad pool type");
+   ODP_ERR("Bad pool type\n");
return ODP_POOL_INVALID;
}
  
@@ -342,7 +342,7 @@ static odp_pool_t pool_create(const char *name, odp_pool_param_t *params,

pool = reserve_pool();
  
  	if (pool == NULL) {

-   ODP_ERR("No more free pools");
+   ODP_ERR("No more free pools\n");
return ODP_POOL_INVALID;
}
  
@@ -390,7 +390,7 @@ static odp_pool_t pool_create(const char *name, odp_pool_param_t *params,

pool->shm = shm;
  
  	if (shm == ODP_SHM_INVALID) {

-   ODP_ERR("Shm reserve failed");
+   ODP_ERR("Shm reserve failed\n");
goto error;
}
  
@@ -404,7 +404,7 @@ static odp_pool_t pool_create(const char *name, odp_pool_param_t *params,

pool->uarea_shm = shm;
  
  		if (shm == ODP_SHM_INVALID) {

-   ODP_ERR("Shm reserve failed (uarea)");
+   ODP_ERR("Shm reserve failed (uarea)\n");
goto error;
}
  




[lng-odp] [API-NEXT PATCHv2] linux-gen: pool add missing eof for error prints

2016-12-01 Thread Maxim Uvarov
During debug found missing end of lines in debug prints.

Signed-off-by: Maxim Uvarov 
---
 platform/linux-generic/odp_pool.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index 4be3827..f63efb6 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -285,7 +285,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
char uarea_name[ODP_POOL_NAME_LEN + sizeof(postfix)];
 
if (params == NULL) {
-   ODP_ERR("No params");
+   ODP_ERR("No params\n");
return ODP_POOL_INVALID;
}
 
@@ -300,7 +300,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
/* Validate requested buffer alignment */
if (align > ODP_CONFIG_BUFFER_ALIGN_MAX ||
align != ODP_ALIGN_ROUNDDOWN_POWER_2(align, align)) {
-   ODP_ERR("Bad align requirement");
+   ODP_ERR("Bad align requirement\n");
return ODP_POOL_INVALID;
}
 
@@ -332,7 +332,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
break;
 
default:
-   ODP_ERR("Bad pool type");
+   ODP_ERR("Bad pool type\n");
return ODP_POOL_INVALID;
}
 
@@ -342,7 +342,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool = reserve_pool();
 
if (pool == NULL) {
-   ODP_ERR("No more free pools");
+   ODP_ERR("No more free pools\n");
return ODP_POOL_INVALID;
}
 
@@ -390,7 +390,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool->shm = shm;
 
if (shm == ODP_SHM_INVALID) {
-   ODP_ERR("Shm reserve failed");
+   ODP_ERR("Shm reserve failed\n");
goto error;
}
 
@@ -404,7 +404,7 @@ static odp_pool_t pool_create(const char *name, 
odp_pool_param_t *params,
pool->uarea_shm = shm;
 
if (shm == ODP_SHM_INVALID) {
-   ODP_ERR("Shm reserve failed (uarea)");
+   ODP_ERR("Shm reserve failed (uarea)\n");
goto error;
}
 
-- 
2.7.1.250.gff4ea60



[lng-odp] [Bug 2670] New: ./configure --disable-abi-compat fails make distcheck

2016-12-01 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2670

Bug ID: 2670
   Summary: ./configure --disable-abi-compat fails make distcheck
   Product: OpenDataPlane - linux- generic reference
   Version: v1.11.0.0
  Hardware: Other
OS: Linux
Status: UNCONFIRMED
  Severity: enhancement
  Priority: ---
 Component: Build system
  Assignee: maxim.uva...@linaro.org
  Reporter: mike.hol...@linaro.org
CC: lng-odp@lists.linaro.org
  Target Milestone: ---

I think that just because you build a distribution targeted at an embedded case
where performance trumps having an ABI still, it is still expected that the
distribution tests pass.

./bootstrap 
./configure --disable-abi-compat
 make distcheck
..
make[4]: Entering directory
'/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance'
  CC   odp_crypto-odp_crypto.o
  CCLD odp_crypto
  CC   odp_pktio_perf.o
  CCLD odp_pktio_perf
odp_pktio_perf.o: In function `run_test_single':
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:617:
undefined reference to `odp_atomic_store_u32'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:651:
undefined reference to `odp_atomic_store_u32'
odp_pktio_perf.o: In function `run_thread_rx':
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:458:
undefined reference to `odp_atomic_load_u32'
odp_pktio_perf.o: In function `pktio_create_packet':
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:199:
undefined reference to `odp_cpu_to_be_16'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:205:
undefined reference to `odp_cpu_to_be_32'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:206:
undefined reference to `odp_cpu_to_be_32'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:208:
undefined reference to `odp_cpu_to_be_16'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:212:
undefined reference to `odp_atomic_fetch_inc_u32'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:213:
undefined reference to `odp_cpu_to_be_16'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:221:
undefined reference to `odp_cpu_to_be_16'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:222:
undefined reference to `odp_cpu_to_be_16'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:223:
undefined reference to `odp_cpu_to_be_16'
odp_pktio_perf.o: In function `test_init':
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:753:
undefined reference to `odp_atomic_init_u32'
/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance/../../../../test/common_plat/performance/odp_pktio_perf.c:754:
undefined reference to `odp_atomic_init_u32'
collect2: error: ld returned 1 exit status
Makefile:745: recipe for target 'odp_pktio_perf' failed
make[4]: *** [odp_pktio_perf] Error 1
make[4]: Leaving directory
'/root/odp/opendataplane-1.12.0.0/_build/test/common_plat/performance'
Makefile:420: recipe for target 'all-recursive' failed
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory
'/root/odp/opendataplane-1.12.0.0/_build/test/common_plat'
Makefile:419: recipe for target 'all-recursive' failed
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory '/root/odp/opendataplane-1.12.0.0/_build/test'
Makefile:477: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/root/odp/opendataplane-1.12.0.0/_build'
Makefile:683: recipe for target 'distcheck' failed
make: *** [distcheck] Error 1

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: [lng-odp] [PATCHv10 2/3] changelog: summary of changes for odp v1.12.0.0

2016-12-01 Thread Mike Holmes
On 30 November 2016 at 15:32, Maxim Uvarov  wrote:

> From: Bill Fischofer 
>
> Signed-off-by: Bill Fischofer 
> Signed-off-by: Maxim Uvarov 
>

Reviewed-by: Mike Holmes 



> ---
>  CHANGELOG | 177 ++
> 
>  1 file changed, 177 insertions(+)
>
> diff --git a/CHANGELOG b/CHANGELOG
> index 1d652a8..17afe44 100644
> --- a/CHANGELOG
> +++ b/CHANGELOG
> @@ -1,3 +1,180 @@
> +== OpenDataPlane (1.12.0.0)
> +
> +=== New Features
> +
> + APIs
> +ODP v1.12.0.0 has no API changes from previous v1.11.0 Monarch LTS.
> Version
> +is increased in current development release to make room for Monarch
> updates
> +numbers.
> +
> + Application Binary Interface (ABI) Support
> +Support is added to enable ODP applications to be binary compatible across
> +different implementations of ODP sharing the same Instruction Set
> Architecture
> +(ISA). This support introduces a new `configure` option:
> +
> +`no abi disable option`::
> +This is the default and specifies that the ODP library is to be built to
> +support ABI compatibility mode. In this mode ODP APIs are never inlined.
> ABI
> +compatibility ensures maximum application portability in cloud
> environments.
> +
> +`--disable-abi-compat`::
> +Specify this option to enable the inlining of ODP APIs. This may result in
> +improved performance at the cost of ABI compatibility and is suitable for
> +applications running in embedded environments.
> +
> +Note that ODP applications retain source code portability between ODP
> +implementations regardless of the ABI mode chosen. To move to a different
> ODP
> +application running on a different ISA, code need simply be recompiled
> against
> +that target ODP implementation.
> +
> + SCTP Parsing Support
> +The ODP classifier adds support for recognizing Stream Control
> Transmission
> +Protocol (SCTP) packets. The APIs for this were previously not
> implemented.
> +
> +=== Packaging and Implementation Refinements
> +
> + Remove dependency on Linux headers
> +ODP no longer has a dependency on Linux headers. This will help make the
> +odp-linux reference implementation more easily portable to non-Linux
> +environments.
> +
> + Remove dependency on helpers
> +The odp-linux implementation has been made independent of the helper
> library
> +to avoid circular dependency issues with packaging. Helper functions may
> use
> +ODP APIs, however ODP implementations should not use helper functions.
> +
> + Reorganization of `test` directory
> +The `test` directory has been reorganized to better support a unified
> approach
> +to ODP component testing. API tests now live in
> +`test/common_plat/validation/api` instead of the former
> +`test/validation`. With this change performance and validation tests, as
> well
> +as common and platform-specific tests can all be part of a unified test
> +hierarchy.
> +
> +The resulting test tree now looks like:
> +
> +.New `test` directory hierarchy
> +-
> +test
> +├── common_plat
> +│   ├── common
> +│   ├── m4
> +│   ├── miscellaneous
> +│   ├── performance
> +│   └── validation
> +│   └── api
> +│   ├── atomic
> +│   ├── barrier
> +│   ├── buffer
> +│   ├── classification
> +│   ├── cpumask
> +│   ├── crypto
> +│   ├── errno
> +│   ├── hash
> +│   ├── init
> +│   ├── lock
> +│   ├── packet
> +│   ├── pktio
> +│   ├── pool
> +│   ├── queue
> +│   ├── random
> +│   ├── scheduler
> +│   ├── shmem
> +│   ├── std_clib
> +│   ├── system
> +│   ├── thread
> +│   ├── time
> +│   ├── timer
> +│   └── traffic_mngr
> +├── linux-generic
> +│   ├── m4
> +│   ├── mmap_vlan_ins
> +│   ├── performance
> +│   ├── pktio_ipc
> +│   ├── ring
> +│   └── validation
> +│   └── api
> +│   ├── pktio
> +│   └── shmem
> +└── m4
> +-
> +
> + Pools
> +The maximum number of pools that may be created in the odp-linux reference
> +implementation has been raised from 16 to 64.
> +
> + Upgrade to DPDK 16.07
> +The DPDK pktio support in odp-linux has been upgraded to work with DPDK
> 16.07.
> +A number of miscellaneous fixes and performance improvements in this
> support
> +are also present.
> +
> + PktIO TAP Interface Classifier Support
> +Packet I/O interfaces operating in TAP mode now can feed packets to the
> ODP
> +classifier the same as other pktio modes can do.
> +
> +=== Performance Improvements
> +
> + Burst-mode buffer allocation
> +The scheduler and pktio components have been reworked to use burst-mode
> +buffer allocation/deallocation, yielding a measurable performance gain in
> +almost all cases.
> +
> + Burst-mode queue operations
> +ODP queues internally now attempt to use burst-mode enq/deq operations to
> +accelerate performance where applicable.
> +
> + Ring-based Scheduler Priority Queues
> +The ODP schedule

Re: [lng-odp] [PATCHv10 1/3] configure.ac: disable shared library for non abi compat mode

2016-12-01 Thread Mike Holmes
On 30 November 2016 at 15:32, Maxim Uvarov  wrote:

> original configure.ac enables abi compat mode by default,
> even without --enable-abi-compat provided. And has broken
> logic to disable abi compat mode. Correct logic to build abi
> compat mode and option to disable it. Shared library is not
> needed for non abi compat mode, so turn it off.
>
> Signed-off-by: Maxim Uvarov 
>

Reviewed-by: Mike Holmes 


> ---
>  configure.ac | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index be5a292..b460a65 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -241,13 +241,11 @@ ODP_CFLAGS="$ODP_CFLAGS -DODP_DEBUG=$ODP_DEBUG"
>  ODP_ABI_COMPAT=1
>  abi_compat=yes
>  AC_ARG_ENABLE([abi-compat],
> -[  --enable-abi-compat build all targets in ABI compatible mode
> (default=yes)],
> -[if test "x$enableval" = "xyes"; then
> -   ODP_ABI_COMPAT=1
> -   abi_compat=yes
> - else
> +[  --disable-abi-compat disables ABI compatible mode, enables
> inline code in header files],
> +[if test "x$enableval" = "xno"; then
> ODP_ABI_COMPAT=0
> abi_compat=no
> +   enable_shared=no
>  fi])
>  AC_SUBST(ODP_ABI_COMPAT)
>
> @@ -336,6 +334,7 @@ AC_MSG_RESULT([
> static libraries:   ${enable_static}
> shared libraries:   ${enable_shared}
> ABI compatible: ${abi_compat}
> +   ODP_ABI_COMPAT: ${ODP_ABI_COMPAT}
> cunit:  ${cunit_support}
> test_vald:  ${test_vald}
> test_perf:  ${test_perf}
> --
> 2.7.1.250.gff4ea60
>
>


-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


Re: [lng-odp] [PATCHv10 3/3] configure.ac update version numbers

2016-12-01 Thread Anders Roxell
On 2016-12-01 17:29, Maxim Uvarov wrote:
> On 12/01/16 17:25, Mike Holmes wrote:
> >Thanks Steve
> >
> >In an effort to close on this
> >
> >Steve and Anders both say it should be
> >
> >-ODPHELPER_LIBSO_VERSION=110:0:1
> >+ODPHELPER_LIBSO_VERSION=110:1:1
> >
> >And they are our best resourse I consider my self educated :)
> >Maxim / Anders can we take the CURL  text and add any extra explanation we
> >need before  adding it to our docs ?
> >
> >Mike
> 
> ok, if first two patches ok then please add review-by and I will resping the
> latest with that changes and comment from curl.

Please add a link to the gnu [1] describes it as well.
[1] 
https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html

Cheers,
Anders

> 
> Maxim.
> 
> >
> >On 1 December 2016 at 09:17, Steve McIntyre  >> wrote:
> >
> >On Thu, Dec 01, 2016 at 04:05:14PM +0300, Maxim Uvarov wrote:
> >>On 12/01/16 01:09, Anders Roxell wrote:
> >>> Since the ABI isn't changed we shouldn't bump the age only the 
> > revision.
> >>> The curl project [1] describes the rules in a easier way.
> >>>
> >>>[1]https://github.com/curl/curl/blob/master/lib/Makefile.am#L95
> >
> >>
> >>curl project is not official documentation for autotools. So we
> >can take this
> >>under account but can not just follow it.
> >>
> >>So we have 2 official links describing that numbers:
> >>
> >>1. https://autotools.io/libtool/version.html
> >
> >>2.
> >
> > https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
> >
> > 
> >>
> >>From link 1:
> >>"Always increase the revision value. "
> >>
> >>From link 2:
> >>"If the library source code has changed at all since the last
> >update, then
> >>increment revision (‘c:r:a’ becomes ‘c:/r+1/:a’). "
> >
> >Hi guys,
> >
> >Two things here...
> >
> >1. Maxim's two docs say exactly what the curl doc says - just in
> >   different language. Also from link 1:
> >
> >   """
> >   Warning
> >
> >   A common mistake is to assume that the three values passed to
> >   -version-info map directly into the three numbers at the end of the
> >   library name. This is not the case, and indeed, current, revision
> >   and age are applied differently depending on the operating system
> >   that one is using.
> >   """
> >
> >   The libtool -version_info stuff is *horrendously* confusing for
> >   many people precisely because of this awful mismatch :-( WTF
> >   they've defined things this way I have no idea...
> >
> >2. That just describes the *revision*, however. You've also increased
> >   the *age* by 2, and that's what Anders was complaining about. From
> >   the doc you have referenced here (link 1), increasing the *age* but
> >   not touching *current* makes no sense:
> >
> >   * Increase the current value whenever an interface has been
> >added, removed or changed.
> >   * Increase the age value only if the changes made to the ABI
> >are backward compatible.
> >
> >   The curl doc again agrees with that.
> >
> >Do these two points make sense to people?
> >
> >Cheers,
> >--
> >Steve McIntyre steve.mcint...@linaro.org
> >
> > Linaro.org | Open source software for ARM
> >SoCs
> >
> >
> >
> >
> >-- 
> >Mike Holmes
> >Program Manager - Linaro Networking Group
> >Linaro.org ***│ *Open source software for ARM SoCs
> >"Work should be fun and collaborative, the rest follows"
> >
> 

-- 
Anders Roxell
anders.rox...@linaro.org
M: +46 709 71 42 85 | IRC: roxell


Re: [lng-odp] [PATCH] linux-gen: _fdserver: request sigterm if parent dies

2016-12-01 Thread Mike Holmes
On 25 November 2016 at 09:01, Christophe Milard <
christophe.mil...@linaro.org> wrote:

> _fdserver now request SIGTERM if parent process (ODP instantiation
> process) dies, hence avoiding it to become orphan and reattached to the
> init process.
>
> Signed-off-by: Christophe Milard 
>

Reviewed-by: Mike Holmes 


> ---
>  platform/linux-generic/_fdserver.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/platform/linux-generic/_fdserver.c b/platform/linux-generic/_
> fdserver.c
> index 41a630b..9aed7a9 100644
> --- a/platform/linux-generic/_fdserver.c
> +++ b/platform/linux-generic/_fdserver.c
> @@ -41,6 +41,8 @@
>  #include 
>  #include 
>  #include <_fdserver_internal.h>
> +#include 
> +#include 
>
>  #include 
>  #include 
> @@ -622,6 +624,10 @@ int _odp_fdserver_init_global(void)
> /* TODO: pin the server on appropriate service cpu mask */
> /* when (if) we can agree on the usage of service mask  */
>
> +   /* request to be killed if parent dies, hence avoiding  */
> +   /* orphans being "adopted" by the init process...   */
> +   prctl(PR_SET_PDEATHSIG, SIGTERM);
> +
> /* allocate the space for the file descriptor<->key table:
> */
> fd_table = malloc(FDSERVER_MAX_ENTRIES *
> sizeof(fdentry_t));
> if (!fd_table) {
> --
> 2.7.4
>
>


-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


[lng-odp] [PATCH] configure.ac: fix builds from raw git tar

2016-12-01 Thread Mike Holmes
Signed-off-by: Mike Holmes 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 48fe0be..5c7ddd0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1,5 +1,5 @@
 AC_PREREQ([2.5])
-AC_INIT([OpenDataPlane], m4_esyscmd(./scripts/git_hash.sh .), 
[lng-odp@lists.linaro.org])
+AC_INIT([OpenDataPlane], m4_esyscmd_s(./scripts/git_hash.sh .), 
[lng-odp@lists.linaro.org])
 AM_INIT_AUTOMAKE([1.9 tar-pax subdir-objects])
 AC_CONFIG_SRCDIR([helper/config.h.in])
 AM_CONFIG_HEADER([helper/config.h])
-- 
2.9.3



[lng-odp] [MONARCH PATCH] configure.ac: fix builds from raw git tar

2016-12-01 Thread Mike Holmes
Signed-off-by: Mike Holmes 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 48fe0be..5c7ddd0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1,5 +1,5 @@
 AC_PREREQ([2.5])
-AC_INIT([OpenDataPlane], m4_esyscmd(./scripts/git_hash.sh .), 
[lng-odp@lists.linaro.org])
+AC_INIT([OpenDataPlane], m4_esyscmd_s(./scripts/git_hash.sh .), 
[lng-odp@lists.linaro.org])
 AM_INIT_AUTOMAKE([1.9 tar-pax subdir-objects])
 AC_CONFIG_SRCDIR([helper/config.h.in])
 AM_CONFIG_HEADER([helper/config.h])
-- 
2.9.3



Re: [lng-odp] [PATCH] configure.ac: fix builds from raw git tar

2016-12-01 Thread Mike Holmes
I missed the prefix for MONARCH - re sending

On 1 December 2016 at 14:21, Mike Holmes  wrote:

> Signed-off-by: Mike Holmes 
> ---
>  configure.ac | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 48fe0be..5c7ddd0 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1,5 +1,5 @@
>  AC_PREREQ([2.5])
> -AC_INIT([OpenDataPlane], m4_esyscmd(./scripts/git_hash.sh .), [
> lng-odp@lists.linaro.org])
> +AC_INIT([OpenDataPlane], m4_esyscmd_s(./scripts/git_hash.sh .), [
> lng-odp@lists.linaro.org])
>  AM_INIT_AUTOMAKE([1.9 tar-pax subdir-objects])
>  AC_CONFIG_SRCDIR([helper/config.h.in])
>  AM_CONFIG_HEADER([helper/config.h])
> --
> 2.9.3
>
>


-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


Re: [lng-odp] [PATCHv3] platform: linux-generic: reading cpu affinity from cpuset

2016-12-01 Thread Brian Brooks
On 11/28 15:34:06, Balakrishna Garapati wrote:
> With this new proposal cpu affinity is read correctly especially
> when using cgroups otherwise wrong cpu mask is set.
> 
> Fixes bug: https://bugs.linaro.org/show_bug.cgi?id=2472
> 
> Signed-off-by: Balakrishna Garapati 
> ---
> 
>  v1 to v2: added Description of the issue to the patch commit log.
>  v2 to v3: Resending the patch adding the log change from v1 to v2
> 
>  platform/linux-generic/odp_cpumask.c | 69 
> +---
>  1 file changed, 16 insertions(+), 53 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_cpumask.c 
> b/platform/linux-generic/odp_cpumask.c
> index 6bf2632..7b0d80a 100644
> --- a/platform/linux-generic/odp_cpumask.c
> +++ b/platform/linux-generic/odp_cpumask.c
> @@ -227,71 +227,34 @@ int odp_cpumask_next(const odp_cpumask_t *mask, int cpu)
>   */
>  static int get_installed_cpus(void)

Should this function be renamed to something like get_available_cpus
since it returns the set of CPUs on which the calling thread is eligible
to run on instead of the set of CPUs in the entire system?

>  {
> - char *numptr;
> - char *endptr;
> - long int cpu_idnum;
> - DIR  *d;
> - struct dirent *dir;
> + int cpu_idnum;
> + cpu_set_t cpuset;
> + int ret;
> 
>   /* Clear the global cpumasks for control and worker CPUs */
>   odp_cpumask_zero(&odp_global_data.control_cpus);
>   odp_cpumask_zero(&odp_global_data.worker_cpus);
> 
> - /*
> -  * Scan the /sysfs pseudo-filesystem for CPU info directories.
> -  * There should be one subdirectory for each installed logical CPU
> -  */
> - d = opendir("/sys/devices/system/cpu");
> - if (d) {
> - while ((dir = readdir(d)) != NULL) {
> - cpu_idnum = CPU_SETSIZE;
> -
> - /*
> -  * If the current directory entry doesn't represent
> -  * a CPU info subdirectory then skip to the next entry.
> -  */
> - if (dir->d_type == DT_DIR) {
> - if (!strncmp(dir->d_name, "cpu", 3)) {
> - /*
> -  * Directory name starts with "cpu"...
> -  * Try to extract a CPU ID number
> -  * from the remainder of the dirname.
> -  */
> - errno = 0;
> - numptr = dir->d_name;
> - numptr += 3;
> - cpu_idnum = strtol(numptr, &endptr,
> -10);
> - if (errno || (endptr == numptr))
> - continue;
> - } else {
> - continue;
> - }
> - } else {
> - continue;
> - }
> - /*
> -  * If we get here the current directory entry specifies
> -  * a CPU info subdir for the CPU indexed by cpu_idnum.
> -  */
> + CPU_ZERO(&cpuset);
> + ret = sched_getaffinity(0, sizeof(cpuset), &cpuset);

It would be great to add a note in the ODP spec that the application thread
calling odp_global_init() must double check its cpuset. E.g. if called from
a control plane thread that has already affinitized to control plane CPUs,
once odp_global_init() is called will the underlying ODP implementation
(and ODP helper) only know about the cpuset of the control plane cores?

> - /* Track number of logical CPUs discovered */
> - if (odp_global_data.num_cpus_installed <
> - (int)(cpu_idnum + 1))
> - odp_global_data.num_cpus_installed =
> - (int)(cpu_idnum + 1);
> + if (ret < 0) {
> + ODP_ERR("Failed to get cpu affinity");
> + return -1;
> + }
> 
> + for (cpu_idnum = 0; cpu_idnum < CPU_SETSIZE - 1; cpu_idnum++) {
> + if (CPU_ISSET(cpu_idnum, &cpuset)) {
> + odp_global_data.num_cpus_installed++;
>   /* Add the CPU to our default cpumasks */
>   odp_cpumask_set(&odp_global_data.control_cpus,
> - (int)cpu_idnum);
> + (int)cpu_idnum);
>   odp_cpumask_set(&odp_global_data.worker_cpus,
> - (int)cpu_idnum);
> + (int)cpu_idnum);
>   }
> - closedir(d);
> - return 0;
> - } else {
> -

Re: [lng-odp] [PATCHv10 1/3] configure.ac: disable shared library for non abi compat mode

2016-12-01 Thread Maxim Uvarov
Merged.

On 12/01/16 21:37, Mike Holmes wrote:
> 
> 
> On 30 November 2016 at 15:32, Maxim Uvarov  > wrote:
> 
> original configure.ac  enables abi compat mode
> by default,
> even without --enable-abi-compat provided. And has broken
> logic to disable abi compat mode. Correct logic to build abi
> compat mode and option to disable it. Shared library is not
> needed for non abi compat mode, so turn it off.
> 
> Signed-off-by: Maxim Uvarov  >
> 
> 
> Reviewed-by: Mike Holmes  >
>  
> 
> ---
>  configure.ac  | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/configure.ac  b/configure.ac
> 
> index be5a292..b460a65 100644
> --- a/configure.ac 
> +++ b/configure.ac 
> @@ -241,13 +241,11 @@ ODP_CFLAGS="$ODP_CFLAGS -DODP_DEBUG=$ODP_DEBUG"
>  ODP_ABI_COMPAT=1
>  abi_compat=yes
>  AC_ARG_ENABLE([abi-compat],
> -[  --enable-abi-compat build all targets in ABI compatible
> mode (default=yes)],
> -[if test "x$enableval" = "xyes"; then
> -   ODP_ABI_COMPAT=1
> -   abi_compat=yes
> - else
> +[  --disable-abi-compat disables ABI compatible mode,
> enables inline code in header files],
> +[if test "x$enableval" = "xno"; then
> ODP_ABI_COMPAT=0
> abi_compat=no
> +   enable_shared=no
>  fi])
>  AC_SUBST(ODP_ABI_COMPAT)
> 
> @@ -336,6 +334,7 @@ AC_MSG_RESULT([
> static libraries:   ${enable_static}
> shared libraries:   ${enable_shared}
> ABI compatible: ${abi_compat}
> +   ODP_ABI_COMPAT: ${ODP_ABI_COMPAT}
> cunit:  ${cunit_support}
> test_vald:  ${test_vald}
> test_perf:  ${test_perf}
> --
> 2.7.1.250.gff4ea60
> 
> 
> 
> 
> -- 
> Mike Holmes
> Program Manager - Linaro Networking Group
> Linaro.org * **│ *Open source software for ARM SoCs
> "Work should be fun and collaborative, the rest follows"
> 
> __
> 
> 



Re: [lng-odp] [PATCHv10 2/3] changelog: summary of changes for odp v1.12.0.0

2016-12-01 Thread Maxim Uvarov
Merged.

On 12/01/16 21:36, Mike Holmes wrote:
> 
> 
> On 30 November 2016 at 15:32, Maxim Uvarov  > wrote:
> 
> From: Bill Fischofer  >
> 
> Signed-off-by: Bill Fischofer  >
> Signed-off-by: Maxim Uvarov  >
> 
> 
> Reviewed-by: Mike Holmes  >
> 
>  
> 
> ---
>  CHANGELOG | 177
> ++
>  1 file changed, 177 insertions(+)
> 
> diff --git a/CHANGELOG b/CHANGELOG
> index 1d652a8..17afe44 100644
> --- a/CHANGELOG
> +++ b/CHANGELOG
> @@ -1,3 +1,180 @@
> +== OpenDataPlane (1.12.0.0)
> +
> +=== New Features
> +
> + APIs
> +ODP v1.12.0.0 has no API changes from previous v1.11.0 Monarch LTS.
> Version
> +is increased in current development release to make room for
> Monarch updates
> +numbers.
> +
> + Application Binary Interface (ABI) Support
> +Support is added to enable ODP applications to be binary compatible
> across
> +different implementations of ODP sharing the same Instruction Set
> Architecture
> +(ISA). This support introduces a new `configure` option:
> +
> +`no abi disable option`::
> +This is the default and specifies that the ODP library is to be
> built to
> +support ABI compatibility mode. In this mode ODP APIs are never
> inlined. ABI
> +compatibility ensures maximum application portability in cloud
> environments.
> +
> +`--disable-abi-compat`::
> +Specify this option to enable the inlining of ODP APIs. This may
> result in
> +improved performance at the cost of ABI compatibility and is
> suitable for
> +applications running in embedded environments.
> +
> +Note that ODP applications retain source code portability between ODP
> +implementations regardless of the ABI mode chosen. To move to a
> different ODP
> +application running on a different ISA, code need simply be
> recompiled against
> +that target ODP implementation.
> +
> + SCTP Parsing Support
> +The ODP classifier adds support for recognizing Stream Control
> Transmission
> +Protocol (SCTP) packets. The APIs for this were previously not
> implemented.
> +
> +=== Packaging and Implementation Refinements
> +
> + Remove dependency on Linux headers
> +ODP no longer has a dependency on Linux headers. This will help
> make the
> +odp-linux reference implementation more easily portable to non-Linux
> +environments.
> +
> + Remove dependency on helpers
> +The odp-linux implementation has been made independent of the
> helper library
> +to avoid circular dependency issues with packaging. Helper
> functions may use
> +ODP APIs, however ODP implementations should not use helper functions.
> +
> + Reorganization of `test` directory
> +The `test` directory has been reorganized to better support a
> unified approach
> +to ODP component testing. API tests now live in
> +`test/common_plat/validation/api` instead of the former
> +`test/validation`. With this change performance and validation
> tests, as well
> +as common and platform-specific tests can all be part of a unified test
> +hierarchy.
> +
> +The resulting test tree now looks like:
> +
> +.New `test` directory hierarchy
> +-
> +test
> +├── common_plat
> +│   ├── common
> +│   ├── m4
> +│   ├── miscellaneous
> +│   ├── performance
> +│   └── validation
> +│   └── api
> +│   ├── atomic
> +│   ├── barrier
> +│   ├── buffer
> +│   ├── classification
> +│   ├── cpumask
> +│   ├── crypto
> +│   ├── errno
> +│   ├── hash
> +│   ├── init
> +│   ├── lock
> +│   ├── packet
> +│   ├── pktio
> +│   ├── pool
> +│   ├── queue
> +│   ├── random
> +│   ├── scheduler
> +│   ├── shmem
> +│   ├── std_clib
> +│   ├── system
> +│   ├── thread
> +│   ├── time
> +│   ├── timer
> +│   └── traffic_mngr
> +├── linux-generic
> +│   ├── m4
> +│   ├── mmap_vlan_ins
> +│   ├── performance
> +│   ├── pktio_ipc
> +│   ├── ring
> +│   └── validation
> +│   └── api
> +│   ├── pktio
> +│   └── shmem
> +└── m4
> +-
> +
> + Pools
> +The maximum number of pools that may be created in the odp-linux
> reference
> +implementation has been raised from 16 to 64.
> +
> + Upgrade to DPDK 16.07
> +The DPDK pktio support in od

[lng-odp] [PATCH] configure.ac update version numbers

2016-12-01 Thread Maxim Uvarov
Update numbers for .so and add short description.

Signed-off-by: Maxim Uvarov 
---
 configure.ac | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index b460a65..3e89b0a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3,7 +3,7 @@ AC_PREREQ([2.5])
 # Set correct API version
 ##
 m4_define([odpapi_generation_version], [1])
-m4_define([odpapi_major_version], [11])
+m4_define([odpapi_major_version], [12])
 m4_define([odpapi_minor_version], [0])
 m4_define([odpapi_point_version], [0])
 m4_define([odpapi_version],
@@ -28,12 +28,20 @@ AC_CONFIG_MACRO_DIR([m4])
 AM_SILENT_RULES([yes])
 
 ##
-# Set correct platform library version
+# Set platform library version
+#
+# Follow version rules described here:
+# 
https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
+# Version is Current:Revision:Age
+# 1. if there are only source changes, use C:R+1:A
+# 2. if interfaces were added use C+1:0:A+1
+# 3. if interfaces were removed, then use C+1:0:0
 ##
-ODP_LIBSO_VERSION=111:0:0
+
+ODP_LIBSO_VERSION=112:0:0
 AC_SUBST(ODP_LIBSO_VERSION)
 
-ODPHELPER_LIBSO_VERSION=110:0:1
+ODPHELPER_LIBSO_VERSION=110:1:1
 AC_SUBST(ODPHELPER_LIBSO_VERSION)
 
 # Checks for programs.
-- 
2.7.1.250.gff4ea60



Re: [lng-odp] [API-NEXT PATCHv2] linux-generic: pool: reset origin_qe on buffer allocation

2016-12-01 Thread Maxim Uvarov
Merged,
Maxim.

On 12/01/16 06:25, Yi He wrote:
> yes, I agree this fix can help pass make check in recent development and
> future bisect maybe.
> 
> Reviewed-and-tested-by: Yi He 
> 
> On 1 December 2016 at 07:08, Bill Fischofer 
> wrote:
> 
>> Resolve bug https://bugs.linaro.org/show_bug.cgi?id=2622 by
>> re-initializing origin_qe to NULL when a buffer is allocated. This step
>> was omitted in the switch to ring pool allocation introduced in
>> commit ID c8cf1d87783d4b4c628f219803b78731b8d4ade4
>>
>> Signed-off-by: Bill Fischofer 
>> ---
>> Changes in v2:
>> - Review comments from Maxim. Move init to earlier loops for completeness
>> and
>>   efficiency.
>>
>>  platform/linux-generic/odp_pool.c | 17 ++---
>>  1 file changed, 10 insertions(+), 7 deletions(-)
>>
>> diff --git a/platform/linux-generic/odp_pool.c
>> b/platform/linux-generic/odp_pool.c
>> index 4be3827..8c38c93 100644
>> --- a/platform/linux-generic/odp_pool.c
>> +++ b/platform/linux-generic/odp_pool.c
>> @@ -588,6 +588,7 @@ int buffer_alloc_multi(pool_t *pool, odp_buffer_t
>> buf[],
>> uint32_t mask, i;
>> pool_cache_t *cache;
>> uint32_t cache_num, num_ch, num_deq, burst;
>> +   odp_buffer_hdr_t *hdr;
>>
>> ring  = &pool->ring.hdr;
>> mask  = pool->ring_mask;
>> @@ -608,8 +609,13 @@ int buffer_alloc_multi(pool_t *pool, odp_buffer_t
>> buf[],
>> }
>>
>> /* Get buffers from the cache */
>> -   for (i = 0; i < num_ch; i++)
>> +   for (i = 0; i < num_ch; i++) {
>> buf[i] = cache->buf[cache_num - num_ch + i];
>> +   hdr = buf_hdl_to_hdr(buf[i]);
>> +   hdr->origin_qe = NULL;
>> +   if (buf_hdr)
>> +   buf_hdr[i] = hdr;
>> +   }
>>
>> /* If needed, get more from the global pool */
>> if (odp_unlikely(num_deq)) {
>> @@ -629,9 +635,11 @@ int buffer_alloc_multi(pool_t *pool, odp_buffer_t
>> buf[],
>> uint32_t idx = num_ch + i;
>>
>> buf[idx] = (odp_buffer_t)(uintptr_t)data[i];
>> +   hdr = buf_hdl_to_hdr(buf[idx]);
>> +   hdr->origin_qe = NULL;
>>
>> if (buf_hdr) {
>> -   buf_hdr[idx] = buf_hdl_to_hdr(buf[idx]);
>> +   buf_hdr[idx] = hdr;
>> /* Prefetch newly allocated and soon to be
>> used
>>  * buffer headers. */
>> odp_prefetch(buf_hdr[idx]);
>> @@ -648,11 +656,6 @@ int buffer_alloc_multi(pool_t *pool, odp_buffer_t
>> buf[],
>> cache->num = cache_num - num_ch;
>> }
>>
>> -   if (buf_hdr) {
>> -   for (i = 0; i < num_ch; i++)
>> -   buf_hdr[i] = buf_hdl_to_hdr(buf[i]);
>> -   }
>> -
>> return num_ch + num_deq;
>>  }
>>
>> --
>> 2.7.4
>>
>>



[lng-odp] [RFC API-NEXT PATCHv2 2/4] api: timer: add odp_timer_pool_from_timer()

2016-12-01 Thread Brian Brooks
Signed-off-by: Brian Brooks 
---
 include/odp/api/spec/timer.h  | 9 +
 platform/linux-generic/odp_timer.c| 5 +
 test/common_plat/validation/api/timer/timer.c | 2 ++
 3 files changed, 16 insertions(+)

diff --git a/include/odp/api/spec/timer.h b/include/odp/api/spec/timer.h
index 2e59ace..43f2d58 100644
--- a/include/odp/api/spec/timer.h
+++ b/include/odp/api/spec/timer.h
@@ -200,6 +200,15 @@ int odp_timer_pool_info(odp_timer_pool_t tpid,
 uint64_t odp_timer_pool_resolution(odp_timer_pool_t tpid);
 
 /**
+ * Get timer pool from timer
+ *
+ * @param tim Timer handle
+ *
+ * @return Timer pool handle
+ */
+odp_timer_pool_t odp_timer_pool_from_timer(odp_timer_t tim);
+
+/**
  * Allocate a timer
  *
  * Create a timer (allocating all necessary resources e.g. timeout event) from
diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index 2c14c93..89e0f52 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -842,6 +842,11 @@ uint64_t odp_timer_pool_resolution(odp_timer_pool_t tpid)
return tpid->param.res_ns;
 }
 
+odp_timer_pool_t odp_timer_pool_from_timer(odp_timer_t tim)
+{
+   return handle_to_tp(tim);
+}
+
 uint64_t odp_timer_pool_to_u64(odp_timer_pool_t tpid)
 {
return _odp_pri(tpid);
diff --git a/test/common_plat/validation/api/timer/timer.c 
b/test/common_plat/validation/api/timer/timer.c
index 0d0514a..0dff939 100644
--- a/test/common_plat/validation/api/timer/timer.c
+++ b/test/common_plat/validation/api/timer/timer.c
@@ -231,6 +231,8 @@ static void handle_tmo(odp_event_t ev, bool stale, uint64_t 
prev_tick)
uint64_t tick = odp_timeout_tick(tmo);
struct test_timer *ttp = odp_timeout_user_ptr(tmo);
 
+   CU_ASSERT(odp_timer_pool_from_timer(tim) == tp);
+
if (tim == ODP_TIMER_INVALID)
CU_FAIL("odp_timeout_timer() invalid timer");
if (!ttp)
-- 
2.7.4



[lng-odp] [RFC API-NEXT PATCHv2 1/4] api: timer: add odp_timer_pool_resolution()

2016-12-01 Thread Brian Brooks
Signed-off-by: Brian Brooks 
---
 include/odp/api/spec/timer.h  | 9 +
 platform/linux-generic/odp_timer.c| 5 +
 test/common_plat/validation/api/timer/timer.c | 2 ++
 3 files changed, 16 insertions(+)

diff --git a/include/odp/api/spec/timer.h b/include/odp/api/spec/timer.h
index 3f8fdd4..2e59ace 100644
--- a/include/odp/api/spec/timer.h
+++ b/include/odp/api/spec/timer.h
@@ -191,6 +191,15 @@ int odp_timer_pool_info(odp_timer_pool_t tpid,
odp_timer_pool_info_t *info);
 
 /**
+ * Get resolution from timer pool
+ *
+ * @param tpid Timer pool identifier
+ *
+ * @return Timeout resolution in nanoseconds
+ */
+uint64_t odp_timer_pool_resolution(odp_timer_pool_t tpid);
+
+/**
  * Allocate a timer
  *
  * Create a timer (allocating all necessary resources e.g. timeout event) from
diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index ee4c4c0..2c14c93 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -837,6 +837,11 @@ int odp_timer_pool_info(odp_timer_pool_t tpid,
return 0;
 }
 
+uint64_t odp_timer_pool_resolution(odp_timer_pool_t tpid)
+{
+   return tpid->param.res_ns;
+}
+
 uint64_t odp_timer_pool_to_u64(odp_timer_pool_t tpid)
 {
return _odp_pri(tpid);
diff --git a/test/common_plat/validation/api/timer/timer.c 
b/test/common_plat/validation/api/timer/timer.c
index 0007639..0d0514a 100644
--- a/test/common_plat/validation/api/timer/timer.c
+++ b/test/common_plat/validation/api/timer/timer.c
@@ -529,6 +529,8 @@ void timer_test_odp_timer_all(void)
CU_ASSERT(tpinfo.param.max_tmo == MAX);
CU_ASSERT(strcmp(tpinfo.name, NAME) == 0);
 
+   CU_ASSERT(odp_timer_pool_resolution(tp) == RES);
+
LOG_DBG("Timer pool handle: %" PRIu64 "\n", odp_timer_pool_to_u64(tp));
LOG_DBG("#timers..: %u\n", NTIMERS);
LOG_DBG("Tmo range: %u ms (%" PRIu64 " ticks)\n", RANGE_MS,
-- 
2.7.4



[lng-odp] [RFC API-NEXT PATCHv2 4/4] timers: poll-mode timers

2016-12-01 Thread Brian Brooks
--enable-polled-timers runs timer expiration processing directly inside
odp_schedule() and avoids the use of timer pool background threads and
itimers.

odp_timers stress test shows improvements for test cases with higher resolution
and lower latency timers. The most noticable improvement is in scenarios where
the number of timer pools is greater than the number of control plane cores.

For further explanation please see:
https://docs.google.com/a/linaro.org/document/d/1AI0TFlIb3QJFAd3mJz74kPzMrLmiQgwmuOx31LlffZA/edit?usp=sharing

Signed-off-by: Brian Brooks 
---
 configure.ac   | 13 +
 .../linux-generic/include/odp_config_internal.h|  3 +
 platform/linux-generic/include/odp_time_internal.h |  2 +
 .../linux-generic/include/odp_timer_internal.h |  2 +
 platform/linux-generic/odp_schedule.c  | 40 +++--
 platform/linux-generic/odp_time.c  | 60 +++-
 platform/linux-generic/odp_timer.c | 65 +-
 7 files changed, 165 insertions(+), 20 deletions(-)

diff --git a/configure.ac b/configure.ac
index 4f6cc18..4f0d276 100644
--- a/configure.ac
+++ b/configure.ac
@@ -232,6 +232,19 @@ AC_ARG_ENABLE([tracing-timers],
 ODP_CFLAGS="$ODP_CFLAGS -DTRACING_TIMERS=$TRACING_TIMERS"
 
 ##
+# Enable/disable ODP_POLLED_TIMERS
+##
+ODP_POLLED_TIMERS=0
+AC_ARG_ENABLE([polled-timers],
+[  --enable-polled-timerspolled timers],
+[if test "x$enableval" = "xyes"; then
+   ODP_POLLED_TIMERS=1
+ else
+   ODP_POLLED_TIMERS=0
+fi])
+ODP_CFLAGS="$ODP_CFLAGS -DODP_POLLED_TIMERS=$ODP_POLLED_TIMERS"
+
+##
 # Enable/disable ODP_DEBUG_PRINT
 ##
 ODP_DEBUG_PRINT=0
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index b7ff610..eec9dac 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -118,6 +118,9 @@ extern "C" {
  */
 #define CONFIG_BURST_SIZE 16
 
+/* Value used to rate limit timer pool expiration processing. */
+#define ODP_CONFIG_TIMER_RUN_NSEC  (250)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/platform/linux-generic/include/odp_time_internal.h 
b/platform/linux-generic/include/odp_time_internal.h
index 5a0bc75..1185f58 100644
--- a/platform/linux-generic/include/odp_time_internal.h
+++ b/platform/linux-generic/include/odp_time_internal.h
@@ -24,4 +24,6 @@ static inline uint64_t core_tick(void)
 #endif
 }
 
+uint64_t core_tick_diff_ns(uint64_t before, uint64_t after);
+
 #endif
diff --git a/platform/linux-generic/include/odp_timer_internal.h 
b/platform/linux-generic/include/odp_timer_internal.h
index b1cd73f..51959c0 100644
--- a/platform/linux-generic/include/odp_timer_internal.h
+++ b/platform/linux-generic/include/odp_timer_internal.h
@@ -39,4 +39,6 @@ typedef struct odp_timeout_hdr_stride {
uint8_t pad[ODP_CACHE_LINE_SIZE_ROUNDUP(sizeof(odp_timeout_hdr_t))];
 } odp_timeout_hdr_stride;
 
+int timer_run(void);
+
 #endif
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 81e79c9..197fd72 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -22,6 +22,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 /* Number of priority levels  */
@@ -788,19 +790,31 @@ static int do_schedule(odp_queue_t *out_queue, 
odp_event_t out_ev[],
return 0;
 }
 
+#ifdef ODP_POLLED_TIMERS
+static __thread uint64_t last_timer_run;
+#endif
 
 static int schedule_loop(odp_queue_t *out_queue, uint64_t wait,
 odp_event_t out_ev[],
 unsigned int max_num)
 {
-   odp_time_t next, wtime;
-   int first = 1;
-   int ret;
-
-   while (1) {
-   ret = do_schedule(out_queue, out_ev, max_num);
+   uint64_t start, now;
+   int nr_events;
+
+   start = core_tick();
+
+   for (;;) {
+#ifdef ODP_POLLED_TIMERS
+   now = core_tick();
+   if (ODP_CONFIG_TIMER_RUN_NSEC <=
+   core_tick_diff_ns(last_timer_run, now)) {
+   last_timer_run = now;
+   (void)timer_run();
+   }
+#endif
+   nr_events = do_schedule(out_queue, out_ev, max_num);
 
-   if (ret)
+   if (nr_events)
break;
 
if (wait == ODP_SCHED_WAIT)
@@ -809,18 +823,12 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
if (wait == ODP_SCHED_NO_WAIT)
break;
 
-   if (first) {
-   

[lng-odp] [RFC API-NEXT PATCHv2 3/4] test: performance: add odp_timers

2016-12-01 Thread Brian Brooks
Add a timers stress test. Timer pool resolution, number of timers pools, number
of timers, queue type, and number of threads may be specified for each test
case.

Timestamps are used to ensure timeout events have been recieved by the program
no later than they should be. Timestamp statistics are printed.

Signed-off-by: Brian Brooks 
---
 configure.ac   |  22 +
 platform/linux-generic/Makefile.am |   1 +
 platform/linux-generic/include/odp_time_internal.h |  27 +
 platform/linux-generic/m4/odp_pthread.m4   |   2 +-
 platform/linux-generic/odp_timer.c |   7 +
 test/common_plat/performance/Makefile.am   |   6 +-
 test/common_plat/performance/odp_timers.c  | 913 +
 7 files changed, 976 insertions(+), 2 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_time_internal.h
 create mode 100644 test/common_plat/performance/odp_timers.c

diff --git a/configure.ac b/configure.ac
index b460a65..4f6cc18 100644
--- a/configure.ac
+++ b/configure.ac
@@ -71,6 +71,10 @@ AC_TYPE_INT32_T
 AC_TYPE_UINT32_T
 AC_TYPE_UINT64_T
 
+AC_CHECK_LIB([m], [cos])
+AC_CHECK_LIB([gslcblas], [cblas_dgemm])
+AC_CHECK_LIB([gsl], [gsl_hypot])
+
 #
 # Get GCC version
 #
@@ -210,6 +214,24 @@ DX_INIT_DOXYGEN($PACKAGE_NAME,
${builddir}/doc/platform-api-guide/output)
 
 ##
+# Event tracing
+##
+
+# Checks for --enable-tracing-timers and defines TRACING_TIMERS if found.
+#
+# This is experimental and stores tracing info inside the user-supplied
+# context associated with a timeout event.
+TRACING_TIMERS=0
+AC_ARG_ENABLE([tracing-timers],
+[  --enable-tracing-timers  trace timeout event scheduling],
+[if test "x$enableval" = "xyes"; then
+TRACING_TIMERS=1
+ else
+TRACING_TIMERS=0
+ fi])
+ODP_CFLAGS="$ODP_CFLAGS -DTRACING_TIMERS=$TRACING_TIMERS"
+
+##
 # Enable/disable ODP_DEBUG_PRINT
 ##
 ODP_DEBUG_PRINT=0
diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 22cf6f3..070ddcf 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -128,6 +128,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_schedule_ordered_internal.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
+ ${srcdir}/include/odp_time_internal.h \
  ${srcdir}/include/odp_timer_internal.h \
  ${srcdir}/include/odp_timer_wheel_internal.h \
  ${srcdir}/include/odp_traffic_mngr_internal.h \
diff --git a/platform/linux-generic/include/odp_time_internal.h 
b/platform/linux-generic/include/odp_time_internal.h
new file mode 100644
index 000..5a0bc75
--- /dev/null
+++ b/platform/linux-generic/include/odp_time_internal.h
@@ -0,0 +1,27 @@
+/* Copyright (c) 2016, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef ODP_TIME_INTERNAL_H_
+#define ODP_TIME_INTERNAL_H_
+
+static inline uint64_t core_tick(void)
+{
+#if defined(__aarch64__)
+   uint64_t vct;
+   /* __asm__ volatile("isb" : : : "memory"); */
+   __asm__ volatile("mrs %0, cntvct_el0" : "=r"(vct));
+   return vct;
+#elif defined(__x86_64__)
+   uint64_t hi, lo;
+   /* __asm__ volatile("mfence" : : : "memory"); */
+   __asm__ volatile("rdtsc" : "=a"(lo), "=d"(hi));
+   return (hi << 32) | lo;
+#else
+#error Please add support for your core in odp_time_internal.h
+#endif
+}
+
+#endif
diff --git a/platform/linux-generic/m4/odp_pthread.m4 
b/platform/linux-generic/m4/odp_pthread.m4
index 7f39103..b5705b2 100644
--- a/platform/linux-generic/m4/odp_pthread.m4
+++ b/platform/linux-generic/m4/odp_pthread.m4
@@ -10,4 +10,4 @@ LIBS="$PTHREAD_LIBS $LIBS"
 AM_CFLAGS="$AM_CFLAGS $PTHREAD_CFLAGS"
 AM_LDFLAGS="$AM_LDFLAGS $PTHREAD_LDFLAGS"
 
-AM_LDFLAGS="$AM_LDFLAGS -pthread -lrt"
+AM_LDFLAGS="$AM_LDFLAGS -pthread -lrt -lm"
diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index 89e0f52..ad44ede 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define TMO_UNUSED   ((uint64_t)0x)
@@ -627,6 +628,12 @@ static unsigned timer_expire(odp_timer_pool *tp, uint32_t 
idx, uint64_t tick)
}
/* Else ignore events of other types */
/* Post the timeout to

Re: [lng-odp] [API-NEXT PATCHv2] linux-gen: pool add missing eof for error prints

2016-12-01 Thread Bill Fischofer
On Thu, Dec 1, 2016 at 9:59 AM, Maxim Uvarov  wrote:
> During debug found missing end of lines in debug prints.
>
> Signed-off-by: Maxim Uvarov 

Reviewed-by: Bill Fischofer 

> ---
>  platform/linux-generic/odp_pool.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/platform/linux-generic/odp_pool.c 
> b/platform/linux-generic/odp_pool.c
> index 4be3827..f63efb6 100644
> --- a/platform/linux-generic/odp_pool.c
> +++ b/platform/linux-generic/odp_pool.c
> @@ -285,7 +285,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> char uarea_name[ODP_POOL_NAME_LEN + sizeof(postfix)];
>
> if (params == NULL) {
> -   ODP_ERR("No params");
> +   ODP_ERR("No params\n");
> return ODP_POOL_INVALID;
> }
>
> @@ -300,7 +300,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> /* Validate requested buffer alignment */
> if (align > ODP_CONFIG_BUFFER_ALIGN_MAX ||
> align != ODP_ALIGN_ROUNDDOWN_POWER_2(align, align)) {
> -   ODP_ERR("Bad align requirement");
> +   ODP_ERR("Bad align requirement\n");
> return ODP_POOL_INVALID;
> }
>
> @@ -332,7 +332,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> break;
>
> default:
> -   ODP_ERR("Bad pool type");
> +   ODP_ERR("Bad pool type\n");
> return ODP_POOL_INVALID;
> }
>
> @@ -342,7 +342,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> pool = reserve_pool();
>
> if (pool == NULL) {
> -   ODP_ERR("No more free pools");
> +   ODP_ERR("No more free pools\n");
> return ODP_POOL_INVALID;
> }
>
> @@ -390,7 +390,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> pool->shm = shm;
>
> if (shm == ODP_SHM_INVALID) {
> -   ODP_ERR("Shm reserve failed");
> +   ODP_ERR("Shm reserve failed\n");
> goto error;
> }
>
> @@ -404,7 +404,7 @@ static odp_pool_t pool_create(const char *name, 
> odp_pool_param_t *params,
> pool->uarea_shm = shm;
>
> if (shm == ODP_SHM_INVALID) {
> -   ODP_ERR("Shm reserve failed (uarea)");
> +   ODP_ERR("Shm reserve failed (uarea)\n");
> goto error;
> }
>
> --
> 2.7.1.250.gff4ea60
>


Re: [lng-odp] [API-NEXT PATCH v3 5/5] linux-gen: sched: new ordered lock implementation

2016-12-01 Thread Bill Fischofer
On Thu, Dec 1, 2016 at 5:37 AM, Matias Elo  wrote:
> Implement ordered locks using per lock atomic counters. The counter values
> are compared to the queue’s atomic context to guarantee ordered locking.
> Compared to the previous implementation this enables parallel processing of
> ordered events outside of the lock context.
>
> Signed-off-by: Matias Elo 
> ---
>  .../linux-generic/include/odp_queue_internal.h |  2 +
>  platform/linux-generic/odp_queue.c |  6 +++
>  platform/linux-generic/odp_schedule.c  | 49 
> --
>  3 files changed, 54 insertions(+), 3 deletions(-)
>
> diff --git a/platform/linux-generic/include/odp_queue_internal.h 
> b/platform/linux-generic/include/odp_queue_internal.h
> index b905bd8..8b55de1 100644
> --- a/platform/linux-generic/include/odp_queue_internal.h
> +++ b/platform/linux-generic/include/odp_queue_internal.h
> @@ -59,6 +59,8 @@ struct queue_entry_s {
> struct {
> odp_atomic_u64_t  ctx; /**< Current ordered context id */
> odp_atomic_u64_t  next_ctx; /**< Next unallocated context id 
> */
> +   /** Array of ordered locks */
> +   odp_atomic_u64_t  lock[CONFIG_QUEUE_MAX_ORD_LOCKS];
> } ordered ODP_ALIGNED_CACHE;
>
> enq_func_t   enqueue ODP_ALIGNED_CACHE;
> diff --git a/platform/linux-generic/odp_queue.c 
> b/platform/linux-generic/odp_queue.c
> index 4c7f497..d9cb9f3 100644
> --- a/platform/linux-generic/odp_queue.c
> +++ b/platform/linux-generic/odp_queue.c
> @@ -77,8 +77,14 @@ static int queue_init(queue_entry_t *queue, const char 
> *name,
> queue->s.param.deq_mode = ODP_QUEUE_OP_DISABLED;
>
> if (param->sched.sync == ODP_SCHED_SYNC_ORDERED) {
> +   unsigned i;
> +
> odp_atomic_init_u64(&queue->s.ordered.ctx, 0);
> odp_atomic_init_u64(&queue->s.ordered.next_ctx, 0);
> +
> +   for (i = 0; i < queue->s.param.sched.lock_count; i++)
> +   odp_atomic_init_u64(&queue->s.ordered.lock[i],
> +   0);
> }
> }
> queue->s.type = queue->s.param.type;
> diff --git a/platform/linux-generic/odp_schedule.c 
> b/platform/linux-generic/odp_schedule.c
> index 4b33513..c628142 100644
> --- a/platform/linux-generic/odp_schedule.c
> +++ b/platform/linux-generic/odp_schedule.c
> @@ -126,6 +126,15 @@ typedef struct {
> int num;
>  } ordered_stash_t;
>
> +/* Ordered lock states */
> +typedef union {
> +   uint8_t u8[CONFIG_QUEUE_MAX_ORD_LOCKS];
> +   uint32_t all;
> +} lock_called_t;
> +
> +ODP_STATIC_ASSERT(sizeof(lock_called_t) == sizeof(uint32_t),
> + "Lock_called_values_do_not_fit_in_uint32");
> +
>  /* Scheduler local data */
>  typedef struct {
> int thr;
> @@ -145,6 +154,7 @@ typedef struct {
> ordered_stash_t stash[MAX_ORDERED_STASH];
> int stash_num; /**< Number of stashed enqueue operations */
> uint8_t in_order; /**< Order status */
> +   lock_called_t lock_called; /**< States of ordered locks */
> } ordered;
>
>  } sched_local_t;
> @@ -553,12 +563,21 @@ static inline void ordered_stash_release(void)
>
>  static inline void release_ordered(void)
>  {
> +   unsigned i;
> queue_entry_t *queue;
>
> queue = sched_local.ordered.src_queue;
>
> wait_for_order(queue);
>
> +   /* Release all ordered locks */
> +   for (i = 0; i < queue->s.param.sched.lock_count; i++) {
> +   if (!sched_local.ordered.lock_called.u8[i])
> +   odp_atomic_store_rel_u64(&queue->s.ordered.lock[i],
> +sched_local.ordered.ctx + 1);
> +   }
> +
> +   sched_local.ordered.lock_called.all = 0;
> sched_local.ordered.src_queue = NULL;
> sched_local.ordered.in_order = 0;
>
> @@ -923,19 +942,43 @@ static void order_unlock(void)
>  {
>  }
>
> -static void schedule_order_lock(unsigned lock_index ODP_UNUSED)
> +static void schedule_order_lock(unsigned lock_index)
>  {
> +   odp_atomic_u64_t *ord_lock;
> queue_entry_t *queue;
>
> queue = sched_local.ordered.src_queue;
>
> ODP_ASSERT(queue && lock_index <= queue->s.param.sched.lock_count);

Sorry, I should have been more precise. The staleness test I was
referring to was to verify that the lock had not been previously used
in this ordered context. In the current code that's the following
assert (in odp_schedule_order.c)

sync = sched_local.sync[lock_index];
sync_out = odp_atomic_load_u64(&origin_qe->s.sync_out[lock_index]);
ODP_ASSERT(sync >= sync_out);

The test above should be open code since it's a validity check on the
call. The current code treats odp_schedule_order_lock/unlock as no-ops
if we aren't running in an ordered context as that permits queue 

Re: [lng-odp] [API-NEXT PATCH v3 0/5] new ordered queue implementation

2016-12-01 Thread Bill Fischofer
Given that my fix to Bug 2622 has been merged into API-NEXT, this will
need to be rebased against those changes as well to put the
buffer_alloc() code back to the way it was.

On Thu, Dec 1, 2016 at 5:37 AM, Matias Elo  wrote:
> V3:
> - Removed old SCHEDULE_ORDERED_LOCKS_PER_QUEUE define (Bill)
> - Replaced error checks with asserts in ordered lock/unlock (Bill)
>
> V2:
> - Support for multiple ordered locks (Bill)
> - New ordered lock implementation
>
> Add new implementation for ordered queues. Compared to the old
> implementation this is much simpler and improves performance ~1-4x
> depending on the test case. Some performance numbers are provided below.
>
> The implementation is based on an atomic ordered context, which only a
> single thread may possess at a time. Only the thread owning the atomic
> context may do enqueue(s) from the ordered queue. All other threads put
> their enqueued events to a thread local enqueue stash (ordered_stash_t).
> All stashed enqueue operations will be performed in the original order when
> the thread acquires the ordered context. If the ordered stash becomes full,
> the enqueue blocks. At the latest a thread blocks when the ev_stash is
> empty and the thread tries to release the order context.
>
>
> The patch set also resolves the following bug:
> https://bugs.linaro.org/show_bug.cgi?id=2644
>
>
> Performance benchmarks:
>
> odp_l2fwd (64B packets)
>
> Throughput (Gbps)
> Cores   Old New Gain (%)
> 
> 1:  3.0 7.0 136
> 2:  3.2 11.1244
> 4:  5.0 17.6252
> 6:  5.9 23.0286
> 8:  7.0 28.6307
> 10: 8.0 33.6321
> 12: 8.7 38.2340
>
>
> odp_pktio_ordered (64B packets)
>
> Throughput (Gbps)
> Cores   Old New Gain (%)
> 
> 1:  1.2 1.6 33
> 2:  1.1 1.8 64
> 4:  1.4 2.6 78
> 6:  1.3 2.9 125
> 8:  1.4 3.3 141
> 10: 1.3 3.5 175
> 12: 1.2 3.8 213
>
> Matias Elo (5):
>   linux-gen: sched: add internal APIs for locking/unlocking ordered
> processing
>   linux-gen: sched: remove old ordered queue implementation
>   linux-gen: sched: add internal API for max number of ordered locks per
> queue
>   linux-gen: sched: new ordered queue implementation
>   linux-gen: sched: new ordered lock implementation
>
>  platform/linux-generic/Makefile.am |   3 -
>  .../linux-generic/include/odp_buffer_internal.h|   7 -
>  .../linux-generic/include/odp_config_internal.h|   5 +
>  .../linux-generic/include/odp_packet_io_queue.h|   5 +-
>  .../linux-generic/include/odp_queue_internal.h |  33 +-
>  platform/linux-generic/include/odp_schedule_if.h   |  15 +-
>  .../linux-generic/include/odp_schedule_internal.h  |  50 --
>  .../include/odp_schedule_ordered_internal.h|  25 -
>  platform/linux-generic/odp_packet_io.c |  17 +-
>  platform/linux-generic/odp_queue.c |  76 +-
>  platform/linux-generic/odp_schedule.c  | 281 ++-
>  platform/linux-generic/odp_schedule_ordered.c  | 818 
> -
>  platform/linux-generic/odp_schedule_sp.c   |  25 +-
>  platform/linux-generic/odp_traffic_mngr.c  |  28 +-
>  platform/linux-generic/pktio/loop.c|   2 +-
>  15 files changed, 360 insertions(+), 1030 deletions(-)
>  delete mode 100644 platform/linux-generic/include/odp_schedule_internal.h
>  delete mode 100644 
> platform/linux-generic/include/odp_schedule_ordered_internal.h
>  delete mode 100644 platform/linux-generic/odp_schedule_ordered.c
>
> --
> 2.7.4
>


Re: [lng-odp] [API-NEXT PATCH v3 4/5] linux-gen: sched: new ordered queue implementation

2016-12-01 Thread Bill Fischofer
On Thu, Dec 1, 2016 at 5:37 AM, Matias Elo  wrote:
> Add new implementation for ordered queues. Compared to the old
> implementation this is much simpler and improves performance ~1-4x
> depending on the test case.
>
> The implementation is based on an atomic ordered context, which only a
> single thread may possess at a time. Only the thread owning the atomic
> context may do enqueue(s) from the ordered queue. All other threads put
> their enqueued events to a thread local enqueue stash (ordered_stash_t).
> All stashed enqueue operations will be performed in the original order when
> the thread acquires the ordered context. If the ordered stash becomes full,
> the enqueue blocks. At the latest a thread blocks when the ev_stash is
> empty and the thread tries to release the order context.
>
> Signed-off-by: Matias Elo 
> ---
>  .../linux-generic/include/odp_queue_internal.h |   5 +
>  platform/linux-generic/odp_queue.c |  14 +-
>  platform/linux-generic/odp_schedule.c  | 171 
> +++--
>  3 files changed, 172 insertions(+), 18 deletions(-)
>
> diff --git a/platform/linux-generic/include/odp_queue_internal.h 
> b/platform/linux-generic/include/odp_queue_internal.h
> index df36b76..b905bd8 100644
> --- a/platform/linux-generic/include/odp_queue_internal.h
> +++ b/platform/linux-generic/include/odp_queue_internal.h
> @@ -56,6 +56,11 @@ struct queue_entry_s {
> odp_buffer_hdr_t *tail;
> int   status;
>
> +   struct {
> +   odp_atomic_u64_t  ctx; /**< Current ordered context id */
> +   odp_atomic_u64_t  next_ctx; /**< Next unallocated context id 
> */
> +   } ordered ODP_ALIGNED_CACHE;
> +
> enq_func_t   enqueue ODP_ALIGNED_CACHE;
> deq_func_t   dequeue;
> enq_multi_func_t enqueue_multi;
> diff --git a/platform/linux-generic/odp_queue.c 
> b/platform/linux-generic/odp_queue.c
> index 99c91e7..4c7f497 100644
> --- a/platform/linux-generic/odp_queue.c
> +++ b/platform/linux-generic/odp_queue.c
> @@ -73,9 +73,14 @@ static int queue_init(queue_entry_t *queue, const char 
> *name,
> if (queue->s.param.sched.lock_count > sched_fn->max_ordered_locks())
> return -1;
>
> -   if (param->type == ODP_QUEUE_TYPE_SCHED)
> +   if (param->type == ODP_QUEUE_TYPE_SCHED) {
> queue->s.param.deq_mode = ODP_QUEUE_OP_DISABLED;
>
> +   if (param->sched.sync == ODP_SCHED_SYNC_ORDERED) {
> +   odp_atomic_init_u64(&queue->s.ordered.ctx, 0);
> +   odp_atomic_init_u64(&queue->s.ordered.next_ctx, 0);
> +   }
> +   }
> queue->s.type = queue->s.param.type;
>
> queue->s.enqueue = queue_enq;
> @@ -301,6 +306,13 @@ int odp_queue_destroy(odp_queue_t handle)
> ODP_ERR("queue \"%s\" not empty\n", queue->s.name);
> return -1;
> }
> +   if (queue_is_ordered(queue) &&
> +   odp_atomic_load_u64(&queue->s.ordered.ctx) !=
> +   odp_atomic_load_u64(&queue->s.ordered.next_ctx)) {
> +   UNLOCK(&queue->s.lock);
> +   ODP_ERR("queue \"%s\" reorder incomplete\n", queue->s.name);
> +   return -1;
> +   }
>
> switch (queue->s.status) {
> case QUEUE_STATUS_READY:
> diff --git a/platform/linux-generic/odp_schedule.c 
> b/platform/linux-generic/odp_schedule.c
> index 5bc274f..4b33513 100644
> --- a/platform/linux-generic/odp_schedule.c
> +++ b/platform/linux-generic/odp_schedule.c
> @@ -111,11 +111,21 @@ ODP_STATIC_ASSERT((8 * sizeof(pri_mask_t)) >= 
> QUEUES_PER_PRIO,
>  #define MAX_DEQ CONFIG_BURST_SIZE
>
>  /* Maximum number of ordered locks per queue */
> -#define MAX_ORDERED_LOCKS_PER_QUEUE 1
> +#define MAX_ORDERED_LOCKS_PER_QUEUE 2
>
>  ODP_STATIC_ASSERT(MAX_ORDERED_LOCKS_PER_QUEUE <= CONFIG_QUEUE_MAX_ORD_LOCKS,
>   "Too_many_ordered_locks");
>
> +/* Ordered stash size */
> +#define MAX_ORDERED_STASH 512
> +
> +/* Storage for stashed enqueue operation arguments */
> +typedef struct {
> +   odp_buffer_hdr_t *buf_hdr[QUEUE_MULTI_MAX];
> +   queue_entry_t *queue;
> +   int num;
> +} ordered_stash_t;
> +
>  /* Scheduler local data */
>  typedef struct {
> int thr;
> @@ -128,7 +138,15 @@ typedef struct {
> uint32_t queue_index;
> odp_queue_t queue;
> odp_event_t ev_stash[MAX_DEQ];
> -   void *queue_entry;
> +   struct {
> +   queue_entry_t *src_queue; /**< Source queue entry */
> +   uint64_t ctx; /**< Ordered context id */
> +   /** Storage for stashed enqueue operations */
> +   ordered_stash_t stash[MAX_ORDERED_STASH];
> +   int stash_num; /**< Number of stashed enqueue operations */
> +   uint8_t in_order; /**< Order status */
> +   } ordered;
> +
>  } sched_local_t;
>
>  /* Priority queue */
> @@ -491,17 +509,81 @@ static 

Re: [lng-odp] A topic related to bug 2622

2016-12-01 Thread Bill Fischofer
This is a good discussion. The key is to recognize that the ordering
provided by ordered queues is of contexts, not events. When the
scheduler dispatches an event from an ordered queue, it creates an
ordered context that the receiving thread now runs in. That context is
ordered with respect to all other contexts originating from the same
ordered queue and persists until the next odp_schedule() call or until
an explicit odp_schedule_release_order() call is made, though the
latter call is defined as simply a "hint" that may or may not actually
release the ordered context.

An ordered context protects all enqueues made by the thread running in
that ordered context. That means that no matter how few or how many
odp_queue_enq() calls that thread makes, all of these calls will be
ordered with respect to all such calls made by other threads running
in other ordered contexts originating from the same ordered queue.

Within the thread itself, the relative order of the odp_queue_enq()
calls it makes determines the order of events generated by this
ordered context since as a single thread it can only make once such
call at a time.

Ordered locks add an interesting wrinkle to this because they make it
possible for ordered critical sections to exist within the overall
parallel processing of these ordered contexts. If an ordered queue has
N ordered locks then there are N possible ordered synchronization
points that can be entered within the lifespan of a given ordered
context. Note that there is no requirement that the thread actually
"visit" any of these points. If another thread in a later ordered
context is waiting on ordered lock i within its context then it will
be delayed until such time as either this thread enters and then exits
ordered lock i, or it releases its ordered context altogether, since
this would provide definitive proof that it can never attempt to enter
that lock at some later point.

One of the key differences between the current order queue code and
Matias' patch is the question of when the internal structure (queue
element) is inspected when performing an odp_queue_enq() call. Because
queue elements are shared objects, they are protected by an internal
lock that must be acquired prior to manipulating that structure. The
current code acquires that lock as part of the entry processing to
odp_queue_enq() and then determines whether this thread's context
represents the current queue order to determine whether the enqueue
operation can be completed. If not, then the event is placed in the
ordered queue's reorder queue where it will be processed at some later
time when the queue's order "catches up" to this thread's order. In
Matias' code, by contrast, odp_queue_enq() first determines if its
order is current and if not it does not lock the queue element but
rather temporarily stores the event into a local stash. The advantage
here is that we avoid locking the queue element if we can't really
complete the operation. The disadvantage is that the read cannot
release its order until its stash has been emptied, which means that
it must stall rather than being able to continue processing another
ordered context when attempting to release its current ordered
context.

I suspect further performance improvement can be made by combining
these two ideas such that out of order enqueues are stashed locally
and then flushed to the origin ordered queue's reorder queue upon
order release as needed to avoid context release blockage. The real
question is how many enqueues are expected within a given ordered
context? The assumption has been that the average would be close to
one, but obviously a stash becomes more beneficial the larger that
number becomes.

On Thu, Dec 1, 2016 at 1:32 AM, Elo, Matias (Nokia - FI/Espoo)
 wrote:
>
> On 1 Dec 2016, at 6:58, Yi He  wrote:
>
> Hi, thanks Matias
>
> The example is very helpful, one question follow it:
>
> Previously I understand atomic queue like multiple producer/single consumer
> behaviour, so producers (enqueue threads) can still run in parallel?
>
>
> Yes. This actually touches another subject which is queue thread safety.
> odp_queue_create() takes 'odp_queue_param_t *param’ argument, which defines
> enqueue and dequeue modes (ODP_QUEUE_OP_MT (default) /
> ODP_QUEUE_OP_MT_UNSAFE / ODP_QUEUE_OP_DISABLED). Currently, there is always
> internal locking when doing enqueue in linux-generic so the argument has no
> real effect.
>
>
> But in the example the two producer threads behaves like sequentially while
> enqueueing atomic queue?
>
> A1, A2 (new allocated), A0 (original), B1, B2 (new allocated), B0 (original)
>
>
>
> The threads processing events from an ordered queue have to synchronise
> their enqueue operations somehow to guarantee event order in the atomic
> queue. How this is actually done, is ODP implementation dependent.
>
> For example in the new ordered queue implementation I try to avoid
> synchronising/blocking as long as possible by locally caching enqueue
> op

Re: [lng-odp] [PATCHv3] platform: linux-generic: reading cpu affinity from cpuset

2016-12-01 Thread Yi He
The original code reads whole system cpuset, and provides no command line
parameters to specify cpuset resources while launching ODP application.
Thus can be an obstacle for the coexisting of multiple ODP applications.

New code implies that ODP application launcher should prepare resource
arrangements (cpuset etc) before launching the ODP applications, favours
multiple ODP applications' coexistence, this patch looks good for me.

Agree with Brian's 2 comments:

Should this function be renamed to something like get_available_cpus?
In documentation, do we have a chapter covers "Run ODP Applications" topic?
Can be a separate task.

Thanks and Best Regards, Yi



On 2 December 2016 at 03:38, Brian Brooks  wrote:

> On 11/28 15:34:06, Balakrishna Garapati wrote:
> > With this new proposal cpu affinity is read correctly especially
> > when using cgroups otherwise wrong cpu mask is set.
> >
> > Fixes bug: https://bugs.linaro.org/show_bug.cgi?id=2472
> >
> > Signed-off-by: Balakrishna Garapati 
> > ---
> >
> >  v1 to v2: added Description of the issue to the patch commit log.
> >  v2 to v3: Resending the patch adding the log change from v1 to v2
> >
> >  platform/linux-generic/odp_cpumask.c | 69
> +---
> >  1 file changed, 16 insertions(+), 53 deletions(-)
> >
> > diff --git a/platform/linux-generic/odp_cpumask.c
> b/platform/linux-generic/odp_cpumask.c
> > index 6bf2632..7b0d80a 100644
> > --- a/platform/linux-generic/odp_cpumask.c
> > +++ b/platform/linux-generic/odp_cpumask.c
> > @@ -227,71 +227,34 @@ int odp_cpumask_next(const odp_cpumask_t *mask,
> int cpu)
> >   */
> >  static int get_installed_cpus(void)
>
> Should this function be renamed to something like get_available_cpus
> since it returns the set of CPUs on which the calling thread is eligible
> to run on instead of the set of CPUs in the entire system?
>
> >  {
> > - char *numptr;
> > - char *endptr;
> > - long int cpu_idnum;
> > - DIR  *d;
> > - struct dirent *dir;
> > + int cpu_idnum;
> > + cpu_set_t cpuset;
> > + int ret;
> >
> >   /* Clear the global cpumasks for control and worker CPUs */
> >   odp_cpumask_zero(&odp_global_data.control_cpus);
> >   odp_cpumask_zero(&odp_global_data.worker_cpus);
> >
> > - /*
> > -  * Scan the /sysfs pseudo-filesystem for CPU info directories.
> > -  * There should be one subdirectory for each installed logical CPU
> > -  */
> > - d = opendir("/sys/devices/system/cpu");
> > - if (d) {
> > - while ((dir = readdir(d)) != NULL) {
> > - cpu_idnum = CPU_SETSIZE;
> > -
> > - /*
> > -  * If the current directory entry doesn't represent
> > -  * a CPU info subdirectory then skip to the next
> entry.
> > -  */
> > - if (dir->d_type == DT_DIR) {
> > - if (!strncmp(dir->d_name, "cpu", 3)) {
> > - /*
> > -  * Directory name starts with
> "cpu"...
> > -  * Try to extract a CPU ID number
> > -  * from the remainder of the
> dirname.
> > -  */
> > - errno = 0;
> > - numptr = dir->d_name;
> > - numptr += 3;
> > - cpu_idnum = strtol(numptr, &endptr,
> > -10);
> > - if (errno || (endptr == numptr))
> > - continue;
> > - } else {
> > - continue;
> > - }
> > - } else {
> > - continue;
> > - }
> > - /*
> > -  * If we get here the current directory entry
> specifies
> > -  * a CPU info subdir for the CPU indexed by
> cpu_idnum.
> > -  */
> > + CPU_ZERO(&cpuset);
> > + ret = sched_getaffinity(0, sizeof(cpuset), &cpuset);
>
> It would be great to add a note in the ODP spec that the application thread
> calling odp_global_init() must double check its cpuset. E.g. if called from
> a control plane thread that has already affinitized to control plane CPUs,
> once odp_global_init() is called will the underlying ODP implementation
> (and ODP helper) only know about the cpuset of the control plane cores?
>
> > - /* Track number of logical CPUs discovered */
> > - if (odp_global_data.num_cpus_installed <
> > - (int)(cpu_idnum + 1))
> > - odp_global_data.num_cpus_installed =
> > - 

Re: [lng-odp] A topic related to bug 2622

2016-12-01 Thread Yi He
On 2 December 2016 at 11:18, Bill Fischofer 
wrote:

> This is a good discussion. The key is to recognize that the ordering
> provided by ordered queues is of contexts, not events. When the
> scheduler dispatches an event from an ordered queue, it creates an
> ordered context that the receiving thread now runs in. That context is
> ordered with respect to all other contexts originating from the same
> ordered queue and persists until the next odp_schedule() call or until
> an explicit odp_schedule_release_order() call is made, though the
> latter call is defined as simply a "hint" that may or may not actually
> release the ordered context.
>
> An ordered context protects all enqueues made by the thread running in
> that ordered context. That means that no matter how few or how many
> odp_queue_enq() calls that thread makes, all of these calls will be
> ordered with respect to all such calls made by other threads running
> in other ordered contexts originating from the same ordered queue.
>
> Within the thread itself, the relative order of the odp_queue_enq()
> calls it makes determines the order of events generated by this
> ordered context since as a single thread it can only make once such
> call at a time.
>
> Ordered locks add an interesting wrinkle to this because they make it
> possible for ordered critical sections to exist within the overall
> parallel processing of these ordered contexts. If an ordered queue has
> N ordered locks then there are N possible ordered synchronization
> points that can be entered within the lifespan of a given ordered
> context. Note that there is no requirement that the thread actually
> "visit" any of these points. If another thread in a later ordered
> context is waiting on ordered lock i within its context then it will
> be delayed until such time as either this thread enters and then exits
> ordered lock i, or it releases its ordered context altogether, since
> this would provide definitive proof that it can never attempt to enter
> that lock at some later point.
>
> One of the key differences between the current order queue code and
> Matias' patch is the question of when the internal structure (queue
> element) is inspected when performing an odp_queue_enq() call. Because
> queue elements are shared objects, they are protected by an internal
> lock that must be acquired prior to manipulating that structure. The
> current code acquires that lock as part of the entry processing to
> odp_queue_enq() and then determines whether this thread's context
> represents the current queue order to determine whether the enqueue
> operation can be completed. If not, then the event is placed in the
> ordered queue's reorder queue where it will be processed at some later
> time when the queue's order "catches up" to this thread's order. In
> Matias' code, by contrast, odp_queue_enq() first determines if its
> order is current and if not it does not lock the queue element but
> rather temporarily stores the event into a local stash. The advantage
> here is that we avoid locking the queue element if we can't really
> complete the operation.


Thanks Bill for the explanation, helped me in learning and understanding
Matias new implementation. Caching in thread local to avoid queue
locking/unlocking is good idea.


> The disadvantage is that the read cannot
> release its order until its stash has been emptied, which means that
> it must stall rather than being able to continue processing another
> ordered context when attempting to release its current ordered
> context.
>

Yes, here is also my concern and comments after reviewing the patchset, a
thread schedules events from one ordered queue puts itself into a "line"
and stalling. Will not duplicate in replying since Bill commented already.


>
> I suspect further performance improvement can be made by combining
> these two ideas such that out of order enqueues are stashed locally
> and then flushed to the origin ordered queue's reorder queue upon
> order release as needed to avoid context release blockage. The real
> question is how many enqueues are expected within a given ordered
> context? The assumption has been that the average would be close to
> one, but obviously a stash becomes more beneficial the larger that
> number becomes.
>
> On Thu, Dec 1, 2016 at 1:32 AM, Elo, Matias (Nokia - FI/Espoo)
>  wrote:
> >
> > On 1 Dec 2016, at 6:58, Yi He  wrote:
> >
> > Hi, thanks Matias
> >
> > The example is very helpful, one question follow it:
> >
> > Previously I understand atomic queue like multiple producer/single
> consumer
> > behaviour, so producers (enqueue threads) can still run in parallel?
> >
> >
> > Yes. This actually touches another subject which is queue thread safety.
> > odp_queue_create() takes 'odp_queue_param_t *param’ argument, which
> defines
> > enqueue and dequeue modes (ODP_QUEUE_OP_MT (default) /
> > ODP_QUEUE_OP_MT_UNSAFE / ODP_QUEUE_OP_DISABLED). Currently, there is
> always
> > internal locking when doing en