Re: [lng-odp] Classification Queue Group

2016-11-10 Thread Brian Brooks
On 11/10 15:17:15, Bala Manoharan wrote:
> On 10 November 2016 at 13:26, Brian Brooks  wrote:
> > On 11/07 16:46:12, Bala Manoharan wrote:
> >> Hi,
> >
> > Hiya
> >
> >> This mail thread discusses the design of classification queue group
> >> RFC. The same can be found in the google doc whose link is given
> >> below.
> >> Users can provide their comments either in this mail thread or in the
> >> google doc as per their convenience.
> >>
> >> https://docs.google.com/document/d/1fOoG9WDR0lMpVjgMAsx8QsMr0YFK9slR93LZ8VXqM2o/edit?usp=sharing
> >>
> >> The basic issues with queues as being a single target for a CoS are two 
> >> fold:
> >>
> >> Queues must be created and deleted individually. This imposes a
> >> significant burden when queues are used to represent individual flows
> >> since the application may need to process thousands (or millions) of
> >> flows.
> >
> > Wondering why there is an issue with creating and deleting queues 
> > individually
> > if queue objects represent millions of flows..
> 
> The queue groups are mainly required for hashing the incoming packets
> to multiple flows based on the hash configuration.
> So from application point of view it just needs a queue to have
> packets belonging to same flow and that packets belonging to different
> flows are placed in different queues respectively.It does not matter
> who creates the flow/queue.

When the application receives an event from odp_schedule() call, how does it
know whether the odp_queue_t was previously created by the application from
odp_queue_create() or whether it was created by the implementation?

> It is actually simpler if implementation creates a flow since in that
> case implementation need not accumulate meta-data for all possible
> hash values in a queue group and it can be created when traffic
> arrives in that particular flow.
> 
> >
> > Could an application ever call odp_schedule() and receive an event (e.g. 
> > packet)
> > from a queue (of opaque type odp_queue_t) and that queue has never been 
> > created
> > by the application (via odp_queue_create())? Could that ever happen from the
> > hardware, and could the application ever handle that?
> 
> No. All the queues in the system are created by the application either
> directly or in-directly.
> In-case of queue groups the queues are in-directly created by the
> application by configuring a queue group.
> 
> > Or, is it related to memory usage? The reference implementation
> > struct queue_entry_s is 320 bytes on a 64-bit machine.
> >
> >   2^28 ~= 268,435,456 queues -> 81.920 GB
> >   2^26 ~=  67,108,864 queues -> 20.480 GB
> >   2^22 ~=   4,194,304 queues ->  1.280 GB
> >
> > Forget about 320 bytes per queue, if each queue was represented by a 32-bit
> > integer (4 bytes!) the usage would be:
> >
> >   2^28 ~= 268,435,456 queues ->  1.024 GB
> >   2^26 ~=  67,108,864 queues ->256 MB
> >   2^22 ~=   4,194,304 queues -> 16 MB
> >
> > That still might be a lot of usage if the application must explicitly create
> > every queue (before it is used) and require an ODP implementation to map
> > between every ODP queue object (opaque type) and the internal queue.
> >
> > Lets say ODP API has two classes of handles: 1) pointers, 2) integers. An 
> > opaque
> > pointer is used to point to some other software object. This object should 
> > be
> > larger than 64 bits (or 32 bits on a chip in 32-bit pointer mode) otherwise 
> > it
> > could just be represented in a 64-bit (or 32-bit) integer type value!
> >
> > To support millions of queues (flows) should odp_queue_t be an integer type 
> > in
> > the API? A software-only implementation may still use 320 bytes per queue 
> > and
> > use that integer as an index into an array or as a key for lookup operation 
> > on a
> > data structure containing queues. An implementation with hardware assist may
> > use this integer value directly when interfacing with hardware!
> 
> I believe I have answered this question based on explanation above.
> Pls feel free to point out if something is not clear.
> 
> >
> > Would it still be necessary to assign a "name" to each queue (flow)?
> 
> "name" per queue might not be required since it would mean a character
> based lookup across millions of items.
> 
> >
> > Would a queue (flow) also require an "op type" to explicitly specify whether
> > access to the queue (flow) is threadsafe? Atomic queues are threadsafe since
> > only 1 core at any given time can recieve from it. Parallel queues are also
> > threadsafe. Are all ODP APIs threadsafe?
> 
> There are two types of queue enqueue operation ODP_QUEUE_OP_MT and
> ODP_QUEUE_OP_MT_UNSAFE.
> Rest of the ODP APIs are multi thread safe since in ODP there is no
> defined way in which a single packet can be given to more than one
> core at the same time, as packets move across different modules
> through queues.
> 
> >
> >> A single PMR can only match a packet to a single queue associated with
> >> a target CoS. This 

Re: [lng-odp] continuous memory allocation for drivers

2016-11-10 Thread Christophe Milard
I hoped such a kernel module would already exist, but I am surprised
that DPDK would not rely on it. Maybe there is a limitation I cannot
see. I'll keep searching. Francois, maybe you know the answer?

On 10 November 2016 at 19:10, Mike Holmes  wrote:
>
>
> On 10 November 2016 at 12:52, Christophe Milard
>  wrote:
>>
>> Hi,
>>
>> My hope was that packet segments would all be smaller than one page
>> (either normal pages or huge pages) to guarantee physical memory
>> continuity which is needed by some drivers (read non vfio drivers for
>> PCI).
>>
>> Francois Ozog's experience (with dpdk)shows that this hope will fail
>> in some case: not all platforms support the required huge page size.
>> And it would be nice to be able to run even in the absence of huge
>> pages.
>>
>> I am therefore planning to expand drvshm to include a flag requesting
>> contiguous physical memory. But sadly, from user space, this is
>> nothing we can guarantee... So when this flag is set, the allocator
>> will allocate untill physical memory "happens to be continuous".
>> This is a bit like the DPDK approach (try & error), which I dislike,
>> but there aren't many alternatives from user space. This would be
>> triggered only in case huge page allocation failed, or if the
>> requested size exceed the HP size.
>>
>> Last alternative would be to have a kernel module to do this kind of
>> allocation, but I guess we don't really want to depend on that...
>>
>> Any comment?
>
>
> Would that module be launched from the implementations global init, or be an
> independent expectation on some other support lib that existed and it had
> installed the module ?
> Feels like and external support lib would be reusable by all implementations
> and not need re coding.
>
>
>
>
> --
> Mike Holmes
> Program Manager - Linaro Networking Group
> Linaro.org │ Open source software for ARM SoCs
> "Work should be fun and collaborative, the rest follows"
>
>


[lng-odp] [RFC] example: traffic_mgmt: enhancement with multiple pktios

2016-11-10 Thread forrest.shi
From: Xuelin Shi 

This patch made the following changes:
- receiving packets from multiple pktios other than generating packets
- identifying TM user by ip with individual class of service
- print the user service while starting the program
- pirnt the packets counts every 10 seconds for each user

Signed-off-by: Xuelin Shi 
---
 example/traffic_mgmt/Makefile.am |   7 +-
 example/traffic_mgmt/odp_traffic_mgmt.c  | 476 +-
 example/traffic_mgmt/odp_traffic_mgmt.h  |  47 ++
 example/traffic_mgmt/odp_traffic_pktio.c | 794 +++
 4 files changed, 1091 insertions(+), 233 deletions(-)
 create mode 100644 example/traffic_mgmt/odp_traffic_mgmt.h
 create mode 100644 example/traffic_mgmt/odp_traffic_pktio.c

diff --git a/example/traffic_mgmt/Makefile.am b/example/traffic_mgmt/Makefile.am
index c8ff797..d2c7929 100644
--- a/example/traffic_mgmt/Makefile.am
+++ b/example/traffic_mgmt/Makefile.am
@@ -2,8 +2,9 @@ include $(top_srcdir)/example/Makefile.inc
 
 bin_PROGRAMS = odp_traffic_mgmt$(EXEEXT)
 odp_traffic_mgmt_LDFLAGS = $(AM_LDFLAGS) -static
-odp_traffic_mgmt_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/example
+odp_traffic_mgmt_CFLAGS = $(AM_CFLAGS) -I${top_srcdir}/example 
-I${top_srcdir}/test
 
-noinst_HEADERS = $(top_srcdir)/example/example_debug.h
+noinst_HEADERS = $(top_srcdir)/example/example_debug.h \
+$(top_srcdir)/example/traffic_mgmt/odp_traffic_mgmt.h
 
-dist_odp_traffic_mgmt_SOURCES = odp_traffic_mgmt.c
+dist_odp_traffic_mgmt_SOURCES = odp_traffic_mgmt.c odp_traffic_pktio.c
diff --git a/example/traffic_mgmt/odp_traffic_mgmt.c 
b/example/traffic_mgmt/odp_traffic_mgmt.c
index c4f5356..f0fe198 100644
--- a/example/traffic_mgmt/odp_traffic_mgmt.c
+++ b/example/traffic_mgmt/odp_traffic_mgmt.c
@@ -9,16 +9,22 @@
 #define _GNU_SOURCE
 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
+#include "odp_traffic_mgmt.h"
+
+#define TM_USER_IP_START 0x01010101
 #define NUM_SVC_CLASSES 4
 #define USERS_PER_SVC_CLASS 2
-#define APPS_PER_USER   2
-#define TM_QUEUES_PER_APP   2
+#define APPS_PER_USER   1
+#define TM_QUEUES_PER_APP   1
 #define NUM_USERS   (USERS_PER_SVC_CLASS * NUM_SVC_CLASSES)
 #define NUM_TM_QUEUES   (NUM_USERS * APPS_PER_USER * TM_QUEUES_PER_APP)
 #define TM_QUEUES_PER_USER  (TM_QUEUES_PER_APP * APPS_PER_USER)
@@ -49,11 +55,6 @@ typedef struct {
odp_tm_wred_t  wred_profiles[ODP_NUM_PACKET_COLORS];
 } profile_set_t;
 
-static const odp_init_t ODP_INIT_PARAMS = {
-   .log_fn   = odp_override_log,
-   .abort_fn = odp_override_abort
-};
-
 static profile_params_set_t COMPANY_PROFILE_PARAMS = {
.shaper_params = {
.commit_bps = 50  * MBPS,  .commit_burst  = 100,
@@ -161,8 +162,8 @@ static profile_params_set_t COS2_PROFILE_PARAMS = {
},
 
.threshold_params = {
-   .max_pkts  = 1000,.enable_max_pkts  = TRUE,
-   .max_bytes = 10,  .enable_max_bytes = TRUE
+   .max_pkts  = 1000,.enable_max_pkts  = FALSE,
+   .max_bytes = 10,  .enable_max_bytes = FALSE
},
 
.wred_params = {
@@ -194,8 +195,8 @@ static profile_params_set_t COS3_PROFILE_PARAMS = {
},
 
.threshold_params = {
-   .max_pkts  = 400,.enable_max_pkts  = TRUE,
-   .max_bytes = 6,  .enable_max_bytes = TRUE
+   .max_pkts  = 400,.enable_max_pkts  = FALSE,
+   .max_bytes = 6,  .enable_max_bytes = FALSE
},
 
.wred_params = {
@@ -226,19 +227,46 @@ static profile_set_t 
APP_PROFILE_SETS[NUM_SVC_CLASSES][APPS_PER_USER];
 
 static odp_tm_t odp_tm_test;
 
-static odp_pool_t odp_pool;
-
 static odp_tm_queue_t queue_num_tbls[NUM_SVC_CLASSES][TM_QUEUES_PER_CLASS + 1];
 static uint32_t   next_queue_nums[NUM_SVC_CLASSES];
 
 static uint8_t  random_buf[RANDOM_BUF_LEN];
 static uint32_t next_rand_byte;
 
-static odp_atomic_u32_t atomic_pkts_into_tm;
-static odp_atomic_u32_t atomic_pkts_from_tm;
+typedef struct {
+   struct {
+   odp_atomic_u64_t tm_pkts_in;
+   odp_atomic_u64_t tm_bytes_in;
+   odp_atomic_u64_t tm_pkts_out;
+   odp_atomic_u64_t tm_bytes_out;
+   } s;/* statistics info for each user of tm */
+   uint64_t max_pkts_limit; /* max packets allowed for this user */
+   uint64_t max_bytes_limit; /* max bytes allowed for this user */
+   odp_bool_t enable_pkts_limit;   /* max packets limit is valid */
+   odp_bool_t enable_bytes_limit;  /* max bytes limit is valid */
+   uint32_t ipaddr;/* user ipaddr */
+   int svc;/* service class */
+} tm_user_t;
+
+/* statistics value info, used for param pass */
+typedef struct {
+   uint64_t pkts_in;
+   uint64_t pkts_out;
+   uint64_t 

Re: [lng-odp] [API-NEXT PATCH 1/3] drv: shm: function to query for physical addresses

2016-11-10 Thread Francois Ozog
On 10 November 2016 at 20:05, Maxim Uvarov  wrote:

> On 11/10/16 20:21, Christophe Milard wrote:
>
>> The capability "can_getphy" is introduced and tells whether physical
>> address queries are available.
>> The function odpdrv_getphy() is added to query for physical address
>> (from virtual address)
>>
>> Signed-off-by: Christophe Milard 
>> ---
>>   include/odp/drv/spec/shm.h | 35 +++
>>   1 file changed, 35 insertions(+)
>>
>> diff --git a/include/odp/drv/spec/shm.h b/include/odp/drv/spec/shm.h
>> index ef64f5d..ca3aae2 100644
>> --- a/include/odp/drv/spec/shm.h
>> +++ b/include/odp/drv/spec/shm.h
>> @@ -43,6 +43,12 @@ extern "C" {
>>   #define ODPDRV_SHM_LOCK   0x02 /**< Memory shall be locked
>> (no swap) */
>> /**
>> + * @typedef odpdrv_phys_addr_t
>> + * A physical memory address
>> + *
>> + */
>> +
>> +/**
>>* Shared memory block info
>>*/
>>   typedef struct odpdrv_shm_info_t {
>> @@ -75,6 +81,13 @@ typedef struct odpdrv_shm_capability_t {
>>  * available memory size. */
>> uint64_t max_align;
>>   + /** Can retrieve physical addresses
>> +*
>> +* A true (non-zero) value means that odpdrv_getphy() can be
>> +* used to retrieve physical memory addresses, (as well as
>> +* the debug function odp_drv_dumpphy() */
>> +   int can_getphy;
>>
> intnphys_addr:1;
>
>> +
>>   } odpdrv_shm_capability_t;
>> /**
>> @@ -220,6 +233,28 @@ int odpdrv_shm_print_all(const char *title);
>>   uint64_t odpdrv_shm_to_u64(odpdrv_shm_t hdl);
>> /**
>> + * Get physical address from virtual address
>> + *
>> + * @addr   virtual address.
>> + * @return Physicall address or NULL
>> + *
>> + * @note This routine will work only if capability "can_getphy" is true.
>> + */
>> +odpdrv_phys_addr_t odpdrv_getphy(const void *addr);
>>
>
> naming is bad. By 'phy' I used to see physics layer chip, not physical
> memory.
> Something similar to linux kernel might be:
>
> odpdrv_virt_to_phys()
> odpdrv_phys_to_virt()
>
>
> and (I don't know how writes done from user space), but probably you need
> odp version of ioremap()
>
> [FF]: I agree for the naming. "physical memory" is then handled through
either UIO or VFIO frameworks which deal with IO particularities, we should
only use APIs from them (ioremap is a kernel API). Drivers may have to
implement one or the other or both physical access methods. Those
frameworks provide required functions to prepare DMAs properly (on some
architectures "DMA from device" and "DMA to device" require preparation
even in bare metal). VFIO is the preferred method because it is secure but
it is not yet widely available. For instance, DPDK virtio-net guest and
host drivers are using uio. vfio support is embryonic and provides poor
performance at the moment.

>
> +
>> +/**
>> + * Print physical address mapping
>> + *
>>
> print where?
>
>> + * @addr   virtual address.
>>
> what is len?
>
> + * @return Physicall address or NULL
>> + *
>>
>
> it's void, no return.
>
> + * @note This routine will work only if capability "can_getphy" is true.
>> + * @note This routine is intended to be used for diagnostic purposes,
>> + * and uses ODP_DBG printouts
>> + */
>> +void odpdrv_dumpphy(void *addr, uint64_t len);
>>
>
> Maybe odpdrv_memmap_print()? to print current mappings by driver?
>
> should driver handle be passed as argument?
>
>
> +
>> +/**
>>* @}
>>*/
>>
>>
>
>


-- 
[image: Linaro] 
François-Frédéric Ozog | *Director Linaro Networking Group*
T: +33.67221.6485
francois.o...@linaro.org | Skype: ffozog


Re: [lng-odp] [API-NEXT PATCH 1/3] drv: shm: function to query for physical addresses

2016-11-10 Thread Maxim Uvarov

On 11/10/16 20:21, Christophe Milard wrote:

The capability "can_getphy" is introduced and tells whether physical
address queries are available.
The function odpdrv_getphy() is added to query for physical address
(from virtual address)

Signed-off-by: Christophe Milard 
---
  include/odp/drv/spec/shm.h | 35 +++
  1 file changed, 35 insertions(+)

diff --git a/include/odp/drv/spec/shm.h b/include/odp/drv/spec/shm.h
index ef64f5d..ca3aae2 100644
--- a/include/odp/drv/spec/shm.h
+++ b/include/odp/drv/spec/shm.h
@@ -43,6 +43,12 @@ extern "C" {
  #define ODPDRV_SHM_LOCK   0x02 /**< Memory shall be locked (no 
swap) */
  
  /**

+ * @typedef odpdrv_phys_addr_t
+ * A physical memory address
+ *
+ */
+
+/**
   * Shared memory block info
   */
  typedef struct odpdrv_shm_info_t {
@@ -75,6 +81,13 @@ typedef struct odpdrv_shm_capability_t {
 * available memory size. */
uint64_t max_align;
  
+	/** Can retrieve physical addresses

+*
+* A true (non-zero) value means that odpdrv_getphy() can be
+* used to retrieve physical memory addresses, (as well as
+* the debug function odp_drv_dumpphy() */
+   int can_getphy;

intnphys_addr:1;

+
  } odpdrv_shm_capability_t;
  
  /**

@@ -220,6 +233,28 @@ int odpdrv_shm_print_all(const char *title);
  uint64_t odpdrv_shm_to_u64(odpdrv_shm_t hdl);
  
  /**

+ * Get physical address from virtual address
+ *
+ * @addr   virtual address.
+ * @return Physicall address or NULL
+ *
+ * @note This routine will work only if capability "can_getphy" is true.
+ */
+odpdrv_phys_addr_t odpdrv_getphy(const void *addr);


naming is bad. By 'phy' I used to see physics layer chip, not physical 
memory.

Something similar to linux kernel might be:

odpdrv_virt_to_phys()
odpdrv_phys_to_virt()


and (I don't know how writes done from user space), but probably you 
need odp version of ioremap()




+
+/**
+ * Print physical address mapping
+ *

print where?

+ * @addr   virtual address.

what is len?


+ * @return Physicall address or NULL
+ *


it's void, no return.


+ * @note This routine will work only if capability "can_getphy" is true.
+ * @note This routine is intended to be used for diagnostic purposes,
+ * and uses ODP_DBG printouts
+ */
+void odpdrv_dumpphy(void *addr, uint64_t len);


Maybe odpdrv_memmap_print()? to print current mappings by driver?

should driver handle be passed as argument?


+
+/**
   * @}
   */
  




Re: [lng-odp] Classification Queue Group

2016-11-10 Thread Mike Holmes
On 10 November 2016 at 04:14, Bala Manoharan 
wrote:

> Regards,
> Bala
>
>
> On 10 November 2016 at 07:34, Bill Fischofer 
> wrote:
> >
> >
> > On Mon, Nov 7, 2016 at 5:16 AM, Bala Manoharan <
> bala.manoha...@linaro.org>
> > wrote:
> >>
> >> Hi,
> >>
> >> This mail thread discusses the design of classification queue group
> >> RFC. The same can be found in the google doc whose link is given
> >> below.
> >> Users can provide their comments either in this mail thread or in the
> >> google doc as per their convenience.
> >>
> >>
> >> https://docs.google.com/document/d/1fOoG9WDR0lMpVjgMAsx8QsMr0YFK9
> slR93LZ8VXqM2o/edit?usp=sharing
> >>
> >> The basic issues with queues as being a single target for a CoS are two
> >> fold:
> >>
> >> Queues must be created and deleted individually. This imposes a
> >> significant burden when queues are used to represent individual flows
> >> since the application may need to process thousands (or millions) of
> >> flows.
> >> A single PMR can only match a packet to a single queue associated with
> >> a target CoS. This prohibits efficient capture of subfield
> >> classification.
> >> To solve these issues, Tiger Moth introduces the concept of a queue
> >> group. A queue group is an extension to the existing queue
> >> specification in a Class of Service.
> >>
> >> Queue groups solve the classification issues associated with
> >> individual queues in three ways:
> >>
> >> * The odp_queue_group_create() API can create a large number of
> >> related queues with a single call.
> >> * A single PMR can spread traffic to many queues associated with the
> >> same CoS by assigning packets matching the PMR to a queue group rather
> >> than a queue.
> >> * A hashed PMR subfield is used to distribute individual queues within
> >> a queue group for scheduling purposes.
> >>
> >>
> >> diff --git a/include/odp/api/spec/classification.h
> >> b/include/odp/api/spec/classification.h
> >> index 6eca9ab..cf56852 100644
> >> --- a/include/odp/api/spec/classification.h
> >> +++ b/include/odp/api/spec/classification.h
> >> @@ -126,6 +126,12 @@ typedef struct odp_cls_capability_t {
> >>
> >> /** A Boolean to denote support of PMR range */
> >> odp_bool_t pmr_range_supported;
> >> +
> >> + /** A Boolean to denote support of queue group */
> >> + odp_bool_t queue_group_supported;
> >
> >
> > To date we've not introduced optional APIs into ODP so I'm not sure if
> we'd
> > want to start here. If we are adding queue groups, all ODP
> implementations
> > should be expected to support queue groups, so this flag shouldn't be
> > needed. Limits on the support (e.g., max number of queue groups
> supported,
> > etc.) are appropriate, but there shouldn't be an option to not support
> them
> > at all.
> >
> >>
> >> +
> >> + /** A Boolean to denote support of queue */
> >> + odp_bool_t queue_supported;
> >
> >
> > Not sure what the intent is here. Is this anticipating that some
> > implementations might only support sending flows to queue groups and not
> to
> > individual queues? That would be a serious functional regression and not
> > something we'd want to encourage.
>
> The idea here is that some implementations might have limitations
> where it could support either queue or a queue group attached to a CoS
> object but not both. This limitation is only for queue created within
> a queue group and not general odp_queue_t.
>
> This could be use-ful when some low end hardware does not support
> hashing after classification in the HW and in that case supporting
> queue group which requires hashing might be costly.
>

I am wary of "could be usefull", unless we know it is a use case, lets fix
that case when we have it especially if it makes things simpler now



>
> The reason being that the internal handles of a packet flow might have
> to be re-aligned so that they can be exposed as either queue or queue
> group. So this capability is useful if an implementation supports only
> queue groups and not queue or vice-versa. I did not see much harm in
> the above boolean since it is added only in the capability struct.
>
> >
> >>
> >> } odp_cls_capability_t;
> >>
> >>
> >> /**
> >> @@ -162,7 +168,18 @@ typedef enum {
> >>  * Used to communicate class of service creation options
> >>  */
> >> typedef struct odp_cls_cos_param {
> >> - odp_queue_t queue; /**< Queue associated with CoS */
> >> + /** If type is ODP_QUEUE_T, odp_queue_t is linked with CoS,
> >> + * if type is ODP_QUEUE_GROUP_T, odp_queue_group_t is linked with CoS.
> >
> >
> > Perhaps not the best choice of discriminator names. Perhaps
> > ODP_COS_TYPE_QUEUE and ODP_COS_TYPE_GROUP might be simpler?
> >
> >>
> >> + */
> >> + odp_queue_type_e type;
> >
> >
> > We already have an odp_queue_type_t defined for the queue APIs so this
> name
> > would be confusingly similar. We're really identifying what the type of
> the
> > CoS is so perhaps odp_cos_type_t might be better here? That would be
> > 

Re: [lng-odp] continuous memory allocation for drivers

2016-11-10 Thread Mike Holmes
On 10 November 2016 at 12:52, Christophe Milard <
christophe.mil...@linaro.org> wrote:

> Hi,
>
> My hope was that packet segments would all be smaller than one page
> (either normal pages or huge pages) to guarantee physical memory
> continuity which is needed by some drivers (read non vfio drivers for
> PCI).
>
> Francois Ozog's experience (with dpdk)shows that this hope will fail
> in some case: not all platforms support the required huge page size.
> And it would be nice to be able to run even in the absence of huge
> pages.
>
> I am therefore planning to expand drvshm to include a flag requesting
> contiguous physical memory. But sadly, from user space, this is
> nothing we can guarantee... So when this flag is set, the allocator
> will allocate untill physical memory "happens to be continuous".
> This is a bit like the DPDK approach (try & error), which I dislike,
> but there aren't many alternatives from user space. This would be
> triggered only in case huge page allocation failed, or if the
> requested size exceed the HP size.
>
> Last alternative would be to have a kernel module to do this kind of
> allocation, but I guess we don't really want to depend on that...
>
> Any comment?
>

Would that module be launched from the implementations global init, or be
an independent expectation on some other support lib that existed and it
had installed the module ?
Feels like and external support lib would be reusable by all
implementations and not need re coding.




-- 
Mike Holmes
Program Manager - Linaro Networking Group
Linaro.org  *│ *Open source software for ARM SoCs
"Work should be fun and collaborative, the rest follows"


[lng-odp] continuous memory allocation for drivers

2016-11-10 Thread Christophe Milard
Hi,

My hope was that packet segments would all be smaller than one page
(either normal pages or huge pages) to guarantee physical memory
continuity which is needed by some drivers (read non vfio drivers for
PCI).

Francois Ozog's experience (with dpdk)shows that this hope will fail
in some case: not all platforms support the required huge page size.
And it would be nice to be able to run even in the absence of huge
pages.

I am therefore planning to expand drvshm to include a flag requesting
contiguous physical memory. But sadly, from user space, this is
nothing we can guarantee... So when this flag is set, the allocator
will allocate untill physical memory "happens to be continuous".
This is a bit like the DPDK approach (try & error), which I dislike,
but there aren't many alternatives from user space. This would be
triggered only in case huge page allocation failed, or if the
requested size exceed the HP size.

Last alternative would be to have a kernel module to do this kind of
allocation, but I guess we don't really want to depend on that...

Any comment?


[lng-odp] [API-NEXT PATCH 2/3] linux-gen: drv: functions to query for physical addresses

2016-11-10 Thread Christophe Milard
The capability "can_getphy" is introduced and tells whether physical
address queries are available.
The function odpdrv_getphy() is added to query for physical address
(from virtual address)

Signed-off-by: Christophe Milard 
---
 platform/linux-generic/drv_shm.c| 12 
 platform/linux-generic/include/_ishmphy_internal.h  |  3 ++-
 platform/linux-generic/include/odp/drv/plat/shm_types.h |  3 +++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/platform/linux-generic/drv_shm.c b/platform/linux-generic/drv_shm.c
index 9b2560d..831c325 100644
--- a/platform/linux-generic/drv_shm.c
+++ b/platform/linux-generic/drv_shm.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include <_ishm_internal.h>
+#include <_ishmphy_internal.h>
 
 static inline uint32_t from_handle(odpdrv_shm_t shm)
 {
@@ -24,6 +25,7 @@ int odpdrv_shm_capability(odpdrv_shm_capability_t *capa)
capa->max_blocks = ODPDRV_CONFIG_SHM_BLOCKS;
capa->max_size = 0;
capa->max_align = 0;
+   capa->can_getphy = _odp_ishmphy_can_getphy();
 
return 0;
 }
@@ -100,3 +102,13 @@ int odpdrv_shm_print_all(const char *title)
 {
return _odp_ishm_status(title);
 }
+
+odpdrv_phys_addr_t odpdrv_getphy(const void *addr)
+{
+   return (odpdrv_phys_addr_t)_odp_ishmphy_getphy(addr);
+}
+
+void odpdrv_dumpphy(void *addr, uint64_t len)
+{
+   _odp_ishmphy_dumpphy(addr, len);
+}
diff --git a/platform/linux-generic/include/_ishmphy_internal.h 
b/platform/linux-generic/include/_ishmphy_internal.h
index bb61ab4..1b56a64 100644
--- a/platform/linux-generic/include/_ishmphy_internal.h
+++ b/platform/linux-generic/include/_ishmphy_internal.h
@@ -12,8 +12,9 @@ extern "C" {
 #endif
 
 #include 
+#include 
 
-typedef uint64_t phys_addr_t; /* Physical address definition. */
+typedef odpdrv_phys_addr_t phys_addr_t; /* Physical address definition. */
 #define PHYS_ADDR_INVALID ((phys_addr_t)-1)
 
 void *_odp_ishmphy_book_va(uintptr_t len, intptr_t align);
diff --git a/platform/linux-generic/include/odp/drv/plat/shm_types.h 
b/platform/linux-generic/include/odp/drv/plat/shm_types.h
index c48eeca..a54d806 100644
--- a/platform/linux-generic/include/odp/drv/plat/shm_types.h
+++ b/platform/linux-generic/include/odp/drv/plat/shm_types.h
@@ -29,6 +29,9 @@ typedef ODPDRV_HANDLE_T(odpdrv_shm_t);
 
 #define ODPDRV_SHM_INVALID _odpdrv_cast_scalar(odpdrv_shm_t, 0)
 
+/* a physical address: */
+typedef uint64_t odpdrv_phys_addr_t;
+
 /** Get printable format of odpdrv_shm_t */
 static inline uint64_t odpdrv_shm_to_u64(odpdrv_shm_t hdl)
 {
-- 
2.7.4



[lng-odp] [API-NEXT PATCH 3/3] test: drv: shm: adding test for physical address queries

2016-11-10 Thread Christophe Milard
Signed-off-by: Christophe Milard 
---
 .../common_plat/validation/drv/drvshmem/drvshmem.c | 37 ++
 .../common_plat/validation/drv/drvshmem/drvshmem.h |  1 +
 2 files changed, 38 insertions(+)

diff --git a/test/common_plat/validation/drv/drvshmem/drvshmem.c 
b/test/common_plat/validation/drv/drvshmem/drvshmem.c
index 559c55d..274aeb2 100644
--- a/test/common_plat/validation/drv/drvshmem/drvshmem.c
+++ b/test/common_plat/validation/drv/drvshmem/drvshmem.c
@@ -764,11 +764,48 @@ void drvshmem_test_stress(void)
CU_ASSERT(odpdrv_shm_print_all("After stress tests") == base);
 }
 
+/*
+ * test basic things: shmem creation, info, share, and free
+ */
+void drvshmem_test_physical(void)
+{
+   shared_test_data_big_t *shared_test_data;
+   odpdrv_shm_t shm;
+   odpdrv_shm_capability_t capa;
+   odpdrv_phys_addr_t phy_addr;
+
+   /* query_capabilities: */
+   CU_ASSERT(odpdrv_shm_capability() == 0)
+
+   /* if we cannot handle phy queries, end of story... */
+   if (!capa.can_getphy)
+   return;
+
+   /* allocated shared mem (locked, for real physical) */
+   shm = odpdrv_shm_reserve(MEM_NAME,
+sizeof(shared_test_data_big_t),
+ALIGN_SIZE, ODPDRV_SHM_LOCK);
+   CU_ASSERT(ODPDRV_SHM_INVALID != shm);
+
+   /* query for the physical address and check its validity as we can: */
+   shared_test_data = odpdrv_shm_addr(shm);
+
+   phy_addr = odpdrv_getphy((void *)shared_test_data);
+   CU_ASSERT(phy_addr != 0);
+
+   /* Also run the debug dump function: */
+   odpdrv_dumpphy((void *)shared_test_data,
+  sizeof(shared_test_data_big_t));
+
+   CU_ASSERT(0 == odpdrv_shm_free_by_handle(shm));
+}
+
 odp_testinfo_t drvshmem_suite[] = {
ODP_TEST_INFO(drvshmem_test_basic),
ODP_TEST_INFO(drvshmem_test_reserve_after_fork),
ODP_TEST_INFO(drvshmem_test_singleva_after_fork),
ODP_TEST_INFO(drvshmem_test_stress),
+   ODP_TEST_INFO(drvshmem_test_physical),
ODP_TEST_INFO_NULL,
 };
 
diff --git a/test/common_plat/validation/drv/drvshmem/drvshmem.h 
b/test/common_plat/validation/drv/drvshmem/drvshmem.h
index f4c26a1..056c0de 100644
--- a/test/common_plat/validation/drv/drvshmem/drvshmem.h
+++ b/test/common_plat/validation/drv/drvshmem/drvshmem.h
@@ -14,6 +14,7 @@ void drvshmem_test_basic(void);
 void drvshmem_test_reserve_after_fork(void);
 void drvshmem_test_singleva_after_fork(void);
 void drvshmem_test_stress(void);
+void drvshmem_test_physical(void);
 
 /* test arrays: */
 extern odp_testinfo_t drvshmem_suite[];
-- 
2.7.4



[lng-odp] [API-NEXT PATCH 1/3] drv: shm: function to query for physical addresses

2016-11-10 Thread Christophe Milard
The capability "can_getphy" is introduced and tells whether physical
address queries are available.
The function odpdrv_getphy() is added to query for physical address
(from virtual address)

Signed-off-by: Christophe Milard 
---
 include/odp/drv/spec/shm.h | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/include/odp/drv/spec/shm.h b/include/odp/drv/spec/shm.h
index ef64f5d..ca3aae2 100644
--- a/include/odp/drv/spec/shm.h
+++ b/include/odp/drv/spec/shm.h
@@ -43,6 +43,12 @@ extern "C" {
 #define ODPDRV_SHM_LOCK0x02 /**< Memory shall be locked (no 
swap) */
 
 /**
+ * @typedef odpdrv_phys_addr_t
+ * A physical memory address
+ *
+ */
+
+/**
  * Shared memory block info
  */
 typedef struct odpdrv_shm_info_t {
@@ -75,6 +81,13 @@ typedef struct odpdrv_shm_capability_t {
 * available memory size. */
uint64_t max_align;
 
+   /** Can retrieve physical addresses
+*
+* A true (non-zero) value means that odpdrv_getphy() can be
+* used to retrieve physical memory addresses, (as well as
+* the debug function odp_drv_dumpphy() */
+   int can_getphy;
+
 } odpdrv_shm_capability_t;
 
 /**
@@ -220,6 +233,28 @@ int odpdrv_shm_print_all(const char *title);
 uint64_t odpdrv_shm_to_u64(odpdrv_shm_t hdl);
 
 /**
+ * Get physical address from virtual address
+ *
+ * @addr   virtual address.
+ * @return Physicall address or NULL
+ *
+ * @note This routine will work only if capability "can_getphy" is true.
+ */
+odpdrv_phys_addr_t odpdrv_getphy(const void *addr);
+
+/**
+ * Print physical address mapping
+ *
+ * @addr   virtual address.
+ * @return Physicall address or NULL
+ *
+ * @note This routine will work only if capability "can_getphy" is true.
+ * @note This routine is intended to be used for diagnostic purposes,
+ * and uses ODP_DBG printouts
+ */
+void odpdrv_dumpphy(void *addr, uint64_t len);
+
+/**
  * @}
  */
 
-- 
2.7.4



[lng-odp] [API-NEXT PATCH 0/3] physical address query on DRV interface

2016-11-10 Thread Christophe Milard
NOTE: Must be applied on top of "getting physical addresses from _ishmphy"

Brings the physical address query functions on the driver interface.

Christophe Milard (3):
  drv: shm: function to query for physical addresses
  linux-gen: drv: functions to query for physical addresses
  test: drv: shm: adding test for physical address queries

 include/odp/drv/spec/shm.h | 35 
 platform/linux-generic/drv_shm.c   | 12 +++
 platform/linux-generic/include/_ishmphy_internal.h |  3 +-
 .../linux-generic/include/odp/drv/plat/shm_types.h |  3 ++
 .../common_plat/validation/drv/drvshmem/drvshmem.c | 37 ++
 .../common_plat/validation/drv/drvshmem/drvshmem.h |  1 +
 6 files changed, 90 insertions(+), 1 deletion(-)

-- 
2.7.4



[lng-odp] [Bug 2437] LAVA: FAIL: odp_timer_simple Juno

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2437

Mike Holmes  changed:

   What|Removed |Added

 Resolution|--- |NON REPRODUCIBLE
 Status|IN_PROGRESS |RESOLVED

--- Comment #6 from Mike Holmes  ---
https://lng.validation.linaro.org/dashboard/streams/public/team/lng/odp/bundles/2f4edb6a401814a5d4f0ab999555a5de65cfb9b9/cdd22d6e-c81f-44c2-951f-49e7010bf652/?page=2#table

Only skipped tests remain

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.

[lng-odp] [Bug 2438] LAVA: FAIL: odp_pktio_perf arndale

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2438

Mike Holmes  changed:

   What|Removed |Added

 Resolution|--- |NON REPRODUCIBLE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Mike Holmes  ---
https://lng.validation.linaro.org/dashboard/attachment/58132/view#L20

Perf tests now pass - skipping only tests that dont have enough CPUs on Arndale

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [Bug 2558] Upgrade socket mmap to TPACKET_V3

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2558

Mike Holmes  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX
 CC||mike.hol...@linaro.org

--- Comment #2 from Mike Holmes  ---
Patches would be considered but no internal use case currently.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [Bug 2512] validation: TCP checksum update

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2512

--- Comment #5 from Bala Manoharan  ---
V2 Patch in mailing list. Needs review.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [Bug 2497] validation: odp_schedule_multi waiting for n number of packets

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2497

Bala Manoharan  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Bala Manoharan  ---
Agreed. This bug is not required as it does not cause a failure and can be
closed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[lng-odp] [PATCHv2 1/1] validation: pktio: fix invalid mac addr

2016-11-10 Thread Balasubramanian Manoharan
Fixes https://bugs.linaro.org/show_bug.cgi?id=2496

Signed-off-by: Balasubramanian Manoharan 
---
v2: Incorporate review comments
 test/common_plat/validation/api/pktio/pktio.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index a6a18c3..7115def 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -31,6 +31,8 @@
 #define PKTIN_TS_MAX_RES   100
 #define PKTIN_TS_CMP_RES   1
 
+#define PKTIO_SRC_MAC  {1, 2, 3, 4, 5, 6}
+#define PKTIO_DST_MAC  {1, 2, 3, 4, 5, 6}
 #undef DEBUG_STATS
 
 /** interface names used for testing */
@@ -241,16 +243,32 @@ static uint32_t pktio_init_packet(odp_packet_t pkt)
odph_udphdr_t *udp;
char *buf;
uint16_t seq;
-   uint8_t mac[ODP_PKTIO_MACADDR_MAXSIZE] = {0};
+   uint8_t src_mac[ODP_PKTIO_MACADDR_MAXSIZE] = PKTIO_SRC_MAC;
+   uint8_t dst_mac[ODP_PKTIO_MACADDR_MAXSIZE] = PKTIO_DST_MAC;
+   uint8_t src_mac_be[ODP_PKTIO_MACADDR_MAXSIZE];
+   uint8_t dst_mac_be[ODP_PKTIO_MACADDR_MAXSIZE];
int pkt_len = odp_packet_len(pkt);
+   int i;
+
+   #if ODP_BYTE_ORDER == ODP_LITTLE_ENDIAN
+   for (i = 0; i < ODP_PKTIO_MACADDR_MAXSIZE; i++) {
+   src_mac_be[i] = src_mac[i];
+   dst_mac_be[i] = dst_mac[i];
+   }
+   #else
+   for (i = 0; i < ODP_PKTIO_MACADDR_MAXSIZE; i++) {
+   src_mac_be[i] = src_mac[ODP_PKTIO_MACADDR_MAXSIZE - i];
+   dst_mac_be[i] = dst_mac[ODP_PKTIO_MACADDR_MAXSIZE - i];
+   }
+   #endif
 
buf = odp_packet_data(pkt);
 
/* Ethernet */
odp_packet_l2_offset_set(pkt, 0);
eth = (odph_ethhdr_t *)buf;
-   memcpy(eth->src.addr, mac, ODPH_ETHADDR_LEN);
-   memcpy(eth->dst.addr, mac, ODPH_ETHADDR_LEN);
+   memcpy(eth->src.addr, _mac_be, ODPH_ETHADDR_LEN);
+   memcpy(eth->dst.addr, _mac_be, ODPH_ETHADDR_LEN);
eth->type = odp_cpu_to_be_16(ODPH_ETHTYPE_IPV4);
 
/* IP */
-- 
1.9.1



Re: [lng-odp] [API-NEXT PATCH v3 05/19] linux-gen: pool: reimplement pool with ring

2016-11-10 Thread Bill Fischofer
This series fails bisectability with this patch:

Using patch: 0005-linux-gen-pool-reimplement-pool-with-ring.patch
  Trying to apply patch
  Patch applied
  Building with patch
PASS: odp_crypto
PASS: odp_pktio_perf
SKIP: odp_l2fwd_run.sh
PASS: odp_sched_latency_run.sh
PASS: odp_scheduling_run.sh

Testsuite summary for OpenDataPlane 1.11.0.0

# TOTAL: 5
# PASS:  4
# SKIP:  1
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0


Testsuite summary for OpenDataPlane 1.11.0.0

# TOTAL: 0
# PASS:  0
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

PASS: odp_scheduling_run_proc.sh

Testsuite summary for OpenDataPlane 1.11.0.0

# TOTAL: 1
# PASS:  1
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

PASS: validation/api/pktio/pktio_run.sh
SKIP: validation/api/pktio/pktio_run_tap.sh
PASS: validation/api/shmem/shmem_linux
PASS: ../../test/common_plat/validation/api/atomic/atomic_main
PASS: ../../test/common_plat/validation/api/barrier/barrier_main
PASS: ../../test/common_plat/validation/api/buffer/buffer_main
PASS:
../../test/common_plat/validation/api/classification/classification_main
PASS: ../../test/common_plat/validation/api/cpumask/cpumask_main
PASS: ../../test/common_plat/validation/api/crypto/crypto_main
PASS: ../../test/common_plat/validation/api/errno/errno_main
PASS: ../../test/common_plat/validation/api/hash/hash_main
PASS: ../../test/common_plat/validation/api/init/init_main_ok
PASS: ../../test/common_plat/validation/api/init/init_main_abort
PASS: ../../test/common_plat/validation/api/init/init_main_log
PASS: ../../test/common_plat/validation/api/lock/lock_main
PASS: ../../test/common_plat/validation/api/packet/packet_main
PASS: ../../test/common_plat/validation/api/pool/pool_main
PASS: ../../test/common_plat/validation/api/queue/queue_main
PASS: ../../test/common_plat/validation/api/random/random_main
FAIL: ../../test/common_plat/validation/api/scheduler/scheduler_main
PASS: ../../test/common_plat/validation/api/std_clib/std_clib_main
PASS: ../../test/common_plat/validation/api/thread/thread_main
PASS: ../../test/common_plat/validation/api/time/time_main
PASS: ../../test/common_plat/validation/api/timer/timer_main
PASS: ../../test/common_plat/validation/api/traffic_mngr/traffic_mngr_main
PASS: ../../test/common_plat/validation/api/shmem/shmem_main
PASS: ../../test/common_plat/validation/api/system/system_main
PASS: ../../test/common_plat/validation/drv/drvatomic/drvatomic_main
PASS: ../../test/common_plat/validation/drv/drvshmem/drvshmem_main
PASS: ring/ring_main
PASS: validation/api/pktio/pktio_run_pcap.sh
SKIP: mmap_vlan_ins/mmap_vlan_ins.sh

Testsuite summary for OpenDataPlane 1.11.0.0

# TOTAL: 32
# PASS:  29
# SKIP:  2
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

See test/linux-generic/test-suite.log
Please report to lng-odp@lists.linaro.org


On Thu, Nov 10, 2016 at 5:07 AM, Petri Savolainen <
petri.savolai...@nokia.com> wrote:

> Used the ring data structure to implement pool. Also
> buffer structure was simplified to enable future driver
> interface. Every buffer includes a packet header, so each
> buffer can be used as a packet head or segment. Segmentation
> was disabled and segment size was fixed to a large number
> (64kB) to limit the number of modification in the commit.
>
> Signed-off-by: Petri Savolainen 
> ---
>  .../include/odp/api/plat/pool_types.h  |6 -
>  .../linux-generic/include/odp_buffer_inlines.h |  160 +--
>  .../linux-generic/include/odp_buffer_internal.h|  104 +-
>  .../include/odp_classification_datamodel.h |2 +-
>  .../linux-generic/include/odp_config_internal.h|   34 +-
>  .../linux-generic/include/odp_packet_internal.h|   13 +-
>  platform/linux-generic/include/odp_pool_internal.h |  270 +---
>  .../linux-generic/include/odp_timer_internal.h |4 -
>  platform/linux-generic/odp_buffer.c|8 -
>  platform/linux-generic/odp_classification.c|   25 +-
>  platform/linux-generic/odp_crypto.c  

[lng-odp] [Bug 2494] Extend odp_pktio_capability_t to include minimal pool size required by pktio implementation.

2016-11-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=2494

--- Comment #7 from Bala Manoharan  ---
Regards,
Bala


On 3 November 2016 at 19:36,   wrote:
> https://bugs.linaro.org/show_bug.cgi?id=2494
>
> --- Comment #6 from Maciej Czekaj  ---
> (In reply to Bala Manoharan from comment #5)
>> Application portability across multiple platforms can not be done without
>> any modifications. However I do not have an issue is adding the minimal pool
>> size as capability in the capability struct but in the point 2 of your
>> initial suggestion states that pool size be given by the implementation
>> based on the buffer requirements, since the pools are actually created by
>> the implementation and the application gives buffer count and pool type
>> can't this be internally handled by the implementation without the
>> additional API.
>>
>> If we are on agreement here I can send a proposal to modify the capability
>> struct.
>
> Yes, application specifies and creates a pool, so user needs to manually
> account for increased pool consumption.
>
> Also, it is
> nb_rx_queues*(rx_queue_depth+alloc_queue_depth) + nb_tx_queues*tx_queue_depth.
>
> In case of ThunderX the formula is
>
> (nb_rx_queues+nb_tx_queues)*1024 + ceil(nb_rx_queues/8)*8192
>
>
> Now the challenge is that the buffer consumption has to be known before
> odp_pktio_open() since that call consumes the buffer pool... but there is no
> call to be made on pktio before odp_pktio_open() because this is a factory
> function for pktio objects. The best way would be to:
>
> 1.create pktio object without a pool
> 2.config the number of queues
> 3.ask about capabilities , including buffer needs
> 4.create the pool with application specified size + internal consumption size
> taken from capability
> 5.start the interface
>
> Adding the pool in the first call to pktio complicates the whole thing, as it
> is not possible to alter the pool size later and it is not possible to query
> the pktio pool consumption before odp_pktio_open().

The pools are configured based on "num" packets of size "len" or
smaller which it has to support so if a pool is configured to support
1024 packets of size 512K then it can only support 512 packets of size
1024K. The pools are linked at pktio level and not at queue level the
maximum number of packets which it has to support across all queue in
a pktio is "num".

Since the pktio capability comes after the pktio_open(), so couldn't
the application expose the maximum queue which it can support based on
pool configured with pktio interface.

>
>>
>> The real question I would like to understand is the expectation of this
>> minimal buffer size lets say if the application requires 1024
>> packets/buffers in a pktio interface and the minimal buffer size is 512 then
>> should be pool be configured with (1024 + 512) buffers and can the 512
>> buffer required by pktio interface be used for storing application packets?
>
> Yes, the buffers are shared btw internal queues and application, because they
> are really used for internal queuing of packets in HW, e.g. the contain:
>  -  packets waiting for reception in RX queue
>  -  packets waiting for transmission in TX queue
>  -  empty packets allocated for future reception
>
> If the driver is under heavy load, all those queues may be filled up and lead
> to buffer starvation that is difficult to recover from and it certainly limits
> the performance since lack of buffers leads to dropping traffic.

Packets being dropped coz of buffer starvation is expected and
acceptable under heavy load and usually this is handled using Random
Early Discard when the buffer in the pool reaches a threshold limit.

>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: [lng-odp] [Bug 2494] Extend odp_pktio_capability_t to include minimal pool size required by pktio implementation.

2016-11-10 Thread Bala Manoharan
Regards,
Bala


On 3 November 2016 at 19:36,   wrote:
> https://bugs.linaro.org/show_bug.cgi?id=2494
>
> --- Comment #6 from Maciej Czekaj  ---
> (In reply to Bala Manoharan from comment #5)
>> Application portability across multiple platforms can not be done without
>> any modifications. However I do not have an issue is adding the minimal pool
>> size as capability in the capability struct but in the point 2 of your
>> initial suggestion states that pool size be given by the implementation
>> based on the buffer requirements, since the pools are actually created by
>> the implementation and the application gives buffer count and pool type
>> can't this be internally handled by the implementation without the
>> additional API.
>>
>> If we are on agreement here I can send a proposal to modify the capability
>> struct.
>
> Yes, application specifies and creates a pool, so user needs to manually
> account for increased pool consumption.
>
> Also, it is
> nb_rx_queues*(rx_queue_depth+alloc_queue_depth) + nb_tx_queues*tx_queue_depth.
>
> In case of ThunderX the formula is
>
> (nb_rx_queues+nb_tx_queues)*1024 + ceil(nb_rx_queues/8)*8192
>
>
> Now the challenge is that the buffer consumption has to be known before
> odp_pktio_open() since that call consumes the buffer pool... but there is no
> call to be made on pktio before odp_pktio_open() because this is a factory
> function for pktio objects. The best way would be to:
>
> 1.create pktio object without a pool
> 2.config the number of queues
> 3.ask about capabilities , including buffer needs
> 4.create the pool with application specified size + internal consumption size
> taken from capability
> 5.start the interface
>
> Adding the pool in the first call to pktio complicates the whole thing, as it
> is not possible to alter the pool size later and it is not possible to query
> the pktio pool consumption before odp_pktio_open().

The pools are configured based on "num" packets of size "len" or
smaller which it has to support so if a pool is configured to support
1024 packets of size 512K then it can only support 512 packets of size
1024K. The pools are linked at pktio level and not at queue level the
maximum number of packets which it has to support across all queue in
a pktio is "num".

Since the pktio capability comes after the pktio_open(), so couldn't
the application expose the maximum queue which it can support based on
pool configured with pktio interface.

>
>>
>> The real question I would like to understand is the expectation of this
>> minimal buffer size lets say if the application requires 1024
>> packets/buffers in a pktio interface and the minimal buffer size is 512 then
>> should be pool be configured with (1024 + 512) buffers and can the 512
>> buffer required by pktio interface be used for storing application packets?
>
> Yes, the buffers are shared btw internal queues and application, because they
> are really used for internal queuing of packets in HW, e.g. the contain:
>  -  packets waiting for reception in RX queue
>  -  packets waiting for transmission in TX queue
>  -  empty packets allocated for future reception
>
> If the driver is under heavy load, all those queues may be filled up and lead
> to buffer starvation that is difficult to recover from and it certainly limits
> the performance since lack of buffers leads to dropping traffic.

Packets being dropped coz of buffer starvation is expected and
acceptable under heavy load and usually this is handled using Random
Early Discard when the buffer in the pool reaches a threshold limit.

>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.


Re: [lng-odp] [API-NEXT PATCH v3 17/19] api: packet: added limits for packet len on alloc

2016-11-10 Thread Maxim Uvarov

On 11/10/16 14:07, Petri Savolainen wrote:

There's no use case for application to allocate zero length
packets. Application should always have some knowledge about
the new packet data length before allocation. Also
implementations are more efficient when a check for zero length
is avoided.

Also added a pool parameter to specify the maximum packet length
to be allocated from the pool. Implementations may use this
information to optimize e.g. memory usage, etc. Application must
not exceed the max_len parameter value on alloc calls. Pool
capabilities define already max_len.

Signed-off-by: Petri Savolainen 
---
  include/odp/api/spec/packet.h | 9 +
  include/odp/api/spec/pool.h   | 6 ++
  2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h
index 4a14f2d..faf62e2 100644
--- a/include/odp/api/spec/packet.h
+++ b/include/odp/api/spec/packet.h
@@ -82,13 +82,14 @@ extern "C" {
   * Allocate a packet from a packet pool
   *
   * Allocates a packet of the requested length from the specified packet pool.
- * Pool must have been created with ODP_POOL_PACKET type. The
+ * The pool must have been created with ODP_POOL_PACKET type. The
   * packet is initialized with data pointers and lengths set according to the
   * specified len, and the default headroom and tailroom length settings. All
- * other packet metadata are set to their default values.
+ * other packet metadata are set to their default values. Packet length must
+ * be greater than zero and not exceed packet pool parameter 'max_len' value.
   *
   * @param pool  Pool handle
- * @param len   Packet data length
+ * @param len   Packet data length (1 ... pool max_len)
   *
   * @return Handle of allocated packet
   * @retval ODP_PACKET_INVALID  Packet could not be allocated
@@ -105,7 +106,7 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool, uint32_t 
len);
   * packets from a pool.
   *
   * @param pool  Pool handle
- * @param len   Packet data length
+ * @param len   Packet data length (1 ... pool max_len)
   * @param[out] pkt  Array of packet handles for output
   * @param num   Maximum number of packets to allocate
   *
diff --git a/include/odp/api/spec/pool.h b/include/odp/api/spec/pool.h
index c80c98a..6c99e05 100644
--- a/include/odp/api/spec/pool.h
+++ b/include/odp/api/spec/pool.h
@@ -192,6 +192,12 @@ typedef struct odp_pool_param_t {
pkt.max_len. Use 0 for default. */
uint32_t len;
  
+			/** Maximum packet length that will be allocated from

+   the pool. The maximum value is defined by pool
+   capability pkt.max_len. Use 0 for default (the
+   pool maximum). */
+   uint32_t max_len;
+


This max_len is not not used in 2 patches bellow.

Do not see difference between this new max_len and old len:

/** Minimum packet length that the pool must provide
'num' packets. The number of packets may be less
than 'num' when packets are larger than 'len'.
The maximum value is defined by pool capability
pkt.max_len. Use 0 for default. */
uint32_t len;


Is that really needed?

Maxim.


/** Minimum number of packet data bytes that are stored
in the first segment of a packet. The maximum value
is defined by pool capability pkt.max_seg_len.




[lng-odp] [API-NEXT PATCH v3 05/19] linux-gen: pool: reimplement pool with ring

2016-11-10 Thread Petri Savolainen
Used the ring data structure to implement pool. Also
buffer structure was simplified to enable future driver
interface. Every buffer includes a packet header, so each
buffer can be used as a packet head or segment. Segmentation
was disabled and segment size was fixed to a large number
(64kB) to limit the number of modification in the commit.

Signed-off-by: Petri Savolainen 
---
 .../include/odp/api/plat/pool_types.h  |6 -
 .../linux-generic/include/odp_buffer_inlines.h |  160 +--
 .../linux-generic/include/odp_buffer_internal.h|  104 +-
 .../include/odp_classification_datamodel.h |2 +-
 .../linux-generic/include/odp_config_internal.h|   34 +-
 .../linux-generic/include/odp_packet_internal.h|   13 +-
 platform/linux-generic/include/odp_pool_internal.h |  270 +---
 .../linux-generic/include/odp_timer_internal.h |4 -
 platform/linux-generic/odp_buffer.c|8 -
 platform/linux-generic/odp_classification.c|   25 +-
 platform/linux-generic/odp_crypto.c|4 +-
 platform/linux-generic/odp_packet.c|   99 +-
 platform/linux-generic/odp_pool.c  | 1441 
 platform/linux-generic/odp_timer.c |1 +
 platform/linux-generic/pktio/socket.c  |   16 +-
 platform/linux-generic/pktio/socket_mmap.c |   10 +-
 test/common_plat/performance/odp_pktio_perf.c  |2 +-
 test/common_plat/performance/odp_scheduling.c  |8 +-
 test/common_plat/validation/api/packet/packet.c|8 +-
 19 files changed, 746 insertions(+), 1469 deletions(-)

diff --git a/platform/linux-generic/include/odp/api/plat/pool_types.h 
b/platform/linux-generic/include/odp/api/plat/pool_types.h
index 1ca8f02..4e39de5 100644
--- a/platform/linux-generic/include/odp/api/plat/pool_types.h
+++ b/platform/linux-generic/include/odp/api/plat/pool_types.h
@@ -39,12 +39,6 @@ typedef enum odp_pool_type_t {
ODP_POOL_TIMEOUT = ODP_EVENT_TIMEOUT,
 } odp_pool_type_t;
 
-/** Get printable format of odp_pool_t */
-static inline uint64_t odp_pool_to_u64(odp_pool_t hdl)
-{
-   return _odp_pri(hdl);
-}
-
 /**
  * @}
  */
diff --git a/platform/linux-generic/include/odp_buffer_inlines.h 
b/platform/linux-generic/include/odp_buffer_inlines.h
index 2b1ab42..2f5eb88 100644
--- a/platform/linux-generic/include/odp_buffer_inlines.h
+++ b/platform/linux-generic/include/odp_buffer_inlines.h
@@ -18,43 +18,20 @@ extern "C" {
 #endif
 
 #include 
-#include 
 
-static inline odp_buffer_t odp_buffer_encode_handle(odp_buffer_hdr_t *hdr)
-{
-   odp_buffer_bits_t handle;
-   uint32_t pool_id = pool_handle_to_index(hdr->pool_hdl);
-   struct pool_entry_s *pool = get_pool_entry(pool_id);
+odp_event_type_t _odp_buffer_event_type(odp_buffer_t buf);
+void _odp_buffer_event_type_set(odp_buffer_t buf, int ev);
+int odp_buffer_snprint(char *str, uint32_t n, odp_buffer_t buf);
 
-   handle.handle = 0;
-   handle.pool_id = pool_id;
-   handle.index = ((uint8_t *)hdr - pool->pool_mdata_addr) /
-   ODP_CACHE_LINE_SIZE;
-   handle.seg = 0;
-
-   return handle.handle;
-}
+void *buffer_map(odp_buffer_hdr_t *buf, uint32_t offset, uint32_t *seglen,
+uint32_t limit);
 
 static inline odp_buffer_t odp_hdr_to_buf(odp_buffer_hdr_t *hdr)
 {
return hdr->handle.handle;
 }
 
-static inline odp_buffer_hdr_t *odp_buf_to_hdr(odp_buffer_t buf)
-{
-   odp_buffer_bits_t handle;
-   uint32_t pool_id;
-   uint32_t index;
-   struct pool_entry_s *pool;
-
-   handle.handle = buf;
-   pool_id   = handle.pool_id;
-   index = handle.index;
-   pool  = get_pool_entry(pool_id);
-
-   return (odp_buffer_hdr_t *)(void *)
-   (pool->pool_mdata_addr + (index * ODP_CACHE_LINE_SIZE));
-}
+odp_buffer_hdr_t *odp_buf_to_hdr(odp_buffer_t buf);
 
 static inline uint32_t pool_id_from_buf(odp_buffer_t buf)
 {
@@ -64,131 +41,6 @@ static inline uint32_t pool_id_from_buf(odp_buffer_t buf)
return handle.pool_id;
 }
 
-static inline odp_buffer_hdr_t *validate_buf(odp_buffer_t buf)
-{
-   odp_buffer_bits_t handle;
-   odp_buffer_hdr_t *buf_hdr;
-   handle.handle = buf;
-
-   /* For buffer handles, segment index must be 0 and pool id in range */
-   if (handle.seg != 0 || handle.pool_id >= ODP_CONFIG_POOLS)
-   return NULL;
-
-   pool_entry_t *pool =
-   odp_pool_to_entry(_odp_cast_scalar(odp_pool_t,
-  handle.pool_id));
-
-   /* If pool not created, handle is invalid */
-   if (pool->s.pool_shm == ODP_SHM_INVALID)
-   return NULL;
-
-   uint32_t buf_stride = pool->s.buf_stride / ODP_CACHE_LINE_SIZE;
-
-   /* A valid buffer index must be on stride, and must be in range */
-   if ((handle.index % buf_stride != 0) ||
-   ((uint32_t)(handle.index / 

[lng-odp] [API-NEXT PATCH v3 15/19] linux-gen: packet: added support for segmented packets

2016-11-10 Thread Petri Savolainen
Added support for multi-segmented packets. The first segments
is the packet descriptor, which contains all metadata and
pointers to other segments.

Signed-off-by: Petri Savolainen 
---
 .../include/odp/api/plat/packet_types.h|   6 +-
 .../linux-generic/include/odp_buffer_inlines.h |  11 -
 .../linux-generic/include/odp_buffer_internal.h|  23 +-
 .../linux-generic/include/odp_config_internal.h|  39 +-
 .../linux-generic/include/odp_packet_internal.h|  80 +--
 platform/linux-generic/include/odp_pool_internal.h |   3 -
 platform/linux-generic/odp_buffer.c|   8 +-
 platform/linux-generic/odp_crypto.c|   8 +-
 platform/linux-generic/odp_packet.c| 712 +
 platform/linux-generic/odp_pool.c  | 123 ++--
 platform/linux-generic/pktio/netmap.c  |   4 +-
 platform/linux-generic/pktio/socket.c  |   3 +-
 12 files changed, 692 insertions(+), 328 deletions(-)

diff --git a/platform/linux-generic/include/odp/api/plat/packet_types.h 
b/platform/linux-generic/include/odp/api/plat/packet_types.h
index b5345ed..864494d 100644
--- a/platform/linux-generic/include/odp/api/plat/packet_types.h
+++ b/platform/linux-generic/include/odp/api/plat/packet_types.h
@@ -32,9 +32,11 @@ typedef ODP_HANDLE_T(odp_packet_t);
 
 #define ODP_PACKET_OFFSET_INVALID (0x0fff)
 
-typedef ODP_HANDLE_T(odp_packet_seg_t);
+/* A packet segment handle stores a small index. Strong type handles are
+ * pointers, which would be wasteful in this case. */
+typedef uint8_t odp_packet_seg_t;
 
-#define ODP_PACKET_SEG_INVALID _odp_cast_scalar(odp_packet_seg_t, 0x)
+#define ODP_PACKET_SEG_INVALID ((odp_packet_seg_t)-1)
 
 /** odp_packet_color_t assigns names to the various pkt "colors" */
 typedef enum {
diff --git a/platform/linux-generic/include/odp_buffer_inlines.h 
b/platform/linux-generic/include/odp_buffer_inlines.h
index f8688f6..cf817d9 100644
--- a/platform/linux-generic/include/odp_buffer_inlines.h
+++ b/platform/linux-generic/include/odp_buffer_inlines.h
@@ -23,22 +23,11 @@ odp_event_type_t _odp_buffer_event_type(odp_buffer_t buf);
 void _odp_buffer_event_type_set(odp_buffer_t buf, int ev);
 int odp_buffer_snprint(char *str, uint32_t n, odp_buffer_t buf);
 
-void *buffer_map(odp_buffer_hdr_t *buf, uint32_t offset, uint32_t *seglen,
-uint32_t limit);
-
 static inline odp_buffer_t odp_hdr_to_buf(odp_buffer_hdr_t *hdr)
 {
return hdr->handle.handle;
 }
 
-static inline uint32_t pool_id_from_buf(odp_buffer_t buf)
-{
-   odp_buffer_bits_t handle;
-
-   handle.handle = buf;
-   return handle.pool_id;
-}
-
 #ifdef __cplusplus
 }
 #endif
diff --git a/platform/linux-generic/include/odp_buffer_internal.h 
b/platform/linux-generic/include/odp_buffer_internal.h
index 0ca13f8..4e75908 100644
--- a/platform/linux-generic/include/odp_buffer_internal.h
+++ b/platform/linux-generic/include/odp_buffer_internal.h
@@ -33,10 +33,6 @@ extern "C" {
 #include 
 #include 
 
-ODP_STATIC_ASSERT(ODP_CONFIG_PACKET_SEG_LEN_MIN >= 256,
- "ODP Segment size must be a minimum of 256 bytes");
-
-
 typedef union odp_buffer_bits_t {
odp_buffer_t handle;
 
@@ -65,6 +61,20 @@ struct odp_buffer_hdr_t {
int burst_first;
struct odp_buffer_hdr_t *burst[BUFFER_BURST_SIZE];
 
+   struct {
+   void *hdr;
+   uint8_t  *data;
+   uint32_t  len;
+   } seg[CONFIG_PACKET_MAX_SEGS];
+
+   /* max data size */
+   uint32_t size;
+
+   /* Initial buffer data pointer and length */
+   void *base_data;
+   uint32_t  base_len;
+   uint8_t  *buf_end;
+
union {
uint32_t all;
struct {
@@ -75,7 +85,6 @@ struct odp_buffer_hdr_t {
 
int8_t   type;   /* buffer type */
odp_event_type_t event_type; /* for reuse as event */
-   uint32_t size;   /* max data size */
odp_pool_t   pool_hdl;   /* buffer pool handle */
union {
uint64_t buf_u64;/* user u64 */
@@ -86,8 +95,6 @@ struct odp_buffer_hdr_t {
uint32_t uarea_size; /* size of user area */
uint32_t segcount;   /* segment count */
uint32_t segsize;/* segment size */
-   /* block addrs */
-   void*addr[ODP_CONFIG_PACKET_MAX_SEGS];
uint64_t order;  /* sequence for ordered queues */
queue_entry_t   *origin_qe;  /* ordered queue origin */
union {
@@ -105,8 +112,6 @@ struct odp_buffer_hdr_t {
 };
 
 /* Forward declarations */
-int seg_alloc_head(odp_buffer_hdr_t *buf_hdr, int segcount);
-void seg_free_head(odp_buffer_hdr_t *buf_hdr, int segcount);
 int seg_alloc_tail(odp_buffer_hdr_t *buf_hdr, int segcount);
 void 

[lng-odp] [API-NEXT PATCH v3 09/19] linux-gen: pool: clean up pool inlines functions

2016-11-10 Thread Petri Savolainen
Removed odp_pool_to_entry(), which was a duplicate of
pool_entry_from_hdl(). Renamed odp_buf_to_hdr() to
buf_hdl_to_hdr(), which describes more accurately the internal
function. Inlined pool_entry(), pool_entry_from_hdl() and
buf_hdl_to_hdr(), which are used often and also outside of
pool.c. Renamed odp_buffer_pool_headroom() and _tailroom() to
simply pool_headroom() and _tailroom(), since those are internal
functions (not API as previous names hint). Also moved those
into pool.c, since inlining is not needed for functions that are
called only in (netmap) init phase.

Signed-off-by: Petri Savolainen 
---
 .../linux-generic/include/odp_buffer_inlines.h |  2 -
 .../linux-generic/include/odp_packet_internal.h|  2 +-
 platform/linux-generic/include/odp_pool_internal.h | 36 ---
 platform/linux-generic/odp_buffer.c|  6 +--
 platform/linux-generic/odp_packet.c|  8 ++--
 platform/linux-generic/odp_packet_io.c |  2 +-
 platform/linux-generic/odp_pool.c  | 54 ++
 platform/linux-generic/odp_queue.c |  4 +-
 platform/linux-generic/odp_schedule_ordered.c  |  4 +-
 platform/linux-generic/odp_timer.c |  2 +-
 platform/linux-generic/pktio/loop.c|  2 +-
 platform/linux-generic/pktio/netmap.c  |  4 +-
 platform/linux-generic/pktio/socket_mmap.c |  2 +-
 13 files changed, 62 insertions(+), 66 deletions(-)

diff --git a/platform/linux-generic/include/odp_buffer_inlines.h 
b/platform/linux-generic/include/odp_buffer_inlines.h
index 2f5eb88..f8688f6 100644
--- a/platform/linux-generic/include/odp_buffer_inlines.h
+++ b/platform/linux-generic/include/odp_buffer_inlines.h
@@ -31,8 +31,6 @@ static inline odp_buffer_t odp_hdr_to_buf(odp_buffer_hdr_t 
*hdr)
return hdr->handle.handle;
 }
 
-odp_buffer_hdr_t *odp_buf_to_hdr(odp_buffer_t buf);
-
 static inline uint32_t pool_id_from_buf(odp_buffer_t buf)
 {
odp_buffer_bits_t handle;
diff --git a/platform/linux-generic/include/odp_packet_internal.h 
b/platform/linux-generic/include/odp_packet_internal.h
index 2cad71f..0cdd5ca 100644
--- a/platform/linux-generic/include/odp_packet_internal.h
+++ b/platform/linux-generic/include/odp_packet_internal.h
@@ -199,7 +199,7 @@ typedef struct {
  */
 static inline odp_packet_hdr_t *odp_packet_hdr(odp_packet_t pkt)
 {
-   return (odp_packet_hdr_t *)odp_buf_to_hdr((odp_buffer_t)pkt);
+   return (odp_packet_hdr_t *)buf_hdl_to_hdr((odp_buffer_t)pkt);
 }
 
 static inline void copy_packet_parser_metadata(odp_packet_hdr_t *src_hdr,
diff --git a/platform/linux-generic/include/odp_pool_internal.h 
b/platform/linux-generic/include/odp_pool_internal.h
index 278c553..f7c315c 100644
--- a/platform/linux-generic/include/odp_pool_internal.h
+++ b/platform/linux-generic/include/odp_pool_internal.h
@@ -73,23 +73,45 @@ typedef struct pool_t {
 
 } pool_t;
 
-pool_t *pool_entry(uint32_t pool_idx);
+typedef struct pool_table_t {
+   pool_tpool[ODP_CONFIG_POOLS];
+   odp_shm_t shm;
+} pool_table_t;
 
-static inline pool_t *odp_pool_to_entry(odp_pool_t pool_hdl)
+extern pool_table_t *pool_tbl;
+
+static inline pool_t *pool_entry(uint32_t pool_idx)
 {
-   return pool_entry(_odp_typeval(pool_hdl));
+   return _tbl->pool[pool_idx];
 }
 
-static inline uint32_t odp_buffer_pool_headroom(odp_pool_t pool)
+static inline pool_t *pool_entry_from_hdl(odp_pool_t pool_hdl)
 {
-   return odp_pool_to_entry(pool)->headroom;
+   return _tbl->pool[_odp_typeval(pool_hdl)];
 }
 
-static inline uint32_t odp_buffer_pool_tailroom(odp_pool_t pool)
+static inline odp_buffer_hdr_t *buf_hdl_to_hdr(odp_buffer_t buf)
 {
-   return odp_pool_to_entry(pool)->tailroom;
+   odp_buffer_bits_t handle;
+   uint32_t pool_id, index, block_offset;
+   pool_t *pool;
+   odp_buffer_hdr_t *buf_hdr;
+
+   handle.handle = buf;
+   pool_id   = handle.pool_id;
+   index = handle.index;
+   pool  = pool_entry(pool_id);
+   block_offset  = index * pool->block_size;
+
+   /* clang requires cast to uintptr_t */
+   buf_hdr = (odp_buffer_hdr_t *)(uintptr_t)>base_addr[block_offset];
+
+   return buf_hdr;
 }
 
+uint32_t pool_headroom(odp_pool_t pool);
+uint32_t pool_tailroom(odp_pool_t pool);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/platform/linux-generic/odp_buffer.c 
b/platform/linux-generic/odp_buffer.c
index 0ddaf95..eed15c0 100644
--- a/platform/linux-generic/odp_buffer.c
+++ b/platform/linux-generic/odp_buffer.c
@@ -26,14 +26,14 @@ odp_event_t odp_buffer_to_event(odp_buffer_t buf)
 
 void *odp_buffer_addr(odp_buffer_t buf)
 {
-   odp_buffer_hdr_t *hdr = odp_buf_to_hdr(buf);
+   odp_buffer_hdr_t *hdr = buf_hdl_to_hdr(buf);
 
return hdr->addr[0];
 }
 
 uint32_t odp_buffer_size(odp_buffer_t buf)
 {
-   odp_buffer_hdr_t *hdr = odp_buf_to_hdr(buf);
+   odp_buffer_hdr_t 

[lng-odp] [API-NEXT PATCH v3 03/19] linux-gen: ring: created common ring implementation

2016-11-10 Thread Petri Savolainen
Moved scheduler ring code into a new header file, so that
it can be used also in other parts of the implementation.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/Makefile.am |   1 +
 platform/linux-generic/include/odp_ring_internal.h | 111 +
 platform/linux-generic/odp_schedule.c  | 102 ++-
 3 files changed, 120 insertions(+), 94 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_ring_internal.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 19dc0ba..b60eacb 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -151,6 +151,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_pool_internal.h \
  ${srcdir}/include/odp_posix_extensions.h \
  ${srcdir}/include/odp_queue_internal.h \
+ ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_schedule_if.h \
  ${srcdir}/include/odp_schedule_internal.h \
  ${srcdir}/include/odp_schedule_ordered_internal.h \
diff --git a/platform/linux-generic/include/odp_ring_internal.h 
b/platform/linux-generic/include/odp_ring_internal.h
new file mode 100644
index 000..6a6291a
--- /dev/null
+++ b/platform/linux-generic/include/odp_ring_internal.h
@@ -0,0 +1,111 @@
+/* Copyright (c) 2016, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef ODP_RING_INTERNAL_H_
+#define ODP_RING_INTERNAL_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include 
+#include 
+
+/* Ring empty, not a valid data value. */
+#define RING_EMPTY ((uint32_t)-1)
+
+/* Ring of uint32_t data
+ *
+ * Ring stores head and tail counters. Ring indexes are formed from these
+ * counters with a mask (mask = ring_size - 1), which requires that ring size
+ * must be a power of two. Also ring size must be larger than the maximum
+ * number of data items that will be stored on it (there's no check against
+ * overwriting). */
+typedef struct {
+   /* Writer head and tail */
+   odp_atomic_u32_t w_head;
+   odp_atomic_u32_t w_tail;
+   uint8_t pad[ODP_CACHE_LINE_SIZE - (2 * sizeof(odp_atomic_u32_t))];
+
+   /* Reader head and tail */
+   odp_atomic_u32_t r_head;
+   odp_atomic_u32_t r_tail;
+
+   uint32_t data[0];
+} ring_t ODP_ALIGNED_CACHE;
+
+/* Initialize ring */
+static inline void ring_init(ring_t *ring)
+{
+   odp_atomic_init_u32(>w_head, 0);
+   odp_atomic_init_u32(>w_tail, 0);
+   odp_atomic_init_u32(>r_head, 0);
+   odp_atomic_init_u32(>r_tail, 0);
+}
+
+/* Dequeue data from the ring head */
+static inline uint32_t ring_deq(ring_t *ring, uint32_t mask)
+{
+   uint32_t head, tail, new_head;
+   uint32_t data;
+
+   head = odp_atomic_load_u32(>r_head);
+
+   /* Move reader head. This thread owns data at the new head. */
+   do {
+   tail = odp_atomic_load_u32(>w_tail);
+
+   if (head == tail)
+   return RING_EMPTY;
+
+   new_head = head + 1;
+
+   } while (odp_unlikely(odp_atomic_cas_acq_u32(>r_head, ,
+ new_head) == 0));
+
+   /* Read queue index */
+   data = ring->data[new_head & mask];
+
+   /* Wait until other readers have updated the tail */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>r_tail) != head))
+   odp_cpu_pause();
+
+   /* Now update the reader tail */
+   odp_atomic_store_rel_u32(>r_tail, new_head);
+
+   return data;
+}
+
+/* Enqueue data into the ring tail */
+static inline void ring_enq(ring_t *ring, uint32_t mask, uint32_t data)
+{
+   uint32_t old_head, new_head;
+
+   /* Reserve a slot in the ring for writing */
+   old_head = odp_atomic_fetch_inc_u32(>w_head);
+   new_head = old_head + 1;
+
+   /* Ring is full. Wait for the last reader to finish. */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>r_tail) == new_head))
+   odp_cpu_pause();
+
+   /* Write data */
+   ring->data[new_head & mask] = data;
+
+   /* Wait until other writers have updated the tail */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>w_tail) != old_head))
+   odp_cpu_pause();
+
+   /* Now update the writer tail */
+   odp_atomic_store_rel_u32(>w_tail, new_head);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 81e79c9..86c98fe 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -17,12 +17,12 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels  */
 #define NUM_PRIO 8
@@ -82,9 +82,6 @@ ODP_STATIC_ASSERT((ODP_SCHED_PRIO_NORMAL > 0) &&
 

[lng-odp] [API-NEXT PATCH v3 08/19] linux-gen: pool: optimize buffer alloc

2016-11-10 Thread Petri Savolainen
Round up global pool allocations to a burst size. Cache any
extra buffers for future use. Prefetch buffers header which
very newly allocated from global pool and will be returned to
the caller.

Signed-off-by: Petri Savolainen 
---
 .../linux-generic/include/odp_buffer_internal.h|  3 +-
 platform/linux-generic/odp_packet.c| 16 +++--
 platform/linux-generic/odp_pool.c  | 74 --
 3 files changed, 68 insertions(+), 25 deletions(-)

diff --git a/platform/linux-generic/include/odp_buffer_internal.h 
b/platform/linux-generic/include/odp_buffer_internal.h
index abe8591..64ba221 100644
--- a/platform/linux-generic/include/odp_buffer_internal.h
+++ b/platform/linux-generic/include/odp_buffer_internal.h
@@ -105,7 +105,8 @@ struct odp_buffer_hdr_t {
 };
 
 /* Forward declarations */
-int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[], int num);
+int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[],
+  odp_buffer_hdr_t *buf_hdr[], int num);
 void buffer_free_multi(const odp_buffer_t buf[], int num_free);
 
 int seg_alloc_head(odp_buffer_hdr_t *buf_hdr, int segcount);
diff --git a/platform/linux-generic/odp_packet.c 
b/platform/linux-generic/odp_packet.c
index 6df1c5b..6565a5d 100644
--- a/platform/linux-generic/odp_packet.c
+++ b/platform/linux-generic/odp_packet.c
@@ -80,14 +80,16 @@ static void packet_init(pool_t *pool, odp_packet_hdr_t 
*pkt_hdr,
 int packet_alloc_multi(odp_pool_t pool_hdl, uint32_t len,
   odp_packet_t pkt[], int max_num)
 {
-   odp_packet_hdr_t *pkt_hdr;
pool_t *pool = odp_pool_to_entry(pool_hdl);
int num, i;
+   odp_packet_hdr_t *pkt_hdrs[max_num];
 
-   num = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt, max_num);
+   num = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt,
+(odp_buffer_hdr_t **)pkt_hdrs, max_num);
 
for (i = 0; i < num; i++) {
-   pkt_hdr = odp_packet_hdr(pkt[i]);
+   odp_packet_hdr_t *pkt_hdr = pkt_hdrs[i];
+
packet_init(pool, pkt_hdr, len, 1 /* do parse */);
 
if (pkt_hdr->tailroom >= pkt_hdr->buf_hdr.segsize)
@@ -113,7 +115,7 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool_hdl, uint32_t 
len)
if (odp_unlikely(len > pool->max_len))
return ODP_PACKET_INVALID;
 
-   ret = buffer_alloc_multi(pool_hdl, (odp_buffer_t *), 1);
+   ret = buffer_alloc_multi(pool_hdl, (odp_buffer_t *), NULL, 1);
if (ret != 1)
return ODP_PACKET_INVALID;
 
@@ -134,6 +136,7 @@ int odp_packet_alloc_multi(odp_pool_t pool_hdl, uint32_t 
len,
pool_t *pool = odp_pool_to_entry(pool_hdl);
size_t pkt_size = len ? len : pool->data_size;
int count, i;
+   odp_packet_hdr_t *pkt_hdrs[num];
 
if (odp_unlikely(pool->params.type != ODP_POOL_PACKET)) {
__odp_errno = EINVAL;
@@ -143,10 +146,11 @@ int odp_packet_alloc_multi(odp_pool_t pool_hdl, uint32_t 
len,
if (odp_unlikely(len > pool->max_len))
return -1;
 
-   count = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt, num);
+   count = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt,
+  (odp_buffer_hdr_t **)pkt_hdrs, num);
 
for (i = 0; i < count; ++i) {
-   odp_packet_hdr_t *pkt_hdr = odp_packet_hdr(pkt[i]);
+   odp_packet_hdr_t *pkt_hdr = pkt_hdrs[i];
 
packet_init(pool, pkt_hdr, pkt_size, 0 /* do not parse */);
if (len == 0)
diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index a2e5d54..7dc0938 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -562,14 +562,14 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t 
*info)
return 0;
 }
 
-int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[], int max_num)
+int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[],
+  odp_buffer_hdr_t *buf_hdr[], int max_num)
 {
pool_t *pool;
ring_t *ring;
-   uint32_t mask;
-   int i;
+   uint32_t mask, i;
pool_cache_t *cache;
-   uint32_t cache_num;
+   uint32_t cache_num, num_ch, num_deq, burst;
 
pool  = pool_entry_from_hdl(pool_hdl);
ring  = >ring.hdr;
@@ -577,28 +577,66 @@ int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t 
buf[], int max_num)
cache = local.cache[_odp_typeval(pool_hdl)];
 
cache_num = cache->num;
+   num_ch= max_num;
+   num_deq   = 0;
+   burst = CACHE_BURST;
 
-   if (odp_likely((int)cache_num >= max_num)) {
-   for (i = 0; i < max_num; i++)
-   buf[i] = cache->buf[cache_num - max_num + i];
+   if (odp_unlikely(cache_num < (uint32_t)max_num)) {
+   /* Cache does not have enough buffers 

[lng-odp] [API-NEXT PATCH v3 11/19] test: validation: buf: test alignment

2016-11-10 Thread Petri Savolainen
Added checks for correct alignment. Also updated tests to call
odp_pool_param_init() for parameter initialization.

Signed-off-by: Petri Savolainen 
---
 test/common_plat/validation/api/buffer/buffer.c | 113 +---
 1 file changed, 63 insertions(+), 50 deletions(-)

diff --git a/test/common_plat/validation/api/buffer/buffer.c 
b/test/common_plat/validation/api/buffer/buffer.c
index d26d5e8..7c723d4 100644
--- a/test/common_plat/validation/api/buffer/buffer.c
+++ b/test/common_plat/validation/api/buffer/buffer.c
@@ -8,20 +8,21 @@
 #include "odp_cunit_common.h"
 #include "buffer.h"
 
+#define BUF_ALIGN  ODP_CACHE_LINE_SIZE
+#define BUF_SIZE   1500
+
 static odp_pool_t raw_pool;
 static odp_buffer_t raw_buffer = ODP_BUFFER_INVALID;
-static const size_t raw_buffer_size = 1500;
 
 int buffer_suite_init(void)
 {
-   odp_pool_param_t params = {
-   .buf = {
-   .size  = raw_buffer_size,
-   .align = ODP_CACHE_LINE_SIZE,
-   .num   = 100,
-   },
-   .type  = ODP_POOL_BUFFER,
-   };
+   odp_pool_param_t params;
+
+   odp_pool_param_init();
+   params.type  = ODP_POOL_BUFFER;
+   params.buf.size  = BUF_SIZE;
+   params.buf.align = BUF_ALIGN;
+   params.buf.num   = 100;
 
raw_pool = odp_pool_create("raw_pool", );
if (raw_pool == ODP_POOL_INVALID)
@@ -44,25 +45,25 @@ void buffer_test_pool_alloc(void)
 {
odp_pool_t pool;
const int num = 3;
-   const size_t size = 1500;
odp_buffer_t buffer[num];
odp_event_t ev;
int index;
-   char wrong_type = 0, wrong_size = 0;
-   odp_pool_param_t params = {
-   .buf = {
-   .size  = size,
-   .align = ODP_CACHE_LINE_SIZE,
-   .num   = num,
-   },
-   .type  = ODP_POOL_BUFFER,
-   };
+   char wrong_type = 0, wrong_size = 0, wrong_align = 0;
+   odp_pool_param_t params;
+
+   odp_pool_param_init();
+   params.type  = ODP_POOL_BUFFER;
+   params.buf.size  = BUF_SIZE;
+   params.buf.align = BUF_ALIGN;
+   params.buf.num   = num;
 
pool = odp_pool_create("buffer_pool_alloc", );
odp_pool_print(pool);
 
/* Try to allocate num items from the pool */
for (index = 0; index < num; index++) {
+   uintptr_t addr;
+
buffer[index] = odp_buffer_alloc(pool);
 
if (buffer[index] == ODP_BUFFER_INVALID)
@@ -71,9 +72,15 @@ void buffer_test_pool_alloc(void)
ev = odp_buffer_to_event(buffer[index]);
if (odp_event_type(ev) != ODP_EVENT_BUFFER)
wrong_type = 1;
-   if (odp_buffer_size(buffer[index]) < size)
+   if (odp_buffer_size(buffer[index]) < BUF_SIZE)
wrong_size = 1;
-   if (wrong_type || wrong_size)
+
+   addr = (uintptr_t)odp_buffer_addr(buffer[index]);
+
+   if ((addr % BUF_ALIGN) != 0)
+   wrong_align = 1;
+
+   if (wrong_type || wrong_size || wrong_align)
odp_buffer_print(buffer[index]);
}
 
@@ -85,6 +92,7 @@ void buffer_test_pool_alloc(void)
/* Check that the pool had correct buffers */
CU_ASSERT(wrong_type == 0);
CU_ASSERT(wrong_size == 0);
+   CU_ASSERT(wrong_align == 0);
 
for (; index >= 0; index--)
odp_buffer_free(buffer[index]);
@@ -112,19 +120,17 @@ void buffer_test_pool_alloc_multi(void)
 {
odp_pool_t pool;
const int num = 3;
-   const size_t size = 1500;
odp_buffer_t buffer[num + 1];
odp_event_t ev;
int index;
-   char wrong_type = 0, wrong_size = 0;
-   odp_pool_param_t params = {
-   .buf = {
-   .size  = size,
-   .align = ODP_CACHE_LINE_SIZE,
-   .num   = num,
-   },
-   .type  = ODP_POOL_BUFFER,
-   };
+   char wrong_type = 0, wrong_size = 0, wrong_align = 0;
+   odp_pool_param_t params;
+
+   odp_pool_param_init();
+   params.type  = ODP_POOL_BUFFER;
+   params.buf.size  = BUF_SIZE;
+   params.buf.align = BUF_ALIGN;
+   params.buf.num   = num;
 
pool = odp_pool_create("buffer_pool_alloc_multi", );
odp_pool_print(pool);
@@ -133,15 +139,23 @@ void buffer_test_pool_alloc_multi(void)
CU_ASSERT_FATAL(buffer_alloc_multi(pool, buffer, num + 1) == num);
 
for (index = 0; index < num; index++) {
+   uintptr_t addr;
+
if (buffer[index] == ODP_BUFFER_INVALID)
break;
 
ev = 

[lng-odp] [API-NEXT PATCH v3 10/19] linux-gen: pool: ptr instead of hdl in buffer_alloc_multi

2016-11-10 Thread Petri Savolainen
Improve performance by changing the first parameter of
buffer_alloc_multi() to pool pointer (from handle), to avoid
double lookup of the pool pointer. Pointer is available for
packet alloc calls already.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/include/odp_buffer_internal.h |  4 
 platform/linux-generic/include/odp_pool_internal.h   |  4 
 platform/linux-generic/odp_packet.c  |  6 +++---
 platform/linux-generic/odp_pool.c| 16 ++--
 4 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/platform/linux-generic/include/odp_buffer_internal.h 
b/platform/linux-generic/include/odp_buffer_internal.h
index 64ba221..0ca13f8 100644
--- a/platform/linux-generic/include/odp_buffer_internal.h
+++ b/platform/linux-generic/include/odp_buffer_internal.h
@@ -105,10 +105,6 @@ struct odp_buffer_hdr_t {
 };
 
 /* Forward declarations */
-int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[],
-  odp_buffer_hdr_t *buf_hdr[], int num);
-void buffer_free_multi(const odp_buffer_t buf[], int num_free);
-
 int seg_alloc_head(odp_buffer_hdr_t *buf_hdr, int segcount);
 void seg_free_head(odp_buffer_hdr_t *buf_hdr, int segcount);
 int seg_alloc_tail(odp_buffer_hdr_t *buf_hdr, int segcount);
diff --git a/platform/linux-generic/include/odp_pool_internal.h 
b/platform/linux-generic/include/odp_pool_internal.h
index f7c315c..f7e951a 100644
--- a/platform/linux-generic/include/odp_pool_internal.h
+++ b/platform/linux-generic/include/odp_pool_internal.h
@@ -109,6 +109,10 @@ static inline odp_buffer_hdr_t 
*buf_hdl_to_hdr(odp_buffer_t buf)
return buf_hdr;
 }
 
+int buffer_alloc_multi(pool_t *pool, odp_buffer_t buf[],
+  odp_buffer_hdr_t *buf_hdr[], int num);
+void buffer_free_multi(const odp_buffer_t buf[], int num_free);
+
 uint32_t pool_headroom(odp_pool_t pool);
 uint32_t pool_tailroom(odp_pool_t pool);
 
diff --git a/platform/linux-generic/odp_packet.c 
b/platform/linux-generic/odp_packet.c
index c44f687..2eee775 100644
--- a/platform/linux-generic/odp_packet.c
+++ b/platform/linux-generic/odp_packet.c
@@ -84,7 +84,7 @@ int packet_alloc_multi(odp_pool_t pool_hdl, uint32_t len,
int num, i;
odp_packet_hdr_t *pkt_hdrs[max_num];
 
-   num = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt,
+   num = buffer_alloc_multi(pool, (odp_buffer_t *)pkt,
 (odp_buffer_hdr_t **)pkt_hdrs, max_num);
 
for (i = 0; i < num; i++) {
@@ -115,7 +115,7 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool_hdl, uint32_t 
len)
if (odp_unlikely(len > pool->max_len))
return ODP_PACKET_INVALID;
 
-   ret = buffer_alloc_multi(pool_hdl, (odp_buffer_t *), NULL, 1);
+   ret = buffer_alloc_multi(pool, (odp_buffer_t *), NULL, 1);
if (ret != 1)
return ODP_PACKET_INVALID;
 
@@ -146,7 +146,7 @@ int odp_packet_alloc_multi(odp_pool_t pool_hdl, uint32_t 
len,
if (odp_unlikely(len > pool->max_len))
return -1;
 
-   count = buffer_alloc_multi(pool_hdl, (odp_buffer_t *)pkt,
+   count = buffer_alloc_multi(pool, (odp_buffer_t *)pkt,
   (odp_buffer_hdr_t **)pkt_hdrs, num);
 
for (i = 0; i < count; ++i) {
diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index faea2fc..364df97 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -528,19 +528,17 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t 
*info)
return 0;
 }
 
-int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[],
+int buffer_alloc_multi(pool_t *pool, odp_buffer_t buf[],
   odp_buffer_hdr_t *buf_hdr[], int max_num)
 {
-   pool_t *pool;
ring_t *ring;
uint32_t mask, i;
pool_cache_t *cache;
uint32_t cache_num, num_ch, num_deq, burst;
 
-   pool  = pool_entry_from_hdl(pool_hdl);
ring  = >ring.hdr;
mask  = pool->ring_mask;
-   cache = local.cache[_odp_typeval(pool_hdl)];
+   cache = local.cache[pool->pool_idx];
 
cache_num = cache->num;
num_ch= max_num;
@@ -696,9 +694,11 @@ void buffer_free_multi(const odp_buffer_t buf[], int 
num_total)
 odp_buffer_t odp_buffer_alloc(odp_pool_t pool_hdl)
 {
odp_buffer_t buf;
+   pool_t *pool;
int ret;
 
-   ret = buffer_alloc_multi(pool_hdl, , NULL, 1);
+   pool = pool_entry_from_hdl(pool_hdl);
+   ret = buffer_alloc_multi(pool, , NULL, 1);
 
if (odp_likely(ret == 1))
return buf;
@@ -708,7 +708,11 @@ odp_buffer_t odp_buffer_alloc(odp_pool_t pool_hdl)
 
 int odp_buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t buf[], int num)
 {
-   return buffer_alloc_multi(pool_hdl, buf, NULL, num);
+   pool_t *pool;
+
+   pool = pool_entry_from_hdl(pool_hdl);
+
+   return 

[lng-odp] [API-NEXT PATCH v3 07/19] linux-gen: pool: use ring multi enq and deq operations

2016-11-10 Thread Petri Savolainen
Use multi enq and deq operations to optimize global pool
access performance. Temporary uint32_t arrays are needed
since handles are pointer size variables.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/odp_pool.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/platform/linux-generic/odp_pool.c 
b/platform/linux-generic/odp_pool.c
index 1286753..a2e5d54 100644
--- a/platform/linux-generic/odp_pool.c
+++ b/platform/linux-generic/odp_pool.c
@@ -586,15 +586,16 @@ int buffer_alloc_multi(odp_pool_t pool_hdl, odp_buffer_t 
buf[], int max_num)
return max_num;
}
 
-   for (i = 0; i < max_num; i++) {
-   uint32_t data;
+   {
+   /* Temporary copy needed since odp_buffer_t is uintptr_t
+* and not uint32_t. */
+   int num;
+   uint32_t data[max_num];
 
-   data = ring_deq(ring, mask);
+   num = ring_deq_multi(ring, mask, data, max_num);
 
-   if (data == RING_EMPTY)
-   break;
-
-   buf[i] = (odp_buffer_t)(uintptr_t)data;
+   for (i = 0; i < num; i++)
+   buf[i] = (odp_buffer_t)(uintptr_t)data[i];
}
 
return i;
@@ -629,17 +630,24 @@ static inline void buffer_free_to_pool(uint32_t pool_id,
cache_num = cache->num;
 
if (odp_unlikely((int)(CONFIG_POOL_CACHE_SIZE - cache_num) < num)) {
+   uint32_t index;
int burst = CACHE_BURST;
 
if (odp_unlikely(num > CACHE_BURST))
burst = num;
 
-   for (i = 0; i < burst; i++) {
-   uint32_t data, index;
+   {
+   /* Temporary copy needed since odp_buffer_t is
+* uintptr_t and not uint32_t. */
+   uint32_t data[burst];
+
+   index = cache_num - burst;
+
+   for (i = 0; i < burst; i++)
+   data[i] = (uint32_t)
+ (uintptr_t)cache->buf[index + i];
 
-   index = cache_num - burst + i;
-   data  = (uint32_t)(uintptr_t)cache->buf[index];
-   ring_enq(ring, mask, data);
+   ring_enq_multi(ring, mask, data, burst);
}
 
cache_num -= burst;
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 00/19] pool optimization

2016-11-10 Thread Petri Savolainen
Pool performance is optimized by using a ring as the global buffer storage. 
IPC build is disabled, since it needs large modifications due to dependency to 
pool internals. Old pool implementation was based on locks and linked list of 
buffer headers. New implementation maintain a ring of buffer handles, which 
enable fast, burst based allocs and frees. Also ring scales better with number 
of cpus than a list (enq and deq operations update opposite ends of the pool). 

L2fwd link rate (%), 2 x 40GE, 64 byte packets

direct- parallel-   atomic-
cpusorigdirect  difforigparall  difforigatomic  diff
1   7 % 8 % 1 % 6 % 6 % 2 % 5.4 %   5.6 %   4 %
2   14 %15 %7 % 9 % 9 % 5 % 8 % 9 % 8 %
4   28 %30 %6 % 13 %14 %13 %12 %15 %19 %
6   42 %44 %6 % 16 %19 %19 %8 % 20 %150 %
8   46 %59 %28 %19 %23 %26 %18 %24 %34 %
10  55 %57 %3 % 20 %27 %37 %8 % 28 %264 %
12  56 %56 %-1 %22 %31 %43 %7 % 32 %357 %

Max packet rate of NICs are reached with 10-12 cpu on direct mode. Otherwise, 
all cases were improved. Especially, scheduler driven cases suffered on bad 
pool scalability.

changed in v3:
* rebased
* ipc disabled with #ifdef
* added support for multi-segment packets
* API: added explicit limits for packet length in alloc calls
* Corrected validation test and example application bugs found during
  segmentation implementation

changed in v2:
* rebased to api-next branch
* added a comment that ring size must be larger than number of items in it
* fixed clang build issue
* added parens in align macro

v1 reviews:
Reviewed-by: Brian Brooks 



Petri Savolainen (19):
  linux-gen: ipc: disable build of ipc pktio
  linux-gen: pktio: do not free zero packets
  linux-gen: ring: created common ring implementation
  linux-gen: align: added round up power of two
  linux-gen: pool: reimplement pool with ring
  linux-gen: ring: added multi enq and deq
  linux-gen: pool: use ring multi enq and deq operations
  linux-gen: pool: optimize buffer alloc
  linux-gen: pool: clean up pool inlines functions
  linux-gen: pool: ptr instead of hdl in buffer_alloc_multi
  test: validation: buf: test alignment
  test: performance: crypto: use capability to select max packet
  test: correctly initialize pool parameters
  test: validation: packet: fix bugs in tailroom and concat tests
  linux-gen: packet: added support for segmented packets
  test: validation: packet: improved multi-segment alloc test
  api: packet: added limits for packet len on alloc
  linux-gen: packet: remove zero len support from alloc
  linux-gen: packet: enable multi-segment packets

 example/generator/odp_generator.c  |2 +-
 include/odp/api/spec/packet.h  |9 +-
 include/odp/api/spec/pool.h|6 +
 platform/linux-generic/Makefile.am |1 +
 .../include/odp/api/plat/packet_types.h|6 +-
 .../include/odp/api/plat/pool_types.h  |6 -
 .../linux-generic/include/odp_align_internal.h |   34 +-
 .../linux-generic/include/odp_buffer_inlines.h |  167 +--
 .../linux-generic/include/odp_buffer_internal.h|  120 +-
 .../include/odp_classification_datamodel.h |2 +-
 .../linux-generic/include/odp_config_internal.h|   55 +-
 .../linux-generic/include/odp_packet_internal.h|   87 +-
 platform/linux-generic/include/odp_pool_internal.h |  289 +---
 platform/linux-generic/include/odp_ring_internal.h |  176 +++
 .../linux-generic/include/odp_timer_internal.h |4 -
 platform/linux-generic/odp_buffer.c|   22 +-
 platform/linux-generic/odp_classification.c|   25 +-
 platform/linux-generic/odp_crypto.c|   12 +-
 platform/linux-generic/odp_packet.c|  717 --
 platform/linux-generic/odp_packet_io.c |2 +-
 platform/linux-generic/odp_pool.c  | 1440 
 platform/linux-generic/odp_queue.c |4 +-
 platform/linux-generic/odp_schedule.c  |  102 +-
 platform/linux-generic/odp_schedule_ordered.c  |4 +-
 platform/linux-generic/odp_timer.c |3 +-
 platform/linux-generic/pktio/dpdk.c|   10 +-
 platform/linux-generic/pktio/ipc.c |3 +-
 platform/linux-generic/pktio/loop.c|2 +-
 platform/linux-generic/pktio/netmap.c  |   14 +-
 platform/linux-generic/pktio/socket.c  |   17 +-
 platform/linux-generic/pktio/socket_mmap.c |   10 +-
 test/common_plat/performance/odp_crypto.c  |   47 +-
 test/common_plat/performance/odp_pktio_perf.c  |2 +-
 

[lng-odp] [API-NEXT PATCH v3 13/19] test: correctly initialize pool parameters

2016-11-10 Thread Petri Savolainen
Use odp_pool_param_init() to initialize pool parameters. Also
pktio test must use capability to determine maximum packet
segment length.

Signed-off-by: Petri Savolainen 
---
 example/generator/odp_generator.c   |  2 +-
 test/common_plat/validation/api/crypto/crypto.c |  2 +-
 test/common_plat/validation/api/pktio/pktio.c   | 21 +++--
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/example/generator/odp_generator.c 
b/example/generator/odp_generator.c
index 48d7f5f..23dbd55 100644
--- a/example/generator/odp_generator.c
+++ b/example/generator/odp_generator.c
@@ -732,7 +732,7 @@ int main(int argc, char *argv[])
odp_timer_pool_start();
 
/* Create timeout pool */
-   memset(, 0, sizeof(params));
+   odp_pool_param_init();
params.tmo.num = tparams.num_timers; /* One timeout per timer */
params.type= ODP_POOL_TIMEOUT;
 
diff --git a/test/common_plat/validation/api/crypto/crypto.c 
b/test/common_plat/validation/api/crypto/crypto.c
index 8946cde..9c9a00d 100644
--- a/test/common_plat/validation/api/crypto/crypto.c
+++ b/test/common_plat/validation/api/crypto/crypto.c
@@ -43,7 +43,7 @@ int crypto_init(odp_instance_t *inst)
return -1;
}
 
-   memset(, 0, sizeof(params));
+   odp_pool_param_init();
params.pkt.seg_len = SHM_PKT_POOL_BUF_SIZE;
params.pkt.len = SHM_PKT_POOL_BUF_SIZE;
params.pkt.num = SHM_PKT_POOL_SIZE / SHM_PKT_POOL_BUF_SIZE;
diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index a6a18c3..befaa7e 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -310,7 +310,7 @@ static int default_pool_create(void)
if (default_pkt_pool != ODP_POOL_INVALID)
return -1;
 
-   memset(, 0, sizeof(params));
+   odp_pool_param_init();
set_pool_len();
params.pkt.num = PKT_BUF_NUM;
params.type= ODP_POOL_PACKET;
@@ -1669,10 +1669,11 @@ int pktio_check_send_failure(void)
 
odp_pktio_close(pktio_tx);
 
-   if (mtu <= pool_capa.pkt.max_len - 32)
-   return ODP_TEST_ACTIVE;
+   /* Failure test supports only single segment */
+   if (pool_capa.pkt.max_seg_len < mtu + 32)
+   return ODP_TEST_INACTIVE;
 
-   return ODP_TEST_INACTIVE;
+   return ODP_TEST_ACTIVE;
 }
 
 void pktio_test_send_failure(void)
@@ -1687,6 +1688,7 @@ void pktio_test_send_failure(void)
int long_pkt_idx = TX_BATCH_LEN / 2;
pktio_info_t info_rx;
odp_pktout_queue_t pktout;
+   odp_pool_capability_t pool_capa;
 
pktio_tx = create_pktio(0, ODP_PKTIN_MODE_DIRECT,
ODP_PKTOUT_MODE_DIRECT);
@@ -1705,9 +1707,16 @@ void pktio_test_send_failure(void)
 
_pktio_wait_linkup(pktio_tx);
 
+   CU_ASSERT_FATAL(odp_pool_capability(_capa) == 0);
+
+   if (pool_capa.pkt.max_seg_len < mtu + 32) {
+   CU_FAIL("Max packet seg length is too small.");
+   return;
+   }
+
/* configure the pool so that we can generate test packets larger
 * than the interface MTU */
-   memset(_params, 0, sizeof(pool_params));
+   odp_pool_param_init(_params);
pool_params.pkt.len = mtu + 32;
pool_params.pkt.seg_len = pool_params.pkt.len;
pool_params.pkt.num = TX_BATCH_LEN + 1;
@@ -1996,7 +2005,7 @@ static int create_pool(const char *iface, int num)
char pool_name[ODP_POOL_NAME_LEN];
odp_pool_param_t params;
 
-   memset(, 0, sizeof(params));
+   odp_pool_param_init();
set_pool_len();
params.pkt.num = PKT_BUF_NUM;
params.type= ODP_POOL_PACKET;
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 12/19] test: performance: crypto: use capability to select max packet

2016-11-10 Thread Petri Savolainen
Applications must use pool capabilibty to check maximum values
for parameters. Used maximum segment length since application
seems to support only single segment packets.

Signed-off-by: Petri Savolainen 
---
 test/common_plat/performance/odp_crypto.c | 47 +++
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/test/common_plat/performance/odp_crypto.c 
b/test/common_plat/performance/odp_crypto.c
index 49a9f4b..39df78b 100644
--- a/test/common_plat/performance/odp_crypto.c
+++ b/test/common_plat/performance/odp_crypto.c
@@ -23,15 +23,10 @@
fprintf(stderr, "%s:%d:%s(): Error: " fmt, __FILE__, \
__LINE__, __func__, ##__VA_ARGS__)
 
-/** @def SHM_PKT_POOL_SIZE
- * @brief Size of the shared memory block
+/** @def POOL_NUM_PKT
+ * Number of packets in the pool
  */
-#define SHM_PKT_POOL_SIZE  (512 * 2048 * 2)
-
-/** @def SHM_PKT_POOL_BUF_SIZE
- * @brief Buffer size of the packet pool buffer
- */
-#define SHM_PKT_POOL_BUF_SIZE  (1024 * 32)
+#define POOL_NUM_PKT  64
 
 static uint8_t test_iv[8] = "01234567";
 
@@ -165,9 +160,7 @@ static void parse_args(int argc, char *argv[], 
crypto_args_t *cargs);
 static void usage(char *progname);
 
 /**
- * Set of predefined payloads. Make sure that maximum payload
- * size is not bigger than SHM_PKT_POOL_BUF_SIZE. May relax when
- * implementation start support segmented buffers/packets.
+ * Set of predefined payloads.
  */
 static unsigned int payloads[] = {
16,
@@ -178,6 +171,9 @@ static unsigned int payloads[] = {
16384
 };
 
+/** Number of payloads used in the test */
+static unsigned num_payloads;
+
 /**
  * Set of known algorithms to test
  */
@@ -680,12 +676,10 @@ run_measure_one_config(crypto_args_t *cargs,
 config, );
}
} else {
-   unsigned int i;
+   unsigned i;
 
print_result_header();
-   for (i = 0;
-i < (sizeof(payloads) / sizeof(unsigned int));
-i++) {
+   for (i = 0; i < num_payloads; i++) {
rc = run_measure_one(cargs, config, ,
 payloads[i], );
if (rc)
@@ -728,6 +722,9 @@ int main(int argc, char *argv[])
int num_workers = 1;
odph_odpthread_t thr[num_workers];
odp_instance_t instance;
+   odp_pool_capability_t capa;
+   uint32_t max_seg_len;
+   unsigned i;
 
memset(, 0, sizeof(cargs));
 
@@ -743,11 +740,25 @@ int main(int argc, char *argv[])
/* Init this thread */
odp_init_local(instance, ODP_THREAD_WORKER);
 
+   if (odp_pool_capability()) {
+   app_err("Pool capability request failed.\n");
+   exit(EXIT_FAILURE);
+   }
+
+   max_seg_len = capa.pkt.max_seg_len;
+
+   for (i = 0; i < sizeof(payloads) / sizeof(unsigned int); i++) {
+   if (payloads[i] > max_seg_len)
+   break;
+   }
+
+   num_payloads = i;
+
/* Create packet pool */
odp_pool_param_init();
-   params.pkt.seg_len = SHM_PKT_POOL_BUF_SIZE;
-   params.pkt.len = SHM_PKT_POOL_BUF_SIZE;
-   params.pkt.num = SHM_PKT_POOL_SIZE / SHM_PKT_POOL_BUF_SIZE;
+   params.pkt.seg_len = max_seg_len;
+   params.pkt.len = max_seg_len;
+   params.pkt.num = POOL_NUM_PKT;
params.type= ODP_POOL_PACKET;
pool = odp_pool_create("packet_pool", );
 
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 18/19] linux-gen: packet: remove zero len support from alloc

2016-11-10 Thread Petri Savolainen
Remove support for zero length allocations which were never
required by the API specification or tested by the validation
suite.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/odp_packet.c | 28 
 1 file changed, 28 deletions(-)

diff --git a/platform/linux-generic/odp_packet.c 
b/platform/linux-generic/odp_packet.c
index a5c6ff4..0d3fd05 100644
--- a/platform/linux-generic/odp_packet.c
+++ b/platform/linux-generic/odp_packet.c
@@ -478,7 +478,6 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool_hdl, uint32_t 
len)
pool_t *pool = pool_entry_from_hdl(pool_hdl);
odp_packet_t pkt;
int num, num_seg;
-   int zero_len = 0;
 
if (odp_unlikely(pool->params.type != ODP_POOL_PACKET)) {
__odp_errno = EINVAL;
@@ -488,23 +487,12 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool_hdl, 
uint32_t len)
if (odp_unlikely(len > pool->max_len))
return ODP_PACKET_INVALID;
 
-   if (odp_unlikely(len == 0)) {
-   len = pool->data_size;
-   zero_len = 1;
-   }
-
num_seg = num_segments(len);
num = packet_alloc(pool, len, 1, num_seg, , 0);
 
if (odp_unlikely(num == 0))
return ODP_PACKET_INVALID;
 
-   if (odp_unlikely(zero_len)) {
-   odp_packet_hdr_t *pkt_hdr = odp_packet_hdr(pkt);
-
-   pull_tail(pkt_hdr, len);
-   }
-
return pkt;
 }
 
@@ -513,7 +501,6 @@ int odp_packet_alloc_multi(odp_pool_t pool_hdl, uint32_t 
len,
 {
pool_t *pool = pool_entry_from_hdl(pool_hdl);
int num, num_seg;
-   int zero_len = 0;
 
if (odp_unlikely(pool->params.type != ODP_POOL_PACKET)) {
__odp_errno = EINVAL;
@@ -523,24 +510,9 @@ int odp_packet_alloc_multi(odp_pool_t pool_hdl, uint32_t 
len,
if (odp_unlikely(len > pool->max_len))
return -1;
 
-   if (odp_unlikely(len == 0)) {
-   len = pool->data_size;
-   zero_len = 1;
-   }
-
num_seg = num_segments(len);
num = packet_alloc(pool, len, max_num, num_seg, pkt, 0);
 
-   if (odp_unlikely(zero_len)) {
-   int i;
-
-   for (i = 0; i < num; i++) {
-   odp_packet_hdr_t *pkt_hdr = odp_packet_hdr(pkt[i]);
-
-   pull_tail(pkt_hdr, len);
-   }
-   }
-
return num;
 }
 
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 17/19] api: packet: added limits for packet len on alloc

2016-11-10 Thread Petri Savolainen
There's no use case for application to allocate zero length
packets. Application should always have some knowledge about
the new packet data length before allocation. Also
implementations are more efficient when a check for zero length
is avoided.

Also added a pool parameter to specify the maximum packet length
to be allocated from the pool. Implementations may use this
information to optimize e.g. memory usage, etc. Application must
not exceed the max_len parameter value on alloc calls. Pool
capabilities define already max_len.

Signed-off-by: Petri Savolainen 
---
 include/odp/api/spec/packet.h | 9 +
 include/odp/api/spec/pool.h   | 6 ++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h
index 4a14f2d..faf62e2 100644
--- a/include/odp/api/spec/packet.h
+++ b/include/odp/api/spec/packet.h
@@ -82,13 +82,14 @@ extern "C" {
  * Allocate a packet from a packet pool
  *
  * Allocates a packet of the requested length from the specified packet pool.
- * Pool must have been created with ODP_POOL_PACKET type. The
+ * The pool must have been created with ODP_POOL_PACKET type. The
  * packet is initialized with data pointers and lengths set according to the
  * specified len, and the default headroom and tailroom length settings. All
- * other packet metadata are set to their default values.
+ * other packet metadata are set to their default values. Packet length must
+ * be greater than zero and not exceed packet pool parameter 'max_len' value.
  *
  * @param pool  Pool handle
- * @param len   Packet data length
+ * @param len   Packet data length (1 ... pool max_len)
  *
  * @return Handle of allocated packet
  * @retval ODP_PACKET_INVALID  Packet could not be allocated
@@ -105,7 +106,7 @@ odp_packet_t odp_packet_alloc(odp_pool_t pool, uint32_t 
len);
  * packets from a pool.
  *
  * @param pool  Pool handle
- * @param len   Packet data length
+ * @param len   Packet data length (1 ... pool max_len)
  * @param[out] pkt  Array of packet handles for output
  * @param num   Maximum number of packets to allocate
  *
diff --git a/include/odp/api/spec/pool.h b/include/odp/api/spec/pool.h
index c80c98a..6c99e05 100644
--- a/include/odp/api/spec/pool.h
+++ b/include/odp/api/spec/pool.h
@@ -192,6 +192,12 @@ typedef struct odp_pool_param_t {
pkt.max_len. Use 0 for default. */
uint32_t len;
 
+   /** Maximum packet length that will be allocated from
+   the pool. The maximum value is defined by pool
+   capability pkt.max_len. Use 0 for default (the
+   pool maximum). */
+   uint32_t max_len;
+
/** Minimum number of packet data bytes that are stored
in the first segment of a packet. The maximum value
is defined by pool capability pkt.max_seg_len.
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 16/19] test: validation: packet: improved multi-segment alloc test

2016-11-10 Thread Petri Savolainen
Added test cases to allocate and free multiple multi-segment
packets.

Signed-off-by: Petri Savolainen 
---
 test/common_plat/validation/api/packet/packet.c | 73 +++--
 1 file changed, 68 insertions(+), 5 deletions(-)

diff --git a/test/common_plat/validation/api/packet/packet.c 
b/test/common_plat/validation/api/packet/packet.c
index 8b31872..bf7eab2 100644
--- a/test/common_plat/validation/api/packet/packet.c
+++ b/test/common_plat/validation/api/packet/packet.c
@@ -273,23 +273,86 @@ void packet_test_alloc_free_multi(void)
 
 void packet_test_alloc_segmented(void)
 {
+   const int num = 5;
+   odp_packet_t pkts[num];
odp_packet_t pkt;
-   uint32_t len;
+   uint32_t max_len;
+   odp_pool_t pool;
+   odp_pool_param_t params;
odp_pool_capability_t capa;
+   int ret, i, num_alloc;
 
CU_ASSERT_FATAL(odp_pool_capability() == 0);
 
if (capa.pkt.max_len)
-   len = capa.pkt.max_len;
+   max_len = capa.pkt.max_len;
else
-   len = capa.pkt.min_seg_len * capa.pkt.max_segs_per_pkt;
+   max_len = capa.pkt.min_seg_len * capa.pkt.max_segs_per_pkt;
+
+   odp_pool_param_init();
+
+   params.type   = ODP_POOL_PACKET;
+   params.pkt.seg_len= capa.pkt.min_seg_len;
+   params.pkt.len= max_len;
+
+   /* Ensure that 'num' segmented packets can be allocated */
+   params.pkt.num= num * capa.pkt.max_segs_per_pkt;
+
+   pool = odp_pool_create("pool_alloc_segmented", );
+   CU_ASSERT_FATAL(pool != ODP_POOL_INVALID);
+
+   /* Less than max len allocs */
+   pkt = odp_packet_alloc(pool, max_len / 2);
+   CU_ASSERT_FATAL(pkt != ODP_PACKET_INVALID);
+   CU_ASSERT(odp_packet_len(pkt) == max_len / 2);
+
+   odp_packet_free(pkt);
+
+   num_alloc = 0;
+   for (i = 0; i < num; i++) {
+   ret = odp_packet_alloc_multi(pool, max_len / 2,
+[num_alloc], num - num_alloc);
+   CU_ASSERT_FATAL(ret >= 0);
+   num_alloc += ret;
+   if (num_alloc >= num)
+   break;
+   }
+
+   CU_ASSERT(num_alloc == num);
+
+   for (i = 0; i < num_alloc; i++)
+   CU_ASSERT(odp_packet_len(pkts[i]) == max_len / 2);
 
-   pkt = odp_packet_alloc(packet_pool, len);
+   odp_packet_free_multi(pkts, num_alloc);
+
+   /* Max len allocs */
+   pkt = odp_packet_alloc(pool, max_len);
CU_ASSERT_FATAL(pkt != ODP_PACKET_INVALID);
-   CU_ASSERT(odp_packet_len(pkt) == len);
+   CU_ASSERT(odp_packet_len(pkt) == max_len);
+
if (segmentation_supported)
CU_ASSERT(odp_packet_is_segmented(pkt) == 1);
+
odp_packet_free(pkt);
+
+   num_alloc = 0;
+   for (i = 0; i < num; i++) {
+   ret = odp_packet_alloc_multi(pool, max_len,
+[num_alloc], num - num_alloc);
+   CU_ASSERT_FATAL(ret >= 0);
+   num_alloc += ret;
+   if (num_alloc >= num)
+   break;
+   }
+
+   CU_ASSERT(num_alloc == num);
+
+   for (i = 0; i < num_alloc; i++)
+   CU_ASSERT(odp_packet_len(pkts[i]) == max_len);
+
+   odp_packet_free_multi(pkts, num_alloc);
+
+   CU_ASSERT(odp_pool_destroy(pool) == 0);
 }
 
 void packet_test_event_conversion(void)
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 14/19] test: validation: packet: fix bugs in tailroom and concat tests

2016-11-10 Thread Petri Savolainen
Tailroom test did not call odp_packet_extend_tail() since it pushed
tail too few bytes. Corrected the test to extend the tail by 100
bytes.

Concat test did pass the same packet as src and dst packets. There's
no valid use case to concatenate a packet into itself (forms
a loop). Corrected the test to concatenate two copies of the same
packet.

Signed-off-by: Petri Savolainen 
---
 test/common_plat/validation/api/packet/packet.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/test/common_plat/validation/api/packet/packet.c 
b/test/common_plat/validation/api/packet/packet.c
index 454c73f..8b31872 100644
--- a/test/common_plat/validation/api/packet/packet.c
+++ b/test/common_plat/validation/api/packet/packet.c
@@ -642,9 +642,10 @@ void packet_test_tailroom(void)
_verify_tailroom_shift(, 0);
 
if (segmentation_supported) {
-   _verify_tailroom_shift(, pull_val);
+   push_val = room + 100;
+   _verify_tailroom_shift(, push_val);
_verify_tailroom_shift(, 0);
-   _verify_tailroom_shift(, -pull_val);
+   _verify_tailroom_shift(, -push_val);
}
 
odp_packet_free(pkt);
@@ -1157,12 +1158,18 @@ void packet_test_concatsplit(void)
odp_packet_t pkt, pkt2;
uint32_t pkt_len;
odp_packet_t splits[4];
+   odp_pool_t pool;
 
-   pkt = odp_packet_copy(test_packet, odp_packet_pool(test_packet));
+   pool = odp_packet_pool(test_packet);
+   pkt  = odp_packet_copy(test_packet, pool);
+   pkt2 = odp_packet_copy(test_packet, pool);
pkt_len = odp_packet_len(test_packet);
CU_ASSERT_FATAL(pkt != ODP_PACKET_INVALID);
+   CU_ASSERT_FATAL(pkt2 != ODP_PACKET_INVALID);
+   CU_ASSERT(pkt_len == odp_packet_len(pkt));
+   CU_ASSERT(pkt_len == odp_packet_len(pkt2));
 
-   CU_ASSERT(odp_packet_concat(, pkt) == 0);
+   CU_ASSERT(odp_packet_concat(, pkt2) == 0);
CU_ASSERT(odp_packet_len(pkt) == pkt_len * 2);
_packet_compare_offset(pkt, 0, pkt, pkt_len, pkt_len);
 
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 06/19] linux-gen: ring: added multi enq and deq

2016-11-10 Thread Petri Savolainen
Added multi-data versions of ring enqueue and dequeue operations.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/include/odp_ring_internal.h | 65 ++
 1 file changed, 65 insertions(+)

diff --git a/platform/linux-generic/include/odp_ring_internal.h 
b/platform/linux-generic/include/odp_ring_internal.h
index 6a6291a..55fedeb 100644
--- a/platform/linux-generic/include/odp_ring_internal.h
+++ b/platform/linux-generic/include/odp_ring_internal.h
@@ -80,6 +80,45 @@ static inline uint32_t ring_deq(ring_t *ring, uint32_t mask)
return data;
 }
 
+/* Dequeue multiple data from the ring head. Num is smaller than ring size. */
+static inline uint32_t ring_deq_multi(ring_t *ring, uint32_t mask,
+ uint32_t data[], uint32_t num)
+{
+   uint32_t head, tail, new_head, i;
+
+   head = odp_atomic_load_u32(>r_head);
+
+   /* Move reader head. This thread owns data at the new head. */
+   do {
+   tail = odp_atomic_load_u32(>w_tail);
+
+   /* Ring is empty */
+   if (head == tail)
+   return 0;
+
+   /* Try to take all available */
+   if ((tail - head) < num)
+   num = tail - head;
+
+   new_head = head + num;
+
+   } while (odp_unlikely(odp_atomic_cas_acq_u32(>r_head, ,
+ new_head) == 0));
+
+   /* Read queue index */
+   for (i = 0; i < num; i++)
+   data[i] = ring->data[(head + 1 + i) & mask];
+
+   /* Wait until other readers have updated the tail */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>r_tail) != head))
+   odp_cpu_pause();
+
+   /* Now update the reader tail */
+   odp_atomic_store_rel_u32(>r_tail, new_head);
+
+   return num;
+}
+
 /* Enqueue data into the ring tail */
 static inline void ring_enq(ring_t *ring, uint32_t mask, uint32_t data)
 {
@@ -104,6 +143,32 @@ static inline void ring_enq(ring_t *ring, uint32_t mask, 
uint32_t data)
odp_atomic_store_rel_u32(>w_tail, new_head);
 }
 
+/* Enqueue multiple data into the ring tail. Num is smaller than ring size. */
+static inline void ring_enq_multi(ring_t *ring, uint32_t mask, uint32_t data[],
+ uint32_t num)
+{
+   uint32_t old_head, new_head, i;
+
+   /* Reserve a slot in the ring for writing */
+   old_head = odp_atomic_fetch_add_u32(>w_head, num);
+   new_head = old_head + 1;
+
+   /* Ring is full. Wait for the last reader to finish. */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>r_tail) == new_head))
+   odp_cpu_pause();
+
+   /* Write data */
+   for (i = 0; i < num; i++)
+   ring->data[(new_head + i) & mask] = data[i];
+
+   /* Wait until other writers have updated the tail */
+   while (odp_unlikely(odp_atomic_load_acq_u32(>w_tail) != old_head))
+   odp_cpu_pause();
+
+   /* Now update the writer tail */
+   odp_atomic_store_rel_u32(>w_tail, old_head + num);
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 04/19] linux-gen: align: added round up power of two

2016-11-10 Thread Petri Savolainen
Added a macro to round up a value to the next power of two,
if it's not already a power of two. Also removed duplicated
code from the same file.

Signed-off-by: Petri Savolainen 
---
 .../linux-generic/include/odp_align_internal.h | 34 +-
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/platform/linux-generic/include/odp_align_internal.h 
b/platform/linux-generic/include/odp_align_internal.h
index 9ccde53..d9cd30b 100644
--- a/platform/linux-generic/include/odp_align_internal.h
+++ b/platform/linux-generic/include/odp_align_internal.h
@@ -29,24 +29,18 @@ extern "C" {
 
 /**
  * @internal
- * Round up pointer 'x' to alignment 'align'
- */
-#define ODP_ALIGN_ROUNDUP_PTR(x, align)\
-   ((void *)ODP_ALIGN_ROUNDUP((uintptr_t)(x), (uintptr_t)(align)))
-
-/**
- * @internal
- * Round up pointer 'x' to cache line size alignment
+ * Round up 'x' to alignment 'align'
  */
-#define ODP_CACHE_LINE_SIZE_ROUNDUP_PTR(x)\
-   ((void *)ODP_CACHE_LINE_SIZE_ROUNDUP((uintptr_t)(x)))
+#define ODP_ALIGN_ROUNDUP(x, align)\
+   ((align) * (((x) + (align) - 1) / (align)))
 
 /**
  * @internal
- * Round up 'x' to alignment 'align'
+ * When 'x' is not already a power of two, round it up to the next
+ * power of two value. Zero is not supported as an input value.
  */
-#define ODP_ALIGN_ROUNDUP(x, align)\
-   ((align) * (((x) + align - 1) / (align)))
+#define ODP_ROUNDUP_POWER_2(x)\
+   (1 << (((int)(8 * sizeof(x))) - __builtin_clz((x) - 1)))
 
 /**
  * @internal
@@ -82,20 +76,6 @@ extern "C" {
 
 /**
  * @internal
- * Round down pointer 'x' to 'align' alignment, which is a power of two
- */
-#define ODP_ALIGN_ROUNDDOWN_PTR_POWER_2(x, align)\
-((void *)ODP_ALIGN_ROUNDDOWN_POWER_2((uintptr_t)(x), (uintptr_t)(align)))
-
-/**
- * @internal
- * Round down pointer 'x' to cache line size alignment
- */
-#define ODP_CACHE_LINE_SIZE_ROUNDDOWN_PTR(x)\
-   ((void *)ODP_CACHE_LINE_SIZE_ROUNDDOWN((uintptr_t)(x)))
-
-/**
- * @internal
  * Round down 'x' to 'align' alignment, which is a power of two
  */
 #define ODP_ALIGN_ROUNDDOWN_POWER_2(x, align)\
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 02/19] linux-gen: pktio: do not free zero packets

2016-11-10 Thread Petri Savolainen
In some error cases, netmap and dpdk pktios were calling
odp_packet_free_multi with zero packets. Moved existing error
check to avoid a free call with zero packets.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/pktio/dpdk.c   | 10 ++
 platform/linux-generic/pktio/netmap.c | 10 ++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/platform/linux-generic/pktio/dpdk.c 
b/platform/linux-generic/pktio/dpdk.c
index 11f3509..0eb025a 100644
--- a/platform/linux-generic/pktio/dpdk.c
+++ b/platform/linux-generic/pktio/dpdk.c
@@ -956,10 +956,12 @@ static int dpdk_send(pktio_entry_t *pktio_entry, int 
index,
rte_pktmbuf_free(tx_mbufs[i]);
}
 
-   odp_packet_free_multi(pkt_table, tx_pkts);
-
-   if (odp_unlikely(tx_pkts == 0 && __odp_errno != 0))
-   return -1;
+   if (odp_unlikely(tx_pkts == 0)) {
+   if (__odp_errno != 0)
+   return -1;
+   } else {
+   odp_packet_free_multi(pkt_table, tx_pkts);
+   }
 
return tx_pkts;
 }
diff --git a/platform/linux-generic/pktio/netmap.c 
b/platform/linux-generic/pktio/netmap.c
index 412beec..c1cdf72 100644
--- a/platform/linux-generic/pktio/netmap.c
+++ b/platform/linux-generic/pktio/netmap.c
@@ -830,10 +830,12 @@ static int netmap_send(pktio_entry_t *pktio_entry, int 
index,
if (!pkt_nm->lockless_tx)
odp_ticketlock_unlock(_nm->tx_desc_ring[index].s.lock);
 
-   odp_packet_free_multi(pkt_table, nb_tx);
-
-   if (odp_unlikely(nb_tx == 0 && __odp_errno != 0))
-   return -1;
+   if (odp_unlikely(nb_tx == 0)) {
+   if (__odp_errno != 0)
+   return -1;
+   } else {
+   odp_packet_free_multi(pkt_table, nb_tx);
+   }
 
return nb_tx;
 }
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 19/19] linux-gen: packet: enable multi-segment packets

2016-11-10 Thread Petri Savolainen
Enable segmentation support with CONFIG_PACKET_MAX_SEGS
configuration option.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/include/odp_config_internal.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index 9a4e6eb..8818cda 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -70,12 +70,12 @@ extern "C" {
 /*
  * Maximum number of segments per packet
  */
-#define CONFIG_PACKET_MAX_SEGS 1
+#define CONFIG_PACKET_MAX_SEGS 2
 
 /*
  * Maximum packet segment size including head- and tailrooms
  */
-#define CONFIG_PACKET_SEG_SIZE (64 * 1024)
+#define CONFIG_PACKET_SEG_SIZE (8 * 1024)
 
 /* Maximum data length in a segment
  *
-- 
2.8.1



[lng-odp] [API-NEXT PATCH v3 01/19] linux-gen: ipc: disable build of ipc pktio

2016-11-10 Thread Petri Savolainen
IPC pktio implementation depends heavily on pool internals. It's
build is disabled due to pool re-implementation. IPC should be
re-implemented with a cleaner internal interface towards pool and
shm.

Signed-off-by: Petri Savolainen 
---
 platform/linux-generic/pktio/ipc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/platform/linux-generic/pktio/ipc.c 
b/platform/linux-generic/pktio/ipc.c
index c1f28db..0e99c6e 100644
--- a/platform/linux-generic/pktio/ipc.c
+++ b/platform/linux-generic/pktio/ipc.c
@@ -3,7 +3,7 @@
  *
  * SPDX-License-Identifier: BSD-3-Clause
  */
-
+#ifdef _ODP_PKTIO_IPC
 #include 
 #include 
 #include 
@@ -795,3 +795,4 @@ const pktio_if_ops_t ipc_pktio_ops = {
.pktin_ts_from_ns = NULL,
.config = NULL
 };
+#endif
-- 
2.8.1



Re: [lng-odp] Classification Queue Group

2016-11-10 Thread Bala Manoharan
On 10 November 2016 at 13:26, Brian Brooks  wrote:
> On 11/07 16:46:12, Bala Manoharan wrote:
>> Hi,
>
> Hiya
>
>> This mail thread discusses the design of classification queue group
>> RFC. The same can be found in the google doc whose link is given
>> below.
>> Users can provide their comments either in this mail thread or in the
>> google doc as per their convenience.
>>
>> https://docs.google.com/document/d/1fOoG9WDR0lMpVjgMAsx8QsMr0YFK9slR93LZ8VXqM2o/edit?usp=sharing
>>
>> The basic issues with queues as being a single target for a CoS are two fold:
>>
>> Queues must be created and deleted individually. This imposes a
>> significant burden when queues are used to represent individual flows
>> since the application may need to process thousands (or millions) of
>> flows.
>
> Wondering why there is an issue with creating and deleting queues individually
> if queue objects represent millions of flows..

The queue groups are mainly required for hashing the incoming packets
to multiple flows based on the hash configuration.
So from application point of view it just needs a queue to have
packets belonging to same flow and that packets belonging to different
flows are placed in different queues respectively.It does not matter
who creates the flow/queue.

It is actually simpler if implementation creates a flow since in that
case implementation need not accumulate meta-data for all possible
hash values in a queue group and it can be created when traffic
arrives in that particular flow.

>
> Could an application ever call odp_schedule() and receive an event (e.g. 
> packet)
> from a queue (of opaque type odp_queue_t) and that queue has never been 
> created
> by the application (via odp_queue_create())? Could that ever happen from the
> hardware, and could the application ever handle that?

No. All the queues in the system are created by the application either
directly or in-directly.
In-case of queue groups the queues are in-directly created by the
application by configuring a queue group.

>
> Or, is it related to memory usage? The reference implementation
> struct queue_entry_s is 320 bytes on a 64-bit machine.
>
>   2^28 ~= 268,435,456 queues -> 81.920 GB
>   2^26 ~=  67,108,864 queues -> 20.480 GB
>   2^22 ~=   4,194,304 queues ->  1.280 GB
>
> Forget about 320 bytes per queue, if each queue was represented by a 32-bit
> integer (4 bytes!) the usage would be:
>
>   2^28 ~= 268,435,456 queues ->  1.024 GB
>   2^26 ~=  67,108,864 queues ->256 MB
>   2^22 ~=   4,194,304 queues -> 16 MB
>
> That still might be a lot of usage if the application must explicitly create
> every queue (before it is used) and require an ODP implementation to map
> between every ODP queue object (opaque type) and the internal queue.
>
> Lets say ODP API has two classes of handles: 1) pointers, 2) integers. An 
> opaque
> pointer is used to point to some other software object. This object should be
> larger than 64 bits (or 32 bits on a chip in 32-bit pointer mode) otherwise it
> could just be represented in a 64-bit (or 32-bit) integer type value!
>
> To support millions of queues (flows) should odp_queue_t be an integer type in
> the API? A software-only implementation may still use 320 bytes per queue and
> use that integer as an index into an array or as a key for lookup operation 
> on a
> data structure containing queues. An implementation with hardware assist may
> use this integer value directly when interfacing with hardware!

I believe I have answered this question based on explanation above.
Pls feel free to point out if something is not clear.

>
> Would it still be necessary to assign a "name" to each queue (flow)?

"name" per queue might not be required since it would mean a character
based lookup across millions of items.

>
> Would a queue (flow) also require an "op type" to explicitly specify whether
> access to the queue (flow) is threadsafe? Atomic queues are threadsafe since
> only 1 core at any given time can recieve from it. Parallel queues are also
> threadsafe. Are all ODP APIs threadsafe?

There are two types of queue enqueue operation ODP_QUEUE_OP_MT and
ODP_QUEUE_OP_MT_UNSAFE.
Rest of the ODP APIs are multi thread safe since in ODP there is no
defined way in which a single packet can be given to more than one
core at the same time, as packets move across different modules
through queues.

>
>> A single PMR can only match a packet to a single queue associated with
>> a target CoS. This prohibits efficient capture of subfield
>> classification.
>
> odp_cls_pmr_create() can take more than 1 odp_pmr_param_t, so it is possible
> to create a single PMR which matches multiple fields of a packet. I can 
> imagine
> a case where a packet matches pmr1 (match Vlan) and also matches pmr2
> (match Vlan AND match L3DestIP). Is that an example of subfield 
> classification?
> How does the queue relate?

This question is related to classification, If a PMR is configured
with more than one 

Re: [lng-odp] driver interface thinkings....

2016-11-10 Thread Brian Brooks
On 11/09 09:42:49, Francois Ozog wrote:
> To deal with drivers we need to have additional "standard" (helper or API?)
> objects to deal with MAC addresses.
> 
> ODP could be made to handle SDH, SDLC, FrameRelay or ATM packets. A
> packet_io is pretty generic...
> 
> But as soon as we talk with drivers and initialization, we need to deal
> with L1 initialization (link up/ synchronized...) and L2 initialization
> (DLCI, MAC (Ethernet, FDDI, TokenRing...)...).
> 
> Shall we add those L1 and L3 concepts ?

Thinking of L1 and L2 concepts. External PHYs and transceivers should
be brought out of reset and MDIO busses probed. Optional firmware loaded.
Line drivers configured with board-specific drive strength. Startup and
shutdown sequence across MAC, PHY, and optional transceiver. Loopback at
line-side or system-side. Autonegotiation. Link status according to MAC
and PHY. Link status info such as Link Fault Signaling. Timestamping.
IEEE DCB. Majority kept in driver, but what does interface look like?

> FF
> 
> On 9 November 2016 at 09:38, Francois Ozog  wrote:
> 
> >
> >
> > On 9 November 2016 at 09:09, Christophe Milard <
> > christophe.mil...@linaro.org> wrote:
> >
> >>
> >>
> >> On 8 November 2016 at 18:19, Francois Ozog 
> >> wrote:
> >>
> >>> Hi Christophe,
> >>>
> >>> We are focusing on network devices which have at least network ports.
> >>> Inventory of devices and ports are two distinct topics as there is no
> >>> guarantee that there is one device per port. (Intel and Mellanox have this
> >>> policy: one device per PCI port, but many others have one PCI device for
> >>> multiple ports (Chelsio...)
> >>>
> >>> 1) device discovery
> >>>
> >>> So let's focus first on device discovery.
> >>>
> >>> Devices can be found from:
> >>> - static "configuration" based on ODP implementation
> >>> - ACPI
> >>> - PCI
> >>> - DPAA2 NXP bus
> >>> - VMBus (VMs running on Microsoft Hyper-V hypervisor)
> >>> - others ?
> >>>
> >>> So bus are somewhat limited (ACPI is NOT a bus). I prefer to talk about
> >>> device enumerators.
> >>>
> >>
> >> OK.
> >>
> >>
> >>>
> >>> so we can loop through all devices using your proposed:
> >>> for each 
> >>> for each probed device
> >>>
> >>
> >> Not sure what a "probed device" is here, I what thinking:
> >>  for each 
> >> for each driver
> >> driver.probe(device)
> >>
> >> Is that what you meant?
> >>
> >>
> >>> [FF] no:
> >  for each 
> >
> >> for each enumerated device
> >>  for each registered driver
> >>
> >driver.probe(device)
> >
> >>
> >>> But that is only for static discovery at the program startup.
> >>>
> >>
> >> yes
> >>
> >>
> >>>
> >>> It would be way better to allow hotplugs (card insertion in a chassis,
> >>> addition of one virtual interfrace in aa VM because of a scale up
> >>> operation).
> >>>
> >>
> >> yes: we can probe new hotplugged devices (again registered drivers) when
> >> a new device is hot plugged and and probe registered (unassigned) device
> >> again any hot plugged driver.
> >>
> >>
> >>>
> >>> So in addition to initial synchronous probing we need asynchronous
> >>> events handling.
> >>>
> >>
> >> yes
> >>
> >>
> >>>
> >>> Once we have found the device, we need to check if we have a driver for
> >>> it and if the "user" has decided if he wants to use that device
> >>> (configuration aspects)
> >>>
> >>
> >> I guess the fact that a user loads a driver is an act of configuration...?
> >>  [FF] I guess not, load all drivers
> >>
> >>>
> >>> 2) driver assignment
> >>> From this device we may want to enumerate ports and other "things" (like
> >>> embeded switch) that are present on the device.
> >>>
> >>
> >> hmmm... so a device driver could act as a new enumerator? I am not fully
> >> following your proposal here... Let's say 2 PCI board contains (each) its
> >> switch (one switch per board). The PCI enumerator will detect 2 devices. So
> >> according to your proposal, we would also have switch enumerators? How many
> >> such switch enumerator do we have? Where is it located?
> >> [FF]: lets make a simpler example: 2 PCI devices, each with four 10Gbps
> >> ethernet ports. there are two enumerated PCI devices but there are 8
> >> packet_ios. packet ios should conserve a link to a device, the device
> >> should have a link back to the enumerator. no upperlayer metadata saved in
> >> each "object", just a link.

Just 1 enumeration and driver layer? On top-level (PCI, SoC bus) enumerator, say
2 devices probed and assigned driver. In driver on device init, a
representation of hw (pktio, classifier, scheduler, ..) returned to rest of
system. Ideally hotpluggable. Could device driver act as another enumerator if
there are further busses and devices which may be interchangeable across the 2
devices?

> >>>
> >>> Accessing devices is bus dependent, os dependent and processor
> >>> dependent.
> >>> A few problems to tackle:
> >>> - On Linux, a device can be accessed 

Re: [lng-odp] Classification Queue Group

2016-11-10 Thread Bala Manoharan
Regards,
Bala


On 10 November 2016 at 07:34, Bill Fischofer  wrote:
>
>
> On Mon, Nov 7, 2016 at 5:16 AM, Bala Manoharan 
> wrote:
>>
>> Hi,
>>
>> This mail thread discusses the design of classification queue group
>> RFC. The same can be found in the google doc whose link is given
>> below.
>> Users can provide their comments either in this mail thread or in the
>> google doc as per their convenience.
>>
>>
>> https://docs.google.com/document/d/1fOoG9WDR0lMpVjgMAsx8QsMr0YFK9slR93LZ8VXqM2o/edit?usp=sharing
>>
>> The basic issues with queues as being a single target for a CoS are two
>> fold:
>>
>> Queues must be created and deleted individually. This imposes a
>> significant burden when queues are used to represent individual flows
>> since the application may need to process thousands (or millions) of
>> flows.
>> A single PMR can only match a packet to a single queue associated with
>> a target CoS. This prohibits efficient capture of subfield
>> classification.
>> To solve these issues, Tiger Moth introduces the concept of a queue
>> group. A queue group is an extension to the existing queue
>> specification in a Class of Service.
>>
>> Queue groups solve the classification issues associated with
>> individual queues in three ways:
>>
>> * The odp_queue_group_create() API can create a large number of
>> related queues with a single call.
>> * A single PMR can spread traffic to many queues associated with the
>> same CoS by assigning packets matching the PMR to a queue group rather
>> than a queue.
>> * A hashed PMR subfield is used to distribute individual queues within
>> a queue group for scheduling purposes.
>>
>>
>> diff --git a/include/odp/api/spec/classification.h
>> b/include/odp/api/spec/classification.h
>> index 6eca9ab..cf56852 100644
>> --- a/include/odp/api/spec/classification.h
>> +++ b/include/odp/api/spec/classification.h
>> @@ -126,6 +126,12 @@ typedef struct odp_cls_capability_t {
>>
>> /** A Boolean to denote support of PMR range */
>> odp_bool_t pmr_range_supported;
>> +
>> + /** A Boolean to denote support of queue group */
>> + odp_bool_t queue_group_supported;
>
>
> To date we've not introduced optional APIs into ODP so I'm not sure if we'd
> want to start here. If we are adding queue groups, all ODP implementations
> should be expected to support queue groups, so this flag shouldn't be
> needed. Limits on the support (e.g., max number of queue groups supported,
> etc.) are appropriate, but there shouldn't be an option to not support them
> at all.
>
>>
>> +
>> + /** A Boolean to denote support of queue */
>> + odp_bool_t queue_supported;
>
>
> Not sure what the intent is here. Is this anticipating that some
> implementations might only support sending flows to queue groups and not to
> individual queues? That would be a serious functional regression and not
> something we'd want to encourage.

The idea here is that some implementations might have limitations
where it could support either queue or a queue group attached to a CoS
object but not both. This limitation is only for queue created within
a queue group and not general odp_queue_t.

This could be use-ful when some low end hardware does not support
hashing after classification in the HW and in that case supporting
queue group which requires hashing might be costly.

The reason being that the internal handles of a packet flow might have
to be re-aligned so that they can be exposed as either queue or queue
group. So this capability is useful if an implementation supports only
queue groups and not queue or vice-versa. I did not see much harm in
the above boolean since it is added only in the capability struct.

>
>>
>> } odp_cls_capability_t;
>>
>>
>> /**
>> @@ -162,7 +168,18 @@ typedef enum {
>>  * Used to communicate class of service creation options
>>  */
>> typedef struct odp_cls_cos_param {
>> - odp_queue_t queue; /**< Queue associated with CoS */
>> + /** If type is ODP_QUEUE_T, odp_queue_t is linked with CoS,
>> + * if type is ODP_QUEUE_GROUP_T, odp_queue_group_t is linked with CoS.
>
>
> Perhaps not the best choice of discriminator names. Perhaps
> ODP_COS_TYPE_QUEUE and ODP_COS_TYPE_GROUP might be simpler?
>
>>
>> + */
>> + odp_queue_type_e type;
>
>
> We already have an odp_queue_type_t defined for the queue APIs so this name
> would be confusingly similar. We're really identifying what the type of the
> CoS is so perhaps odp_cos_type_t might be better here? That would be
> consistent with the ODP_QUEUE_TYPE_PLAIN and ODP_QUEUE_TYPE_SCHED used in
> the odp_queue_type_t enum.

Agreed.
>
>>
>> +
>> + typedef union {
>> + /** Queue associated with CoS */
>> + odp_queue_t queue;
>> +
>> + /** Queue Group associated with CoS */
>> + odp_queue_group_t queue_group;
>> + };
>> odp_pool_t pool; /**< Pool associated with CoS */
>> odp_cls_drop_t drop_policy; /**< Drop policy associated with CoS */
>> } odp_cls_cos_param_t;
>>
>>
>> diff --git 

Re: [lng-odp] Classification Queue Group

2016-11-10 Thread Francois Ozog
Good points Brian.

I can add some data point with real life observations of a DPI box
analyzing two 10Gbps links (four 10 Gbps ports): there are approximately
50M known flows (TCP, UDP...), may be not active (TCP Close_wait).

FF

On 10 November 2016 at 08:56, Brian Brooks  wrote:

> On 11/07 16:46:12, Bala Manoharan wrote:
> > Hi,
>
> Hiya
>
> > This mail thread discusses the design of classification queue group
> > RFC. The same can be found in the google doc whose link is given
> > below.
> > Users can provide their comments either in this mail thread or in the
> > google doc as per their convenience.
> >
> > https://docs.google.com/document/d/1fOoG9WDR0lMpVjgMAsx8QsMr0YFK9
> slR93LZ8VXqM2o/edit?usp=sharing
> >
> > The basic issues with queues as being a single target for a CoS are two
> fold:
> >
> > Queues must be created and deleted individually. This imposes a
> > significant burden when queues are used to represent individual flows
> > since the application may need to process thousands (or millions) of
> > flows.
>
> Wondering why there is an issue with creating and deleting queues
> individually
> if queue objects represent millions of flows..
>
> Could an application ever call odp_schedule() and receive an event (e.g.
> packet)
> from a queue (of opaque type odp_queue_t) and that queue has never been
> created
> by the application (via odp_queue_create())? Could that ever happen from
> the
> hardware, and could the application ever handle that?
>
> Or, is it related to memory usage? The reference implementation
> struct queue_entry_s is 320 bytes on a 64-bit machine.
>
>   2^28 ~= 268,435,456 queues -> 81.920 GB
>   2^26 ~=  67,108,864 queues -> 20.480 GB
>   2^22 ~=   4,194,304 queues ->  1.280 GB
>
> Forget about 320 bytes per queue, if each queue was represented by a 32-bit
> integer (4 bytes!) the usage would be:
>
>   2^28 ~= 268,435,456 queues ->  1.024 GB
>   2^26 ~=  67,108,864 queues ->256 MB
>   2^22 ~=   4,194,304 queues -> 16 MB
>
> That still might be a lot of usage if the application must explicitly
> create
> every queue (before it is used) and require an ODP implementation to map
> between every ODP queue object (opaque type) and the internal queue.
>
> Lets say ODP API has two classes of handles: 1) pointers, 2) integers. An
> opaque
> pointer is used to point to some other software object. This object should
> be
> larger than 64 bits (or 32 bits on a chip in 32-bit pointer mode)
> otherwise it
> could just be represented in a 64-bit (or 32-bit) integer type value!
>
> To support millions of queues (flows) should odp_queue_t be an integer
> type in
> the API? A software-only implementation may still use 320 bytes per queue
> and
> use that integer as an index into an array or as a key for lookup
> operation on a
> data structure containing queues. An implementation with hardware assist
> may
> use this integer value directly when interfacing with hardware!
>
> Would it still be necessary to assign a "name" to each queue (flow)?
>
> Would a queue (flow) also require an "op type" to explicitly specify
> whether
> access to the queue (flow) is threadsafe? Atomic queues are threadsafe
> since
> only 1 core at any given time can recieve from it. Parallel queues are also
> threadsafe. Are all ODP APIs threadsafe?
>
> > A single PMR can only match a packet to a single queue associated with
> > a target CoS. This prohibits efficient capture of subfield
> > classification.
>
> odp_cls_pmr_create() can take more than 1 odp_pmr_param_t, so it is
> possible
> to create a single PMR which matches multiple fields of a packet. I can
> imagine
> a case where a packet matches pmr1 (match Vlan) and also matches pmr2
> (match Vlan AND match L3DestIP). Is that an example of subfield
> classification?
> How does the queue relate?
>
> > To solve these issues, Tiger Moth introduces the concept of a queue
> > group. A queue group is an extension to the existing queue
> > specification in a Class of Service.
> >
> > Queue groups solve the classification issues associated with
> > individual queues in three ways:
> >
> > * The odp_queue_group_create() API can create a large number of
> > related queues with a single call.
>
> If the application calls this API, does that mean the ODP implementation
> can create a large number of queues? What happens if the application
> receives an event on a queue that was created by the implmentation--how
> does the application know whether this queue was created by the hardware
> according to the ODP Classification or whether the queue was created by
> the application?
>
> > * A single PMR can spread traffic to many queues associated with the
> > same CoS by assigning packets matching the PMR to a queue group rather
> > than a queue.
> > * A hashed PMR subfield is used to distribute individual queues within
> > a queue group for scheduling purposes.
>
> Is there a way to write a test case for this? Trying to think of what kind
> of
>