[dpdk-dev] [RFC] libeventdev: event driven programming model framework for DPDK

2016-08-09 Thread Jerin Jacob
Hi All,

Find below an RFC API specification which attempts to
define the standard application programming interface
for event driven programming in DPDK and to abstract HW based event devices.

These devices can support event scheduling and flow ordering
in HW and typically found in NW SoCs as an integrated device or
as PCI EP device.

The RFC APIs are inspired from existing ethernet and crypto devices.
Following are the requirements considered to define the RFC API.

1) APIs similar to existing Ethernet and crypto API framework for
? Device creation, device Identification and device configuration
2) Enumerate libeventdev resources as numbers(0..N) to
? Avoid ABI issues with handles
? Event device may have million flow queues so it's not practical to
have handles for each flow queue and its associated name based
lookup in multiprocess case
3) Avoid struct mbuf changes
4) APIs to
? Enumerate eventdev driver capabilities and resources
? Enqueue events from l-core
? Schedule events
? Synchronize events
? Maintain ingress order of the events
? Run to completion support

Find below the URL for the complete API specification.

https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h

I have created a supportive document to share the concepts of
event driven programming model and proposed APIs details to get
better reach for the specification.
This presentation will cover introduction to event driven programming model 
concepts,
characteristics of hardware-based event manager devices,
RFC API proposal, example use case, and benefits of using the event driven 
programming model.

Find below the URL for the supportive document.

https://rawgit.com/jerinjacobk/libeventdev/master/DPDK-event_driven_programming_framework.pdf

git repo for the above documents:

https://github.com/jerinjacobk/libeventdev/

Looking forward to getting comments from both application and driver
implementation perspective.

What follows is the text version of the above documents, for inline comments 
and discussion.
I intend to update that specification accordingly.

/**
 * Get the total number of event devices that have been successfully
 * initialised.
 *
 * @return
 *   The total number of usable event devices.
 */
extern uint8_t
rte_eventdev_count(void);

/**
 * Get the device identifier for the named event device.
 *
 * @param name
 *   Event device name to select the event device identifier.
 *
 * @return
 *   Returns event device identifier on success.
 *   - <0: Failure to find named event device.
 */
extern uint8_t
rte_eventdev_get_dev_id(const char *name);

/*
 * Return the NUMA socket to which a device is connected.
 *
 * @param dev_id
 *   The identifier of the device.
 * @return
 *   The NUMA socket id to which the device is connected or
 *   a default of zero if the socket could not be determined.
 *   - -1: dev_id value is out of range.
 */
extern int
rte_eventdev_socket_id(uint8_t dev_id);

/**  Event device information */
struct rte_eventdev_info {
const char *driver_name;/**< Event driver name */
struct rte_pci_device *pci_dev; /**< PCI information */
uint32_t min_sched_wait_ns;
/**< Minimum supported scheduler wait delay in ns by this device */
uint32_t max_sched_wait_ns;
/**< Maximum supported scheduler wait delay in ns by this device */
uint32_t sched_wait_ns;
/**< Configured scheduler wait delay in ns of this device */
uint32_t max_flow_queues_log2;
/**< LOG2 of maximum flow queues supported by this device */
uint8_t  max_sched_groups;
/**< Maximum schedule groups supported by this device */
uint8_t  max_sched_group_priority_levels;
/**< Maximum schedule group priority levels supported by this device */
}

/**
 * Retrieve the contextual information of an event device.
 *
 * @param dev_id
 *   The identifier of the device.
 * @param[out] dev_info
 *   A pointer to a structure of type *rte_eventdev_info* to be filled with the
 *   contextual information of the device.
 */
extern void
rte_eventdev_info_get(uint8_t dev_id, struct rte_eventdev_info *dev_info);

/** Event device configuration structure */
struct rte_eventdev_config {
uint32_t sched_wait_ns;
/**< rte_event_schedule() wait for *sched_wait_ns* ns on this device */
uint32_t nb_flow_queues_log2;
/**< LOG2 of the number of flow queues to configure on this device */
uint8_t  nb_sched_groups;
/**< The number of schedule groups to configure on this device */
};

/**
 * Configure an event device.
 *
 * This function must be invoked first before any other function in the
 * API. This function can also be re-invoked when a device is in the
 * stopped state.
 *
 * The caller may use rte_eventdev_info_get() to get the capability of each
 * resources available in this event device.
 *
 * @param dev_id
 *   The identifier of the device to configure.
 * @param

[dpdk-dev] [PATCH v3] net/i40e: fix Rx statistic inconsistent

2016-08-09 Thread Wei Zhao1
rx_good_bytes and rx_good_packets statistic is inconsistent when port
stopped,ipackets statistic is minus the discard packets but rx_bytes
statistic not,but i40e has no statistic of discard bytes.So we check
"dev_started" flag, if the port is started we do the statistic work,if
stopped we don't.We can avoid the statistic inconsistent
problem by this way.

Fixes: 9aace75fc82e ("i40e: fix statistics")

Signed-off-by: Wei Zhao1 
---
 drivers/net/i40e/i40e_ethdev.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index d0aeb70..47ed6e9 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -2315,7 +2315,8 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
unsigned i;

/* call read registers - updates values, now write them to struct */
-   i40e_read_stats_registers(pf, hw);
+   if (dev->data->dev_started == 1)
+   i40e_read_stats_registers(pf, hw);

stats->ipackets = pf->main_vsi->eth_stats.rx_unicast +
pf->main_vsi->eth_stats.rx_multicast +
@@ -2494,7 +2495,8 @@ i40e_dev_xstats_get(struct rte_eth_dev *dev, struct 
rte_eth_xstat *xstats,
if (n < count)
return count;

-   i40e_read_stats_registers(pf, hw);
+   if (dev->data->dev_started == 1)
+   i40e_read_stats_registers(pf, hw);

if (xstats == NULL)
return 0;
-- 
2.5.5



[dpdk-dev] [PATCH v2] net/i40e: fix Rx statistic inconsistent

2016-08-09 Thread Zhao1, Wei
Hi ,Kyle Larose&Jingjing

> -Original Message-
> From: Kyle Larose [mailto:eomereadig at gmail.com]
> Sent: Wednesday, August 3, 2016 12:22 AM
> To: Zhao1, Wei 
> Cc: Wu, Jingjing ; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: fix Rx statistic inconsistent
> 
> Hello Wei,
> 
> 
> On Tue, Aug 2, 2016 at 2:59 AM, Zhao1, Wei  wrote:
> > Hi, Wujingjing and Kyle Larose
> >
> >
> >
> >> -Original Message-
> >> From: Zhao1, Wei
> >> Sent: Tuesday, August 2, 2016 11:27 AM
> >> To: Wu, Jingjing ; Lu, Wenzhuo
> >> 
> >> Cc: dev at dpdk.org
> >> Subject: RE: [dpdk-dev] [PATCH v2] net/i40e: fix Rx statistic
> >> inconsistent
> >>
> >> Hi,Wu jingjing and wenzhuo
> >>
> >> > -Original Message-
> >> > From: Zhao1, Wei
> >> > Sent: Monday, August 1, 2016 4:58 PM
> >> > To: 'Kyle Larose' 
> >> > Cc: dev at dpdk.org
> >> > Subject: RE: [dpdk-dev] [PATCH v2] net/i40e: fix Rx statistic
> >> > inconsistent
> >> >
> >> > Hi, Kyle Larose
> >> >The core problem is i40e has no statistic of discard bytes, that
> >> > means even if when ports are not stopped, the statistic
> >> > rx_good_bytes is consist of discard
> >> > bytes?is that reasonable? In other words, I can just minus discard
> >> > bytes from rx_good_bytes if I can get discard bytes number, that is
> >> > much
> >> better.
> >> >
> >> > -Original Message-
> >> > From: Kyle Larose [mailto:eomereadig at gmail.com]
> >> > Sent: Saturday, July 30, 2016 1:17 AM
> >> > To: Zhao1, Wei 
> >> > Cc: dev at dpdk.org
> >> > Subject: Re: [dpdk-dev] [PATCH v2] net/i40e: fix Rx statistic
> >> > inconsistent
> >> >
> >> > On Fri, Jul 29, 2016 at 4:50 AM, Wei Zhao1 
> wrote:
> >> > > rx_good_bytes and rx_good_packets statistic is inconsistent when
> >> > > port stopped,ipackets statistic is minus the discard packets but
> >> > > rx_bytes statistic not.Also,i40e has no statistic of discard
> >> > > bytes, so we have to delete discard packets item from
> rx_good_packets statistic.
> >> > >
> >> > > Fixes: 9aace75fc82e ("i40e: fix statistics")
> >> > >
> >> > > Signed-off-by: Wei Zhao1 
> >> > > ---
> >> > >  drivers/net/i40e/i40e_ethdev.c | 3 +--
> >> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> >> > >
> >> > > diff --git a/drivers/net/i40e/i40e_ethdev.c
> >> > > b/drivers/net/i40e/i40e_ethdev.c index 11a5804..553dfd9 100644
> >> > > --- a/drivers/net/i40e/i40e_ethdev.c
> >> > > +++ b/drivers/net/i40e/i40e_ethdev.c
> >> > > @@ -2319,8 +2319,7 @@ i40e_dev_stats_get(struct rte_eth_dev
> *dev,
> >> > > struct rte_eth_stats *stats)
> >> > >
> >> > > stats->ipackets = pf->main_vsi->eth_stats.rx_unicast +
> >> > > pf->main_vsi->eth_stats.rx_multicast +
> >> > > -   pf->main_vsi->eth_stats.rx_broadcast -
> >> > > -   pf->main_vsi->eth_stats.rx_discards;
> >> > > +   pf->main_vsi->eth_stats.rx_broadcast;
> >> > > stats->opackets = pf->main_vsi->eth_stats.tx_unicast +
> >> > > pf->main_vsi->eth_stats.tx_multicast +
> >> > > pf->main_vsi->eth_stats.tx_broadcast;
> >> > > --
> >> > > 2.5.5
> >> > >
> >> >
> >> > Is it not worse to report a received packet when no packet was
> >> > actually received by the upper layers under normal operations than
> >> > to ensure that packets and  bytes are consistent when an interface
> >> > is stopped? It seems like the first case is much more likely to
> >> > occur than the
> >> second.
> >> >
> >> > Are we just introducing a new issue to fix another?
> >> >
> >> > How does this behaviour compare to other NICs? Does the ixgbe
> >> > report discarded packets in its ipackets? My reading of the driver is 
> >> > that
> it does not.
> >> > In fact, it does something interesting to deal with the
> >> > problem:
> >> >
> >> > from:
> >> > http://dpdk.org/browse/dpdk/tree/drivers/net/ixgbe/ixgbe_ethdev.c
> >> >
> >> > /*
> >> > * An errata states that gprc actually counts good + missed packets:
> >> > * Workaround to set gprc to summated queue packet receives */
> >> > hw_stats-
> >> > >gprc = *total_qprc;
> >> >
> >> > total_gprc is equal to the sum of the qprc per queue. Can we do
> >> > something similar on the i40e instead of adding unicast, mulitcast
> >> > and
> >> broadcast?
> >>
> >>
> >> I have checked ixgbe code about  Rx statistic, in function
> >> ixgbe_read_stats_registers() we can find the rx_good_bytes and
> >> rx_good_packets statistic.
> >> It is listed below, we  can see rx_good_packets is also just addition
> >> of Queue Packets Received Count and  not minused  discard packet
> number.
> >> Is there some wrong of understanding?
> 
> 
> 
> My understanding of the problem can be broken into three parts:
>  1) In Unicast/Multicast/Broadcast packet counters are counting packets
> which were discarded
>  2) The corresponding byte counters count packets which were discarded.
>  3) There are no discarded byte counters.
> 
> Our in bytes counter consis

[dpdk-dev] [PATCH] mk:fix second compile error

2016-08-09 Thread xu,huilong
when compile different targets on a same environment.
The second compile will failed, because test_resource obj file
can't auto clearn by makfile.

Signed-off-by: xu,huilong 
---
 mk/rte.app.mk | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index eb28e11..d23e8b9 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -263,7 +263,8 @@ $(RTE_OUTPUT)/app/$(APP).map: $(APP)
 #
 .PHONY: clean
 clean: _postclean
-   $(Q)rm -f $(_BUILD_TARGETS) $(_INSTALL_TARGETS) $(_CLEAN_TARGETS)
+   $(Q)rm -f $(_BUILD_TARGETS) $(_INSTALL_TARGETS) $(_CLEAN_TARGETS) \
+   test_resource.res test_resource_c.o test_resource_c.res.o

 .PHONY: doclean
 doclean:
-- 
1.9.3



[dpdk-dev] [PATCH] mk:fix second compile error

2016-08-09 Thread Bruce Richardson
On Tue, Aug 09, 2016 at 02:01:58PM +0800, xu,huilong wrote:
> when compile different targets on a same environment.
> The second compile will failed, because test_resource obj file
> can't auto clearn by makfile.
> 
> Signed-off-by: xu,huilong 
> ---
>  mk/rte.app.mk | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index eb28e11..d23e8b9 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -263,7 +263,8 @@ $(RTE_OUTPUT)/app/$(APP).map: $(APP)
>  #
>  .PHONY: clean
>  clean: _postclean
> - $(Q)rm -f $(_BUILD_TARGETS) $(_INSTALL_TARGETS) $(_CLEAN_TARGETS)
> + $(Q)rm -f $(_BUILD_TARGETS) $(_INSTALL_TARGETS) $(_CLEAN_TARGETS) \
> + test_resource.res test_resource_c.o test_resource_c.res.o
>  
This fix looks very specific to the test application, and specific files in
that application. Can it be made more generic so that it will work if we add
other resource files to the test app without us having to modify the top-level
application makefile.

Regards,
/Bruce


[dpdk-dev] [RFC] libeventdev: event driven programming model framework for DPDK

2016-08-09 Thread Bruce Richardson
On Tue, Aug 09, 2016 at 06:31:41AM +0530, Jerin Jacob wrote:
> Hi All,
> 
> Find below an RFC API specification which attempts to
> define the standard application programming interface
> for event driven programming in DPDK and to abstract HW based event devices.
> 
> These devices can support event scheduling and flow ordering
> in HW and typically found in NW SoCs as an integrated device or
> as PCI EP device.
> 
> The RFC APIs are inspired from existing ethernet and crypto devices.
> Following are the requirements considered to define the RFC API.
> 
> 1) APIs similar to existing Ethernet and crypto API framework for
> ? Device creation, device Identification and device configuration
> 2) Enumerate libeventdev resources as numbers(0..N) to
> ? Avoid ABI issues with handles
> ? Event device may have million flow queues so it's not practical to
> have handles for each flow queue and its associated name based
> lookup in multiprocess case
> 3) Avoid struct mbuf changes
> 4) APIs to
> ? Enumerate eventdev driver capabilities and resources
> ? Enqueue events from l-core
> ? Schedule events
> ? Synchronize events
> ? Maintain ingress order of the events
> ? Run to completion support
> 
> Find below the URL for the complete API specification.
> 
> https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> 
> I have created a supportive document to share the concepts of
> event driven programming model and proposed APIs details to get
> better reach for the specification.
> This presentation will cover introduction to event driven programming model 
> concepts,
> characteristics of hardware-based event manager devices,
> RFC API proposal, example use case, and benefits of using the event driven 
> programming model.
> 
> Find below the URL for the supportive document.
> 
> https://rawgit.com/jerinjacobk/libeventdev/master/DPDK-event_driven_programming_framework.pdf
> 
> git repo for the above documents:
> 
> https://github.com/jerinjacobk/libeventdev/
> 
> Looking forward to getting comments from both application and driver
> implementation perspective.
> 

Hi Jerin,

thanks for the RFC. Packet distribution and scheduling is something we've been
thinking about here too. This RFC gives us plenty of new ideas to take on 
board. :-)
While you refer to HW implementations on SOC's, have you given any thought to
how a pure-software implementation of an event API might work? I know that
while a software implemenation can obviously be done for just about any API,
I'd be concerned that the API not get in the way of a very highly
tuned implementation.

We'll look at it in some detail and get back to you with our feedback, as soon
as we can, to start getting the discussion going.

Regards,
/Bruce



[dpdk-dev] [PATCH v4 3/6] ip_pipeline: fix lcore mapping for varying SMT threads as in ppc64

2016-08-09 Thread Chao Zhu
Gowrishankar,

Can you give more description about this patch? 
Thank you!

-Original Message-
From: Gowrishankar Muthukrishnan [mailto:gowrishanka...@linux.vnet.ibm.com] 
Sent: 2016?8?6? 20:33
To: dev at dpdk.org
Cc: Chao Zhu ; Bruce Richardson
; Konstantin Ananyev
; Thomas Monjalon ;
Cristian Dumitrescu ; Pradeep
; gowrishankar 
Subject: [PATCH v4 3/6] ip_pipeline: fix lcore mapping for varying SMT
threads as in ppc64

From: gowrishankar 

offline lcore would still refer to original core id and this has to be
considered while creating cpu core mask.

Signed-off-by: Gowrishankar 
---
 config/defconfig_ppc_64-power8-linuxapp-gcc |  3 ---
 examples/ip_pipeline/cpu_core_map.c | 12 +---
 examples/ip_pipeline/init.c |  4 
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc
b/config/defconfig_ppc_64-power8-linuxapp-gcc
index dede34f..a084672 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -58,6 +58,3 @@ CONFIG_RTE_LIBRTE_FM10K_PMD=n

 # This following libraries are not available on Power. So they're turned
off.
 CONFIG_RTE_LIBRTE_SCHED=n
-CONFIG_RTE_LIBRTE_PORT=n
-CONFIG_RTE_LIBRTE_TABLE=n
-CONFIG_RTE_LIBRTE_PIPELINE=n
diff --git a/examples/ip_pipeline/cpu_core_map.c
b/examples/ip_pipeline/cpu_core_map.c
index cb088b1..482e68e 100644
--- a/examples/ip_pipeline/cpu_core_map.c
+++ b/examples/ip_pipeline/cpu_core_map.c
@@ -351,9 +351,6 @@ cpu_core_map_compute_linux(struct cpu_core_map *map)
int lcore_socket_id =
cpu_core_map_get_socket_id_linux(lcore_id);

-   if (lcore_socket_id < 0)
-   return -1;
-
if (((uint32_t) lcore_socket_id) == socket_id)
n_detected++;
}
@@ -368,18 +365,11 @@ cpu_core_map_compute_linux(struct cpu_core_map *map)
cpu_core_map_get_socket_id_linux(
lcore_id);

-   if (lcore_socket_id < 0)
-   return -1;
-
Why to remove the lcore_socket_id check?

int lcore_core_id =
cpu_core_map_get_core_id_linux(
lcore_id);

-   if (lcore_core_id < 0)
-   return -1;
-
-   if (((uint32_t) lcore_socket_id ==
socket_id) &&
-   ((uint32_t) lcore_core_id ==
core_id)) {
+   if ((uint32_t) lcore_socket_id == socket_id)
{
uint32_t pos = cpu_core_map_pos(map,
socket_id,
core_id_contig,
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c index
cd167f6..60c931f 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -61,7 +61,11 @@ static void
 app_init_core_map(struct app_params *app)  {
APP_LOG(app, HIGH, "Initializing CPU core map ...");
+#if defined(RTE_ARCH_PPC_64)
+   app->core_map = cpu_core_map_init(2, 5, 1, 0); #else

This value seems quite strange. Can you give more detail?

app->core_map = cpu_core_map_init(4, 32, 4, 0);
+#endif

if (app->core_map == NULL)
rte_panic("Cannot create CPU core map\n");
--
1.9.1




[dpdk-dev] vhost [query] : support for multiple ports and non VMDQ devices in vhost switch

2016-08-09 Thread Pankaj Chauhan

Hi,

I am working on an NXP platform where we intend to use user space vhost 
switch (examples/vhost) as backend for VIRTIO devices. But there are two 
limitations in current vhost-switch (examples/vhost)that are restricting 
my use case:

1. The vhost-switch application is tightly integrated with Intel VMDQ. 
Since my device doesn't have VMDQ i can not use this application directly.

2. The vhost-switch application supports only one external or physical 
port (non virtio devices), but my requirement is to have multiple 
physical ports and multiple virtio devices.

In summary my requirement is to do whatever vhost-switch is doing, in 
addition to that add support for following:

1. support devices that don't have VMDQ.
2. Support multiple physical ports.

I need suggestions on the approach i should take: whether to add support 
of above mentioned in existing vhost-switch (examples/vhost) or write 
another application (based on librte_vhost only) to support my requirements.

I'll work on it after the suggestion i get from the list, and send the 
RFC patch.

Thanks,
Pankaj




[dpdk-dev] [PATCH v4 3/6] ip_pipeline: fix lcore mapping for varying SMT threads as in ppc64

2016-08-09 Thread gowrishankar muthukrishnan
Hi Chao,
Sure. Please find below one.

This patch fixes ip_pipeline panic in app_init_core_map while preparing cpu
core map in powerpc with SMT off. cpu_core_map_compute_linux currently 
prepares
core mapping based on file existence in sysfs ie.

/sys/devices/system/cpu/cpu/topology/physical_package_id
   /sys/devices/system/cpu/cpu/topology/core_id

These files do not exist for lcores which are offline for any reason (as in
powerpc, while SMT is off). In this situation, this function should further
continue preparing map for other online lcores instead of returning with -1
for a first unavailable lcore.

Also, in SMT=off scenario for powerpc, lcore ids can not be always 
indexed from
0 upto 'number of cores present' (/sys/devices/system/cpu/present). For 
eg, for
an online lcore 32, core_id returned in sysfs is 112 where online lcores are
10 (as in one configuration), hence sysfs lcore id can not be checked with
indexing lcore number before positioning lcore map array.

Thanks,
Gowrishankar

On Tuesday 09 August 2016 02:37 PM, Chao Zhu wrote:
> Gowrishankar,
>
> Can you give more description about this patch?
> Thank you!
>
> -Original Message-
> From: Gowrishankar Muthukrishnan [mailto:gowrishankar.m at linux.vnet.ibm.com]
> Sent: 2016?8?6? 20:33
> To: dev at dpdk.org
> Cc: Chao Zhu ; Bruce Richardson
> ; Konstantin Ananyev
> ; Thomas Monjalon  6wind.com>;
> Cristian Dumitrescu ; Pradeep
> ; gowrishankar 
> Subject: [PATCH v4 3/6] ip_pipeline: fix lcore mapping for varying SMT
> threads as in ppc64
>
> From: gowrishankar 
>
> offline lcore would still refer to original core id and this has to be
> considered while creating cpu core mask.
>
> Signed-off-by: Gowrishankar 
> ---
>   config/defconfig_ppc_64-power8-linuxapp-gcc |  3 ---
>   examples/ip_pipeline/cpu_core_map.c | 12 +---
>   examples/ip_pipeline/init.c |  4 
>   3 files changed, 5 insertions(+), 14 deletions(-)
>
> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc
> b/config/defconfig_ppc_64-power8-linuxapp-gcc
> index dede34f..a084672 100644
> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
> @@ -58,6 +58,3 @@ CONFIG_RTE_LIBRTE_FM10K_PMD=n
>
>   # This following libraries are not available on Power. So they're turned
> off.
>   CONFIG_RTE_LIBRTE_SCHED=n
> -CONFIG_RTE_LIBRTE_PORT=n
> -CONFIG_RTE_LIBRTE_TABLE=n
> -CONFIG_RTE_LIBRTE_PIPELINE=n
> diff --git a/examples/ip_pipeline/cpu_core_map.c
> b/examples/ip_pipeline/cpu_core_map.c
> index cb088b1..482e68e 100644
> --- a/examples/ip_pipeline/cpu_core_map.c
> +++ b/examples/ip_pipeline/cpu_core_map.c
> @@ -351,9 +351,6 @@ cpu_core_map_compute_linux(struct cpu_core_map *map)
>   int lcore_socket_id =
>   cpu_core_map_get_socket_id_linux(lcore_id);
>
> - if (lcore_socket_id < 0)
> - return -1;
> -
>   if (((uint32_t) lcore_socket_id) == socket_id)
>   n_detected++;
>   }
> @@ -368,18 +365,11 @@ cpu_core_map_compute_linux(struct cpu_core_map *map)
>   cpu_core_map_get_socket_id_linux(
>   lcore_id);
>
> - if (lcore_socket_id < 0)
> - return -1;
> -
> Why to remove the lcore_socket_id check?
>
>   int lcore_core_id =
>   cpu_core_map_get_core_id_linux(
>   lcore_id);
>
> - if (lcore_core_id < 0)
> - return -1;
> -
> - if (((uint32_t) lcore_socket_id ==
> socket_id) &&
> - ((uint32_t) lcore_core_id ==
> core_id)) {
> + if ((uint32_t) lcore_socket_id == socket_id)
> {
>   uint32_t pos = cpu_core_map_pos(map,
>   socket_id,
>   core_id_contig,
> diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c index
> cd167f6..60c931f 100644
> --- a/examples/ip_pipeline/init.c
> +++ b/examples/ip_pipeline/init.c
> @@ -61,7 +61,11 @@ static void
>   app_init_core_map(struct app_params *app)  {
>   APP_LOG(app, HIGH, "Initializing CPU core map ...");
> +#if defined(RTE_ARCH_PPC_64)
> + app->core_map = cpu_core_map_init(2, 5, 1, 0); #else
>
> This value seems quite strange. Can you give more detail?
>
>   app->core_map = cpu_core_map_init(4, 32, 4, 0);
> +#endif
>
>   if (app->core_map == NULL)
>   rte_panic("Cannot create CPU core map\n");
> --
> 1.9.1
>
>
>




[dpdk-dev] [PATCH v2] examples/exception_path: fix shift operation in lcore setup

2016-08-09 Thread Daniel Mrzyglod
The operaton may have an undefined behavior or yield to an unexpected result.
A bit shift operation has a shift amount which is too large or has a negative 
value.

As was mentioned in mailing list core list was limited to 64 so i changed 
bitmask
to core array

Coverity issue: 30688
Fixes: ea977ff1cb0b ("examples/exception_path: fix shift operation in lcore 
setup")

Signed-off-by: Daniel Mrzyglod 
---
 examples/exception_path/main.c | 74 ++
 1 file changed, 61 insertions(+), 13 deletions(-)

diff --git a/examples/exception_path/main.c b/examples/exception_path/main.c
index e5eedcc..1493338 100644
--- a/examples/exception_path/main.c
+++ b/examples/exception_path/main.c
@@ -128,11 +128,11 @@ static struct rte_mempool * pktmbuf_pool = NULL;
 /* Mask of enabled ports */
 static uint32_t ports_mask = 0;

-/* Mask of cores that read from NIC and write to tap */
-static uint64_t input_cores_mask = 0;
+/* Table of cores that read from NIC and write to tap */
+static uint8_t input_cores_table[RTE_MAX_LCORE];

-/* Mask of cores that read from tap and write to NIC */
-static uint64_t output_cores_mask = 0;
+/* Table of cores that read from tap and write to NIC */
+static uint8_t output_cores_table[RTE_MAX_LCORE];

 /* Array storing port_id that is associated with each lcore */
 static uint8_t port_ids[RTE_MAX_LCORE];
@@ -224,7 +224,7 @@ main_loop(__attribute__((unused)) void *arg)
char tap_name[IFNAMSIZ];
int tap_fd;

-   if ((1ULL << lcore_id) & input_cores_mask) {
+   if (input_cores_table[lcore_id]) {
/* Create new tap interface */
snprintf(tap_name, IFNAMSIZ, "tap_dpdk_%.2u", lcore_id);
tap_fd = tap_create(tap_name);
@@ -257,7 +257,7 @@ main_loop(__attribute__((unused)) void *arg)
}
}
}
-   else if ((1ULL << lcore_id) & output_cores_mask) {
+   else if (output_cores_table[lcore_id]) {
/* Create new tap interface */
snprintf(tap_name, IFNAMSIZ, "tap_dpdk_%.2u", lcore_id);
tap_fd = tap_create(tap_name);
@@ -341,7 +341,7 @@ setup_port_lcore_affinities(void)

/* Setup port_ids[] array, and check masks were ok */
RTE_LCORE_FOREACH(i) {
-   if (input_cores_mask & (1ULL << i)) {
+   if (input_cores_table[i]) {
/* Skip ports that are not enabled */
while ((ports_mask & (1 << rx_port)) == 0) {
rx_port++;
@@ -350,7 +350,7 @@ setup_port_lcore_affinities(void)
}

port_ids[i] = rx_port++;
-   } else if (output_cores_mask & (1ULL << (i & 0x3f))) {
+   } else if (output_cores_table[i]) {
/* Skip ports that are not enabled */
while ((ports_mask & (1 << tx_port)) == 0) {
tx_port++;
@@ -373,6 +373,54 @@ fail:
FATAL_ERROR("Invalid core/port masks specified on command line");
 }

+static int parse_hex2coretable(const char *coremask_string, uint8_t 
*core_table,
+   int core_table_size)
+{
+
+   char portmask[RTE_MAX_LCORE];
+
+   int coremask_string_len;
+   int i = 0, j = 0;
+   uint64_t num;
+   char tmp_char;
+   char *end = NULL;
+
+   coremask_string_len = strlen(coremask_string);
+   if ((coremask_string_len > (RTE_MAX_LCORE / 4) + 2)
+   || (core_table_size > RTE_MAX_LCORE) || 
(coremask_string == NULL)
+   || (core_table == NULL))
+   return -1;
+
+   memset(portmask, '\0', sizeof(portmask));
+   memset(core_table, '\0', sizeof(uint8_t) * core_table_size);
+   strncpy(portmask, coremask_string, coremask_string_len);
+
+   if (coremask_string[i] == '0') {
+   ++i;
+   if (coremask_string[i] == 'X' || coremask_string[i] == 'x')
+   ++i;
+   }
+
+   j = 0;
+   while (coremask_string_len - i > 0) {
+
+   tmp_char = portmask[coremask_string_len - 1];
+   end = NULL;
+   num = strtoull(&tmp_char, &end, 16);
+   if ((end == &tmp_char))
+   return -1;
+
+   for (int z = 0; z < 4; z++)
+   if (((num) & (1 << (z))) != 0)
+   core_table[z + j * 4] = 1;
+
+   coremask_string_len--;
+   j++;
+   }
+
+   return 0;
+}
+
 /* Parse the arguments given in the command line of the application */
 static void
 parse_args(int argc, char **argv)
@@ -382,15 +430,15 @@ parse_args(int argc, char **argv)

/* Disable printing messages within getopt() */
opterr = 0;
-
+   int reti = 0, reto = 0;
/* Parse command line */
while ((opt = getopt(argc, argv, "i:o:p:")) != EOF) {
switch (opt) {
   

[dpdk-dev] [PATCH 0/2] Add HMAC_MD5 to Intel QuickAssist Technology driver

2016-08-09 Thread Arek Kusztal
This patchset add capability to use HMAC_MD5 hash algorithm to Intel
QuickAssist Technology driver and test cases to cryptodev test files.

This patchset depends on the following patches/patchsets:

"crypto/qat: make the session struct variable in size"
(http://dpdk.org/dev/patchwork/patch/15125/)

Arek Kusztal (2):
  crypto/qat: add MD5 HMAC capability to Intel QAT driver
  app/test: add test cases for MD5 HMAC for Intel QAT driver

 app/test/test_cryptodev.c| 185 +++
 app/test/test_cryptodev_hmac_test_vectors.h  | 121 +++
 doc/guides/cryptodevs/qat.rst|   1 +
 drivers/crypto/qat/qat_adf/qat_algs_build_desc.c |  34 +
 drivers/crypto/qat/qat_crypto.c  |  28 +++-
 5 files changed, 367 insertions(+), 2 deletions(-)
 create mode 100644 app/test/test_cryptodev_hmac_test_vectors.h

-- 
2.1.0



[dpdk-dev] [PATCH 1/2] crypto/qat: add MD5 HMAC capability to Intel QAT driver

2016-08-09 Thread Arek Kusztal
Added posibility to compute MD5 HMAC digest with Intel QuickAssist
Technology Driver

Signed-off-by: Arek Kusztal 
---
 doc/guides/cryptodevs/qat.rst|  1 +
 drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 34 
 drivers/crypto/qat/qat_crypto.c  | 28 +--
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index cae1958..485abb4 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -57,6 +57,7 @@ Hash algorithms:
 * ``RTE_CRYPTO_AUTH_SHA512_HMAC``
 * ``RTE_CRYPTO_AUTH_AES_XCBC_MAC``
 * ``RTE_CRYPTO_AUTH_SNOW3G_UIA2``
+* ``RTE_CRYPTO_AUTH_MD5_HMAC``


 Limitations
diff --git a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c 
b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
index c658f6e..521a9c4 100644
--- a/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
+++ b/drivers/crypto/qat/qat_adf/qat_algs_build_desc.c
@@ -58,6 +58,7 @@

 #include/* Needed to calculate pre-compute values */
 #include/* Needed to calculate pre-compute values */
+#include/* Needed to calculate pre-compute values */


 /*
@@ -86,6 +87,9 @@ static int qat_hash_get_state1_size(enum icp_qat_hw_auth_algo 
qat_hash_alg)
case ICP_QAT_HW_AUTH_ALGO_SNOW_3G_UIA2:
return QAT_HW_ROUND_UP(ICP_QAT_HW_SNOW_3G_UIA2_STATE1_SZ,
QAT_HW_DEFAULT_ALIGNMENT);
+   case ICP_QAT_HW_AUTH_ALGO_MD5:
+   return QAT_HW_ROUND_UP(ICP_QAT_HW_MD5_STATE1_SZ,
+   QAT_HW_DEFAULT_ALIGNMENT);
case ICP_QAT_HW_AUTH_ALGO_DELIMITER:
/* return maximum state1 size in this case */
return QAT_HW_ROUND_UP(ICP_QAT_HW_SHA512_STATE1_SZ,
@@ -107,6 +111,8 @@ static int qat_hash_get_digest_size(enum 
icp_qat_hw_auth_algo qat_hash_alg)
return ICP_QAT_HW_SHA256_STATE1_SZ;
case ICP_QAT_HW_AUTH_ALGO_SHA512:
return ICP_QAT_HW_SHA512_STATE1_SZ;
+   case ICP_QAT_HW_AUTH_ALGO_MD5:
+   return ICP_QAT_HW_MD5_STATE1_SZ;
case ICP_QAT_HW_AUTH_ALGO_DELIMITER:
/* return maximum digest size in this case */
return ICP_QAT_HW_SHA512_STATE1_SZ;
@@ -129,6 +135,8 @@ static int qat_hash_get_block_size(enum 
icp_qat_hw_auth_algo qat_hash_alg)
return SHA512_CBLOCK;
case ICP_QAT_HW_AUTH_ALGO_GALOIS_128:
return 16;
+   case ICP_QAT_HW_AUTH_ALGO_MD5:
+   return MD5_CBLOCK;
case ICP_QAT_HW_AUTH_ALGO_DELIMITER:
/* return maximum block size in this case */
return SHA512_CBLOCK;
@@ -172,6 +180,19 @@ static int partial_hash_sha512(uint8_t *data_in, uint8_t 
*data_out)
return 0;
 }

+static int partial_hash_md5(uint8_t *data_in, uint8_t *data_out)
+{
+
+   MD5_CTX ctx;
+
+   if (!MD5_Init(&ctx))
+   return -EFAULT;
+   MD5_Transform(&ctx, data_in);
+   rte_memcpy(data_out, &ctx, MD5_DIGEST_LENGTH);
+
+   return 0;
+}
+
 static int partial_hash_compute(enum icp_qat_hw_auth_algo hash_alg,
uint8_t *data_in,
uint8_t *data_out)
@@ -213,6 +234,10 @@ static int partial_hash_compute(enum icp_qat_hw_auth_algo 
hash_alg,
*hash_state_out_be64 =
rte_bswap64(*(((uint64_t *)digest)+i));
break;
+   case ICP_QAT_HW_AUTH_ALGO_MD5:
+   if (partial_hash_md5(data_in, data_out))
+   return -EFAULT;
+   break;
default:
PMD_DRV_LOG(ERR, "invalid hash alg %u", hash_alg);
return -EFAULT;
@@ -620,6 +645,15 @@ int qat_alg_aead_session_create_content_desc_auth(struct 
qat_session *cdesc,
auth_param->hash_state_sz =
RTE_ALIGN_CEIL(add_auth_data_length, 16) >> 3;
break;
+   case ICP_QAT_HW_AUTH_ALGO_MD5:
+   if (qat_alg_do_precomputes(ICP_QAT_HW_AUTH_ALGO_MD5,
+   authkey, authkeylen, cdesc->cd_cur_ptr,
+   &state1_size)) {
+   PMD_DRV_LOG(ERR, "(MD5)precompute failed");
+   return -EFAULT;
+   }
+   state2_size = ICP_QAT_HW_MD5_STATE2_SZ;
+   break;
default:
PMD_DRV_LOG(ERR, "Invalid HASH alg %u", cdesc->qat_hash_alg);
return -EFAULT;
diff --git a/drivers/crypto/qat/qat_crypto.c b/drivers/crypto/qat/qat_crypto.c
index 9a5f8ad..e90f181 100644
--- a/drivers/crypto/qat/qat_crypto.c
+++ b/drivers/crypto/qat/qat_crypto.c
@@ -132,6 +132,27 @@ static const struct rte_cryptodev_capabilities 
qat_pmd_capabilities[] = {
}, }
}, }
},
+   {   /* MD5 HMAC */
+   

[dpdk-dev] [PATCH 2/2] app/test: add test cases for MD5 HMAC for Intel QAT driver

2016-08-09 Thread Arek Kusztal
Added MD5 HMAC hash algorithm to test file for Intel QuickAssist
Technology Driver

Signed-off-by: Arek Kusztal 
---
 app/test/test_cryptodev.c   | 185 
 app/test/test_cryptodev_hmac_test_vectors.h | 121 ++
 2 files changed, 306 insertions(+)
 create mode 100644 app/test/test_cryptodev_hmac_test_vectors.h

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 647787d..8553759 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -49,6 +49,7 @@
 #include "test_cryptodev_snow3g_test_vectors.h"
 #include "test_cryptodev_snow3g_hash_test_vectors.h"
 #include "test_cryptodev_gcm_test_vectors.h"
+#include "test_cryptodev_hmac_test_vectors.h"

 static enum rte_cryptodev_type gbl_cryptodev_type;

@@ -3431,6 +3432,179 @@ test_stats(void)
return TEST_SUCCESS;
 }

+static int MD5_HMAC_create_session(struct crypto_testsuite_params *ts_params,
+  struct crypto_unittest_params *ut_params,
+  enum rte_crypto_auth_operation op,
+  const struct HMAC_MD5_vector *test_case)
+{
+   uint8_t key[64];
+
+   memcpy(key, test_case->key.data, test_case->key.len);
+
+   ut_params->auth_xform.type = RTE_CRYPTO_SYM_XFORM_AUTH;
+   ut_params->auth_xform.next = NULL;
+   ut_params->auth_xform.auth.op = op;
+
+   ut_params->auth_xform.auth.algo = RTE_CRYPTO_AUTH_MD5_HMAC;
+
+   ut_params->auth_xform.auth.digest_length = MD5_DIGEST_LEN;
+   ut_params->auth_xform.auth.add_auth_data_length = 0;
+   ut_params->auth_xform.auth.key.length = test_case->key.len;
+   ut_params->auth_xform.auth.key.data = key;
+
+   ut_params->sess = rte_cryptodev_sym_session_create(
+   ts_params->valid_devs[0], &ut_params->auth_xform);
+
+   if (ut_params->sess == NULL)
+   return TEST_FAILED;
+
+   ut_params->ibuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
+
+   memset(rte_pktmbuf_mtod(ut_params->ibuf, uint8_t *), 0,
+   rte_pktmbuf_tailroom(ut_params->ibuf));
+
+   return 0;
+}
+
+static int MD5_HMAC_create_op(struct crypto_unittest_params *ut_params,
+ const struct HMAC_MD5_vector *test_case,
+ uint8_t **plaintext)
+{
+   uint16_t plaintext_pad_len;
+
+   struct rte_crypto_sym_op *sym_op = ut_params->op->sym;
+
+   plaintext_pad_len = RTE_ALIGN_CEIL(test_case->plaintext.len,
+   16);
+
+   *plaintext = (uint8_t *)rte_pktmbuf_append(ut_params->ibuf,
+   plaintext_pad_len);
+   memcpy(*plaintext, test_case->plaintext.data,
+   test_case->plaintext.len);
+
+   sym_op->auth.digest.data = (uint8_t *)rte_pktmbuf_append(
+   ut_params->ibuf, MD5_DIGEST_LEN);
+   TEST_ASSERT_NOT_NULL(sym_op->auth.digest.data,
+   "no room to append digest");
+   sym_op->auth.digest.phys_addr = rte_pktmbuf_mtophys_offset(
+   ut_params->ibuf, plaintext_pad_len);
+   sym_op->auth.digest.length = MD5_DIGEST_LEN;
+
+   if (ut_params->auth_xform.auth.op == RTE_CRYPTO_AUTH_OP_VERIFY) {
+   rte_memcpy(sym_op->auth.digest.data, test_case->auth_tag.data,
+  test_case->auth_tag.len);
+   }
+
+   sym_op->auth.data.offset = 0;
+   sym_op->auth.data.length = test_case->plaintext.len;
+
+   rte_crypto_op_attach_sym_session(ut_params->op, ut_params->sess);
+   ut_params->op->sym->m_src = ut_params->ibuf;
+
+   return 0;
+}
+
+static int
+test_MD5_HMAC_generate(const struct HMAC_MD5_vector *test_case)
+{
+   uint16_t plaintext_pad_len;
+   uint8_t *plaintext, *auth_tag;
+
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   struct crypto_unittest_params *ut_params = &unittest_params;
+
+   if (MD5_HMAC_create_session(ts_params, ut_params,
+   RTE_CRYPTO_AUTH_OP_GENERATE, test_case))
+   return TEST_FAILED;
+
+   /* Generate Crypto op data structure */
+   ut_params->op = rte_crypto_op_alloc(ts_params->op_mpool,
+   RTE_CRYPTO_OP_TYPE_SYMMETRIC);
+   TEST_ASSERT_NOT_NULL(ut_params->op,
+   "Failed to allocate symmetric crypto operation struct");
+
+   plaintext_pad_len = RTE_ALIGN_CEIL(test_case->plaintext.len,
+   16);
+
+   if (MD5_HMAC_create_op(ut_params, test_case, &plaintext))
+   return TEST_FAILED;
+
+   TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params->valid_devs[0],
+   ut_params->op), "failed to process sym crypto op");
+
+   TEST_ASSERT_EQUAL(ut_params->op->status, RTE_CRYPTO_OP_STATUS_SUCCESS,
+   "crypto op processing failed");
+
+   if (ut_params->op->sym->m_dst) {
+   au

[dpdk-dev] [PATCH v1] crypto/qat: make the session struct variable in size

2016-08-09 Thread Trahe, Fiona


-Original Message-
From: Griffin, John 
Sent: Thursday, August 4, 2016 4:46 PM
To: dev at dpdk.org
Cc: Griffin, John ; Trahe, Fiona ; Jain, Deepak K ; De Lara Guarch, Pablo 

Subject: [PATCH v1] crypto/qat: make the session struct variable in size

This patch changes the qat firmware session data structure from a fixed size to 
a variable size which is dependent on the size of the chosen algorithm.
This reduces the amount of bytes which are transferred across PCIe and thus 
helps to increase qat performance when the accelerator is bound by PCIe.

Signed-off-by: John Griffin 
---
v1:
* Fixed a compile issue with icc.

Acked-by: Fiona Trahe 


[dpdk-dev] [PATCH 1/2] lib/librte_port: modify source and sink port structure parameter

2016-08-09 Thread Jasvinder Singh
The ``file_name`` data type of ``struct rte_port_source_params`` and
``struct rte_port_sink_params`` is changed from `char *`` to ``const char *``.

Signed-off-by: Jasvinder Singh 
---
 doc/guides/rel_notes/deprecation.rst   | 4 
 doc/guides/rel_notes/release_16_11.rst | 3 ++-
 lib/librte_port/rte_port_source_sink.h | 4 ++--
 3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 96db661..f302af0 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -61,7 +61,3 @@ Deprecation Notices
   renamed to something more consistent (net and crypto prefixes) in 16.11.
   Some of these driver names are used publicly, to create virtual devices,
   so a deprecation notice is necessary.
-
-* API will change for ``rte_port_source_params`` and ``rte_port_sink_params``
-  structures. The member ``file_name`` data type will be changed from
-  ``char *`` to ``const char *``. This change targets release 16.11.
diff --git a/doc/guides/rel_notes/release_16_11.rst 
b/doc/guides/rel_notes/release_16_11.rst
index 0b9022d..4f3d899 100644
--- a/doc/guides/rel_notes/release_16_11.rst
+++ b/doc/guides/rel_notes/release_16_11.rst
@@ -94,7 +94,8 @@ API Changes

This section is a comment. Make sure to start the actual text at the margin.

-* The log history is removed.
+* The ``file_name`` data type of ``struct rte_port_source_params`` and
+  ``struct rte_port_sink_params`` is changed from `char *`` to ``const char 
*``.


 ABI Changes
diff --git a/lib/librte_port/rte_port_source_sink.h 
b/lib/librte_port/rte_port_source_sink.h
index 4db8a8a..be585a7 100644
--- a/lib/librte_port/rte_port_source_sink.h
+++ b/lib/librte_port/rte_port_source_sink.h
@@ -55,7 +55,7 @@ struct rte_port_source_params {
struct rte_mempool *mempool;

/** The full path of the pcap file to read packets from */
-   char *file_name;
+   const char *file_name;
/** The number of bytes to be read from each packet in the
 *  pcap file. If this value is 0, the whole packet is read;
 *  if it is bigger than packet size, the generated packets
@@ -69,7 +69,7 @@ extern struct rte_port_in_ops rte_port_source_ops;
 /** sink port parameters */
 struct rte_port_sink_params {
/** The full path of the pcap file to write the packets to */
-   char *file_name;
+   const char *file_name;
/** The maximum number of packets write to the pcap file.
 *  If this value is 0, the "infinite" write will be carried
 *  out.
-- 
2.5.5



[dpdk-dev] [PATCH 2/2] examples/ip_pipeline: modify source port default parameter

2016-08-09 Thread Jasvinder Singh
The default value of ``file_name`` parameter of the source port structure is
changed from ``NULL`` to ``./config/packets.pcap``.

Signed-off-by: Jasvinder Singh 
---
 examples/ip_pipeline/app.h  | 4 ++--
 examples/ip_pipeline/config_parse.c | 6 +-
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 6a6fdd9..4fdf0d9 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -182,14 +182,14 @@ struct app_pktq_source_params {
uint32_t parsed;
uint32_t mempool_id; /* Position in the app->mempool_params array */
uint32_t burst;
-   char *file_name; /* Full path of PCAP file to be copied to mbufs */
+   const char *file_name; /* Full path of PCAP file to be copied to mbufs 
*/
uint32_t n_bytes_per_pkt;
 };

 struct app_pktq_sink_params {
char *name;
uint8_t parsed;
-   char *file_name; /* Full path of PCAP file to be copied to mbufs */
+   const char *file_name; /* Full path of PCAP file to be copied to mbufs 
*/
uint32_t n_pkts_to_dump;
 };

diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index 8fe8157..48c9923 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -207,7 +207,7 @@ struct app_pktq_source_params default_source_params = {
.parsed = 0,
.mempool_id = 0,
.burst = 32,
-   .file_name = NULL,
+   .file_name = "./config/packets.pcap",
.n_bytes_per_pkt = 0,
 };

@@ -3083,10 +3083,6 @@ app_config_init(struct app_params *app)

memcpy(app, &app_params_default, sizeof(struct app_params));

-   /* configure default_source_params */
-   default_source_params.file_name = strdup("./config/packets.pcap");
-   PARSE_ERROR_MALLOC(default_source_params.file_name != NULL);
-
for (i = 0; i < RTE_DIM(app->mempool_params); i++)
memcpy(&app->mempool_params[i],
&mempool_params_default,
-- 
2.5.5



[dpdk-dev] [PATCH 1/2] lib/librte_port: modify source and sink port structure parameter

2016-08-09 Thread Dumitrescu, Cristian


> -Original Message-
> From: Singh, Jasvinder
> Sent: Tuesday, August 9, 2016 9:31 AM
> To: dev at dpdk.org
> Cc: Dumitrescu, Cristian 
> Subject: [PATCH 1/2] lib/librte_port: modify source and sink port structure
> parameter
> 
> The ``file_name`` data type of ``struct rte_port_source_params`` and
> ``struct rte_port_sink_params`` is changed from `char *`` to ``const char *``.
> 
> Signed-off-by: Jasvinder Singh 
> ---
>  doc/guides/rel_notes/deprecation.rst   | 4 
>  doc/guides/rel_notes/release_16_11.rst | 3 ++-
>  lib/librte_port/rte_port_source_sink.h | 4 ++--
>  3 files changed, 4 insertions(+), 7 deletions(-)
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH 2/2] examples/ip_pipeline: modify source port default parameter

2016-08-09 Thread Dumitrescu, Cristian


> -Original Message-
> From: Singh, Jasvinder
> Sent: Tuesday, August 9, 2016 9:31 AM
> To: dev at dpdk.org
> Cc: Dumitrescu, Cristian 
> Subject: [PATCH 2/2] examples/ip_pipeline: modify source port default
> parameter
> 
> The default value of ``file_name`` parameter of the source port structure is
> changed from ``NULL`` to ``./config/packets.pcap``.
> 
> Signed-off-by: Jasvinder Singh 
> ---
>  examples/ip_pipeline/app.h  | 4 ++--
>  examples/ip_pipeline/config_parse.c | 6 +-
>  2 files changed, 3 insertions(+), 7 deletions(-)
> 

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-09 Thread John Fastabend
[...]

>> I'm not sure I understand 'bit granularity' here. I would say we have
>> devices now that have rather strange restrictions due to hardware
>> implementation. Going forward we should get better hardware and a lot
>> of this will go away in my view. Yes this is a long term view and
>> doesn't help the current state. The overall point you are making is
>> the sum off all these strange/odd bits in the hardware implementation
>> means capabilities queries are very difficult to guarantee. On existing
>> hardware and I think you've convinced me. Thanks ;)
> 
> Precisely. By "bit granularity" I meant that while it is fairly easy to
> report whether bit-masking is supported on protocol fields such as MAC
> addresses at all, devices may have restrictions on the possible bit-masks,
> like they may only have an effect at byte level (0xff), may not allow
> specific bits (broadcast) or there even may be a fixed set of bit-masks to
> choose from.

Yep lots of strange hardware implementation voodoo here.

> 
> [...]
>>> I understand, however I think this approach may be too low-level to express
>>> all the possible combinations. This graph would have to include possible
>>> actions for each possible pattern, all while considering that some actions
>>> are not possible with some patterns and that there are exclusive actions.
>>>
>>
>> Really? You have hardware that has dependencies between the parser and
>> the supported actions? Ugh...
> 
> Not that I know of actually, even though we cannot rule out this
> possibility.
> 
> Here are the possible cases I have in mind with existing HW:
> 
> - Too many actions specified for a single rule, even though each of them is
>   otherwise supported.

Yep most hardware will have this restriction.

> 
> - Performing several encap/decap actions. None are defined in the initial
>   specification but these are already planned.
> 

Great this is certainly needed.

> - Assuming there is a single table from the application point of view
>   (separate discussion for the other thread), some actions may only be
>   possible with the right pattern item or meta item. Asking HW to perform
>   tunnel decap may only be safe if the pattern specifically matches that
>   protocol.
> 

Yep continue in other thread.

>> If the hardware has separate tables then we shouldn't try to have the
>> PMD flatten those into a single table because we will have no way of
>> knowing how to do that. (I'll respond to the other thread on this in
>> an attempt to not get to scattered).
> 
> OK, will reply there as well.
> 
>>> Also while memory consumption is not really an issue, such a graph may be
>>> huge. It could take a while for the PMD to update it when adding a rule
>>> impacting capabilities.
>>
>> Ugh... I wouldn't suggest updating the capabilities at runtime like
>> this. But I see your point if the graph has to _guarantee_ correctness
>> how does it represent limited number of masks and other strange hw,
>> its unfortunate the hardware isn't more regular.
>>
>> You have convinced me that guaranteed correctness via capabilities
>> is going to difficult for many types of devices although not all.
> 
> I'll just add that these capabilities also depend on side effects of
> configuration performed outside the scope of this API. The way queues are
> (re)initialized or offloads configured may affect them. RSS configuration is
> the most obvious example.
> 

OK.

[...]

>>
>> My concern is this non-determinism will create performance issues in
>> the network because when a flow may or may not be offloaded this can
>> have a rather significant impact on its performance. This can make
>> debugging network wide performance miserable when at time X I get
>> performance X and then for whatever reason something degrades to
>> software and at time Y I get some performance Y << X. I suspect that
>> in general applications will bind tightly with hardware they know
>> works.
> 
> You are right, performance determinism is not taken into account at all, at
> least not yet. It should not be an issue at the beginning as long as the
> API has the ability evolve later for applications that need it.
> 
> Just an idea, could some kind of meta pattern items specifying time
> constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
> may take to query/apply/delete the rule. If it cannot be guaranteed, the
> rule cannot be created. Applications could mantain statistic counters about
> failed rules to determine if performance issues are caused by the inability
> to create them.

It seems a bit heavy to me to have each PMD driver implementing
something like this. But it would be interesting to explore probably
after the basic support is implemented though.

> 
> [...]
>>> For individual points:
>>>
>>> (i) should be doable with the query API without recompiling DPDK as well,
>>> the fact API/ABI breakage must be avoided being part of the requirements. If
>>> you think there is a problem regarding this, can you

[dpdk-dev] [PATCH] net/enic: move link checking init to probe time

2016-08-09 Thread Nelson Escobar
The enic DMAs link status information to the host and this requires a
little setup.  This setup was being done as a result of calling
rte_eth_dev_start().  But applications expect to be able to check link
status before calling rte_eth_dev_start().

This patch moves he link status setup to enic_init() which is called
at device probe time so that link status can be checked anytime.

Fixes: fefed3d1e62c ("enic: new driver")

Signed-off-by: Nelson Escobar 
Reviewed-by: John Daley 
---
 drivers/net/enic/enic_main.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index b4ca371..eb32ac1 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -430,7 +430,6 @@ int enic_enable(struct enic *enic)

eth_dev->data->dev_link.link_speed = vnic_dev_port_speed(enic->vdev);
eth_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   vnic_dev_notify_set(enic->vdev, -1); /* No Intr for notify */

if (enic_clsf_init(enic))
dev_warning(enic, "Init of hash table for clsf failed."\
@@ -820,7 +819,6 @@ int enic_disable(struct enic *enic)
}

vnic_dev_set_reset_flag(enic->vdev, 1);
-   vnic_dev_notify_unset(enic->vdev);

for (i = 0; i < enic->wq_count; i++)
vnic_wq_clean(&enic->wq[i], enic_free_wq_buf);
@@ -1022,6 +1020,9 @@ static void enic_dev_deinit(struct enic *enic)
 {
struct rte_eth_dev *eth_dev = enic->rte_dev;

+   /* stop link status checking */
+   vnic_dev_notify_unset(enic->vdev);
+
rte_free(eth_dev->data->mac_addrs);
 }

@@ -1137,6 +1138,9 @@ static int enic_dev_init(struct enic *enic)

vnic_dev_set_reset_flag(enic->vdev, 0);

+   /* set up link status checking */
+   vnic_dev_notify_set(enic->vdev, -1); /* No Intr for notify */
+
return 0;

 }
-- 
2.7.0



[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-08-09 Thread John Fastabend
On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
> On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
>> [...]
>>
 The proposal looks very good.  It satisfies most of the features
 supported by Chelsio NICs.  We are looking for suggestions on exposing
 more additional features supported by Chelsio NICs via this API.

 Chelsio NICs have two regions in which filters can be placed -
 Maskfull and Maskless regions.  As their names imply, maskfull region
 can accept masks to match a range of values; whereas, maskless region
 don't accept any masks and hence perform a more strict exact-matches.
 Filters without masks can also be placed in maskfull region.  By
 default, maskless region have higher priority over the maskfull region.
 However, the priority between the two regions is configurable.
>>>
>>> I understand this configuration affects the entire device. Just to be 
>>> clear,
>>> assuming some filters are already configured, are they affected by a 
>>> change
>>> of region priority later?
>>>
>>
>> Both the regions exist at the same time in the device.  Each filter can
>> either belong to maskfull or the maskless region.
>>
>> The priority is configured at time of filter creation for every
>> individual filter and cannot be changed while the filter is still
>> active. If priority needs to be changed for a particular filter then,
>> it needs to be deleted first and re-created.
>
> Could you model this as two tables and add a table_id to the API? This
> way user space could populate the table it chooses. We would have to add
> some capabilities attributes to "learn" if tables support masks or not
> though.
>

 This approach sounds interesting.
>>>
>>> Now I understand the idea behind these tables, however from an application
>>> point of view I still think it's better if the PMD could take care of flow
>>> rules optimizations automatically. Think about it, PMDs have exactly a
>>> single kind of device they know perfectly well to manage, while applications
>>> want the best possible performance out of any device in the most generic
>>> fashion.
>>
>> The problem is keeping priorities in order and/or possibly breaking
>> rules apart (e.g. you have an L2 table and an L3 table) becomes very
>> complex to manage at driver level. I think its easier for the
>> application which has some context to do this. The application "knows"
>> if its a router for example will likely be able to pack rules better
>> than a PMD will.
> 
> I don't think most applications know they are L2 or L3 routers. They may not
> know more than the pattern provided to the PMD, which may indeed end at a L2
> or L3 protocol. If the application simply chooses a table based on this
> information, then the PMD could have easily done the same.
> 

But when we start thinking about encap/decap then its natural to start
using this interface to implement various forwarding dataplanes. And one
common way to organize a switch is into a TEP, router, switch
(mac/vlan), ACL tables, etc. In fact we see this topology starting to
show up in the NICs now.

Further each table may be "managed" by a different entity. In which
case the software will want to manage the physical and virtual networks
separately.

It doesn't make sense to me to require a software aggregator object to
marshal the rules into a flat table then for a PMD to split them apart
again.

> I understand the issue is what happens when applications really want to
> define e.g. L2/L3/L2 rules in this specific order (or any ordering that
> cannot be satisfied by HW due to table constraints).
> 
> By exposing tables, in such a case applications should move all rules from
> L2 to a L3 table themselves (assuming this is even supported) to guarantee
> ordering between rules, or fail to add them. This is basically what the PMD
> could have done, possibly in a more efficient manner in my opinion.

I disagree with the more efficient comment :)

If the software layer is working on L2/TEP/ACL/router layers merging
them just to pull them back apart is not going to be more efficient.

> 
> Let's assume two opposite scenarios for this discussion:
> 
> - App #1 is a command-line interface directly mapped to flow rules, which
>   basically gets slow random input from users depending on how they want to
>   configure their traffic. All rules differ considerably (L2, L3, L4, some
>   with incomplete bit-masks, etc). All in all, few but complex rules with
>   specific priorities.
> 

Agree with this and in this case the application should be behind any
network physical/virtual and not giving rules like encap/decap/etc. This
application either sits on the physical function and "owns" the hardware
resource or sits behind a virtual switch.


> - App #2 is something like OVS, creating and deleting a large number of very
>   specific (