Re: [dpdk-dev] [PATCH v1] ethdev: fix multi-process NULL dereference crashes

2017-01-24 Thread Remy Horton


On 20/01/2017 18:37, Thomas Monjalon wrote:
[..]

3 comments here:
- it is in the wrong section (EAL instead of Drivers)
- secondary processes can setup a vdev PMD
- before Yuanhan's patch, even PCI PMD were blanking primary process data


Since the code being changed is in rte_ether rather than drivers/net it 
seemed the logical place to me.. :)



I propose this rebase:

-   memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
eth_dev = eth_dev_get(port_id);
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   memset(eth_dev->data, 0, sizeof(*eth_dev->data));
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->data->mtu = ETHER_MTU;


Seems OK to me, assuming Yuanhan's patch is going in as-is.

..Remy


[dpdk-dev] [PATCH] net/ixgbe: correct error when SFP not present

2017-01-24 Thread Wei Dai
Ignore the error=IXGBE_ERR_SFP_NOT_PRESENT when SFP is not present.
If it is not ignored, testpmd will in the NIC initialization process.
Ixgbe kernel driver ignores this error and works well. So DPDK
does same thing.

Signed-off-by: Wei Dai 
Signed-off-by: Helin Zhang 
Tested-by: Yuan Peng 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index bdf4e2b..ae7fb86 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -898,6 +898,8 @@ ixgbe_pf_reset_hw(struct ixgbe_hw *hw)
IXGBE_WRITE_REG(hw, IXGBE_CTRL_EXT, ctrl_ext);
IXGBE_WRITE_FLUSH(hw);
 
+   if (status == IXGBE_ERR_SFP_NOT_PRESENT)
+   status = IXGBE_SUCCESS;
return status;
 }
 
@@ -1237,6 +1239,9 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
diag = ixgbe_init_hw(hw);
}
 
+   if (diag == IXGBE_ERR_SFP_NOT_PRESENT)
+   diag = IXGBE_SUCCESS;
+
if (diag == IXGBE_ERR_EEPROM_VERSION) {
PMD_INIT_LOG(ERR, "This device is a pre-production adapter/"
 "LOM.  Please be aware there may be issues 
associated "
-- 
2.7.4



Re: [dpdk-dev] [PATCH] ethdev: fix wrong memset

2017-01-24 Thread Remy Horton



On 23/01/2017 11:56, Yuanhan Liu wrote:
[..]

http://dpdk.org/ml/archives/dev/2017-January/054422.html


Yes, it should fix that issue.


Well, few more thoughts: it may fix the crash issue Remy saw, but it
looks like more a workaround to me. Basically, if primary and secondary
shares a same port id, they should point to same device. Otherwise,
primary process may use eth_dev->data for a device A, while the
secondary process may use it for another device, as you said, it
could be a vdev.

In such case, there is no way we could continue safely. That said,
the given patch avoids the total reset of eth_dev->data, while it
continues reset the eth_dev->data->name, which is wrong.


I did wonder whether 7f95f78a8aea ought to be rolled back rather than 
the memset being made process-conditional. You going to be fixing the 
issue in your own patch?




One question: do Remy or you regularly
run some multiple process test cases (and with vdev both in primary
and secondary process)?


Not aware of there being any multiproc-related unit tests.

..Remy


Re: [dpdk-dev] [PATCHv6 12/33] net/dpaa2: introducing NXP dpaa2 pmd driver

2017-01-24 Thread Shreyansh Jain

On Monday 23 January 2017 11:02 PM, Ferruh Yigit wrote:

On 1/23/2017 11:59 AM, Hemant Agrawal wrote:

add support for fsl-mc bus based dpaa2 pmd driver.

Signed-off-by: Hemant Agrawal 
---

<...>


diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile
new file mode 100644
index 000..f85aa9f
--- /dev/null
+++ b/drivers/net/dpaa2/Makefile

<...>

+
+# library dependencies
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_bus_fslmc


Is this dependency correct?


I think yes.
- Without Bus, dpaa2 PMD wouldn't work and being a configurable option,
  user can set CONFIG_RTE_LIBRTE_DPAA2_PMD=n and
  CONFIG_RTE_LIBRTE_DPAA2_PMD=y.

- If you referring to whether lib/librte_bus_fslmc is correct or not, I
  have replied to your response on 16/33 patch. In short, I think this
  is correct assuming that librte_bus_fslmc is valid LIB name and not
  expected to be a folder in lib/




+
+include $(RTE_SDK)/mk/rte.lib.mk


<...>





[dpdk-dev] [PATCH] net/vhost: fix unix socket not removed as closing

2017-01-24 Thread Jianfeng Tan
The commit aed0b12930b ("net/vhost: fix socket file deleted on stop")
moves rte_vhost_driver_register and rte_vhost_driver_unregister from
dev_start() and dev_stop() into driver's probe() and remove().

Apps, like testpmd, using vhost pmd in server mode, usually calls
dev_stop() and dev_close() as quitting, instead of driver-specific
remove(). Then those unix socket files have no chance to get removed.

Semantically, device-specific things should be put into device-specific
APIs. Fix this issue by moving rte_vhost_driver_unregister, plus other
structure free into dev_close().

Fixes: aed0b12930b3 ("net/vhost: fix socket file deleted on stop")

Reported-by: Lei Yao 
Signed-off-by: Jianfeng Tan 
---
 drivers/net/vhost/rte_eth_vhost.c | 48 +++
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index 848a3da..93b8a52 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -801,6 +801,32 @@ eth_dev_stop(struct rte_eth_dev *dev)
update_queuing_status(dev);
 }
 
+static void
+eth_dev_close(struct rte_eth_dev *dev)
+{
+   struct pmd_internal *internal;
+   struct internal_list *list;
+
+   internal = dev->data->dev_private;
+   if (!internal)
+   return;
+
+   list = find_internal_resource(internal->iface_name);
+   if (!list)
+   return;
+
+   pthread_mutex_lock(&internal_list_lock);
+   TAILQ_REMOVE(&internal_list, list, next);
+   pthread_mutex_unlock(&internal_list_lock);
+   rte_free(list);
+
+   rte_vhost_driver_unregister(internal->iface_name);
+
+   free(internal->dev_name);
+   free(internal->iface_name);
+   rte_free(internal);
+}
+
 static int
 eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
   uint16_t nb_rx_desc __rte_unused,
@@ -968,6 +994,7 @@ rte_eth_vhost_feature_get(void)
 static const struct eth_dev_ops ops = {
.dev_start = eth_dev_start,
.dev_stop = eth_dev_stop,
+   .dev_close = eth_dev_close,
.dev_configure = eth_dev_configure,
.dev_infos_get = eth_dev_info,
.rx_queue_setup = eth_rx_queue_setup,
@@ -1195,8 +1222,6 @@ static int
 rte_pmd_vhost_remove(const char *name)
 {
struct rte_eth_dev *eth_dev = NULL;
-   struct pmd_internal *internal;
-   struct internal_list *list;
unsigned int i;
 
RTE_LOG(INFO, PMD, "Un-Initializing pmd_vhost for %s\n", name);
@@ -1206,22 +1231,9 @@ rte_pmd_vhost_remove(const char *name)
if (eth_dev == NULL)
return -ENODEV;
 
-   internal = eth_dev->data->dev_private;
-   if (internal == NULL)
-   return -ENODEV;
-
-   list = find_internal_resource(internal->iface_name);
-   if (list == NULL)
-   return -ENODEV;
-
-   pthread_mutex_lock(&internal_list_lock);
-   TAILQ_REMOVE(&internal_list, list, next);
-   pthread_mutex_unlock(&internal_list_lock);
-   rte_free(list);
-
eth_dev_stop(eth_dev);
 
-   rte_vhost_driver_unregister(internal->iface_name);
+   eth_dev_close(eth_dev);
 
if (rte_atomic16_sub_return(&nb_started_ports, 1) == 0)
vhost_driver_session_stop();
@@ -1229,9 +1241,6 @@ rte_pmd_vhost_remove(const char *name)
rte_free(vring_states[eth_dev->data->port_id]);
vring_states[eth_dev->data->port_id] = NULL;
 
-   free(internal->dev_name);
-   free(internal->iface_name);
-
for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
rte_free(eth_dev->data->rx_queues[i]);
for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
@@ -1239,7 +1248,6 @@ rte_pmd_vhost_remove(const char *name)
 
rte_free(eth_dev->data->mac_addrs);
rte_free(eth_dev->data);
-   rte_free(internal);
 
rte_eth_dev_release_port(eth_dev);
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH] mk: parallelize make config

2017-01-24 Thread Olivier MATZ
Hi Keith,

On Mon, 23 Jan 2017 17:50:27 +, "Wiles, Keith"
 wrote:
> > On Jan 23, 2017, at 10:18 AM, Olivier Matz 
> > wrote:
> > 
> > Hi Ferruh,
> > 
> > On Sun, 22 Jan 2017 01:50:34 +, Ferruh Yigit
> >  wrote:  
> >> make config dependency resolving was always running serial,
> >> parallelize it for better performance.
> >> 
> >> $ time make T=x86_64-native-linuxapp-gcc config
> >> real0m12.633s
> >> 
> >> $ time make -j8 T=x86_64-native-linuxapp-gcc config
> >> real0m1.826s
> >> 
> >> Signed-off-by: Ferruh Yigit   
> > 
> > I have a patch that fix the same issue (configuration takes to
> > long), but done differently. It is more intrusive, since it rework
> > the way DEPDIRS are used, but it does not require to use -j.  
> 
> I tested the patch and the performance is very quick and seems to
> work very nicely. I have not looked at the other patch yet are they
> both compatible and could be combined or not?

No, I don't think they could be combined.

Ferruh's approach looks less risky, since it's a parallelization
of the config. My approach changes the way dependencies are processed,
and also how they are declared. I think it's faster, but more intrusive.

Regards,
Olivier


Re: [dpdk-dev] [PATCH] net/ixgbe: correct error when SFP not present

2017-01-24 Thread Lu, Wenzhuo
Hi,


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Wei Dai
> Sent: Tuesday, January 24, 2017 4:14 PM
> To: dev@dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Dai, Wei
> Subject: [dpdk-dev] [PATCH] net/ixgbe: correct error when SFP not present
> 
> Ignore the error=IXGBE_ERR_SFP_NOT_PRESENT when SFP is not present.
> If it is not ignored, testpmd will in the NIC initialization process.
> Ixgbe kernel driver ignores this error and works well. So DPDK does same 
> thing.
> 
> Signed-off-by: Wei Dai 
> Signed-off-by: Helin Zhang 
> Tested-by: Yuan Peng 
Acked-by: Wenzhuo Lu 



Re: [dpdk-dev] [PATCH v6 0/6] distributor library performance enhancements

2017-01-24 Thread Liu, Yong
Tested-by: Yong Liu 

- Tested Branch: master
- Tested Commit: 61207d014fc906302a184ae2f779b54ccfd0cd4c
- OS: Fedora20 4.9.0
- GCC: gcc version 4.8.3 20140911
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation Device Fortville [8086:1584]
- Default x86_64-native-linuxapp-gcc configuration
- Prerequisites:
- Total 6 cases, 6 passed, 0 failed

- Prerequisites command / instruction:
  Intel(r) X710 (Fortville) NIC plugged in

- Case: Distributor unit test
  Description: check burst packet distributor API work fine
  Command / instruction:
Start test application and run distributor unit test
   test -c f -n 4 -- -i
   RTE>>distributor_autotest
Verify burst distributor API unit test passed

- Case: Distributor performance unit test
  Description: check burst packet distributor API performance
  Command / instruction:
Start test application and run distributor unit test
   test -c f -n 4 -- -i
   RTE>>distributor_perf_autotest
Compared CPU cycles for normal distributor and burst API
Verify burst distributor API cost much less cycles then legacy library

- Case: Distributor library function check
  Description: check burst packet distributor API performance
  Command / instruction:
Start distributor sample with one worker::
  distributor_app -c 0x7c  -n 4 -- -p 0x1
Send few packets (less then burst 8) with sequence index
Check forwarded packets are all in sequence and content not changed
Send packets equal to burst size with sequence index
Check forwarded packets are all in sequence and content not changed
Send packets over burst size with sequence index
Check forwarded packets are all in sequence and content not changed

- Case: Distributor between multiple workers
  Description: check burst packet distributor sample performance
  Command / instruction:
Start distributor sample with multiple workers::
  distributor_app -c 0xfc  -n 4 -- -p 0x1
Send several packets with IP address increasing
Check packets distributed to all workers
Repeat these steps for 4/8/16/32 workers

- Case: Distributor between maximum workers
  Description: check burst packet distributor can work with 63 workers
  Command / instruction:
Start distributor sample with multiple workers::
  distributor_app -c 0xe0  -n 4 -- -p 0x1
Send several packets with IP address increasing
Check packets distributed to all workers

- Case: Distributor packets from multiple input ports
  Description: check burst packet distributor work with multiple inputs
  Command / instruction:
Start distributor sample with multiple workers::
  distributor_app -c 0x7c -n 4 -- -p 0x3
Send several packets from two tester ports with different IP
Check packets forwarded back

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Hunt
> Sent: Monday, January 23, 2017 5:25 PM
> To: dev@dpdk.org
> Cc: Richardson, Bruce 
> Subject: [dpdk-dev] [PATCH v6 0/6] distributor library performance
> enhancements


[dpdk-dev] [PATCH v1] fix API parameter checking for ixgbe

2017-01-24 Thread Tiwei Bie
When MACsec patch was merged. The helper function is_ixgbe_pmd() was
just raised and merged in another git tree. Now, both of them have
been merged in the same git tree. So do the same fix for MACsec.

Tiwei Bie (1):
  net/ixgbe: fix API parameter checking

 drivers/net/ixgbe/ixgbe_ethdev.c  | 30 ++
 drivers/net/ixgbe/rte_pmd_ixgbe.h |  4 
 2 files changed, 34 insertions(+)

-- 
2.7.4



[dpdk-dev] [PATCH v1] net/ixgbe: fix API parameter checking

2017-01-24 Thread Tiwei Bie
Add checks to rte_pmd_ixgbe_macsec_* APIs to ensure that the
port is an ixgbe port.

Fixes: b35d309710fe ("net/ixgbe: add MACsec offload")

Signed-off-by: Tiwei Bie 
---
 drivers/net/ixgbe/ixgbe_ethdev.c  | 30 ++
 drivers/net/ixgbe/rte_pmd_ixgbe.h |  4 
 2 files changed, 34 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index ad63e5a..58a4a2e 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -8222,10 +8222,15 @@ rte_pmd_ixgbe_macsec_enable(uint8_t port, uint8_t en, 
uint8_t rp)
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -8301,10 +8306,15 @@ rte_pmd_ixgbe_macsec_disable(uint8_t port)
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -8361,10 +8371,15 @@ rte_pmd_ixgbe_macsec_config_txsc(uint8_t port, uint8_t 
*mac)
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -8382,10 +8397,15 @@ rte_pmd_ixgbe_macsec_config_rxsc(uint8_t port, uint8_t 
*mac, uint16_t pi)
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -8405,10 +8425,15 @@ rte_pmd_ixgbe_macsec_select_txsa(uint8_t port, uint8_t 
idx, uint8_t an,
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl, i;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
@@ -8457,10 +8482,15 @@ rte_pmd_ixgbe_macsec_select_rxsa(uint8_t port, uint8_t 
idx, uint8_t an,
 {
struct ixgbe_hw *hw;
struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
uint32_t ctrl, i;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
+   rte_eth_dev_info_get(port, &dev_info);
+   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   return -ENOTSUP;
+
dev = &rte_eth_devices[port];
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 
diff --git a/drivers/net/ixgbe/rte_pmd_ixgbe.h 
b/drivers/net/ixgbe/rte_pmd_ixgbe.h
index dade57a..d4efe07 100644
--- a/drivers/net/ixgbe/rte_pmd_ixgbe.h
+++ b/drivers/net/ixgbe/rte_pmd_ixgbe.h
@@ -222,6 +222,7 @@ int rte_pmd_ixgbe_macsec_disable(uint8_t port);
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
  */
 int rte_pmd_ixgbe_macsec_config_txsc(uint8_t port, uint8_t *mac);
 
@@ -237,6 +238,7 @@ int rte_pmd_ixgbe_macsec_config_txsc(uint8_t port, uint8_t 
*mac);
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
  */
 int rte_pmd_ixgbe_macsec_config_rxsc(uint8_t port, uint8_t *mac, uint16_t pi);
 
@@ -256,6 +258,7 @@ int rte_pmd_ixgbe_macsec_config_rxsc(uint8_t port, uint8_t 
*mac, uint16_t pi);
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-EINVAL) if bad parameter.
  */
 int rte_pmd_ixgbe_macsec_select_txsa(uint8_t port, uint8_t idx, uint8_t an,
@@ -277,6 +280,7 @@ int rte_pmd_ixgbe_macsec_select_txsa(uint8_t port, uint8_t 
idx, uint8_t an,
  * @return
  *   - (0) if successful.
  *   - (-ENODEV) if *port* invalid.
+ *   - (-ENOTSUP) if hardware doesn't support this feature.
  *   - (-

Re: [dpdk-dev] [PATCHv6 16/33] drivers/pool/dpaa2: adding hw offloaded mempool

2017-01-24 Thread Shreyansh Jain

On Monday 23 January 2017 11:04 PM, Ferruh Yigit wrote:

On 1/23/2017 11:59 AM, Hemant Agrawal wrote:

Adding NXP DPAA2 architecture specific mempool support
Each mempool instance is represented by a DPBP object
from the FSL-MC bus.

This patch also registers a dpaa2 type MEMPOOL OPS

Signed-off-by: Hemant Agrawal 
---

<...>


diff --git a/drivers/common/Makefile b/drivers/common/Makefile
index b52931c..0bb75b5 100644
--- a/drivers/common/Makefile
+++ b/drivers/common/Makefile
@@ -35,7 +35,11 @@ ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_PMD),y)
 CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_PMD)
 endif

-ifeq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)
+ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_POOL),y)
+CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_POOL)
+endif
+
+ifneq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)


I guess this is a typo, but this prevents DPAA2_COMMON to be compiled !!


It should be 'ifeq' rather than 'ifneq'. And it will prevent COMMON
compilation only if CONFIG_RTE_LIBRTE_FSLMC_BUS=n which is not the case
right now.

We will fix it.




 CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_FSLMC_BUS)
 endif



<...>

+# library dependencies
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_common_dpaa2_qbman


This dependeny doesn not looks correct, there is no folder like that.


This is something even I need to understand. From the DEPDIRS what I
understood was that though it refers to a directory, it essentially
links libraries in build/lib/*.

Further, somehow the development is deploying drivers/bus,
drivers/common and drivers/pool in lib/* under the name specified as
LIB in Makefile. My understanding was that it is expected behavior and
not special because of drivers folder.

Thus, above line only links lib/librte_common_dpaa2_qbman generated by
drivers/common/dpaa2/qbman code.

In fact, I think, this might also one of the issues why a parallel 
shared build fails for DPAA2 PMD (added in Cover letter).

The dependency graph cannot create a graph for drivers/common
as dependency for drivers/net or drivers/bus and hence parallel build
fails because of missing libraries which are being parallely compiled.




+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_bus_fslmc
+
+include $(RTE_SDK)/mk/rte.lib.mk


<...>





Re: [dpdk-dev] [PATCH] mk: parallelize make config

2017-01-24 Thread Bruce Richardson
On Tue, Jan 24, 2017 at 09:42:35AM +0100, Olivier MATZ wrote:
> Hi Keith,
> 
> On Mon, 23 Jan 2017 17:50:27 +, "Wiles, Keith"
>  wrote:
> > > On Jan 23, 2017, at 10:18 AM, Olivier Matz 
> > > wrote:
> > > 
> > > Hi Ferruh,
> > > 
> > > On Sun, 22 Jan 2017 01:50:34 +, Ferruh Yigit
> > >  wrote:  
> > >> make config dependency resolving was always running serial,
> > >> parallelize it for better performance.
> > >> 
> > >> $ time make T=x86_64-native-linuxapp-gcc config
> > >> real0m12.633s
> > >> 
> > >> $ time make -j8 T=x86_64-native-linuxapp-gcc config
> > >> real0m1.826s
> > >> 
> > >> Signed-off-by: Ferruh Yigit   
> > > 
> > > I have a patch that fix the same issue (configuration takes to
> > > long), but done differently. It is more intrusive, since it rework
> > > the way DEPDIRS are used, but it does not require to use -j.  
> > 
> > I tested the patch and the performance is very quick and seems to
> > work very nicely. I have not looked at the other patch yet are they
> > both compatible and could be combined or not?
> 
> No, I don't think they could be combined.
> 
> Ferruh's approach looks less risky, since it's a parallelization
> of the config. My approach changes the way dependencies are processed,
> and also how they are declared. I think it's faster, but more intrusive.
>
I think we may hit a case of diminishing returns. On my system, the
config time goes from 12 seconds down to less than 1 second (large
amounts of parallelism). Even with limiting parallelism to just 4
make processes, the time for config drops to 3 seconds.
I don't see huge value in trying to shave off any more time than
that, given the amount of extra complexity involved.

/Bruce


Re: [dpdk-dev] [PATCH v4] ethdev: fix MAC address replay

2017-01-24 Thread Igor Ryzhov
Thank you Steve.

I never did it before and I don't know if I have rights for that, but:

Acked-by: Igor Ryzhov 

On Tue, Jan 24, 2017 at 5:21 AM, Steve Shin  wrote:

> This patch fixes a bug in replaying MAC address to the hardware
> in rte_eth_dev_config_restore() routine. Added default MAC replay as well.
>
> Fixes: 4bdefaade6d1 ("ethdev: VMDQ enhancements")
>
> ---
> v2: Added default MAC replay & Code optimization
> v3: Covered a case (ex, SR-IOV) where multiple pools
>exist in the mac_pool_sel array.
> v4: removed a coding style warning
>
> Signed-off-by: Steve Shin 
>


Re: [dpdk-dev] [RFC] Add GRO support in DPDK

2017-01-24 Thread Ananyev, Konstantin


> -Original Message-
> From: Wiles, Keith
> Sent: Tuesday, January 24, 2017 5:26 AM
> To: Ananyev, Konstantin 
> Cc: Stephen Hemminger ; Hu, Jiayu 
> ; dev@dpdk.org; Kinsella, Ray
> ; Gilmore, Walter E ; 
> Venkatesan, Venky ;
> yuanhan@linux.intel.com
> Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> 
> 
> > On Jan 23, 2017, at 6:43 PM, Ananyev, Konstantin 
> >  wrote:
> >
> >
> >
> >> -Original Message-
> >> From: Wiles, Keith
> >> Sent: Monday, January 23, 2017 9:53 PM
> >> To: Stephen Hemminger 
> >> Cc: Hu, Jiayu ; dev@dpdk.org; Kinsella, Ray 
> >> ; Ananyev, Konstantin
> >> ; Gilmore, Walter E 
> >> ; Venkatesan, Venky
> ;
> >> yuanhan@linux.intel.com
> >> Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> >>
> >>
> >>> On Jan 23, 2017, at 10:15 AM, Stephen Hemminger 
> >>>  wrote:
> >>>
> >>> On Mon, 23 Jan 2017 21:03:12 +0800
> >>> Jiayu Hu  wrote:
> >>>
>  With the support of hardware segmentation techniques in DPDK, the
>  networking stack overheads of send-side of applications, which directly
>  leverage DPDK, have been greatly reduced. But for receive-side, numbers 
>  of
>  segmented packets seriously burden the networking stack of applications.
>  Generic Receive Offload (GRO) is a widely used method to solve the
>  receive-side issue, which gains performance by reducing the amount of
>  packets processed by the networking stack. But currently, DPDK doesn't
>  support GRO. Therefore, we propose to add GRO support in DPDK, and this
>  RFC is used to explain the basic DPDK GRO design.
> 
>  DPDK GRO is a SW-based packets assembly library, which provides GRO
>  abilities for numbers of protocols. In DPDK GRO, packets are merged
>  before returning to applications and after receiving from drivers.
> 
>  In DPDK, GRO is a capability of NIC drivers. That support GRO or not and
>  what GRO types are supported are up to NIC drivers. Different drivers may
>  support different GRO types. By default, drivers enable all supported GRO
>  types. For applications, they can inquire the supported GRO types by
>  each driver, and can control what GRO types are applied. For example,
>  ixgbe supports TCP and UDP GRO, but the application just needs TCP GRO.
>  The application can disable ixgbe UDP GRO.
> 
>  To support GRO, a driver should provide a way to tell applications what
>  GRO types are supported, and provides a GRO function, which is in charge
>  of assembling packets. Since different drivers may support different GRO
>  types, their GRO functions may be different. For applications, they don't
>  need extra operations to enable GRO. But if there are some GRO types that
>  are not needed, applications can use an API, like
>  rte_eth_gro_disable_protocols, to disable them. Besides, they can
>  re-enable the disabled ones.
> 
>  The GRO function processes numbers of packets at a time. In each
>  invocation, what GRO types are applied depends on applications, and the
>  amount of packets to merge depends on the networking status and
>  applications. Specifically, applications determine the maximum number of
>  packets to be processed by the GRO function, but how many packets are
>  actually processed depends on if there are available packets to receive.
>  For example, the receive-side application asks the GRO function to
>  process 64 packets, but the sender only sends 40 packets. At this time,
>  the GRO function returns after processing 40 packets. To reassemble the
>  given packets, the GRO function performs an "assembly procedure" on each
>  packet. We use an example to demonstrate this procedure. Supposing the
>  GRO function is going to process packetX, it will do the following two
>  things:
>   a. Find a L4 assembly function according to the packet type of
>   packetX. A L4 assembly function is in charge of merging packets of a
>   specific type. For example, TCPv4 assembly function merges packets
>   whose L3 IPv4 and L4 is TCP. Each L4 assembly function has a packet
>   array, which keeps the packets that are unable to assemble.
>   Initially, the packet array is empty;
>   b. The L4 assembly function traverses own packet array to find a
>   mergeable packet (comparing Ethernet, IP and L4 header fields). If
>   finds, merges it and packetX via chaining them together; if doesn't,
>   allocates a new array element to store packetX and updates element
>   number of the array.
>  After performing the assembly procedure to all packets, the GRO function
>  combines the results of all packet arrays, and returns these packets to
>  applications.
> 
>  There are lots of ways to implement the above design in DPDK. One of the
>  ways is:
>   a. Drivers tell applications what GRO types are supported via
>   dev->dev_o

Re: [dpdk-dev] [PATCH] cryptodev: crypto PMD functions incorrectly inlined

2017-01-24 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Zhang, Roy Fan
> Sent: Monday, January 23, 2017 12:25 PM
> To: Doherty, Declan; dev@dpdk.org
> Cc: De Lara Guarch, Pablo; sta...@dpdk.org; Doherty, Declan
> Subject: RE: [dpdk-dev] [PATCH] cryptodev: crypto PMD functions
> incorrectly inlined
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Declan Doherty
> > Sent: Monday, January 23, 2017 12:19 PM
> > To: dev@dpdk.org
> > Cc: De Lara Guarch, Pablo ;
> > sta...@dpdk.org; Doherty, Declan 
> > Subject: [dpdk-dev] [PATCH] cryptodev: crypto PMD functions incorrectly
> > inlined
> >
> > rte_cryptodev_pmd_get_dev, rte_cryptodev_pmd_get_named_dev,
> > rte_cryptodev_pmd_is_valid_dev were incorrectly marked as inline and
> > therefore not useable from crypto PMDs when built as shared libraries as
> > they accessed the global rte_cryptodev_globals device structure.
> >
> > Fixes: d11b0f30 ("cryptodev: introduce API and framework for crypto
> > devices")
> >
> > Signed-off-by: Declan Doherty 

...
> Acked-by: Fan Zhang 

Applied to dpdk-next-crypto.
Thanks,

Pablo



[dpdk-dev] [PATCH RFCv2 0/4] generalise rte_ring to allow different datatypes

2017-01-24 Thread Bruce Richardson
Following on from the previous RFC, this generalises the rte_ring 
structure using C constructs rather than using macros. The idea
here is to have the size of the data objects passed in to all common
functions and then switching the code paths, where necessary, based on
those size parameters. For the standard rte_ring APIs, these work on
void pointers, so the extra parameter is always specified as 
sizeof(void *). A new event_ring type is also added as a patch to test
out creating and using rings with different size objects.

Comments welcome on this RFC, though I still hope to try out some
other options, as I think there may be still better ways to achieve
the same end.

Bruce Richardson (4):
  ring: create common ring files
  ring: separate common and rte_ring specific functions
  ring: allow common ring to use 8 or 16 byte values
  ring: add new event ring class

 app/test/Makefile |   1 +
 app/test/test_event_ring.c|  83 
 lib/librte_ring/Makefile  |   8 +-
 lib/librte_ring/rte_common_ring.c | 386 
 lib/librte_ring/rte_common_ring.h | 925 ++
 lib/librte_ring/rte_event_ring.c  |  87 
 lib/librte_ring/rte_event_ring.h  | 696 
 lib/librte_ring/rte_ring.c| 320 +
 lib/librte_ring/rte_ring.h| 606 +
 9 files changed, 2210 insertions(+), 902 deletions(-)
 create mode 100644 app/test/test_event_ring.c
 create mode 100644 lib/librte_ring/rte_common_ring.c
 create mode 100644 lib/librte_ring/rte_common_ring.h
 create mode 100644 lib/librte_ring/rte_event_ring.c
 create mode 100644 lib/librte_ring/rte_event_ring.h

-- 
2.9.3



[dpdk-dev] [PATCH RFCv2 2/4] ring: separate common and rte_ring specific functions

2017-01-24 Thread Bruce Richardson
Provide a separate rte_ring implementation which just calls into the
common ring code. This allows us to generalise the common ring code
without affecting the API/ABI of the rte_ring. The common functions
are now all renamed to have an rte_common_ring prefix.

Signed-off-by: Bruce Richardson 
---
 lib/librte_ring/Makefile  |   1 +
 lib/librte_ring/rte_common_ring.c |  57 ++--
 lib/librte_ring/rte_common_ring.h | 463 ++---
 lib/librte_ring/rte_ring.c|  86 +
 lib/librte_ring/rte_ring.h| 692 +-
 5 files changed, 832 insertions(+), 467 deletions(-)
 create mode 100644 lib/librte_ring/rte_ring.c
 mode change 12 => 100644 lib/librte_ring/rte_ring.h

diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
index 1e2396e..5cebb29 100644
--- a/lib/librte_ring/Makefile
+++ b/lib/librte_ring/Makefile
@@ -42,6 +42,7 @@ LIBABIVER := 1
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_RING) += rte_common_ring.c
+SRCS-$(CONFIG_RTE_LIBRTE_RING) += rte_ring.c
 
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include += rte_ring.h
diff --git a/lib/librte_ring/rte_common_ring.c 
b/lib/librte_ring/rte_common_ring.c
index ca0a108..a0c4b5a 100644
--- a/lib/librte_ring/rte_common_ring.c
+++ b/lib/librte_ring/rte_common_ring.c
@@ -89,19 +89,19 @@
 
 #include "rte_ring.h"
 
-TAILQ_HEAD(rte_ring_list, rte_tailq_entry);
+TAILQ_HEAD(rte_common_ring_list, rte_tailq_entry);
 
-static struct rte_tailq_elem rte_ring_tailq = {
+static struct rte_tailq_elem rte_common_ring_tailq = {
.name = RTE_TAILQ_RING_NAME,
 };
-EAL_REGISTER_TAILQ(rte_ring_tailq)
+EAL_REGISTER_TAILQ(rte_common_ring_tailq)
 
 /* true if x is a power of 2 */
 #define POWEROF2(x) x)-1) & (x)) == 0)
 
 /* return the size of memory occupied by a ring */
 ssize_t
-rte_ring_get_memsize(unsigned count)
+rte_common_ring_get_memsize(unsigned count)
 {
ssize_t sz;
 
@@ -119,7 +119,7 @@ rte_ring_get_memsize(unsigned count)
 }
 
 int
-rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
+rte_common_ring_init(struct rte_ring *r, const char *name, unsigned count,
unsigned flags)
 {
int ret;
@@ -134,7 +134,7 @@ rte_ring_init(struct rte_ring *r, const char *name, 
unsigned count,
RTE_BUILD_BUG_ON((offsetof(struct rte_ring, prod) &
  RTE_CACHE_LINE_MASK) != 0);
 #ifdef RTE_LIBRTE_RING_DEBUG
-   RTE_BUILD_BUG_ON((sizeof(struct rte_ring_debug_stats) &
+   RTE_BUILD_BUG_ON((sizeof(struct rte_common_ring_debug_stats) &
  RTE_CACHE_LINE_MASK) != 0);
RTE_BUILD_BUG_ON((offsetof(struct rte_ring, stats) &
  RTE_CACHE_LINE_MASK) != 0);
@@ -159,7 +159,7 @@ rte_ring_init(struct rte_ring *r, const char *name, 
unsigned count,
 
 /* create the ring */
 struct rte_ring *
-rte_ring_create(const char *name, unsigned count, int socket_id,
+rte_common_ring_create(const char *name, unsigned count, int socket_id,
unsigned flags)
 {
char mz_name[RTE_MEMZONE_NAMESIZE];
@@ -168,12 +168,12 @@ rte_ring_create(const char *name, unsigned count, int 
socket_id,
const struct rte_memzone *mz;
ssize_t ring_size;
int mz_flags = 0;
-   struct rte_ring_list* ring_list = NULL;
+   struct rte_common_ring_list* ring_list = NULL;
int ret;
 
-   ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
+   ring_list = RTE_TAILQ_CAST(rte_common_ring_tailq.head, 
rte_common_ring_list);
 
-   ring_size = rte_ring_get_memsize(count);
+   ring_size = rte_common_ring_get_memsize(count);
if (ring_size < 0) {
rte_errno = ring_size;
return NULL;
@@ -203,7 +203,7 @@ rte_ring_create(const char *name, unsigned count, int 
socket_id,
r = mz->addr;
/* no need to check return value here, we already checked the
 * arguments above */
-   rte_ring_init(r, name, count, flags);
+   rte_common_ring_init(r, name, count, flags);
 
te->data = (void *) r;
r->memzone = mz;
@@ -221,20 +221,20 @@ rte_ring_create(const char *name, unsigned count, int 
socket_id,
 
 /* free the ring */
 void
-rte_ring_free(struct rte_ring *r)
+rte_common_ring_free(struct rte_ring *r)
 {
-   struct rte_ring_list *ring_list = NULL;
+   struct rte_common_ring_list *ring_list = NULL;
struct rte_tailq_entry *te;
 
if (r == NULL)
return;
 
/*
-* Ring was not created with rte_ring_create,
+* Ring was not created with rte_common_ring_create,
 * therefore, there is no memzone to free.
 */
if (r->memzone == NULL) {
-   RTE_LOG(ERR, RING, "Cannot free ring (not created with 
rte_ring_create()");
+   RTE_LOG(ERR, RING, "Cannot free ring (not created with 
rte_common_ring_create()");
   

[dpdk-dev] [PATCH RFCv2 1/4] ring: create common ring files

2017-01-24 Thread Bruce Richardson
Create rte_common_ring.[ch] files which will be modified to contain
generic ring implementation code to be shared across multiple ring
implementations for different sizes and types of data.
For now, these are exact copies of the original rte_ring files.

Signed-off-by: Bruce Richardson 
---
 lib/librte_ring/Makefile  |5 +-
 lib/librte_ring/{rte_ring.c => rte_common_ring.c} |0
 lib/librte_ring/rte_common_ring.h | 1269 
 lib/librte_ring/rte_ring.h| 1270 +
 4 files changed, 1273 insertions(+), 1271 deletions(-)
 rename lib/librte_ring/{rte_ring.c => rte_common_ring.c} (100%)
 create mode 100644 lib/librte_ring/rte_common_ring.h
 mode change 100644 => 12 lib/librte_ring/rte_ring.h

diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
index 4b1112e..1e2396e 100644
--- a/lib/librte_ring/Makefile
+++ b/lib/librte_ring/Makefile
@@ -41,10 +41,11 @@ EXPORT_MAP := rte_ring_version.map
 LIBABIVER := 1
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c
+SRCS-$(CONFIG_RTE_LIBRTE_RING) += rte_common_ring.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include += rte_ring.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include += rte_common_ring.h
 
 DEPDIRS-$(CONFIG_RTE_LIBRTE_RING) += lib/librte_eal
 
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_common_ring.c
similarity index 100%
rename from lib/librte_ring/rte_ring.c
rename to lib/librte_ring/rte_common_ring.c
diff --git a/lib/librte_ring/rte_common_ring.h 
b/lib/librte_ring/rte_common_ring.h
new file mode 100644
index 000..e359aff
--- /dev/null
+++ b/lib/librte_ring/rte_common_ring.h
@@ -0,0 +1,1269 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Derived from FreeBSD's bufring.h
+ *
+ **
+ *
+ * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *this list of conditions and the following disclaimer.
+ *
+ * 2. The name of Kip Macy nor the names of other
+ *contributors may be used to endorse or promote products derived from
+ *this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF 

[dpdk-dev] [PATCH RFCv2 3/4] ring: allow common ring to use 8 or 16 byte values

2017-01-24 Thread Bruce Richardson
Change the common ring enqueue/dequeue functions to support enqueuing and
dequeuing 16B values. Add the element size as parameter to all common ring
functions that need it, and pass that as parameter from the rte_ring
functions.

Signed-off-by: Bruce Richardson 
---
 lib/librte_ring/rte_common_ring.c |  14 ++--
 lib/librte_ring/rte_common_ring.h | 135 --
 lib/librte_ring/rte_ring.c|   7 +-
 lib/librte_ring/rte_ring.h|  16 ++---
 4 files changed, 122 insertions(+), 50 deletions(-)

diff --git a/lib/librte_ring/rte_common_ring.c 
b/lib/librte_ring/rte_common_ring.c
index a0c4b5a..eb04de4 100644
--- a/lib/librte_ring/rte_common_ring.c
+++ b/lib/librte_ring/rte_common_ring.c
@@ -101,7 +101,7 @@ EAL_REGISTER_TAILQ(rte_common_ring_tailq)
 
 /* return the size of memory occupied by a ring */
 ssize_t
-rte_common_ring_get_memsize(unsigned count)
+rte_common_ring_get_memsize(unsigned count, unsigned int elem_sz)
 {
ssize_t sz;
 
@@ -113,14 +113,14 @@ rte_common_ring_get_memsize(unsigned count)
return -EINVAL;
}
 
-   sz = sizeof(struct rte_ring) + count * sizeof(void *);
+   sz = sizeof(struct rte_ring) + count * elem_sz;
sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE);
return sz;
 }
 
 int
 rte_common_ring_init(struct rte_ring *r, const char *name, unsigned count,
-   unsigned flags)
+   unsigned flags, unsigned int elem_sz)
 {
int ret;
 
@@ -146,6 +146,7 @@ rte_common_ring_init(struct rte_ring *r, const char *name, 
unsigned count,
if (ret < 0 || ret >= (int)sizeof(r->name))
return -ENAMETOOLONG;
r->flags = flags;
+   r->elem_sz = elem_sz;
r->prod.watermark = count;
r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
r->cons.sc_dequeue = !!(flags & RING_F_SC_DEQ);
@@ -160,7 +161,7 @@ rte_common_ring_init(struct rte_ring *r, const char *name, 
unsigned count,
 /* create the ring */
 struct rte_ring *
 rte_common_ring_create(const char *name, unsigned count, int socket_id,
-   unsigned flags)
+   unsigned flags, unsigned int elem_sz)
 {
char mz_name[RTE_MEMZONE_NAMESIZE];
struct rte_ring *r;
@@ -173,7 +174,7 @@ rte_common_ring_create(const char *name, unsigned count, 
int socket_id,
 
ring_list = RTE_TAILQ_CAST(rte_common_ring_tailq.head, 
rte_common_ring_list);
 
-   ring_size = rte_common_ring_get_memsize(count);
+   ring_size = rte_common_ring_get_memsize(count, elem_sz);
if (ring_size < 0) {
rte_errno = ring_size;
return NULL;
@@ -203,7 +204,7 @@ rte_common_ring_create(const char *name, unsigned count, 
int socket_id,
r = mz->addr;
/* no need to check return value here, we already checked the
 * arguments above */
-   rte_common_ring_init(r, name, count, flags);
+   rte_common_ring_init(r, name, count, flags, elem_sz);
 
te->data = (void *) r;
r->memzone = mz;
@@ -293,6 +294,7 @@ rte_common_ring_dump(FILE *f, const struct rte_ring *r)
 
fprintf(f, "ring <%s>@%p\n", r->name, r);
fprintf(f, "  flags=%x\n", r->flags);
+   fprintf(f, "  elem_sz=%u\n", r->elem_sz);
fprintf(f, "  size=%"PRIu32"\n", r->prod.size);
fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
diff --git a/lib/librte_ring/rte_common_ring.h 
b/lib/librte_ring/rte_common_ring.h
index f2c1c46..314d53b 100644
--- a/lib/librte_ring/rte_common_ring.h
+++ b/lib/librte_ring/rte_common_ring.h
@@ -101,6 +101,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 
 
 #define RTE_TAILQ_RING_NAME "RTE_RING"
 
@@ -157,6 +158,7 @@ struct rte_ring {
 */
char name[RTE_MEMZONE_NAMESIZE];/**< Name of the ring. */
int flags;   /**< Flags supplied at creation. */
+   unsigned int elem_sz;/**< Size of a ring entry */
const struct rte_memzone *memzone;
/**< Memzone, if any, containing the rte_ring */
 
@@ -232,7 +234,7 @@ struct rte_ring {
  *   - The memory size needed for the ring on success.
  *   - -EINVAL if count is not a power of 2.
  */
-ssize_t rte_common_ring_get_memsize(unsigned count);
+ssize_t rte_common_ring_get_memsize(unsigned count, unsigned int elem_sz);
 
 /**
  * Initialize a ring structure.
@@ -269,7 +271,7 @@ ssize_t rte_common_ring_get_memsize(unsigned count);
  *   0 on success, or a negative value on error.
  */
 int rte_common_ring_init(struct rte_ring *r, const char *name, unsigned count,
-   unsigned flags);
+   unsigned flags, unsigned int elem_sz);
 
 /**
  * Create a new ring named *name* in memory.
@@ -311,7 +313,8 @@ int rte_common_ring_init(struct rte_ring *r, const char 
*name, unsigned count,
  *- ENOMEM - no appropriate memory area found in which to create memzone
  */
 str

[dpdk-dev] [PATCH RFCv2 4/4] ring: add new event ring class

2017-01-24 Thread Bruce Richardson
using a placeholder struct type of the correct (16B) size, create an
event ring type and unit test it to verify the generalized ring
implementation works.

Signed-off-by: Bruce Richardson 
---
 app/test/Makefile|   1 +
 app/test/test_event_ring.c   |  83 +
 lib/librte_ring/Makefile |   2 +
 lib/librte_ring/rte_event_ring.c |  87 +
 lib/librte_ring/rte_event_ring.h | 696 +++
 5 files changed, 869 insertions(+)
 create mode 100644 app/test/test_event_ring.c
 create mode 100644 lib/librte_ring/rte_event_ring.c
 create mode 100644 lib/librte_ring/rte_event_ring.h

diff --git a/app/test/Makefile b/app/test/Makefile
index 9de301f..c0ab2ba 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -96,6 +96,7 @@ SRCS-y += test_memory.c
 SRCS-y += test_memzone.c
 
 SRCS-y += test_ring.c
+SRCS-y += test_event_ring.c
 SRCS-y += test_ring_perf.c
 SRCS-y += test_pmd_perf.c
 
diff --git a/app/test/test_event_ring.c b/app/test/test_event_ring.c
new file mode 100644
index 000..94f85cd
--- /dev/null
+++ b/app/test/test_event_ring.c
@@ -0,0 +1,83 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include "test.h"
+
+#define BURST_SIZE 24 /* not a power of two so we can test wrap around better 
*/
+#define RING_SIZE 128
+#define ENQ_ITERATIONS 48
+
+#define ERR_OUT() do { \
+   printf("Error %s:%d\n", __FILE__, __LINE__); \
+   return -1; \
+} while (0)
+
+static int
+test_event_ring(void)
+{
+   struct rte_event in_events[BURST_SIZE];
+   struct rte_event out_events[BURST_SIZE];
+   unsigned int i;
+
+   struct rte_ring *ering = rte_event_ring_create("TEST_RING", RING_SIZE,
+   rte_socket_id(), RING_F_SP_ENQ|RING_F_SC_DEQ);
+   if (ering == NULL)
+   ERR_OUT();
+
+   for (i = 0; i < BURST_SIZE; i++)
+   in_events[i].event_metadata = rte_rand();
+
+   for (i = 0; i < ENQ_ITERATIONS; i++)
+   if (rte_event_ring_enqueue_bulk(ering, in_events,
+   RTE_DIM(in_events)) != 0) {
+   unsigned j;
+
+   if (rte_event_ring_dequeue_burst(ering, out_events,
+   RTE_DIM(out_events)) != 
RTE_DIM(out_events))
+   ERR_OUT();
+   for (j = 0; j < RTE_DIM(out_events); j++)
+   if (out_events[j].event_metadata !=
+   in_events[j].event_metadata)
+   ERR_OUT();
+   /* retry, now that we've made space */
+   if (rte_event_ring_enqueue_bulk(ering, in_events,
+   RTE_DIM(in_events)) != 0)
+   ERR_OUT();
+   }
+
+   return 0;
+}
+
+REGISTER_TEST_COMMAND(event_ring_autotest, test_event_ring);
diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
index 5cebb29..fb1c9e5 100644
--- a/lib/librte_ring/Makefile
+++ b/lib/librte_ring/Makefile
@@ -43,10 +43,12 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_RING) += rte_common_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_RING) += rte_ring.c
+SRCS-$(CONFIG_RTE_LIBRTE_RING) += rt

Re: [dpdk-dev] [PATCHv6 22/33] net/dpaa2: add support for l3 and l4 checksum offload

2017-01-24 Thread Hemant Agrawal

On 1/23/2017 11:05 PM, Ferruh Yigit wrote:

On 1/23/2017 11:59 AM, Hemant Agrawal wrote:

Signed-off-by: Hemant Agrawal 
---

<...>

--- a/drivers/net/dpaa2/Makefile
+++ b/drivers/net/dpaa2/Makefile
@@ -66,6 +66,6 @@ DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_eal 
lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_mempool lib/librte_mbuf
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_common_dpaa2_qbman
 DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_bus_fslmc
-DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_pmd_dpaa2_pool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_pool_dpaa2


I understand these are added since DEPDIRS converted to LDLIBS, but this
does not looks right, since these variables mainly dependency solving
and the value provided is not correct for DEPDIRS.

Did you tried adding as "LDLIBS += xx", not tested, but may work.



Our intention was to create dependency to help in parallel build in case 
of shared library.  "LDLIBS += xx" also work, but we are still not able 
to create inter driver directory dependency for parallel build.



We may need your help in understanding the change required in depdir 
script to support directories within "driver".




 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 6de571a..2298246 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -68,7 +68,17 @@
dev_info->min_rx_bufsize = DPAA2_MIN_RX_BUF_SIZE;
dev_info->max_rx_queues = (uint16_t)priv->nb_rx_queues;
dev_info->max_tx_queues = (uint16_t)priv->nb_tx_queues;
-
+   dev_info->rx_offload_capa =
+   DEV_RX_OFFLOAD_IPV4_CKSUM |
+   DEV_RX_OFFLOAD_UDP_CKSUM |
+   DEV_RX_OFFLOAD_TCP_CKSUM |
+   DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM;


Is there capabilities always enabled for all devices this driver
support? Or should these be read from some device registers?

Capabilities are always enabled for all devices (MC DPNI object in this 
case)



+   dev_info->tx_offload_capa =
+   DEV_TX_OFFLOAD_IPV4_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_SCTP_CKSUM |
+   DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
dev_info->speed_capa = ETH_LINK_SPEED_1G |
ETH_LINK_SPEED_2_5G |
ETH_LINK_SPEED_10G;

<...>






Re: [dpdk-dev] [PATCHv6 16/33] drivers/pool/dpaa2: adding hw offloaded mempool

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 9:12 AM, Shreyansh Jain wrote:
> On Monday 23 January 2017 11:04 PM, Ferruh Yigit wrote:
>> On 1/23/2017 11:59 AM, Hemant Agrawal wrote:
>>> Adding NXP DPAA2 architecture specific mempool support
>>> Each mempool instance is represented by a DPBP object
>>> from the FSL-MC bus.
>>>
>>> This patch also registers a dpaa2 type MEMPOOL OPS
>>>
>>> Signed-off-by: Hemant Agrawal 
>>> ---
>> <...>
>>
>>> diff --git a/drivers/common/Makefile b/drivers/common/Makefile
>>> index b52931c..0bb75b5 100644
>>> --- a/drivers/common/Makefile
>>> +++ b/drivers/common/Makefile
>>> @@ -35,7 +35,11 @@ ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_PMD),y)
>>>  CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_PMD)
>>>  endif
>>>
>>> -ifeq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)
>>> +ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_POOL),y)
>>> +CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_POOL)
>>> +endif
>>> +
>>> +ifneq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)
>>
>> I guess this is a typo, but this prevents DPAA2_COMMON to be compiled !!
> 
> It should be 'ifeq' rather than 'ifneq'. 

> And it will prevent COMMON
> compilation only if CONFIG_RTE_LIBRTE_FSLMC_BUS=n which is not the case
> right now.

It was the case for me for x86 config, but you are right it is not the
default case for arm.

> 
> We will fix it.
> 
>>
>>>  CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_FSLMC_BUS)
>>>  endif
>>>
>>
>> <...>
>>> +# library dependencies
>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_eal
>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_mempool
>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_common_dpaa2_qbman
>>
>> This dependeny doesn not looks correct, there is no folder like that.
> 
> This is something even I need to understand. From the DEPDIRS what I
> understood was that though it refers to a directory, it essentially
> links libraries in build/lib/*.
> 
> Further, somehow the development is deploying drivers/bus,
> drivers/common and drivers/pool in lib/* under the name specified as
> LIB in Makefile. My understanding was that it is expected behavior and
> not special because of drivers folder.
> 
> Thus, above line only links lib/librte_common_dpaa2_qbman generated by
> drivers/common/dpaa2/qbman code.
> 
> In fact, I think, this might also one of the issues why a parallel 
> shared build fails for DPAA2 PMD (added in Cover letter).
> The dependency graph cannot create a graph for drivers/common
> as dependency for drivers/net or drivers/bus and hence parallel build
> fails because of missing libraries which are being parallely compiled.

DEPDIRS-y is mainly to resolve dependencies for compilation order, and
should point to the folder,

Following line will cause "librte_eal" to be compiled before driver:
DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_eal

So "lib/librte_common_dpaa2_qbman" does not makes more sense, since
there is no folder like that.


Somewhere in the history, with following commit, DEPDIRS-y gained a side
effect, it has been used to set dynamic linking dependencies, to fix
underlinking issue:
 bf5a46fa5972 ("mk: generate internal library dependencies")

I guess you are having that line to benefit from this side effect, but
this can be done with following more properly:
LDLIBS += lib/librte_common_dpaa2_qbman


To resolve the drivers/net to drivers/common dependency, following line
in this Makefile should work:
DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += drivers/common/dpaa2

This adds following, which will cause "drivers/common" compiled before
any "drivers/net":
LOCAL_DEPDIRS-drivers/net += drivers/common

> 
>>
>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_bus_fslmc
>>> +
>>> +include $(RTE_SDK)/mk/rte.lib.mk
>>
>> <...>
>>
> 



Re: [dpdk-dev] [PATCH v1] ethdev: fix multi-process NULL dereference crashes

2017-01-24 Thread Thomas Monjalon
2017-01-24 08:16, Remy Horton:
> 
> On 20/01/2017 18:37, Thomas Monjalon wrote:
> [..]
> > 3 comments here:
> > - it is in the wrong section (EAL instead of Drivers)
> > - secondary processes can setup a vdev PMD
> > - before Yuanhan's patch, even PCI PMD were blanking primary process data
> 
> Since the code being changed is in rte_ether rather than drivers/net it 
> seemed the logical place to me.. :)

The change is in ethdev, and you put the release note in EAL.
So no, it is not logical, because ethdev is not EAL.

> > I propose this rebase:
> >
> > -   memset(&rte_eth_dev_data[port_id], 0, sizeof(struct 
> > rte_eth_dev_data));
> > eth_dev = eth_dev_get(port_id);
> > +   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > +   memset(eth_dev->data, 0, sizeof(*eth_dev->data));
> > snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", 
> > name);
> > eth_dev->data->port_id = port_id;
> > eth_dev->data->mtu = ETHER_MTU;
> 
> Seems OK to me, assuming Yuanhan's patch is going in as-is.

Yuanhan's patch is already part of RC1.




[dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Matej Vido
Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")

Signed-off-by: Matej Vido 
---
 drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
b/drivers/net/szedata2/rte_eth_szedata2.h
index b58adb6..afe8a38 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.h
+++ b/drivers/net/szedata2/rte_eth_szedata2.h
@@ -192,7 +192,7 @@ struct szedata {
 }
 
 #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \
-   ((type)((uint8_t *)(rsc)->addr) + (offset))
+   ((type)(((uint8_t *)(rsc)->addr) + (offset)))
 
 enum szedata2_link_speed {
SZEDATA2_LINK_SPEED_DEFAULT = 0,
-- 
1.8.4



Re: [dpdk-dev] [PATCHv6 22/33] net/dpaa2: add support for l3 and l4 checksum offload

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 10:45 AM, Hemant Agrawal wrote:
> On 1/23/2017 11:05 PM, Ferruh Yigit wrote:
>> On 1/23/2017 11:59 AM, Hemant Agrawal wrote:
>>> Signed-off-by: Hemant Agrawal 
>>> ---
>> <...>
>>> --- a/drivers/net/dpaa2/Makefile
>>> +++ b/drivers/net/dpaa2/Makefile
>>> @@ -66,6 +66,6 @@ DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_eal 
>>> lib/librte_ether
>>>  DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_mempool 
>>> lib/librte_mbuf
>>>  DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_common_dpaa2_qbman
>>>  DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_bus_fslmc
>>> -DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_pmd_dpaa2_pool
>>> +DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += lib/librte_pool_dpaa2
>>
>> I understand these are added since DEPDIRS converted to LDLIBS, but this
>> does not looks right, since these variables mainly dependency solving
>> and the value provided is not correct for DEPDIRS.
>>
>> Did you tried adding as "LDLIBS += xx", not tested, but may work.
>>
> 
> Our intention was to create dependency to help in parallel build in case 
> of shared library.  "LDLIBS += xx" also work, but we are still not able 
> to create inter driver directory dependency for parallel build.
> 
> 
> We may need your help in understanding the change required in depdir 
> script to support directories within "driver".
> 
>>>
>>>  include $(RTE_SDK)/mk/rte.lib.mk
>>> diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c 
>>> b/drivers/net/dpaa2/dpaa2_ethdev.c
>>> index 6de571a..2298246 100644
>>> --- a/drivers/net/dpaa2/dpaa2_ethdev.c
>>> +++ b/drivers/net/dpaa2/dpaa2_ethdev.c
>>> @@ -68,7 +68,17 @@
>>> dev_info->min_rx_bufsize = DPAA2_MIN_RX_BUF_SIZE;
>>> dev_info->max_rx_queues = (uint16_t)priv->nb_rx_queues;
>>> dev_info->max_tx_queues = (uint16_t)priv->nb_tx_queues;
>>> -
>>> +   dev_info->rx_offload_capa =
>>> +   DEV_RX_OFFLOAD_IPV4_CKSUM |
>>> +   DEV_RX_OFFLOAD_UDP_CKSUM |
>>> +   DEV_RX_OFFLOAD_TCP_CKSUM |
>>> +   DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM;
>>
>> Is there capabilities always enabled for all devices this driver
>> support? Or should these be read from some device registers?
>>
> Capabilities are always enabled for all devices (MC DPNI object in this 
> case)

Thanks for the clarification, is it same for the speed capabilities?

> 
>>> +   dev_info->tx_offload_capa =
>>> +   DEV_TX_OFFLOAD_IPV4_CKSUM |
>>> +   DEV_TX_OFFLOAD_UDP_CKSUM |
>>> +   DEV_TX_OFFLOAD_TCP_CKSUM |
>>> +   DEV_TX_OFFLOAD_SCTP_CKSUM |
>>> +   DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
>>> dev_info->speed_capa = ETH_LINK_SPEED_1G |
>>> ETH_LINK_SPEED_2_5G |
>>> ETH_LINK_SPEED_10G;
>> <...>
>>
> 
> 



Re: [dpdk-dev] [PATCH 5/5] net/virtio: fix Tso when mbuf is shared

2017-01-24 Thread Olivier MATZ
On Wed, 18 Jan 2017 13:03:48 +0800, Yuanhan Liu
 wrote:
> On Tue, Jan 17, 2017 at 12:18:25PM +0100, Olivier Matz wrote:
> > > I hope I could have time to dig this further, since, honestly, I
> > > don't quite like this patch: it makes things un-maintainable.  
> > 
> > Well, I'm not that proud of the patch, but that's the best solution
> > I've found. Nevertheless saying it makes things un-maintainable
> > looks a bit excessive to me :)  
> 
> Aha... really sorry about that!
> 
> But honestly, I'd say again, it makes thing more complex, just for
> fixing a corner and rare issue. I'd try to avoid that.
> 
> > 
> > The option of reallocating a mbuf, copy and fix network headers in
> > it looks even more complex to me (that was my first approach).
> >   
> > > Besides that, I think we have similar issue with nic drivers. See
> > > the rte_net_intel_cksum_flags_prepare() function introduced at
> > > commit 4fb7e803eb1a ("ethdev: add Tx preparation").  
> > 
> > Yes, that was discussed a bit. See [1] and the subsequent mails.
> > http://dpdk.org/ml/archives/dev/2016-December/051014.html  
> 
> Thanks for the info, and I'm pretty Okay with that.
> 
> > My opinion is that tx_burst() should not change the mbuf data, it's
> > always been like this. For Intel NICs, there is no issue since the
> > DPDK API is derived from Intel NICs API, so there is no fix to do
> > in the mbuf data.
> > 
> > For tx_prepare(), it's explicitly said that it can update the data.
> > If tx_prepare() becomes mandatory, it will naturally fix this issue
> > without modifying the driver, because the phdr csum calculation
> > will be done in tx_prepare().
> > 
> > An alternative is to mark this as a known issue for now, and wait
> > until tx_prepare() is mandatory.  
> 
> I see no reason to wait. Though my understanding is it may not be a
> mandatory so far, but user is supposed to calculate the pseudo-header
> checksum by themself before. Now they have one more option:
> tx_prepare.
>
> That means, in either way, user has to do some extra works to make
> TSO work (either by themself or call tx_prepare). So I don't think
> it'd be an issue?

Yes sounds good. I'll check how a tx_prepare() could be implemented for
virtio, in order to fix this issue.

In the meanwhile, for the 17.02, I think it could be good to highlight
the problem in the known issues, what do you think?


Thanks
Olivier


Re: [dpdk-dev] [PATCH] mk: parallelize make config

2017-01-24 Thread Bruce Richardson
On Sun, Jan 22, 2017 at 01:50:34AM +, Ferruh Yigit wrote:
> make config dependency resolving was always running serial,
> parallelize it for better performance.
> 
> $ time make T=x86_64-native-linuxapp-gcc config
> real0m12.633s
> 
> $ time make -j8 T=x86_64-native-linuxapp-gcc config
> real0m1.826s
> 
> Signed-off-by: Ferruh Yigit 

Works great for me! Config time has always been a pain.

Tested-by: Bruce Richardson 


Re: [dpdk-dev] [PATCH v1] ethdev: fix multi-process NULL dereference crashes

2017-01-24 Thread Remy Horton


On 24/01/2017 10:49, Thomas Monjalon wrote:
[..]

Seems OK to me, assuming Yuanhan's patch is going in as-is.


Yuanhan's patch is already part of RC1.


Ah ok. I'll rebase a v2 then..


Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Robin Jarry

Hi Olivier,

2017-01-23, Olivier Matz:

Before this patch, the management of dependencies between directories
had several issues:

- the generation of .depdirs, done at configuration is slow: it can take
 more than one minute on some slow targets (usually ~10s on a standard
 PC).


Indeed, on a Qualcomm development board where disk I/O is quite slow:

 $ git describe
 v17.02-rc1-3-g61207d014fc9

 $ time make config T=arm64-armv8a-linuxapp-gcc
 real1m4.308s


- for instance, it is possible to expressed a dependency like:


s/expressed/express/

- we cannot use "make -d" for debug, because the output of make is used 
for the generation of .depdirs.


That is really annoying when debugging makefiles.

After applying this patch:

 $ git am mk-optimize-directory-dependencies.patch
 Applying: mk: optimize directory dependencies
 $ rm -rf build/

 $ time make config T=arm64-armv8a-linuxapp-gcc
 real0m0.111s

Almost 600 times faster than before!

I prefer this solution to the one proposed by Ferruh (which is 
interesting but requires to run parallel make). Here is a test with the 
other patch:


 $ git am mk-parallelize-make-config.patch
 Applying: mk: parallelize make config
 $ rm -rf build/
 $ grep -c processor /proc/cpuinfo
 24

 $ time make config T=arm64-armv8a-linuxapp-gcc -j24
 real0m11.253s

Here only 6 times faster than before, even when using 24 parallel 
processes.


Tested-by: Robin Jarry 


Re: [dpdk-dev] [PATCHv6 00/33] NXP DPAA2 PMD

2017-01-24 Thread Ferruh Yigit
On 1/23/2017 5:58 PM, Ferruh Yigit wrote:
> On 1/23/2017 11:59 AM, Hemant Agrawal wrote:
> <...>
> 
>>
>> Hemant Agrawal (33):
>>   mk/dpaa2: add the crc support to the machine type
>>   drivers/common/dpaa2: adding qbman driver
>>   bus/fslmc: introducing fsl-mc bus driver
>>   bus/fslmc: introduce mc object functions
>>   bus/fslmc: add mc dpni object support
>>   bus/fslmc: add mc dpio object support
>>   bus/fslmc: add mc dpbp object support
>>   bus/fslmc: add mc dpseci object support
>>   eal/vfio: adding vfio utility functions in map file
>>   bus/fslmc: add vfio support
>>   bus/fslmc: scan for net and sec devices
>>   net/dpaa2: introducing NXP dpaa2 pmd driver
>>   doc: add dpaa2 nic details
>>   bus/fslmc: add debug log message support
>>   drivers/common/dpaa2: dpio portal driver
>>   drivers/pool/dpaa2: adding hw offloaded mempool
>>   drivers/common/dpaa2: dpio routine to affine to crypto threads
>>   net/dpaa2: adding eth ops to dpaa2
>>   net/dpaa2: add rss flow distribution
>>   net/dpaa2: configure mac address at init
>>   net/dpaa2: attach the buffer pool to dpni
>>   net/dpaa2: add support for l3 and l4 checksum offload
>>   net/dpaa2: add support for promiscuous mode
>>   net/dpaa2: add mtu config support
>>   net/dpaa2: add packet rx and tx support
>>   net/dpaa2: rx packet parsing and packet type support
>>   net/dpaa2: link status update
>>   net/dpaa2: basic stats support
>>   net/dpaa2: enable stashing for LS2088A devices
>>   net/dpaa2: add support for non hw buffer pool packet transmit
>>   net/dpaa2: enabling the use of physical addresses
>>   bus/fslmc: add support for dmamap to ARM SMMU
>>   drivers/common/dpaa2: frame queue based dq storage alloc
>>
> 
> devtools/check-git-log.sh gives following errors:
> 
> Wrong headline prefix:
> bus/fslmc: add debug log message support

Some of these warnings are because of the assumption that the scope of
changed files are limited to the patch title tag.

For example, for this patch, because of the "bus/fslmc:", it is expected
that all modified files are under "drivers/bus/fslmc" folder, but this
patch modifies:
bus/fslmc/*
common/dpaa2/*
net/dpaa2/*

I can guess different dpaa2 modules (bus/common/net) has dependencies to
each other, and may not always be possible to separate them. This needs
to be investigated per patch.

But the more they are separated, easier to review / understand them. And
I am aware this is easier to say than to do it.

thanks,
ferruh


> drivers/common/dpaa2: dpio portal driver

Or this one, scope is "drivers/common/dpaa2", but all the modifies files
are under "/drivers/bus/*".


> drivers/common/dpaa2: dpio routine to affine to crypto threads
> net/dpaa2: adding eth ops to dpaa2
> net/dpaa2: attach the buffer pool to dpni
> net/dpaa2: add support for l3 and l4 checksum offload
> net/dpaa2: add mtu config support
> net/dpaa2: add packet rx and tx support
> net/dpaa2: enabling the use of physical addresses
> bus/fslmc: add support for dmamap to ARM SMMU
> drivers/common/dpaa2: frame queue based dq storage alloc
> Wrong headline lowercase:
> net/dpaa2: introducing NXP dpaa2 pmd driver
> doc: add dpaa2 nic details
> drivers/pool/dpaa2: adding hw offloaded mempool
> net/dpaa2: add rss flow distribution
> net/dpaa2: configure mac address at init
> net/dpaa2: add support for l3 and l4 checksum offload
> net/dpaa2: add mtu config support
> net/dpaa2: add packet rx and tx support
> net/dpaa2: rx packet parsing and packet type support
> net/dpaa2: add support for non hw buffer pool packet transmit
> Headline too long:
> drivers/common/dpaa2: dpio routine to affine to crypto threads
> net/dpaa2: add support for non hw buffer pool packet transmit
> 
> <...>
> 
> 
> 



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Bruce Richardson
On Tue, Jan 24, 2017 at 12:19:49PM +0100, Robin Jarry wrote:
> Hi Olivier,
> 
> 2017-01-23, Olivier Matz:
> > Before this patch, the management of dependencies between directories
> > had several issues:
> > 
> > - the generation of .depdirs, done at configuration is slow: it can take
> >  more than one minute on some slow targets (usually ~10s on a standard
> >  PC).
> 
> Indeed, on a Qualcomm development board where disk I/O is quite slow:
> 
>  $ git describe
>  v17.02-rc1-3-g61207d014fc9
> 
>  $ time make config T=arm64-armv8a-linuxapp-gcc
>  real1m4.308s
> 

Wow, what is the build time in that case?

> > - for instance, it is possible to expressed a dependency like:
> 
> s/expressed/express/
> 
> > - we cannot use "make -d" for debug, because the output of make is used
> > for the generation of .depdirs.
> 
> That is really annoying when debugging makefiles.
> 
> After applying this patch:
> 
>  $ git am mk-optimize-directory-dependencies.patch
>  Applying: mk: optimize directory dependencies
>  $ rm -rf build/
> 
>  $ time make config T=arm64-armv8a-linuxapp-gcc
>  real0m0.111s
> 
> Almost 600 times faster than before!
> 
> I prefer this solution to the one proposed by Ferruh (which is interesting
> but requires to run parallel make). Here is a test with the other patch:
> 
>  $ git am mk-parallelize-make-config.patch
>  Applying: mk: parallelize make config
>  $ rm -rf build/
>  $ grep -c processor /proc/cpuinfo
>  24
> 
>  $ time make config T=arm64-armv8a-linuxapp-gcc -j24
>  real0m11.253s
> 
> Here only 6 times faster than before, even when using 24 parallel processes.
> 
> Tested-by: Robin Jarry 

Hi Robin,

what are the differences in the patches like when doing a build rather
than just a config? If the build is minutes long because of slow IO,
is the extra 10 seconds really going to make that much of a difference?

Regards,
/Bruce


Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Jerin Jacob
On Mon, Jan 23, 2017 at 06:19:13PM +0100, Olivier Matz wrote:
> Before this patch, the management of dependencies between directories
> had several issues:
> 
> - the generation of .depdirs, done at configuration is slow: it can take
>   more than one minute on some slow targets (usually ~10s on a standard
>   PC).
> 
> - for instance, it is possible to expressed a dependency like:
>   - app/foo depends on lib/librte_foo
>   - and lib/librte_foo depends on app/bar
>   But this won't work because the directories are traversed with a
>   depth-first algorithm, so we have to choose between doing 'app' before
>   or after 'lib'.
> 
> - the script depdirs-rule.sh is too complex.
> 
> - we cannot use "make -d" for debug, because the output of make is used for
>   the generation of .depdirs.
> 
> This patch moves the DEPDIRS-* variables in the upper Makefile, making
> the dependencies much easier to calculate. A DEPDIRS variable is still
> used to process library dependencies in LDLIBS.
> 
> After this commit, "make config" is almost immediate.
> 
> Signed-off-by: Olivier Matz 

Tested both approach on ThunderX. This patch looks better

Tested-by: Jerin Jacob 

➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
Configuration done

real0m18.112s
user0m2.810s
sys 0m0.660s

➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19859
Applying patch #19859 using 'git am'
Description: [dpdk-dev] mk: parallelize make config
Applying: mk: parallelize make config

➜ [master]GB-2S [dpdk-master] $ rm -rf build
➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
-j 8
Configuration done

real0m2.812s
user0m3.020s
sys 0m0.870s

➜ [master]GB-2S [dpdk-master] $ rm -rf build
➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
-j 16
Configuration done

real0m1.748s
user0m3.040s
sys 0m1.020s

➜ [master]GB-2S [dpdk-master] $ rm -rf build
➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
-j 32
Configuration done

real0m1.422s
user0m3.380s
sys 0m1.080s

➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19918
Applying patch #19918 using 'git am'
Description: [dpdk-dev] mk: optimize directory dependencies
Applying: mk: optimize directory dependencies

➜ [master]GB-2S [dpdk-master] $ rm -rf build
➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
Configuration done

real0m0.064s
user0m0.000s
sys 0m0.000s
➜ [master]GB-2S [dpdk-master] $ rm -rf build
➜ [master]GB-2S [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc 
-j 8
Configuration done

real0m0.055s
user0m0.000s
sys 0m0.000s



[dpdk-dev] [PATCH 2/2] net/i40e: fix bitmask of supported Tx flags

2017-01-24 Thread Jingjing Wu
Some Tx offload flags are missed in Bitmask of all supported packet
Tx flags by i40e.
This patch fixes it.

Fixes: 3f33e643e5c6 ("net/i40e: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_rxtx.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 89b9bf1..509d379 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -83,11 +83,16 @@
 
 #define I40E_TX_OFFLOAD_MASK (  \
PKT_TX_IP_CKSUM |   \
+   PKT_TX_IPV4 |   \
+   PKT_TX_IPV6 |   \
PKT_TX_L4_MASK |\
PKT_TX_OUTER_IP_CKSUM | \
+   PKT_TX_OUTER_IPV4 | \
+   PKT_TX_OUTER_IPV6 | \
PKT_TX_TCP_SEG |\
PKT_TX_QINQ_PKT |   \
-   PKT_TX_VLAN_PKT)
+   PKT_TX_VLAN_PKT |   \
+   PKT_TX_TUNNEL_MASK)
 
 #define I40E_TX_OFFLOAD_NOTSUP_MASK \
(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK)
-- 
2.4.11



[dpdk-dev] [PATCH 1/2] mbuf: fix bitmask of Tx offload flags

2017-01-24 Thread Jingjing Wu
Some Tx offload flags are missed in Bitmask of all supported packet
Tx offload features flags.
This patch fixes it.

Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 lib/librte_mbuf/rte_mbuf.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index bfce9f4..e57a4d2 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -295,8 +295,12 @@ extern "C" {
  */
 #define PKT_TX_OFFLOAD_MASK (\
PKT_TX_IP_CKSUM |\
+   PKT_TX_IPV4 |\
+   PKT_TX_IPV6 |\
PKT_TX_L4_MASK | \
PKT_TX_OUTER_IP_CKSUM |  \
+   PKT_TX_OUTER_IPV4 |  \
+   PKT_TX_OUTER_IPV6 |  \
PKT_TX_TCP_SEG | \
PKT_TX_QINQ_PKT |\
PKT_TX_VLAN_PKT |\
-- 
2.4.11



[dpdk-dev] [PATCH 1/2] mbuf: fix bitmask of Tx offload flags

2017-01-24 Thread Jingjing Wu
Some Tx offload flags are missed in bitmask of all supported packet
Tx offload features flags.
This patch fixes it.

Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 lib/librte_mbuf/rte_mbuf.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index bfce9f4..e57a4d2 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -295,8 +295,12 @@ extern "C" {
  */
 #define PKT_TX_OFFLOAD_MASK (\
PKT_TX_IP_CKSUM |\
+   PKT_TX_IPV4 |\
+   PKT_TX_IPV6 |\
PKT_TX_L4_MASK | \
PKT_TX_OUTER_IP_CKSUM |  \
+   PKT_TX_OUTER_IPV4 |  \
+   PKT_TX_OUTER_IPV6 |  \
PKT_TX_TCP_SEG | \
PKT_TX_QINQ_PKT |\
PKT_TX_VLAN_PKT |\
-- 
2.4.11



[dpdk-dev] [PATCH 2/2] net/i40e: fix bitmask of supported Tx flags

2017-01-24 Thread Jingjing Wu
Some Tx offload flags are missed in bitmask of all supported packet
Tx flags by i40e.
This patch fixes it.

Fixes: 3f33e643e5c6 ("net/i40e: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_rxtx.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 89b9bf1..509d379 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -83,11 +83,16 @@
 
 #define I40E_TX_OFFLOAD_MASK (  \
PKT_TX_IP_CKSUM |   \
+   PKT_TX_IPV4 |   \
+   PKT_TX_IPV6 |   \
PKT_TX_L4_MASK |\
PKT_TX_OUTER_IP_CKSUM | \
+   PKT_TX_OUTER_IPV4 | \
+   PKT_TX_OUTER_IPV6 | \
PKT_TX_TCP_SEG |\
PKT_TX_QINQ_PKT |   \
-   PKT_TX_VLAN_PKT)
+   PKT_TX_VLAN_PKT |   \
+   PKT_TX_TUNNEL_MASK)
 
 #define I40E_TX_OFFLOAD_NOTSUP_MASK \
(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK)
-- 
2.4.11



Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 10:49 AM, Matej Vido wrote:
> Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")
> 
> Signed-off-by: Matej Vido 

Unrelated from this patch, in maintainers file, you have your other mail
address: "Matej Vido ", do you want to update it?

> ---
>  drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
> b/drivers/net/szedata2/rte_eth_szedata2.h
> index b58adb6..afe8a38 100644
> --- a/drivers/net/szedata2/rte_eth_szedata2.h
> +++ b/drivers/net/szedata2/rte_eth_szedata2.h
> @@ -192,7 +192,7 @@ struct szedata {
>  }
>  
>  #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \
> - ((type)((uint8_t *)(rsc)->addr) + (offset))
> + ((type)(((uint8_t *)(rsc)->addr) + (offset)))

Although output will be same, (in all uses, type is a pointer), this
seems the intention, so:

Reviewed-by: Ferruh Yigit 

btw, following will do same, right, not sure if it is better:
((type)(rsc)->addr + (offset))

>  
>  enum szedata2_link_speed {
>   SZEDATA2_LINK_SPEED_DEFAULT = 0,
> 



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Bruce Richardson
On Tue, Jan 24, 2017 at 05:10:15PM +0530, Jerin Jacob wrote:
> On Mon, Jan 23, 2017 at 06:19:13PM +0100, Olivier Matz wrote:
> > Before this patch, the management of dependencies between directories
> > had several issues:
> > 
> > - the generation of .depdirs, done at configuration is slow: it can take
> >   more than one minute on some slow targets (usually ~10s on a standard
> >   PC).
> > 
> > - for instance, it is possible to expressed a dependency like:
> >   - app/foo depends on lib/librte_foo
> >   - and lib/librte_foo depends on app/bar
> >   But this won't work because the directories are traversed with a
> >   depth-first algorithm, so we have to choose between doing 'app' before
> >   or after 'lib'.
> > 
> > - the script depdirs-rule.sh is too complex.
> > 
> > - we cannot use "make -d" for debug, because the output of make is used for
> >   the generation of .depdirs.
> > 
> > This patch moves the DEPDIRS-* variables in the upper Makefile, making
> > the dependencies much easier to calculate. A DEPDIRS variable is still
> > used to process library dependencies in LDLIBS.
> > 
> > After this commit, "make config" is almost immediate.
> > 
> > Signed-off-by: Olivier Matz 
> 
> Tested both approach on ThunderX. This patch looks better
> 
> Tested-by: Jerin Jacob 
> 
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc 
> Configuration done
> 
> real0m18.112s
> user0m2.810s
> sys 0m0.660s
> 
> ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19859
> Applying patch #19859 using 'git am'
> Description: [dpdk-dev] mk: parallelize make config
> Applying: mk: parallelize make config
> 
> ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc -j 8
> Configuration done
> 
> real0m2.812s
> user0m3.020s
> sys 0m0.870s
> 
> ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc -j 16
> Configuration done
> 
> real0m1.748s
> user0m3.040s
> sys 0m1.020s
> 
> ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc -j 32
> Configuration done
> 
> real0m1.422s
> user0m3.380s
> sys 0m1.080s
> 
> ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19918
> Applying patch #19918 using 'git am'
> Description: [dpdk-dev] mk: optimize directory dependencies
> Applying: mk: optimize directory dependencies
> 
> ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc 
> Configuration done
> 
> real0m0.064s
> user0m0.000s
> sys 0m0.000s
> ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> ➜ [master]GB-2S [dpdk-master] $ time make config 
> T=arm64-thunderx-linuxapp-gcc -j 8
> Configuration done
> 
> real0m0.055s
> user0m0.000s
> sys 0m0.000s
>

I agree that Olivier's patch is faster. However, I think I prefer having
the library dependencies in the makefiles for the libs themselves rather
than up a level. Given that we are only looking at ~2second of a
difference here in your tests - assuming -j flag - what is the actual
build time differences? My suspicion is that after Ferruh's simpler
patch is applied, the config time becomes such a small part of the
build, that the extra benefits from Oliviers work is not worth the extra
complexity.

Regards,
/Bruce



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Robin Jarry

Hi Bruce,

2017-01-24, Bruce Richardson:

what are the differences in the patches like when doing a build rather
than just a config? If the build is minutes long because of slow IO,
is the extra 10 seconds really going to make that much of a difference?


I agree there is no significant difference in total build time (config 
included) when using -j8 (about ~15s). And I understand your concern 
about the complexity of the patch in the second solution.


But the way dependencies are computed is overly complex and I feel that 
parallelizing is just hiding dust under the carpet. The result after the 
second solution is cleaner (to me): we get rid of an obscure shell 
script and we stop piping make output to files thus restoring the 
possibility to use make -d to debug problems.


And while the goal is to reduce the config time, why not go all the 
way?


--
Robin


Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in rte_pktmbuf_attach

2017-01-24 Thread Olivier MATZ
Hi,

On Sat, 21 Jan 2017 16:28:29 +, "Ananyev, Konstantin"
 wrote:
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya
> > Matveychikov Sent: Saturday, January 21, 2017 3:08 PM
> > To: Yigit, Ferruh 
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in
> > rte_pktmbuf_attach
> > 
> >   
> > > On Jan 20, 2017, at 4:08 PM, Ferruh Yigit
> > >  wrote:
> > >
> > > On 1/20/2017 12:19 AM, Ilya Matveychikov wrote:  
> > >> mi->next will be assigned to NULL few lines later, trivial patch
> > >>
> > >> Signed-off-by: Ilya V. Matveychikov 
> > >> ---
> > >> lib/librte_mbuf/rte_mbuf.h | 1 -
> > >> 1 file changed, 1 deletion(-)
> > >>
> > >> diff --git a/lib/librte_mbuf/rte_mbuf.h
> > >> b/lib/librte_mbuf/rte_mbuf.h index ead7c6e..5589d54 100644
> > >> --- a/lib/librte_mbuf/rte_mbuf.h
> > >> +++ b/lib/librte_mbuf/rte_mbuf.h
> > >> @@ -1139,7 +1139,6 @@ static inline void
> > >> rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
> > >> mi->buf_addr = m->buf_addr; mi->buf_len = m->buf_len;
> > >>
> > >> -mi->next = m->next;  
> > >

Fixes: ea672a8b1655 ("mbuf: remove the rte_pktmbuf structure")

Acked-by: Olivier Matz 


> > > Do you know why attaching mbuf is not supporting multi-segment?  
> 
> This is supported, but you have to do it segment by segment.
> Actually  rte_pktmbuf_clone() does that.
> Konstantin
> 
> 
> > > Perhaps this can be documented in function comment, as one of the
> > > "not supported" items.  
> > 
> > No, I don’t know. For my application I’ve found that nb_segs with
> > it’s limit in 256 segments is very annoying and I’ve decided not to
> > use DPDK functions that dealt with nb_segs… But it is not about the
> > rte_pktmbuf_attach() function and the patch. 


Out of curiosity, can you explain why your application needs more
than 256 segments? When we were discussing the possibility of extending
this field to 16 bits, Konstantin convinced me that it was not so
useful.


Thanks,
Olivier


Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Jerin Jacob
On Tue, Jan 24, 2017 at 12:15:06PM +, Bruce Richardson wrote:
> On Tue, Jan 24, 2017 at 05:10:15PM +0530, Jerin Jacob wrote:
> > On Mon, Jan 23, 2017 at 06:19:13PM +0100, Olivier Matz wrote:
> > > Before this patch, the management of dependencies between directories
> > > had several issues:
> > > 
> > > - the generation of .depdirs, done at configuration is slow: it can take
> > >   more than one minute on some slow targets (usually ~10s on a standard
> > >   PC).
> > > 
> > > - for instance, it is possible to expressed a dependency like:
> > >   - app/foo depends on lib/librte_foo
> > >   - and lib/librte_foo depends on app/bar
> > >   But this won't work because the directories are traversed with a
> > >   depth-first algorithm, so we have to choose between doing 'app' before
> > >   or after 'lib'.
> > > 
> > > - the script depdirs-rule.sh is too complex.
> > > 
> > > - we cannot use "make -d" for debug, because the output of make is used 
> > > for
> > >   the generation of .depdirs.
> > > 
> > > This patch moves the DEPDIRS-* variables in the upper Makefile, making
> > > the dependencies much easier to calculate. A DEPDIRS variable is still
> > > used to process library dependencies in LDLIBS.
> > > 
> > > After this commit, "make config" is almost immediate.
> > > 
> > > Signed-off-by: Olivier Matz 
> > 
> > Tested both approach on ThunderX. This patch looks better
> > 
> > Tested-by: Jerin Jacob 
> > 
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc 
> > Configuration done
> > 
> > real0m18.112s
> > user0m2.810s
> > sys 0m0.660s
> > 
> > ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19859
> > Applying patch #19859 using 'git am'
> > Description: [dpdk-dev] mk: parallelize make config
> > Applying: mk: parallelize make config
> > 
> > ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc -j 8
> > Configuration done
> > 
> > real0m2.812s
> > user0m3.020s
> > sys 0m0.870s
> > 
> > ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc -j 16
> > Configuration done
> > 
> > real0m1.748s
> > user0m3.040s
> > sys 0m1.020s
> > 
> > ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc -j 32
> > Configuration done
> > 
> > real0m1.422s
> > user0m3.380s
> > sys 0m1.080s
> > 
> > ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19918
> > Applying patch #19918 using 'git am'
> > Description: [dpdk-dev] mk: optimize directory dependencies
> > Applying: mk: optimize directory dependencies
> > 
> > ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc 
> > Configuration done
> > 
> > real0m0.064s
> > user0m0.000s
> > sys 0m0.000s
> > ➜ [master]GB-2S [dpdk-master] $ rm -rf build
> > ➜ [master]GB-2S [dpdk-master] $ time make config 
> > T=arm64-thunderx-linuxapp-gcc -j 8
> > Configuration done
> > 
> > real0m0.055s
> > user0m0.000s
> > sys 0m0.000s
> >
> 
> I agree that Olivier's patch is faster. However, I think I prefer having
> the library dependencies in the makefiles for the libs themselves rather
> than up a level. Given that we are only looking at ~2second of a
> difference here in your tests - assuming -j flag - what is the actual
> build time differences? My suspicion is that after Ferruh's simpler

Without patch - 18sec
With patch -j1 - 18 sec
With patch -j2 - 9.2 sec
With patch -j4 - 4.9 sec
With patch -j8 - 2.8 sec
With patch -j16 - 1.7 sec
With patch -j32 - 1.4 sec

> patch is applied, the config time becomes such a small part of the
> build, that the extra benefits from Oliviers work is not worth the extra
> complexity.

The low-end embedded SoCs (SoC with 2 to 4 cores) will be get benefited out
of bring this extra complexity.My take is, if it is manageable complexity then
take the most optimized one.



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Ferruh Yigit
On 1/23/2017 5:19 PM, Olivier Matz wrote:
> Before this patch, the management of dependencies between directories
> had several issues:
> 
> - the generation of .depdirs, done at configuration is slow: it can take
>   more than one minute on some slow targets (usually ~10s on a standard
>   PC).
> 
> - for instance, it is possible to expressed a dependency like:
>   - app/foo depends on lib/librte_foo
>   - and lib/librte_foo depends on app/bar
>   But this won't work because the directories are traversed with a
>   depth-first algorithm, so we have to choose between doing 'app' before
>   or after 'lib'.
> 
> - the script depdirs-rule.sh is too complex.
> 
> - we cannot use "make -d" for debug, because the output of make is used for
>   the generation of .depdirs.
> 
> This patch moves the DEPDIRS-* variables in the upper Makefile, making
> the dependencies much easier to calculate. A DEPDIRS variable is still
> used to process library dependencies in LDLIBS.
> 
> After this commit, "make config" is almost immediate.
> 
> Signed-off-by: Olivier Matz 

Hi Olivier,

It seems both have pros and cons,

Your patch pros,
- faster
- and simpler implementation.

cons,
- Need to update another Makefile in another level to update
dependencies of a library/driver.
- Root level dependencies hardcoded to a mk level makefile.
- removes depgraph target too, not sure how commonly used



Original implementation pros:
- self contained, it manages dependencies of library in same Makefile
- no hardcoded dependencies, all resolved dynamically

cons,
- relatively slower, but not too bad with -j make option.
- complex implementation


I would prefer my version, surprisingly J, but it is good to have
alternatives, and I don't have strong opinion against your patch.

Thanks,
ferruh


Re: [dpdk-dev] [PATCH v4] ethdev: fix MAC address replay

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 10:09 AM, Igor Ryzhov wrote:
> Thank you Steve.
> 

> I never did it before and I don't know if I have rights for that, but:
> 
> Acked-by: Igor Ryzhov mailto:iryz...@nfware.com>>

Unrelated to the patch itself, but since it has been mentioned, let me
share what I know, I believe Thomas or others will correct me if I am wrong:

- Everyone can Ack.
  And this is useful information for maintainers, so it is something
good when more people review and ack. Please do.

- Multiple ack or review is better.

- But each Ack does not have same weight, maintainer decides on this
weight, based on contribution of the person who ack'ed.

- There is slight difference between Acked-by and Reviewed-by:

-- Acked-by: Kind of asking for patch to be applied, saying this patch
is good and please get it.

-- Reviewed-by: Saying I have done the review at my best and patch looks
good to me.

Acked-by has slightly more responsibility than Reviewed-by.

If you are not maintainer of that field, and not have strong opinion
about that patch to be merged, it is possible to prefer Reviewed-by
against Acked-by.

But overall both are good, and definitely better than not saying
anything at all.

Thanks,
ferruh

> 
> On Tue, Jan 24, 2017 at 5:21 AM, Steve Shin  > wrote:
> 
> This patch fixes a bug in replaying MAC address to the hardware
> in rte_eth_dev_config_restore() routine. Added default MAC replay as
> well.
> 
> Fixes: 4bdefaade6d1 ("ethdev: VMDQ enhancements")
> 
> ---
> v2: Added default MAC replay & Code optimization
> v3: Covered a case (ex, SR-IOV) where multiple pools
>exist in the mac_pool_sel array.
> v4: removed a coding style warning
> 
> Signed-off-by: Steve Shin mailto:jons...@cisco.com>>
> 



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Richardson, Bruce


> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Tuesday, January 24, 2017 12:57 PM
> To: Richardson, Bruce 
> Cc: Olivier Matz ; dev@dpdk.org; Yigit, Ferruh
> ; thomas.monja...@6wind.com
> Subject: Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies
> 
> On Tue, Jan 24, 2017 at 12:15:06PM +, Bruce Richardson wrote:
> > On Tue, Jan 24, 2017 at 05:10:15PM +0530, Jerin Jacob wrote:
> > > On Mon, Jan 23, 2017 at 06:19:13PM +0100, Olivier Matz wrote:
> > > > Before this patch, the management of dependencies between
> > > > directories had several issues:
> > > >
> > > > - the generation of .depdirs, done at configuration is slow: it can
> take
> > > >   more than one minute on some slow targets (usually ~10s on a
> standard
> > > >   PC).
> > > >
> > > > - for instance, it is possible to expressed a dependency like:
> > > >   - app/foo depends on lib/librte_foo
> > > >   - and lib/librte_foo depends on app/bar
> > > >   But this won't work because the directories are traversed with a
> > > >   depth-first algorithm, so we have to choose between doing 'app'
> before
> > > >   or after 'lib'.
> > > >
> > > > - the script depdirs-rule.sh is too complex.
> > > >
> > > > - we cannot use "make -d" for debug, because the output of make is
> used for
> > > >   the generation of .depdirs.
> > > >
> > > > This patch moves the DEPDIRS-* variables in the upper Makefile,
> > > > making the dependencies much easier to calculate. A DEPDIRS
> > > > variable is still used to process library dependencies in LDLIBS.
> > > >
> > > > After this commit, "make config" is almost immediate.
> > > >
> > > > Signed-off-by: Olivier Matz 
> > >
> > > Tested both approach on ThunderX. This patch looks better
> > >
> > > Tested-by: Jerin Jacob 
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ time make config
> > > T=arm64-thunderx-linuxapp-gcc Configuration done
> > >
> > > real0m18.112s
> > > user0m2.810s
> > > sys 0m0.660s
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19859 Applying patch
> > > #19859 using 'git am'
> > > Description: [dpdk-dev] mk: parallelize make config
> > > Applying: mk: parallelize make config
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ rm -rf build ➜ [master]GB-2S
> > > [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc -j 8
> > > Configuration done
> > >
> > > real0m2.812s
> > > user0m3.020s
> > > sys 0m0.870s
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ rm -rf build ➜ [master]GB-2S
> > > [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc -j 16
> > > Configuration done
> > >
> > > real0m1.748s
> > > user0m3.040s
> > > sys 0m1.020s
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ rm -rf build ➜ [master]GB-2S
> > > [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc -j 32
> > > Configuration done
> > >
> > > real0m1.422s
> > > user0m3.380s
> > > sys 0m1.080s
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ pwclient git-am 19918 Applying patch
> > > #19918 using 'git am'
> > > Description: [dpdk-dev] mk: optimize directory dependencies
> > > Applying: mk: optimize directory dependencies
> > >
> > > ➜ [master]GB-2S [dpdk-master] $ rm -rf build ➜ [master]GB-2S
> > > [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc
> > > Configuration done
> > >
> > > real0m0.064s
> > > user0m0.000s
> > > sys 0m0.000s
> > > ➜ [master]GB-2S [dpdk-master] $ rm -rf build ➜ [master]GB-2S
> > > [dpdk-master] $ time make config T=arm64-thunderx-linuxapp-gcc -j 8
> > > Configuration done
> > >
> > > real0m0.055s
> > > user0m0.000s
> > > sys 0m0.000s
> > >
> >
> > I agree that Olivier's patch is faster. However, I think I prefer
> > having the library dependencies in the makefiles for the libs
> > themselves rather than up a level. Given that we are only looking at
> > ~2second of a difference here in your tests - assuming -j flag - what
> > is the actual build time differences? My suspicion is that after
> > Ferruh's simpler
> 
> Without patch - 18sec
> With patch -j1 - 18 sec
> With patch -j2 - 9.2 sec
> With patch -j4 - 4.9 sec
> With patch -j8 - 2.8 sec
> With patch -j16 - 1.7 sec
> With patch -j32 - 1.4 sec
> 
> > patch is applied, the config time becomes such a small part of the
> > build, that the extra benefits from Oliviers work is not worth the
> > extra complexity.
> 
> The low-end embedded SoCs (SoC with 2 to 4 cores) will be get benefited
> out of bring this extra complexity.My take is, if it is manageable
> complexity then take the most optimized one.

Ok. Point taken for the lower-core count parts.

Thomas - can at least one of these patches be merged into 17.02, since it will 
definitely help us developers? [If Olivier's is too big a change at this point 
in the cycle, can Ferruh's be taken as an intermediate fix for this release 
before merging Olivier's in 17.05?]

Regards,
/Bruce


Re: [dpdk-dev] [PATCH 3/3] doc: remove ABI changes in igb_uio

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 7:34 AM, Jianfeng Tan wrote:
> We announced ABI changes to remove iomem and ioport mapping in
> igb_uio. But it has potential backward compatibility issue: cannot
> run old version DPDK on modified igb_uio.
> 
> The purpose of this changes was to fix a bug: when DPDK app crashes,
> those devices by igb_uio are not stopped either DPDK PMD driver or
> igb_uio driver. We need to figure out new way to fix this bug.

Hi Jianfeng,

I believe it would be good to fix this potential defect.

Is "remove iomem and ioport" a must for that fix? If so, I suggest
re-think about it.

If I see correctly, dpdk1.8 and older uses igb_uio iomem files. So
backward compatibility is the possible issue for dpdk1.8 and older.
Since v1.8 two years old, I would prefer fixing defect instead of
keeping that backward compatibility.

Jianfeng, Thomas,

What do you think postponing this deprecation notice to next release,
instead of removing it, and discuss more?


And overall, if "remove iomem and ioport" is not a must for this fix, no
problem to remove deprecation notice.

Thanks,
ferruh


> 
> Fixes: 3bac1dbc1ed ("doc: announce iomem and ioport removal from igb_uio")
> 
> Suggested-by: Stephen Hemminger 
> Signed-off-by: Jianfeng Tan 
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 755dc65..0f039dd 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -8,11 +8,6 @@ API and ABI deprecation notices are to be posted here.
>  Deprecation Notices
>  ---
>  
> -* igb_uio: iomem mapping and sysfs files created for iomem and ioport in
> -  igb_uio will be removed, because we are able to detect these from what 
> Linux
> -  has exposed, like the way we have done with uio-pci-generic. This change
> -  targets release 17.02.
> -
>  * ABI/API changes are planned for 17.02: ``rte_device``, ``rte_driver`` will 
> be
>impacted because of introduction of a new ``rte_bus`` hierarchy. This would
>also impact the way devices are identified by EAL. A bus-device-driver 
> model
> 



Re: [dpdk-dev] [PATCH v4] ethdev: fix MAC address replay

2017-01-24 Thread Igor Ryzhov
Hello Ferruh,

Thanks for the explanation.

I tried to find something like that in "Contribution Guidelines" and found
that both "Acked-by" and "Reviewed-by" are just mentioned but not explained.
Meaning of these sentences can be different in different projects so it can
be good to explain it in DPDK development guidelines.

Best regards,
Igor

On Tue, Jan 24, 2017 at 4:21 PM, Ferruh Yigit 
wrote:

> On 1/24/2017 10:09 AM, Igor Ryzhov wrote:
> > Thank you Steve.
> >
>
> > I never did it before and I don't know if I have rights for that, but:
> >
> > Acked-by: Igor Ryzhov mailto:iryz...@nfware.com>>
>
> Unrelated to the patch itself, but since it has been mentioned, let me
> share what I know, I believe Thomas or others will correct me if I am
> wrong:
>
> - Everyone can Ack.
>   And this is useful information for maintainers, so it is something
> good when more people review and ack. Please do.
>
> - Multiple ack or review is better.
>
> - But each Ack does not have same weight, maintainer decides on this
> weight, based on contribution of the person who ack'ed.
>
> - There is slight difference between Acked-by and Reviewed-by:
>
> -- Acked-by: Kind of asking for patch to be applied, saying this patch
> is good and please get it.
>
> -- Reviewed-by: Saying I have done the review at my best and patch looks
> good to me.
>
> Acked-by has slightly more responsibility than Reviewed-by.
>
> If you are not maintainer of that field, and not have strong opinion
> about that patch to be merged, it is possible to prefer Reviewed-by
> against Acked-by.
>
> But overall both are good, and definitely better than not saying
> anything at all.
>
> Thanks,
> ferruh
>
> >
> > On Tue, Jan 24, 2017 at 5:21 AM, Steve Shin  > > wrote:
> >
> > This patch fixes a bug in replaying MAC address to the hardware
> > in rte_eth_dev_config_restore() routine. Added default MAC replay as
> > well.
> >
> > Fixes: 4bdefaade6d1 ("ethdev: VMDQ enhancements")
> >
> > ---
> > v2: Added default MAC replay & Code optimization
> > v3: Covered a case (ex, SR-IOV) where multiple pools
> >exist in the mac_pool_sel array.
> > v4: removed a coding style warning
> >
> > Signed-off-by: Steve Shin  jons...@cisco.com>>
> >
>
>


Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Matej Vido

On 24.01.2017 12:58, Ferruh Yigit wrote:

On 1/24/2017 10:49 AM, Matej Vido wrote:

Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")

Signed-off-by: Matej Vido 

Unrelated from this patch, in maintainers file, you have your other mail
address: "Matej Vido ", do you want to update it?

Hi Ferruh,

yes, I will send the patch.



---
  drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
b/drivers/net/szedata2/rte_eth_szedata2.h
index b58adb6..afe8a38 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.h
+++ b/drivers/net/szedata2/rte_eth_szedata2.h
@@ -192,7 +192,7 @@ struct szedata {
  }
  
  #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \

-   ((type)((uint8_t *)(rsc)->addr) + (offset))
+   ((type)(((uint8_t *)(rsc)->addr) + (offset)))

Although output will be same, (in all uses, type is a pointer), this
seems the intention, so:

Reviewed-by: Ferruh Yigit 

btw, following will do same, right, not sure if it is better:
((type)(rsc)->addr + (offset))
This is also wrong. The intention of the macro is to add an offset to 
the base address and typecast the result.


Regards,
Matej


  
  enum szedata2_link_speed {

SZEDATA2_LINK_SPEED_DEFAULT = 0,





Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 2:02 PM, Matej Vido wrote:
> On 24.01.2017 12:58, Ferruh Yigit wrote:
>> On 1/24/2017 10:49 AM, Matej Vido wrote:
>>> Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")
>>>
>>> Signed-off-by: Matej Vido 
>> Unrelated from this patch, in maintainers file, you have your other mail
>> address: "Matej Vido ", do you want to update it?
> Hi Ferruh,
> 
> yes, I will send the patch.
>>
>>> ---
>>>   drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
>>> b/drivers/net/szedata2/rte_eth_szedata2.h
>>> index b58adb6..afe8a38 100644
>>> --- a/drivers/net/szedata2/rte_eth_szedata2.h
>>> +++ b/drivers/net/szedata2/rte_eth_szedata2.h
>>> @@ -192,7 +192,7 @@ struct szedata {
>>>   }
>>>   
>>>   #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \
>>> -   ((type)((uint8_t *)(rsc)->addr) + (offset))
>>> +   ((type)(((uint8_t *)(rsc)->addr) + (offset)))
>> Although output will be same, (in all uses, type is a pointer), this
>> seems the intention, so:
>>
>> Reviewed-by: Ferruh Yigit 
>>
>> btw, following will do same, right, not sure if it is better:
>> ((type)(rsc)->addr + (offset))
> This is also wrong. The intention of the macro is to add an offset to 
> the base address and typecast the result.

Right, again this will give same output when "type" is pointer, but
wrong for described intention.

> 
> Regards,
> Matej
>>
>>>   
>>>   enum szedata2_link_speed {
>>> SZEDATA2_LINK_SPEED_DEFAULT = 0,
>>>
> 



[dpdk-dev] [PATCH] net/ena: prepare TSO offload calculation

2017-01-24 Thread Jakub Palider
While ENA can handle checksum calculations in almost all cases,
it cannot do so when DF bit in IPv4 header is not set,
that is DF=0, and TSO is requested. For that situation pseudo
header must be prepared manually.

Signed-off-by: Jakub Palider 
---
 drivers/net/ena/ena_ethdev.c | 29 +++--
 drivers/net/ena/ena_ethdev.h |  1 +
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e99bf29..060bed9 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1360,6 +1360,10 @@ static int eth_ena_dev_init(struct rte_eth_dev *eth_dev)
/* Set max MTU for this device */
adapter->max_mtu = get_feat_ctx.dev_attr.max_mtu;
 
+   /* set device support for TSO */
+   adapter->tso4_supported = get_feat_ctx.offload.tx &
+ ENA_ADMIN_FEATURE_OFFLOAD_DESC_TSO_IPV4_MASK;
+
/* Copy MAC address and point DPDK to it */
eth_dev->data->mac_addrs = (struct ether_addr *)adapter->mac_addr;
ether_addr_copy((struct ether_addr *)get_feat_ctx.dev_attr.mac_addr,
@@ -1585,13 +1589,20 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, 
struct rte_mbuf **rx_pkts,
 }
 
 static uint16_t
-eth_ena_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts,
+eth_ena_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts)
 {
int32_t ret;
uint32_t i;
struct rte_mbuf *m;
+   struct ena_ring *tx_ring = (struct ena_ring *)(tx_queue);
+   struct ipv4_hdr *ip_hdr;
uint64_t ol_flags;
+   uint16_t frag_field;
+
+   /* ENA needs partial checksum for TSO packets only, skip early */
+   if (!tx_ring->adapter->tso4_supported)
+   return nb_pkts;
 
for (i = 0; i != nb_pkts; i++) {
m = tx_pkts[i];
@@ -1611,7 +1622,21 @@ static uint16_t eth_ena_recv_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
return i;
}
 #endif
-   /* ENA doesn't need different phdr cskum for TSO */
+
+   if (!(m->ol_flags & PKT_TX_IPV4))
+   continue;
+
+   ip_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *,
+m->l2_len);
+   frag_field = rte_be_to_cpu_16(ip_hdr->fragment_offset);
+   if (frag_field & IPV4_HDR_DF_FLAG)
+   continue;
+
+   /* In case we are supposed to TSO and have DF not set (DF=0)
+* hardware must be provided with partial checksum, otherwise
+* it will take care of necessary calculations.
+*/
+
ret = rte_net_intel_cksum_flags_prepare(m,
ol_flags & ~PKT_TX_TCP_SEG);
if (ret != 0) {
diff --git a/drivers/net/ena/ena_ethdev.h b/drivers/net/ena/ena_ethdev.h
index 4c7edbb..dc3080f 100644
--- a/drivers/net/ena/ena_ethdev.h
+++ b/drivers/net/ena/ena_ethdev.h
@@ -162,6 +162,7 @@ struct ena_adapter {
 
u16 num_queues;
u16 max_mtu;
+   u8 tso4_supported;
 
int id_number;
char name[ENA_NAME_MAX_LEN];
-- 
1.9.1



[dpdk-dev] [PATCH] maintainers: update email address

2017-01-24 Thread Matej Vido
Signed-off-by: Matej Vido 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f071138..21d0ef9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -311,7 +311,7 @@ F: drivers/net/enic/
 F: doc/guides/nics/enic.rst
 
 Combo szedata2
-M: Matej Vido 
+M: Matej Vido 
 F: drivers/net/szedata2/
 F: doc/guides/nics/szedata2.rst
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Olivier MATZ
On Tue, 24 Jan 2017 13:26:41 +, "Richardson, Bruce"
 wrote:
> 
> Ok. Point taken for the lower-core count parts.
> 
> Thomas - can at least one of these patches be merged into 17.02,
> since it will definitely help us developers? [If Olivier's is too big
> a change at this point in the cycle, can Ferruh's be taken as an
> intermediate fix for this release before merging Olivier's in 17.05?]

It looks to be a good option. Ferruh's patch is less risky for 17.02.

Regards,
Olivier



Re: [dpdk-dev] [PATCHv6 16/33] drivers/pool/dpaa2: adding hw offloaded mempool

2017-01-24 Thread Hemant Agrawal

On 1/24/2017 4:19 PM, Ferruh Yigit wrote:

On 1/24/2017 9:12 AM, Shreyansh Jain wrote:

On Monday 23 January 2017 11:04 PM, Ferruh Yigit wrote:

On 1/23/2017 11:59 AM, Hemant Agrawal wrote:

Adding NXP DPAA2 architecture specific mempool support
Each mempool instance is represented by a DPBP object
from the FSL-MC bus.

This patch also registers a dpaa2 type MEMPOOL OPS

Signed-off-by: Hemant Agrawal 
---

<...>


diff --git a/drivers/common/Makefile b/drivers/common/Makefile
index b52931c..0bb75b5 100644
--- a/drivers/common/Makefile
+++ b/drivers/common/Makefile
@@ -35,7 +35,11 @@ ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_PMD),y)
 CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_PMD)
 endif

-ifeq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)
+ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_POOL),y)
+CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_DPAA2_POOL)
+endif
+
+ifneq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y)


I guess this is a typo, but this prevents DPAA2_COMMON to be compiled !!


It should be 'ifeq' rather than 'ifneq'.



And it will prevent COMMON
compilation only if CONFIG_RTE_LIBRTE_FSLMC_BUS=n which is not the case
right now.


It was the case for me for x86 config, but you are right it is not the
default case for arm.



We will fix it.




 CONFIG_RTE_LIBRTE_DPAA2_COMMON = $(CONFIG_RTE_LIBRTE_FSLMC_BUS)
 endif



<...>

+# library dependencies
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_common_dpaa2_qbman


This dependeny doesn not looks correct, there is no folder like that.


This is something even I need to understand. From the DEPDIRS what I
understood was that though it refers to a directory, it essentially
links libraries in build/lib/*.

Further, somehow the development is deploying drivers/bus,
drivers/common and drivers/pool in lib/* under the name specified as
LIB in Makefile. My understanding was that it is expected behavior and
not special because of drivers folder.

Thus, above line only links lib/librte_common_dpaa2_qbman generated by
drivers/common/dpaa2/qbman code.

In fact, I think, this might also one of the issues why a parallel
shared build fails for DPAA2 PMD (added in Cover letter).
The dependency graph cannot create a graph for drivers/common
as dependency for drivers/net or drivers/bus and hence parallel build
fails because of missing libraries which are being parallely compiled.


DEPDIRS-y is mainly to resolve dependencies for compilation order, and
should point to the folder,

Following line will cause "librte_eal" to be compiled before driver:
DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_eal

So "lib/librte_common_dpaa2_qbman" does not makes more sense, since
there is no folder like that.


Somewhere in the history, with following commit, DEPDIRS-y gained a side
effect, it has been used to set dynamic linking dependencies, to fix
underlinking issue:
 bf5a46fa5972 ("mk: generate internal library dependencies")

I guess you are having that line to benefit from this side effect, but
this can be done with following more properly:
LDLIBS += lib/librte_common_dpaa2_qbman


To resolve the drivers/net to drivers/common dependency, following line
in this Makefile should work:
DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_PMD) += drivers/common/dpaa2

This adds following, which will cause "drivers/common" compiled before
any "drivers/net":
LOCAL_DEPDIRS-drivers/net += drivers/common


Thanks for your suggestion. This is one thing, I am not yet able to fix.

Based on your suggestions:
e.g.
LDLIBS += -lrte_common_dpaa2_qbman
DEPDIRS-$(CONFIG_RTE_LIBRTE_FSLMC_BUS) += drivers/common/dpaa2

It does add entry in the ".depdirs"
./arm64-dpaa2-linuxapp-gcc/.depdirs:168:LOCAL_DEPDIRS-drivers/bus += 
drivers/common

./arm64-dpaa2-linuxapp-gcc/.depdirs:170:LOCAL_DEPDIRS-drivers += lib
./arm64-dpaa2-linuxapp-gcc/.depdirs:172:LOCAL_DEPDIRS-drivers += lib
./arm64-dpaa2-linuxapp-gcc/.depdirs:174:LOCAL_DEPDIRS-drivers/pool += 
drivers/common


However,  we keep on getting:
LD librte_bus_fslmc.so.1.1
aarch64-linux-gnu-gcc: error: drivers/common/dpaa2: No such file or 
directory

make[6]: *** [librte_bus_fslmc.so.1.1] Error 1






+DEPDIRS-$(CONFIG_RTE_LIBRTE_DPAA2_POOL) += lib/librte_bus_fslmc
+
+include $(RTE_SDK)/mk/rte.lib.mk


<...>











Re: [dpdk-dev] [RFC] Add GRO support in DPDK

2017-01-24 Thread Wiles, Keith

> On Jan 24, 2017, at 3:33 AM, Ananyev, Konstantin 
>  wrote:
> 
> 
> 
>> -Original Message-
>> From: Wiles, Keith
>> Sent: Tuesday, January 24, 2017 5:26 AM
>> To: Ananyev, Konstantin 
>> Cc: Stephen Hemminger ; Hu, Jiayu 
>> ; dev@dpdk.org; Kinsella, Ray
>> ; Gilmore, Walter E ; 
>> Venkatesan, Venky ;
>> yuanhan@linux.intel.com
>> Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
>> 
>> 
>>> On Jan 23, 2017, at 6:43 PM, Ananyev, Konstantin 
>>>  wrote:
>>> 
>>> 
>>> 
 -Original Message-
 From: Wiles, Keith
 Sent: Monday, January 23, 2017 9:53 PM
 To: Stephen Hemminger 
 Cc: Hu, Jiayu ; dev@dpdk.org; Kinsella, Ray 
 ; Ananyev, Konstantin
 ; Gilmore, Walter E 
 ; Venkatesan, Venky
>> ;
 yuanhan@linux.intel.com
 Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
 
 
> On Jan 23, 2017, at 10:15 AM, Stephen Hemminger 
>  wrote:
> 
> On Mon, 23 Jan 2017 21:03:12 +0800
> Jiayu Hu  wrote:
> 
>> With the support of hardware segmentation techniques in DPDK, the
>> networking stack overheads of send-side of applications, which directly
>> leverage DPDK, have been greatly reduced. But for receive-side, numbers 
>> of
>> segmented packets seriously burden the networking stack of applications.
>> Generic Receive Offload (GRO) is a widely used method to solve the
>> receive-side issue, which gains performance by reducing the amount of
>> packets processed by the networking stack. But currently, DPDK doesn't
>> support GRO. Therefore, we propose to add GRO support in DPDK, and this
>> RFC is used to explain the basic DPDK GRO design.
>> 
>> DPDK GRO is a SW-based packets assembly library, which provides GRO
>> abilities for numbers of protocols. In DPDK GRO, packets are merged
>> before returning to applications and after receiving from drivers.
>> 
>> In DPDK, GRO is a capability of NIC drivers. That support GRO or not and
>> what GRO types are supported are up to NIC drivers. Different drivers may
>> support different GRO types. By default, drivers enable all supported GRO
>> types. For applications, they can inquire the supported GRO types by
>> each driver, and can control what GRO types are applied. For example,
>> ixgbe supports TCP and UDP GRO, but the application just needs TCP GRO.
>> The application can disable ixgbe UDP GRO.
>> 
>> To support GRO, a driver should provide a way to tell applications what
>> GRO types are supported, and provides a GRO function, which is in charge
>> of assembling packets. Since different drivers may support different GRO
>> types, their GRO functions may be different. For applications, they don't
>> need extra operations to enable GRO. But if there are some GRO types that
>> are not needed, applications can use an API, like
>> rte_eth_gro_disable_protocols, to disable them. Besides, they can
>> re-enable the disabled ones.
>> 
>> The GRO function processes numbers of packets at a time. In each
>> invocation, what GRO types are applied depends on applications, and the
>> amount of packets to merge depends on the networking status and
>> applications. Specifically, applications determine the maximum number of
>> packets to be processed by the GRO function, but how many packets are
>> actually processed depends on if there are available packets to receive.
>> For example, the receive-side application asks the GRO function to
>> process 64 packets, but the sender only sends 40 packets. At this time,
>> the GRO function returns after processing 40 packets. To reassemble the
>> given packets, the GRO function performs an "assembly procedure" on each
>> packet. We use an example to demonstrate this procedure. Supposing the
>> GRO function is going to process packetX, it will do the following two
>> things:
>>  a. Find a L4 assembly function according to the packet type of
>>  packetX. A L4 assembly function is in charge of merging packets of a
>>  specific type. For example, TCPv4 assembly function merges packets
>>  whose L3 IPv4 and L4 is TCP. Each L4 assembly function has a packet
>>  array, which keeps the packets that are unable to assemble.
>>  Initially, the packet array is empty;
>>  b. The L4 assembly function traverses own packet array to find a
>>  mergeable packet (comparing Ethernet, IP and L4 header fields). If
>>  finds, merges it and packetX via chaining them together; if doesn't,
>>  allocates a new array element to store packetX and updates element
>>  number of the array.
>> After performing the assembly procedure to all packets, the GRO function
>> combines the results of all packet arrays, and returns these packets to
>> applications.
>> 
>> There are lots of ways to implement the above design in DPDK. One of the
>> wa

Re: [dpdk-dev] [PATCH] mk: optimize directory dependencies

2017-01-24 Thread Wiles, Keith

> On Jan 24, 2017, at 7:50 AM, Olivier MATZ  wrote:
> 
> On Tue, 24 Jan 2017 13:26:41 +, "Richardson, Bruce"
>  wrote:
>> 
>> Ok. Point taken for the lower-core count parts.
>> 
>> Thomas - can at least one of these patches be merged into 17.02,
>> since it will definitely help us developers? [If Olivier's is too big
>> a change at this point in the cycle, can Ferruh's be taken as an
>> intermediate fix for this release before merging Olivier's in 17.05?]
> 
> It looks to be a good option. Ferruh's patch is less risky for 17.02.

+1 for Ferruh’s version now and Olivier’s later as it does not require -j.

> 
> Regards,
> Olivier
> 

Regards,
Keith



[dpdk-dev] NXP DPAA2: Symbol renaming issue: Request for Suggestions

2017-01-24 Thread Shreyansh Jain
Hello,

We are facing a peculiar problem with respect to symbol namespace in DPDK. I
think Ferruh and Thomas would have fair idea about it as they have already
reviewed and commented on it. I was hoping to get some input to take it
forward from here.

Brief Intro to DPAA2 Architecture:

This is brief about NXP's DPAA2 PMD to start with:
(A lot more information is available at [1])


 
 +---+   
 | Application   | 
 +..-+ 
  ||   
 +'--+   +-'-+ 
drivers/>|   DPIO|   |   DPIO|<---drivers/bus/fslmc
 bus/fslmc   +.--+   +--.+
  | |  
 +/-||--||--/+ 
 |   Queue/Buffer Manager|<--- drivers/common/dpaa2 
 
 +\-||--||--\+  qbman
  | | 
 +'--+   +--'+ 
drivers/ --->|   DPNI|   |   DPSEC   |<---drivers/cyrpto
 net/dpaa2   +|--+   +-|-+ dpaa2_sec
  ||  
 +|--+  +--|-+   ++
 | PHY H/W   |  |   SEC H/W  |  .> FSL MC BUS |
 +---+  ++ / ++
  drivers/bus/fslmc
  
  
If we consider the above layout, drivers/crypto/dpaa_sec (NXP's DPAA2 Crypto
PMD, already available on ML [2]), and drivers/net/dpaa2 (NXP's DPAA2 PMD) are
using a common code (drivers/common/dpaa2/qbman).

QBMAN (drivers/common/dpaa2/qbman) is essentially a Queue and Buffer Manager
set of APIs which allow DPIO (Data Path IO interfaces) to communicate with the
Hardware through queues (and buffers).

At the scan time, FSLMC bus is scanned and all devices (Phy or Sec) are
identified and added to a list. For each such device, appropriate I/O portals
are opened which are essentially gateway between user-space and DP* devices
using the hardware queues/buffers (qbman)

Problem:

You might have noticed that we have exposed a lot of symbols from
drivers/common and drivers/bus for drivers/net and drivers/crypto. All these 
symbols are not rte_* as what has been suggested for exported symbols.

Review comments have been received for renaming these to make them rte_* or
_rte_* prefixed.

Just as a side note, these symbols are being exposed _internally_ within
drivers/* area. 

There are (3) possible solutions we have:

1/ Rename all the symbols:
  - This is a difficult option for us. Renaming means breaking our linkage
with existing code (Linux Kernel upstream candidate as well as internal
repository).
  - Changing it means maintaining this change set internally/independently
which is not a feasible long term solution.

2/ Merge all the libraries together:
  - In the initial RFC days, there were review comments which suggested that
we should break the PMD into common libraries and place it in drivers/*
parallel folders.
  - This is precisely the reason we are facing the situation.
  - Another possibility is to start duplicating the code for common. But, this
too has a technical limitation for us as some data structures are shared
across net and crypto and it is not possible to have multiple instances of
those.
  - One more offshoot option could have been to keep the library external
of the DPDK framework (external location and linked on demand basis,
manually). We don't prefer this as this will make it difficult for any user
to use DPAA2 easily.

3/ Finding a way to keep symbols internal to drivers/* independent of rte_*
   prefix:
   - For example, allowing symbols to be exposed limited to drivers/* area
 and not allowing them to be available across lib/* (not sure how, though!)
   
   
   
   My argument for this:
  - With new bus infra in place, there would be more drivers being contributed.
It also means that there would be PMDs having their own code and symbol
models. It would be difficult to ask all of them to mandatorily adhere
to a naming scheme.
This argument bodes well for lib/* because that is core (libraries) which
should be controlled for uniformity and performance.

[1] https://www.kernel.org/doc/readme/drivers-staging-fsl-mc-README.txt
[2] http://dpdk.org/ml/archives/dev/2017-January/054251.html



[dpdk-dev] [PATCH v2] ethdev: fix multi-process NULL dereference crashes

2017-01-24 Thread Remy Horton
Secondary processes were blanket zeroing ethernet device memory,
resulting in NULL dereference crashes in multi-process setups.

Fixes: 7f95f78a8aea ("ethdev: clear data when allocating device")

Signed-off-by: Remy Horton 
---
 doc/guides/rel_notes/release_17_02.rst | 5 +
 lib/librte_ether/rte_ethdev.c  | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_02.rst 
b/doc/guides/rel_notes/release_17_02.rst
index 0ecd720..1472f84 100644
--- a/doc/guides/rel_notes/release_17_02.rst
+++ b/doc/guides/rel_notes/release_17_02.rst
@@ -222,6 +222,11 @@ Drivers
   Fixed few regressions introduced in recent releases that break the virtio
   multiple process support.
 
+* **ethdev: Fixed crash with multi-processing.**
+
+  Secondary processes were blanket zeroing ethernet device memory,
+  resulting in NULL dereference crashes in multi-process setups.
+
 
 Libraries
 ~
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 61f44e2..d911921 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -225,8 +225,10 @@ rte_eth_dev_allocate(const char *name)
return NULL;
}
 
-   memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
eth_dev = eth_dev_get(port_id);
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   memset(&rte_eth_dev_data[port_id], 0,
+   sizeof(struct rte_eth_dev_data));
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->data->mtu = ETHER_MTU;
-- 
2.5.5



Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 2:02 PM, Matej Vido wrote:
> On 24.01.2017 12:58, Ferruh Yigit wrote:
>> On 1/24/2017 10:49 AM, Matej Vido wrote:
>>> Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")
>>>
>>> Signed-off-by: Matej Vido 
>> Unrelated from this patch, in maintainers file, you have your other mail
>> address: "Matej Vido ", do you want to update it?
> Hi Ferruh,
> 
> yes, I will send the patch.
>>
>>> ---
>>>   drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
>>> b/drivers/net/szedata2/rte_eth_szedata2.h
>>> index b58adb6..afe8a38 100644
>>> --- a/drivers/net/szedata2/rte_eth_szedata2.h
>>> +++ b/drivers/net/szedata2/rte_eth_szedata2.h
>>> @@ -192,7 +192,7 @@ struct szedata {
>>>   }
>>>   
>>>   #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \
>>> -   ((type)((uint8_t *)(rsc)->addr) + (offset))
>>> +   ((type)(((uint8_t *)(rsc)->addr) + (offset)))
>> Although output will be same, (in all uses, type is a pointer), 

Of course won't be same, please forget about it J, I am confused.

So these two differs a lot, taking into account that offset numbers used
are big numbers (0x8000..), it should be accessing very unrelated addresses.

So how this was working before?

> this
>> seems the intention, so:
>>
>> Reviewed-by: Ferruh Yigit 
>>
>> btw, following will do same, right, not sure if it is better:
>> ((type)(rsc)->addr + (offset))
> This is also wrong. The intention of the macro is to add an offset to 
> the base address and typecast the result.
> 
> Regards,
> Matej
>>
>>>   
>>>   enum szedata2_link_speed {
>>> SZEDATA2_LINK_SPEED_DEFAULT = 0,
>>>
> 



Re: [dpdk-dev] [PATCH v3 03/10] crypto/dpaa2_sec: add dpaa2_sec poll mode driver

2017-01-24 Thread Neil Horman
On Tue, Jan 24, 2017 at 12:04:09PM +0530, Akhil Goyal wrote:
> On 1/21/2017 1:01 AM, Neil Horman wrote:
> > On Fri, Jan 20, 2017 at 06:47:49PM +0530, Akhil Goyal wrote:
> > > On 1/20/2017 6:02 PM, Neil Horman wrote:
> > > > On Fri, Jan 20, 2017 at 07:35:02PM +0530, akhil.go...@nxp.com wrote:
> > > > > From: Akhil Goyal 
> > > > > 
> > > > > Signed-off-by: Hemant Agrawal 
> > > > > Signed-off-by: Akhil Goyal 
> > > > > ---
> > > > >  config/common_base |   8 +
> > > > >  config/defconfig_arm64-dpaa2-linuxapp-gcc  |  12 +
> > > > >  drivers/bus/Makefile   |   3 +
> > > > >  drivers/common/Makefile|   3 +
> > > > >  drivers/crypto/Makefile|   1 +
> > > > >  drivers/crypto/dpaa2_sec/Makefile  |  77 +
> > > > >  drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c| 374 
> > > > > +
> > > > >  drivers/crypto/dpaa2_sec/dpaa2_sec_logs.h  |  70 
> > > > >  drivers/crypto/dpaa2_sec/dpaa2_sec_priv.h  | 225 
> > > > > +
> > > > >  .../crypto/dpaa2_sec/rte_pmd_dpaa2_sec_version.map |   4 +
> > > > >  drivers/net/dpaa2/Makefile |   1 +
> > > > >  drivers/pool/Makefile  |   4 +
> > > > >  mk/rte.app.mk  |   6 +
> > > > >  13 files changed, 788 insertions(+)
> > > > >  create mode 100644 drivers/crypto/dpaa2_sec/Makefile
> > > > >  create mode 100644 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
> > > > >  create mode 100644 drivers/crypto/dpaa2_sec/dpaa2_sec_logs.h
> > > > >  create mode 100644 drivers/crypto/dpaa2_sec/dpaa2_sec_priv.h
> > > > >  create mode 100644 
> > > > > drivers/crypto/dpaa2_sec/rte_pmd_dpaa2_sec_version.map
> > > > > 
> > > > NAK, you're trying to patch driver/bus/Makefile, which doesn't exist in 
> > > > the
> > > > upstream tree, please fix your patch.
> > > > 
> > > > I'm also opposed to the inclusion of pmds that require non-open external
> > > > libraries as your documentation suggests that you require.  If you need 
> > > > an out
> > > > of tree library to support your hardware, you will recieve no benefit 
> > > > from the
> > > > upstream community in terms of testing and maintenence, nor will the 
> > > > community
> > > > be able to work with your hardware on arches that your library doesn't 
> > > > support.
> > > > 
> > > > Neil
> > > > 
> > > Thanks for your comments Neil.
> > > dpaa2_sec driver is dependent on dpaa2 driver which is in review in other
> > > thread. I have mentioned that in the cover letter.
> > > Its latest version is http://dpdk.org/dev/patchwork/patch/19782/
> > > 
> > Sorry, I missed that comment, I'll go find the other thread and take another
> > look
> > 
> > > Also there is no external library used. The libraries that are mentioned 
> > > in
> > > the documentation are all part of the above dpaa2 driver patchset.
> > > 
> > Your documentation patch doesn't seem to suggest that.  From the patch:
> > 
> > +This driver relies on external libraries and kernel drivers for resources
> > +allocations and initialization. The following dependencies are not part of
> > +DPDK and must be installed separately:
> > +
> > +- **NXP Linux SDK**
> > +
> > +  NXP Linux software development kit (SDK) includes support for family
> > +  of QorIQ® ARM-Architecture-based system on chip (SoC) processors
> > +  and corresponding boards.
> > 
> > 
> > 
> > If thats not the case, you should update the documentation.  If it is the 
> > case,
> > I think my initial comment is still valid...
> > 
> > Regards
> > Neil
> > 
> > > -Akhil
> > > 
> > > 
> > > 
> > 
> Thanks for pointing this out. I will update the documentation.
> 

I found the v6 patch series that you referenced, and this set still doesn't
apply cleanly to it.  Theres a conflict in one of the makefiles.


> Regards,
> Akhil
> 
> 


Re: [dpdk-dev] [PATCH] net/i40e: fix parsing tunnel filter issue

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 2:44 AM, Beilei Xing wrote:
> VNI of VXLAN is parsed wrongly. The root cause is that
> array vni in item VXLAN also uses network byte ordering.
> 
> Fixes: d416530e6358 ("net/i40e: parse tunnel filter")
> 
> Signed-off-by: Beilei Xing 
> ---
>  drivers/net/i40e/i40e_flow.c | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c
> index 76bb332..51b3223 100644
> --- a/drivers/net/i40e/i40e_flow.c
> +++ b/drivers/net/i40e/i40e_flow.c
> @@ -1196,6 +1196,20 @@ i40e_check_tenant_id_mask(const uint8_t *mask)
>   return is_masked;
>  }
>  
> +static uint32_t
> +i40e_flow_set_tenant_id(const uint8_t *vni)
> +{
> + uint32_t tenant_id;
> +
> +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> + tenant_id = (vni[0] << 16) | (vni[1] << 8) | vni[2];
> +#else
> + tenant_id = vni[0] | (vni[1] << 8) | (vni[2] << 16);
> +#endif

Instead of a new function, will following do the same?:

uint32_t tenant_id_be= 0;
rte_memcpy(((uint8_t *)&tenant_id_be + 1), vxlan_spec->vni, 3)
filter->tenant_id = rte_be_to_cpu(tenant_id_be);

I think it is easier to understand, what do you think?

Thanks,
ferruh



[dpdk-dev] [RFC 1/8] mbuf: make segment prefree function public

2017-01-24 Thread Olivier Matz
Document the function and make it public, since it is used at several
places in the drivers. The old one is marked as deprecated.

Signed-off-by: Olivier Matz 
---
 drivers/net/enic/enic_rxtx.c  |  2 +-
 drivers/net/fm10k/fm10k_rxtx.c|  6 +++---
 drivers/net/fm10k/fm10k_rxtx_vec.c|  6 +++---
 drivers/net/i40e/i40e_rxtx_vec_common.h   |  6 +++---
 drivers/net/ixgbe/ixgbe_rxtx.c|  2 +-
 drivers/net/ixgbe/ixgbe_rxtx_vec_common.h |  6 +++---
 drivers/net/virtio/virtio_rxtx_simple.h   |  6 +++---
 lib/librte_mbuf/rte_mbuf.h| 30 +++---
 8 files changed, 44 insertions(+), 20 deletions(-)

diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index 26b83ae..f8c8ad0 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -473,7 +473,7 @@ static inline void enic_free_wq_bufs(struct vnic_wq *wq, 
u16 completed_index)
pool = ((struct rte_mbuf *)buf->mb)->pool;
for (i = 0; i < nb_to_free; i++) {
buf = &wq->bufs[tail_idx];
-   m = __rte_pktmbuf_prefree_seg((struct rte_mbuf *)(buf->mb));
+   m = rte_pktmbuf_prefree_seg((struct rte_mbuf *)(buf->mb));
buf->mb = NULL;
 
if (unlikely(m == NULL)) {
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index 144e5e6..c9bb04a 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -434,12 +434,12 @@ static inline void tx_free_bulk_mbuf(struct rte_mbuf 
**txep, int num)
if (unlikely(num == 0))
return;
 
-   m = __rte_pktmbuf_prefree_seg(txep[0]);
+   m = rte_pktmbuf_prefree_seg(txep[0]);
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
for (i = 1; i < num; i++) {
-   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   m = rte_pktmbuf_prefree_seg(txep[i]);
if (likely(m != NULL)) {
if (likely(m->pool == free[0]->pool))
free[nb_free++] = m;
@@ -455,7 +455,7 @@ static inline void tx_free_bulk_mbuf(struct rte_mbuf 
**txep, int num)
rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
} else {
for (i = 1; i < num; i++) {
-   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   m = rte_pktmbuf_prefree_seg(txep[i]);
if (m != NULL)
rte_mempool_put(m->pool, m);
txep[i] = NULL;
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 27f3e43..825e3c1 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -754,12 +754,12 @@ fm10k_tx_free_bufs(struct fm10k_tx_queue *txq)
 * next_dd - (rs_thresh-1)
 */
txep = &txq->sw_ring[txq->next_dd - (n - 1)];
-   m = __rte_pktmbuf_prefree_seg(txep[0]);
+   m = rte_pktmbuf_prefree_seg(txep[0]);
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
for (i = 1; i < n; i++) {
-   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   m = rte_pktmbuf_prefree_seg(txep[i]);
if (likely(m != NULL)) {
if (likely(m->pool == free[0]->pool))
free[nb_free++] = m;
@@ -774,7 +774,7 @@ fm10k_tx_free_bufs(struct fm10k_tx_queue *txq)
rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
} else {
for (i = 1; i < n; i++) {
-   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   m = rte_pktmbuf_prefree_seg(txep[i]);
if (m != NULL)
rte_mempool_put(m->pool, m);
}
diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h 
b/drivers/net/i40e/i40e_rxtx_vec_common.h
index 3745558..76031fe 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -123,12 +123,12 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
  * tx_next_dd - (tx_rs_thresh-1)
  */
txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
-   m = __rte_pktmbuf_prefree_seg(txep[0].mbuf);
+   m = rte_pktmbuf_prefree_seg(txep[0].mbuf);
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
for (i = 1; i < n; i++) {
-   m = __rte_pktmbuf_prefree_seg(txep[i].mbuf);
+   m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (likely(m != NULL)) {
if (likely(m->pool == free[0]->pool)) {
free[nb_free++] = m;
@@ -144,7 +144,7 @@ i40e_t

[dpdk-dev] [RFC 2/8] mbuf: make raw free function public

2017-01-24 Thread Olivier Matz
Rename __rte_mbuf_raw_free() as rte_mbuf_raw_free() and make
it public. The old function is kept for compat but is marked as
deprecated.

The next commit changes the behavior of rte_mbuf_raw_free() to
make it more consistent with rte_mbuf_raw_alloc().

Signed-off-by: Olivier Matz 
---
 drivers/net/ena/ena_ethdev.c |  2 +-
 drivers/net/mlx5/mlx5_rxtx.c |  6 +++---
 drivers/net/mpipe/mpipe_tilegx.c |  2 +-
 lib/librte_mbuf/rte_mbuf.h   | 22 --
 4 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index 8497cd7..4aac6a9 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -685,7 +685,7 @@ static void ena_rx_queue_release_bufs(struct ena_ring *ring)
ring->rx_buffer_info[ring->next_to_clean & ring_mask];
 
if (m)
-   __rte_mbuf_raw_free(m);
+   rte_mbuf_raw_free(m);
 
ring->next_to_clean++;
}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 56c7f78..a518a42 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1328,7 +1328,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
assert(pkt != (*rxq->elts)[idx]);
rep = NEXT(pkt);
rte_mbuf_refcnt_set(pkt, 0);
-   __rte_mbuf_raw_free(pkt);
+   rte_mbuf_raw_free(pkt);
pkt = rep;
}
break;
@@ -1339,13 +1339,13 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
   &rss_hash_res);
if (!len) {
rte_mbuf_refcnt_set(rep, 0);
-   __rte_mbuf_raw_free(rep);
+   rte_mbuf_raw_free(rep);
break;
}
if (unlikely(len == -1)) {
/* RX error, packet is likely too large. */
rte_mbuf_refcnt_set(rep, 0);
-   __rte_mbuf_raw_free(rep);
+   rte_mbuf_raw_free(rep);
++rxq->stats.idropped;
goto skip;
}
diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index 7bbd168..eedc0b3 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -549,7 +549,7 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv)
mbuf->data_len= 0;
mbuf->pkt_len = 0;
 
-   __rte_mbuf_raw_free(mbuf);
+   rte_mbuf_raw_free(mbuf);
}
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 73b79c0..8ff2290 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -788,20 +788,30 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct 
rte_mempool *mp)
 }
 
 /**
- * @internal Put mbuf back into its original mempool.
- * The use of that function is reserved for RTE internal needs.
- * Please use rte_pktmbuf_free().
+ * Put mbuf back into its original mempool.
+ *
+ * The caller must ensure that the mbuf is direct and that the
+ * reference counter is 0.
  *
  * @param m
  *   The mbuf to be freed.
  */
 static inline void __attribute__((always_inline))
-__rte_mbuf_raw_free(struct rte_mbuf *m)
+rte_mbuf_raw_free(struct rte_mbuf *m)
 {
+   RTE_ASSERT(RTE_MBUF_DIRECT(m));
RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
rte_mempool_put(m->pool, m);
 }
 
+/* compat with older versions */
+__rte_deprecated
+static inline void __attribute__((always_inline))
+__rte_mbuf_raw_free(struct rte_mbuf *m)
+{
+   rte_mbuf_raw_free(m);
+}
+
 /* Operations on ctrl mbuf */
 
 /**
@@ -1209,7 +1219,7 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
m->ol_flags = 0;
 
if (rte_mbuf_refcnt_update(md, -1) == 0)
-   __rte_mbuf_raw_free(md);
+   rte_mbuf_raw_free(md);
 }
 
 /**
@@ -1264,7 +1274,7 @@ rte_pktmbuf_free_seg(struct rte_mbuf *m)
m = rte_pktmbuf_prefree_seg(m);
if (likely(m != NULL)) {
m->next = NULL;
-   __rte_mbuf_raw_free(m);
+   rte_mbuf_raw_free(m);
}
 }
 
-- 
2.8.1



[dpdk-dev] [RFC 0/8] mbuf: structure reorganization

2017-01-24 Thread Olivier Matz
Based on discussion done in [1], this patchset reorganizes the mbuf.

The main changes are:
- reorder structure to increase vector performance on some non-ia
  platforms.
- add a 64bits timestamp field in the 1st cache line
- m->next, m->nb_segs, and m->refcnt are always initialized for mbufs
  in the pool, avoiding the need of setting m->next (located in the
  2nd cache line) in the Rx path for mono-segment packets.
- change port and nb_segs to 16 bits
- move seqn in the 2nd cache line

Things discussed but not done in the patchset:
- move refcnt and nb_segs to the 2nd cache line: many drivers sets
  them in the Rx path, so it could introduce a performance regression, or
  it would require to change all the drivers, which is not an easy task.
- remove the m->port field: too much impact on many examples and libraries,
  and some people highlighted they are using it.
- moving m->next in the 1st cache line: there is not enough room, and having
  it set to NULL for unused mbuf should remove the need for it.
- merge seqn and timestamp together in a union: we could imagine use cases
  were both are activated. There is no flag indicating the presence of seqn,
  so it looks preferable to keep them separated for now.

I made some basic performance tests (ixgbe) and see no regression, but
the patchset requires more testing.

[1] http://dpdk.org/ml/archives/dev/2016-October/049338.html


Jerin Jacob (1):
  mbuf: make rearm data address naturally aligned

Olivier Matz (7):
  mbuf: make segment prefree function public
  mbuf: make raw free function public
  mbuf: set mbuf fields while in pool
  net: don't touch mbuf next or nb segs on Rx
  mbuf: use 2 bytes for port and nb segments
  mbuf: move sequence number in second cache line
  mbuf: add a timestamp field

 app/test-pmd/csumonly.c|   4 +-
 drivers/net/ena/ena_ethdev.c   |   2 +-
 drivers/net/enic/enic_rxtx.c   |   2 +-
 drivers/net/fm10k/fm10k_rxtx.c |   6 +-
 drivers/net/fm10k/fm10k_rxtx_vec.c |   9 +-
 drivers/net/i40e/i40e_rxtx_vec_common.h|   6 +-
 drivers/net/i40e/i40e_rxtx_vec_sse.c   |  11 +-
 drivers/net/ixgbe/ixgbe_rxtx.c |  10 +-
 drivers/net/ixgbe/ixgbe_rxtx_vec_common.h  |   6 +-
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c|   9 --
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c |   9 --
 drivers/net/mlx5/mlx5_rxtx.c   |  11 +-
 drivers/net/mpipe/mpipe_tilegx.c   |   3 +-
 drivers/net/null/rte_eth_null.c|   2 -
 drivers/net/virtio/virtio_rxtx.c   |   3 -
 drivers/net/virtio/virtio_rxtx_simple.h|   6 +-
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |   5 +-
 lib/librte_mbuf/rte_mbuf.c |   4 +
 lib/librte_mbuf/rte_mbuf.h | 114 -
 19 files changed, 124 insertions(+), 98 deletions(-)

-- 
2.8.1



[dpdk-dev] [RFC 3/8] mbuf: set mbuf fields while in pool

2017-01-24 Thread Olivier Matz
Set the value of m->refcnt to 1, m->nb_segs to 1 and m->next
to NULL when the mbuf is stored inside the mempool (unused).
This is done in rte_pktmbuf_prefree_seg(), before freeing or
recycling a mbuf.

Before this patch, the value of m->refcnt was expected to be 0
while in pool.

The objectives are:

- to avoid drivers to set m->next to NULL in the early Rx path, since
  this field is in the second 64B of the mbuf and its access could
  trigger a cache miss

- rationalize the behavior of raw_alloc/raw_free: one is now the
  symmetric of the other, and refcnt is never changed in these functions.

Signed-off-by: Olivier Matz 
---
 drivers/net/mlx5/mlx5_rxtx.c |  5 ++---
 drivers/net/mpipe/mpipe_tilegx.c |  1 +
 lib/librte_mbuf/rte_mbuf.c   |  2 ++
 lib/librte_mbuf/rte_mbuf.h   | 45 +---
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index a518a42..294dfde 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1327,7 +1327,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
while (pkt != seg) {
assert(pkt != (*rxq->elts)[idx]);
rep = NEXT(pkt);
-   rte_mbuf_refcnt_set(pkt, 0);
+   NEXT(pkt) = NULL;
+   NB_SEGS(pkt) = 1;
rte_mbuf_raw_free(pkt);
pkt = rep;
}
@@ -1338,13 +1339,11 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, 
uint16_t pkts_n)
len = mlx5_rx_poll_len(rxq, cqe, cqe_cnt,
   &rss_hash_res);
if (!len) {
-   rte_mbuf_refcnt_set(rep, 0);
rte_mbuf_raw_free(rep);
break;
}
if (unlikely(len == -1)) {
/* RX error, packet is likely too large. */
-   rte_mbuf_refcnt_set(rep, 0);
rte_mbuf_raw_free(rep);
++rxq->stats.idropped;
goto skip;
diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index eedc0b3..560ffe9 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -548,6 +548,7 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv)
mbuf->packet_type = 0;
mbuf->data_len= 0;
mbuf->pkt_len = 0;
+   mbuf->next= NULL;
 
rte_mbuf_raw_free(mbuf);
}
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 72ad91e..0acc810 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -145,6 +145,8 @@ rte_pktmbuf_init(struct rte_mempool *mp,
m->pool = mp;
m->nb_segs = 1;
m->port = 0xff;
+   rte_mbuf_refcnt_set(m, 1);
+   m->next = NULL;
 }
 
 /* helper to create a mbuf pool */
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 8ff2290..bbd0700 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -766,6 +766,11 @@ rte_mbuf_sanity_check(const struct rte_mbuf *m, int 
is_header);
  * initializing all the required fields. See rte_pktmbuf_reset().
  * For standard needs, prefer rte_pktmbuf_alloc().
  *
+ * The caller can expect that the following fields of the mbuf structure
+ * are initialized: buf_addr, buf_physaddr, buf_len, refcnt=1, nb_segs=1,
+ * next=NULL, pool, priv_size. The other fields must be initialized
+ * by the caller.
+ *
  * @param mp
  *   The mempool from which mbuf is allocated.
  * @return
@@ -780,8 +785,9 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct 
rte_mempool *mp)
if (rte_mempool_get(mp, &mb) < 0)
return NULL;
m = (struct rte_mbuf *)mb;
-   RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
-   rte_mbuf_refcnt_set(m, 1);
+   RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
+   RTE_ASSERT(m->next == NULL);
+   RTE_ASSERT(m->nb_segs == 1);
__rte_mbuf_sanity_check(m, 0);
 
return m;
@@ -790,8 +796,13 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct 
rte_mempool *mp)
 /**
  * Put mbuf back into its original mempool.
  *
- * The caller must ensure that the mbuf is direct and that the
- * reference counter is 0.
+ * The caller must ensure that the mbuf is direct and properly
+ * reinitialized (refcnt=1, next=NULL, nb_segs=1), as done by
+ * rte_pktmbuf_prefree_seg().
+ *
+ * This function should be used with care, when optimization is
+ * required. For standard needs, prefer rte_pktmbuf_free() or
+ * rte_pktmbuf_free_seg().
  *
  * @param m

[dpdk-dev] [RFC 6/8] mbuf: use 2 bytes for port and nb segments

2017-01-24 Thread Olivier Matz
It is now possible to reference a port identifier larger than 256
and have a mbuf chain larger than 256 segments.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/csumonly.c   | 4 ++--
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 4 ++--
 lib/librte_mbuf/rte_mbuf.h| 5 ++---
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 88cc842..5eaff9b 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -583,7 +583,7 @@ pkt_copy_split(const struct rte_mbuf *pkt)
rc = mbuf_copy_split(pkt, md, seglen, nb_seg);
if (rc < 0)
RTE_LOG(ERR, USER1,
-   "mbuf_copy_split for %p(len=%u, nb_seg=%hhu) "
+   "mbuf_copy_split for %p(len=%u, nb_seg=%u) "
"into %u segments failed with error code: %d\n",
pkt, pkt->pkt_len, pkt->nb_segs, nb_seg, rc);
 
@@ -801,7 +801,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
char buf[256];
 
printf("-\n");
-   printf("port=%u, mbuf=%p, pkt_len=%u, nb_segs=%hhu:\n",
+   printf("port=%u, mbuf=%p, pkt_len=%u, nb_segs=%u:\n",
fs->rx_port, m, m->pkt_len, m->nb_segs);
/* dump rx parsed packet info */
rte_get_rx_ol_flag_list(rx_ol_flags, buf, sizeof(buf));
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index f24f79f..2ac879f 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -118,8 +118,8 @@ struct rte_kni_mbuf {
uint64_t buf_physaddr;
uint16_t data_off;  /**< Start address of data in segment buffer. */
char pad1[2];
-   uint8_t nb_segs;/**< Number of segments. */
-   char pad4[3];
+   uint16_t nb_segs;   /**< Number of segments. */
+   char pad4[2];
uint64_t ol_flags;  /**< Offload features. */
char pad2[4];
uint32_t pkt_len;   /**< Total pkt len: sum of all segment 
data_len. */
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index cac31c9..de72314 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -408,9 +408,8 @@ struct rte_mbuf {
rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
uint16_t refcnt;  /**< Non-atomically accessed 
refcnt */
};
-   uint8_t nb_segs;  /**< Number of segments. */
-   uint8_t port; /**< Input port. */
-   uint16_t pad; /**< 2B pad for naturally aligned ol_flags */
+   uint16_t nb_segs; /**< Number of segments. */
+   uint16_t port;/**< Input port. */
 
uint64_t ol_flags;/**< Offload features. */
 
-- 
2.8.1



[dpdk-dev] [RFC 7/8] mbuf: move sequence number in second cache line

2017-01-24 Thread Olivier Matz
Move this field in the second cache line, since no driver use it
in Rx path. The freed space will be used by a timestamp in next
commit.

Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index de72314..39df3e1 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -465,8 +465,6 @@ struct rte_mbuf {
uint32_t usr; /**< User defined tags. See 
rte_distributor_process() */
} hash;   /**< hash information */
 
-   uint32_t seqn; /**< Sequence number. See also rte_reorder_insert() */
-
/** Outer VLAN TCI (CPU order), valid if PKT_RX_QINQ_STRIPPED is set. */
uint16_t vlan_tci_outer;
 
@@ -511,6 +509,10 @@ struct rte_mbuf {
 
/** Timesync flags for use with IEEE1588. */
uint16_t timesync;
+
+   /** Sequence number. See also rte_reorder_insert(). */
+   uint32_t seqn;
+
 } __rte_cache_aligned;
 
 /**
-- 
2.8.1



[dpdk-dev] [RFC 8/8] mbuf: add a timestamp field

2017-01-24 Thread Olivier Matz
The field itself is not fully described yet, but this commit reserves
the room in the mbuf.

Signed-off-by: Olivier Matz 
---
 lib/librte_mbuf/rte_mbuf.c |  2 ++
 lib/librte_mbuf/rte_mbuf.h | 10 ++
 2 files changed, 12 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 0acc810..f679bce 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -322,6 +322,7 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
case PKT_RX_LRO: return "PKT_RX_LRO";
+   case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
default: return NULL;
}
 }
@@ -356,6 +357,7 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t 
buflen)
{ PKT_RX_IEEE1588_TMST, PKT_RX_IEEE1588_TMST, NULL },
{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
{ PKT_RX_LRO, PKT_RX_LRO, NULL },
+   { PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
};
const char *name;
unsigned int i;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 39df3e1..4818e2f 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -177,6 +177,11 @@ extern "C" {
  */
 #define PKT_RX_LRO   (1ULL << 16)
 
+/**
+ * Indicate that the timestamp field in the mbuf is valid.
+ */
+#define PKT_RX_TIMESTAMP (1ULL << 17)
+
 /* add new RX flags here */
 
 /* add new TX flags here */
@@ -469,6 +474,10 @@ struct rte_mbuf {
uint16_t vlan_tci_outer;
 
uint16_t buf_len; /**< Length of segment buffer. */
+
+   /** Valid if PKT_RX_TIMESTAMP is set. The unit is nanoseconds */
+   uint64_t timestamp;
+
/* second cache line - fields only used in slow path or on TX */
MARKER cacheline1 __rte_cache_min_aligned;
 
@@ -1197,6 +1206,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
*mi, struct rte_mbuf *m)
mi->nb_segs = 1;
mi->ol_flags = m->ol_flags | IND_ATTACHED_MBUF;
mi->packet_type = m->packet_type;
+   mi->timestamp = m->timestamp;
 
__rte_mbuf_sanity_check(mi, 1);
__rte_mbuf_sanity_check(m, 0);
-- 
2.8.1



[dpdk-dev] [RFC 4/8] net: don't touch mbuf next or nb segs on Rx

2017-01-24 Thread Olivier Matz
Now that the m->next pointer and m->nb_segs is expected to be set (to
NULL and 1 respectively) after a mempool_get(), we can avoid to write them
in the Rx functions of drivers.

Only some drivers are patched, it's not an exhaustive patch. It gives
the idea to do the same in other drivers.

Signed-off-by: Olivier Matz 
---
 drivers/net/i40e/i40e_rxtx_vec_sse.c| 6 --
 drivers/net/ixgbe/ixgbe_rxtx.c  | 8 
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 6 --
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 6 --
 drivers/net/null/rte_eth_null.c | 2 --
 drivers/net/virtio/virtio_rxtx.c| 3 ---
 6 files changed, 31 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c 
b/drivers/net/i40e/i40e_rxtx_vec_sse.c
index 7c84a41..33bc121 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_sse.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c
@@ -403,12 +403,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
/* store the resulting 32-bit value */
*(int *)split_packet = _mm_cvtsi128_si32(eop_bits);
split_packet += RTE_I40E_DESCS_PER_LOOP;
-
-   /* zero-out next pointers */
-   rx_pkts[pos]->next = NULL;
-   rx_pkts[pos + 1]->next = NULL;
-   rx_pkts[pos + 2]->next = NULL;
-   rx_pkts[pos + 3]->next = NULL;
}
 
/* C.3 calc available number of desc */
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index dd53cc6..2c9e342 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1548,8 +1548,6 @@ ixgbe_rx_alloc_bufs(struct ixgbe_rx_queue *rxq, bool 
reset_mbuf)
/* populate the static rte mbuf fields */
mb = rxep[i].mbuf;
if (reset_mbuf) {
-   mb->next = NULL;
-   mb->nb_segs = 1;
mb->port = rxq->port_id;
}
 
@@ -2157,12 +2155,6 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts,
goto next_desc;
}
 
-   /*
-* This is the last buffer of the received packet - return
-* the current cluster to the user.
-*/
-   rxm->next = NULL;
-
/* Initialize the first mbuf of the returned packet */
ixgbe_fill_cluster_head_buf(first_seg, &rxd, rxq, staterr);
 
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index f96cc85..63f2556 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -333,12 +333,6 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
*(int *)split_packet = ~stat & IXGBE_VPMD_DESC_EOP_MASK;
 
split_packet += RTE_IXGBE_DESCS_PER_LOOP;
-
-   /* zero-out next pointers */
-   rx_pkts[pos]->next = NULL;
-   rx_pkts[pos + 1]->next = NULL;
-   rx_pkts[pos + 2]->next = NULL;
-   rx_pkts[pos + 3]->next = NULL;
}
 
rte_prefetch_non_temporal(rxdp + RTE_IXGBE_DESCS_PER_LOOP);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
index abbf284..65c5da3 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
@@ -425,12 +425,6 @@ _recv_raw_pkts_vec(struct ixgbe_rx_queue *rxq, struct 
rte_mbuf **rx_pkts,
/* store the resulting 32-bit value */
*(int *)split_packet = _mm_cvtsi128_si32(eop_bits);
split_packet += RTE_IXGBE_DESCS_PER_LOOP;
-
-   /* zero-out next pointers */
-   rx_pkts[pos]->next = NULL;
-   rx_pkts[pos + 1]->next = NULL;
-   rx_pkts[pos + 2]->next = NULL;
-   rx_pkts[pos + 3]->next = NULL;
}
 
/* C.3 calc available number of desc */
diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 57203e2..7e14da0 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -112,8 +112,6 @@ eth_null_rx(void *q, struct rte_mbuf **bufs, uint16_t 
nb_bufs)
break;
bufs[i]->data_len = (uint16_t)packet_size;
bufs[i]->pkt_len = packet_size;
-   bufs[i]->nb_segs = 1;
-   bufs[i]->next = NULL;
bufs[i]->port = h->internals->port_id;
}
 
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index b29565e..111a983 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/v

Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 10:49 AM, Matej Vido wrote:
> Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")
> 
> Signed-off-by: Matej Vido 

Reviewed-by: Ferruh Yigit 

Cc: sta...@dpdk.org

Applied to dpdk-next-net/master, thanks.




Re: [dpdk-dev] [PATCH v7 5/6] lib: added new library for latency stats

2017-01-24 Thread Olivier MATZ
On Wed, 18 Jan 2017 21:11:28 +0100, Olivier Matz
 wrote:
> Hi guys,
> 
> On Tue, 17 Jan 2017 21:55:16 +0530, Jerin Jacob
> > Oliver,
> > 
> > Could you please suggest how to proceed further?
> >   
> 
> Sorry for the lack of response. I know people are waiting for
> me, but these days I have too many things to do at the same time, and
> it's difficult to find time.
> 
> In few words (I'll provide more detailed answers to the thread by
> friday): I expected to post the mbuf rework patchset for this release,
> which includes the structure changes (Jerin's patch for arm access,
> timestamp, port, nb_segs, refcnt changes). But the patchset is clearly
> not ready yet, it needs a rebase, and it lacks test.
> 
> Jerin, I know that you submitted your patch a long time ago, and I'm
> the only one to blame, please do not see any vendor preference in it.
> 
> I'll check friday what's the effective state of the patchset in my
> workspace. If I can extract a minimal patch that only change the
> structure, I'll send it for discussion. But from what I remember, the
> mbuf structure rework depends on changing the way we access the
> refcnt, so it can be moved to the 2nd cache line.
> 
> If that's not possible, I'll try propose some alternatives.

I just posted a mbuf RFC patchset [1]. I think it contains most
things that were mentioned on the ML. As checked with Thomas, it's too
late to have it included in 17.02.

I'll tend to agree with John that having the timestamp in the mbuf for
latency is not an ABI break, since it is added at the end of the
structure. So I won't oppose to add this field in the mbuf structure
for the release.

The mbuf rearm patch was not forgotten, but it took clearly too long to
be integrated. With the benefit of hindsight, it should have been
pushed without waiting the mbuf rework. Again, apologies for that, I
understand it's quite frustrating.

Anyway, tests or comments on my RFC patchset are welcome, so we can
integrate it at the beginning of the 17.05 cycle.

Regards,
Olivier

[1] http://dpdk.org/ml/archives/dev/2017-January/056187.html



[dpdk-dev] [RFC 5/8] mbuf: make rearm data address naturally aligned

2017-01-24 Thread Olivier Matz
From: Jerin Jacob 

To avoid multiple stores on fast path, Ethernet drivers
aggregate the writes to data_off, refcnt, nb_segs and port
to an uint64_t data and write the data in one shot
with uint64_t* at &mbuf->rearm_data address.

Some of the non-IA platforms have store operation overhead
if the store address is not naturally aligned.This patch
fixes the performance issue on those targets.

Signed-off-by: Jerin Jacob 
Signed-off-by: Olivier Matz 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c| 3 ---
 drivers/net/i40e/i40e_rxtx_vec_sse.c  | 5 +
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c   | 3 ---
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c| 3 ---
 lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 3 +--
 lib/librte_mbuf/rte_mbuf.h| 6 +++---
 6 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 825e3c1..61a65e9 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -324,9 +324,6 @@ fm10k_rxq_rearm(struct fm10k_rx_queue *rxq)
 
/* Flush mbuf with pkt template.
 * Data to be rearmed is 6 bytes long.
-* Though, RX will overwrite ol_flags that are coming next
-* anyway. So overwrite whole 8 bytes with one load:
-* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
 */
p0 = (uintptr_t)&mb0->rearm_data;
*(uint64_t *)p0 = rxq->mbuf_initializer;
diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c 
b/drivers/net/i40e/i40e_rxtx_vec_sse.c
index 33bc121..1a8bcdf 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_sse.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c
@@ -87,11 +87,8 @@ i40e_rxq_rearm(struct i40e_rx_queue *rxq)
mb0 = rxep[0].mbuf;
mb1 = rxep[1].mbuf;
 
-/* Flush mbuf with pkt template.
+   /* Flush mbuf with pkt template.
 * Data to be rearmed is 6 bytes long.
-* Though, RX will overwrite ol_flags that are coming next
-* anyway. So overwrite whole 8 bytes with one load:
-* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
 */
p0 = (uintptr_t)&mb0->rearm_data;
*(uint64_t *)p0 = rxq->mbuf_initializer;
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index 63f2556..c538796 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -85,9 +85,6 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
/*
 * Flush mbuf with pkt template.
 * Data to be rearmed is 6 bytes long.
-* Though, RX will overwrite ol_flags that are coming next
-* anyway. So overwrite whole 8 bytes with one load:
-* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
 */
vst1_u8((uint8_t *)&mb0->rearm_data, p);
paddr = mb0->buf_physaddr + RTE_PKTMBUF_HEADROOM;
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
index 65c5da3..62afe31 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
@@ -90,9 +90,6 @@ ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
/*
 * Flush mbuf with pkt template.
 * Data to be rearmed is 6 bytes long.
-* Though, RX will overwrite ol_flags that are coming next
-* anyway. So overwrite whole 8 bytes with one load:
-* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
 */
p0 = (uintptr_t)&mb0->rearm_data;
*(uint64_t *)p0 = rxq->mbuf_initializer;
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 09713b0..f24f79f 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -116,11 +116,10 @@ struct rte_kni_fifo {
 struct rte_kni_mbuf {
void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
uint64_t buf_physaddr;
-   char pad0[2];
uint16_t data_off;  /**< Start address of data in segment buffer. */
char pad1[2];
uint8_t nb_segs;/**< Number of segments. */
-   char pad4[1];
+   char pad4[3];
uint64_t ol_flags;  /**< Offload features. */
char pad2[4];
uint32_t pkt_len;   /**< Total pkt len: sum of all segment 
data_len. */
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index bbd0700..cac31c9 100644
--- a/lib/librte_mbuf/rte_mbuf.h

Re: [dpdk-dev] [PATCH 1/3] mbuf: embedding timestamp into the packet

2017-01-24 Thread Olivier MATZ
On Thu, 13 Oct 2016 14:35:06 +, Oleg Kuporosov 
wrote:
> The hard requirement of financial services industry is accurate
> timestamping aligned with the packet itself. This patch is to satisfy
> this requirement:
> 
> - include uint64_t timestamp field into rte_mbuf with minimal impact
> to throughput/latency. Keep it just simple uint64_t in ns (more than
> 580 years) would be enough for immediate needs while using full
>   struct timespec with twice bigger size would have much stronger
>   performance impact as missed cacheline0.
> 
> - it is possible as there is 6-bytes gap in 1st cacheline (fast path)
>   and moving uint16_t vlan_tci_outer field to 2nd cacheline.
> 
> - such move will only impact for pretty rare usable VLAN RX stripping
>   mode for outer TCI (it used only for one NIC i40e from the whole
> set and allows to keep minimal performance impact for RX/TX
> timestamps.
> 
> Signed-off-by: Oleg Kuporosov 
> ---
>  lib/librte_mbuf/rte_mbuf.h | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 23b7bf8..1e1f2ed 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -851,8 +851,7 @@ struct rte_mbuf {
>  
>   uint32_t seqn; /**< Sequence number. See also
> rte_reorder_insert() */ 
> - /** Outer VLAN TCI (CPU order), valid if
> PKT_RX_QINQ_STRIPPED is set. */
> - uint16_t vlan_tci_outer;
> + uint64_t timestamp;   /**< Packet's timestamp, usually
> in ns */ 
>   /* second cache line - fields only used in slow path or on
> TX */ MARKER cacheline1 __rte_cache_min_aligned;
> @@ -885,6 +884,9 @@ struct rte_mbuf {
>   };
>   };
>  
> + /** Outer VLAN TCI (CPU order), valid if
> PKT_RX_QINQ_STRIPPED is set. */
> + uint16_t vlan_tci_outer;
> +
>   /** Size of the application private data. In case of an
> indirect
>* mbuf, it stores the direct mbuf private data size. */
>   uint16_t priv_size;

FYI, I posted a RFC patchset that introduces the timestamp field in the
mbuf for v17.05:
http://dpdk.org/ml/archives/dev/2017-January/056187.html


Regards,
Olivier



Re: [dpdk-dev] [PATCH v3 01/10] doc: add NXP dpaa2_sec in cryptodev

2017-01-24 Thread De Lara Guarch, Pablo
Hi,

> -Original Message-
> From: akhil.go...@nxp.com [mailto:akhil.go...@nxp.com]
> Sent: Friday, January 20, 2017 2:05 PM
> To: dev@dpdk.org
> Cc: thomas.monja...@6wind.com; Doherty, Declan; De Lara Guarch, Pablo;
> Mcnamara, John; nhor...@tuxdriver.com; Akhil Goyal
> Subject: [PATCH v3 01/10] doc: add NXP dpaa2_sec in cryptodev
> 
> From: Akhil Goyal 
> 
> Signed-off-by: Akhil Goyal 
> Reviewed-by: Hemant Agrawal 

> diff --git a/doc/guides/cryptodevs/index.rst
> b/doc/guides/cryptodevs/index.rst
> index 06c3f6e..941b865 100644
> --- a/doc/guides/cryptodevs/index.rst
> +++ b/doc/guides/cryptodevs/index.rst
> @@ -38,6 +38,7 @@ Crypto Device Drivers
>  overview
>  aesni_mb
>  aesni_gcm
> +dpaa2_sec
>  armv8
>  kasumi
>  openssl
> --
> 2.9.3

This list is in alphabetical order.

Also, I would add this patch at the end of the patchset, and not the first.



Re: [dpdk-dev] [RFC 3/8] mbuf: set mbuf fields while in pool

2017-01-24 Thread Bruce Richardson
On Tue, Jan 24, 2017 at 04:19:28PM +0100, Olivier Matz wrote:
> Set the value of m->refcnt to 1, m->nb_segs to 1 and m->next
> to NULL when the mbuf is stored inside the mempool (unused).
> This is done in rte_pktmbuf_prefree_seg(), before freeing or
> recycling a mbuf.
> 
> Before this patch, the value of m->refcnt was expected to be 0
> while in pool.
> 
> The objectives are:
> 
> - to avoid drivers to set m->next to NULL in the early Rx path, since
>   this field is in the second 64B of the mbuf and its access could
>   trigger a cache miss
> 
> - rationalize the behavior of raw_alloc/raw_free: one is now the
>   symmetric of the other, and refcnt is never changed in these functions.
> 
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/mlx5/mlx5_rxtx.c |  5 ++---
>  drivers/net/mpipe/mpipe_tilegx.c |  1 +
>  lib/librte_mbuf/rte_mbuf.c   |  2 ++
>  lib/librte_mbuf/rte_mbuf.h   | 45 
> +---
>  4 files changed, 38 insertions(+), 15 deletions(-)
> 

> /* compat with older versions */
>  __rte_deprecated
> -static inline void __attribute__((always_inline))
> +static inline void
>  __rte_mbuf_raw_free(struct rte_mbuf *m)
>  {
>   rte_mbuf_raw_free(m);
> @@ -1218,8 +1232,12 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf 
> *m)
>   m->data_len = 0;
>   m->ol_flags = 0;
>  
> - if (rte_mbuf_refcnt_update(md, -1) == 0)
> + if (rte_mbuf_refcnt_update(md, -1) == 0) {

Minor nit, but in the case that we only have a single reference to the
mbufs, we are always setting that to zero just to re-increment it to 1
again.

> + md->next = NULL;
> + md->nb_segs = 1;
> + rte_mbuf_refcnt_set(md, 1);
>   rte_mbuf_raw_free(md);
> + }
>  }
>  
>  /**


Re: [dpdk-dev] [PATCH] net/bnx2x: fix Rx mode configuration

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 5:38 AM, Rasesh Mody wrote:
> Check if promisc mode was set when setting allmulti mode and visa-versa.
> Introduced BNX2X_RX_MODE_ALLMULTI_PROMISC for the same. If check is
> absent the filter configuration gets over written.
> 
> Fixes: 540a211084a7 ("bnx2x: driver core")
> Fixes: 5dbc53d7e5a2 ("net/bnx2x: restrict Rx mask flags sent to the PF")
> 
> Signed-off-by: Rasesh Mody 

Applied to dpdk-next-net/master, thanks.



Re: [dpdk-dev] [PATCH] net/ixgbe: correct error when SFP not present

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 8:51 AM, Lu, Wenzhuo wrote:
<...>
>>
>> Ignore the error=IXGBE_ERR_SFP_NOT_PRESENT when SFP is not present.
>> If it is not ignored, testpmd will in the NIC initialization process.
>> Ixgbe kernel driver ignores this error and works well. So DPDK does same 
>> thing.
>>
>> Signed-off-by: Wei Dai 
>> Signed-off-by: Helin Zhang 
>> Tested-by: Yuan Peng 
> Acked-by: Wenzhuo Lu 

Applied to dpdk-next-net/master, thanks.



Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Matej Vido

On 24.01.2017 16:11, Ferruh Yigit wrote:

On 1/24/2017 2:02 PM, Matej Vido wrote:

On 24.01.2017 12:58, Ferruh Yigit wrote:

On 1/24/2017 10:49 AM, Matej Vido wrote:

Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")

Signed-off-by: Matej Vido 

Unrelated from this patch, in maintainers file, you have your other mail
address: "Matej Vido ", do you want to update it?

Hi Ferruh,

yes, I will send the patch.

---
   drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
b/drivers/net/szedata2/rte_eth_szedata2.h
index b58adb6..afe8a38 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.h
+++ b/drivers/net/szedata2/rte_eth_szedata2.h
@@ -192,7 +192,7 @@ struct szedata {
   }
   
   #define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \

-   ((type)((uint8_t *)(rsc)->addr) + (offset))
+   ((type)(((uint8_t *)(rsc)->addr) + (offset)))

Although output will be same, (in all uses, type is a pointer),

Of course won't be same, please forget about it J, I am confused.

So these two differs a lot, taking into account that offset numbers used
are big numbers (0x8000..), it should be accessing very unrelated addresses.

So how this was working before?


The macro was fine before the patch [1]. It hasn't been working since 
the acceptance of that patch, but I didn't manage to test the 
functionality until now.


[1] http://dpdk.org/ml/archives/dev/2016-December/053241.html

Regards,
Matej




this

seems the intention, so:

Reviewed-by: Ferruh Yigit 

btw, following will do same, right, not sure if it is better:
((type)(rsc)->addr + (offset))

This is also wrong. The intention of the macro is to add an offset to
the base address and typecast the result.

Regards,
Matej
   
   enum szedata2_link_speed {

SZEDATA2_LINK_SPEED_DEFAULT = 0,





Re: [dpdk-dev] [PATCH v1] net/ixgbe: fix API parameter checking

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 9:00 AM, Tiwei Bie wrote:
> Add checks to rte_pmd_ixgbe_macsec_* APIs to ensure that the
> port is an ixgbe port.
> 
> Fixes: b35d309710fe ("net/ixgbe: add MACsec offload")
> 
> Signed-off-by: Tiwei Bie 

Applied to dpdk-next-net/master, thanks.



Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in rte_pktmbuf_attach

2017-01-24 Thread Ilya Matveychikov

> On Jan 24, 2017, at 4:56 PM, Olivier MATZ  wrote:
> 
> Hi,
> 
> On Sat, 21 Jan 2017 16:28:29 +, "Ananyev, Konstantin"
>  wrote:
>>> -Original Message-
>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya
>>> Matveychikov Sent: Saturday, January 21, 2017 3:08 PM
>>> To: Yigit, Ferruh 
>>> Cc: dev@dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in
>>> rte_pktmbuf_attach
>>> 
>>> 
 On Jan 20, 2017, at 4:08 PM, Ferruh Yigit
  wrote:
 
 On 1/20/2017 12:19 AM, Ilya Matveychikov wrote:  
> mi->next will be assigned to NULL few lines later, trivial patch
> 
> Signed-off-by: Ilya V. Matveychikov 
> ---
> lib/librte_mbuf/rte_mbuf.h | 1 -
> 1 file changed, 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h
> b/lib/librte_mbuf/rte_mbuf.h index ead7c6e..5589d54 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1139,7 +1139,6 @@ static inline void
> rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
> mi->buf_addr = m->buf_addr; mi->buf_len = m->buf_len;
> 
> - mi->next = m->next;  
 
> 
> Fixes: ea672a8b1655 ("mbuf: remove the rte_pktmbuf structure")
> 
> Acked-by: Olivier Matz 
> 
> 
 Do you know why attaching mbuf is not supporting multi-segment?  
>> 
>> This is supported, but you have to do it segment by segment.
>> Actually  rte_pktmbuf_clone() does that.
>> Konstantin
>> 
>> 
 Perhaps this can be documented in function comment, as one of the
 "not supported" items.  
>>> 
>>> No, I don’t know. For my application I’ve found that nb_segs with
>>> it’s limit in 256 segments is very annoying and I’ve decided not to
>>> use DPDK functions that dealt with nb_segs… But it is not about the
>>> rte_pktmbuf_attach() function and the patch. 
> 
> 
> Out of curiosity, can you explain why your application needs more
> than 256 segments? When we were discussing the possibility of extending
> this field to 16 bits, Konstantin convinced me that it was not so
> useful.

In my application I need to do IPv4 fragments reassembly. There is no explicit 
limit of number of fragments in datagram, so I’m trying to avoid any 
limitations and `nb_segs` here is a constraint for me. Expanding it from 8-bit 
to 16-bit can solve that issue for me. But I don’t remember are there any  
other places in DPDK where we need to know how many segments are in the packet? 
I mean that is `nb_segs` required at all?



Re: [dpdk-dev] [RFC 0/8] mbuf: structure reorganization

2017-01-24 Thread Bruce Richardson
On Tue, Jan 24, 2017 at 04:19:25PM +0100, Olivier Matz wrote:
> Based on discussion done in [1], this patchset reorganizes the mbuf.
> 

Hi Olivier,

thanks for all the work on this. From a quick scan of the patches, and
the description below, it looks like a good set of changes. Comments
below to see about kick-starting some further discussion about some of
the other changes you propose.

> The main changes are:
> - reorder structure to increase vector performance on some non-ia
>   platforms.
> - add a 64bits timestamp field in the 1st cache line
> - m->next, m->nb_segs, and m->refcnt are always initialized for mbufs
>   in the pool, avoiding the need of setting m->next (located in the
>   2nd cache line) in the Rx path for mono-segment packets.
> - change port and nb_segs to 16 bits
> - move seqn in the 2nd cache line
> 
> Things discussed but not done in the patchset:
> - move refcnt and nb_segs to the 2nd cache line: many drivers sets
>   them in the Rx path, so it could introduce a performance regression, or
>   it would require to change all the drivers, which is not an easy task.

But if we do make this change and update the drivers, some of them
should perform a little better, since they do fewer writes. I don't
think the fastest vector drivers will be affected, since they already
coalesce the writes to these fields with other writes, but others drivers
may well be improved by the change.

> - remove the m->port field: too much impact on many examples and libraries,
>   and some people highlighted they are using it.
> - moving m->next in the 1st cache line: there is not enough room, and having
>   it set to NULL for unused mbuf should remove the need for it.

I agree.

> - merge seqn and timestamp together in a union: we could imagine use cases
>   were both are activated. There is no flag indicating the presence of seqn,
>   so it looks preferable to keep them separated for now.

What were the use-cases? If we have a timestamp, surely sequence can be
determined from that? Even if you use the TSC as a timestamp per burst,
you can still sequence the packets cheaply by just adding 1 to each 
subsequent value.

/Bruce



Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Matej Vido

On 24.01.2017 16:24, Ferruh Yigit wrote:

On 1/24/2017 10:49 AM, Matej Vido wrote:

Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")

Signed-off-by: Matej Vido 

Reviewed-by: Ferruh Yigit 

Cc: sta...@dpdk.org

Applied to dpdk-next-net/master, thanks.
I'm not sure about the policy regarding the stable releases, but I think 
that this patch doesn't belong to stable as the bug was introduced and 
also fixed during the current 17.02 window. Or am I wrong?


Thanks,
Matej


[dpdk-dev] [PATCH v6 00/11] crypto/scheduler: add driver for scheduler crypto pmd

2017-01-24 Thread Fan Zhang
This patch provides the initial implementation of the scheduler poll mode
driver using DPDK cryptodev framework.

Scheduler PMD is used to schedule and enqueue the crypto ops to the
hardware and/or software crypto devices attached to it (slaves). The
dequeue operation from the slave(s), and the possible dequeued crypto op
reordering, are then carried out by the scheduler.

As the initial version, the scheduler PMD currently supports only the
Round-robin mode, which distributes the enqueued burst of crypto ops
among its slaves in a round-robin manner. This mode may help to fill
the throughput gap between the physical core and the existing cryptodevs
to increase the overall performance. Moreover, the scheduler PMD is
provided the APIs for user to create his/her own scheduler.

Build instructions:
To build DPDK with CRYTPO_SCHEDULER_PMD the user is required to set
CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER=y in config/common_base

Notice:
- Scheduler PMD shares same EAL commandline options as other cryptodevs.
  However, apart from socket_id, the rest of cryptodev options are
  ignored. The scheduler PMD's max_nb_queue_pairs and max_nb_sessions
  options are set as the minimum values of the attached slaves'. For
  example, a scheduler cryptodev is attached 2 cryptodevs with
  max_nb_queue_pairs of 2 and 8, respectively. The scheduler cryptodev's
  max_nb_queue_pairs will be automatically updated as 2.

- In addition, an extra option "slave" is added. The user can attach one
  or more slave cryptodevs initially by passing their names with this
  option. Here is an example:

  ... --vdev "crypto_aesni_mb_pmd,name=aesni_mb_1" --vdev "crypto_aesni_
  mb_pmd,name=aesni_mb_2" --vdev "crypto_scheduler_pmd,slave=aesni_mb_1,
  slave=aesni_mb_2" ...

  Remember the software cryptodevs to be attached shall be declared before
  the scheduler PMD, otherwise the scheduler will fail to locate the
  slave(s) and report error.

- The scheduler cryptodev cannot be started unless the scheduling mode
  is set and at least one slave is attached. Also, to configure the
  scheduler in the run-time, like attach/detach slave(s), change
  scheduling mode, or enable/disable crypto op ordering, one should stop
  the scheduler first, otherwise an error will be returned.

- Enabling crypto ops reordering will cause overwriting the userdata field
  of each mbuf.

Fan Zhang (11):

Changes in v6:
Split into multiple patches.
Added documentation.
Added unit test.

Changes in v5:
Fixed EOF whitespace warning.
Updated Copyright.

Changes in v4:
Fixed a few bugs.
Added slave EAL commandline option support.

Changes in v3:
Fixed config/common_base.

Changes in v2:
New approaches in API to suit future scheduling modes.

  cryptodev: add scheduler PMD name and type
  crypto/scheduler: add APIs for scheduler
  crypto/scheduler: add internal structure declarations
  crypto/scheduler: add scheduler API implementations
  crypto/scheduler: add round-robin scheduling mode
  crypto/scheduler: register scheduler vdev driver
  crypto/scheduler: register operation function pointer table
  crypto/scheduler: add scheduler PMD to DPDK compile system
  crypto/scheduler: add scheduler PMD config options
  app/test: add unit test for cryptodev scheduler PMD
  crypto/scheduler: add documentation

 app/test/test_cryptodev.c  | 241 +-
 app/test/test_cryptodev_aes_test_vectors.h | 101 +++--
 app/test/test_cryptodev_blockcipher.c  |   6 +-
 app/test/test_cryptodev_blockcipher.h  |   3 +-
 app/test/test_cryptodev_hash_test_vectors.h|  38 +-
 config/common_base |   8 +-
 doc/guides/cryptodevs/img/scheduler-overview.svg   | 277 
 doc/guides/cryptodevs/index.rst|   3 +-
 doc/guides/cryptodevs/scheduler.rst| 128 ++
 drivers/crypto/Makefile|   3 +-
 drivers/crypto/scheduler/Makefile  |  66 +++
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c | 471 
 drivers/crypto/scheduler/rte_cryptodev_scheduler.h | 165 +++
 .../scheduler/rte_cryptodev_scheduler_operations.h |  71 +++
 .../scheduler/rte_pmd_crypto_scheduler_version.map |  12 +
 drivers/crypto/scheduler/scheduler_pmd.c   | 361 +++
 drivers/crypto/scheduler/scheduler_pmd_ops.c   | 490 +
 drivers/crypto/scheduler/scheduler_pmd_private.h   | 115 +
 drivers/crypto/scheduler/scheduler_roundrobin.c| 435 ++
 lib/librte_cryptodev/rte_cryptodev.h   |   3 +
 mk/rte.app.mk  |   6 +-
 21 files changed, 2948 insertions(+), 55 deletions(-)
 create mode 100644 doc/guides/cryptodevs/img/scheduler-overview.svg
 create mode 100644 doc/guides/cryptodevs/scheduler.rst
 create mode 100644 drivers/crypto/scheduler/Makefile
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.c
 create mode 100644 

[dpdk-dev] [PATCH v6 01/11] cryptodev: add scheduler PMD name and type

2017-01-24 Thread Fan Zhang
This patch adds the cryptodev scheduler PMD name and type identifier to
librte_cryptodev.

Signed-off-by: Fan Zhang 
---
 lib/librte_cryptodev/rte_cryptodev.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index f284668..618f302 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -68,6 +68,8 @@ extern "C" {
 /**< KASUMI PMD device name */
 #define CRYPTODEV_NAME_ARMV8_PMD   crypto_armv8
 /**< ARMv8 Crypto PMD device name */
+#define CRYPTODEV_NAME_SCHEDULER_PMD   crypto_scheduler
+/**< Scheduler Crypto PMD device name */
 
 /** Crypto device type */
 enum rte_cryptodev_type {
@@ -80,6 +82,7 @@ enum rte_cryptodev_type {
RTE_CRYPTODEV_ZUC_PMD,  /**< ZUC PMD */
RTE_CRYPTODEV_OPENSSL_PMD,/**<  OpenSSL PMD */
RTE_CRYPTODEV_ARMV8_PMD,/**< ARMv8 crypto PMD */
+   RTE_CRYPTODEV_SCHEDULER_PMD,/**< Crypto Scheduler PMD */
 };
 
 extern const char **rte_cyptodev_names;
-- 
2.7.4



[dpdk-dev] [PATCH v6 02/11] crypto/scheduler: add APIs for scheduler

2017-01-24 Thread Fan Zhang
Adds APIs and function prototypes for the scheduler PMD to perform extra
operations other than standard cryptodev APIs.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/rte_cryptodev_scheduler.h | 162 +
 .../scheduler/rte_cryptodev_scheduler_operations.h |  71 +
 .../scheduler/rte_pmd_crypto_scheduler_version.map |  12 ++
 3 files changed, 245 insertions(+)
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.h
 create mode 100644 
drivers/crypto/scheduler/rte_cryptodev_scheduler_operations.h
 create mode 100644 
drivers/crypto/scheduler/rte_pmd_crypto_scheduler_version.map

diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.h 
b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
new file mode 100644
index 000..b18fc48
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
@@ -0,0 +1,162 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CRYPTO_SCHEDULER_H
+#define _RTE_CRYPTO_SCHEDULER_H
+
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Crypto scheduler PMD operation modes
+ */
+enum rte_cryptodev_scheduler_mode {
+   CDEV_SCHED_MODE_NOT_SET = 0,
+   CDEV_SCHED_MODE_USERDEFINED,
+
+   CDEV_SCHED_MODE_COUNT /* number of modes */
+};
+
+#define RTE_CRYPTODEV_SCHEDULER_NAME_MAX_LEN   (64)
+#define RTE_CRYPTODEV_SCHEDULER_DESC_MAX_LEN   (256)
+
+struct rte_cryptodev_scheduler;
+
+/**
+ * Load a user defined scheduler
+ *
+ * @param  scheduler_idThe target scheduler device ID
+ * scheduler   Pointer to the user defined scheduler
+ *
+ * @return
+ * 0 if loading successful, negative integer if otherwise.
+ */
+int
+rte_cryptodev_scheduler_load_user_scheduler(uint8_t scheduler_id,
+   struct rte_cryptodev_scheduler *scheduler);
+
+/**
+ * Attach a pre-configured crypto device to the scheduler
+ *
+ * @param  scheduler_idThe target scheduler device ID
+ * slave_idcrypto device ID to be attached
+ *
+ * @return
+ * 0 if attaching successful, negative int if otherwise.
+ */
+int
+rte_cryptodev_scheduler_slave_attach(uint8_t scheduler_id, uint8_t slave_id);
+
+/**
+ * Detach a attached crypto device to the scheduler
+ *
+ * @param  scheduler_idThe target scheduler device ID
+ * slave_idcrypto device ID to be detached
+ *
+ * @return
+ * 0 if detaching successful, negative int if otherwise.
+ */
+int
+rte_cryptodev_scheduler_slave_detach(uint8_t scheduler_id, uint8_t slave_id);
+
+/**
+ * Set the scheduling mode
+ *
+ * @param  scheduler_idThe target scheduler device ID
+ * modeThe scheduling mode
+ *
+ * @return
+ * 0 if attaching successful, negative integer if otherwise.
+ */
+int
+rte_crpytodev_scheduler_mode_set(uint8_t scheduler_id,
+   enum rte_cryptodev_scheduler_mode mode);
+
+/**
+ * Get the current scheduling mode
+ *
+ * @param  scheduler_idThe target scheduler device ID
+ * modePointer to write the scheduling mode
+ */
+enum rte_cryptodev_scheduler_mode
+rte_crpytodev_scheduler_mode_get(uint8_t scheduler_id);
+
+/**
+ * Set the crypto ops reordering feature on/off
+ *
+ * @param  dev_id  The target scheduler device ID
+ * enable_reord

[dpdk-dev] [PATCH v6 03/11] crypto/scheduler: add internal structure declarations

2017-01-24 Thread Fan Zhang
Adds a number of internal structures for the cryptodev scheduler PMD. The
structures include the scheduler context, slave, queue pair context,
and session.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/scheduler_pmd_private.h | 115 +++
 1 file changed, 115 insertions(+)
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd_private.h

diff --git a/drivers/crypto/scheduler/scheduler_pmd_private.h 
b/drivers/crypto/scheduler/scheduler_pmd_private.h
new file mode 100644
index 000..ac4690e
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd_private.h
@@ -0,0 +1,115 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SCHEDULER_PMD_PRIVATE_H
+#define _SCHEDULER_PMD_PRIVATE_H
+
+#include 
+#include 
+#include 
+
+/**< Maximum number of bonded devices per devices */
+#ifndef MAX_SLAVES_NUM
+#define MAX_SLAVES_NUM (8)
+#endif
+
+#define PER_SLAVE_BUFF_SIZE(256)
+
+#define CS_LOG_ERR(fmt, args...)   \
+   RTE_LOG(ERR, CRYPTODEV, "[%s] %s() line %u: " fmt "\n", \
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+
+#ifdef RTE_LIBRTE_CRYPTO_SCHEDULER_DEBUG
+#define CS_LOG_INFO(fmt, args...)  \
+   RTE_LOG(INFO, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",\
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+
+#define CS_LOG_DBG(fmt, args...)   \
+   RTE_LOG(DEBUG, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",   \
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+#else
+#define CS_LOG_INFO(fmt, args...)
+#define CS_LOG_DBG(fmt, args...)
+#endif
+
+struct scheduler_slave {
+   uint8_t dev_id;
+   uint16_t qp_id;
+   uint32_t nb_inflight_cops;
+
+   enum rte_cryptodev_type dev_type;
+};
+
+struct scheduler_ctx {
+   void *private_ctx;
+   /**< private scheduler context pointer */
+
+   struct rte_cryptodev_capabilities *capabilities;
+   uint32_t nb_capabilities;
+
+   uint32_t max_nb_queue_pairs;
+
+   struct scheduler_slave slaves[MAX_SLAVES_NUM];
+   uint32_t nb_slaves;
+
+   enum rte_cryptodev_scheduler_mode mode;
+
+   struct rte_cryptodev_scheduler_ops ops;
+
+   uint8_t reordering_enabled;
+
+   char name[RTE_CRYPTODEV_SCHEDULER_NAME_MAX_LEN];
+   char description[RTE_CRYPTODEV_SCHEDULER_DESC_MAX_LEN];
+} __rte_cache_aligned;
+
+struct scheduler_qp_ctx {
+   void *private_qp_ctx;
+
+   rte_cryptodev_scheduler_burst_enqueue_t schedule_enqueue;
+   rte_cryptodev_scheduler_burst_dequeue_t schedule_dequeue;
+
+   struct rte_reorder_buffer *reorder_buf;
+   uint32_t seqn;
+} __rte_cache_aligned;
+
+struct scheduler_session {
+   struct rte_cryptodev_sym_session *sessions[MAX_SLAVES_NUM];
+};
+
+/** device specific operations function pointer structure */
+extern struct rte_cryptodev_ops *rte_crypto_scheduler_pmd_ops;
+
+#endif /* _SCHEDULER_PMD_PRIVATE_H */
-- 
2.7.4



[dpdk-dev] [PATCH v6 05/11] crypto/scheduler: add round-robin scheduling mode

2017-01-24 Thread Fan Zhang
Implements round-robin scheduling mode and register into cryptodev
scheduler ops structure. This mode enqueues a burst of operation
to one of its slaves, and iterates the next burst to the other
slave. Same procedure is done on dequeueing operations.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c |   7 +
 drivers/crypto/scheduler/rte_cryptodev_scheduler.h |   3 +
 drivers/crypto/scheduler/scheduler_roundrobin.c| 435 +
 3 files changed, 445 insertions(+)
 create mode 100644 drivers/crypto/scheduler/scheduler_roundrobin.c

diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.c 
b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
index ae6f032..e0ca029 100644
--- a/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
@@ -329,6 +329,13 @@ rte_crpytodev_scheduler_mode_set(uint8_t scheduler_id,
return 0;
 
switch (mode) {
+   case CDEV_SCHED_MODE_ROUNDROBIN:
+   if (rte_cryptodev_scheduler_load_user_scheduler(scheduler_id,
+   roundrobin_scheduler) < 0) {
+   CS_LOG_ERR("Failed to load scheduler");
+   return -1;
+   }
+   break;
default:
CS_LOG_ERR("Not yet supported");
return -ENOTSUP;
diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.h 
b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
index b18fc48..7ef44e7 100644
--- a/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.h
@@ -46,6 +46,7 @@ extern "C" {
 enum rte_cryptodev_scheduler_mode {
CDEV_SCHED_MODE_NOT_SET = 0,
CDEV_SCHED_MODE_USERDEFINED,
+   CDEV_SCHED_MODE_ROUNDROBIN,
 
CDEV_SCHED_MODE_COUNT /* number of modes */
 };
@@ -156,6 +157,8 @@ struct rte_cryptodev_scheduler {
struct rte_cryptodev_scheduler_ops *ops;
 };
 
+extern struct rte_cryptodev_scheduler *roundrobin_scheduler;
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/crypto/scheduler/scheduler_roundrobin.c 
b/drivers/crypto/scheduler/scheduler_roundrobin.c
new file mode 100644
index 000..1f2e709
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_roundrobin.c
@@ -0,0 +1,435 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+
+#include "rte_cryptodev_scheduler_operations.h"
+#include "scheduler_pmd_private.h"
+
+struct rr_scheduler_qp_ctx {
+   struct scheduler_slave slaves[MAX_SLAVES_NUM];
+   uint32_t nb_slaves;
+
+   uint32_t last_enq_slave_idx;
+   uint32_t last_deq_slave_idx;
+};
+
+static uint16_t
+schedule_enqueue(void *qp_ctx, struct rte_crypto_op **ops, uint16_t nb_ops)
+{
+   struct rr_scheduler_qp_ctx *rr_qp_ctx =
+   ((struct scheduler_qp_ctx *)qp_ctx)->private_qp_ctx;
+   uint32_t slave_idx = rr_qp_ctx->last_enq_slave_idx;
+   struct scheduler_slave *slave = &rr_qp_ctx->slaves[slave_idx];
+   uint16_t i, processed_ops;
+   struct rte_cryptodev_sym_session *sessions[nb_ops];
+   struct scheduler_session *sess0, *sess1, *sess2, *sess3;
+
+   if (unlikely(nb_ops == 0))
+   return 0;
+
+   for (i = 0; i < nb_ops && i < 4; i++)
+   rte_prefetch0(ops[i

[dpdk-dev] [PATCH v6 04/11] crypto/scheduler: add scheduler API implementations

2017-01-24 Thread Fan Zhang
Adds the implementations of the APIs for scheduler cryptodev PMD.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c | 464 +
 1 file changed, 464 insertions(+)
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.c

diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.c 
b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
new file mode 100644
index 000..ae6f032
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
@@ -0,0 +1,464 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "scheduler_pmd_private.h"
+
+/** update the scheduler pmd's capability with attaching device's
+ *  capability.
+ *  For each device to be attached, the scheduler's capability should be
+ *  the common capability set of all slaves
+ **/
+static uint32_t
+sync_caps(struct rte_cryptodev_capabilities *caps,
+   uint32_t nb_caps,
+   const struct rte_cryptodev_capabilities *slave_caps)
+{
+   uint32_t sync_nb_caps = nb_caps, nb_slave_caps = 0;
+   uint32_t i;
+
+   while (slave_caps[nb_slave_caps].op != RTE_CRYPTO_OP_TYPE_UNDEFINED)
+   nb_slave_caps++;
+
+   if (nb_caps == 0) {
+   rte_memcpy(caps, slave_caps, sizeof(*caps) * nb_slave_caps);
+   return nb_slave_caps;
+   }
+
+   for (i = 0; i < sync_nb_caps; i++) {
+   struct rte_cryptodev_capabilities *cap = &caps[i];
+   uint32_t j;
+
+   for (j = 0; j < nb_slave_caps; j++) {
+   const struct rte_cryptodev_capabilities *s_cap =
+   &slave_caps[i];
+
+   if (s_cap->op != cap->op || s_cap->sym.xform_type !=
+   cap->sym.xform_type)
+   continue;
+
+   if (s_cap->sym.xform_type ==
+   RTE_CRYPTO_SYM_XFORM_AUTH) {
+   if (s_cap->sym.auth.algo !=
+   cap->sym.auth.algo)
+   continue;
+
+   cap->sym.auth.digest_size.min =
+   s_cap->sym.auth.digest_size.min <
+   cap->sym.auth.digest_size.min ?
+   s_cap->sym.auth.digest_size.min :
+   cap->sym.auth.digest_size.min;
+   cap->sym.auth.digest_size.max =
+   s_cap->sym.auth.digest_size.max <
+   cap->sym.auth.digest_size.max ?
+   s_cap->sym.auth.digest_size.max :
+   cap->sym.auth.digest_size.max;
+
+   }
+
+   if (s_cap->sym.xform_type ==
+   RTE_CRYPTO_SYM_XFORM_CIPHER)
+   if (s_cap->sym.cipher.algo !=
+   cap->sym.cipher.algo)
+   continue;
+
+   /* no common cap found */
+   break;
+

[dpdk-dev] [PATCH v6 06/11] crypto/scheduler: register scheduler vdev driver

2017-01-24 Thread Fan Zhang
Adds crypto scheduler's PMD's probe and remove function and the device's
enqueue and dequeue burst functions. A cryptodev scheduler PMD is
then registered in the end.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/scheduler_pmd.c | 361 +++
 1 file changed, 361 insertions(+)
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd.c

diff --git a/drivers/crypto/scheduler/scheduler_pmd.c 
b/drivers/crypto/scheduler/scheduler_pmd.c
new file mode 100644
index 000..62418d0
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd.c
@@ -0,0 +1,361 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "scheduler_pmd_private.h"
+
+struct scheduler_init_params {
+   struct rte_crypto_vdev_init_params def_p;
+   uint32_t nb_slaves;
+   uint8_t slaves[MAX_SLAVES_NUM];
+};
+
+#define RTE_CRYPTODEV_VDEV_NAME("name")
+#define RTE_CRYPTODEV_VDEV_SLAVE   ("slave")
+#define RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG   ("max_nb_queue_pairs")
+#define RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG ("max_nb_sessions")
+#define RTE_CRYPTODEV_VDEV_SOCKET_ID   ("socket_id")
+
+const char *scheduler_valid_params[] = {
+   RTE_CRYPTODEV_VDEV_NAME,
+   RTE_CRYPTODEV_VDEV_SLAVE,
+   RTE_CRYPTODEV_VDEV_MAX_NB_QP_ARG,
+   RTE_CRYPTODEV_VDEV_MAX_NB_SESS_ARG,
+   RTE_CRYPTODEV_VDEV_SOCKET_ID
+};
+
+static uint16_t
+scheduler_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops,
+   uint16_t nb_ops)
+{
+   struct scheduler_qp_ctx *qp_ctx = queue_pair;
+   uint16_t processed_ops;
+
+   processed_ops = (*qp_ctx->schedule_enqueue)(qp_ctx, ops,
+   nb_ops);
+
+   return processed_ops;
+}
+
+static uint16_t
+scheduler_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
+   uint16_t nb_ops)
+{
+   struct scheduler_qp_ctx *qp_ctx = queue_pair;
+   uint16_t processed_ops;
+
+   processed_ops = (*qp_ctx->schedule_dequeue)(qp_ctx, ops,
+   nb_ops);
+
+   return processed_ops;
+}
+
+static int
+attach_init_slaves(uint8_t scheduler_id,
+   const uint8_t *slaves, const uint8_t nb_slaves)
+{
+   uint8_t i;
+
+   for (i = 0; i < nb_slaves; i++) {
+   struct rte_cryptodev *dev =
+   rte_cryptodev_pmd_get_dev(slaves[i]);
+   int status = rte_cryptodev_scheduler_slave_attach(
+   scheduler_id, slaves[i]);
+
+   if (status < 0 || !dev) {
+   CS_LOG_ERR("Failed to attach slave cryptodev "
+   "%u.\n", slaves[i]);
+   return status;
+   }
+
+   RTE_LOG(INFO, PMD, "  Attached slave cryptodev %s\n",
+   dev->data->name);
+   }
+
+   return 0;
+}
+
+static int
+cryptodev_scheduler_create(const char *name,
+   struct scheduler_init_params *init_params)
+{
+   char crypto_dev_name[RTE_CRYPTODEV_NAME_MAX_LEN];
+   struct rte_cryptodev *dev;
+   struct scheduler_ctx *sched_ctx;
+
+   if (init_params->def_p.name[0] == '\0') {
+   int ret = rte_

[dpdk-dev] [PATCH v6 07/11] crypto/scheduler: register operation function pointer table

2017-01-24 Thread Fan Zhang
Implements all standard operations required for cryptodev,
and register them to cryptodev operation function pointer table.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/scheduler_pmd_ops.c | 490 +++
 1 file changed, 490 insertions(+)
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd_ops.c

diff --git a/drivers/crypto/scheduler/scheduler_pmd_ops.c 
b/drivers/crypto/scheduler/scheduler_pmd_ops.c
new file mode 100644
index 000..56624c7
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd_ops.c
@@ -0,0 +1,490 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "scheduler_pmd_private.h"
+
+/** Configure device */
+static int
+scheduler_pmd_config(struct rte_cryptodev *dev)
+{
+   struct scheduler_ctx *sched_ctx = dev->data->dev_private;
+   uint32_t i;
+   int ret = 0;
+
+   for (i = 0; i < sched_ctx->nb_slaves; i++) {
+   uint8_t slave_dev_id = sched_ctx->slaves[i].dev_id;
+   struct rte_cryptodev *slave_dev =
+   rte_cryptodev_pmd_get_dev(slave_dev_id);
+
+   ret = (*slave_dev->dev_ops->dev_configure)(slave_dev);
+   if (ret < 0)
+   break;
+   }
+
+   return ret;
+}
+
+static int
+update_reorder_buff(struct rte_cryptodev *dev, uint16_t qp_id)
+{
+   struct scheduler_ctx *sched_ctx = dev->data->dev_private;
+   struct scheduler_qp_ctx *qp_ctx = dev->data->queue_pairs[qp_id];
+
+   if (sched_ctx->reordering_enabled) {
+   char reorder_buff_name[RTE_CRYPTODEV_NAME_MAX_LEN];
+   uint32_t buff_size = sched_ctx->nb_slaves * PER_SLAVE_BUFF_SIZE;
+
+   if (qp_ctx->reorder_buf) {
+   rte_reorder_free(qp_ctx->reorder_buf);
+   qp_ctx->reorder_buf = NULL;
+   }
+
+   if (!buff_size)
+   return 0;
+
+   if (snprintf(reorder_buff_name, RTE_CRYPTODEV_NAME_MAX_LEN,
+   "%s_rb_%u_%u", RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),
+   dev->data->dev_id, qp_id) < 0) {
+   CS_LOG_ERR("failed to create unique reorder buffer "
+   "name");
+   return -ENOMEM;
+   }
+
+   qp_ctx->reorder_buf = rte_reorder_create(reorder_buff_name,
+   rte_socket_id(), buff_size);
+   if (!qp_ctx->reorder_buf) {
+   CS_LOG_ERR("failed to create reorder buffer");
+   return -ENOMEM;
+   }
+   } else {
+   if (qp_ctx->reorder_buf) {
+   rte_reorder_free(qp_ctx->reorder_buf);
+   qp_ctx->reorder_buf = NULL;
+   }
+   }
+
+   return 0;
+}
+
+/** Start device */
+static int
+scheduler_pmd_start(struct rte_cryptodev *dev)
+{
+   struct scheduler_ctx *sched_ctx = dev->data->dev_private;
+   uint32_t i;
+   int ret;
+
+   if (dev->data->dev_started)
+   return 0;
+
+   for (i = 0; i < dev->data->nb_queue_pairs; i++) {
+   ret = update_reorder_buff(dev, i);
+   if (ret < 0) {
+

[dpdk-dev] [PATCH v6 08/11] crypto/scheduler: add scheduler PMD to DPDK compile system

2017-01-24 Thread Fan Zhang
Adds Makefile for scheduler cryptodev PMD, and updates existing
Makefiles. Different than other cryptodev PMDs, scheduler PMD
is required to be built as shared libraries.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/Makefile   |  3 +-
 drivers/crypto/scheduler/Makefile | 66 +++
 mk/rte.app.mk |  6 +++-
 3 files changed, 73 insertions(+), 2 deletions(-)
 create mode 100644 drivers/crypto/scheduler/Makefile

diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 77b02cf..a5a246b 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -36,6 +36,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB) += aesni_mb
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_ARMV8_CRYPTO) += armv8
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_OPENSSL) += openssl
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_SNOW3G) += snow3g
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_KASUMI) += kasumi
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_ZUC) += zuc
diff --git a/drivers/crypto/scheduler/Makefile 
b/drivers/crypto/scheduler/Makefile
new file mode 100644
index 000..0cce6f2
--- /dev/null
+++ b/drivers/crypto/scheduler/Makefile
@@ -0,0 +1,66 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_crypto_scheduler.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library version
+LIBABIVER := 1
+
+# versioning export map
+EXPORT_MAP := rte_pmd_crypto_scheduler_version.map
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_cryptodev_scheduler_operations.h
+SYMLINK-y-include += rte_cryptodev_scheduler.h
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler_pmd.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler_pmd_ops.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += rte_cryptodev_scheduler.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += scheduler_roundrobin.c
+
+# library dependencies
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_cryptodev
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_kvargs
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_mempool
+DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER) += lib/librte_reorder
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a5daa84..0d0a970 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -70,7 +70,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PORT)   += -lrte_port
 
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)  += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)+= -lrte_distributor
-_LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)+= -lrte_reorder
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)+= -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)  += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)  += -lrte_sched
@@ -99,10 +98,15 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)  

[dpdk-dev] [PATCH v6 10/11] app/test: add unit test for cryptodev scheduler PMD

2017-01-24 Thread Fan Zhang
Same as other cryptodev PMDs, it is necessary to carry out the unit
test for scheduler PMD. Currently the test is designed to attach 2
AESNI-MB cryptodev PMDs as slaves, sets the scheduling mode as round-
robin, and runs almost all AESNI-MB test items (except for sessionless
tests). In the end, the slaves are detached.

Signed-off-by: Fan Zhang 
---
 app/test/test_cryptodev.c   | 241 +++-
 app/test/test_cryptodev_aes_test_vectors.h  | 101 
 app/test/test_cryptodev_blockcipher.c   |   6 +-
 app/test/test_cryptodev_blockcipher.h   |   3 +-
 app/test/test_cryptodev_hash_test_vectors.h |  38 +++--
 5 files changed, 338 insertions(+), 51 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 0f0cf4d..bf44928 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2015-2017 Intel Corporation. All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
  *   modification, are permitted provided that the following conditions
@@ -40,6 +40,11 @@
 #include 
 #include 
 
+#ifdef RTE_LIBRTE_PMD_CRYPTO_SCHEDULER
+#include 
+#include 
+#endif
+
 #include "test.h"
 #include "test_cryptodev.h"
 
@@ -159,7 +164,7 @@ testsuite_setup(void)
 {
struct crypto_testsuite_params *ts_params = &testsuite_params;
struct rte_cryptodev_info info;
-   unsigned i, nb_devs, dev_id;
+   uint32_t i = 0, nb_devs, dev_id;
int ret;
uint16_t qp_id;
 
@@ -370,6 +375,29 @@ testsuite_setup(void)
}
}
 
+#ifdef RTE_LIBRTE_PMD_CRYPTO_SCHEDULER
+   if (gbl_cryptodev_type == RTE_CRYPTODEV_SCHEDULER_PMD) {
+
+#ifndef RTE_LIBRTE_PMD_AESNI_MB
+   RTE_LOG(ERR, USER1, "CONFIG_RTE_LIBRTE_PMD_AESNI_MB must be"
+   " enabled in config file to run this testsuite.\n");
+   return TEST_FAILED;
+#endif
+   nb_devs = rte_cryptodev_count_devtype(
+   RTE_CRYPTODEV_SCHEDULER_PMD);
+   if (nb_devs < 1) {
+   ret = rte_eal_vdev_init(
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),
+   NULL);
+
+   TEST_ASSERT(ret == 0,
+   "Failed to create instance %u of"
+   " pmd : %s",
+   i, RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD));
+   }
+   }
+#endif /* RTE_LIBRTE_PMD_CRYPTO_SCHEDULER */
+
 #ifndef RTE_LIBRTE_PMD_QAT
if (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_SYM_PMD) {
RTE_LOG(ERR, USER1, "CONFIG_RTE_LIBRTE_PMD_QAT must be enabled "
@@ -1535,6 +1563,58 @@ test_AES_chain_mb_all(void)
return TEST_SUCCESS;
 }
 
+#ifdef RTE_LIBRTE_PMD_CRYPTO_SCHEDULER
+
+static int
+test_AES_cipheronly_scheduler_all(void)
+{
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   int status;
+
+   status = test_blockcipher_all_tests(ts_params->mbuf_pool,
+   ts_params->op_mpool, ts_params->valid_devs[0],
+   RTE_CRYPTODEV_SCHEDULER_PMD,
+   BLKCIPHER_AES_CIPHERONLY_TYPE);
+
+   TEST_ASSERT_EQUAL(status, 0, "Test failed");
+
+   return TEST_SUCCESS;
+}
+
+static int
+test_AES_chain_scheduler_all(void)
+{
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   int status;
+
+   status = test_blockcipher_all_tests(ts_params->mbuf_pool,
+   ts_params->op_mpool, ts_params->valid_devs[0],
+   RTE_CRYPTODEV_SCHEDULER_PMD,
+   BLKCIPHER_AES_CHAIN_TYPE);
+
+   TEST_ASSERT_EQUAL(status, 0, "Test failed");
+
+   return TEST_SUCCESS;
+}
+
+static int
+test_authonly_scheduler_all(void)
+{
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   int status;
+
+   status = test_blockcipher_all_tests(ts_params->mbuf_pool,
+   ts_params->op_mpool, ts_params->valid_devs[0],
+   RTE_CRYPTODEV_SCHEDULER_PMD,
+   BLKCIPHER_AUTHONLY_TYPE);
+
+   TEST_ASSERT_EQUAL(status, 0, "Test failed");
+
+   return TEST_SUCCESS;
+}
+
+#endif /* RTE_LIBRTE_PMD_CRYPTO_SCHEDULER */
+
 static int
 test_AES_chain_openssl_all(void)
 {
@@ -7292,6 +7372,150 @@ 
auth_decryption_AES128CBC_HMAC_SHA1_fail_tag_corrupt(void)
&aes128cbc_hmac_sha1_test_vector);
 }
 
+#ifdef RTE_LIBRTE_PMD_CRYPTO_SCHEDULER
+
+/* global AESNI slave IDs for the scheduler test */
+uint8_t aesni_ids[2];
+
+static int
+test_scheduler_attach_slave_op(void)
+{
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   uint8_t sched_id = ts_params->valid_devs[0];
+   uint32_t nb_devs, qp_id, i, nb_devs_attached = 0;
+   int ret;
+   struct rte_cryptodev_config co

[dpdk-dev] [PATCH v6 09/11] crypto/scheduler: add scheduler PMD config options

2017-01-24 Thread Fan Zhang
Adds scheduler PMD enable and debug flags to config/common_base.

Signed-off-by: Fan Zhang 
---
 config/common_base | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/config/common_base b/config/common_base
index b9fb8e2..cd4a0f3 100644
--- a/config/common_base
+++ b/config/common_base
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -434,6 +434,12 @@ CONFIG_RTE_LIBRTE_PMD_ZUC=n
 CONFIG_RTE_LIBRTE_PMD_ZUC_DEBUG=n
 
 #
+# Compile PMD for Crypto Scheduler device
+#
+CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER=n
+CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER_DEBUG=n
+
+#
 # Compile PMD for NULL Crypto device
 #
 CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
-- 
2.7.4



[dpdk-dev] [PATCH v6 11/11] crypto/scheduler: add documentation

2017-01-24 Thread Fan Zhang
Adds the description of the cryptodev scheduler PMD overview,
limitations, build, instructions, modes, etc.

Signed-off-by: Fan Zhang 
---
 doc/guides/cryptodevs/img/scheduler-overview.svg | 277 +++
 doc/guides/cryptodevs/index.rst  |   3 +-
 doc/guides/cryptodevs/scheduler.rst  | 128 +++
 3 files changed, 407 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/cryptodevs/img/scheduler-overview.svg
 create mode 100644 doc/guides/cryptodevs/scheduler.rst

diff --git a/doc/guides/cryptodevs/img/scheduler-overview.svg 
b/doc/guides/cryptodevs/img/scheduler-overview.svg
new file mode 100644
index 000..82bb775
--- /dev/null
+++ b/doc/guides/cryptodevs/img/scheduler-overview.svg
@@ -0,0 +1,277 @@
+
+http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+
+http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns:ev="http://www.w3.org/2001/xml-events";
+   
xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/"; 
width="6.81229in" height="3.40992in"
+   viewBox="0 0 490.485 245.514" xml:space="preserve" 
color-interpolation-filters="sRGB" class="st10">
+   
+
+   
+   
+   
+
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   Page-1
+   
+   
+   
+   Rounded Rectangle.55
+   User Application
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   User 
Application  
+   
+   Rounded Rectangle.135
+   Cryptodev
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   Cryptodev 
 
+   
+   Rounded Rectangle.136
+   Cryptodev
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+   
+ 

Re: [dpdk-dev] [PATCH] net/szedata2: fix incorrect device memory access

2017-01-24 Thread Ferruh Yigit
On 1/24/2017 3:55 PM, Matej Vido wrote:
> On 24.01.2017 16:11, Ferruh Yigit wrote:
>> On 1/24/2017 2:02 PM, Matej Vido wrote:
>>> On 24.01.2017 12:58, Ferruh Yigit wrote:
 On 1/24/2017 10:49 AM, Matej Vido wrote:
> Fixes: 8acba705b119 ("net/szedata2: localize handling of PCI resources")
>
> Signed-off-by: Matej Vido 
 Unrelated from this patch, in maintainers file, you have your other mail
 address: "Matej Vido ", do you want to update it?
>>> Hi Ferruh,
>>>
>>> yes, I will send the patch.
> ---
>drivers/net/szedata2/rte_eth_szedata2.h | 2 +-
>1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/szedata2/rte_eth_szedata2.h 
> b/drivers/net/szedata2/rte_eth_szedata2.h
> index b58adb6..afe8a38 100644
> --- a/drivers/net/szedata2/rte_eth_szedata2.h
> +++ b/drivers/net/szedata2/rte_eth_szedata2.h
> @@ -192,7 +192,7 @@ struct szedata {
>}
>
>#define SZEDATA2_PCI_RESOURCE_PTR(rsc, offset, type) \
> - ((type)((uint8_t *)(rsc)->addr) + (offset))
> + ((type)(((uint8_t *)(rsc)->addr) + (offset)))
 Although output will be same, (in all uses, type is a pointer),
>> Of course won't be same, please forget about it J, I am confused.
>>
>> So these two differs a lot, taking into account that offset numbers used
>> are big numbers (0x8000..), it should be accessing very unrelated addresses.
>>
>> So how this was working before?
> 
> The macro was fine before the patch [1]. It hasn't been working since 
> the acceptance of that patch, but I didn't manage to test the 
> functionality until now.

I see, thanks for clarification, seems broken relatively new, and thanks
for fixing.

> 
> [1] http://dpdk.org/ml/archives/dev/2016-December/053241.html
> 
> Regards,
> Matej
> 
>>
>>> this
 seems the intention, so:

 Reviewed-by: Ferruh Yigit 

 btw, following will do same, right, not sure if it is better:
 ((type)(rsc)->addr + (offset))
>>> This is also wrong. The intention of the macro is to add an offset to
>>> the base address and typecast the result.
>>>
>>> Regards,
>>> Matej
>
>enum szedata2_link_speed {
>   SZEDATA2_LINK_SPEED_DEFAULT = 0,
>
> 



Re: [dpdk-dev] [RFC 0/8] mbuf: structure reorganization

2017-01-24 Thread Olivier MATZ
On Tue, 24 Jan 2017 15:59:08 +, Bruce Richardson
 wrote:
> On Tue, Jan 24, 2017 at 04:19:25PM +0100, Olivier Matz wrote:
> > Based on discussion done in [1], this patchset reorganizes the mbuf.
> >   
> 
> Hi Olivier,
> 
> thanks for all the work on this. From a quick scan of the patches, and
> the description below, it looks like a good set of changes. Comments
> below to see about kick-starting some further discussion about some of
> the other changes you propose.
> 
> > The main changes are:
> > - reorder structure to increase vector performance on some non-ia
> >   platforms.
> > - add a 64bits timestamp field in the 1st cache line
> > - m->next, m->nb_segs, and m->refcnt are always initialized for
> > mbufs in the pool, avoiding the need of setting m->next (located in
> > the 2nd cache line) in the Rx path for mono-segment packets.
> > - change port and nb_segs to 16 bits
> > - move seqn in the 2nd cache line
> > 
> > Things discussed but not done in the patchset:
> > - move refcnt and nb_segs to the 2nd cache line: many drivers sets
> >   them in the Rx path, so it could introduce a performance
> > regression, or it would require to change all the drivers, which is
> > not an easy task.  
> 
> But if we do make this change and update the drivers, some of them
> should perform a little better, since they do fewer writes. I don't
> think the fastest vector drivers will be affected, since they already
> coalesce the writes to these fields with other writes, but others
> drivers may well be improved by the change.

Yes, that's something I forgot to say in the cover letter: after this
patchset, the Rx path of drivers could be optimized a bit by removing
writes to m->next, m->nb_segs and m->refcnt. The patch 4/8 gives an
idea of what could be done.

Once most drivers are updated, we could reconsider moving nb_segs and
refcnt in the second cache line.

> 
> > - remove the m->port field: too much impact on many examples and
> > libraries, and some people highlighted they are using it.
> > - moving m->next in the 1st cache line: there is not enough room,
> > and having it set to NULL for unused mbuf should remove the need
> > for it.  
> 
> I agree.
> 
> > - merge seqn and timestamp together in a union: we could imagine
> > use cases were both are activated. There is no flag indicating the
> > presence of seqn, so it looks preferable to keep them separated for
> > now.  
> 
> What were the use-cases? If we have a timestamp, surely sequence can
> be determined from that? Even if you use the TSC as a timestamp per
> burst, you can still sequence the packets cheaply by just adding 1 to
> each subsequent value.

Assuming the timestamp is in nanosecond, it is not a sequence number,
so I'm not sure it should be hijacked for this purpose. A timestamp can
be used to reorder packets, but having a sequence number is better
because you can be sure that when you get packets 1, 3, 2, 0 that no
packet is missing between 0 and 3.

For that reason, I guess both features could be used at the same time.

Regards,
Olivier


Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in rte_pktmbuf_attach

2017-01-24 Thread Olivier MATZ
On Tue, 24 Jan 2017 19:57:13 +0400, Ilya Matveychikov
 wrote:
> > On Jan 24, 2017, at 4:56 PM, Olivier MATZ 
> > wrote:
> > 
> > Hi,
> > 
> > On Sat, 21 Jan 2017 16:28:29 +, "Ananyev, Konstantin"
> >  wrote:  
> >>> -Original Message-
> >>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya
> >>> Matveychikov Sent: Saturday, January 21, 2017 3:08 PM
> >>> To: Yigit, Ferruh 
> >>> Cc: dev@dpdk.org
> >>> Subject: Re: [dpdk-dev] [PATCH] mbuf: remove redundant line in
> >>> rte_pktmbuf_attach
> >>> 
> >>>   
>  On Jan 20, 2017, at 4:08 PM, Ferruh Yigit
>   wrote:
>  
>  On 1/20/2017 12:19 AM, Ilya Matveychikov wrote:
> > mi->next will be assigned to NULL few lines later, trivial patch
> > 
> > Signed-off-by: Ilya V. Matveychikov 
> > ---
> > lib/librte_mbuf/rte_mbuf.h | 1 -
> > 1 file changed, 1 deletion(-)
> > 
> > diff --git a/lib/librte_mbuf/rte_mbuf.h
> > b/lib/librte_mbuf/rte_mbuf.h index ead7c6e..5589d54 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -1139,7 +1139,6 @@ static inline void
> > rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
> > mi->buf_addr = m->buf_addr; mi->buf_len = m->buf_len;
> > 
> > -   mi->next = m->next;
>    
> > 
> > Fixes: ea672a8b1655 ("mbuf: remove the rte_pktmbuf structure")
> > 
> > Acked-by: Olivier Matz 
> > 
> >   
>  Do you know why attaching mbuf is not supporting
>  multi-segment?
> >> 
> >> This is supported, but you have to do it segment by segment.
> >> Actually  rte_pktmbuf_clone() does that.
> >> Konstantin
> >> 
> >>   
>  Perhaps this can be documented in function comment, as one of the
>  "not supported" items.
> >>> 
> >>> No, I don’t know. For my application I’ve found that nb_segs with
> >>> it’s limit in 256 segments is very annoying and I’ve decided not
> >>> to use DPDK functions that dealt with nb_segs… But it is not
> >>> about the rte_pktmbuf_attach() function and the patch.   
> > 
> > 
> > Out of curiosity, can you explain why your application needs more
> > than 256 segments? When we were discussing the possibility of
> > extending this field to 16 bits, Konstantin convinced me that it
> > was not so useful.  
> 
> In my application I need to do IPv4 fragments reassembly. There is no
> explicit limit of number of fragments in datagram, so I’m trying to
> avoid any limitations and `nb_segs` here is a constraint for me.
> Expanding it from 8-bit to 16-bit can solve that issue for me. But I
> don’t remember are there any  other places in DPDK where we need to
> know how many segments are in the packet? I mean that is `nb_segs`
> required at all?
> 

Yes, it is used for instance in some PMDs to know how many tx ring
descriptors are needed to send a packet.

Thank you for the explanation. As you probably seen, I'm proposing to
extend the nb_segs to 16 bits in my latest RFC patchset.

Regards,
Olivier


[dpdk-dev] [PATCH] eal: fix wrong log at startup

2017-01-24 Thread Olivier Matz
The log "Debug logs available - lower performance" should
now only be displayed when dataplane debug logs are enabled.

The issue occurs only if the default log level (CONFIG_RTE_LOG_LEVEL) is
set to DEBUG in the configuration, which is not the case by default.

Fixes: 5d8f0baf69ea ("log: do not drop debug logs at compile time")

Signed-off-by: Olivier Matz 
---
 lib/librte_eal/common/eal_common_log.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index e45d326..2197558 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -176,7 +176,8 @@ eal_log_set_default(FILE *default_log)
 {
default_log_stream = default_log;
 
-#if RTE_LOG_LEVEL >= RTE_LOG_DEBUG
-   RTE_LOG(NOTICE, EAL, "Debug logs available - lower performance\n");
+#if RTE_LOG_DP_LEVEL >= RTE_LOG_DEBUG
+   RTE_LOG(NOTICE, EAL,
+   "Debug dataplane logs available - lower performance\n");
 #endif
 }
-- 
2.8.1



[dpdk-dev] [PATCH v7 00/11] crypto/scheduler: add driver for scheduler crypto pmd

2017-01-24 Thread Fan Zhang
This patch provides the initial implementation of the scheduler poll mode
driver using DPDK cryptodev framework.

Scheduler PMD is used to schedule and enqueue the crypto ops to the
hardware and/or software crypto devices attached to it (slaves). The
dequeue operation from the slave(s), and the possible dequeued crypto op
reordering, are then carried out by the scheduler.

As the initial version, the scheduler PMD currently supports only the
Round-robin mode, which distributes the enqueued burst of crypto ops
among its slaves in a round-robin manner. This mode may help to fill
the throughput gap between the physical core and the existing cryptodevs
to increase the overall performance. Moreover, the scheduler PMD is
provided the APIs for user to create his/her own scheduler.

Build instructions:
To build DPDK with CRYTPO_SCHEDULER_PMD the user is required to set
CONFIG_RTE_LIBRTE_PMD_CRYPTO_SCHEDULER=y in config/common_base

Notice:
- Scheduler PMD shares same EAL commandline options as other cryptodevs.
  However, apart from socket_id, the rest of cryptodev options are
  ignored. The scheduler PMD's max_nb_queue_pairs and max_nb_sessions
  options are set as the minimum values of the attached slaves'. For
  example, a scheduler cryptodev is attached 2 cryptodevs with
  max_nb_queue_pairs of 2 and 8, respectively. The scheduler cryptodev's
  max_nb_queue_pairs will be automatically updated as 2.

- In addition, an extra option "slave" is added. The user can attach one
  or more slave cryptodevs initially by passing their names with this
  option. Here is an example:

  ... --vdev "crypto_aesni_mb_pmd,name=aesni_mb_1" --vdev "crypto_aesni_
  mb_pmd,name=aesni_mb_2" --vdev "crypto_scheduler_pmd,slave=aesni_mb_1,
  slave=aesni_mb_2" ...

  Remember the software cryptodevs to be attached shall be declared before
  the scheduler PMD, otherwise the scheduler will fail to locate the
  slave(s) and report error.

- The scheduler cryptodev cannot be started unless the scheduling mode
  is set and at least one slave is attached. Also, to configure the
  scheduler in the run-time, like attach/detach slave(s), change
  scheduling mode, or enable/disable crypto op ordering, one should stop
  the scheduler first, otherwise an error will be returned.

- Enabling crypto ops reordering will cause overwriting the userdata field
  of each mbuf.

Fan Zhang (11):

Changes in v7:
Added missed sign-off

Changes in v6:
Split into multiple patches.
Added documentation.
Added unit test.

Changes in v5:
Fixed EOF whitespace warning.
Updated Copyright.

Changes in v4:
Fixed a few bugs.
Added slave EAL commandline option support.

Changes in v3:
Fixed config/common_base.

Changes in v2:
New approaches in API to suit future scheduling modes.

Fan Zhang (11):
  cryptodev: add scheduler PMD name and type
  crypto/scheduler: add APIs for scheduler
  crypto/scheduler: add internal structure declarations
  crypto/scheduler: add scheduler API implementations
  crypto/scheduler: add round-robin scheduling mode
  crypto/scheduler: register scheduler vdev driver
  crypto/scheduler: register operation function pointer table
  crypto/scheduler: add scheduler PMD to DPDK compile system
  crypto/scheduler: add scheduler PMD config options
  app/test: add unit test for cryptodev scheduler PMD
  crypto/scheduler: add documentation

 app/test/test_cryptodev.c  | 241 +-
 app/test/test_cryptodev_aes_test_vectors.h | 101 +++--
 app/test/test_cryptodev_blockcipher.c  |   6 +-
 app/test/test_cryptodev_blockcipher.h  |   3 +-
 app/test/test_cryptodev_hash_test_vectors.h|  38 +-
 config/common_base |   8 +-
 doc/guides/cryptodevs/img/scheduler-overview.svg   | 277 
 doc/guides/cryptodevs/index.rst|   3 +-
 doc/guides/cryptodevs/scheduler.rst| 128 ++
 drivers/crypto/Makefile|   3 +-
 drivers/crypto/scheduler/Makefile  |  66 +++
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c | 471 
 drivers/crypto/scheduler/rte_cryptodev_scheduler.h | 165 +++
 .../scheduler/rte_cryptodev_scheduler_operations.h |  71 +++
 .../scheduler/rte_pmd_crypto_scheduler_version.map |  12 +
 drivers/crypto/scheduler/scheduler_pmd.c   | 361 +++
 drivers/crypto/scheduler/scheduler_pmd_ops.c   | 490 +
 drivers/crypto/scheduler/scheduler_pmd_private.h   | 115 +
 drivers/crypto/scheduler/scheduler_roundrobin.c| 435 ++
 lib/librte_cryptodev/rte_cryptodev.h   |   3 +
 mk/rte.app.mk  |   6 +-
 21 files changed, 2948 insertions(+), 55 deletions(-)
 create mode 100644 doc/guides/cryptodevs/img/scheduler-overview.svg
 create mode 100644 doc/guides/cryptodevs/scheduler.rst
 create mode 100644 drivers/crypto/scheduler/Makefile
 create mode 100644 drivers/crypto/sc

[dpdk-dev] [PATCH v7 01/11] cryptodev: add scheduler PMD name and type

2017-01-24 Thread Fan Zhang
This patch adds the cryptodev scheduler PMD name and type identifier to
librte_cryptodev.

Signed-off-by: Fan Zhang 
---
 lib/librte_cryptodev/rte_cryptodev.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index f284668..618f302 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -68,6 +68,8 @@ extern "C" {
 /**< KASUMI PMD device name */
 #define CRYPTODEV_NAME_ARMV8_PMD   crypto_armv8
 /**< ARMv8 Crypto PMD device name */
+#define CRYPTODEV_NAME_SCHEDULER_PMD   crypto_scheduler
+/**< Scheduler Crypto PMD device name */
 
 /** Crypto device type */
 enum rte_cryptodev_type {
@@ -80,6 +82,7 @@ enum rte_cryptodev_type {
RTE_CRYPTODEV_ZUC_PMD,  /**< ZUC PMD */
RTE_CRYPTODEV_OPENSSL_PMD,/**<  OpenSSL PMD */
RTE_CRYPTODEV_ARMV8_PMD,/**< ARMv8 crypto PMD */
+   RTE_CRYPTODEV_SCHEDULER_PMD,/**< Crypto Scheduler PMD */
 };
 
 extern const char **rte_cyptodev_names;
-- 
2.7.4



[dpdk-dev] [PATCH v7 03/11] crypto/scheduler: add internal structure declarations

2017-01-24 Thread Fan Zhang
Adds a number of internal structures for the cryptodev scheduler PMD. The
structures include the scheduler context, slave, queue pair context,
and session.

Signed-off-by: Fan Zhang 
Signed-off-by: Declan Doherty 
---
 drivers/crypto/scheduler/scheduler_pmd_private.h | 115 +++
 1 file changed, 115 insertions(+)
 create mode 100644 drivers/crypto/scheduler/scheduler_pmd_private.h

diff --git a/drivers/crypto/scheduler/scheduler_pmd_private.h 
b/drivers/crypto/scheduler/scheduler_pmd_private.h
new file mode 100644
index 000..ac4690e
--- /dev/null
+++ b/drivers/crypto/scheduler/scheduler_pmd_private.h
@@ -0,0 +1,115 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _SCHEDULER_PMD_PRIVATE_H
+#define _SCHEDULER_PMD_PRIVATE_H
+
+#include 
+#include 
+#include 
+
+/**< Maximum number of bonded devices per devices */
+#ifndef MAX_SLAVES_NUM
+#define MAX_SLAVES_NUM (8)
+#endif
+
+#define PER_SLAVE_BUFF_SIZE(256)
+
+#define CS_LOG_ERR(fmt, args...)   \
+   RTE_LOG(ERR, CRYPTODEV, "[%s] %s() line %u: " fmt "\n", \
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+
+#ifdef RTE_LIBRTE_CRYPTO_SCHEDULER_DEBUG
+#define CS_LOG_INFO(fmt, args...)  \
+   RTE_LOG(INFO, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",\
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+
+#define CS_LOG_DBG(fmt, args...)   \
+   RTE_LOG(DEBUG, CRYPTODEV, "[%s] %s() line %u: " fmt "\n",   \
+   RTE_STR(CRYPTODEV_NAME_SCHEDULER_PMD),  \
+   __func__, __LINE__, ## args)
+#else
+#define CS_LOG_INFO(fmt, args...)
+#define CS_LOG_DBG(fmt, args...)
+#endif
+
+struct scheduler_slave {
+   uint8_t dev_id;
+   uint16_t qp_id;
+   uint32_t nb_inflight_cops;
+
+   enum rte_cryptodev_type dev_type;
+};
+
+struct scheduler_ctx {
+   void *private_ctx;
+   /**< private scheduler context pointer */
+
+   struct rte_cryptodev_capabilities *capabilities;
+   uint32_t nb_capabilities;
+
+   uint32_t max_nb_queue_pairs;
+
+   struct scheduler_slave slaves[MAX_SLAVES_NUM];
+   uint32_t nb_slaves;
+
+   enum rte_cryptodev_scheduler_mode mode;
+
+   struct rte_cryptodev_scheduler_ops ops;
+
+   uint8_t reordering_enabled;
+
+   char name[RTE_CRYPTODEV_SCHEDULER_NAME_MAX_LEN];
+   char description[RTE_CRYPTODEV_SCHEDULER_DESC_MAX_LEN];
+} __rte_cache_aligned;
+
+struct scheduler_qp_ctx {
+   void *private_qp_ctx;
+
+   rte_cryptodev_scheduler_burst_enqueue_t schedule_enqueue;
+   rte_cryptodev_scheduler_burst_dequeue_t schedule_dequeue;
+
+   struct rte_reorder_buffer *reorder_buf;
+   uint32_t seqn;
+} __rte_cache_aligned;
+
+struct scheduler_session {
+   struct rte_cryptodev_sym_session *sessions[MAX_SLAVES_NUM];
+};
+
+/** device specific operations function pointer structure */
+extern struct rte_cryptodev_ops *rte_crypto_scheduler_pmd_ops;
+
+#endif /* _SCHEDULER_PMD_PRIVATE_H */
-- 
2.7.4



[dpdk-dev] [PATCH v7 04/11] crypto/scheduler: add scheduler API implementations

2017-01-24 Thread Fan Zhang
Adds the implementations of the APIs for scheduler cryptodev PMD.

Signed-off-by: Fan Zhang 
---
 drivers/crypto/scheduler/rte_cryptodev_scheduler.c | 464 +
 1 file changed, 464 insertions(+)
 create mode 100644 drivers/crypto/scheduler/rte_cryptodev_scheduler.c

diff --git a/drivers/crypto/scheduler/rte_cryptodev_scheduler.c 
b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
new file mode 100644
index 000..14f0983
--- /dev/null
+++ b/drivers/crypto/scheduler/rte_cryptodev_scheduler.c
@@ -0,0 +1,464 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "scheduler_pmd_private.h"
+
+/** update the scheduler pmd's capability with attaching device's
+ *  capability.
+ *  For each device to be attached, the scheduler's capability should be
+ *  the common capability set of all slaves
+ **/
+static uint32_t
+sync_caps(struct rte_cryptodev_capabilities *caps,
+   uint32_t nb_caps,
+   const struct rte_cryptodev_capabilities *slave_caps)
+{
+   uint32_t sync_nb_caps = nb_caps, nb_slave_caps = 0;
+   uint32_t i;
+
+   while (slave_caps[nb_slave_caps].op != RTE_CRYPTO_OP_TYPE_UNDEFINED)
+   nb_slave_caps++;
+
+   if (nb_caps == 0) {
+   rte_memcpy(caps, slave_caps, sizeof(*caps) * nb_slave_caps);
+   return nb_slave_caps;
+   }
+
+   for (i = 0; i < sync_nb_caps; i++) {
+   struct rte_cryptodev_capabilities *cap = &caps[i];
+   uint32_t j;
+
+   for (j = 0; j < nb_slave_caps; j++) {
+   const struct rte_cryptodev_capabilities *s_cap =
+   &slave_caps[i];
+
+   if (s_cap->op != cap->op || s_cap->sym.xform_type !=
+   cap->sym.xform_type)
+   continue;
+
+   if (s_cap->sym.xform_type ==
+   RTE_CRYPTO_SYM_XFORM_AUTH) {
+   if (s_cap->sym.auth.algo !=
+   cap->sym.auth.algo)
+   continue;
+
+   cap->sym.auth.digest_size.min =
+   s_cap->sym.auth.digest_size.min <
+   cap->sym.auth.digest_size.min ?
+   s_cap->sym.auth.digest_size.min :
+   cap->sym.auth.digest_size.min;
+   cap->sym.auth.digest_size.max =
+   s_cap->sym.auth.digest_size.max <
+   cap->sym.auth.digest_size.max ?
+   s_cap->sym.auth.digest_size.max :
+   cap->sym.auth.digest_size.max;
+
+   }
+
+   if (s_cap->sym.xform_type ==
+   RTE_CRYPTO_SYM_XFORM_CIPHER)
+   if (s_cap->sym.cipher.algo !=
+   cap->sym.cipher.algo)
+   continue;
+
+   /* no common cap found */
+   break;
+

  1   2   >