[dpdk-dev] [PATCH v2] mem: limit use of address hint

2019-03-31 Thread Shahaf Shuler
patch[1] added an address hint as starting address for 64 bit systems in
case an explicit base virtual address was not set by the user.

The justification for such hint was to help devices that work in VA
mode and has a address range limitation to work smoothly with the eal
memory subsystem.

While the base address value selected may work fine for the eal
initialization, it easily breaks when trying to register external memory
using rte_extmem_register API.

Trying to register anonymous memory on RH x86_64 machine took several
minutes, during them the function eal_get_virtual_area repeatedly
scanned for a good VA candidate.

The attempt to guess which VA address will be free for mapping will
always result in not portable, error prone code:
* different application may use different libraries along w/ DPDK. One
  can never guess which library was called first and how much virtual
  memory it consumed.
* external memory can be registered at any time in the application run
  time.

In order not to break the existing secondary process design, this patch
only limits the max number of tries that will be done with the
address hint.
When the number of tries exceeds the threshold the code
will use the suggested address from kernel.

Fixes: 1df21702873d ("mem: use address hint for mapping hugepages")
Cc: sta...@dpdk.org
Cc: alejandro.luc...@netronome.com

[1] commit 1df21702873d ("mem: use address hint for mapping hugepages")

Signed-off-by: Shahaf Shuler 
---

On v2:
 * instead of a complete remove of the hint limit the number of tries we allow.
---
 lib/librte_eal/common/eal_common_memory.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index c9da69b164..5ae8d0124d 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -56,6 +56,7 @@ static uint64_t system_page_sz;
 static uint64_t baseaddr = 0x1;
 #endif
 
+#define MAX_MMAP_WITH_DEFINED_ADDR_TRIES 5
 void *
 eal_get_virtual_area(void *requested_addr, size_t *size,
size_t page_sz, int flags, int mmap_flags)
@@ -63,6 +64,7 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
bool addr_is_hint, allow_shrink, unmap, no_align;
uint64_t map_sz;
void *mapped_addr, *aligned_addr;
+   uint8_t try = 0;
 
if (system_page_sz == 0)
system_page_sz = sysconf(_SC_PAGESIZE);
@@ -118,11 +120,14 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
 
if (mapped_addr != MAP_FAILED && addr_is_hint &&
mapped_addr != requested_addr) {
-   /* hint was not used. Try with another offset */
-   munmap(mapped_addr, map_sz);
-   mapped_addr = MAP_FAILED;
+   try++;
next_baseaddr = RTE_PTR_ADD(next_baseaddr, page_sz);
-   requested_addr = next_baseaddr;
+   if (try <= MAX_MMAP_WITH_DEFINED_ADDR_TRIES) {
+   /* hint was not used. Try with another offset */
+   munmap(mapped_addr, map_sz);
+   mapped_addr = MAP_FAILED;
+   requested_addr = next_baseaddr;
+   }
}
} while ((allow_shrink || addr_is_hint) &&
 mapped_addr == MAP_FAILED && *size > 0);
-- 
2.12.0



[dpdk-dev] [PATCH] net/mlx5: fix typos in comments

2019-03-31 Thread Dekel Peled
Correct typing mistake in several locations:
ernno ==> errno

Fixes: 23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
---
 drivers/net/mlx5/mlx5_flow.c   | 14 +++---
 drivers/net/mlx5/mlx5_flow_dv.c|  6 +++---
 drivers/net/mlx5/mlx5_flow_verbs.c |  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index dea38e2..0b7b569 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -786,7 +786,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, 
int32_t priority,
  *   Pointer to error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
 mlx5_flow_validate_action_drop(uint64_t action_flags,
@@ -829,7 +829,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, 
int32_t priority,
  *   Pointer to error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
 mlx5_flow_validate_action_queue(const struct rte_flow_action *action,
@@ -883,7 +883,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, 
int32_t priority,
  *   Pointer to error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
 mlx5_flow_validate_action_rss(const struct rte_flow_action *action,
@@ -976,7 +976,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev, 
int32_t priority,
  *   Pointer to error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
 mlx5_flow_validate_action_count(struct rte_eth_dev *dev __rte_unused,
@@ -1798,7 +1798,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev 
*dev, int32_t priority,
  *   Pointer to the error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static inline int
 flow_drv_validate(struct rte_eth_dev *dev,
@@ -1837,7 +1837,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev 
*dev, int32_t priority,
  *   Pointer to the error structure.
  *
  * @return
- *   Pointer to device flow on success, otherwise NULL and rte_ernno is set.
+ *   Pointer to device flow on success, otherwise NULL and rte_errno is set.
  */
 static inline struct mlx5_flow *
 flow_drv_prepare(const struct rte_flow *flow,
@@ -1881,7 +1881,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev 
*dev, int32_t priority,
  *   Pointer to the error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static inline int
 flow_drv_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 7324e33..2b87a96 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1815,7 +1815,7 @@ struct field_modify_info modify_tcp[] = {
  *   Pointer to the error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
 flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr,
@@ -2209,7 +2209,7 @@ struct field_modify_info modify_tcp[] = {
  *
  * @return
  *   Pointer to mlx5_flow object on success,
- *   otherwise NULL and rte_ernno is set.
+ *   otherwise NULL and rte_errno is set.
  */
 static struct mlx5_flow *
 flow_dv_prepare(const struct rte_flow_attr *attr __rte_unused,
@@ -3039,7 +3039,7 @@ struct field_modify_info modify_tcp[] = {
  *   Pointer to the error structure.
  *
  * @return
- *   0 on success, a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
 flow_dv_translate(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c 
b/drivers/net/mlx5/mlx5_flow_verbs.c
index 6c4f52f..82a1750 100644
--- a/drivers/net/mlx5/mlx5_flow_verbs.c
+++ b/drivers/net/mlx5/mlx5_flow_verbs.c
@@ -1383,7 +1383,7 @@
  *   Pointer to the error structure.
  *
  * @return
- *   0 on success, else a negative errno value otherwise and rte_ernno is set.
+ *   0 on success, else a negative errno value otherwise and rte_errno is set.
  */
 static int
 flow_verbs_translate(struct rte_eth_dev *dev,
-- 
1.8.3.1



[dpdk-dev] [PATCH] doc: fix typos in mlx5 doc

2019-03-31 Thread Dekel Peled
Correct typing mistakes:
appiled ==> applied
tarffic ==> traffic

Fixes: 0280f2812284 ("doc: add mlx5 E-Switch VXLAN tunnels limitations")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
---
 doc/guides/nics/mlx5.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 0200373..f4db921 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -149,7 +149,7 @@ Limitations
 
 - E-Switch VXLAN decapsulation Flow:
 
-  - can be appiled to PF port only.
+  - can be applied to PF port only.
   - must specify VF port action (packet redirection from PF to VF).
   - must specify tunnel outer UDP local (destination) port, wildcards not 
allowed.
   - must specify tunnel outer VNI, wildcards not allowed.
@@ -319,7 +319,7 @@ Run-time configuration
   buffers per a packet, one large buffer is posted in order to receive multiple
   packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides
   and each stride receives one packet. MPRQ can improve throughput for
-  small-packet tarffic.
+  small-packet traffic.
 
   When MPRQ is enabled, max_rx_pkt_len can be larger than the size of
   user-provided mbuf even if DEV_RX_OFFLOAD_SCATTER isn't enabled. PMD will
@@ -330,7 +330,7 @@ Run-time configuration
 - ``mprq_log_stride_num`` parameter [int]
 
   Log 2 of the number of strides for Multi-Packet Rx queue. Configuring more
-  strides can reduce PCIe tarffic further. If configured value is not in the
+  strides can reduce PCIe traffic further. If configured value is not in the
   range of device capability, the default value will be set with a warning
   message. The default value is 4 which is 16 strides per a buffer, valid only
   if ``mprq_en`` is set.
-- 
1.8.3.1



[dpdk-dev] [PATCH] doc: fix typos in testpmd user guide

2019-03-31 Thread Dekel Peled
Correct typing mistakes in several places.
1) "Those command will" ==> "These commands will"
2) icmp6_nd_opt_sla_eth ==> icmp6_nd_opt_tla_eth
   (change 's' to 't' for 'target' where applicable)

Fixes: 1960be7d32f8 ("app/testpmd: add VXLAN encap/decap")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index c6f8b2c..9de3bf7 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1587,7 +1587,7 @@ Configure the outer layer to encapsulate a packet inside 
a VXLAN tunnel::
  udp-dst (udp-dst) ip-tos (ip-tos) ip-ttl (ip-ttl) ip-src (ip-src) \
  ip-dst (ip-dst) eth-src (eth-src) eth-dst (eth-dst)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action vxlan_encap will use the last configuration set.
 To have a different encapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -1602,7 +1602,7 @@ Configure the outer layer to encapsulate a packet inside 
a NVGRE tunnel::
  set nvgre-with-vlan ip-version (ipv4|ipv6) tni (tni) ip-src (ip-src) \
 ip-dst (ip-dst) vlan-tci (vlan-tci) eth-src (eth-src) eth-dst (eth-dst)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action nvgre_encap will use the last configuration set.
 To have a different encapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -1645,7 +1645,7 @@ Configure the outer layer to encapsulate a packet inside 
a MPLSoGRE tunnel::
 ip-src (ip-src) ip-dst (ip-dst) vlan-tci (vlan-tci) \
 eth-src (eth-src) eth-dst (eth-dst)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action mplsogre_encap will use the last configuration set.
 To have a different encapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -1658,7 +1658,7 @@ Configure the outer layer to decapsulate MPLSoGRE packet::
  set mplsogre_decap ip-version (ipv4|ipv6)
  set mplsogre_decap-with-vlan ip-version (ipv4|ipv6)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action mplsogre_decap will use the last configuration set.
 To have a different decapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -1675,7 +1675,7 @@ Configure the outer layer to encapsulate a packet inside 
a MPLSoUDP tunnel::
 udp-src (udp-src) udp-dst (udp-dst) ip-src (ip-src) ip-dst (ip-dst) \
 vlan-tci (vlan-tci) eth-src (eth-src) eth-dst (eth-dst)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action mplsoudp_encap will use the last configuration set.
 To have a different encapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -1688,7 +1688,7 @@ Configure the outer layer to decapsulate MPLSoUDP packet::
  set mplsoudp_decap ip-version (ipv4|ipv6)
  set mplsoudp_decap-with-vlan ip-version (ipv4|ipv6)
 
-Those command will set an internal configuration inside testpmd, any following
+These commands will set an internal configuration inside testpmd, any following
 flow rule using the action mplsoudp_decap will use the last configuration set.
 To have a different decapsulation header, one of those commands must be called
 before the flow rule creation.
@@ -3709,7 +3709,7 @@ This section lists supported pattern items and their 
attributes, if any.
 
   - ``sla {MAC-48}``: source Ethernet LLA.
 
-- ``icmp6_nd_opt_sla_eth``: match ICMPv6 neighbor discovery target Ethernet
+- ``icmp6_nd_opt_tla_eth``: match ICMPv6 neighbor discovery target Ethernet
   link-layer address option.
 
   - ``tla {MAC-48}``: target Ethernet LLA.
-- 
1.8.3.1



[dpdk-dev] [PATCH] app/testpmd: fix typo in comment

2019-03-31 Thread Dekel Peled
Correct minor typing mistake:
then ==> the

Fixes: 79bec05b32b7 ("app/testpmd: add ability to split outgoing packets")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
---
 app/test-pmd/csumonly.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index ffeee20..f4f2a7b 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -575,7 +575,7 @@ struct simple_gre_hdr {
 
 /*
  * Allocate a new mbuf with up to tx_pkt_nb_segs segments.
- * Copy packet contents and offload information into then new segmented mbuf.
+ * Copy packet contents and offload information into the new segmented mbuf.
  */
 static struct rte_mbuf *
 pkt_copy_split(const struct rte_mbuf *pkt)
-- 
1.8.3.1



[dpdk-dev] [PATCH] rte_ethdev: fix typos in error log message

2019-03-31 Thread Dekel Peled
Correct minor typing mistake:
pre-queue ==> per-queue

Fixes: bea1e0c70cfc ("ethdev: convert static log type usage to dynamic")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
---
 lib/librte_ethdev/rte_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 12b66b6..a5462b7 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -1651,7 +1651,7 @@ struct rte_eth_dev *
 local_conf.offloads) {
RTE_ETHDEV_LOG(ERR,
"Ethdev port_id=%d rx_queue_id=%d, new added offloads 
0x%"PRIx64" must be "
-   "within pre-queue offload capabilities 0x%"PRIx64" in 
%s()\n",
+   "within per-queue offload capabilities 0x%"PRIx64" in 
%s()\n",
port_id, rx_queue_id, local_conf.offloads,
dev_info.rx_queue_offload_capa,
__func__);
@@ -1755,7 +1755,7 @@ struct rte_eth_dev *
 local_conf.offloads) {
RTE_ETHDEV_LOG(ERR,
"Ethdev port_id=%d tx_queue_id=%d, new added offloads 
0x%"PRIx64" must be "
-   "within pre-queue offload capabilities 0x%"PRIx64" in 
%s()\n",
+   "within per-queue offload capabilities 0x%"PRIx64" in 
%s()\n",
port_id, tx_queue_id, local_conf.offloads,
dev_info.tx_queue_offload_capa,
__func__);
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH v3] app/eventdev: add option to set global dequeue tmo

2019-03-31 Thread Jerin Jacob Kollanukkaran
On Fri, 2019-03-29 at 07:11 +, Pavan Nikhilesh Bhagavatula wrote:
> From: Pavan Nikhilesh 
> 
> Add option to provide a global dequeue timeout that is used to create
> the eventdev.
> The dequeue timeout provided will be common across all the worker
> ports. If the eventdev hardware supports power management through
> dequeue timeout then this option can be used for verifying power
> demands at various packet rates.
> 
> Signed-off-by: Pavan Nikhilesh 
> ---
>  v3 Changes:
>  - Move common changes to evt_common.h.
> 
>  v2 Changes:
>  - Update documentation and fix indentation.
>  - Add check to see if provided dequeue timeout is within range and
> adjust as
>  required

Applied to dpdk-next-eventdev/master. Thanks.




Re: [dpdk-dev] [PATCH v6 4/5] net/af_xdp: use mbuf mempool for buffer management

2019-03-31 Thread Ye Xiaolong
Hi, Olivier

Thanks for the comments.

On 03/29, Olivier Matz wrote:
>Hi Xiaolong,
>
>On Tue, Mar 26, 2019 at 08:20:28PM +0800, Xiaolong Ye wrote:
>> Now, af_xdp registered memory buffer is managed by rte_mempool. mbuf
>> allocated from rte_mempool can be converted to xdp_desc's address and
>> vice versa.
>> 
>> Signed-off-by: Xiaolong Ye 
>> ---
>>  drivers/net/af_xdp/rte_eth_af_xdp.c | 117 +---
>>  1 file changed, 72 insertions(+), 45 deletions(-)
>> 
>> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> index 47a496ed7..a1fda9212 100644
>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> @@ -48,7 +48,11 @@ static int af_xdp_logtype;
>>  
>>  #define ETH_AF_XDP_FRAME_SIZE   XSK_UMEM__DEFAULT_FRAME_SIZE
>>  #define ETH_AF_XDP_NUM_BUFFERS  4096
>> -#define ETH_AF_XDP_DATA_HEADROOM0
>> +/* mempool hdrobj size (64 bytes) + sizeof(struct rte_mbuf) (128 bytes) */
>> +#define ETH_AF_XDP_MBUF_OVERHEAD192
>> +/* data start from offset 320 (192 + 128) bytes */
>> +#define ETH_AF_XDP_DATA_HEADROOM\
>> +(ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)
>
>Having these constants looks quite dangerous too me. It imposes the size
>of the mbuf, and the mempool header size. It would at least require
>compilation checks.
>
>[...]
>
>> +umem->mb_pool = rte_pktmbuf_pool_create_with_flags("af_xdp_mempool",
>> +ETH_AF_XDP_NUM_BUFFERS,
>> +250, 0,
>> +ETH_AF_XDP_FRAME_SIZE -
>> +ETH_AF_XDP_MBUF_OVERHEAD,
>> +MEMPOOL_F_NO_SPREAD | MEMPOOL_CHUNK_F_PAGE_ALIGN,
>> +SOCKET_ID_ANY);
>> +if (umem->mb_pool == NULL || umem->mb_pool->nb_mem_chunks != 1) {
>> +AF_XDP_LOG(ERR, "Failed to create mempool\n");
>>  goto err;
>>  }
>> +base_addr = (void *)get_base_addr(umem->mb_pool);
>
>Creating a mempool in the pmd like this does not look good to me for
>several reasons:
>- the user application creates its mempool with a specific private
>  area in its mbufs. Here there is no private area, so it will break
>  applications doing this.
>- in patch 3 (mempool), you ensure that the chunk starts at a
>  page-aligned address, and you expect that given the other flags and
>  the constants at the top of the file, the data will be aligned. In
>  my opinion it is not ideal.
>- the user application may create a large number of mbufs, for instance
>  if the application manages large reassembly queues, or tcp sockets.
>  Here the driver limits the number of mbufs to 4k per rx queue.
>- the socket_id is any, so it won't be efficient on numa architectures.

Our mbuf/mempool change regarding to zero copy does have limitations.

>
>May I suggest another idea?
>
>Allocate the xsk_umem almost like in patch 1, but using rte_memzone
>allocation instead of posix_memalign() (and it will be faster, because
>it will use hugepages). And do not allocate any mempool in the driver.
>

rte_memzone_reserve_aligned is better than posix_memalign, I'll use it in my
first patch.

>When you receive a packet in the xsk_umem, allocate a new mbuf from
>the standard pool. Then, use rte_pktmbuf_attach_extbuf() to attach the
>xsk memory to the mbuf. You will have to register a callback to return
>the xsk memory when the mbuf is transmitted or freed.

I'll try to investigate how to implement it.

>
>This is, by the way, something I don't understand in your current
>implementation: what happens if a mbuf is received in the af_xdp driver,
>and freed by the application? How does the xsk buffer is returned?

It is coordinated by the fill ring. The fill ring is used by the application (
user space) to send down addr for the kernel to fill in with Rx packet data.
So for the free side, we just return it to the mempool, and each time in 
rx_pkt_burst, we would allocate new mbufs and submit corresponding addrs to 
fill 
ring, that's how we return the xsk buffer to kernel.

>
>Using rte_pktmbuf_attach_extbuf() would remove changes in mbuf and
>mempool, at the price of (maybe) decreasing the performance. But I think
>there are some places where it can be optimized.
>
>I understand my feedback comes late -- as usual :( -- but if you are in

Sorry for not Cc you in my patch set.

>a hurry for RC1, maybe we can consider to put the 1st patch only, and
>add the zero-copy mode in a second patch later. What do you think?

Sounds a sensible plan, I'll try to exteranl mbuf buffer scheme first.


Thanks,
Xiaolong

>
>Regards,
>Olivier
>
>


Re: [dpdk-dev] [PATCH 1/3] ethdev: add actions to modify TCP header fields

2019-03-31 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Adrien Mazarguil 
> Sent: Friday, March 29, 2019 4:59 PM
> To: Dekel Peled 
> Cc: wenzhuo...@intel.com; jingjing...@intel.com;
> bernard.iremon...@intel.com; Yongseok Koh ;
> Shahaf Shuler ; dev@dpdk.org; Ori Kam
> 
> Subject: Re: [PATCH 1/3] ethdev: add actions to modify TCP header fields
> 
> Hi Derek,

It's Dekel, not Derek.

> 
> It's been a while since I last reviewed something, sorry it took so long.
> Some comments below.
> 
> On Thu, Mar 21, 2019 at 04:18:35PM +0200, Dekel Peled wrote:
> > Add actions:
> > - INC_TCP_SEQ - Increase sequence number in the outermost TCP header.
> > - DEC_TCP_SEQ - Decrease sequence number in the outermost TCP
> header.
> > - INC_TCP_ACK - Increase acknowledgment number in the outermost TCP
> > header.
> > - DEC_TCP_ACK - Decrease acknowledgment number in the outermost TCP
> > header.
> 
> Out of curiosity, are these upcoming/existing OpenFlow actions? If so you
> should consider prefixing them with "OF_", otherwise leave them as is.

Not related to OpenFlow.

> 
> > Original work by Xiaoyu Min.
> >
> > Signed-off-by: Dekel Peled 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 72
> ++
> >  lib/librte_ethdev/rte_flow.c   |  8 +
> >  lib/librte_ethdev/rte_flow.h   | 60
> +++
> >  3 files changed, 140 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 0203f4f..bdb817a 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -2345,6 +2345,78 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION
> error will be returned.
> > | ``mac_addr`` | MAC address   |
> > +--+---+
> >
> > +Action: ``INC_TCP_SEQ``
> > +^^^
> > +
> > +Increase sequence number in the outermost TCP header.
> > +
> > +It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP flow pattern
> item.
> > +Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
> 
> I know this is already present for various SET* actions, but please do not tie
> actions to the presence of specific pattern items at the API level. You can
> describe that the lack of a TCP header in matched *traffic* results in
> unspecified behavior though.

I accept, will change it in all relevant locations.

> 
> Then PMD documentation can describe that the lack of a TCP pattern item
> with this action results in RTE_FLOW_ERROR_TYPE_ACTION as a PMD-specific
> constraint.
> 
> > +
> > +.. _table_rte_flow_action_inc_tcp_seq:
> > +
> > +.. table:: INC_TCP_SEQ
> > +
> > +   +---++
> > +   | Field | Value  |
> > +
> +===++
> > +   | ``value`` | Value to increase TCP sequence number by   |
> > +   +---++
> 
> Nit: unnecessary empty space after "by".

Will remove.

> 
> > +
> > +Action: ``DEC_TCP_SEQ``
> > +^^^
> > +
> > +Decrease sequence number in the outermost TCP header.
> > +
> > +It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP flow pattern
> item.
> > +Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
> 
> Same comment as above.
> 
> > +
> > +.. _table_rte_flow_action_dec_tcp_seq:
> > +
> > +.. table:: DEC_TCP_SEQ
> > +
> > +   +---++
> > +   | Field | Value  |
> > +
> +===++
> > +   | ``value`` | Value to decrease TCP sequence number by   |
> > +   +---++
> > +
> > +Action: ``INC_TCP_ACK``
> > +^^^
> > +
> > +Increase acknowledgment number in the outermost TCP header.
> > +
> > +It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP flow pattern
> item.
> > +Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
> 
> Ditto.
> 
> > +
> > +.. _table_rte_flow_action_inc_tcp_ack:
> > +
> > +.. table:: INC_TCP_ACK
> > +
> > +   +---+--+
> > +   | Field | Value|
> > +
> +===+=
> =+
> > +   | ``value`` | Value to increase TCP acknowledgment number by   |
> > +   +---+--+
> > +
> > +Action: ``DEC_TCP_ACK``
> > +^^^
> > +
> > +Decrease acknowledgment number in the outermost TCP header.
> > +
> > +It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP flow pattern
> item.
> > +Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
> 
> Ditto.
> 
> > +
> > +.. _table_rte_flow_action_dec_tcp_ack:
> > +
> > +.. table:: DEC_TCP_ACK
> > +
> > +   +---+--

Re: [dpdk-dev] [PATCH 2/3] app/testpmd: add actions to modify TCP header fields

2019-03-31 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Adrien Mazarguil 
> Sent: Friday, March 29, 2019 4:59 PM
> To: Dekel Peled 
> Cc: wenzhuo...@intel.com; jingjing...@intel.com;
> bernard.iremon...@intel.com; Yongseok Koh ;
> Shahaf Shuler ; dev@dpdk.org; Ori Kam
> 
> Subject: Re: [PATCH 2/3] app/testpmd: add actions to modify TCP header
> fields
> 
> On Thu, Mar 21, 2019 at 04:18:36PM +0200, Dekel Peled wrote:
> > Add actions:
> > - INC_TCP_SEQ - Increase sequence number in the outermost TCP header.
> > - DEC_TCP_SEQ - Decrease sequence number in the outermost TCP
> header.
> > - INC_TCP_ACK - Increase acknowledgment number in the outermost TCP
> > header.
> > - DEC_TCP_ACK - Decrease acknowledgment number in the outermost TCP
> > header.
> >
> > Original work by Xiaoyu Min.
> >
> > Signed-off-by: Dekel Peled 
> 
> I suggest to merge this patch into the previous one. No reason for testpmd
> to be updated separately.
> 

I prefer it in separate patch since it is part of different SW module.

> Some comments below.
> 
> > ---
> >  app/test-pmd/cmdline_flow.c | 100
> 
> >  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  16 +
> >  2 files changed, 116 insertions(+)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 36659a6..5cd4ceb 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -271,6 +271,14 @@ enum index {
> > ACTION_SET_MAC_SRC_MAC_SRC,
> > ACTION_SET_MAC_DST,
> > ACTION_SET_MAC_DST_MAC_DST,
> > +   ACTION_INC_TCP_SEQ,
> > +   ACTION_INC_TCP_SEQ_VALUE,
> > +   ACTION_DEC_TCP_SEQ,
> > +   ACTION_DEC_TCP_SEQ_VALUE,
> > +   ACTION_INC_TCP_ACK,
> > +   ACTION_INC_TCP_ACK_VALUE,
> > +   ACTION_DEC_TCP_ACK,
> > +   ACTION_DEC_TCP_ACK_VALUE,
> >  };
> >
> >  /** Maximum size for pattern in struct rte_flow_item_raw. */ @@
> > -884,6 +892,10 @@ struct parse_action_priv {
> > ACTION_SET_TTL,
> > ACTION_SET_MAC_SRC,
> > ACTION_SET_MAC_DST,
> > +   ACTION_INC_TCP_SEQ,
> > +   ACTION_DEC_TCP_SEQ,
> > +   ACTION_INC_TCP_ACK,
> > +   ACTION_DEC_TCP_ACK,
> > ZERO,
> >  };
> >
> > @@ -1046,6 +1058,30 @@ struct parse_action_priv {
> > ZERO,
> >  };
> >
> > +static const enum index action_inc_tcp_seq[] = {
> > +   ACTION_INC_TCP_SEQ_VALUE,
> > +   ACTION_NEXT,
> > +   ZERO,
> > +};
> > +
> > +static const enum index action_dec_tcp_seq[] = {
> > +   ACTION_DEC_TCP_SEQ_VALUE,
> > +   ACTION_NEXT,
> > +   ZERO,
> > +};
> > +
> > +static const enum index action_inc_tcp_ack[] = {
> > +   ACTION_INC_TCP_ACK_VALUE,
> > +   ACTION_NEXT,
> > +   ZERO,
> > +};
> > +
> > +static const enum index action_dec_tcp_ack[] = {
> > +   ACTION_DEC_TCP_ACK_VALUE,
> > +   ACTION_NEXT,
> > +   ZERO,
> > +};
> > +
> >  static int parse_init(struct context *, const struct token *,
> >   const char *, unsigned int,
> >   void *, unsigned int);
> > @@ -2843,6 +2879,70 @@ static int comp_vc_action_rss_queue(struct
> context *, const struct token *,
> >  (struct rte_flow_action_set_mac, mac_addr)),
> > .call = parse_vc_conf,
> > },
> > +   [ACTION_INC_TCP_SEQ] = {
> > +   .name = "inc_tcp_seq",
> > +   .help = "increase TCP sequence number",
> > +   .priv = PRIV_ACTION(INC_TCP_SEQ,
> > +   sizeof(struct rte_flow_action_modify_tcp_seq)),
> > +   .next = NEXT(action_inc_tcp_seq),
> > +   .call = parse_vc,
> > +   },
> > +   [ACTION_INC_TCP_SEQ_VALUE] = {
> > +   .name = "value",
> > +   .help = "the value to increase TCP sequence number by",
> > +   .next = NEXT(action_inc_tcp_seq, NEXT_ENTRY(UNSIGNED)),
> > +   .args = ARGS(ARGS_ENTRY_HTON
> > +   (struct rte_flow_action_modify_tcp_seq, value)),
> 
> You may need to remove HTON here depending on the chosen endian for
> the API, see my comments on previous patch.
> 
> > +   .call = parse_vc_conf,
> > +   },
> > +   [ACTION_DEC_TCP_SEQ] = {
> > +   .name = "dec_tcp_seq",
> > +   .help = "decrease TCP sequence number",
> > +   .priv = PRIV_ACTION(DEC_TCP_SEQ,
> > +   sizeof(struct rte_flow_action_modify_tcp_seq)),
> > +   .next = NEXT(action_dec_tcp_seq),
> > +   .call = parse_vc,
> > +   },
> > +   [ACTION_DEC_TCP_SEQ_VALUE] = {
> > +   .name = "value",
> > +   .help = "the value to decrease TCP sequence number by",
> > +   .next = NEXT(action_dec_tcp_seq,
> NEXT_ENTRY(UNSIGNED)),
> > +   .args = ARGS(ARGS_ENTRY_HTON
> > +   (struct rte_flow_action_modify_tcp_seq, value)),
> 
> Same here.
> 
> > +   .call = parse_vc_conf,
> > +   },
> > +   [ACTION_INC_TCP_ACK] = {
> > +   .name = "inc_tcp_ack",
> > +   .help = "increase TCP acknowledgment number",
> > +   .priv = PRIV_ACTION(INC_TCP_ACK,
> > +  

[dpdk-dev] [PATCH v5 2/2] app/testpmd: add mempool bulk get for txonly mode

2019-03-31 Thread Pavan Nikhilesh Bhagavatula
From: Pavan Nikhilesh 

Use mempool bulk get ops to alloc burst of packets and process them.
If bulk get fails fallback to rte_mbuf_raw_alloc.

Tested-by: Yingya Han 
Suggested-by: Andrew Rybchenko 
Signed-off-by: Pavan Nikhilesh 
---
 app/test-pmd/txonly.c | 37 ++---
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 9c0147089..4c34ef128 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -250,18 +250,33 @@ pkt_burst_transmit(struct fwd_stream *fs)
ether_addr_copy(&ports[fs->tx_port].eth_addr, ð_hdr.s_addr);
eth_hdr.ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
 
-   for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) {
-   pkt = rte_mbuf_raw_alloc(mbp);
-   if (pkt == NULL)
-   break;
-   if (unlikely(!pkt_burst_prepare(pkt, mbp,
-   ð_hdr, vlan_tci,
-   vlan_tci_outer,
-   ol_flags))) {
-   rte_mempool_put(mbp, pkt);
-   break;
+   if (rte_mempool_get_bulk(mbp, (void **)pkts_burst,
+   nb_pkt_per_burst) == 0) {
+   for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) {
+   if (unlikely(!pkt_burst_prepare(pkts_burst[nb_pkt], mbp,
+   ð_hdr, vlan_tci,
+   vlan_tci_outer,
+   ol_flags))) {
+   rte_mempool_put_bulk(mbp,
+   (void **)&pkts_burst[nb_pkt],
+   nb_pkt_per_burst - nb_pkt);
+   break;
+   }
+   }
+   } else {
+   for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) {
+   pkt = rte_mbuf_raw_alloc(mbp);
+   if (pkt == NULL)
+   break;
+   if (unlikely(!pkt_burst_prepare(pkt, mbp,
+   ð_hdr, vlan_tci,
+   vlan_tci_outer,
+   ol_flags))) {
+   rte_mempool_put(mbp, pkt);
+   break;
+   }
+   pkts_burst[nb_pkt] = pkt;
}
-   pkts_burst[nb_pkt] = pkt;
}
 
if (nb_pkt == 0)
-- 
2.21.0



[dpdk-dev] [PATCH v5 1/2] app/testpmd: optimize testpmd txonly mode

2019-03-31 Thread Pavan Nikhilesh Bhagavatula
From: Pavan Nikhilesh 

Optimize testpmd txonly mode by
1. Moving per packet ethernet header copy above the loop.
2. Use bulk ops for allocating segments instead of having a inner loop
for every segment.

Also, move the packet prepare logic into a separate function so that it
can be reused later.

Signed-off-by: Pavan Nikhilesh 
---
 v5 Changes
 - Remove unnecessary change to struct rte_port *txp (movement). (Bernard)

 v4 Changes:
 - Fix packet len calculation.

 v3 Changes:
 - Split the patches for easier review. (Thomas)
 - Remove unnecessary assignments to 0. (Bernard)

 v2 Changes:
 - Use bulk ops for fetching segments. (Andrew Rybchenko)
 - Fallback to rte_mbuf_raw_alloc if bulk get fails. (Andrew Rybchenko)
 - Fix mbufs not being freed when there is no more mbufs available for
 segments. (Andrew Rybchenko)

 app/test-pmd/txonly.c | 139 +++---
 1 file changed, 76 insertions(+), 63 deletions(-)

diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 1f08b6ed3..9c0147089 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -147,6 +147,63 @@ setup_pkt_udp_ip_headers(struct ipv4_hdr *ip_hdr,
ip_hdr->hdr_checksum = (uint16_t) ip_cksum;
 }

+static inline bool
+pkt_burst_prepare(struct rte_mbuf *pkt, struct rte_mempool *mbp,
+   struct ether_hdr *eth_hdr, const uint16_t vlan_tci,
+   const uint16_t vlan_tci_outer, const uint64_t ol_flags)
+{
+   struct rte_mbuf *pkt_segs[RTE_MAX_SEGS_PER_PKT];
+   struct rte_mbuf *pkt_seg;
+   uint32_t nb_segs, pkt_len;
+   uint8_t i;
+
+   if (unlikely(tx_pkt_split == TX_PKT_SPLIT_RND))
+   nb_segs = random() % tx_pkt_nb_segs + 1;
+   else
+   nb_segs = tx_pkt_nb_segs;
+
+   if (nb_segs > 1) {
+   if (rte_mempool_get_bulk(mbp, (void **)pkt_segs, nb_segs))
+   return false;
+   }
+
+   rte_pktmbuf_reset_headroom(pkt);
+   pkt->data_len = tx_pkt_seg_lengths[0];
+   pkt->ol_flags = ol_flags;
+   pkt->vlan_tci = vlan_tci;
+   pkt->vlan_tci_outer = vlan_tci_outer;
+   pkt->l2_len = sizeof(struct ether_hdr);
+   pkt->l3_len = sizeof(struct ipv4_hdr);
+
+   pkt_len = pkt->data_len;
+   pkt_seg = pkt;
+   for (i = 1; i < nb_segs; i++) {
+   pkt_seg->next = pkt_segs[i - 1];
+   pkt_seg = pkt_seg->next;
+   pkt_seg->data_len = tx_pkt_seg_lengths[i];
+   pkt_len += pkt_seg->data_len;
+   }
+   pkt_seg->next = NULL; /* Last segment of packet. */
+   /*
+* Copy headers in first packet segment(s).
+*/
+   copy_buf_to_pkt(eth_hdr, sizeof(eth_hdr), pkt, 0);
+   copy_buf_to_pkt(&pkt_ip_hdr, sizeof(pkt_ip_hdr), pkt,
+   sizeof(struct ether_hdr));
+   copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_udp_hdr), pkt,
+   sizeof(struct ether_hdr) +
+   sizeof(struct ipv4_hdr));
+
+   /*
+* Complete first mbuf of packet and append it to the
+* burst of packets to be transmitted.
+*/
+   pkt->nb_segs = nb_segs;
+   pkt->pkt_len = pkt_len;
+
+   return true;
+}
+
 /*
  * Transmit a burst of multi-segments packets.
  */
@@ -156,7 +213,6 @@ pkt_burst_transmit(struct fwd_stream *fs)
struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
struct rte_port *txp;
struct rte_mbuf *pkt;
-   struct rte_mbuf *pkt_seg;
struct rte_mempool *mbp;
struct ether_hdr eth_hdr;
uint16_t nb_tx;
@@ -164,14 +220,12 @@ pkt_burst_transmit(struct fwd_stream *fs)
uint16_t vlan_tci, vlan_tci_outer;
uint32_t retry;
uint64_t ol_flags = 0;
-   uint8_t  i;
uint64_t tx_offloads;
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
uint64_t end_tsc;
uint64_t core_cycles;
 #endif
-   uint32_t nb_segs, pkt_len;

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
start_tsc = rte_rdtsc();
@@ -188,72 +242,31 @@ pkt_burst_transmit(struct fwd_stream *fs)
ol_flags |= PKT_TX_QINQ_PKT;
if (tx_offloads & DEV_TX_OFFLOAD_MACSEC_INSERT)
ol_flags |= PKT_TX_MACSEC;
+
+   /*
+* Initialize Ethernet header.
+*/
+   ether_addr_copy(&peer_eth_addrs[fs->peer_addr], ð_hdr.d_addr);
+   ether_addr_copy(&ports[fs->tx_port].eth_addr, ð_hdr.s_addr);
+   eth_hdr.ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
+
for (nb_pkt = 0; nb_pkt < nb_pkt_per_burst; nb_pkt++) {
pkt = rte_mbuf_raw_alloc(mbp);
-   if (pkt == NULL) {
-   nomore_mbuf:
-   if (nb_pkt == 0)
-   return;
+   if (pkt == NULL)
+   break;
+   if (unlikely(!pkt_burst_prepare(pkt, mbp,
+   ð_hdr, vlan_tci,
+  

Re: [dpdk-dev] [PATCH v2 00/15] Add patch set for IPN3KE

2019-03-31 Thread Xu, Rosen
Hi Ferruh,

My reply is online.

> -Original Message-
> From: Yigit, Ferruh
> Sent: Saturday, March 30, 2019 2:59
> To: Xu, Rosen ; dev@dpdk.org
> Cc: Zhang, Tianfei ; Wei, Dan
> ; Pei, Andy ; Yang, Qiming
> ; Wang, Haiyue ; Chen,
> Santos ; Zhang, Zhang 
> Subject: Re: [PATCH v2 00/15] Add patch set for IPN3KE
> 
> On 3/29/2019 3:58 PM, Rosen Xu wrote:
> > v2 updates:
> > ==
> >  - Fix v1 comments
> >  - Add support for 10G Base Line Design Bitstream
> >  - Add support for 25G Base Line Design Bitstream
> >
> > This patch set adds the support of a new net PMD, Intel® FPGA
> > Programmable Acceleration Card N3000, also called ipn3ke.
> >
> > The ipn3ke PMD (librte_pmd_ipn3ke) provides poll mode driver support
> > for Intel® FPGA PAC(Programmable Acceleration Card) N3000 based on the
> > Intel Ethernet Controller X710/XXV710 and Intel Arria 10 FPGA.
> >
> > In this card, FPGA is an acceleration bridge between network interface
> > and the Intel Ethernet Controller. Although both FPGA and Ethernet
> > Controllers are connected to CPU with PCIe Gen3x16 Switch, all the
> > packet RX/TX is handled by Intel Ethernet Controller. So from
> > application point of view the data path is still the legacy Intel
> > Ethernet Controller
> > X710/XXV710 PMD. Besides this, users can enable more acceleration
> > features by FPGA IP.
> >
> > Rosen Xu (7):
> >   drivers/bus/ifpga: add AFU shared data
> >   drivers/bus/ifpga: add function for AFU search by name
> >   drivers/net/ipn3ke: add IPN3KE ethdev PMD driver
> >   drivers/net/ipn3ke: add IPN3KE representor of PMD driver
> >   drivers/net/ipn3ke: add IPN3KE TM of PMD driver
> >   drivers/net/ipn3ke: add IPN3KE Flow of PMD driver
> >   drivers/raw/ifpga_rawdev: add IPN3KE support for IFPGA Rawdev
> >
> > Zhang, Tianfei (8):
> >   raw/ifpga/base: clean up code for ifpga share code
> >   raw/ifpga/base: store private features in FME and Port list
> >   raw/ifpga/base: add SPI and MAX10 device driver
> >   raw/ifpga/base: add I2C and at24 EEPROM driver
> >   raw/ifpga/base: add eth group driver
> >   raw/ifpga/base: add device tree support
> >   raw/ifpga/base: add version description on README
> >   raw/ifpga/base: using prefix name "ifpga_" for feature and feature_ops
> > data struct
> 
> Hi Rosen,
> 
> I am seeing some build errors, for 32bit [1], cross compilation [2] and 
> linkage
> errors [3], can you please check them?

I will fix it ASAP in v3.

> Is "libfdt.h" dependency in [2] documented in driver doc?

OPAE share code has dependency on libfdt, I will add it in v3.
 
> 
> [1]
> Building i686-native-linuxapp-gcc ...
> .../dpdk/drivers/net/ipn3ke/ipn3ke_ethdev.c: In function
> ‘ipn3ke_indirect_read’:
> .../dpdk/drivers/net/ipn3ke/ipn3ke_ethdev.c:68:19: error: cast to pointer
> from integer of different size [-Werror=int-to-pointer-cast]
>   indirect_addrs = (volatile void *)(base_addr | 0x10);
>^
> .../dpdk/drivers/net/ipn3ke/ipn3ke_ethdev.c:76:19: error: cast to pointer
> from integer of different size [-Werror=int-to-pointer-cast]
>   indirect_addrs = (volatile void *)(base_addr | 0x18);
>^
> .../dpdk/drivers/net/ipn3ke/ipn3ke_ethdev.c: In function
> ‘ipn3ke_indirect_write’:
> .../dpdk/drivers/net/ipn3ke/ipn3ke_ethdev.c:118:19: error: cast to pointer
> from integer of different size [-Werror=int-to-pointer-cast]
>   indirect_addrs = (volatile void *)(base_addr | 0x10);
>^
> cc1: all warnings being treated as errors
> make[7]: *** [.../dpdk/mk/internal/rte.compile-pre.mk:116:
> ipn3ke_ethdev.o] Error 1
> make[7]: *** Waiting for unfinished jobs
> In file included from .../dpdk/drivers/net/ipn3ke/ipn3ke_representor.c:25:
> .../dpdk/drivers/net/ipn3ke/ipn3ke_representor.c: In function
> ‘ipn3ke_update_link’:
> .../dpdk/drivers/net/ipn3ke/ipn3ke_logs.h:13:49: error: format ‘%lx’ expects
> argument of type ‘long unsigned int’, but argument 4 has type ‘uint64_t’ {aka
> ‘long long unsigned int’} [-Werror=format=]
>   rte_log(RTE_LOG_ ## level, ipn3ke_afu_logtype, "ipn3ke_afu: " fmt, \
>  ^~
> .../dpdk/drivers/net/ipn3ke/ipn3ke_logs.h:19:2: note: in expansion of macro
> ‘IPN3KE_AFU_PMD_LOG’
>   IPN3KE_AFU_PMD_LOG(DEBUG, fmt, ## args)
>   ^~
> .../dpdk/drivers/net/ipn3ke/ipn3ke_representor.c:388:2: note: in expansion
> of macro ‘IPN3KE_AFU_PMD_DEBUG’
>   IPN3KE_AFU_PMD_DEBUG("line_link_bitmap is %lx\n", line_link_bitmap);
>   ^~~~
> .../dpdk/drivers/net/ipn3ke/ipn3ke_representor.c:388:46: note: format
> string is defined here
>   IPN3KE_AFU_PMD_DEBUG("line_link_bitmap is %lx\n", line_link_bitmap);
> ~~^
> %llx
> cc1: all warnings being treated as errors
> 
> 
> 
> [2]
> .../dpdk/drivers/raw/ifpga_rawdev/base/opae_intel_max10.c:6:10: fatal
> error:
> libfdt.h: No such file or directory
>  #include 
>   ^~~~

[dpdk-dev] [PATCH v3] eal: increase max number of interrupt vectors

2019-03-31 Thread Pavan Nikhilesh Bhagavatula
From: Pavan Nikhilesh 

MSI-X permits a device to allocate up to 2048 interrupts as per PCIe
spec.
Increase the max number of vectors to a reasonable value of 512.

Signed-off-by: Pavan Nikhilesh 
---
 v3 Changes:
 - Instead of making it a configurable option set max vectors to a reasonable
 value of 512.

 v2 Changes:
 - Add defaults for meson build. (Jerin Jacob)

 lib/librte_eal/common/include/rte_eal_interrupts.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h 
b/lib/librte_eal/common/include/rte_eal_interrupts.h
index 9d302f412..b370c0d26 100644
--- a/lib/librte_eal/common/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
@@ -17,7 +17,7 @@
 #ifndef _RTE_EAL_INTERRUPTS_H_
 #define _RTE_EAL_INTERRUPTS_H_

-#define RTE_MAX_RXTX_INTR_VEC_ID 32
+#define RTE_MAX_RXTX_INTR_VEC_ID  512
 #define RTE_INTR_VEC_ZERO_OFFSET  0
 #define RTE_INTR_VEC_RXTX_OFFSET  1

--
2.21.0



Re: [dpdk-dev] [PATCH v7 0/8] Support vector instructions on ICE

2019-03-31 Thread Thomas Monjalon
26/03/2019 10:50, Ferruh Yigit:
> > Wenzhuo Lu (8):
> >   net/ice: fix Tx function setting
> >   net/ice: add pointer for queue buffer release
> >   net/ice: support vector SSE in RX
> >   net/ice: support Rx scatter SSE vector
> >   net/ice: support Tx SSE vector
> >   net/ice: support Rx AVX2 vector
> >   net/ice: support Rx scatter AVX2 vector
> >   net/ice: support vector AVX2 in TX
> 
> This version (v7) pulled from next-net-intel to next-net.

I assume these patches have been tested, or at least compiled.
However, when running devtools/test-meson-builds.sh, there is a
compilation error for build-x86-default:

In file included from ../drivers/net/ice/ice_ethdev.h:10:
rte_ethdev_pci.h:38:10: fatal error: 'rte_pci.h' file not found

It can be fixed in
net/ice: support Rx AVX2 vector
by adding static_rte_pci and static_rte_bus_pci to the dependencies.
I fixed it even better in
net/ice: support vector SSE in Rx
by replacing the useless include of rte_ethdev_pci.h in ice_ethdev.h
with rte_ethdev_driver.h.

I could just reject the next-net tree, but I don't really have such option
if we want to close 19.05-rc1 quickly.

In summary, I am spending my Sunday hours to fix the mess in your driver
which was supposed to be tested before submitting, plus before merge in
next-net-intel, plus compilation-tested before pull in next-net.
I don't know what failed in the process, but I really don't like it.
I don't want to see any new patch for ice PMD in 19.05 cycle.
If you really need some fixes in 19.05 (very likely given the mass
code drop you are doing few days before the -rc1 deadline),
then I advise you to double check everything and make commits fully
justified and explained.

Sorry for the bad mood, and I hope it won't happen again soon.




[dpdk-dev] [PATCH] ethdev: fix DMA zone reserve not honoring size

2019-03-31 Thread Pavan Nikhilesh Bhagavatula
From: Pavan Nikhilesh 

The `rte_eth_dma_zone_reserve()` is generally used to create HW rings.
In some scenarios when a driver needs to reconfigure the ring size
since the named memzone already exists it returns the previous memzone
without checking if a different sized ring is requested.

Introduce a check to see if the ring size requested is different from the
previously created memzone length.

Fixes: 719dbebceb81 ("xen: allow determining DOM0 at runtime")
Cc: sta...@dpdk.org

Signed-off-by: Pavan Nikhilesh 
---
 lib/librte_ethdev/rte_ethdev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 12b66b68c..4ae12e43b 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -3604,9 +3604,12 @@ rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, 
const char *ring_name,
}
 
mz = rte_memzone_lookup(z_name);
-   if (mz)
+   if (mz && (mz->len == size))
return mz;
 
+   if (mz)
+   rte_memzone_free(mz);
+
return rte_memzone_reserve_aligned(z_name, size, socket_id,
RTE_MEMZONE_IOVA_CONTIG, align);
 }
-- 
2.21.0



Re: [dpdk-dev] [PATCH v4 00/38] ice share code update.

2019-03-31 Thread Thomas Monjalon
25/03/2019 08:07, Zhang, Qi Z:
> > Sync to latest kernel driver, main changes:
> > 1. add DCB/FDIR support.
> > 2. add more APIs in switch module.
> > 3. code clean and bug fix.
> > 
> > Qi Zhang (38):
> >   net/ice/base: add switch resource allocation and free
> >   net/ice/base: improve comments
> >   net/ice/base: add two helper functions
> >   net/ice/base: add helper macros
> >   net/ice/base: allow package copy to be used after resets
> >   net/ice/base: clean code
> >   net/ice/base: declare functions as external
> >   net/ice/base: add more APIs in switch module
> >   net/ice/base: add VSI queue context framework
> >   net/ice/base: add APIs to add remove ethertype filter
> >   net/ice/base: add APIs to get allocated resources
> >   net/ice/base: add APIs to alloc/free resource counter
> >   net/ice/base: add APIs to get VSI promiscuous mode
> >   net/ice/base: add MAC filter with marker and counter
> >   net/ice/base: add two helper functions for flow management
> >   net/ice/base: fix minor issues
> >   net/ice/base: update macros
> >   net/ice/base: clean code
> >   net/ice/base: enable VSI queue context
> >   net/ice/base: ensure only valid bits are set
> >   net/ice/base: enhance get link status command
> >   net/ice/base: add RSS key related macro and structures
> >   net/ice/base: do not write TCAM entries back
> >   net/ice/base: remove local VSIG allocations
> >   net/ice/base: fix minor issues
> >   net/ice/base: update copyright time
> >   net/ice/base: fix Klockwork analysis reported issues
> >   net/ice/base: return config error without queue to disable
> >   net/ice/base: add function to check FW recovery mode
> >   net/ice/base: change profile id reference counting
> >   net/ice/base: add DCB support
> >   net/ice/base: add FDIR support
> >   net/ice/base: change profile priority for RSS reply
> >   net/ice/base: fix duplicate resource allocations
> >   net/ice/base: fix minor issues
> >   net/ice/base: increase prototol offset size
> >   net/ice/base: revert the workaround for resource allocation
> >   net/ice/base: rework on bit ops
> 
> Applied to dpdk-next-net-intel.
> 
> Thanks
> Qi

3 commits have the title "fix minor issues",
that's a funny performance.
When the ice PMD will enter in a serious phase,
you should write some real commit titles
and explain what are the fixed issues.
For DPDK 19.05, I guess it's fine because ice PMD
is not going to be really used soon, right?




Re: [dpdk-dev] [EXT] Re: [PATCH v2 1/2] app/testpmd: fix tx vlan and qinq dependency

2019-03-31 Thread Nithin Kumar Dabilpuram
Hi Ferruh Yigit,

Sorry, missed to see your inline comment about the check in v1.

>> @@ tx_vlan_set(portid_t port_id, uint16_t vlan_id)
>>  if (vlan_id_is_invalid(vlan_id))
>>  return;
>>  
>> -vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
>> -if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
>> -printf("Error, as QinQ has been enabled.\n");
>> -return;
>> -}

> Here I think intention is if QINQ is enabled 'tx_vlan_id' & 'tx_vlan_id_outer'
> should be set by tx_qinq_set() not 'tx_vlan_set', and check is for that.

> What do you think keeping same logic?
> But of course check should be if 'ports[port_id].dev_conf.txmode.offloads' 
> has 'DEV_TX_OFFLOAD_QINQ_INSERT' instead of above check.

Since tx_vlan_set() and tx_qinq_set() are they themselves are resetting/setting 
those flags in 'ports[port_id].dev_conf.txmode.offloads', 
do you think having such a check before clearing and setting flags be 
consistent ?
Current behavior is to override last setting with new setting. All the settings 
could be completely disabled by cmdline "tx vlan reset"

--
Thanks
Nithin



-Original Message-
From: Ferruh Yigit  
Sent: Thursday, March 21, 2019 10:41 PM
To: Nithin Kumar Dabilpuram ; Wenzhuo Lu 
; Jingjing Wu ; Bernard Iremonger 

Cc: dev@dpdk.org; xiao.w.w...@intel.com
Subject: [EXT] Re: [dpdk-dev] [PATCH v2 1/2] app/testpmd: fix tx vlan and qinq 
dependency

External Email

--
On 3/18/2019 9:56 AM, Nithin Kumar Dabilpuram wrote:
> Tx VLAN & QinQ insert enable need not depend on Rx VLAN offload 
> ETH_VLAN_EXTEND_OFFLOAD.
> 
> Fixes: 6a34f91690d0 ("app/testpmd: fix error message when setting Tx 
> VLAN")
> Cc: xiao.w.w...@intel.com
> 
> Signed-off-by: Nithin Dabilpuram 

<...>

> @@ -2963,11 +2962,6 @@ tx_vlan_set(portid_t port_id, uint16_t vlan_id)
>   if (vlan_id_is_invalid(vlan_id))
>   return;
>  
> - vlan_offload = rte_eth_dev_get_vlan_offload(port_id);
> - if (vlan_offload & ETH_VLAN_EXTEND_OFFLOAD) {
> - printf("Error, as QinQ has been enabled.\n");
> - return;
> - }

It looks like you didn't take account comment on this in previous version, can 
you please check it?


Re: [dpdk-dev] [pull-request] next-qos 19.05 PRE-RC1

2019-03-31 Thread Thomas Monjalon
29/03/2019 21:02, Cristian Dumitrescu:
>   http://dpdk.org/git/next/dpdk-next-qos 

Pulled, thanks




Re: [dpdk-dev] [PATCH v3 6/8] stack: add C11 atomic implementation

2019-03-31 Thread Eads, Gage



> -Original Message-
> From: Eads, Gage
> Sent: Friday, March 29, 2019 2:25 PM
> To: Honnappa Nagarahalli ;
> dev@dpdk.org
> Cc: olivier.m...@6wind.com; arybche...@solarflare.com; Richardson, Bruce
> ; Ananyev, Konstantin
> ; Gavin Hu (Arm Technology China)
> ; nd ; tho...@monjalon.net; nd
> ; Thomas Monjalon 
> Subject: RE: [PATCH v3 6/8] stack: add C11 atomic implementation
> 
> [snip]
> 
> > > +static __rte_always_inline void
> > > +__rte_stack_lf_push(struct rte_stack_lf_list *list,
> > > + struct rte_stack_lf_elem *first,
> > > + struct rte_stack_lf_elem *last,
> > > + unsigned int num)
> > > +{
> > > +#ifndef RTE_ARCH_X86_64
> > > + RTE_SET_USED(first);
> > > + RTE_SET_USED(last);
> > > + RTE_SET_USED(list);
> > > + RTE_SET_USED(num);
> > > +#else
> > > + struct rte_stack_lf_head old_head;
> > > + int success;
> > > +
> > > + old_head = list->head;
> > This can be a torn read (same as you have mentioned in
> > __rte_stack_lf_pop). I suggest we use acquire thread fence here as
> > well (please see the comments in __rte_stack_lf_pop).
> 
> Agreed. I'll add the acquire fence.
> 

On second thought, an acquire fence isn't necessary. The acquire fence in 
__rte_stack_lf_pop() ensures the list->head is ordered before the list element 
reads. That isn't necessary here; we need to ensure that the last->next write 
occurs (and is observed) before the list->head write, which the CAS's RELEASE 
success memorder accomplishes.

If a torn read occurs, the CAS will fail and will atomically re-load &old_head.


[dpdk-dev] [PATCH v5 2/8] mempool/stack: convert mempool to use rte stack

2019-03-31 Thread Gage Eads
The new rte_stack library is derived from the mempool handler, so this
commit removes duplicated code and simplifies the handler by migrating it
to this new API.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 MAINTAINERS   |  2 +-
 drivers/mempool/stack/Makefile|  3 +-
 drivers/mempool/stack/meson.build |  6 +-
 drivers/mempool/stack/rte_mempool_stack.c | 93 +--
 4 files changed, 33 insertions(+), 71 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 09fd99dbf..13fe49e2b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -293,7 +293,6 @@ M: Andrew Rybchenko 
 F: lib/librte_mempool/
 F: drivers/mempool/Makefile
 F: drivers/mempool/ring/
-F: drivers/mempool/stack/
 F: doc/guides/prog_guide/mempool_lib.rst
 F: app/test/test_mempool*
 F: app/test/test_func_reentrancy.c
@@ -421,6 +420,7 @@ M: Gage Eads 
 M: Olivier Matz 
 F: lib/librte_stack/
 F: doc/guides/prog_guide/stack_lib.rst
+F: drivers/mempool/stack/
 
 
 Memory Pool Drivers
diff --git a/drivers/mempool/stack/Makefile b/drivers/mempool/stack/Makefile
index 0444aedad..1681a62bc 100644
--- a/drivers/mempool/stack/Makefile
+++ b/drivers/mempool/stack/Makefile
@@ -10,10 +10,11 @@ LIB = librte_mempool_stack.a
 
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 # Headers
 CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
-LDLIBS += -lrte_eal -lrte_mempool -lrte_ring
+LDLIBS += -lrte_eal -lrte_mempool -lrte_stack
 
 EXPORT_MAP := rte_mempool_stack_version.map
 
diff --git a/drivers/mempool/stack/meson.build 
b/drivers/mempool/stack/meson.build
index b75a3bb56..03e369a41 100644
--- a/drivers/mempool/stack/meson.build
+++ b/drivers/mempool/stack/meson.build
@@ -1,4 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2017-2019 Intel Corporation
+
+allow_experimental_apis = true
 
 sources = files('rte_mempool_stack.c')
+
+deps += ['stack']
diff --git a/drivers/mempool/stack/rte_mempool_stack.c 
b/drivers/mempool/stack/rte_mempool_stack.c
index e6d504af5..25ccdb9af 100644
--- a/drivers/mempool/stack/rte_mempool_stack.c
+++ b/drivers/mempool/stack/rte_mempool_stack.c
@@ -1,39 +1,29 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016 Intel Corporation
+ * Copyright(c) 2016-2019 Intel Corporation
  */
 
 #include 
 #include 
-#include 
-
-struct rte_mempool_stack {
-   rte_spinlock_t sl;
-
-   uint32_t size;
-   uint32_t len;
-   void *objs[];
-};
+#include 
 
 static int
 stack_alloc(struct rte_mempool *mp)
 {
-   struct rte_mempool_stack *s;
-   unsigned n = mp->size;
-   int size = sizeof(*s) + (n+16)*sizeof(void *);
-
-   /* Allocate our local memory structure */
-   s = rte_zmalloc_socket("mempool-stack",
-   size,
-   RTE_CACHE_LINE_SIZE,
-   mp->socket_id);
-   if (s == NULL) {
-   RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n");
-   return -ENOMEM;
+   char name[RTE_STACK_NAMESIZE];
+   struct rte_stack *s;
+   int ret;
+
+   ret = snprintf(name, sizeof(name),
+  RTE_MEMPOOL_MZ_FORMAT, mp->name);
+   if (ret < 0 || ret >= (int)sizeof(name)) {
+   rte_errno = ENAMETOOLONG;
+   return -rte_errno;
}
 
-   rte_spinlock_init(&s->sl);
+   s = rte_stack_create(name, mp->size, mp->socket_id, 0);
+   if (s == NULL)
+   return -rte_errno;
 
-   s->size = n;
mp->pool_data = s;
 
return 0;
@@ -41,69 +31,36 @@ stack_alloc(struct rte_mempool *mp)
 
 static int
 stack_enqueue(struct rte_mempool *mp, void * const *obj_table,
-   unsigned n)
+ unsigned int n)
 {
-   struct rte_mempool_stack *s = mp->pool_data;
-   void **cache_objs;
-   unsigned index;
-
-   rte_spinlock_lock(&s->sl);
-   cache_objs = &s->objs[s->len];
-
-   /* Is there sufficient space in the stack ? */
-   if ((s->len + n) > s->size) {
-   rte_spinlock_unlock(&s->sl);
-   return -ENOBUFS;
-   }
-
-   /* Add elements back into the cache */
-   for (index = 0; index < n; ++index, obj_table++)
-   cache_objs[index] = *obj_table;
-
-   s->len += n;
+   struct rte_stack *s = mp->pool_data;
 
-   rte_spinlock_unlock(&s->sl);
-   return 0;
+   return rte_stack_push(s, obj_table, n) == 0 ? -ENOBUFS : 0;
 }
 
 static int
 stack_dequeue(struct rte_mempool *mp, void **obj_table,
-   unsigned n)
+ unsigned int n)
 {
-   struct rte_mempool_stack *s = mp->pool_data;
-   void **cache_objs;
-   unsigned index, len;
-
-   rte_spinlock_lock(&s->sl);
-
-   if (unlikely(n > s->len)) {
-   rte_spinlock_unlock(&s->sl);
-   return -ENOENT;
-   }
+   struct rte_stack *s = mp->pool_data;
 
-   cache_objs = s->objs;
-
-  

[dpdk-dev] [PATCH v5 0/8] Add stack library and new mempool handler

2019-03-31 Thread Gage Eads
This patchset introduces a stack library, supporting both lock-based and
lock-free stacks, and a lock-free stack mempool handler.

The lock-based stack code is derived from the existing stack mempool handler,
and that handler is refactored to use the stack library.

The lock-free stack mempool handler is intended for usages where the rte
ring's "non-preemptive" constraint is not acceptable; for example, if the
application uses a mixture of pinned high-priority threads and multiplexed
low-priority threads that share a mempool.

Note that the lock-free algorithm relies on a 128-bit compare-and-swap[1],
so it is currently limited to the x86_64 platform.

This patchset is the successor to a patchset containing only the new mempool
handler[2].

[1] http://mails.dpdk.org/archives/dev/2019-March/125751.html
[2] http://mails.dpdk.org/archives/dev/2019-January/123555.html

---
v5:
- Add comment to explain padding in *_get_memsize() functions
- Prefix internal functions with '__'
- Use RTE_ASSERT for performance critical run-time checks
- Don't use __atomic_load in the C11 pop_elems function, and put an acquire
  thread fence at the start of the 2nd do-while loop
- Change pop_elems 128b CAS success memorder to RELEASE and failure memorder to
  RELAXED
- Change compile-time assertion to run for all 64-bit architectures
- Reorganize the code with standard and lock-free .c and .h files

v4:
- Fix 32-bit build error in test_stack.c by using %zu format specifier for
  size_t
- Rebase onto master

v3:
- Rebase patchset onto master (test/test/ -> app/test/)
- Fix rte_stack_std_push() segfault introduced in v2

v2:
- Reworked structure and function naming to use rte_stack_{std, lf}_...
- Updated to the latest rte_atomic128_cmp_exchange() interface.
- Rename STACK_F_NB -> RTE_STACK_F_LF.
- Remove rte_rmb() and rte_wmb() from the generic push and pop implementations.
  These are obviated by rte_atomic128_cmp_exchange()'s two memorder arguments.
- Edit stack_lib.rst text to 80 chars/line.
- Fix rte_stack.h doxygen formatting.
- Allocate popped_objs array from the heap
- Fix stack_thread_push_pop bug ("&t->sz" -> "t->sz")
- Remove unnecessary NULL check from test_stack_basic
- Properly terminate the name string in test_stack_name_length
- Add an empty array of struct rte_nb_lifo_elem elements
- In rte_nb_lifo_push(), retrieve the last element from __nb_lifo_pop()
- Split C11 implementation into a separate patchset

Gage Eads (8):
  stack: introduce rte stack library
  mempool/stack: convert mempool to use rte stack
  test/stack: add stack test
  test/stack: add stack perf test
  stack: add lock-free stack implementation
  stack: add C11 atomic implementation
  test/stack: add lock-free stack tests
  mempool/stack: add lock-free stack mempool handler

 MAINTAINERS |   9 +-
 app/test/Makefile   |   3 +
 app/test/meson.build|   7 +
 app/test/test_stack.c   | 423 
 app/test/test_stack_perf.c  | 356 
 config/common_base  |   5 +
 doc/api/doxy-api-index.md   |   1 +
 doc/api/doxy-api.conf.in|   1 +
 doc/guides/prog_guide/env_abstraction_layer.rst |  10 +
 doc/guides/prog_guide/index.rst |   1 +
 doc/guides/prog_guide/stack_lib.rst |  83 +
 doc/guides/rel_notes/release_19_05.rst  |  13 +
 drivers/mempool/stack/Makefile  |   3 +-
 drivers/mempool/stack/meson.build   |   6 +-
 drivers/mempool/stack/rte_mempool_stack.c   | 115 +++
 lib/Makefile|   2 +
 lib/librte_stack/Makefile   |  29 ++
 lib/librte_stack/meson.build|  12 +
 lib/librte_stack/rte_stack.c| 196 +++
 lib/librte_stack/rte_stack.h| 259 +++
 lib/librte_stack/rte_stack_lf.c |  31 ++
 lib/librte_stack/rte_stack_lf.h | 106 ++
 lib/librte_stack/rte_stack_lf_c11.h | 169 ++
 lib/librte_stack/rte_stack_lf_generic.h | 151 +
 lib/librte_stack/rte_stack_pvt.h|  34 ++
 lib/librte_stack/rte_stack_std.c|  26 ++
 lib/librte_stack/rte_stack_std.h| 119 +++
 lib/librte_stack/rte_stack_version.map  |   9 +
 lib/meson.build |   2 +-
 mk/rte.app.mk   |   1 +
 30 files changed, 2110 insertions(+), 72 deletions(-)
 create mode 100644 app/test/test_stack.c
 create mode 100644 app/test/test_stack_perf.c
 create mode 100644 doc/guides/prog_guide/stack_lib.rst
 create mode 100644 lib/librte_stack/Makefile
 create mode 100644 lib/librte_stack/meson.build
 create mode 100644 lib/librte_stack/rte_stack.c
 create mode 100644 lib/librte_stack/rte_stack.h
 crea

[dpdk-dev] [PATCH v5 1/8] stack: introduce rte stack library

2019-03-31 Thread Gage Eads
The rte_stack library provides an API for configuration and use of a
bounded stack of pointers. Push and pop operations are MT-safe, allowing
concurrent access, and the interface supports pushing and popping multiple
pointers at a time.

The library's interface is modeled after another DPDK data structure,
rte_ring, and its lock-based implementation is derived from the stack
mempool handler. An upcoming commit will migrate the stack mempool handler
to rte_stack.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 MAINTAINERS|   6 +
 config/common_base |   5 +
 doc/api/doxy-api-index.md  |   1 +
 doc/api/doxy-api.conf.in   |   1 +
 doc/guides/prog_guide/index.rst|   1 +
 doc/guides/prog_guide/stack_lib.rst|  28 +
 doc/guides/rel_notes/release_19_05.rst |   5 +
 lib/Makefile   |   2 +
 lib/librte_stack/Makefile  |  25 
 lib/librte_stack/meson.build   |   8 ++
 lib/librte_stack/rte_stack.c   | 182 +
 lib/librte_stack/rte_stack.h   | 207 +
 lib/librte_stack/rte_stack_pvt.h   |  34 ++
 lib/librte_stack/rte_stack_std.c   |  26 +
 lib/librte_stack/rte_stack_std.h   | 119 +++
 lib/librte_stack/rte_stack_version.map |   9 ++
 lib/meson.build|   2 +-
 mk/rte.app.mk  |   1 +
 18 files changed, 661 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/prog_guide/stack_lib.rst
 create mode 100644 lib/librte_stack/Makefile
 create mode 100644 lib/librte_stack/meson.build
 create mode 100644 lib/librte_stack/rte_stack.c
 create mode 100644 lib/librte_stack/rte_stack.h
 create mode 100644 lib/librte_stack/rte_stack_pvt.h
 create mode 100644 lib/librte_stack/rte_stack_std.c
 create mode 100644 lib/librte_stack/rte_stack_std.h
 create mode 100644 lib/librte_stack/rte_stack_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index e9ff2b4c2..09fd99dbf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -416,6 +416,12 @@ F: drivers/raw/skeleton_rawdev/
 F: app/test/test_rawdev.c
 F: doc/guides/prog_guide/rawdev.rst
 
+Stack API - EXPERIMENTAL
+M: Gage Eads 
+M: Olivier Matz 
+F: lib/librte_stack/
+F: doc/guides/prog_guide/stack_lib.rst
+
 
 Memory Pool Drivers
 ---
diff --git a/config/common_base b/config/common_base
index 6292bc4af..fc8dba69d 100644
--- a/config/common_base
+++ b/config/common_base
@@ -994,3 +994,8 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
+#
+# Compile librte_stack
+#
+CONFIG_RTE_LIBRTE_STACK=y
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index aacc66bd8..de1e215dd 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -125,6 +125,7 @@ The public API headers are grouped by topics:
   [mbuf]   (@ref rte_mbuf.h),
   [mbuf pool ops]  (@ref rte_mbuf_pool_ops.h),
   [ring]   (@ref rte_ring.h),
+  [stack]  (@ref rte_stack.h),
   [tailq]  (@ref rte_tailq.h),
   [bitmap] (@ref rte_bitmap.h)
 
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index a365e669b..7722fc3e9 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -55,6 +55,7 @@ INPUT   = @TOPDIR@/doc/api/doxy-api-index.md \
   @TOPDIR@/lib/librte_ring \
   @TOPDIR@/lib/librte_sched \
   @TOPDIR@/lib/librte_security \
+  @TOPDIR@/lib/librte_stack \
   @TOPDIR@/lib/librte_table \
   @TOPDIR@/lib/librte_telemetry \
   @TOPDIR@/lib/librte_timer \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 6726b1e8d..f4f60862f 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -55,6 +55,7 @@ Programmer's Guide
 metrics_lib
 bpf_lib
 ipsec_lib
+stack_lib
 source_org
 dev_kit_build_system
 dev_kit_root_make_help
diff --git a/doc/guides/prog_guide/stack_lib.rst 
b/doc/guides/prog_guide/stack_lib.rst
new file mode 100644
index 0..25a8cc38a
--- /dev/null
+++ b/doc/guides/prog_guide/stack_lib.rst
@@ -0,0 +1,28 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2019 Intel Corporation.
+
+Stack Library
+=
+
+DPDK's stack library provides an API for configuration and use of a bounded
+stack of pointers.
+
+The stack library provides the following basic operations:
+
+*  Create a uniquely named stack of a user-specified size and using a
+   user-specified socket.
+
+*  Push and pop a burst of one or more stack objects (pointers). These function
+   are multi-threading safe.
+
+*  Free a previously created stack.
+
+*  Lookup a pointer to a stack by its name.
+
+*  Que

[dpdk-dev] [PATCH v5 4/8] test/stack: add stack perf test

2019-03-31 Thread Gage Eads
stack_perf_autotest tests the following with one lcore:
- Cycles to attempt to pop an empty stack
- Cycles to push then pop a single object
- Cycles to push then pop a burst of 32 objects

It also tests the cycles to push then pop a burst of 8 and 32 objects with
the following lcore combinations (if possible):
- Two hyperthreads
- Two physical cores
- Two physical cores on separate NUMA nodes
- All available lcores

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 app/test/Makefile  |   1 +
 app/test/meson.build   |   2 +
 app/test/test_stack_perf.c | 343 +
 3 files changed, 346 insertions(+)
 create mode 100644 app/test/test_stack_perf.c

diff --git a/app/test/Makefile b/app/test/Makefile
index e5bde81af..b28bed2d4 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -91,6 +91,7 @@ endif
 SRCS-y += test_rwlock.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack.c
+SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack_perf.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer.c
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c
diff --git a/app/test/meson.build b/app/test/meson.build
index 56ea13f53..02eb788a4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -96,6 +96,7 @@ test_sources = files('commands.c',
'test_service_cores.c',
'test_spinlock.c',
'test_stack.c',
+   'test_stack_perf.c',
'test_string_fns.c',
'test_table.c',
'test_table_acl.c',
@@ -241,6 +242,7 @@ perf_test_names = [
 'distributor_perf_autotest',
 'ring_pmd_perf_autotest',
 'pmd_perf_autotest',
+'stack_perf_autotest',
 ]
 
 # All test cases in driver_test_names list are non-parallel
diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c
new file mode 100644
index 0..484370d30
--- /dev/null
+++ b/app/test/test_stack_perf.c
@@ -0,0 +1,343 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define STACK_NAME "STACK_PERF"
+#define MAX_BURST 32
+#define STACK_SIZE (RTE_MAX_LCORE * MAX_BURST)
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+/*
+ * Push/pop bulk sizes, marked volatile so they aren't treated as compile-time
+ * constants.
+ */
+static volatile unsigned int bulk_sizes[] = {8, MAX_BURST};
+
+static rte_atomic32_t lcore_barrier;
+
+struct lcore_pair {
+   unsigned int c1;
+   unsigned int c2;
+};
+
+static int
+get_two_hyperthreads(struct lcore_pair *lcp)
+{
+   unsigned int socket[2];
+   unsigned int core[2];
+   unsigned int id[2];
+
+   RTE_LCORE_FOREACH(id[0]) {
+   RTE_LCORE_FOREACH(id[1]) {
+   if (id[0] == id[1])
+   continue;
+   core[0] = lcore_config[id[0]].core_id;
+   core[1] = lcore_config[id[1]].core_id;
+   socket[0] = lcore_config[id[0]].socket_id;
+   socket[1] = lcore_config[id[1]].socket_id;
+   if ((core[0] == core[1]) && (socket[0] == socket[1])) {
+   lcp->c1 = id[0];
+   lcp->c2 = id[1];
+   return 0;
+   }
+   }
+   }
+
+   return 1;
+}
+
+static int
+get_two_cores(struct lcore_pair *lcp)
+{
+   unsigned int socket[2];
+   unsigned int core[2];
+   unsigned int id[2];
+
+   RTE_LCORE_FOREACH(id[0]) {
+   RTE_LCORE_FOREACH(id[1]) {
+   if (id[0] == id[1])
+   continue;
+   core[0] = lcore_config[id[0]].core_id;
+   core[1] = lcore_config[id[1]].core_id;
+   socket[0] = lcore_config[id[0]].socket_id;
+   socket[1] = lcore_config[id[1]].socket_id;
+   if ((core[0] != core[1]) && (socket[0] == socket[1])) {
+   lcp->c1 = id[0];
+   lcp->c2 = id[1];
+   return 0;
+   }
+   }
+   }
+
+   return 1;
+}
+
+static int
+get_two_sockets(struct lcore_pair *lcp)
+{
+   unsigned int socket[2];
+   unsigned int id[2];
+
+   RTE_LCORE_FOREACH(id[0]) {
+   RTE_LCORE_FOREACH(id[1]) {
+   if (id[0] == id[1])
+   continue;
+   socket[0] = lcore_config[id[0]].socket_id;
+   socket[1] = lcore_config[id[1]].socket_id;
+   if (socket[0] != socket[1]) {
+   lcp->c1 = id[0];
+   lcp->c2 = id[1];
+   return 0;
+   }
+   }
+   }
+
+   return 1;
+}
+
+/* Measure the cycle c

[dpdk-dev] [PATCH v5 3/8] test/stack: add stack test

2019-03-31 Thread Gage Eads
stack_autotest performs positive and negative testing of the stack API, and
exercises the push and pop datapath functions with all available lcores.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 MAINTAINERS   |   1 +
 app/test/Makefile |   2 +
 app/test/meson.build  |   3 +
 app/test/test_stack.c | 410 ++
 4 files changed, 416 insertions(+)
 create mode 100644 app/test/test_stack.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 13fe49e2b..2842f07ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -421,6 +421,7 @@ M: Olivier Matz 
 F: lib/librte_stack/
 F: doc/guides/prog_guide/stack_lib.rst
 F: drivers/mempool/stack/
+F: test/test/*stack*
 
 
 Memory Pool Drivers
diff --git a/app/test/Makefile b/app/test/Makefile
index d6aa28bad..e5bde81af 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -90,6 +90,8 @@ endif
 
 SRCS-y += test_rwlock.c
 
+SRCS-$(CONFIG_RTE_LIBRTE_STACK) += test_stack.c
+
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer.c
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_TIMER) += test_timer_racecond.c
diff --git a/app/test/meson.build b/app/test/meson.build
index c5e65fe66..56ea13f53 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -95,6 +95,7 @@ test_sources = files('commands.c',
'test_sched.c',
'test_service_cores.c',
'test_spinlock.c',
+   'test_stack.c',
'test_string_fns.c',
'test_table.c',
'test_table_acl.c',
@@ -133,6 +134,7 @@ test_deps = ['acl',
'port',
'reorder',
'ring',
+   'stack',
'timer'
 ]
 
@@ -174,6 +176,7 @@ fast_parallel_test_names = [
 'rwlock_autotest',
 'sched_autotest',
 'spinlock_autotest',
+'stack_autotest',
 'string_autotest',
 'table_autotest',
 'tailq_autotest',
diff --git a/app/test/test_stack.c b/app/test/test_stack.c
new file mode 100644
index 0..8392e4e4d
--- /dev/null
+++ b/app/test/test_stack.c
@@ -0,0 +1,410 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define STACK_SIZE 4096
+#define MAX_BULK 32
+
+static int
+test_stack_push_pop(struct rte_stack *s, void **obj_table, unsigned int 
bulk_sz)
+{
+   unsigned int i, ret;
+   void **popped_objs;
+
+   popped_objs = rte_calloc(NULL, STACK_SIZE, sizeof(void *), 0);
+   if (popped_objs == NULL) {
+   printf("[%s():%u] failed to calloc %zu bytes\n",
+  __func__, __LINE__, STACK_SIZE * sizeof(void *));
+   return -1;
+   }
+
+   for (i = 0; i < STACK_SIZE; i += bulk_sz) {
+   ret = rte_stack_push(s, &obj_table[i], bulk_sz);
+
+   if (ret != bulk_sz) {
+   printf("[%s():%u] push returned: %d (expected %u)\n",
+  __func__, __LINE__, ret, bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+   }
+
+   if (rte_stack_count(s) != i + bulk_sz) {
+   printf("[%s():%u] stack count: %u (expected %u)\n",
+  __func__, __LINE__, rte_stack_count(s),
+  i + bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+   }
+
+   if (rte_stack_free_count(s) != STACK_SIZE - i - bulk_sz) {
+   printf("[%s():%u] stack free count: %u (expected %u)\n",
+  __func__, __LINE__, rte_stack_count(s),
+  STACK_SIZE - i - bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+   }
+   }
+
+   for (i = 0; i < STACK_SIZE; i += bulk_sz) {
+   ret = rte_stack_pop(s, &popped_objs[i], bulk_sz);
+
+   if (ret != bulk_sz) {
+   printf("[%s():%u] pop returned: %d (expected %u)\n",
+  __func__, __LINE__, ret, bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+   }
+
+   if (rte_stack_count(s) != STACK_SIZE - i - bulk_sz) {
+   printf("[%s():%u] stack count: %u (expected %u)\n",
+  __func__, __LINE__, rte_stack_count(s),
+  STACK_SIZE - i - bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+   }
+
+   if (rte_stack_free_count(s) != i + bulk_sz) {
+   printf("[%s():%u] stack free count: %u (expected %u)\n",
+  __func__, __LINE__, rte_stack_count(s),
+  i + bulk_sz);
+   rte_free(popped_objs);
+   return -1;
+ 

[dpdk-dev] [PATCH v5 5/8] stack: add lock-free stack implementation

2019-03-31 Thread Gage Eads
This commit adds support for a lock-free (linked list based) stack to the
stack API. This behavior is selected through a new rte_stack_create() flag,
RTE_STACK_F_LF.

The stack consists of a linked list of elements, each containing a data
pointer and a next pointer, and an atomic stack depth counter.

The lock-free push operation enqueues a linked list of pointers by pointing
the tail of the list to the current stack head, and using a CAS to swing
the stack head pointer to the head of the list. The operation retries if it
is unsuccessful (i.e. the list changed between reading the head and
modifying it), else it adjusts the stack length and returns.

The lock-free pop operation first reserves num elements by adjusting the
stack length, to ensure the dequeue operation will succeed without
blocking. It then dequeues pointers by walking the list -- starting from
the head -- then swinging the head pointer (using a CAS as well). While
walking the list, the data pointers are recorded in an object table.

This algorithm stack uses a 128-bit compare-and-swap instruction, which
atomically updates the stack top pointer and a modification counter, to
protect against the ABA problem.

The linked list elements themselves are maintained in a lock-free LIFO
list, and are allocated before stack pushes and freed after stack pops.
Since the stack has a fixed maximum depth, these elements do not need to be
dynamically created.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 doc/guides/prog_guide/stack_lib.rst |  61 -
 doc/guides/rel_notes/release_19_05.rst  |   3 +
 lib/librte_stack/Makefile   |   7 +-
 lib/librte_stack/meson.build|   7 +-
 lib/librte_stack/rte_stack.c|  28 --
 lib/librte_stack/rte_stack.h|  62 +++--
 lib/librte_stack/rte_stack_lf.c |  31 +++
 lib/librte_stack/rte_stack_lf.h | 102 +
 lib/librte_stack/rte_stack_lf_generic.h | 151 
 9 files changed, 433 insertions(+), 19 deletions(-)
 create mode 100644 lib/librte_stack/rte_stack_lf.c
 create mode 100644 lib/librte_stack/rte_stack_lf.h
 create mode 100644 lib/librte_stack/rte_stack_lf_generic.h

diff --git a/doc/guides/prog_guide/stack_lib.rst 
b/doc/guides/prog_guide/stack_lib.rst
index 25a8cc38a..8fe8804e3 100644
--- a/doc/guides/prog_guide/stack_lib.rst
+++ b/doc/guides/prog_guide/stack_lib.rst
@@ -10,7 +10,8 @@ stack of pointers.
 The stack library provides the following basic operations:
 
 *  Create a uniquely named stack of a user-specified size and using a
-   user-specified socket.
+   user-specified socket, with either standard (lock-based) or lock-free
+   behavior.
 
 *  Push and pop a burst of one or more stack objects (pointers). These function
are multi-threading safe.
@@ -24,5 +25,59 @@ The stack library provides the following basic operations:
 Implementation
 ~~
 
-The stack consists of a contiguous array of pointers, a current index, and a
-spinlock. Accesses to the stack are made multi-thread safe by the spinlock.
+The library supports two types of stacks: standard (lock-based) and lock-free.
+Both types use the same set of interfaces, but their implementations differ.
+
+Lock-based Stack
+
+
+The lock-based stack consists of a contiguous array of pointers, a current
+index, and a spinlock. Accesses to the stack are made multi-thread safe by the
+spinlock.
+
+Lock-free Stack
+--
+
+The lock-free stack consists of a linked list of elements, each containing a
+data pointer and a next pointer, and an atomic stack depth counter. The
+lock-free property means that multiple threads can push and pop simultaneously,
+and one thread being preempted/delayed in a push or pop operation will not
+impede the forward progress of any other thread.
+
+The lock-free push operation enqueues a linked list of pointers by pointing the
+list's tail to the current stack head, and using a CAS to swing the stack head
+pointer to the head of the list. The operation retries if it is unsuccessful
+(i.e. the list changed between reading the head and modifying it), else it
+adjusts the stack length and returns.
+
+The lock-free pop operation first reserves one or more list elements by
+adjusting the stack length, to ensure the dequeue operation will succeed
+without blocking. It then dequeues pointers by walking the list -- starting
+from the head -- then swinging the head pointer (using a CAS as well). While
+walking the list, the data pointers are recorded in an object table.
+
+The linked list elements themselves are maintained in a lock-free LIFO, and are
+allocated before stack pushes and freed after stack pops. Since the stack has a
+fixed maximum depth, these elements do not need to be dynamically created.
+
+The lock-free behavior is selected by passing the *RTE_STACK_F_LF* flag to
+rte_stack_create().
+
+Preventing the ABA Problem
+^^
+

[dpdk-dev] [PATCH v5 8/8] mempool/stack: add lock-free stack mempool handler

2019-03-31 Thread Gage Eads
This commit adds support for lock-free (linked list based) stack mempool
handler.

In mempool_perf_autotest the lock-based stack outperforms the
lock-free handler for certain lcore/alloc count/free count
combinations*, however:
- For applications with preemptible pthreads, a standard (lock-based)
  stack's worst-case performance (i.e. one thread being preempted while
  holding the spinlock) is much worse than the lock-free stack's.
- Using per-thread mempool caches will largely mitigate the performance
  difference.

*Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4,
running on isolcpus cores with a tickless scheduler. The lock-based stack's
rate_persec was 0.6x-3.5x the lock-free stack's.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 10 ++
 doc/guides/rel_notes/release_19_05.rst  |  5 +
 drivers/mempool/stack/rte_mempool_stack.c   | 26 +++--
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index c1346363b..1a4391898 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -563,6 +563,16 @@ Known Issues
 
   5. It MUST not be used by multi-producer/consumer pthreads, whose scheduling 
policies are SCHED_FIFO or SCHED_RR.
 
+  Alternatively, applications can use the lock-free stack mempool handler. When
+  considering this handler, note that:
+
+  - It is currently limited to the x86_64 platform, because it uses an
+instruction (16-byte compare-and-swap) that is not yet available on other
+platforms.
+  - It has worse average-case performance than the non-preemptive rte_ring, but
+software caching (e.g. the mempool cache) can mitigate this by reducing the
+number of stack accesses.
+
 + rte_timer
 
   Running  ``rte_timer_manage()`` on a non-EAL pthread is not allowed. 
However, resetting/stopping the timer from a non-EAL pthread is allowed.
diff --git a/doc/guides/rel_notes/release_19_05.rst 
b/doc/guides/rel_notes/release_19_05.rst
index 3b115b5f6..f873984ad 100644
--- a/doc/guides/rel_notes/release_19_05.rst
+++ b/doc/guides/rel_notes/release_19_05.rst
@@ -130,6 +130,11 @@ New Features
   The library supports two stack implementations: standard (lock-based) and 
lock-free.
   The lock-free implementation is currently limited to x86-64 platforms.
 
+* **Added Lock-Free Stack Mempool Handler.**
+
+  Added a new lock-free stack handler, which uses the newly added stack
+  library.
+
 Removed Items
 -
 
diff --git a/drivers/mempool/stack/rte_mempool_stack.c 
b/drivers/mempool/stack/rte_mempool_stack.c
index 25ccdb9af..7e85c8d6b 100644
--- a/drivers/mempool/stack/rte_mempool_stack.c
+++ b/drivers/mempool/stack/rte_mempool_stack.c
@@ -7,7 +7,7 @@
 #include 
 
 static int
-stack_alloc(struct rte_mempool *mp)
+__stack_alloc(struct rte_mempool *mp, uint32_t flags)
 {
char name[RTE_STACK_NAMESIZE];
struct rte_stack *s;
@@ -20,7 +20,7 @@ stack_alloc(struct rte_mempool *mp)
return -rte_errno;
}
 
-   s = rte_stack_create(name, mp->size, mp->socket_id, 0);
+   s = rte_stack_create(name, mp->size, mp->socket_id, flags);
if (s == NULL)
return -rte_errno;
 
@@ -30,6 +30,18 @@ stack_alloc(struct rte_mempool *mp)
 }
 
 static int
+stack_alloc(struct rte_mempool *mp)
+{
+   return __stack_alloc(mp, 0);
+}
+
+static int
+lf_stack_alloc(struct rte_mempool *mp)
+{
+   return __stack_alloc(mp, RTE_STACK_F_LF);
+}
+
+static int
 stack_enqueue(struct rte_mempool *mp, void * const *obj_table,
  unsigned int n)
 {
@@ -72,4 +84,14 @@ static struct rte_mempool_ops ops_stack = {
.get_count = stack_get_count
 };
 
+static struct rte_mempool_ops ops_lf_stack = {
+   .name = "lf_stack",
+   .alloc = lf_stack_alloc,
+   .free = stack_free,
+   .enqueue = stack_enqueue,
+   .dequeue = stack_dequeue,
+   .get_count = stack_get_count
+};
+
 MEMPOOL_REGISTER_OPS(ops_stack);
+MEMPOOL_REGISTER_OPS(ops_lf_stack);
-- 
2.13.6



[dpdk-dev] [PATCH v5 6/8] stack: add C11 atomic implementation

2019-03-31 Thread Gage Eads
This commit adds an implementation of the lock-free stack push, pop, and
length functions that use __atomic builtins, for systems that benefit from
the finer-grained memory ordering control.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 lib/librte_stack/Makefile   |   3 +-
 lib/librte_stack/meson.build|   3 +-
 lib/librte_stack/rte_stack_lf.h |   4 +
 lib/librte_stack/rte_stack_lf_c11.h | 169 
 4 files changed, 177 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_stack/rte_stack_lf_c11.h

diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile
index 311edd997..8d18ce520 100644
--- a/lib/librte_stack/Makefile
+++ b/lib/librte_stack/Makefile
@@ -23,6 +23,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_STACK) := rte_stack.c \
 SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include := rte_stack.h \
  rte_stack_std.h \
  rte_stack_lf.h \
- rte_stack_lf_generic.h
+ rte_stack_lf_generic.h \
+ rte_stack_lf_c11.h
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_stack/meson.build b/lib/librte_stack/meson.build
index 7a09a5d66..46fce0c20 100644
--- a/lib/librte_stack/meson.build
+++ b/lib/librte_stack/meson.build
@@ -8,4 +8,5 @@ sources = files('rte_stack.c', 'rte_stack_std.c', 
'rte_stack_lf.c')
 headers = files('rte_stack.h',
'rte_stack_std.h',
'rte_stack_lf.h',
-   'rte_stack_lf_generic.h')
+   'rte_stack_lf_generic.h',
+   'rte_stack_lf_c11.h')
diff --git a/lib/librte_stack/rte_stack_lf.h b/lib/librte_stack/rte_stack_lf.h
index bfd680133..518889a05 100644
--- a/lib/librte_stack/rte_stack_lf.h
+++ b/lib/librte_stack/rte_stack_lf.h
@@ -5,7 +5,11 @@
 #ifndef _RTE_STACK_LF_H_
 #define _RTE_STACK_LF_H_
 
+#ifdef RTE_USE_C11_MEM_MODEL
+#include "rte_stack_lf_c11.h"
+#else
 #include "rte_stack_lf_generic.h"
+#endif
 
 /**
  * @internal Push several objects on the lock-free stack (MT-safe).
diff --git a/lib/librte_stack/rte_stack_lf_c11.h 
b/lib/librte_stack/rte_stack_lf_c11.h
new file mode 100644
index 0..74e3d8eb4
--- /dev/null
+++ b/lib/librte_stack/rte_stack_lf_c11.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#ifndef _RTE_STACK_LF_C11_H_
+#define _RTE_STACK_LF_C11_H_
+
+#include 
+#include 
+
+static __rte_always_inline unsigned int
+__rte_stack_lf_count(struct rte_stack *s)
+{
+   /* stack_lf_push() and stack_lf_pop() do not update the list's contents
+* and stack_lf->len atomically, which can cause the list to appear
+* shorter than it actually is if this function is called while other
+* threads are modifying the list.
+*
+* However, given the inherently approximate nature of the get_count
+* callback -- even if the list and its size were updated atomically,
+* the size could change between when get_count executes and when the
+* value is returned to the caller -- this is acceptable.
+*
+* The stack_lf->len updates are placed such that the list may appear to
+* have fewer elements than it does, but will never appear to have more
+* elements. If the mempool is near-empty to the point that this is a
+* concern, the user should consider increasing the mempool size.
+*/
+   return (unsigned int)__atomic_load_n(&s->stack_lf.used.len.cnt,
+__ATOMIC_RELAXED);
+}
+
+static __rte_always_inline void
+__rte_stack_lf_push_elems(struct rte_stack_lf_list *list,
+ struct rte_stack_lf_elem *first,
+ struct rte_stack_lf_elem *last,
+ unsigned int num)
+{
+#ifndef RTE_ARCH_X86_64
+   RTE_SET_USED(first);
+   RTE_SET_USED(last);
+   RTE_SET_USED(list);
+   RTE_SET_USED(num);
+#else
+   struct rte_stack_lf_head old_head;
+   int success;
+
+   old_head = list->head;
+
+   do {
+   struct rte_stack_lf_head new_head;
+
+   /* Swing the top pointer to the first element in the list and
+* make the last element point to the old top.
+*/
+   new_head.top = first;
+   new_head.cnt = old_head.cnt + 1;
+
+   last->next = old_head.top;
+
+   /* Use the release memmodel to ensure the writes to the LF LIFO
+* elements are visible before the head pointer write.
+*/
+   success = rte_atomic128_cmp_exchange(
+   (rte_int128_t *)&list->head,
+   (rte_int128_t *)&old_head,
+   (rte_int128_t *)&new_head,
+  

[dpdk-dev] [PATCH v5 7/8] test/stack: add lock-free stack tests

2019-03-31 Thread Gage Eads
This commit adds lock-free stack variants of stack_autotest
(stack_lf_autotest) and stack_perf_autotest (stack_lf_perf_autotest), which
differ only in that the lock-free versions pass the RTE_STACK_F_LF flag to
all rte_stack_create() calls.

Signed-off-by: Gage Eads 
Reviewed-by: Olivier Matz 
---
 app/test/meson.build   |  2 ++
 app/test/test_stack.c  | 41 +++--
 app/test/test_stack_perf.c | 17 +++--
 3 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 02eb788a4..867cc5863 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -178,6 +178,7 @@ fast_parallel_test_names = [
 'sched_autotest',
 'spinlock_autotest',
 'stack_autotest',
+'stack_nb_autotest',
 'string_autotest',
 'table_autotest',
 'tailq_autotest',
@@ -243,6 +244,7 @@ perf_test_names = [
 'ring_pmd_perf_autotest',
 'pmd_perf_autotest',
 'stack_perf_autotest',
+'stack_nb_perf_autotest',
 ]
 
 # All test cases in driver_test_names list are non-parallel
diff --git a/app/test/test_stack.c b/app/test/test_stack.c
index 8392e4e4d..f199136aa 100644
--- a/app/test/test_stack.c
+++ b/app/test/test_stack.c
@@ -97,7 +97,7 @@ test_stack_push_pop(struct rte_stack *s, void **obj_table, 
unsigned int bulk_sz)
 }
 
 static int
-test_stack_basic(void)
+test_stack_basic(uint32_t flags)
 {
struct rte_stack *s = NULL;
void **obj_table = NULL;
@@ -113,7 +113,7 @@ test_stack_basic(void)
for (i = 0; i < STACK_SIZE; i++)
obj_table[i] = (void *)(uintptr_t)i;
 
-   s = rte_stack_create(__func__, STACK_SIZE, rte_socket_id(), 0);
+   s = rte_stack_create(__func__, STACK_SIZE, rte_socket_id(), flags);
if (s == NULL) {
printf("[%s():%u] failed to create a stack\n",
   __func__, __LINE__);
@@ -177,18 +177,18 @@ test_stack_basic(void)
 }
 
 static int
-test_stack_name_reuse(void)
+test_stack_name_reuse(uint32_t flags)
 {
struct rte_stack *s[2];
 
-   s[0] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0);
+   s[0] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags);
if (s[0] == NULL) {
printf("[%s():%u] Failed to create a stack\n",
   __func__, __LINE__);
return -1;
}
 
-   s[1] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0);
+   s[1] = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags);
if (s[1] != NULL) {
printf("[%s():%u] Failed to detect re-used name\n",
   __func__, __LINE__);
@@ -201,7 +201,7 @@ test_stack_name_reuse(void)
 }
 
 static int
-test_stack_name_length(void)
+test_stack_name_length(uint32_t flags)
 {
char name[RTE_STACK_NAMESIZE + 1];
struct rte_stack *s;
@@ -209,7 +209,7 @@ test_stack_name_length(void)
memset(name, 's', sizeof(name));
name[RTE_STACK_NAMESIZE] = '\0';
 
-   s = rte_stack_create(name, STACK_SIZE, rte_socket_id(), 0);
+   s = rte_stack_create(name, STACK_SIZE, rte_socket_id(), flags);
if (s != NULL) {
printf("[%s():%u] Failed to prevent long name\n",
   __func__, __LINE__);
@@ -328,7 +328,7 @@ stack_thread_push_pop(void *args)
 }
 
 static int
-test_stack_multithreaded(void)
+test_stack_multithreaded(uint32_t flags)
 {
struct test_args *args;
unsigned int lcore_id;
@@ -349,7 +349,7 @@ test_stack_multithreaded(void)
return -1;
}
 
-   s = rte_stack_create("test", STACK_SIZE, rte_socket_id(), 0);
+   s = rte_stack_create("test", STACK_SIZE, rte_socket_id(), flags);
if (s == NULL) {
printf("[%s():%u] Failed to create a stack\n",
   __func__, __LINE__);
@@ -384,9 +384,9 @@ test_stack_multithreaded(void)
 }
 
 static int
-test_stack(void)
+__test_stack(uint32_t flags)
 {
-   if (test_stack_basic() < 0)
+   if (test_stack_basic(flags) < 0)
return -1;
 
if (test_lookup_null() < 0)
@@ -395,16 +395,29 @@ test_stack(void)
if (test_free_null() < 0)
return -1;
 
-   if (test_stack_name_reuse() < 0)
+   if (test_stack_name_reuse(flags) < 0)
return -1;
 
-   if (test_stack_name_length() < 0)
+   if (test_stack_name_length(flags) < 0)
return -1;
 
-   if (test_stack_multithreaded() < 0)
+   if (test_stack_multithreaded(flags) < 0)
return -1;
 
return 0;
 }
 
+static int
+test_stack(void)
+{
+   return __test_stack(0);
+}
+
+static int
+test_lf_stack(void)
+{
+   return __test_stack(RTE_STACK_F_LF);
+}
+
 REGISTER_TEST_COMMAND(stack_autotest, test_stack);
+REGISTER_TEST_COMMAND(stack_lf_autotest, test_lf_stack);
diff --git a/app/test/test_stack_perf.c b/app/test/test_st

Re: [dpdk-dev] [PATCH v2 2/4] ethdev: add siblings iterators

2019-03-31 Thread Thomas Monjalon
27/02/2019 11:51, Thomas Monjalon:
> 27/02/2019 11:07, Gaëtan Rivet:
> > On Wed, Feb 20, 2019 at 11:10:49PM +0100, Thomas Monjalon wrote:
> > > +uint16_t __rte_experimental
> > > +rte_eth_find_next_of(uint16_t port_id, const struct rte_device *parent)
> > > +{
> > > + while (port_id < RTE_MAX_ETHPORTS &&
> > > + rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED &&
> > > + rte_eth_devices[port_id].device != parent)
> > > + port_id++;
> > 
> > Why not call rte_eth_find_next directly from this function, and
> > add your specific test on top of it?
> > 
> > Something like:
> > 
> > while (port_id < RTE_MAX_ETHPORTS &&
> >rte_eth_devices[port_id].device != parent)
> > port_id = rte_eth_find_next(port_id + 1);
> > 
> > this way you won't have to rewrite the test on the device state. Having the
> > logic expressed in several places would make reworking the device states 
> > more
> > complicated than necessary if it were to happen (just as you did when 
> > switching
> > the test from !(ATTACHED || REMOVED) to (UNUSED).
> 
> About the intent, you are right.
> About the solution, it seems buggy. We can try to find another way
> of coding this loop by using rte_eth_find_next()
> and adding the parent condition.

Your proposal is correct if adding a first call before the loop:
port_id = rte_eth_find_next(port_id);





Re: [dpdk-dev] [PATCH v2 2/4] ethdev: add siblings iterators

2019-03-31 Thread Thomas Monjalon
19/03/2019 19:04, Ferruh Yigit:
> On 3/19/2019 5:34 PM, Thomas Monjalon wrote:
> >>> +uint16_t __rte_experimental
> >> 
> >> Do we need _rte_experimental on function definitions? I guess only in .h 
> >> file,
> >> function declaration is enough.
> > 
> > Yes we need them both in .h and .c files.
> 
> Why we need them in .c file?
> I think the compiler is interested in ones in .h file, because of the
> experimental checks.

We need the tag in .c file because a check is done in the ELF object
by buildtools/check-experimental-syms.sh

David tried a replacement of this script to run on header files,
but it looks a bit slow:
https://patches.dpdk.org/patch/49118/





[dpdk-dev] [PATCH v3 0/4] ethdev iterators for multi-ports device

2019-03-31 Thread Thomas Monjalon
Add port iterators in order to allow listing easily
the ports of the same device.

The iterators can be tested by using mlx5 or testpmd.


v3: changes only in the (main) patch 2


Thomas Monjalon (4):
  ethdev: simplify port state comparisons
  ethdev: add siblings iterators
  net/mlx5: use port sibling iterators
  app/testpmd: use port sibling iterator in device cleanup

 app/test-pmd/testpmd.c   |  4 +-
 drivers/net/mlx5/mlx5.c  | 34 +
 drivers/net/mlx5/mlx5_ethdev.c   |  6 +--
 lib/librte_ethdev/rte_ethdev.c   | 25 --
 lib/librte_ethdev/rte_ethdev.h   | 63 
 lib/librte_ethdev/rte_ethdev_version.map |  2 +
 6 files changed, 101 insertions(+), 33 deletions(-)

-- 
2.21.0



[dpdk-dev] [PATCH v3 1/4] ethdev: simplify port state comparisons

2019-03-31 Thread Thomas Monjalon
There are three states for an ethdev port.
Checking that the port is unused looks simpler than
checking it is neither attached nor removed.

Signed-off-by: Thomas Monjalon 
Reviewed-by: Andrew Rybchenko 
---
 lib/librte_ethdev/rte_ethdev.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 10bdfb37e..33cffc498 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -330,8 +330,7 @@ uint16_t
 rte_eth_find_next(uint16_t port_id)
 {
while (port_id < RTE_MAX_ETHPORTS &&
-  rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-  rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED)
+   rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED)
port_id++;
 
if (port_id >= RTE_MAX_ETHPORTS)
@@ -567,8 +566,7 @@ uint64_t
 rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id)
 {
while (port_id < RTE_MAX_ETHPORTS &&
-  ((rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
-  rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED) ||
+  (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED ||
   rte_eth_devices[port_id].data->owner.id != owner_id))
port_id++;
 
-- 
2.21.0



[dpdk-dev] [PATCH v3 2/4] ethdev: add siblings iterators

2019-03-31 Thread Thomas Monjalon
If multiple ports share the same hardware device (rte_device),
they are siblings and can be found thanks to the new functions
and loop macros.
One iterator takes a port id as reference,
while the other one directly refers to the parent device.

The ownership is not checked because siblings may have
different owners.

Signed-off-by: Thomas Monjalon 
---
v2: Reviewed-by: Andrew Rybchenko - not kept because of changes in v3

v3:
- fix logic + re-use rte_eth_find_next()
- longer parameter names
- more and better doxygen comments
---
 lib/librte_ethdev/rte_ethdev.c   | 19 +++
 lib/librte_ethdev/rte_ethdev.h   | 63 
 lib/librte_ethdev/rte_ethdev_version.map |  2 +
 3 files changed, 84 insertions(+)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 33cffc498..3b125a642 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -339,6 +339,25 @@ rte_eth_find_next(uint16_t port_id)
return port_id;
 }
 
+uint16_t __rte_experimental
+rte_eth_find_next_of(uint16_t port_id, const struct rte_device *parent)
+{
+   port_id = rte_eth_find_next(port_id);
+   while (port_id < RTE_MAX_ETHPORTS &&
+   rte_eth_devices[port_id].device != parent)
+   port_id = rte_eth_find_next(port_id + 1);
+
+   return port_id;
+}
+
+uint16_t __rte_experimental
+rte_eth_find_next_sibling(uint16_t port_id, uint16_t ref_port_id)
+{
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(ref_port_id, RTE_MAX_ETHPORTS);
+   return rte_eth_find_next_of(port_id,
+   rte_eth_devices[ref_port_id].device);
+}
+
 static void
 rte_eth_dev_shared_data_prepare(void)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index b6023c050..3d5bacaee 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1387,6 +1387,69 @@ uint16_t rte_eth_find_next(uint16_t port_id);
 #define RTE_ETH_FOREACH_DEV(p) \
RTE_ETH_FOREACH_DEV_OWNED_BY(p, RTE_ETH_DEV_NO_OWNER)
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Iterates over ethdev ports of a specified device.
+ *
+ * @param port_id_start
+ *   The id of the next possible valid port.
+ * @param parent
+ *   The generic device behind the ports to iterate.
+ * @return
+ *   Next port id of the device, possibly port_id_start,
+ *   RTE_MAX_ETHPORTS if there is none.
+ */
+__rte_experimental
+uint16_t rte_eth_find_next_of(uint16_t port_id_start,
+   const struct rte_device *parent);
+
+/**
+ * Macro to iterate over all ethdev ports of a specified device.
+ *
+ * @param port_id
+ *   The id of the matching port being iterated.
+ * @param parent
+ *   The rte_device pointer matching the iterated ports.
+ */
+#define RTE_ETH_FOREACH_DEV_OF(port_id, parent) \
+   for (port_id = rte_eth_find_next_of(0, parent); \
+   port_id < RTE_MAX_ETHPORTS; \
+   port_id = rte_eth_find_next_of(port_id + 1, parent))
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Iterates over sibling ethdev ports (i.e. sharing the same rte_device).
+ *
+ * @param port_id_start
+ *   The id of the next possible valid sibling port.
+ * @param ref_port_id
+ *   The id of a reference port to compare rte_device with.
+ * @return
+ *   Next sibling port id, possibly port_id_start or ref_port_id itself,
+ *   RTE_MAX_ETHPORTS if there is none.
+ */
+__rte_experimental
+uint16_t rte_eth_find_next_sibling(uint16_t port_id_start,
+   uint16_t ref_port_id);
+
+/**
+ * Macro to iterate over all ethdev ports sharing the same rte_device
+ * as the specified port.
+ * Note: the specified reference port is part of the loop iterations.
+ *
+ * @param port_id
+ *   The id of the matching port being iterated.
+ * @param ref_port_id
+ *   The id of the port being compared.
+ */
+#define RTE_ETH_FOREACH_DEV_SIBLING(port_id, ref_port_id) \
+   for (port_id = rte_eth_find_next_sibling(0, ref_port_id); \
+   port_id < RTE_MAX_ETHPORTS; \
+   port_id = rte_eth_find_next_sibling(port_id + 1, ref_port_id))
 
 /**
  * @warning
diff --git a/lib/librte_ethdev/rte_ethdev_version.map 
b/lib/librte_ethdev/rte_ethdev_version.map
index 92ac3de25..b37a4167d 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -245,6 +245,8 @@ EXPERIMENTAL {
rte_eth_dev_owner_set;
rte_eth_dev_owner_unset;
rte_eth_dev_rx_intr_ctl_q_get_fd;
+   rte_eth_find_next_of;
+   rte_eth_find_next_sibling;
rte_eth_switch_domain_alloc;
rte_eth_switch_domain_free;
rte_flow_conv;
-- 
2.21.0



[dpdk-dev] [PATCH v3 4/4] app/testpmd: use port sibling iterator in device cleanup

2019-03-31 Thread Thomas Monjalon
When removing a rte_device on a port-based request,
all the sibling ports must be marked as closed.
The iterator loop can be simplified by using the dedicated macro.

Signed-off-by: Thomas Monjalon 
---
 app/test-pmd/testpmd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 40c873b97..aeaa74c98 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2370,9 +2370,7 @@ detach_port_device(portid_t port_id)
return;
}
 
-   for (sibling = 0; sibling < RTE_MAX_ETHPORTS; sibling++) {
-   if (rte_eth_devices[sibling].device != dev)
-   continue;
+   RTE_ETH_FOREACH_DEV_SIBLING(sibling, port_id) {
/* reset mapping between old ports and removed device */
rte_eth_devices[sibling].device = NULL;
if (ports[sibling].port_status != RTE_PORT_CLOSED) {
-- 
2.21.0



[dpdk-dev] [PATCH v3 3/4] net/mlx5: use port sibling iterators

2019-03-31 Thread Thomas Monjalon
Iterating over siblings was done with RTE_ETH_FOREACH_DEV()
which skips the owned ports.
The new iterators RTE_ETH_FOREACH_DEV_SIBLING()
and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/mlx5/mlx5.c| 34 +-
 drivers/net/mlx5/mlx5_ethdev.c |  6 +-
 2 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1d7ca615b..3287a3d78 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -493,17 +493,15 @@ mlx5_dev_close(struct rte_eth_dev *dev)
dev->data->port_id);
if (priv->domain_id != RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) {
unsigned int c = 0;
-   unsigned int i = mlx5_dev_to_port_id(dev->device, NULL, 0);
-   uint16_t port_id[i];
+   uint16_t port_id;
 
-   i = RTE_MIN(mlx5_dev_to_port_id(dev->device, port_id, i), i);
-   while (i--) {
+   RTE_ETH_FOREACH_DEV_OF(port_id, dev->device) {
struct mlx5_priv *opriv =
-   rte_eth_devices[port_id[i]].data->dev_private;
+   rte_eth_devices[port_id].data->dev_private;
 
if (!opriv ||
opriv->domain_id != priv->domain_id ||
-   &rte_eth_devices[port_id[i]] == dev)
+   &rte_eth_devices[port_id] == dev)
continue;
++c;
}
@@ -1147,22 +1145,16 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 * Look for sibling devices in order to reuse their switch domain
 * if any, otherwise allocate one.
 */
-   i = mlx5_dev_to_port_id(dpdk_dev, NULL, 0);
-   if (i > 0) {
-   uint16_t port_id[i];
+   RTE_ETH_FOREACH_DEV_OF(port_id, dpdk_dev) {
+   const struct mlx5_priv *opriv =
+   rte_eth_devices[port_id].data->dev_private;
 
-   i = RTE_MIN(mlx5_dev_to_port_id(dpdk_dev, port_id, i), i);
-   while (i--) {
-   const struct mlx5_priv *opriv =
-   rte_eth_devices[port_id[i]].data->dev_private;
-
-   if (!opriv ||
-   opriv->domain_id ==
-   RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID)
-   continue;
-   priv->domain_id = opriv->domain_id;
-   break;
-   }
+   if (!opriv ||
+   opriv->domain_id ==
+   RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID)
+   continue;
+   priv->domain_id = opriv->domain_id;
+   break;
}
if (priv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) {
err = rte_eth_switch_domain_alloc(&priv->domain_id);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7273bd940..e01b698fc 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1398,11 +1398,7 @@ mlx5_dev_to_port_id(const struct rte_device *dev, 
uint16_t *port_list,
uint16_t id;
unsigned int n = 0;
 
-   RTE_ETH_FOREACH_DEV(id) {
-   struct rte_eth_dev *ldev = &rte_eth_devices[id];
-
-   if (ldev->device != dev)
-   continue;
+   RTE_ETH_FOREACH_DEV_OF(id, dev) {
if (n < port_list_n)
port_list[n] = id;
n++;
-- 
2.21.0



Re: [dpdk-dev] [PATCH v3] app/pdump: enhance to support multi-core capture

2019-03-31 Thread Varghese, Vipin
Hi Reshma & Ferruh,

Summarizing the discussion with maintainer and proposed changes below.

1. Agreed to make changes for migrating 'strlcpy' to 'rte_strlcpy'.
2. Agreed to make changes for spelling error.
3. Informed it is user decision to enable multiple core capture for queue-pair.
4. Informed master core cannot participate in capture, as it requires book 
keeping for future enhancements like file size, and packet count.
5. Disagreed to the statement '-l cores' option is confusing as user should 
have the option to specify the cores.
6. Disagreed to the option suggested from maintainer to enhance '--pdump to add 
core', as it leads to combinations when option is passed for a few and not for 
others.
7. Printing port-queue pair instance with core as debug is agreed, but sharing 
information once capture is stopped does not look useful. But will enhance to 
do same.
8. In my humble opinion, there should be default core for call back which is 
core 0. Removing 'c_flag' is not right way after rte_eal_init is not correct. 
Hence user arguments especially (if pdump is to run on multiple cores) should 
be checked before rte_eal_init.

Action Item:
1. Send or wait for patch, to remove the core default value.
2. Send v5 for the above agreed points.

Thanks
Vipin Varghese


> -Original Message-
> From: Yigit, Ferruh
> Sent: Friday, March 29, 2019 10:33 PM
> To: Varghese, Vipin ; Pattan, Reshma
> ; dev@dpdk.org
> Cc: Wiles, Keith ; Mcnamara, John
> ; Byrne, Stephen1 ;
> Tamboli, Amit S ; Padubidri, Sanjay A
> ; Patel, Amol ; Kovacevic,
> Marko 
> Subject: Re: [dpdk-dev] [PATCH v3] app/pdump: enhance to support multi-core
> capture
> 
> On 3/29/2019 10:22 AM, Varghese, Vipin wrote:
> > Hi Reshma,
> >
> > snipped
> >>>
> >>>  /* true if x is a power of 2 */
> >>>  #define POWEROF2(x) x)-1) & (x)) == 0) @@ -144,7 +145,7 @@
> >>> static volatile uint8_t quit_signal;  static void  pdump_usage(const char
> *prgname)  {
> >>> - printf("usage: %s [EAL options] -- --pdump "
> >>> + printf("usage: %s [EAL options] -- [-l ] --pdump "
> >>
> >> Using -l option same as eal is confusing. Please use other name.
> > Current implementation passes core-mask '-cx1' as EAL argument. The check 
> > for
> user argument '-l  identified
> it is replaced with c_flag.
> >
> > Hence I disagree to the point it is confusing.
> 
> I agree with Reshma, if there is a need to run in different cores, lets 
> remove the
> hardcoded core information from application and manage the core selection in
> eal level, instead of having this in application.
> 
> And in app level, you can say which core to use for that specific pdump, 
> overall
> something like:
> 
> dpdk-pdump -l 20,23 -- --pdump 'port=0,queue=*,core=21,rx-
> dev=/tmp/rx.pcap'
> --pdump 'port=1,queue=*,core=22,tx-dev=/tmp/tx.pcap'
> 
> 
> >
> >> Also how about moving this  new option inside --pdump"" so it will be
> >> clearly known that the particular core will be associated to that tuple.
> > Yes, this can be done.
> >
> >>
> >> Also, I have some major concern, check my below comments.
> > Thanks for your concerns, let me try to address them below.
> >
> >>
> >>>   "'(port= | device_id=),"
> >>>   "(queue=),"
> >>>   "(rx-dev= |"
> >>> @@ -415,6 +416,7 @@ print_pdump_stats(void)
> >>>   for (i = 0; i < num_tuples; i++) {
> >>>   printf("# PDUMP DEBUG STATS #\n");
> >>>   pt = &pdump_t[i];
> >>> + printf(" == DPDK interface (%d) ==\n", i);
> >>
> >> Here good to print the portid/deviceid and queue info, instead of
> >> printing pdump tuple index  i? User might not understand that.
> > I am not sure, why you mention that I am displaying tuple index with I here?
> >
> >> Use ### instead of === as above.
> > I can do this, but is there specific reasoning for "" as it is used to 
> > represent
> main header?
> >
> >>
> >>> +
> >>>  static inline void
> >>>  dump_packets(void)
> >>>  {
> >>>   int i;
> >>> - struct pdump_tuples *pt;
> >>> + uint32_t lcore_id = 0;
> >>> +
> >>> + lcore_id = rte_get_next_lcore(lcore_id, 1, 1);
> >>> +
> >>> + if (rte_lcore_count() == 1) {
> >>> + while (!quit_signal) {
> >>> + for (i = 0; i < num_tuples; i++) {
> >>> + struct pdump_tuples *pt = &pdump_t[i];
> >>> + pdump_packets(pt);
> >>> + }
> >>> + }
> >>> + } else {
> >>> + printf(" Tuples (%u) lcores (%u)\n",
> >>> + num_tuples, rte_lcore_count());
> >>> +
> >>> + if ((uint32_t)num_tuples >= rte_lcore_count()) {
> >>> + printf("Insufficent Cores\n");
> >> Typo %s/Insufficent/
> > Ok
> >
> >>
> >>
> >>> + for (i = 0; i < argc; i++) {
> >>> + if (strstr(argv[i], "-l")) {
> >>> + snprintf(c_flag, RTE_DIM(c_flag), "-l %s", argv[i+1]);
> >>
> >> You are taking this as application arguments then making it as eal
> >> argument  to run the application.
> 

Re: [dpdk-dev] [PATCH v7 0/2] guide to debug and troubleshoot.

2019-03-31 Thread Varghese, Vipin
Hi John and Marko,

Waiting for updates for changes if any, as these are not pushed for 19.02. 

Please do let me know if there are changes required any more?

Thanks
Vipin Varghese

> -Original Message-
> From: Varghese, Vipin
> Sent: Monday, February 25, 2019 10:42 PM
> To: Mcnamara, John ; Kovacevic, Marko
> ; tho...@monjalon.net; Yigit, Ferruh
> ; shreyansh.j...@nxp.com; dev@dpdk.org
> Cc: Padubidri, Sanjay A ; Patel, Amol
> ; Varghese, Vipin 
> Subject: [PATCH v7 0/2] guide to debug and troubleshoot.
> 
> The patch series adds a how-to guide for debugging and troubleshooting tips.
> 
> Motivation
> ==
> 
> DPDK proc-info tool is been enhanced to accommodate the debug information
> for the port, traffic manager crypto, ring and mempool contents. With these
> additional information, it becomes easy to analyze issues and performance
> variance.
> 
> But applications are designed based on the target platform, workload, poll 
> mode
> drivers, and multi-process. This raises variance in debugging and collecting 
> data.
> Hence attempt of patch series is identified such symptoms and share step by 
> step
> guide to cover the cases.
> 
> Not all possible cases could be covered in a single attempt. But with 
> feedback and
> support from the community, this can be expanded.
> 
> Status
> ==
> 
> Reviews and changes accommodated. ACK received for documentation and SVG
> files.
> 
> Change Log:
> ==
> 
> V7:
>  - add space to note and indent - John Macnamara
> 
> V6:
>  - correction for word style and grammar - Thomas Monjalon
>  - add license for svg files - Vipin Varghese
> 
> v5:
>  - rework of content - Vipin Varghese
> 
> V4:
>  - Correction for word style - Shreyansh Jain
> 
> V3:
>  - reorder for removing warning in 'make doc-guides-html' - Thomas Monjalon
> 
> V2:
>  - add offload flag check - Vipin Varghese
>  - change tab to space - Marko Kovacevic
>  - spelling correction - Marko Kovacevic
>  - remove extra characters - Marko Kovacevic
>  - add ACK by Marko - Vipn Varghese
>  - add ACK from Marko - Vipin Varghese
> 
> 
> Vipin Varghese (2):
>   doc: add svg for debug and troubleshoot guide
>   doc: add guide for debug and troubleshoot
> 
>  doc/guides/howto/debug_troubleshoot_guide.rst | 465 ++
>  doc/guides/howto/img/dtg_consumer_ring.svg|  24 +
>  doc/guides/howto/img/dtg_crypto.svg   |  21 +
>  .../howto/img/dtg_distributor_worker.svg  |  36 ++
>  doc/guides/howto/img/dtg_mempool.svg  |  27 +
>  doc/guides/howto/img/dtg_pdump.svg|  33 ++
>  doc/guides/howto/img/dtg_producer_ring.svg|  24 +
>  doc/guides/howto/img/dtg_qos_tx.svg   |  29 ++
>  doc/guides/howto/img/dtg_rx_rate.svg  |  25 +
>  doc/guides/howto/img/dtg_rx_tx_drop.svg   |  33 ++
>  doc/guides/howto/img/dtg_sample_app_model.svg | 110 +
>  doc/guides/howto/img/dtg_service.svg  |  20 +
>  doc/guides/howto/index.rst|   1 +
>  13 files changed, 848 insertions(+)
>  create mode 100644 doc/guides/howto/debug_troubleshoot_guide.rst
>  create mode 100644 doc/guides/howto/img/dtg_consumer_ring.svg
>  create mode 100644 doc/guides/howto/img/dtg_crypto.svg
>  create mode 100644 doc/guides/howto/img/dtg_distributor_worker.svg
>  create mode 100644 doc/guides/howto/img/dtg_mempool.svg
>  create mode 100644 doc/guides/howto/img/dtg_pdump.svg
>  create mode 100644 doc/guides/howto/img/dtg_producer_ring.svg
>  create mode 100644 doc/guides/howto/img/dtg_qos_tx.svg
>  create mode 100644 doc/guides/howto/img/dtg_rx_rate.svg
>  create mode 100644 doc/guides/howto/img/dtg_rx_tx_drop.svg
>  create mode 100644 doc/guides/howto/img/dtg_sample_app_model.svg
>  create mode 100644 doc/guides/howto/img/dtg_service.svg
> 
> --
> 2.17.1



Re: [dpdk-dev] [PATCH v4 00/38] ice share code update.

2019-03-31 Thread Zhang, Qi Z



> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Monday, April 1, 2019 1:06 AM
> To: Zhang, Qi Z 
> Cc: dev@dpdk.org; Lu, Wenzhuo ; Yang, Qiming
> ; Stillwell Jr, Paul M ;
> Yigit, Ferruh ; O'Hare, Cathal 
> ;
> Mcnamara, John 
> Subject: Re: [dpdk-dev] [PATCH v4 00/38] ice share code update.
> 
> 25/03/2019 08:07, Zhang, Qi Z:
> > > Sync to latest kernel driver, main changes:
> > > 1. add DCB/FDIR support.
> > > 2. add more APIs in switch module.
> > > 3. code clean and bug fix.
> > >
> > > Qi Zhang (38):
> > >   net/ice/base: add switch resource allocation and free
> > >   net/ice/base: improve comments
> > >   net/ice/base: add two helper functions
> > >   net/ice/base: add helper macros
> > >   net/ice/base: allow package copy to be used after resets
> > >   net/ice/base: clean code
> > >   net/ice/base: declare functions as external
> > >   net/ice/base: add more APIs in switch module
> > >   net/ice/base: add VSI queue context framework
> > >   net/ice/base: add APIs to add remove ethertype filter
> > >   net/ice/base: add APIs to get allocated resources
> > >   net/ice/base: add APIs to alloc/free resource counter
> > >   net/ice/base: add APIs to get VSI promiscuous mode
> > >   net/ice/base: add MAC filter with marker and counter
> > >   net/ice/base: add two helper functions for flow management
> > >   net/ice/base: fix minor issues
> > >   net/ice/base: update macros
> > >   net/ice/base: clean code
> > >   net/ice/base: enable VSI queue context
> > >   net/ice/base: ensure only valid bits are set
> > >   net/ice/base: enhance get link status command
> > >   net/ice/base: add RSS key related macro and structures
> > >   net/ice/base: do not write TCAM entries back
> > >   net/ice/base: remove local VSIG allocations
> > >   net/ice/base: fix minor issues
> > >   net/ice/base: update copyright time
> > >   net/ice/base: fix Klockwork analysis reported issues
> > >   net/ice/base: return config error without queue to disable
> > >   net/ice/base: add function to check FW recovery mode
> > >   net/ice/base: change profile id reference counting
> > >   net/ice/base: add DCB support
> > >   net/ice/base: add FDIR support
> > >   net/ice/base: change profile priority for RSS reply
> > >   net/ice/base: fix duplicate resource allocations
> > >   net/ice/base: fix minor issues
> > >   net/ice/base: increase prototol offset size
> > >   net/ice/base: revert the workaround for resource allocation
> > >   net/ice/base: rework on bit ops
> >
> > Applied to dpdk-next-net-intel.
> >
> > Thanks
> > Qi
> 
> 3 commits have the title "fix minor issues", that's a funny performance.
> When the ice PMD will enter in a serious phase, you should write some real
> commit titles and explain what are the fixed issues.
> For DPDK 19.05, I guess it's fine because ice PMD is not going to be really 
> used
> soon, right?

Yes, the share code in 19.05 can be regarded as an initial version, the commit 
log is not quite sensitive for developers.
Thanks to forgive this rough, this will not happen in our following share code 
update :)


> 



Re: [dpdk-dev] [PATCH v7 0/8] Support vector instructions on ICE

2019-03-31 Thread Lu, Wenzhuo
Hi Thomas,


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Sunday, March 31, 2019 11:52 PM
> To: Yigit, Ferruh ; Lu, Wenzhuo
> ; Zhang, Qi Z 
> Cc: dev@dpdk.org; O'Hare, Cathal ; Mcnamara,
> John 
> Subject: Re: [dpdk-dev] [PATCH v7 0/8] Support vector instructions on ICE
> 
> 26/03/2019 10:50, Ferruh Yigit:
> > > Wenzhuo Lu (8):
> > >   net/ice: fix Tx function setting
> > >   net/ice: add pointer for queue buffer release
> > >   net/ice: support vector SSE in RX
> > >   net/ice: support Rx scatter SSE vector
> > >   net/ice: support Tx SSE vector
> > >   net/ice: support Rx AVX2 vector
> > >   net/ice: support Rx scatter AVX2 vector
> > >   net/ice: support vector AVX2 in TX
> >
> > This version (v7) pulled from next-net-intel to next-net.
> 
> I assume these patches have been tested, or at least compiled.
> However, when running devtools/test-meson-builds.sh, there is a
> compilation error for build-x86-default:
> 
> In file included from ../drivers/net/ice/ice_ethdev.h:10:
> rte_ethdev_pci.h:38:10: fatal error: 'rte_pci.h' file not found
> 
> It can be fixed in
>   net/ice: support Rx AVX2 vector
> by adding static_rte_pci and static_rte_bus_pci to the dependencies.
> I fixed it even better in
>   net/ice: support vector SSE in Rx
> by replacing the useless include of rte_ethdev_pci.h in ice_ethdev.h with
> rte_ethdev_driver.h.
Really sorry for this. Although I don't understand the issue. I do use meson 
build and it works.
In my server,  no matter using " rte_ethdev_pci.h " or " rte_ethdev_driver.h ", 
it works fine.
To be honest, the compile error looks weird to me. Looks like any file which 
includes " rte_ethdev_pci.h " can hit the same problem. But I cannot tell 
anything, as I cannot reproduce the error.
Again, really appreciate for root causing and fixing the error but not 
rejecting the patches.

> 
> I could just reject the next-net tree, but I don't really have such option if 
> we
> want to close 19.05-rc1 quickly.
> 
> In summary, I am spending my Sunday hours to fix the mess in your driver
> which was supposed to be tested before submitting, plus before merge in
> next-net-intel, plus compilation-tested before pull in next-net.
> I don't know what failed in the process, but I really don't like it.
> I don't want to see any new patch for ice PMD in 19.05 cycle.
> If you really need some fixes in 19.05 (very likely given the mass code drop
> you are doing few days before the -rc1 deadline), then I advise you to
> double check everything and make commits fully justified and explained.
> 
> Sorry for the bad mood, and I hope it won't happen again soon.
> 



Re: [dpdk-dev] [PATCH v2] net/ixgbe: enable 10Mb/s link setup for x553

2019-03-31 Thread Zhao1, Wei



> -Original Message-
> From: Stillwell Jr, Paul M
> Sent: Saturday, March 30, 2019 2:47 AM
> To: Zhao1, Wei ; dev@dpdk.org
> Cc: Lu, Wenzhuo ; sta...@dpdk.org; Zhang, Qi Z
> ; Zhao1, Wei 
> Subject: RE: [dpdk-dev] [PATCH v2] net/ixgbe: enable 10Mb/s link setup for
> x553
> 
> 
> > -Original Message-
> > From: dev  On Behalf Of Wei Zhao
> > Sent: Friday, March 29, 2019 1:16 AM
> > To: dev@dpdk.org
> > Cc: Lu, Wenzhuo ; sta...@dpdk.org; Zhang, Qi Z
> > ; Zhao1, Wei 
> > Subject: [dpdk-dev] [PATCH v2] net/ixgbe: enable 10Mb/s link setup for
> > x553
> >
> > There is need to eanble 10Mb/s link foixgbe NIC of x553.
> 
> Couple of typos, how about:
> 
> Enable 10Mb/s link for ixgbe x553.


Good,  update in v2

> 
> > This new device has own device id of 0x15E4 and 0x15E5, so ixgbe PMD
> > driver need to special check when setup link for these two types of device.
> >
> > Signed-off-by: Wei Zhao 
> >
> > ---
> >
> > v2:
> > delete test code and change permit link speed
> > ---
> >  drivers/net/ixgbe/ixgbe_ethdev.c | 12 +++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > index 97e1021..0aef577 100644
> > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > @@ -2701,6 +2701,10 @@ static int eth_ixgbevf_pci_remove(struct
> > rte_pci_device *pci_dev)
> > allowed_speeds = ETH_LINK_SPEED_100M |
> ETH_LINK_SPEED_1G |
> > ETH_LINK_SPEED_2_5G |  ETH_LINK_SPEED_5G |
> > ETH_LINK_SPEED_10G;
> > +   if (hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T ||
> > +   hw->device_id ==
> > IXGBE_DEV_ID_X550EM_A_1G_T_L)
> > +   allowed_speeds = ETH_LINK_SPEED_10M |
> > +   ETH_LINK_SPEED_100M |
> > ETH_LINK_SPEED_1G;
> > break;
> > default:
> > allowed_speeds = ETH_LINK_SPEED_100M |
> ETH_LINK_SPEED_1G | @@
> > -2742,6 +2746,8 @@ static int eth_ixgbevf_pci_remove(struct
> > rte_pci_device *pci_dev)
> > speed |= IXGBE_LINK_SPEED_1GB_FULL;
> > if (*link_speeds & ETH_LINK_SPEED_100M)
> > speed |= IXGBE_LINK_SPEED_100_FULL;
> > +   if (*link_speeds & ETH_LINK_SPEED_10M)
> > +   speed |= IXGBE_LINK_SPEED_10_FULL;
> > }
> >
> > err = ixgbe_setup_link(hw, speed, link_up); @@ -4061,8 +4067,12
> @@
> > static int ixgbevf_dev_xstats_get_names(__rte_unused struct
> > rte_eth_dev *dev,
> > switch (link_speed) {
> > default:
> > case IXGBE_LINK_SPEED_UNKNOWN:
> > +   if (hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T ||
> > +   hw->device_id ==
> > IXGBE_DEV_ID_X550EM_A_1G_T_L)
> > +   link.link_speed = ETH_SPEED_NUM_10M;
> > +   else
> > +   link.link_speed = ETH_SPEED_NUM_100M;
> > link.link_duplex = ETH_LINK_FULL_DUPLEX;
> > -   link.link_speed = ETH_SPEED_NUM_100M;
> > break;
> >
> > case IXGBE_LINK_SPEED_100_FULL:
> > --
> > 1.8.3.1



Re: [dpdk-dev] [PATCH v2 2/4] ethdev: add siblings iterators

2019-03-31 Thread David Marchand
On Mon, Apr 1, 2019 at 4:16 AM Thomas Monjalon  wrote:

> 19/03/2019 19:04, Ferruh Yigit:
> > On 3/19/2019 5:34 PM, Thomas Monjalon wrote:
> > >>> +uint16_t __rte_experimental
> > >>
> > >> Do we need _rte_experimental on function definitions? I guess only in
> .h file,
> > >> function declaration is enough.
> > >
> > > Yes we need them both in .h and .c files.
> >
> > Why we need them in .c file?
> > I think the compiler is interested in ones in .h file, because of the
> > experimental checks.
>
> We need the tag in .c file because a check is done in the ELF object
> by buildtools/check-experimental-syms.sh
>

?
The attribute should be inherited from the declaration in the header.
If you have a case where it does not work, I'd like to look at it.



> David tried a replacement of this script to run on header files,
> but it looks a bit slow:
> https://patches.dpdk.org/patch/49118/


Never got any review/comment, will do a quick update in this thread.


-- 
David Marchand


[dpdk-dev] [PATCH v3] net/ixgbe: enable 10Mb/s link setup for x553

2019-03-31 Thread Wei Zhao
This patch enable 10Mb/s link for ixgbe x553.
This new device has own device id of 0x15E4 and 0x15E5, so
ixgbe PMD driver need to special check when setup link for
these two types of device.

Signed-off-by: Wei Zhao 

---

v2:
delete test code and change permit link speed

V3:
update git log info
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 97e1021..0aef577 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -2701,6 +2701,10 @@ static int eth_ixgbevf_pci_remove(struct rte_pci_device 
*pci_dev)
allowed_speeds = ETH_LINK_SPEED_100M | ETH_LINK_SPEED_1G |
ETH_LINK_SPEED_2_5G |  ETH_LINK_SPEED_5G |
ETH_LINK_SPEED_10G;
+   if (hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T ||
+   hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T_L)
+   allowed_speeds = ETH_LINK_SPEED_10M |
+   ETH_LINK_SPEED_100M | ETH_LINK_SPEED_1G;
break;
default:
allowed_speeds = ETH_LINK_SPEED_100M | ETH_LINK_SPEED_1G |
@@ -2742,6 +2746,8 @@ static int eth_ixgbevf_pci_remove(struct rte_pci_device 
*pci_dev)
speed |= IXGBE_LINK_SPEED_1GB_FULL;
if (*link_speeds & ETH_LINK_SPEED_100M)
speed |= IXGBE_LINK_SPEED_100_FULL;
+   if (*link_speeds & ETH_LINK_SPEED_10M)
+   speed |= IXGBE_LINK_SPEED_10_FULL;
}
 
err = ixgbe_setup_link(hw, speed, link_up);
@@ -4061,8 +4067,12 @@ static int ixgbevf_dev_xstats_get_names(__rte_unused 
struct rte_eth_dev *dev,
switch (link_speed) {
default:
case IXGBE_LINK_SPEED_UNKNOWN:
+   if (hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T ||
+   hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T_L)
+   link.link_speed = ETH_SPEED_NUM_10M;
+   else
+   link.link_speed = ETH_SPEED_NUM_100M;
link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   link.link_speed = ETH_SPEED_NUM_100M;
break;
 
case IXGBE_LINK_SPEED_100_FULL:
-- 
1.8.3.1