Re: [dpdk-dev] Retire x86 32 bit?

2018-04-17 Thread Roger B Melton

On 4/17/18 4:46 PM, Stephen Hemminger wrote:

On Tue, 17 Apr 2018 13:01:14 -0700
Jim Murphy  wrote:


Still used in certain memory constrained environments.


On Tue, Apr 17, 2018 at 11:39 AM, David Harton (dharton) 
wrote:


It is used and tested in production and non-production environments.

Regards,
Dave
  

-Original Message-
From: dev  On Behalf Of Stephen Hemminger
Sent: Tuesday, April 17, 2018 2:31 PM
To: dev@dpdk.org
Subject: [dpdk-dev] Retire x86 32 bit?

I wonder if x86 32 bit is still useful?
Many distributions no longer support it, and not sure if it is tested
througly by anyone.

Maybe time to deprecate it (gradually)?
  

Pure 32 bit, or x86-64 instructions and registers used in 32 bit mode (which 
can be faster).
.



Pure 32bit in our case.  We do not use x32 ABI.

-Roger




Re: [dpdk-dev] [PATCH v2] net/e1000: correct VLAN tag byte order for i35x LB packets

2017-10-25 Thread Roger B. Melton

On 10/25/17 4:22 PM, Ferruh Yigit wrote:

On 10/25/2017 1:16 PM, Bruce Richardson wrote:

On Wed, Oct 25, 2017 at 11:11:08AM -0700, Ferruh Yigit wrote:

On 10/23/2017 10:42 AM, Roger B. Melton wrote:

On 10/20/17 3:04 PM, Ferruh Yigit wrote:

On 10/12/2017 10:24 AM, Roger B Melton wrote:

When copying VLAN tags from the RX descriptor to the vlan_tci field
in the mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always
little endian.  While i350, i354 and /i350vf VLAN non-loopback
packets are stored little endian, VLAN tags in loopback packets for
those devices are big endian.

For i350, i354 and i350vf VLAN loopback packets, swap the tag when
copying from the RX descriptor to the mbuf header.  This will ensure
that the mbuf vlan_tci is always little endian.

Signed-off-by: Roger B Melton 

<...>


@@ -946,9 +954,16 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
   
   		rxm->hash.rss = rxd.wb.lower.hi_dword.rss;

hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
-   /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
-   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
-
+   /*
+* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+* set in the pkt_flags field and must be in CPU byte order.
+*/
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {

This is adding more condition checks into Rx path.
What is the performance cost of this addition?

I have not measured the performance cost, but I can collect data. What
specifically are you looking for?

To be clear the current implementation incorrect as it does not
normalize the vlan tag to CPU byte order before copying it into mbuf and
applications have no visibility to determine if the tag in the mbuf is
big or little endian.

Do you have any suggestions for an alternative approach to avoid rx
patch checks?

No suggestion indeed. And correctness matters.

But this add a cost and I wonder how much it is, based on that result it may be
possible to do more investigation for alternate solutions or trade-offs.

Konstantin, Bruce, Wenzhuo,

What do you think, do you have any comment?


For a 1G driver, is performance really that big an issue?

I don't know. So is this an Ack from you for the patch?


I can tell you that from the perspective of my application the 
performance impact for 1G is not a concern.


FWIW, I did go through a few iterations with Wenzhou to minimize the 
performance impact before we settled on this implementation, and Wenzhou 
did Ack it btw.


I'm hoping we can get this into 17.11.

Thanks,
-Roger




Unless you
have a *lot* of 1G ports, I would expect most platforms not to notice an
extra couple of cycles when dealing with 1G line rates.

/Bruce


.





Re: [dpdk-dev] [PATCH v2] net/e1000: correct VLAN tag byte order for i35x LB packets

2017-10-23 Thread Roger B. Melton

On 10/20/17 3:04 PM, Ferruh Yigit wrote:

On 10/12/2017 10:24 AM, Roger B Melton wrote:

When copying VLAN tags from the RX descriptor to the vlan_tci field
in the mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always
little endian.  While i350, i354 and /i350vf VLAN non-loopback
packets are stored little endian, VLAN tags in loopback packets for
those devices are big endian.

For i350, i354 and i350vf VLAN loopback packets, swap the tag when
copying from the RX descriptor to the mbuf header.  This will ensure
that the mbuf vlan_tci is always little endian.

Signed-off-by: Roger B Melton 

<...>


@@ -946,9 +954,16 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
  
  		rxm->hash.rss = rxd.wb.lower.hi_dword.rss;

hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
-   /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
-   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
-
+   /*
+* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+* set in the pkt_flags field and must be in CPU byte order.
+*/
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {

This is adding more condition checks into Rx path.
What is the performance cost of this addition?


I have not measured the performance cost, but I can collect data. What 
specifically are you looking for?


To be clear the current implementation incorrect as it does not 
normalize the vlan tag to CPU byte order before copying it into mbuf and 
applications have no visibility to determine if the tag in the mbuf is 
big or little endian.


Do you have any suggestions for an alternative approach to avoid rx 
patch checks?


Thanks,
Roger





+   rxm->vlan_tci = rte_be_to_cpu_16(rxd.wb.upper.vlan);
+   } else {
+   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   }
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(rxq, 
hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);

<...>
.






[dpdk-dev] [PATCH v2] net/e1000: correct VLAN tag byte order for i35x LB packets

2017-10-12 Thread Roger B Melton
When copying VLAN tags from the RX descriptor to the vlan_tci field
in the mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always
little endian.  While i350, i354 and /i350vf VLAN non-loopback
packets are stored little endian, VLAN tags in loopback packets for
those devices are big endian.

For i350, i354 and i350vf VLAN loopback packets, swap the tag when
copying from the RX descriptor to the mbuf header.  This will ensure
that the mbuf vlan_tci is always little endian.

Signed-off-by: Roger B Melton 
---
v2:
* PF: Limit swap to i350 and i354
* VF: Limit swap to i350VF

 drivers/net/e1000/igb_rxtx.c | 55 
 1 file changed, 50 insertions(+), 5 deletions(-)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 1c80a2a..3eb547b 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -105,6 +105,13 @@ struct igb_tx_entry {
 };
 
 /**
+ * rx queue flags
+ */
+enum igb_rxq_flags {
+   IGB_RXQ_FLAG_LB_BSWAP_VLAN = 0x01,
+};
+
+/**
  * Structure associated with each RX queue.
  */
 struct igb_rx_queue {
@@ -128,6 +135,7 @@ struct igb_rx_queue {
uint8_t wthresh;/**< Write-back threshold register. */
uint8_t crc_len;/**< 0 if CRC stripped, 4 otherwise. */
uint8_t drop_en;  /**< If not 0, set SRRCTL.Drop_En. */
+   uint32_tflags;  /**< RX flags. */
 };
 
 /**
@@ -946,9 +954,16 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 
rxm->hash.rss = rxd.wb.lower.hi_dword.rss;
hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
-   /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
-   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
-
+   /*
+* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+* set in the pkt_flags field and must be in CPU byte order.
+*/
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {
+   rxm->vlan_tci = rte_be_to_cpu_16(rxd.wb.upper.vlan);
+   } else {
+   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   }
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(rxq, 
hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
@@ -1181,9 +1196,16 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
 
/*
 * The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
-* set in the pkt_flags field.
+* set in the pkt_flags field and must be in CPU byte order.
 */
-   first_seg->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {
+   first_seg->vlan_tci =
+   rte_be_to_cpu_16(rxd.wb.upper.vlan);
+   } else {
+   first_seg->vlan_tci =
+   rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   }
hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(rxq, 
hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
@@ -2278,6 +2300,18 @@ eth_igb_rx_init(struct rte_eth_dev *dev)
 
rxq = dev->data->rx_queues[i];
 
+   rxq->flags = 0;
+   /*
+* i350 and i354 vlan packets have vlan tags byte swapped.
+*/
+   if ((hw->mac.type == e1000_i350) ||
+   (hw->mac.type == e1000_i354)) {
+   rxq->flags |= IGB_RXQ_FLAG_LB_BSWAP_VLAN;
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap required");
+   } else {
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap not required");
+   }
+
/* Allocate buffers for descriptor rings and set up queue */
ret = igb_alloc_rx_queue_mbufs(rxq);
if (ret)
@@ -2557,6 +2591,17 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)
 
rxq = dev->data->rx_queues[i];
 
+   rxq->flags = 0;
+   /*
+* i350VF LB vlan packets have vlan tags byte swapped.
+*/
+   if (hw->mac.type == e1000_vfadapt_i350) {
+   rxq->flags |= I

Re: [dpdk-dev] [dpdk-stable] [PATCH] net/i40e: fix unexpected mbuf free in vPMD

2017-10-12 Thread Roger B. Melton

On 10/10/17 7:47 AM, Roger B. Melton wrote:

On 10/10/17 7:39 AM, Bruce Richardson wrote:

On Tue, Oct 10, 2017 at 07:05:33AM -0400, Roger B. Melton wrote:

Hi Bruce,

I can.  It will take a day or 2 to get the results.

Regards,
Roger


Thanks. Keep us posted. We want to ensure we have the best fix possible
for the issue in this release.


Thanks for finding a better solution ;-).  I'll let you know the 
results when I have them, just wanted you to know it may take a couple 
of days to turn around.


-Roger


Confirmed that this patch does address the issue.  Thanks again Bruce/Qi 
for finding a better alternative.


Regards, Roger




/Bruce


On 10/10/17 4:48 AM, Bruce Richardson wrote:

+Roger Melton

On Tue, Oct 10, 2017 at 09:22:05AM -0400, Qi Zhang wrote:

vPMD tx does not set sw_ring's mbuf to NULL after free it.
So to prevent same mbuf be free again, we need more carefully
check in i40e_tx_queue_release_mbufs.

Fixes: b4669bb95038 ("i40e: add vector Tx")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
v2:
- fix at i40e_tx_queue_release_mbufs so no performance impact.

This patch also supercedes this one, 
http://dpdk.org/dev/patchwork/patch/29814/

right?
Roger, can you please confirm that this alternative fix works in your
testing and that your patch is no longer necessary too.

Thanks,
/Bruce

.


.



.





Re: [dpdk-dev] [PATCH] net/e1000: i350 VLAN tags in loopback packets are big endian

2017-10-11 Thread Roger B. Melton

Hi Wenzhou,

On 10/10/17 9:32 PM, Lu, Wenzhuo wrote:

Hi Roger,


-Original Message-
From: Roger B Melton [mailto:rmel...@cisco.com]
Sent: Saturday, October 7, 2017 7:34 AM
To: Lu, Wenzhuo 
Cc: dev@dpdk.org; Roger B Melton 
Subject: [PATCH] net/e1000: i350 VLAN tags in loopback packets are big
endian

When copying VLAN tags from the RX descriptor to the vlan_tci field in the
mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always little
endian.  While i350 VLAN non-loopback packets are stored little endian,
VLAN tags for i350 VLAN loopback packets are big endian.

For i350 VLAN loopback packets, swap the tag when copying from the RX
descriptor to the mbuf header.  This will ensure that the mbuf vlan_tci is
always little endian.

Signed-off-by: Roger B Melton 
---
  drivers/net/e1000/igb_rxtx.c | 56

  1 file changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c index
1c80a2a..8432d8b 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c



@@ -2278,6 +2300,18 @@ eth_igb_rx_init(struct rte_eth_dev *dev)

rxq = dev->data->rx_queues[i];

+   rxq->flags = 0;
+   /*
+* The i210, i211, i350, i354 and i350VF LB vlan packets
+* have vlan tags byte swapped.
+*/
+   if (hw->mac.type >= e1000_i350) {

Many thanks for the patch.
I think we need to check i350 and i354 here as LB bit is only meaningful on 
these NICs.


Agreed, I'll change this to:

   if ((hw->mac.type == e1000_i350) || (hw->mac.type == e1000_i354)) {...






+   rxq->flags |= IGB_RXQ_FLAG_LB_BSWAP_VLAN;
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap
required");
+   } else {
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap not
required");
+   }
+
/* Allocate buffers for descriptor rings and set up queue */
ret = igb_alloc_rx_queue_mbufs(rxq);
if (ret)
@@ -2557,6 +2591,18 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)

rxq = dev->data->rx_queues[i];

+   rxq->flags = 0;
+   /*
+* The i210, i211, i350, i354 and i350VF LB vlan packets
+* have vlan tags byte swapped.
+*/
+   if (hw->mac.type >= e1000_i350) {

We need to check e1000_vfadapt_i350 here to exclude e1000_vfadapt.


Agreed, I'll change this to:

   if (hw->mac.type == e1000_vfadapt_i350) {...


I'll make these changes and submit an updated patch.




+   rxq->flags |= IGB_RXQ_FLAG_LB_BSWAP_VLAN;
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap
required");
+   } else {
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap not
required");
+   }
+
/* Allocate buffers for descriptor rings and set up queue */
ret = igb_alloc_rx_queue_mbufs(rxq);
if (ret)
--
2.10.3.dirty

.





Re: [dpdk-dev] [PATCH] net/i40e: fix unexpected mbuf free in vPMD

2017-10-10 Thread Roger B. Melton

On 10/10/17 7:39 AM, Bruce Richardson wrote:

On Tue, Oct 10, 2017 at 07:05:33AM -0400, Roger B. Melton wrote:

Hi Bruce,

I can.  It will take a day or 2 to get the results.

Regards,
Roger


Thanks. Keep us posted. We want to ensure we have the best fix possible
for the issue in this release.


Thanks for finding a better solution ;-).  I'll let you know the results 
when I have them, just wanted you to know it may take a couple of days 
to turn around.


-Roger



/Bruce


On 10/10/17 4:48 AM, Bruce Richardson wrote:

+Roger Melton

On Tue, Oct 10, 2017 at 09:22:05AM -0400, Qi Zhang wrote:

vPMD tx does not set sw_ring's mbuf to NULL after free it.
So to prevent same mbuf be free again, we need more carefully
check in i40e_tx_queue_release_mbufs.

Fixes: b4669bb95038 ("i40e: add vector Tx")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
v2:
- fix at i40e_tx_queue_release_mbufs so no performance impact.


This patch also supercedes this one, http://dpdk.org/dev/patchwork/patch/29814/
right?
Roger, can you please confirm that this alternative fix works in your
testing and that your patch is no longer necessary too.

Thanks,
/Bruce

.


.





Re: [dpdk-dev] [PATCH] net/i40e: fix unexpected mbuf free in vPMD

2017-10-10 Thread Roger B. Melton

Hi Bruce,

I can.  It will take a day or 2 to get the results.

Regards,
Roger


On 10/10/17 4:48 AM, Bruce Richardson wrote:

+Roger Melton

On Tue, Oct 10, 2017 at 09:22:05AM -0400, Qi Zhang wrote:

vPMD tx does not set sw_ring's mbuf to NULL after free it.
So to prevent same mbuf be free again, we need more carefully
check in i40e_tx_queue_release_mbufs.

Fixes: b4669bb95038 ("i40e: add vector Tx")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
v2:
- fix at i40e_tx_queue_release_mbufs so no performance impact.


This patch also supercedes this one, http://dpdk.org/dev/patchwork/patch/29814/
right?
Roger, can you please confirm that this alternative fix works in your
testing and that your patch is no longer necessary too.

Thanks,
/Bruce

.





[dpdk-dev] [PATCH] net/e1000: i350 VLAN tags in loopback packets are big endian

2017-10-06 Thread Roger B Melton
When copying VLAN tags from the RX descriptor to the vlan_tci field in the
mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always
little endian.  While i350 VLAN non-loopback packets are stored
little endian, VLAN tags for i350 VLAN loopback packets are big endian.

For i350 VLAN loopback packets, swap the tag when copying from the
RX descriptor to the mbuf header.  This will ensure that the mbuf
vlan_tci is always little endian.

Signed-off-by: Roger B Melton 
---
 drivers/net/e1000/igb_rxtx.c | 56 
 1 file changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 1c80a2a..8432d8b 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -105,6 +105,13 @@ struct igb_tx_entry {
 };
 
 /**
+ * rx queue flags
+ */
+enum igb_rxq_flags {
+   IGB_RXQ_FLAG_LB_BSWAP_VLAN = 0x01,
+};
+
+/**
  * Structure associated with each RX queue.
  */
 struct igb_rx_queue {
@@ -128,6 +135,7 @@ struct igb_rx_queue {
uint8_t wthresh;/**< Write-back threshold register. */
uint8_t crc_len;/**< 0 if CRC stripped, 4 otherwise. */
uint8_t drop_en;  /**< If not 0, set SRRCTL.Drop_En. */
+   uint32_tflags;  /**< RX flags. */
 };
 
 /**
@@ -946,9 +954,16 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 
rxm->hash.rss = rxd.wb.lower.hi_dword.rss;
hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
-   /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
-   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
-
+   /*
+* The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+* set in the pkt_flags field and must be in CPU byte order.
+*/
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {
+   rxm->vlan_tci = rte_be_to_cpu_16(rxd.wb.upper.vlan);
+   } else {
+   rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   }
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(rxq, 
hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
@@ -1181,9 +1196,16 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
 
/*
 * The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
-* set in the pkt_flags field.
+* set in the pkt_flags field and must be in CPU byte order.
 */
-   first_seg->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+   (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {
+   first_seg->vlan_tci =
+   rte_be_to_cpu_16(rxd.wb.upper.vlan);
+   } else {
+   first_seg->vlan_tci =
+   rte_le_to_cpu_16(rxd.wb.upper.vlan);
+   }
hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(rxq, 
hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
@@ -2278,6 +2300,18 @@ eth_igb_rx_init(struct rte_eth_dev *dev)
 
rxq = dev->data->rx_queues[i];
 
+   rxq->flags = 0;
+   /*
+* The i210, i211, i350, i354 and i350VF LB vlan packets
+* have vlan tags byte swapped.
+*/
+   if (hw->mac.type >= e1000_i350) {
+   rxq->flags |= IGB_RXQ_FLAG_LB_BSWAP_VLAN;
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap required");
+   } else {
+   PMD_INIT_LOG(DEBUG, "IGB rx vlan bswap not required");
+   }
+
/* Allocate buffers for descriptor rings and set up queue */
ret = igb_alloc_rx_queue_mbufs(rxq);
if (ret)
@@ -2557,6 +2591,18 @@ eth_igbvf_rx_init(struct rte_eth_dev *dev)
 
rxq = dev->data->rx_queues[i];
 
+   rxq->flags = 0;
+   /*
+* The i210, i211, i350, i354 and i350VF LB vlan packets
+* have vlan tags byte swapped.
+*/
+   if (hw->mac.type >= e1000_i350) {
+   rxq->flags |= IGB_RXQ_FLAG_LB_BSWAP_VLAN;
+   PMD_INIT_LOG(DEBUG, &quo

[dpdk-dev] [PATCH v2] net/i40e: improve i40evf buffer cleanup in Tx vector mode

2017-10-06 Thread Roger B Melton
Signed-off-by: Roger B Melton 
---
v2 - Same content as v1, but properly signed-off.

i40evf tx vector logic frees mbufs, but it does not remove the
mbufs from software rings which leads to double frees.  This change
corrects that oversight.  I've validated this fix within our application.

 drivers/net/i40e/i40e_rxtx_vec_common.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h 
b/drivers/net/i40e/i40e_rxtx_vec_common.h
index 39a6da0..fdc6fce 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -127,6 +127,7 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (likely(m != NULL)) {
@@ -139,14 +140,17 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
free[0] = m;
nb_free = 1;
}
+   txep[i].mbuf = NULL;
}
}
rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
} else {
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (m != NULL)
rte_mempool_put(m->pool, m);
+   txep[i].mbuf = NULL;
}
}
 
-- 
2.10.3.dirty



Re: [dpdk-dev] [PATCH] net/i40e: Improve i40evf buffer cleanup in tx vector mode

2017-10-06 Thread Roger B. Melton

On 10/6/17 9:51 AM, Bruce Richardson wrote:

On Fri, Oct 06, 2017 at 08:38:01AM -0400, Roger B. Melton wrote:

On 10/6/17 4:54 AM, Bruce Richardson wrote:

On Thu, Oct 05, 2017 at 03:11:11PM -0400, Roger B Melton wrote:

---

i40evf tx vector logic frees mbufs, but it does not remove the
mbufs from software rings which leads to double frees.  This change
   corrects that oversight.  We've validated this fix within our application.


Hi Roger,

I'm a little concerned here by this driver fix, since if we are getting
double frees of mbufs there must be another bug somewhere in tracking
the free ring elements. Clearing the entries to NULL introduces extra
writes to the vector code path which will likely have a performance
impact, but also should be unnecessary, given proper tracking of the
ring status.

Can you provide us with some details as to how to reproduce the issue,
especially with a public sample app? I'd really like to look more into
why this is happening, and if other fixes may work.

Thanks,
/Bruce

.


Hey Bruce,

I've not attempted to reproduce the issue using sample apps.  It was
initially difficult for us to reproduce with our application until we
stumbled on a recipe.  The symptoms of the double free are two fold:

  * Application crashes: Corrupted packet headers, walking a chain of rx
segments and loosing one in the middle,...
  * i40e adminq lockup - Meaning VF sends an OP and PF does not process it

The former has been directly correlated to tx vector mode.  We still don't
understand how this could lead to adminq lockup, but we have not observed
the issue applied the patch I submitted.  We have questions out to Intel
i40e team on this.

Here's a high level view of the scenario under which the issues are observed
and how we concluded that there were issues with tx vector free. We provided
this information to the Intel i40e DPDK team, they reviewed the tx vector
logic and suggested changes.  With the changes suggested by Intel (the patch
I submitted) we have re-enabled tx vector when MTU is < MBUF size and
observed no crashes.

Below that you will find additional detail on the procedure within our
application for changing MTU, including DPDK API calls.

Let me know if you have additional questions.

Regards,
Roger

  o Scenario:

  + Intel i40e-da4 (4x10G) or Cisco branded equivalent
  + KVM or ESXi 6.5 host
  + In a single VM we have at least 1 VF from all 4 PFs.
  + We are running traffic on all interfaces.
  + We stop traffic
  + We stop a device (VF)
  + We start the device
  + We start traffic
  + At any point after we start the device we can crash

  o Experiment (Jumping over some of the work to get to the point
where we believed that i40e driver was doing double frees):

  + Our application attaches userdata to mbufs and we put that
userdata on a linked list.
  + We observed that when we processed the userdata it had been
corrupted which lead to crashes.
  + This gave us a hint that the mbuf was on multiple lists.
  + We reviewed our application code and could not find a cause.
  + We began to suspect a double free in the i40evf PMD.

  # We disabled rx free logic and observed crashes
(intentionally leaking mbufs in search of the double free).
  # We disabled tx free logic and observed no crashes
  # This gave us a hint that the double frees were coming
from the i40evf PMD tx logic.

  + We had also observed that if we forced MTU to large always
that there were no crashes

  # A side effect of forcing large MTU is that multi-segment
is enabled.
  # This gave us a hint that enabling multi-segment was
somehow avoiding the double free.

  + We forced multi-segments regardless of MTU and permitted MTU
changes and observed no crashes.
  + We reviewed the i40evf mbuf free logic to see the effect of
enabling multi-segment and observed that when multi-segment
is enabled, rx vector was enabled but tx vector was not.
  + This lead us to examine RX vector mode free logic vs TX
vector mode free logic.

  # RX free logic has special handling for vector mode free
  # TX free logic does not have any special handling for
vector free

  o By enabling multiple segments always (even if MTU does not
require multiple segments), we effectively disabled tx vector
mode and this avoided the double free.
  o Our application no longer crashed, but it could not take
advantage of tx vector optimizations.





CP == Control Plane
DP == Data Plane

  * CP sends admin down to DP
  * DP disables RX/TX
  o Block all future tx burst calls

Re: [dpdk-dev] [PATCH] net/i40e: Improve i40evf buffer cleanup in tx vector mode

2017-10-06 Thread Roger B. Melton

On 10/6/17 4:54 AM, Bruce Richardson wrote:

On Thu, Oct 05, 2017 at 03:11:11PM -0400, Roger B Melton wrote:

---

i40evf tx vector logic frees mbufs, but it does not remove the
mbufs from software rings which leads to double frees.  This change
  corrects that oversight.  We've validated this fix within our application.


Hi Roger,

I'm a little concerned here by this driver fix, since if we are getting
double frees of mbufs there must be another bug somewhere in tracking
the free ring elements. Clearing the entries to NULL introduces extra
writes to the vector code path which will likely have a performance
impact, but also should be unnecessary, given proper tracking of the
ring status.

Can you provide us with some details as to how to reproduce the issue,
especially with a public sample app? I'd really like to look more into
why this is happening, and if other fixes may work.

Thanks,
/Bruce

.


Hey Bruce,

I've not attempted to reproduce the issue using sample apps.  It was 
initially difficult for us to reproduce with our application until we 
stumbled on a recipe.  The symptoms of the double free are two fold:


 * Application crashes: Corrupted packet headers, walking a chain of rx
   segments and loosing one in the middle,...
 * i40e adminq lockup - Meaning VF sends an OP and PF does not process it

The former has been directly correlated to tx vector mode.  We still 
don't understand how this could lead to adminq lockup, but we have not 
observed the issue applied the patch I submitted.  We have questions out 
to Intel i40e team on this.


Here's a high level view of the scenario under which the issues are 
observed and how we concluded that there were issues with tx vector 
free. We provided this information to the Intel i40e DPDK team, they 
reviewed the tx vector logic and suggested changes.  With the changes 
suggested by Intel (the patch I submitted) we have re-enabled tx vector 
when MTU is < MBUF size and observed no crashes.


Below that you will find additional detail on the procedure within our 
application for changing MTU, including DPDK API calls.


Let me know if you have additional questions.

Regards,
Roger

 o Scenario:

 + Intel i40e-da4 (4x10G) or Cisco branded equivalent
 + KVM or ESXi 6.5 host
 + In a single VM we have at least 1 VF from all 4 PFs.
 + We are running traffic on all interfaces.
 + We stop traffic
 + We stop a device (VF)
 + We start the device
 + We start traffic
 + At any point after we start the device we can crash

 o Experiment (Jumping over some of the work to get to the point
   where we believed that i40e driver was doing double frees):

 + Our application attaches userdata to mbufs and we put that
   userdata on a linked list.
 + We observed that when we processed the userdata it had been
   corrupted which lead to crashes.
 + This gave us a hint that the mbuf was on multiple lists.
 + We reviewed our application code and could not find a cause.
 + We began to suspect a double free in the i40evf PMD.

 # We disabled rx free logic and observed crashes
   (intentionally leaking mbufs in search of the double free).
 # We disabled tx free logic and observed no crashes
 # This gave us a hint that the double frees were coming
   from the i40evf PMD tx logic.

 + We had also observed that if we forced MTU to large always
   that there were no crashes

 # A side effect of forcing large MTU is that multi-segment
   is enabled.
 # This gave us a hint that enabling multi-segment was
   somehow avoiding the double free.

 + We forced multi-segments regardless of MTU and permitted MTU
   changes and observed no crashes.
 + We reviewed the i40evf mbuf free logic to see the effect of
   enabling multi-segment and observed that when multi-segment
   is enabled, rx vector was enabled but tx vector was not.
 + This lead us to examine RX vector mode free logic vs TX
   vector mode free logic.

 # RX free logic has special handling for vector mode free
 # TX free logic does not have any special handling for
   vector free

 o By enabling multiple segments always (even if MTU does not
   require multiple segments), we effectively disabled tx vector
   mode and this avoided the double free.
 o Our application no longer crashed, but it could not take
   advantage of tx vector optimizations.





CP == Control Plane
DP == Data Plane

 * CP sends admin down to DP
 * DP disables RX/TX
 o Block all future tx burst calls
 o rte_eth_dev_set_link_down() invoked
 o Block all future rx burst calls
 * DP notifies CP admin down action is complete
 * CP sends MTU change

Re: [dpdk-dev] [PATCH] net/i40e: Improve i40evf buffer cleanup in tx vector mode

2017-10-05 Thread Roger B. Melton

Hi Everyone,

As soon as I submitted the patch, I realized I neglected to signoff.  
What's the recommended procedure, resubmit the original signed off, or 
bump to v1?


Thanks,
Roger


On 10/5/17 3:11 PM, Roger B Melton wrote:

---

i40evf tx vector logic frees mbufs, but it does not remove the
mbufs from software rings which leads to double frees.  This change
  corrects that oversight.  We've validated this fix within our application.

  drivers/net/i40e/i40e_rxtx_vec_common.h | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h 
b/drivers/net/i40e/i40e_rxtx_vec_common.h
index 39a6da0..fdc6fce 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -127,6 +127,7 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (likely(m != NULL)) {
@@ -139,14 +140,17 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
free[0] = m;
nb_free = 1;
}
+   txep[i].mbuf = NULL;
}
}
rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
} else {
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (m != NULL)
rte_mempool_put(m->pool, m);
+   txep[i].mbuf = NULL;
}
}
  





[dpdk-dev] [PATCH] net/i40e: Improve i40evf buffer cleanup in tx vector mode

2017-10-05 Thread Roger B Melton
---

i40evf tx vector logic frees mbufs, but it does not remove the
mbufs from software rings which leads to double frees.  This change
 corrects that oversight.  We've validated this fix within our application.

 drivers/net/i40e/i40e_rxtx_vec_common.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h 
b/drivers/net/i40e/i40e_rxtx_vec_common.h
index 39a6da0..fdc6fce 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -127,6 +127,7 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
if (likely(m != NULL)) {
free[0] = m;
nb_free = 1;
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (likely(m != NULL)) {
@@ -139,14 +140,17 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
free[0] = m;
nb_free = 1;
}
+   txep[i].mbuf = NULL;
}
}
rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
} else {
+   txep[0].mbuf = NULL;
for (i = 1; i < n; i++) {
m = rte_pktmbuf_prefree_seg(txep[i].mbuf);
if (m != NULL)
rte_mempool_put(m->pool, m);
+   txep[i].mbuf = NULL;
}
}
 
-- 
2.10.3.dirty



Re: [dpdk-dev] Byte order of vlan_tci of rte_mbuf is different on different source

2017-08-24 Thread Roger B Melton

Hi folks,

Resurrecting this old thread.  I ran across this issue recently in DPDK 
17.05, and it's also present in17.08.  It appears no fix was committed.  
I'm working on a solution, but if anyone has a fix in flight please let 
me know.


Regards,
Roger


On 4/29/16 3:29 AM, Olivier Matz wrote:

Hi,

On 04/25/2016 04:35 AM, zhang.xingh...@zte.com.cn wrote:

When using I350 working on SR-IOV mode, we got confused that byte order
of vlan_tci in the VF received packet descriptor is different when the
packet source is different.

1) Packets from VF to VF, the byte order is big-endian. (e.g. 0xF00)
2) Packets from PC to VF, the byte order is little-endian. (e.g. 0xF)

Below is the testing net-work:
 VM0VM1 PC
 VF0VF1  |
   | |   |
   +--+--+   |
  |  |
  PF |
  hypervisor |
  SR-IOV NIC |
  |  |
  |VLAN 15   |
  +-switch---+


We make a breakpoint at the following line of eth_igb_recv_pkts, the
vlan_tci
we observed that everytime.

uint16_t
eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts)

 /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
 rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);

In rte_mbuf.h, it is specified that these values (vlan_tci and
vlan_tci_outer) must be stored in CPU order.

It's probably a driver or hardware issue. Note that in linux there is
something that looks similar to your issue:

http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/igb/igb_main.c#L1278

   /* On i350, i354, i210, and i211, loopback VLAN packets
* have the tag byte-swapped.
*/
   if (adapter->hw.mac.type >= e1000_i350)
   set_bit(IGB_RING_FLAG_RX_LB_VLAN_BSWAP, &ring->flags);

I think you could check if the same thing is done in the
dpdk driver.




ZTE Information Security Notice: The information contained in this mail (and 
any attachment transmitted herewith) is privileged and confidential and is 
intended for the exclusive use of the addressee(s).  If you are not an intended 
recipient, any disclosure, reproduction, distribution or other dissemination or 
use of the information contained is strictly prohibited.  If you have received 
this mail in error, please delete it and notify us immediately.


This notice should be removed in public emails.

Regards,
Olivier
.



Re: [dpdk-dev] 2nd try: problem with ixgbe_dev_link_update() for multi-speed fiber [was] Re: [PATCH v4] net/ixgbe: ensure link status is updated

2017-06-02 Thread Roger B. Melton

Hi Wei,

Have you had a chance to look at the debug output yet?  Is there any 
additional data you need?


Regards,
Roger


On 5/24/17 2:38 PM, Roger B. Melton wrote:

Hi Wei,

I am using ixgbe:

00:02.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit 
SFI/SFP+ Network Connection (rev 01)


The debug output you requested is below showing the results for both 
far end link up and far end link down.  In the case of far end link 
up, the elapse time is too small for the granularity of my logging 
system.  In the case when the far end link is down, the elapsed time 
for rte_eth_link_get_nowait() is ~1.4 seconds.  You can really see 
those 40 and 100msec delays add up when the link is down.


Let me know if you require more information.

Regards,
Roger


Far end link up:

2017/05/24 18:15:14.353 [ulord-dp]: [3706]:  UUID: 0, ra: 0 (debug): 
rmeltonDBG Start rte_eth_link_get_nowait()


2017/05/24 18:15:14.353 [ulord-dp]: [3706]:  UUID: 0, ra: 0 (debug): 
PMD: ixgbe_check_mac_link_generic(): ixgbe_check_mac_link_generic


2017/05/24 18:15:14.353 [ulord-dp]: [3706]:  UUID: 0, ra: 0 (debug): 
rmeltonDBG :End rte_eth_link_get_nowait()



Far end link down:


2017/05/24 18:15:38.502 (debug): rmeltonDBG: Start 
rte_eth_link_get_nowait()


2017/05/24 18:15:38.502 (debug): PMD: ixgbe_get_media_type_82599(): 
ixgbe_get_media_type_82599
2017/05/24 18:15:38.502 (debug): PMD: 
ixgbe_setup_mac_link_multispeed_fiber(): 
ixgbe_setup_mac_link_multispeed_fiber
2017/05/24 18:15:38.502 (debug): PMD: 
ixgbe_get_link_capabilities_82599(): ixgbe_get_link_capabilities_82599
2017/05/24 18:15:38.542 (debug): PMD: ixgbe_setup_mac_link_82599(): 
ixgbe_setup_mac_link_82599
2017/05/24 18:15:38.542 (debug): PMD: 
ixgbe_get_link_capabilities_82599(): ixgbe_get_link_capabilities_82599
2017/05/24 18:15:38.542 (debug): PMD: 
ixgbe_flap_tx_laser_multispeed_fiber(): 
ixgbe_flap_tx_laser_multispeed_fiber
2017/05/24 18:15:38.542 (debug): PMD: ixgbe_check_reset_blocked(): 
ixgbe_check_reset_blocked
2017/05/24 18:15:38.643 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:38.743 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:38.843 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:38.943 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.043 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_setup_mac_link_82599(): 
ixgbe_setup_mac_link_82599
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_get_link_capabilities_82599(): ixgbe_get_link_capabilities_82599
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_check_reset_blocked(): 
ixgbe_check_reset_blocked
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_verify_lesm_fw_enabled_82599(): ixgbe_verify_lesm_fw_enabled_82599
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_read_eeprom_82599(): 
ixgbe_read_eeprom_82599
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_read_eerd_buffer_generic(): ixgbe_read_eerd_buffer_generic
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_init_eeprom_params_generic(): ixgbe_init_eeprom_params_generic
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_poll_eerd_eewr_done(): 
ixgbe_poll_eerd_eewr_done
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_read_eeprom_82599(): 
ixgbe_read_eeprom_82599
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_read_eerd_buffer_generic(): ixgbe_read_eerd_buffer_generic
2017/05/24 18:15:39.083 (debug): PMD: 
ixgbe_init_eeprom_params_generic(): ixgbe_init_eeprom_params_generic
2017/05/24 18:15:39.083 (debug): PMD: ixgbe_poll_eerd_eewr_done(): 
ixgbe_poll_eerd_eewr_done
2017/05/24 18:15:39.165 (debug): PMD: 
ixgbe_flap_tx_laser_multispeed_fiber(): 
ixgbe_flap_tx_laser_multispeed_fiber
2017/05/24 18:15:39.165 (debug): PMD: ixgbe_check_reset_blocked(): 
ixgbe_check_reset_blocked
2017/05/24 18:15:39.265 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.265 (debug): PMD: 
ixgbe_setup_mac_link_multispeed_fiber(): 
ixgbe_setup_mac_link_multispeed_fiber
2017/05/24 18:15:39.265 (debug): PMD: 
ixgbe_get_link_capabilities_82599(): ixgbe_get_link_capabilities_82599
2017/05/24 18:15:39.305 (debug): PMD: ixgbe_setup_mac_link_82599(): 
ixgbe_setup_mac_link_82599
2017/05/24 18:15:39.305 (debug): PMD: 
ixgbe_get_link_capabilities_82599(): ixgbe_get_link_capabilities_82599
2017/05/24 18:15:39.305 (debug): PMD: ixgbe_check_reset_blocked(): 
ixgbe_check_reset_blocked
2017/05/24 18:15:39.305 (debug): PMD: 
ixgbe_verify_lesm_fw_enabled_82599(): ixgbe_verify_lesm_fw_enabled_82599
2017/05/24 18:15:39.305 (debug): PMD: ixgbe_read_eeprom_82599(): 
ixgbe_read_eeprom_82599
2017/05/24 18:15:39.305 (debug): PMD: 
ixgbe_read_eerd_buffer_generic(): ixgbe_read_eerd_buffer_generic
2017/05/24 18:15:39.305 (debug): PMD: 
ixgbe_init_eeprom_params_generic(): ixgbe_init_eeprom_params_generic
2017/05/24 18:15:39.305 (debug): PMD

Re: [dpdk-dev] 2nd try: problem with ixgbe_dev_link_update() for multi-speed fiber [was] Re: [PATCH v4] net/ixgbe: ensure link status is updated

2017-05-24 Thread Roger B. Melton
 (debug): PMD: ixgbe_read_eerd_buffer_generic(): 
ixgbe_read_eerd_buffer_generic
2017/05/24 18:15:39.305 (debug): PMD: 
ixgbe_init_eeprom_params_generic(): ixgbe_init_eeprom_params_generic
2017/05/24 18:15:39.305 (debug): PMD: ixgbe_poll_eerd_eewr_done(): 
ixgbe_poll_eerd_eewr_done
2017/05/24 18:15:39.387 (debug): PMD: 
ixgbe_flap_tx_laser_multispeed_fiber(): ixgbe_flap_tx_laser_multispeed_fiber
2017/05/24 18:15:39.387 (debug): PMD: ixgbe_check_reset_blocked(): 
ixgbe_check_reset_blocked
2017/05/24 18:15:39.487 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.587 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.687 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.787 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.887 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic
2017/05/24 18:15:39.887 (debug): PMD: ixgbe_check_mac_link_generic(): 
ixgbe_check_mac_link_generic


2017/05/24 18:15:39.887 (debug): rmeltonDBG 
posix_pmd_oper_state_get_by_id:End rte_eth_link_get_nowait()



On 5/24/17 10:40 AM, Roger B. Melton wrote:

Hi Wei,

Thanks for getting back to me.  I'm also dealing with various 
priorities so it'll take me a bit to get the data, but I'll get back 
to you with the details in the next day or so.


Regards,
Roger


On 5/23/17 3:41 AM, Dai, Wei wrote:

Hi, Roger
Sorry for typo error. You should set --log-level=8

Regards
-Wei


-Original Message-
From: Dai, Wei
Sent: Tuesday, May 23, 2017 3:24 PM
To: 'Roger B. Melton' ; Laurent Hardy
; dev@dpdk.org
Cc: Yigit, Ferruh ; Zhang, Helin
; Ananyev, Konstantin
; olivier.m...@6wind.com
Subject: RE: 2nd try: problem with ixgbe_dev_link_update() for 
multi-speed
fiber [was] Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status 
is updated


Hi, Roger
Sorry for late response as we are busy with other higher priority task.
ixgbe_setup_mac_link_multispeed_fiber( ) in ixgbe_common.c calls
ixgbe_setup_mac_link( ) which some functions defined in ixgbe/base .
Would you please give us more info to narrow down this issue ?

What device id did you use to trigger this issue ?

To get more info, would you please change
"CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=n" to"
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=y" in config/common_base And
run your application with EAL parameter --log-level=7  to get trace 
path of

ixgbe functions for us.

Thanks & Best Regards
-Wei



-Original Message-
From: Roger B. Melton [mailto:rmel...@cisco.com]
Sent: Tuesday, May 23, 2017 2:54 AM
To: Laurent Hardy ; dev@dpdk.org; Dai, Wei

Cc: Yigit, Ferruh ; Zhang, Helin
; Ananyev, Konstantin
; olivier.m...@6wind.com
Subject: 2nd try: problem with ixgbe_dev_link_update() for multi-speed
fiber [was] Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status is
updated

Hi Laurent/Wei,

Any thoughts?

Regards,
Roger


On 5/17/17 9:31 AM, Roger B Melton wrote:

Hi Laurent/Wei,

As I continue to integrate DPDK 17.05 into my application, I have
hit another issue with this patch while testing in a VM with
multispeed fiber ixgbe PCI passthrough.  My application periodically
invokes
rte_eth_link_get_nowait() to detect link state changes. If the link
is down (no cable or far end disabled), ixgbe_setup_link() will not
return for ~1.3seconds due to the link setup algorithm in
ixgbe_common.c:ixgbe_multispeed_fiber():

+if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+speed = hw->phy.autoneg_advertised;
+if (!speed)
+ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+ixgbe_setup_link(hw, speed, true); +}
+

I have two questions:

  * Shouldn't we avoid the link setup cost if the caller has 
specified

not to wait_to_complete?
  * If the concern is speed may not be properly configured, shouldn't
the link setup be deferred until state changes link up thus
minimizing the delays enforced in ixgbe_multispeed_fiber()?


Regards,
-Roger




On 4/27/17 11:03 AM, Laurent Hardy wrote:

In case of fiber and link speed set to 1Gb at peer side (with
autoneg or with defined speed), link status could be not properly
updated at time cable is plugged-in.
Indeed if cable was not plugged when device has been configured and
started then link status will not be updated properly with new
speed as no link setup will be triggered.

To avoid this issue, IXGBE_FLAG_NEED_LINK_CONFIG is set to try a
link setup each time link_update() is triggered and current link
status is down. When cable is plugged-in, link setup will be
performed via ixgbe_setup_link().

Signed-off-by: Laurent Hardy 
---
Hi Wei, please find enclosed patch v4, tested using testpmd.

v1 -> v2:
- rebase on top of head (change flag to 1&l

Re: [dpdk-dev] 2nd try: problem with ixgbe_dev_link_update() for multi-speed fiber [was] Re: [PATCH v4] net/ixgbe: ensure link status is updated

2017-05-24 Thread Roger B. Melton

Hi Wei,

Thanks for getting back to me.  I'm also dealing with various priorities 
so it'll take me a bit to get the data, but I'll get back to you with 
the details in the next day or so.


Regards,
Roger


On 5/23/17 3:41 AM, Dai, Wei wrote:

Hi, Roger
Sorry for typo error. You should set --log-level=8

Regards
-Wei


-Original Message-
From: Dai, Wei
Sent: Tuesday, May 23, 2017 3:24 PM
To: 'Roger B. Melton' ; Laurent Hardy
; dev@dpdk.org
Cc: Yigit, Ferruh ; Zhang, Helin
; Ananyev, Konstantin
; olivier.m...@6wind.com
Subject: RE: 2nd try: problem with ixgbe_dev_link_update() for multi-speed
fiber [was] Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status is updated

Hi, Roger
Sorry for late response as we are busy with other higher priority task.
ixgbe_setup_mac_link_multispeed_fiber( ) in ixgbe_common.c calls
ixgbe_setup_mac_link( ) which some functions defined in ixgbe/base .
Would you please give us more info to narrow down this issue ?

What device id did you use to trigger this issue ?

To get more info, would you please change
"CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=n" to"
CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=y" in config/common_base And
run your application with EAL parameter --log-level=7  to get trace path of
ixgbe functions for us.

Thanks & Best Regards
-Wei



-Original Message-
From: Roger B. Melton [mailto:rmel...@cisco.com]
Sent: Tuesday, May 23, 2017 2:54 AM
To: Laurent Hardy ; dev@dpdk.org; Dai, Wei

Cc: Yigit, Ferruh ; Zhang, Helin
; Ananyev, Konstantin
; olivier.m...@6wind.com
Subject: 2nd try: problem with ixgbe_dev_link_update() for multi-speed
fiber [was] Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status is
updated

Hi Laurent/Wei,

Any thoughts?

Regards,
Roger


On 5/17/17 9:31 AM, Roger B Melton wrote:

Hi Laurent/Wei,

As I continue to integrate DPDK 17.05 into my application, I have
hit another issue with this patch while testing in a VM with
multispeed fiber ixgbe PCI passthrough.  My application periodically
invokes
rte_eth_link_get_nowait() to detect link state changes.  If the link
is down (no cable or far end disabled), ixgbe_setup_link() will not
return for ~1.3seconds due to the link setup algorithm in
ixgbe_common.c:ixgbe_multispeed_fiber():

+if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+speed = hw->phy.autoneg_advertised;
+if (!speed)
+ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+ixgbe_setup_link(hw, speed, true); +}
+

I have two questions:

  * Shouldn't we avoid the link setup cost if the caller has specified
not to wait_to_complete?
  * If the concern is speed may not be properly configured, shouldn't
the link setup be deferred until state changes link up thus
minimizing the delays enforced in ixgbe_multispeed_fiber()?


Regards,
-Roger




On 4/27/17 11:03 AM, Laurent Hardy wrote:

In case of fiber and link speed set to 1Gb at peer side (with
autoneg or with defined speed), link status could be not properly
updated at time cable is plugged-in.
Indeed if cable was not plugged when device has been configured and
started then link status will not be updated properly with new
speed as no link setup will be triggered.

To avoid this issue, IXGBE_FLAG_NEED_LINK_CONFIG is set to try a
link setup each time link_update() is triggered and current link
status is down. When cable is plugged-in, link setup will be
performed via ixgbe_setup_link().

Signed-off-by: Laurent Hardy 
---
Hi Wei, please find enclosed patch v4, tested using testpmd.

v1 -> v2:
- rebase on top of head (change flag to 1<<4)
- fix regression with copper links: only update link for fiber
links

v2 -> v3:
- remove unnescessary check on speed mask if autoneg is false

v3 -> v4:
- remove default speed set to 10Gb if autoneg is false, rely on
ixgbe_get_link_capabilities( ) instead.
---
   drivers/net/ixgbe/ixgbe_ethdev.c | 14 ++
   drivers/net/ixgbe/ixgbe_ethdev.h |  1 +
   2 files changed, 15 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7b856bb..8a0c0a7 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3782,8 +3782,12 @@ ixgbe_dev_link_update(struct rte_eth_dev

*dev,

int wait_to_complete)
   struct ixgbe_hw *hw =
IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
   struct rte_eth_link link, old;
   ixgbe_link_speed link_speed = IXGBE_LINK_SPEED_UNKNOWN;
+struct ixgbe_interrupt *intr =
+ IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
   int link_up;
   int diag;
+u32 speed = 0;
+bool autoneg = false;
 link.link_status = ETH_LINK_DOWN;
   link.link_speed = 0;
@@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct rte_eth_dev

*dev,

int wait_to_complete)
 hw->mac.get

[dpdk-dev] 2nd try: problem with ixgbe_dev_link_update() for multi-speed fiber [was] Re: [PATCH v4] net/ixgbe: ensure link status is updated

2017-05-22 Thread Roger B. Melton

Hi Laurent/Wei,

Any thoughts?

Regards,
Roger


On 5/17/17 9:31 AM, Roger B Melton wrote:

Hi Laurent/Wei,

As I continue to integrate DPDK 17.05 into my application, I have hit 
another issue with this patch while testing in a VM with multispeed 
fiber ixgbe PCI passthrough.  My application periodically invokes 
rte_eth_link_get_nowait() to detect link state changes.  If the link 
is down (no cable or far end disabled), ixgbe_setup_link() will not 
return for ~1.3seconds due to the link setup algorithm in 
ixgbe_common.c:ixgbe_multispeed_fiber():


+if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+speed = hw->phy.autoneg_advertised;
+if (!speed)
+ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+ixgbe_setup_link(hw, speed, true); +}
+

I have two questions:

 * Shouldn't we avoid the link setup cost if the caller has specified
   not to wait_to_complete?
 * If the concern is speed may not be properly configured, shouldn't
   the link setup be deferred until state changes link up thus
   minimizing the delays enforced in ixgbe_multispeed_fiber()?


Regards,
-Roger




On 4/27/17 11:03 AM, Laurent Hardy wrote:

In case of fiber and link speed set to 1Gb at peer side (with autoneg
or with defined speed), link status could be not properly updated at
time cable is plugged-in.
Indeed if cable was not plugged when device has been configured and
started then link status will not be updated properly with new speed
as no link setup will be triggered.

To avoid this issue, IXGBE_FLAG_NEED_LINK_CONFIG is set to try a link
setup each time link_update() is triggered and current link status is
down. When cable is plugged-in, link setup will be performed via
ixgbe_setup_link().

Signed-off-by: Laurent Hardy 
---
Hi Wei, please find enclosed patch v4, tested using testpmd.

v1 -> v2:
- rebase on top of head (change flag to 1<<4)
- fix regression with copper links: only update link for fiber links

v2 -> v3:
- remove unnescessary check on speed mask if autoneg is false

v3 -> v4:
- remove default speed set to 10Gb if autoneg is false, rely on
ixgbe_get_link_capabilities( ) instead.
---
  drivers/net/ixgbe/ixgbe_ethdev.c | 14 ++
  drivers/net/ixgbe/ixgbe_ethdev.h |  1 +
  2 files changed, 15 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
b/drivers/net/ixgbe/ixgbe_ethdev.c

index 7b856bb..8a0c0a7 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3782,8 +3782,12 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, 
int wait_to_complete)
  struct ixgbe_hw *hw = 
IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);

  struct rte_eth_link link, old;
  ixgbe_link_speed link_speed = IXGBE_LINK_SPEED_UNKNOWN;
+struct ixgbe_interrupt *intr =
+ IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
  int link_up;
  int diag;
+u32 speed = 0;
+bool autoneg = false;
link.link_status = ETH_LINK_DOWN;
  link.link_speed = 0;
@@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, 
int wait_to_complete)

hw->mac.get_link_status = true;
  +if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+speed = hw->phy.autoneg_advertised;
+if (!speed)
+ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+ixgbe_setup_link(hw, speed, true);
+}
+
  /* check if it needs to wait to complete, if lsc interrupt is 
enabled */
  if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc 
!= 0)

  diag = ixgbe_check_link(hw, &link_speed, &link_up, 0);
@@ -3810,10 +3822,12 @@ ixgbe_dev_link_update(struct rte_eth_dev 
*dev, int wait_to_complete)

if (link_up == 0) {
  rte_ixgbe_dev_atomic_write_link_status(dev, &link);
+intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
  if (link.link_status == old.link_status)
  return -1;
  return 0;
  }
+intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
  link.link_status = ETH_LINK_UP;
  link.link_duplex = ETH_LINK_FULL_DUPLEX;
  diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
b/drivers/net/ixgbe/ixgbe_ethdev.h

index 5176b02..b576a6f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -45,6 +45,7 @@
  #define IXGBE_FLAG_MAILBOX  (uint32_t)(1 << 1)
  #define IXGBE_FLAG_PHY_INTERRUPT(uint32_t)(1 << 2)
  #define IXGBE_FLAG_MACSEC   (uint32_t)(1 << 3)
+#define IXGBE_FLAG_NEED_LINK_CONFIG (uint32_t)(1 << 4)
/*
   * Defines that were not part of ixgbe_type.h as they are not used 
by the


.





Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status is updated

2017-05-17 Thread Roger B Melton

Hi Laurent/Wei,

As I continue to integrate DPDK 17.05 into my application, I have hit 
another issue with this patch while testing in a VM with multispeed 
fiber ixgbe PCI passthrough.  My application periodically invokes 
rte_eth_link_get_nowait() to detect link state changes.  If the link is 
down (no cable or far end disabled), ixgbe_setup_link() will not return 
for ~1.3seconds due to the link setup algorithm in 
ixgbe_common.c:ixgbe_multispeed_fiber():


+   if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   speed = hw->phy.autoneg_advertised;
+   if (!speed)
+   ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+   ixgbe_setup_link(hw, speed, true); +}
+

I have two questions:

 * Shouldn't we avoid the link setup cost if the caller has specified
   not to wait_to_complete?
 * If the concern is speed may not be properly configured, shouldn't
   the link setup be deferred until state changes link up thus
   minimizing the delays enforced in ixgbe_multispeed_fiber()?


Regards,
-Roger




On 4/27/17 11:03 AM, Laurent Hardy wrote:

In case of fiber and link speed set to 1Gb at peer side (with autoneg
or with defined speed), link status could be not properly updated at
time cable is plugged-in.
Indeed if cable was not plugged when device has been configured and
started then link status will not be updated properly with new speed
as no link setup will be triggered.

To avoid this issue, IXGBE_FLAG_NEED_LINK_CONFIG is set to try a link
setup each time link_update() is triggered and current link status is
down. When cable is plugged-in, link setup will be performed via
ixgbe_setup_link().

Signed-off-by: Laurent Hardy 
---
Hi Wei, please find enclosed patch v4, tested using testpmd.

v1 -> v2:
- rebase on top of head (change flag to 1<<4)
- fix regression with copper links: only update link for fiber links

v2 -> v3:
- remove unnescessary check on speed mask if autoneg is false

v3 -> v4:
- remove default speed set to 10Gb if autoneg is false, rely on
ixgbe_get_link_capabilities( ) instead.
---
  drivers/net/ixgbe/ixgbe_ethdev.c | 14 ++
  drivers/net/ixgbe/ixgbe_ethdev.h |  1 +
  2 files changed, 15 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7b856bb..8a0c0a7 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3782,8 +3782,12 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct rte_eth_link link, old;
ixgbe_link_speed link_speed = IXGBE_LINK_SPEED_UNKNOWN;
+   struct ixgbe_interrupt *intr =
+   IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
int link_up;
int diag;
+   u32 speed = 0;
+   bool autoneg = false;
  
  	link.link_status = ETH_LINK_DOWN;

link.link_speed = 0;
@@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
  
  	hw->mac.get_link_status = true;
  
+	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&

+   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   speed = hw->phy.autoneg_advertised;
+   if (!speed)
+   ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+   ixgbe_setup_link(hw, speed, true);
+   }
+
/* check if it needs to wait to complete, if lsc interrupt is enabled */
if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0)
diag = ixgbe_check_link(hw, &link_speed, &link_up, 0);
@@ -3810,10 +3822,12 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
  
  	if (link_up == 0) {

rte_ixgbe_dev_atomic_write_link_status(dev, &link);
+   intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
if (link.link_status == old.link_status)
return -1;
return 0;
}
+   intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
link.link_status = ETH_LINK_UP;
link.link_duplex = ETH_LINK_FULL_DUPLEX;
  
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h

index 5176b02..b576a6f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -45,6 +45,7 @@
  #define IXGBE_FLAG_MAILBOX  (uint32_t)(1 << 1)
  #define IXGBE_FLAG_PHY_INTERRUPT(uint32_t)(1 << 2)
  #define IXGBE_FLAG_MACSEC   (uint32_t)(1 << 3)
+#define IXGBE_FLAG_NEED_LINK_CONFIG (uint32_t)(1 << 4)
  
  /*

   * Defines that were not part of ixgbe_type.h as they are not used by the




Re: [dpdk-dev] [dpdk-stable] [PATCH] net/ixgbe: fix calling null function of VF

2017-05-10 Thread Roger B Melton

-sta...@dpdk.org per Thomas

I know this patch has already been applied but to complete the loop, I 
have applied the patch to 17.05-rc4 and verified within my application 
that the NULL function pointer issue for ixgbe_vf has been resolved.



Regards,
Roger

On 5/10/17 3:00 AM, Wei Dai wrote:

hw->mac.ops.get_media-type() of ixgbe VF is NULL and should not
be called directly. It had better be replaced by calling
ixgbe_get_media_type( ) to avoid crash.

Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated")
Cc: sta...@dpdk.org

Signed-off-by: Wei Dai 

Tested-by Roger B Melton 

---
  drivers/net/ixgbe/ixgbe_ethdev.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index ec667d8..ed2baec 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3799,7 +3799,7 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
hw->mac.get_link_status = true;
  
  	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&

-   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
speed = hw->phy.autoneg_advertised;
if (!speed)
ixgbe_get_link_capabilities(hw, &speed, &autoneg);





Re: [dpdk-dev] [dpdk-stable] [PATCH] net/ixgbe: fix calling null function of VF

2017-05-10 Thread Roger B. Melton



On 5/10/17 5:50 AM, Ferruh Yigit wrote:

On 5/10/2017 8:00 AM, Wei Dai wrote:

hw->mac.ops.get_media-type() of ixgbe VF is NULL and should not
be called directly. It had better be replaced by calling
ixgbe_get_media_type( ) to avoid crash.

Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated")
Cc: sta...@dpdk.org

Signed-off-by: Wei Dai 

Hi Roger and Laurent,

This is very close the release, can you please support verifying this
defect?

An explicit Acked-by and/or Tested-by would help a lot.

Thanks,
ferruh


Hi Ferruh,

Happy to test it.  I'll get back to you shortly.

Regards,
Roger





---
  drivers/net/ixgbe/ixgbe_ethdev.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index ec667d8..ed2baec 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3799,7 +3799,7 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
hw->mac.get_link_status = true;
  
  	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&

-   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
speed = hw->phy.autoneg_advertised;
if (!speed)
ixgbe_get_link_capabilities(hw, &speed, &autoneg);


.





Re: [dpdk-dev] bug in net/ixgbe/ixgbe_ethdev.c:ixgbe_dev_link_update beginning in 17.05-rc3?

2017-05-10 Thread Roger B. Melton

Thanks  -Roger


On 5/10/17 4:20 AM, Dai, Wei wrote:

Yes, it is a bug.
The hw->mac.ops.get_media_type of ixgbe VF is NULL.
I have just submitted a patch to fix it.
http://dpdk.org/dev/patchwork/patch/24188/


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Roger B Melton
Sent: Wednesday, May 10, 2017 1:53 AM
To: dev@dpdk.org; laurent.ha...@6wind.com
Subject: [dpdk-dev] bug in net/ixgbe/ixgbe_ethdev.c:ixgbe_dev_link_update
beginning in 17.05-rc3?

After updating to 17.05-rc4 I hit a crash in
drivers/net/ixgbe/ixgbe_ethdev.c:ixgbe_dev_link_update().  The issue was
a NULL get_media_type FV for the VF driver.

Looking at recent commits, I see the following added the
get_media_type() check:

 commit c12d22f65b132c56db7b4fdbfd5ddce27d1e9572
 Author: Laurent Hardy 
 Date:   Thu Apr 27 17:03:42 2017 +0200

 @@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct
rte_eth_dev
 *dev, int wait_to_complete)

  hw->mac.get_link_status = true;

 +   if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
 +   hw->mac.ops.get_media_type(hw) ==
 ixgbe_media_type_fiber) {
 +   speed = hw->phy.autoneg_advertised;
 +   if (!speed)
 +   ixgbe_get_link_capabilities(hw,
&speed,
 &autoneg);
 +   ixgbe_setup_link(hw, speed, true);
 +   }
 +

This is fine for the PF driver, but we shouldn't invoke get_media_type
for the VF.


Laurent,

Is this a bug, or am I missing something?  If it is a bug, what's the
proper fix?

Regards,
Roger




--





[dpdk-dev] bug in net/ixgbe/ixgbe_ethdev.c:ixgbe_dev_link_update beginning in 17.05-rc3?

2017-05-09 Thread Roger B Melton
After updating to 17.05-rc4 I hit a crash in 
drivers/net/ixgbe/ixgbe_ethdev.c:ixgbe_dev_link_update().  The issue was 
a NULL get_media_type FV for the VF driver.


Looking at recent commits, I see the following added the 
get_media_type() check:


   commit c12d22f65b132c56db7b4fdbfd5ddce27d1e9572
   Author: Laurent Hardy 
   Date:   Thu Apr 27 17:03:42 2017 +0200

   @@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct rte_eth_dev
   *dev, int wait_to_complete)

hw->mac.get_link_status = true;

   +   if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
   +   hw->mac.ops.get_media_type(hw) ==
   ixgbe_media_type_fiber) {
   +   speed = hw->phy.autoneg_advertised;
   +   if (!speed)
   +   ixgbe_get_link_capabilities(hw, &speed,
   &autoneg);
   +   ixgbe_setup_link(hw, speed, true);
   +   }
   +

This is fine for the PF driver, but we shouldn't invoke get_media_type 
for the VF.



Laurent,

Is this a bug, or am I missing something?  If it is a bug, what's the 
proper fix?


Regards,
Roger




--
 ________
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmel...@cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev library to allow grouping of stats

2017-04-25 Thread Roger B Melton

On 4/24/17 11:49 AM, Mcnamara, John wrote:

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
Sent: Monday, April 24, 2017 1:41 PM
To: Kozak, KubaX 
Cc: dev@dpdk.org; Olivier Matz ; Van Haaren, Harry
; Jain, Deepak K ;
Piasecki, JacekX 
Subject: Re: [dpdk-dev] [PATCH v6 0/5] Extended xstats API in ethdev
library to allow grouping of stats

24/04/2017 14:32, Olivier Matz:

Hi,


...

So, I wonder if it wouldn't be more simple to keep the old API intact
(it would avoid unannounced breakage). The new feature can be
implemented in an additional API:

  rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
uint64_t *values, unsigned int size)
rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
struct rte_eth_xstat_name *xstats_names, unsigned int size)

Or:

  rte_eth_xstats_get_by_id(uint8_t port_id, const uint64_t *ids,
struct rte_eth_xstat *values, unsigned int size)
rte_eth_xstats_get_names_by_id(uint8_t port_id, const uint64_t *ids,
struct rte_eth_xstat_name *xstats_names, unsigned int size)

  (which would allow to deprecate the old API, but I'm not sure
   we need to)


Can we fix that for 17.05?

...

Back to the issues, please try to fix it quickly or we should revert it
for 17.05-rc3.

Hi,

We'll submit a patch to change the APIs for rte_eth_xstats_get() and
rte_eth_xstats_get_names() back to their previous signature, without
symbol versions, and add the APIs suggested by Olivier.

We'll work on that as soon as possible.

John
.


Hi folks,

I adapted our application to the API changes presented in 17.05-rc2 and 
encountered problem that I don't think was pointed out in Olivier's list 
of issues.


In our application we call rte_eth_stats_get() with NULL pointers to 
determine the number of xstats.  We use the returned count  to 
pre-allocate buffers for the name strings and xstats values and then 
invoke rte_eth_xstats_get_names() and rte_eth_xstats_get() with pointers 
to appropriately sized buffers and the number of entries available in 
those buffers.


Things appeared to work fine until I requested stats for a net_ixgbe_vf 
port. In that case the initial call to rte_eth_xstats_get() did not 
account for the xstats extensions from the ixgbevf PMD.  Long story 
short, my buffers were undersized and bad things happened.


John,

If you intend to restore the previous API signature but retain the -rc2 
internal implementation, may I suggest that both APIs use the same 
internal function to determine stats count.  Otherwise the potential for 
disagreement between the APIs will remain.


Also while on this topic, IMO the above demonstrates that we really need 
an API to query the xstats count rather than relying on 
rte_eth_xstats_get() or rte_eth_xstats_get_names() with 0'd out arguments.


Regards,
Roger


[dpdk-dev] [PATCH v2] eal: fix compile error for old glibc caused by pthread_setname_np()

2015-11-25 Thread Roger B. Melton


On 11/25/15 9:03 AM, Thomas Monjalon wrote:
> 2015-11-25 08:51, Roger B. Melton:
>> Have you thought about a way to set thread name when glibc < 2.12.  I
>> also ran into the problem recently and played around with prctl()
>> (Linux) to set thread (process) name.  e.g.
>>
>>  ret = prctl(PR_SET_NAME,,0,0,0);
>>
>>
>> There are 2 issues I think:
>>
>>  1) The semantics are different than prthread_setname_np().  With
>>  pthread_setname_np() a name can be assigned to any thread, with
>>  prctl() the name is assigned to the active thread.  That would mean
>>  that rather than rte_eal_init(), rte_eal_intr_init() could not
>>  assign thread names.  Rather the threads would have to name themselves.
>>
>>  2) I think BSD lacks prctl(), but some (not all?) BSD
>>  implementations have setproctitle() to do the same thing.
>>
>>
>> It might be too late for 2.2, but something to think about for the future.
> I don't think this feature is important enough to deal with old environments
> and to risk some complicated bugs.
> Do you think it deserves more tricks?
> .
>
I agree with you Thomas.  While I am one of those living in an old 
environment, I believe that the complications of the tricks out weight 
the debug benefit.  However there may be other in the community who have 
a different view, so I thought I would at least suggest that there are 
alternatives.

Thanks,
-Roger



[dpdk-dev] [PATCH v2] eal: fix compile error for old glibc caused by pthread_setname_np()

2015-11-25 Thread Roger B. Melton


> +/**
> + * Set thread names.
> + *
> + * Macro to wrap `pthread_setname_np()` with a glibc version check.
> + * Only glibc >= 2.12 supports this feature.
> + *
> + * This macro only used for Linux, BSD does direct libc call.
> + * BSD libc version of function is `pthread_set_name_np()`.
> + */
> +#if defined(__DOXYGEN__)
> +#define rte_thread_setname(...) pthread_setname_np(__VA_ARGS__)
> +#endif
> +
> +#if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
> +#if __GLIBC_PREREQ(2, 12)
> +#define rte_thread_setname(...) pthread_setname_np(__VA_ARGS__)
> +#else
> +#define rte_thread_setname(...) 0
> +#endif
> +#endif

Have you thought about a way to set thread name when glibc < 2.12.  I 
also ran into the problem recently and played around with prctl() 
(Linux) to set thread (process) name.  e.g.

ret = prctl(PR_SET_NAME,,0,0,0);


There are 2 issues I think:

1) The semantics are different than prthread_setname_np().  With
pthread_setname_np() a name can be assigned to any thread, with
prctl() the name is assigned to the active thread.  That would mean
that rather than rte_eal_init(), rte_eal_intr_init() could not
assign thread names.  Rather the threads would have to name themselves.

2) I think BSD lacks prctl(), but some (not all?) BSD
implementations have setproctitle() to do the same thing.


It might be too late for 2.2, but something to think about for the future.

Regards,
Roger







[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-21 Thread Roger B. Melton
David/Thomas,  in-line  -Roger


On 11/18/15 5:13 PM, Roger B. Melton wrote:
> Hi Thomas, in-line  -Roger
>
> On 11/17/15 10:46 AM, Thomas Monjalon wrote:
>> 2015-11-17 08:56, Roger B. Melton:
>>> Hi David,  in-line -Roger
>>>
>>> On 11/16/15 4:46 AM, David Marchand wrote:
>>>> Hello Roger,
>>>>
>>>> On Sun, Nov 15, 2015 at 3:45 PM, Roger B. Melton >>> <mailto:rmelton at cisco.com>> wrote:
>>>>
>>>>  I like the "-b all" and "-w none" idea, but I think it might be
>>>>  complicated to implement it the way we would need it to work.  
>>>> The
>>>>  existing -b and -w options  persist for the duration of the
>>>>  application, and we would need the "-b all"/"-w none" to persists
>>>>  only through rte_eal_init() time.  Otherwise our attempt to to
>>>>  attach a device at a later time would be blocked by the option.
>>>>
>>>> I agree, the black/white lists should only apply to initial scan.
>>>> I forgot about this problem ...
>>>> I had started some cleanup in the pci scan / attach code but this is
>>>> too late for 2.2, I will post this in the next merge window.
>>>>
>>>>
>>>>  Wouldn't it be simpler to have an option to disable the
>>>>  rte_eal_init() time the probe.  Would that address the issue with
>>>>  VFIO, prevent automatically attaching to devices while permitting
>>>>  on demand attach?
>>>>
>>>>
>>>> I suppose we can do this yes (I think Thomas once proposed off-list an
>>>> option like --no-pci-scan).
>>>> Do you think you can send a patch ?
>>> What about --no-pci-init-probe?  I know it's long, but it is more
>>> descriptive of it's purpose to disable only the init time pci probe.
>> Why not a "-b all"?
>> Making it work would also solve the case where you to scan only part of
>> the devices and initialize the blacklisted ones later.
>> .
>>
> Do you envision "-b all" setting a flag that would be used to block 
> rte_eal_init() invocation of rte_eal_pci_probe()?  e.g. If we have a 
> new API, *rte_eal_pci_blacklist_all_get()* that returns a non-zero 
> value if "-b all" was specified, then in rte_eal_init() we would have 
> something like:
>
> ...
> /* Probe & Initialize PCI devices */
> *if (!rte_eal_pci_blacklist_all_get())**  <--- New check*
> if (rte_eal_pci_probe())
> rte_panic("Cannot probe PCI\n");
> ...
>
>
> Or setting a flag that would be checked in rte_eal_probe_one() similar 
> to the existing per device blacklist check?  e.g. Again with a new 
> API, *rte_eal_pci_blacklist_all_get()* that returns a non-zero value 
> if "-b all" was specified, then in rte_eal_pci_probe_one_driver() we 
> would have something like:
>
> /* no initialization when blacklisted, return without error */
> if (*rte_eal_pci_blacklist_all_get() ||  <--- New check*
> (dev->devargs != NULL &&
>  dev->devargs->type == RTE_DEVTYPE_BLACKLISTED_PCI)) {
> RTE_LOG(DEBUG, EAL, "  Device is blacklisted, not 
> initializing\n");
> return 1;
> }
>
> The former would work, but I think it would be confusing for "-b all" 
> to only apply to init.
>
> The latter would be consistent with how "-b " works, but in 
> order to initialize devices at a later time, we would need a way to 
> clear the blacklist all state at run time so that 
> *rte_eal_pci_blacklist_all()* would return false. For example, 
> something like *rte_eal_pci_blacklist_all_clear()*.
>
> Or do you have something else in mind entirely?
>
> .
>

How about something like this:

  * Add pci_blacklist_all member to eal internal config.
  * When non-zero, pci_blacklist_all will prevent PCI probing. Result is
that probes will find no match.
  * pci_blacklist_all is temporal
  o Can be set at rte_eal_init() time with "-b all" switch.
  o Can be set or cleared at run time with
rte_eal_pci_blacklist_all_set() or
rte_eal_pci_blacklist_all_clear() respectively.
  o State can be queried with rte_eal_pci_blacklist_all_get()
  * Blacklist and whitelist switches with PCI DBDF are still permitted
as before, but they are used only when the blacklist all override is
cleared.

I've tested the implementation from the diff below in our application 
and it works well.  Note ou

[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-18 Thread Roger B. Melton
Hi Thomas, in-line  -Roger

On 11/17/15 10:46 AM, Thomas Monjalon wrote:
> 2015-11-17 08:56, Roger B. Melton:
>> Hi David,  in-line -Roger
>>
>> On 11/16/15 4:46 AM, David Marchand wrote:
>>> Hello Roger,
>>>
>>> On Sun, Nov 15, 2015 at 3:45 PM, Roger B. Melton >> <mailto:rmelton at cisco.com>> wrote:
>>>
>>>  I like the "-b all" and "-w none" idea, but I think it might be
>>>  complicated to implement it the way we would need it to work.  The
>>>  existing -b and -w options  persist for the duration of the
>>>  application, and we would need the "-b all"/"-w none" to persists
>>>  only through rte_eal_init() time.  Otherwise our attempt to to
>>>  attach a device at a later time would be blocked by the option.
>>>
>>> I agree, the black/white lists should only apply to initial scan.
>>> I forgot about this problem ...
>>> I had started some cleanup in the pci scan / attach code but this is
>>> too late for 2.2, I will post this in the next merge window.
>>>
>>>
>>>  Wouldn't it be simpler to have an option to disable the
>>>  rte_eal_init() time the probe.  Would that address the issue with
>>>  VFIO, prevent automatically attaching to devices while permitting
>>>  on demand attach?
>>>
>>>
>>> I suppose we can do this yes (I think Thomas once proposed off-list an
>>> option like --no-pci-scan).
>>> Do you think you can send a patch ?
>> What about --no-pci-init-probe?  I know it's long, but it is more
>> descriptive of it's purpose to disable only the init time pci probe.
> Why not a "-b all"?
> Making it work would also solve the case where you to scan only part of
> the devices and initialize the blacklisted ones later.
> .
>
Do you envision "-b all" setting a flag that would be used to block 
rte_eal_init() invocation of rte_eal_pci_probe()?  e.g. If we have a new 
API, *rte_eal_pci_blacklist_all_get()* that returns a non-zero value if 
"-b all" was specified, then in rte_eal_init() we would have something like:

...
 /* Probe & Initialize PCI devices */
*if (!rte_eal_pci_blacklist_all_get())**  <--- New check*
 if (rte_eal_pci_probe())
 rte_panic("Cannot probe PCI\n");
...


Or setting a flag that would be checked in rte_eal_probe_one() similar 
to the existing per device blacklist check?  e.g. Again with a new API, 
*rte_eal_pci_blacklist_all_get()* that returns a non-zero value if "-b 
all" was specified, then in rte_eal_pci_probe_one_driver() we would have 
something like:

 /* no initialization when blacklisted, return without error */
 if (*rte_eal_pci_blacklist_all_get() ||  <--- New check*
 (dev->devargs != NULL &&
  dev->devargs->type == RTE_DEVTYPE_BLACKLISTED_PCI)) {
 RTE_LOG(DEBUG, EAL, "  Device is blacklisted, not 
initializing\n");
 return 1;
 }

The former would work, but I think it would be confusing for "-b all" to 
only apply to init.

The latter would be consistent with how "-b " works, but in 
order to initialize devices at a later time, we would need a way to 
clear the blacklist all state at run time so that 
*rte_eal_pci_blacklist_all()* would return false. For example, something 
like *rte_eal_pci_blacklist_all_clear()*.

Or do you have something else in mind entirely?



[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-17 Thread Roger B. Melton
Hi David,  in-line -Roger

On 11/16/15 4:46 AM, David Marchand wrote:
> Hello Roger,
>
> On Sun, Nov 15, 2015 at 3:45 PM, Roger B. Melton  <mailto:rmelton at cisco.com>> wrote:
>
> I like the "-b all" and "-w none" idea, but I think it might be
> complicated to implement it the way we would need it to work.  The
> existing -b and -w options  persist for the duration of the
> application, and we would need the "-b all"/"-w none" to persists
> only through rte_eal_init() time.  Otherwise our attempt to to
> attach a device at a later time would be blocked by the option.
>
>
> I agree, the black/white lists should only apply to initial scan.
> I forgot about this problem ...
> I had started some cleanup in the pci scan / attach code but this is 
> too late for 2.2, I will post this in the next merge window.
>
>
> Wouldn't it be simpler to have an option to disable the
> rte_eal_init() time the probe.  Would that address the issue with
> VFIO, prevent automatically attaching to devices while permitting
> on demand attach?
>
>
> I suppose we can do this yes (I think Thomas once proposed off-list an 
> option like --no-pci-scan).
> Do you think you can send a patch ?

What about --no-pci-init-probe?  I know it's long, but it is more 
descriptive of it's purpose to disable only the init time pci probe.

I code and test and have it ready.  I'm still working through internal 
processes to allow me to submit patches, but I hope to have that 
resolved in the next few weeks and at that time I can submit the patch.
>
>
> -- 
> David Marchand
>



[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-15 Thread Roger B. Melton
Hi David,

We are on a very old kernel (2.6.xx) that lacks VFIO.  In the future 
however, after migration to a newer kernel it will be an option.

I like the "-b all" and "-w none" idea, but I think it might be 
complicated to implement it the way we would need it to work.  The 
existing -b and -w options  persist for the duration of the application, 
and we would need the "-b all"/"-w none" to persists only through 
rte_eal_init() time.  Otherwise our attempt to to attach a device at a 
later time would be blocked by the option.

Wouldn't it be simpler to have an option to disable the rte_eal_init() 
time the probe.  Would that address the issue with VFIO, prevent 
automatically attaching to devices while permitting on demand attach?

Thanks again.

Regards,
-Roger



On 11/14/15 12:51 PM, David Marchand wrote:
> Hello Roger,
>
> On Sat, Nov 14, 2015 at 4:55 PM, Roger B. Melton  <mailto:rmelton at cisco.com>> wrote:
>
> Agreed.  For our application, the debug case would be to _enable_
> the PCI scan.
>
> Again, thanks David for pointing it out.  It did solve our problem.
>
>
> The only problem with --no-pci is that I think that vfio won't work 
> properly if used.
>
> Did you try to blacklist all your devices then attach them later ?
> I would say what you need here is to "blacklist all" or "whitelist 
> none" at startup, so maybe a special keyword for -b/-w options.
>
>
> -- 
> David Marchand

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-14 Thread Roger B. Melton
Agreed.  For our application, the debug case would be to _enable_ the 
PCI scan.

Again, thanks David for pointing it out.  It did solve our problem.

Regards,
Roger


On 11/13/15 5:58 PM, Don Provan wrote:
> --no-pci is cool. I'm pretty sure that wasn't there when the PCI scan was 
> first added to
> the library init routine. I'm glad to see it, so thanks for pointing it out.
>
> Just for the record: The comment says, "for debug purposes, PCI can be 
> disabled".
> This exhibits one of those classic DPDK blindspots. The library can be used 
> for many
> things entirely unrelated to hardware. My project has several DPDK 
> applications
> intended to be used by users that do not have privs to mess around with PCI
> hardware, so, for them, turning off PCI wouldn't be just for debugging 
> purposes.
>
> -don provan
> dprovan at bivio.net
>
> -Original Message-
> From: David Marchand [mailto:david.marchand at 6wind.com]
> Sent: Friday, November 13, 2015 12:50 AM
> To: Roger B Melton 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?
>
> ...
> Did you try the --no-pci option ?
> It avoids the initial sysfs scan, so with no pci device, the initial 
> pci_probe should do nothing.
> ...
>


[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-13 Thread Roger B. Melton
Hi David,


On 11/13/15 3:49 AM, David Marchand wrote:
> Hello Roger,
>
> On Thu, Nov 12, 2015 at 11:43 PM, Roger B Melton  <mailto:rmelton at cisco.com>> wrote:
>
> Hi folks,
>
> With the addition of hot plug support we have been migrating away
> from device discovery and attach at initialization time to a model
> where it is controlled from a separate process.  The separate
> process manages the binding of devices to UIO and instructs the
> DPDK process when to attach.  One of the problems we stumbled onto
> was that if our control process discovered devices and bound them
> to UIO before our DPDK process started, then rte_eal_init() would
> discover and attach to those devices via the rte_eal_pci_probe()
> invocation. This caused problems later on when when our control
> process, instructed our DPDK process to attach to a device.
>
> There are a number of ways we could address this, but the simplest
> is to prevent the rte_eal_pci_probe() at rte_eal_init() time.  In
> our model we will never need it and I suspect others may also be
> in that boat.
>
> What are your thoughts on adding an argument to instruct
> rte_eal_init() to skip the PCI probe?
>
>
> Did you try the --no-pci option ?
> It avoids the initial sysfs scan, so with no pci device, the initial 
> pci_probe should do nothing.
>
> Attaching devices later will trigger this sysfs scan and only probe 
> the requested device.
> I am not totally happy with the way it is done right now, but I think 
> this should work for you.

I saw it, but I was so caught up in the probe that I didn't consider 
that delaying the scan until attach time might solve the problem.

I'll give it shot.  Thanks for pointing it out David.

Regards,
-Roger

>
>
> -- 
> David Marchand

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-12 Thread Roger B Melton
Hi folks,

With the addition of hot plug support we have been migrating away from 
device discovery and attach at initialization time to a model where it 
is controlled from a separate process.  The separate process manages the 
binding of devices to UIO and instructs the DPDK process when to 
attach.  One of the problems we stumbled onto was that if our control 
process discovered devices and bound them to UIO before our DPDK process 
started, then rte_eal_init() would discover and attach to those devices 
via the rte_eal_pci_probe() invocation. This caused problems later on 
when when our control process, instructed our DPDK process to attach to 
a device.

There are a number of ways we could address this, but the simplest is to 
prevent the rte_eal_pci_probe() at rte_eal_init() time.  In our model we 
will never need it and I suspect others may also be in that boat.

What are your thoughts on adding an argument to instruct rte_eal_init() 
to skip the PCI probe?

Thanks,
-Roger



[dpdk-dev] [PATCH 2/2] igb: fix VF statistic wraparound handling macro

2015-10-14 Thread Roger B. Melton


On 10/12/15 12:45 PM, Harry van Haaren wrote:
> Fix a misinterpreatation of VF statistic macro in e1000/igb.
>
> Signed-off-by: Harry van Haaren 

Acked-by: Roger Melton 


[dpdk-dev] [PATCH 1/2] ixgbe: fix VF statistic wraparound handling macro

2015-10-14 Thread Roger B. Melton


On 10/12/15 12:45 PM, Harry van Haaren wrote:
> Fix a misinterpretation of VF stats in ixgbe
>
> Signed-off-by: Harry van Haaren 

Acked-by: Roger Melton 


[dpdk-dev] [PATCH 2/2] igb: fix VF statistic wraparound handling macro

2015-10-13 Thread Roger B. Melton
ack

On 10/12/15 12:45 PM, Harry van Haaren wrote:
> Fix a misinterpreatation of VF statistic macro in e1000/igb.
>
> Signed-off-by: Harry van Haaren 
> ---
>   drivers/net/e1000/igb_ethdev.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
> index 848ef6e..e4911fc 100644
> --- a/drivers/net/e1000/igb_ethdev.c
> +++ b/drivers/net/e1000/igb_ethdev.c
> @@ -246,11 +246,10 @@ static void eth_igb_configure_msix_intr(struct 
> rte_eth_dev *dev);
>   #define UPDATE_VF_STAT(reg, last, cur)\
>   { \
>   u32 latest = E1000_READ_REG(hw, reg); \
> - cur += latest - last; \
> + cur += (latest-last) & UINT_MAX;  \
>   last = latest;\
>   }
>   
> -
>   #define IGB_FC_PAUSE_TIME 0x0680
>   #define IGB_LINK_UPDATE_CHECK_TIMEOUT  90  /* 9s */
>   #define IGB_LINK_UPDATE_CHECK_INTERVAL 100 /* ms */

-- 
  ____
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] [PATCH 1/2] ixgbe: fix VF statistic wraparound handling macro

2015-10-13 Thread Roger B. Melton
ack

On 10/12/15 12:45 PM, Harry van Haaren wrote:
> Fix a misinterpretation of VF stats in ixgbe
>
> Signed-off-by: Harry van Haaren 
> ---
>   drivers/net/ixgbe/ixgbe_ethdev.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index ec2918c..86dcd87 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -329,10 +329,10 @@ static int ixgbe_timesync_read_tx_timestamp(struct 
> rte_eth_dev *dev,
>   /*
>* Define VF Stats MACRO for Non "cleared on read" register
>*/
> -#define UPDATE_VF_STAT(reg, last, cur)   \
> +#define UPDATE_VF_STAT(reg, last, cur)  \
>   {   \
>   uint32_t latest = IXGBE_READ_REG(hw, reg);  \
> - cur += latest - last;   \
> + cur += (latest-last) & UINT_MAX;\
>   last = latest;  \
>   }
>   

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] [PATCH 1/2] ixgbe: fix VF statistic wraparound handling macro

2015-10-13 Thread Roger B. Melton
Agreed, this handles the off by one error on wrap around and should be 
faster.

-Roger


On 10/12/15 11:41 AM, Alexander Duyck wrote:
> On 10/12/2015 06:33 AM, Harry van Haaren wrote:
>> Fix a misinterpretation of VF stats in ixgbe
>>
>> Signed-off-by: Harry van Haaren 
>> ---
>>   drivers/net/ixgbe/ixgbe_ethdev.c | 8 ++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
>> b/drivers/net/ixgbe/ixgbe_ethdev.c
>> index ec2918c..d226e8d 100644
>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>> @@ -329,10 +329,14 @@ static int 
>> ixgbe_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
>>   /*
>>* Define VF Stats MACRO for Non "cleared on read" register
>>*/
>> -#define UPDATE_VF_STAT(reg, last, cur)\
>> +#define UPDATE_VF_STAT(reg, last, cur) \
>> { \
>>   uint32_t latest = IXGBE_READ_REG(hw, reg);  \
>> -cur += latest - last;   \
>> +if(likely(latest > last)) { \
>> +cur += latest - last;   \
>> +} else {\
>> +cur += (UINT_MAX - last) + latest;  \
>> +}   \
>>   last = latest;  \
>>   }
>
> From what I can tell your math is adding an off by one error.  You 
> should probably be using UINT_MAX as a mask for the result, not as a 
> part of the calculation itself.
>
> So the correct way to compute this would be "cur += (latest - last) & 
> UINT_MAX".  Also the mask approach should be faster as it avoids any 
> conditional jumps.
>
> - Alex
> .
>

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] [PATCH 2/2] igb: fix VF statistic wraparound handling macro

2015-10-12 Thread Roger B. Melton
ack

On 10/12/15 9:33 AM, Harry van Haaren wrote:
> Fix a misinterpreatation of VF statistic macro in e1000/igb.
>
> Signed-off-by: Harry van Haaren 
> ---
>   drivers/net/e1000/igb_ethdev.c | 6 +-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
> index 848ef6e..e3f7402 100644
> --- a/drivers/net/e1000/igb_ethdev.c
> +++ b/drivers/net/e1000/igb_ethdev.c
> @@ -246,7 +246,11 @@ static void eth_igb_configure_msix_intr(struct 
> rte_eth_dev *dev);
>   #define UPDATE_VF_STAT(reg, last, cur)\
>   { \
>   u32 latest = E1000_READ_REG(hw, reg); \
> - cur += latest - last; \
> + if(likely(latest > last)) {   \
> + cur += latest - last; \
> + } else {  \
> + cur += (UINT_MAX - last) + latest;\
> + } \
>   last = latest;\
>   }
>   

-- 
  ____
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] [PATCH 1/2] ixgbe: fix VF statistic wraparound handling macro

2015-10-12 Thread Roger B. Melton
ack

On 10/12/15 9:33 AM, Harry van Haaren wrote:
> Fix a misinterpretation of VF stats in ixgbe
>
> Signed-off-by: Harry van Haaren 
> ---
>   drivers/net/ixgbe/ixgbe_ethdev.c | 8 ++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index ec2918c..d226e8d 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -329,10 +329,14 @@ static int ixgbe_timesync_read_tx_timestamp(struct 
> rte_eth_dev *dev,
>   /*
>* Define VF Stats MACRO for Non "cleared on read" register
>*/
> -#define UPDATE_VF_STAT(reg, last, cur)   \
> +#define UPDATE_VF_STAT(reg, last, cur)  \
>   {   \
>   uint32_t latest = IXGBE_READ_REG(hw, reg);  \
> - cur += latest - last;   \
> + if(likely(latest > last)) { \
> + cur += latest - last;   \
> + } else {\
> + cur += (UINT_MAX - last) + latest;  \
> + }   \
>   last = latest;      \
>   }
>   

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |



[dpdk-dev] [PATCH 4/4] ethdev: check support for rx_queue_count and descriptor_done fns

2015-06-12 Thread Roger B. Melton
Hi Bruce,  Comment in-line.  Regards, Roger

On 6/12/15 7:28 AM, Bruce Richardson wrote:
> The functions rte_eth_rx_queue_count and rte_eth_descriptor_done are
> supported by very few PMDs. Therefore, it is best to check for support
> for the functions in the ethdev library, so as to avoid crashes
> at run-time if the application goes to use those APIs. The performance
> impact of this change should be very small as this is a predictable
> branch in the function.
>
> Signed-off-by: Bruce Richardson 
> ---
>   lib/librte_ether/rte_ethdev.h | 8 ++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 827ca3e..9ad1b6a 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2496,6 +2496,8 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>*  The queue id on the specific port.
>* @return
>*  The number of used descriptors in the specific queue.
> + *  NOTE: if function is not supported by device this call
> + *returns (uint32_t)-ENOTSUP
>*/
>   static inline uint32_t

Why not change the return type to int32_t?
In this way, the caller isn't required to make the assumption that a 
large queue count indicates an error.  < 0 means error, other wise it's 
a valid queue count.

This approach would be consistent with other APIs.

>   rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
> @@ -2507,8 +2509,9 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t 
> queue_id)
>   RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>   return 0;
>   }
> - RTE_ETH_FPTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
>   #endif
> + RTE_ETH_FPTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 
> (uint32_t)-ENOTSUP);
> +
>   return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
>   }
>   
> @@ -2525,6 +2528,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t 
> queue_id)
>*  - (1) if the specific DD bit is set.
>*  - (0) if the specific DD bit is not set.
>*  - (-ENODEV) if *port_id* invalid.
> + *  - (-ENOTSUP) if the device does not support this function
>*/
>   static inline int
>   rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t queue_id, uint16_t 
> offset)
> @@ -2536,8 +2540,8 @@ rte_eth_rx_descriptor_done(uint8_t port_id, uint16_t 
> queue_id, uint16_t offset)
>   RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>   return -ENODEV;
>   }
> - RTE_ETH_FPTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_done, -ENOTSUP);
>   #endif
> + RTE_ETH_FPTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_done, -ENOTSUP);
>   
>   return (*dev->dev_ops->rx_descriptor_done)( \
>   dev->data->rx_queues[queue_id], offset);

-- 
  
|Roger B. Melton|  |  Cisco Systems  |
|CPP Software  :|::|: 7100 Kit Creek Rd  |
|+1.919.476.2332 phone:|||:  :|||:RTP, NC 27709-4987 |
|+1.919.392.1094 fax   .:|||:..:|||:. rmelton at cisco.com  |
||
| This email may contain confidential and privileged material for the|
| sole use of the intended recipient. Any review, use, distribution  |
| or disclosure by others is strictly prohibited. If you are not the |
| intended recipient (or authorized to receive for the recipient),   |
| please contact the sender by reply email and delete all copies of  |
| this message.  |
||
| For corporate legal information go to: |
| http://www.cisco.com/web/about/doing_business/legal/cri/index.html |
|__ http://www.cisco.com |