[dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection

2016-01-15 Thread Stephen Hurd
Yes, this is cause a bug which prevents cross-socket testing... per michael.qiu 
at intel.com on Fri 12/18/2015 5:33 PM  (unable to find email on the list)

>> Hi, Stephen
>> 
>> I just see this patch and found some issue with it.
>> 
>> When I start testpmd with -c 0x3 but with socket-num 1, that means run
>> lcore in socket 0, but want hugepage allocated in socket 1, in previous,
>> it works, I don't know why you force it in the socket lcore locates.I
>> would like to give a warning instead of failure. just like before:
>> 
>> EAL: Requesting 512 pages of size 2MB from socket 1
>> EAL: TSC frequency is ~2294689 KHz
>> EAL: WARNING: Master core has no memory on local socket!
>> 
>> After your patch:
>> 
>> EAL: No probed ethernet devices
>> Interactive-mode selected
>> EAL: Error - exiting with code: 1
>>  Cause: The socket number should be < 1
>>

His example command was:
./testpmd -c 0x3 -n 4 -- -i -socket-num=1


-- Stephen Hurd


-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Thursday, January 14, 2016 5:44 AM
To: Stephen Hurd
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection

On Wed, Jan 13, 2016 at 02:23:36PM -0800, Stephen Hurd wrote:
> Previously, max_socket was set to the highest numbered socket with
> an enabled lcore.  The intent is to set it to the highest socket
> regardless of it being enabled.
> 

Can you clarify why this changes is necessary? Is it causing a bug somewhere?

thanks,
/Bruce




[dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection

2016-01-15 Thread Stephen Hurd
Found it...
http://www.dpdk.org/ml/archives/dev/2015-December/030564.html


-- Stephen Hurd


-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Thursday, January 14, 2016 5:44 AM
To: Stephen Hurd
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection

On Wed, Jan 13, 2016 at 02:23:36PM -0800, Stephen Hurd wrote:
> Previously, max_socket was set to the highest numbered socket with
> an enabled lcore.  The intent is to set it to the highest socket
> regardless of it being enabled.
> 

Can you clarify why this changes is necessary? Is it causing a bug somewhere?

thanks,
/Bruce




[dpdk-dev] [RFC v2 1/2] ethdev: add packet filter flow and new behavior switch to fdir

2016-01-15 Thread Wu, Jingjing
Hi, Rahul

> This approach seems generic enough to allow any vendor specific data
> to be passed in filter as well.  However, 80 seems to be too low for
> multiple flow types that can be combined in the same filter rule.
> I think size of 256 seems reasonable.
>
Yes, 80 is just an example. 
> Could the same thing be done for action arguments as well? Can we add
> the same generic info to rte_eth_fdir_action too?
> 
> struct rte_eth_fdir_action {
> uint16_t rx_queue;
>   enum rte_eth_fdir_behavior behavior;
>   enum rte_eth_fdir_status report_status;
>   uint8_t flex_off;
> +   uint8_t behavior_arg[256];
> };
> 
> This way, we can pass vendor specific action arguments too. What do
> you think?
Yes, it also makes sense.
> Also, now if we take this approach then, I am wondering, that all
> vendors would need to document their own vendor-specific format of
> taking filter match and filter action arguments, right?
> 
> And probably, even come up with their own example application showing
> how to apply filters via dpdk on their card?
Yes, I guess it will be better to doc it or example it. Even currently, 
different kinds of NIC may need different configuration.
Or you can add description (how to configure) in your driver's comment log?
Not sure about the others' opinion?

Thanks
Jingjing


[dpdk-dev] [PATCH 00/29] i40e base driver update

2016-01-15 Thread Helin Zhang
i40e base driver is updated, to support new X722 device IDs, and
use rx control AQ commands to read/write rx control registers.
Of cause, fixes and enhancements are added as listed as below.

Helin Zhang (29):
  i40e/base: use explicit cast from u16 to u8
  i40e/base: Acquire NVM, before issuing an AQ read nvm command
  i40e/base: add hw flag for doing the SRCTL access using AQ for X722
  i40e/base: add changes in nvm read to support X722
  i40e/base: Limit DCB FW version checks to XL710/X710 devices
  i40e/base: check for stopped admin queue
  i40e/base: set aq count after memory allocation
  i40e/base: clean event descriptor before use
  i40e/base: add new device IDs and delete deprecated one
  i40e/base: fix up recent proxy and wol bits for X722_SUPPORT
  i40e/base: define function capabilities in only one place
  i40e/base: Fix for PHY NVM interaction problem
  i40e/base: set shared bit for multicast filters
  i40e/base: add APIs to Add/remove port mirroring rules
  i40e/base: add VEB stat control and remove L2 cloud filter
  i40e/base: implement the API function for aq_set_switch_config
  i40e/base: Add functions to blink led on Coppervale PHY
  i40e/base: When in promisc mode apply promisc mode to Tx Traffic as
well
  i40e/base: Increase timeout when checking GLGEN_RSTAT_DEVSTATE bit
  i40e/base: Save off VSI resource count when updating VSI
  i40e/base: coding style fixes
  i40e/base: use FW to read/write rx control registers
  i40e/base: expose some registers to program parser, FD and RSS logic
  i40e/base: Add a Virtchnl offload for RSS PCTYPE V2
  i40e/base: add AQ thermal sensor control struct
  i40e/base: add/update structure and macro definitions
  i40e: add base driver release info
  i40e: add/remove new device IDs
  i40e: use rx control function for rx control registers

 doc/guides/rel_notes/release_2_3.rst|  15 +
 drivers/net/i40e/Makefile   |   1 +
 drivers/net/i40e/base/i40e_adminq.c |  27 +-
 drivers/net/i40e/base/i40e_adminq_cmd.h | 234 +++---
 drivers/net/i40e/base/i40e_common.c | 942 +---
 drivers/net/i40e/base/i40e_dcb.c|  34 +-
 drivers/net/i40e/base/i40e_devids.h |  10 +-
 drivers/net/i40e/base/i40e_lan_hmc.c|   4 +-
 drivers/net/i40e/base/i40e_nvm.c| 142 +++-
 drivers/net/i40e/base/i40e_osdep.h  |  36 +
 drivers/net/i40e/base/i40e_prototype.h  |  48 +-
 drivers/net/i40e/base/i40e_register.h   |  48 ++
 drivers/net/i40e/base/i40e_type.h   |  24 +-
 drivers/net/i40e/base/i40e_virtchnl.h   |   1 +
 drivers/net/i40e/i40e_ethdev.c  |  81 +-
 drivers/net/i40e/i40e_ethdev_vf.c   |  28 +-
 drivers/net/i40e/i40e_fdir.c|  13 +-
 drivers/net/i40e/i40e_pf.c  |   6 +-
 lib/librte_eal/common/include/rte_pci_dev_ids.h |   8 +-
 19 files changed, 1385 insertions(+), 317 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH 01/29] i40e/base: use explicit cast from u16 to u8

2016-01-15 Thread Helin Zhang
Current implementation generates compilation warnings.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_lan_hmc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_lan_hmc.c 
b/drivers/net/i40e/base/i40e_lan_hmc.c
index 6511767..2260648 100644
--- a/drivers/net/i40e/base/i40e_lan_hmc.c
+++ b/drivers/net/i40e/base/i40e_lan_hmc.c
@@ -769,7 +769,7 @@ static void i40e_write_byte(u8 *hmc_bits,

/* prepare the bits and mask */
shift_width = ce_info->lsb % 8;
-   mask = BIT(ce_info->width) - 1;
+   mask = (u8)(BIT(ce_info->width) - 1);

src_byte = *from;
src_byte &= mask;
@@ -954,7 +954,7 @@ static void i40e_read_byte(u8 *hmc_bits,

/* prepare the bits and mask */
shift_width = ce_info->lsb % 8;
-   mask = BIT(ce_info->width) - 1;
+   mask = (u8)(BIT(ce_info->width) - 1);

/* shift to correct alignment */
mask <<= shift_width;
-- 
1.9.3



[dpdk-dev] [PATCH 02/29] i40e/base: Acquire NVM, before issuing an AQ read nvm command

2016-01-15 Thread Helin Zhang
It needs to acquire the NVM before issuing an AQ read to the
X722 NVM otherwise it will get EBUSY from the FW. Also release
when done.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_nvm.c | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_nvm.c b/drivers/net/i40e/base/i40e_nvm.c
index 60f2bb9..bfa3315 100644
--- a/drivers/net/i40e/base/i40e_nvm.c
+++ b/drivers/net/i40e/base/i40e_nvm.c
@@ -217,11 +217,22 @@ static enum i40e_status_code 
i40e_poll_sr_srctl_done_bit(struct i40e_hw *hw)
 enum i40e_status_code i40e_read_nvm_word(struct i40e_hw *hw, u16 offset,
 u16 *data)
 {
+   enum i40e_status_code ret_code = I40E_SUCCESS;
+
 #ifdef X722_SUPPORT
-   if (hw->mac.type == I40E_MAC_X722)
-   return i40e_read_nvm_word_aq(hw, offset, data);
+   if (hw->mac.type == I40E_MAC_X722) {
+   ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
+   if (!ret_code) {
+   ret_code = i40e_read_nvm_word_aq(hw, offset, data);
+   i40e_release_nvm(hw);
+   }
+   } else {
+   ret_code = i40e_read_nvm_word_srctl(hw, offset, data);
+   }
+#else
+   ret_code = i40e_read_nvm_word_srctl(hw, offset, data);
 #endif
-   return i40e_read_nvm_word_srctl(hw, offset, data);
+   return ret_code;
 }

 /**
@@ -309,11 +320,23 @@ enum i40e_status_code i40e_read_nvm_word_aq(struct 
i40e_hw *hw, u16 offset,
 enum i40e_status_code i40e_read_nvm_buffer(struct i40e_hw *hw, u16 offset,
   u16 *words, u16 *data)
 {
+   enum i40e_status_code ret_code = I40E_SUCCESS;
+
 #ifdef X722_SUPPORT
-   if (hw->mac.type == I40E_MAC_X722)
-   return i40e_read_nvm_buffer_aq(hw, offset, words, data);
+   if (hw->mac.type == I40E_MAC_X722) {
+   ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
+   if (!ret_code) {
+   ret_code = i40e_read_nvm_buffer_aq(hw, offset, words,
+  data);
+   i40e_release_nvm(hw);
+   }
+   } else {
+   ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data);
+   }
+#else
+   ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data);
 #endif
-   return i40e_read_nvm_buffer_srctl(hw, offset, words, data);
+   return ret_code;
 }

 /**
-- 
1.9.3



[dpdk-dev] [PATCH 03/29] i40e/base: add hw flag for doing the SRCTL access using AQ for X722

2016-01-15 Thread Helin Zhang
Instead of doing the MAC check, use a flag that gets set per
MAC. This way there are less chances of user error and it
can enable multiple MACs with the capability in a single place
rather than cluttering the code with MAC checks.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 5 +
 drivers/net/i40e/base/i40e_nvm.c| 4 ++--
 drivers/net/i40e/base/i40e_type.h   | 3 +++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index d7c940d..5e1b39e 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -1032,6 +1032,11 @@ enum i40e_status_code i40e_init_shared_code(struct 
i40e_hw *hw)
else
hw->pf_id = (u8)(func_rid & 0x7);

+#ifdef X722_SUPPORT
+   if (hw->mac.type == I40E_MAC_X722)
+   hw->flags |= I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE;
+
+#endif
status = i40e_init_nvm(hw);
return status;
 }
diff --git a/drivers/net/i40e/base/i40e_nvm.c b/drivers/net/i40e/base/i40e_nvm.c
index bfa3315..a1b150a 100644
--- a/drivers/net/i40e/base/i40e_nvm.c
+++ b/drivers/net/i40e/base/i40e_nvm.c
@@ -220,7 +220,7 @@ enum i40e_status_code i40e_read_nvm_word(struct i40e_hw 
*hw, u16 offset,
enum i40e_status_code ret_code = I40E_SUCCESS;

 #ifdef X722_SUPPORT
-   if (hw->mac.type == I40E_MAC_X722) {
+   if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) {
ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
if (!ret_code) {
ret_code = i40e_read_nvm_word_aq(hw, offset, data);
@@ -323,7 +323,7 @@ enum i40e_status_code i40e_read_nvm_buffer(struct i40e_hw 
*hw, u16 offset,
enum i40e_status_code ret_code = I40E_SUCCESS;

 #ifdef X722_SUPPORT
-   if (hw->mac.type == I40E_MAC_X722) {
+   if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE) {
ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
if (!ret_code) {
ret_code = i40e_read_nvm_buffer_aq(hw, offset, words,
diff --git a/drivers/net/i40e/base/i40e_type.h 
b/drivers/net/i40e/base/i40e_type.h
index 9483884..f566e30 100644
--- a/drivers/net/i40e/base/i40e_type.h
+++ b/drivers/net/i40e/base/i40e_type.h
@@ -658,6 +658,9 @@ struct i40e_hw {
u16 wol_proxy_vsi_seid;

 #endif
+#define I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE BIT_ULL(0)
+   u64 flags;
+
/* debug mask */
u32 debug_mask;
 #ifndef I40E_NDIS_SUPPORT
-- 
1.9.3



[dpdk-dev] [PATCH 04/29] i40e/base: add changes in nvm read to support X722

2016-01-15 Thread Helin Zhang
In X722, NVM reads can't be done through SRCTL registers.
And require AQ calls, which require grabbing the NVM lock.
Unfortunately some paths need the lock to be acquired once
and do a whole bunch of stuff and then release it.
This patch creates an unsafe version of the read calls, so
that it can be called from the paths that need the bulk access.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_nvm.c   | 109 ++---
 drivers/net/i40e/base/i40e_prototype.h |   8 ++-
 2 files changed, 92 insertions(+), 25 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_nvm.c b/drivers/net/i40e/base/i40e_nvm.c
index a1b150a..f4e4eaa 100644
--- a/drivers/net/i40e/base/i40e_nvm.c
+++ b/drivers/net/i40e/base/i40e_nvm.c
@@ -53,7 +53,7 @@ enum i40e_status_code i40e_read_nvm_aq(struct i40e_hw *hw, u8 
module_pointer,
  * once per NVM initialization, e.g. inside the i40e_init_shared_code().
  * Please notice that the NVM term is used here (& in all methods covered
  * in this file) as an equivalent of the FLASH part mapped into the SR.
- * We are accessing FLASH always thru the Shadow RAM.
+ * We are accessing FLASH always through the Shadow RAM.
  **/
 enum i40e_status_code i40e_init_nvm(struct i40e_hw *hw)
 {
@@ -207,7 +207,7 @@ static enum i40e_status_code 
i40e_poll_sr_srctl_done_bit(struct i40e_hw *hw)
 }

 /**
- * i40e_read_nvm_word - Reads Shadow RAM
+ * i40e_read_nvm_word - Reads nvm word and acquire lock if necessary
  * @hw: pointer to the HW structure
  * @offset: offset of the Shadow RAM word to read (0x00 - 0x001FFF)
  * @data: word read from the Shadow RAM
@@ -236,6 +236,31 @@ enum i40e_status_code i40e_read_nvm_word(struct i40e_hw 
*hw, u16 offset,
 }

 /**
+ * __i40e_read_nvm_word - Reads nvm word, assumes caller does the locking
+ * @hw: pointer to the HW structure
+ * @offset: offset of the Shadow RAM word to read (0x00 - 0x001FFF)
+ * @data: word read from the Shadow RAM
+ *
+ * Reads one 16 bit word from the Shadow RAM using the GLNVM_SRCTL register.
+ **/
+enum i40e_status_code __i40e_read_nvm_word(struct i40e_hw *hw,
+  u16 offset,
+  u16 *data)
+{
+   enum i40e_status_code ret_code = I40E_SUCCESS;
+
+#ifdef X722_SUPPORT
+   if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE)
+   ret_code = i40e_read_nvm_word_aq(hw, offset, data);
+   else
+   ret_code = i40e_read_nvm_word_srctl(hw, offset, data);
+#else
+   ret_code = i40e_read_nvm_word_srctl(hw, offset, data);
+#endif
+   return ret_code;
+}
+
+/**
  * i40e_read_nvm_word_srctl - Reads Shadow RAM via SRCTL register
  * @hw: pointer to the HW structure
  * @offset: offset of the Shadow RAM word to read (0x00 - 0x001FFF)
@@ -307,7 +332,35 @@ enum i40e_status_code i40e_read_nvm_word_aq(struct i40e_hw 
*hw, u16 offset,
 }

 /**
- * i40e_read_nvm_buffer - Reads Shadow RAM buffer
+ * __i40e_read_nvm_buffer - Reads nvm buffer, caller must acquire lock
+ * @hw: pointer to the HW structure
+ * @offset: offset of the Shadow RAM word to read (0x00 - 0x001FFF).
+ * @words: (in) number of words to read; (out) number of words actually read
+ * @data: words read from the Shadow RAM
+ *
+ * Reads 16 bit words (data buffer) from the SR using the i40e_read_nvm_srrd()
+ * method. The buffer read is preceded by the NVM ownership take
+ * and followed by the release.
+ **/
+enum i40e_status_code __i40e_read_nvm_buffer(struct i40e_hw *hw,
+u16 offset,
+u16 *words, u16 *data)
+{
+   enum i40e_status_code ret_code = I40E_SUCCESS;
+
+#ifdef X722_SUPPORT
+   if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE)
+   ret_code = i40e_read_nvm_buffer_aq(hw, offset, words, data);
+   else
+   ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data);
+#else
+   ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data);
+#endif
+   return ret_code;
+}
+
+/**
+ * i40e_read_nvm_buffer - Reads Shadow RAM buffer and acuire lock if necessary
  * @hw: pointer to the HW structure
  * @offset: offset of the Shadow RAM word to read (0x00 - 0x001FFF).
  * @words: (in) number of words to read; (out) number of words actually read
@@ -327,7 +380,7 @@ enum i40e_status_code i40e_read_nvm_buffer(struct i40e_hw 
*hw, u16 offset,
ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
if (!ret_code) {
ret_code = i40e_read_nvm_buffer_aq(hw, offset, words,
-  data);
+data);
i40e_release_nvm(hw);
}
} else {
@@ -358,7 +411,7 @@ enum i40e_status_code i40e_read_nvm_buffer_srctl(struct 
i40e_hw *hw, u16 offset,

DEBUGFUNC("i40e_read_nvm_buffer_srctl");

-   /* Loop t

[dpdk-dev] [PATCH 05/29] i40e/base: Limit DCB FW version checks to XL710/X710 devices

2016-01-15 Thread Helin Zhang
XL710/X710 devices requires FW version checks to properly handle
DCB configurations from the FW while other devices (e.g. X722)
do not, so limit these checks to XL710/X710 only.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_dcb.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_dcb.c b/drivers/net/i40e/base/i40e_dcb.c
index d71387f..26c344f 100644
--- a/drivers/net/i40e/base/i40e_dcb.c
+++ b/drivers/net/i40e/base/i40e_dcb.c
@@ -387,32 +387,40 @@ static void i40e_parse_cee_app_tlv(struct 
i40e_cee_feat_tlv *tlv,
 {
u16 length, typelength, offset = 0;
struct i40e_cee_app_prio *app;
-   u8 i, up, selector;
+   u8 i;

typelength = I40E_NTOHS(tlv->hdr.typelen);
length = (u16)((typelength & I40E_LLDP_TLV_LEN_MASK) >>
   I40E_LLDP_TLV_LEN_SHIFT);

-   dcbcfg->numapps = length/sizeof(*app);
+   dcbcfg->numapps = length / sizeof(*app);
if (!dcbcfg->numapps)
return;

for (i = 0; i < dcbcfg->numapps; i++) {
+   u8 up, selector;
+
app = (struct i40e_cee_app_prio *)(tlv->tlvinfo + offset);
for (up = 0; up < I40E_MAX_USER_PRIORITY; up++) {
if (app->prio_map & BIT(up))
break;
}
dcbcfg->app[i].priority = up;
+
/* Get Selector from lower 2 bits, and convert to IEEE */
selector = (app->upper_oui_sel & I40E_CEE_APP_SELECTOR_MASK);
-   if (selector == I40E_CEE_APP_SEL_ETHTYPE)
+   switch (selector) {
+   case I40E_CEE_APP_SEL_ETHTYPE:
dcbcfg->app[i].selector = I40E_APP_SEL_ETHTYPE;
-   else if (selector == I40E_CEE_APP_SEL_TCPIP)
+   break;
+   case I40E_CEE_APP_SEL_TCPIP:
dcbcfg->app[i].selector = I40E_APP_SEL_TCPIP;
-   else
+   break;
+   default:
/* Keep selector as it is for unknown types */
dcbcfg->app[i].selector = selector;
+   }
+
dcbcfg->app[i].protocolid = I40E_NTOHS(app->protocol);
/* Move to next app */
offset += sizeof(*app);
@@ -822,13 +830,15 @@ enum i40e_status_code i40e_get_dcb_config(struct i40e_hw 
*hw)
struct i40e_aqc_get_cee_dcb_cfg_resp cee_cfg;
struct i40e_aqc_get_cee_dcb_cfg_v1_resp cee_v1_cfg;

-   /* If Firmware version < v4.33 IEEE only */
-   if (((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver < 33)) ||
-   (hw->aq.fw_maj_ver < 4))
+   /* If Firmware version < v4.33 on X710/XL710, IEEE only */
+   if ((hw->mac.type == I40E_MAC_XL710) &&
+   (((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver < 33)) ||
+ (hw->aq.fw_maj_ver < 4)))
return i40e_get_ieee_dcb_config(hw);

-   /* If Firmware version == v4.33 use old CEE struct */
-   if ((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver == 33)) {
+   /* If Firmware version == v4.33 on X710/XL710, use old CEE struct */
+   if ((hw->mac.type == I40E_MAC_XL710) &&
+   ((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver == 33))) {
ret = i40e_aq_get_cee_dcb_config(hw, &cee_v1_cfg,
 sizeof(cee_v1_cfg), NULL);
if (ret == I40E_SUCCESS) {
@@ -1240,14 +1250,12 @@ enum i40e_status_code i40e_dcb_config_to_lldp(u8 
*lldpmib, u16 *miblen,
u16 length, offset = 0, tlvid = I40E_TLV_ID_START;
enum i40e_status_code ret = I40E_SUCCESS;
struct i40e_lldp_org_tlv *tlv;
-   u16 type, typelength;
+   u16 typelength;

tlv = (struct i40e_lldp_org_tlv *)lldpmib;
while (1) {
i40e_add_dcb_tlv(tlv, dcbcfg, tlvid++);
typelength = I40E_NTOHS(tlv->typelength);
-   type = (u16)((typelength & I40E_LLDP_TLV_TYPE_MASK) >>
-   I40E_LLDP_TLV_TYPE_SHIFT);
length = (u16)((typelength & I40E_LLDP_TLV_LEN_MASK) >>
I40E_LLDP_TLV_LEN_SHIFT);
if (length)
-- 
1.9.3



[dpdk-dev] [PATCH 06/29] i40e/base: check for stopped admin queue

2016-01-15 Thread Helin Zhang
It's possible that while waiting for the spinlock, another
entity (that owns the spinlock) has shut down the admin queue.
If it then attempts to use the queue, it will panic.
It adds a check for this condition on the receive side. This
matches an existing check on the send queue side.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_adminq.c 
b/drivers/net/i40e/base/i40e_adminq.c
index 998582c..e1a162e 100644
--- a/drivers/net/i40e/base/i40e_adminq.c
+++ b/drivers/net/i40e/base/i40e_adminq.c
@@ -1035,6 +1035,13 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
/* take the lock before we start messing with the ring */
i40e_acquire_spinlock(&hw->aq.arq_spinlock);

+   if (hw->aq.arq.count == 0) {
+   i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE,
+  "AQRX: Admin queue not initialized.\n");
+   ret_code = I40E_ERR_QUEUE_EMPTY;
+   goto clean_arq_element_err;
+   }
+
/* set next_to_use to head */
 #ifdef PF_DRIVER
 #ifdef INTEGRATED_VF
@@ -1113,6 +1120,7 @@ clean_arq_element_out:
/* Set pending if needed, unlock and return */
if (pending != NULL)
*pending = (ntc > ntu ? hw->aq.arq.count : 0) + (ntu - ntc);
+clean_arq_element_err:
i40e_release_spinlock(&hw->aq.arq_spinlock);

 #ifdef PF_DRIVER
-- 
1.9.3



[dpdk-dev] [PATCH 07/29] i40e/base: set aq count after memory allocation

2016-01-15 Thread Helin Zhang
The standard way to check if the AQ is enabled is to look at the
count field. So it should only set this field after it has
successfully allocated memory. To do otherwise is to incite
panic among the populace.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq.c 
b/drivers/net/i40e/base/i40e_adminq.c
index e1a162e..ee563e4 100644
--- a/drivers/net/i40e/base/i40e_adminq.c
+++ b/drivers/net/i40e/base/i40e_adminq.c
@@ -431,7 +431,6 @@ enum i40e_status_code i40e_init_asq(struct i40e_hw *hw)

hw->aq.asq.next_to_use = 0;
hw->aq.asq.next_to_clean = 0;
-   hw->aq.asq.count = hw->aq.num_asq_entries;

/* allocate the ring memory */
ret_code = i40e_alloc_adminq_asq_ring(hw);
@@ -449,6 +448,7 @@ enum i40e_status_code i40e_init_asq(struct i40e_hw *hw)
goto init_adminq_free_rings;

/* success! */
+   hw->aq.asq.count = hw->aq.num_asq_entries;
goto init_adminq_exit;

 init_adminq_free_rings:
@@ -490,7 +490,6 @@ enum i40e_status_code i40e_init_arq(struct i40e_hw *hw)

hw->aq.arq.next_to_use = 0;
hw->aq.arq.next_to_clean = 0;
-   hw->aq.arq.count = hw->aq.num_arq_entries;

/* allocate the ring memory */
ret_code = i40e_alloc_adminq_arq_ring(hw);
@@ -508,6 +507,7 @@ enum i40e_status_code i40e_init_arq(struct i40e_hw *hw)
goto init_adminq_free_rings;

/* success! */
+   hw->aq.arq.count = hw->aq.num_arq_entries;
goto init_adminq_exit;

 init_adminq_free_rings:
-- 
1.9.3



[dpdk-dev] [PATCH 08/29] i40e/base: clean event descriptor before use

2016-01-15 Thread Helin Zhang
In one obscure corner case, it was possible to clear the NVM update
wait flag when no update_done message was actually received. This
patch cleans the event descriptor before use, and moves the opcode
check to where it won't get done if there was no event to clean.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq.c 
b/drivers/net/i40e/base/i40e_adminq.c
index ee563e4..222add4 100644
--- a/drivers/net/i40e/base/i40e_adminq.c
+++ b/drivers/net/i40e/base/i40e_adminq.c
@@ -1032,6 +1032,9 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
u16 flags;
u16 ntu;

+   /* pre-clean the event info */
+   i40e_memset(&e->desc, 0, sizeof(e->desc), I40E_NONDMA_MEM);
+
/* take the lock before we start messing with the ring */
i40e_acquire_spinlock(&hw->aq.arq_spinlock);

@@ -1116,13 +1119,6 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
hw->aq.arq.next_to_clean = ntc;
hw->aq.arq.next_to_use = ntu;

-clean_arq_element_out:
-   /* Set pending if needed, unlock and return */
-   if (pending != NULL)
-   *pending = (ntc > ntu ? hw->aq.arq.count : 0) + (ntu - ntc);
-clean_arq_element_err:
-   i40e_release_spinlock(&hw->aq.arq_spinlock);
-
 #ifdef PF_DRIVER
if (i40e_is_nvm_update_op(&e->desc)) {
if (hw->aq.nvm_release_on_done) {
@@ -1145,6 +1141,13 @@ clean_arq_element_err:
}

 #endif
+clean_arq_element_out:
+   /* Set pending if needed, unlock and return */
+   if (pending != NULL)
+   *pending = (ntc > ntu ? hw->aq.arq.count : 0) + (ntu - ntc);
+clean_arq_element_err:
+   i40e_release_spinlock(&hw->aq.arq_spinlock);
+
return ret_code;
 }

-- 
1.9.3



[dpdk-dev] [PATCH 09/29] i40e/base: add new device IDs and delete deprecated one

2016-01-15 Thread Helin Zhang
Add new Device ID's for backplane and QSFP+ adapters, and delete
deprecated one for backplane.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 12 ++--
 drivers/net/i40e/base/i40e_devids.h | 10 +-
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 5e1b39e..67a5e21 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -58,7 +58,6 @@ STATIC enum i40e_status_code i40e_set_mac_type(struct i40e_hw 
*hw)
switch (hw->device_id) {
case I40E_DEV_ID_SFP_XL710:
case I40E_DEV_ID_QEMU:
-   case I40E_DEV_ID_KX_A:
case I40E_DEV_ID_KX_B:
case I40E_DEV_ID_KX_C:
case I40E_DEV_ID_QSFP_A:
@@ -74,6 +73,8 @@ STATIC enum i40e_status_code i40e_set_mac_type(struct i40e_hw 
*hw)
 #ifdef X722_A0_SUPPORT
case I40E_DEV_ID_X722_A0:
 #endif
+   case I40E_DEV_ID_KX_X722:
+   case I40E_DEV_ID_QSFP_X722:
case I40E_DEV_ID_SFP_X722:
case I40E_DEV_ID_1G_BASE_T_X722:
case I40E_DEV_ID_10G_BASE_T_X722:
@@ -81,15 +82,22 @@ STATIC enum i40e_status_code i40e_set_mac_type(struct 
i40e_hw *hw)
break;
 #endif
 #ifdef X722_SUPPORT
+#if defined(INTEGRATED_VF) || defined(VF_DRIVER)
case I40E_DEV_ID_X722_VF:
case I40E_DEV_ID_X722_VF_HV:
+#ifdef X722_A0_SUPPORT
+   case I40E_DEV_ID_X722_A0_VF:
+#endif
hw->mac.type = I40E_MAC_X722_VF;
break;
-#endif
+#endif /* INTEGRATED_VF || VF_DRIVER */
+#endif /* X722_SUPPORT */
+#if defined(INTEGRATED_VF) || defined(VF_DRIVER)
case I40E_DEV_ID_VF:
case I40E_DEV_ID_VF_HV:
hw->mac.type = I40E_MAC_VF;
break;
+#endif
default:
hw->mac.type = I40E_MAC_GENERIC;
break;
diff --git a/drivers/net/i40e/base/i40e_devids.h 
b/drivers/net/i40e/base/i40e_devids.h
index 26cfd54..f844340 100644
--- a/drivers/net/i40e/base/i40e_devids.h
+++ b/drivers/net/i40e/base/i40e_devids.h
@@ -40,7 +40,6 @@ POSSIBILITY OF SUCH DAMAGE.
 /* Device IDs */
 #define I40E_DEV_ID_SFP_XL710  0x1572
 #define I40E_DEV_ID_QEMU   0x1574
-#define I40E_DEV_ID_KX_A   0x157F
 #define I40E_DEV_ID_KX_B   0x1580
 #define I40E_DEV_ID_KX_C   0x1581
 #define I40E_DEV_ID_QSFP_A 0x1583
@@ -50,17 +49,26 @@ POSSIBILITY OF SUCH DAMAGE.
 #define I40E_DEV_ID_20G_KR20x1587
 #define I40E_DEV_ID_20G_KR2_A  0x1588
 #define I40E_DEV_ID_10G_BASE_T40x1589
+#if defined(INTEGRATED_VF) || defined(VF_DRIVER) || defined(I40E_NDIS_SUPPORT)
 #define I40E_DEV_ID_VF 0x154C
 #define I40E_DEV_ID_VF_HV  0x1571
+#endif /* VF_DRIVER */
 #ifdef X722_SUPPORT
 #ifdef X722_A0_SUPPORT
 #define I40E_DEV_ID_X722_A00x374C
+#if defined(INTEGRATED_VF) || defined(VF_DRIVER)
+#define I40E_DEV_ID_X722_A0_VF 0x374D
 #endif
+#endif
+#define I40E_DEV_ID_KX_X7220x37CE
+#define I40E_DEV_ID_QSFP_X722  0x37CF
 #define I40E_DEV_ID_SFP_X722   0x37D0
 #define I40E_DEV_ID_1G_BASE_T_X722 0x37D1
 #define I40E_DEV_ID_10G_BASE_T_X7220x37D2
+#if defined(INTEGRATED_VF) || defined(VF_DRIVER) || defined(I40E_NDIS_SUPPORT)
 #define I40E_DEV_ID_X722_VF0x37CD
 #define I40E_DEV_ID_X722_VF_HV 0x37D9
+#endif /* VF_DRIVER */
 #endif /* X722_SUPPORT */

 #define i40e_is_40G_device(d)  ((d) == I40E_DEV_ID_QSFP_A  || \
-- 
1.9.3



[dpdk-dev] [PATCH 10/29] i40e/base: fix up recent proxy and wol bits for X722_SUPPORT

2016-01-15 Thread Helin Zhang
The recently added opcodes should be available only with X722
SUPPORT, so move them into the #ifdef, and reorder these to be
in numerical order with the rest of the opcodes. Several structs
that were added are unnecessary, so they are removed here.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 138 
 drivers/net/i40e/base/i40e_common.c |  14 ++--
 2 files changed, 56 insertions(+), 96 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index 1874653..ff6449c 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -139,6 +139,12 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_list_func_capabilities = 0x000A,
i40e_aqc_opc_list_dev_capabilities  = 0x000B,

+#ifdef X722_SUPPORT
+   /* Proxy commands */
+   i40e_aqc_opc_set_proxy_config   = 0x0104,
+   i40e_aqc_opc_set_ns_proxy_table_entry   = 0x0105,
+
+#endif
/* LAA */
i40e_aqc_opc_mac_address_read   = 0x0107,
i40e_aqc_opc_mac_address_write  = 0x0108,
@@ -146,6 +152,12 @@ enum i40e_admin_queue_opc {
/* PXE */
i40e_aqc_opc_clear_pxe_mode = 0x0110,

+#ifdef X722_SUPPORT
+   /* WoL commands */
+   i40e_aqc_opc_set_wol_filter = 0x0120,
+   i40e_aqc_opc_get_wake_reason= 0x0121,
+
+#endif
/* internal switch commands */
i40e_aqc_opc_get_switch_config  = 0x0200,
i40e_aqc_opc_add_statistics = 0x0201,
@@ -270,16 +282,8 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_set_rss_lut= 0x0B03,
i40e_aqc_opc_get_rss_key= 0x0B04,
i40e_aqc_opc_get_rss_lut= 0x0B05,
-
-   /* WoL commands */
-   i40e_aqc_opc_set_wol_filter = 0x0120,
-   i40e_aqc_opc_get_wake_reason = 0x0121,
 #endif

-   /* Proxy commands */
-   i40e_aqc_opc_set_proxy_config   = 0x0104,
-   i40e_aqc_opc_set_ns_proxy_table_entry   = 0x0105,
-
/* Async Events */
i40e_aqc_opc_event_lan_overflow = 0x1001,

@@ -419,6 +423,7 @@ struct i40e_aqc_list_capabilities_element_resp {
 #define I40E_AQ_CAP_ID_OS2BMC_CAP  0x0004
 #define I40E_AQ_CAP_ID_FUNCTIONS_VALID 0x0005
 #define I40E_AQ_CAP_ID_ALTERNATE_RAM   0x0006
+#define I40E_AQ_CAP_ID_WOL_AND_PROXY   0x0008
 #define I40E_AQ_CAP_ID_SRIOV   0x0012
 #define I40E_AQ_CAP_ID_VF  0x0013
 #define I40E_AQ_CAP_ID_VMDQ0x0014
@@ -567,6 +572,43 @@ struct i40e_aqc_clear_pxe {

 I40E_CHECK_CMD_LENGTH(i40e_aqc_clear_pxe);

+#ifdef X722_SUPPORT
+/* Set WoL Filter (0x0120) */
+
+struct i40e_aqc_set_wol_filter {
+   __le16 filter_index;
+#define I40E_AQC_MAX_NUM_WOL_FILTERS   8
+   __le16 cmd_flags;
+#define I40E_AQC_SET_WOL_FILTER0x8000
+#define I40E_AQC_SET_WOL_FILTER_NO_TCO_WOL 0x4000
+   __le16 valid_flags;
+#define I40E_AQC_SET_WOL_FILTER_ACTION_VALID   0x8000
+#define I40E_AQC_SET_WOL_FILTER_NO_TCO_ACTION_VALID0x4000
+   u8 reserved[2];
+   __le32  address_high;
+   __le32  address_low;
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_set_wol_filter);
+
+/* Get Wake Reason (0x0121) */
+
+struct i40e_aqc_get_wake_reason_completion {
+   u8 reserved_1[2];
+   __le16 wake_reason;
+   u8 reserved_2[12];
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_get_wake_reason_completion);
+
+struct i40e_aqc_set_wol_filter_data {
+   u8 filter[128];
+   u8 mask[16];
+};
+
+I40E_CHECK_STRUCT_LEN(0x90, i40e_aqc_set_wol_filter_data);
+
+#endif /* X722_SUPPORT */
 /* Switch configuration commands (0x02xx) */

 /* Used by many indirect commands that only pass an seid and a buffer in the
@@ -2426,84 +2468,4 @@ struct i40e_aqc_debug_modify_internals {

 I40E_CHECK_CMD_LENGTH(i40e_aqc_debug_modify_internals);

-#ifdef X722_SUPPORT
-struct i40e_aqc_set_proxy_config {
-   u8 reserved_1[4];
-   u8 reserved_2[4];
-   __le32  address_high;
-   __le32  address_low;
-};
-
-I40E_CHECK_CMD_LENGTH(i40e_aqc_set_proxy_config);
-
-struct i40e_aqc_set_proxy_config_resp {
-   u8 reserved[8];
-   __le32  address_high;
-   __le32  address_low;
-};
-
-I40E_CHECK_CMD_LENGTH(i40e_aqc_set_proxy_config_resp);
-
-struct i40e_aqc_set_ns_proxy_table_entry {
-   u8 reserved_1[4];
-   u8 reserved_2[4];
-   __le32  address_high;
-   __le32  address_low;
-};
-
-I40E_CHECK_CMD_LENGTH(i40e_aqc_set_ns_proxy_table_entry);
-
-struct i40e_aqc_set_ns_proxy_table_entry_resp {
-   u8 reserved_1[4];
-   u8 reserved_2[4];
-   __le32  address_high;
-   __le32  address_low;
-};
-
-I40E_CHECK_CMD_LENGTH(i40e_aqc_set_ns_proxy_table_entry_resp);
-
-struct i40e_aqc_set_wol_filter {
-   __le16 filter_index;
-#define I40E_AQC_MAX_NUM_WOL_FILTERS   8
-   __le16 cmd_flags;
-#define I40E_AQC_SET_WOL_FILTER0x8000
-#define I40E_AQC_SET_WOL_FILTER_N

[dpdk-dev] [PATCH 12/29] i40e/base: Fix for PHY NVM interaction problem

2016-01-15 Thread Helin Zhang
This patch fixes a problem where the NVMUpdate Tool, when
using the PHY NVM feature, gets bad data from the PHY because
of contention on the MDIO interface from get phy capability
calls from the driver during regular operations. The problem
is fixed by adding a check if media is available before calling
get phy capability function because that bit is not set when
device is in PHY interaction mode.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 8d2f2c7..cc8a63e 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2607,17 +2607,19 @@ enum i40e_status_code i40e_update_link_info(struct 
i40e_hw *hw)
if (status)
return status;

-   status = i40e_aq_get_phy_capabilities(hw, false, false, &abilities,
- NULL);
-   if (status)
-   return status;
-
-   memcpy(hw->phy.link_info.module_type, &abilities.module_type,
-   sizeof(hw->phy.link_info.module_type));
+   if (hw->phy.link_info.link_info & I40E_AQ_MEDIA_AVAILABLE) {
+   status = i40e_aq_get_phy_capabilities(hw, false, false,
+ &abilities, NULL);
+   if (status)
+   return status;

+   memcpy(hw->phy.link_info.module_type, &abilities.module_type,
+   sizeof(hw->phy.link_info.module_type));
+   }
return status;
 }

+
 /**
  * i40e_get_link_speed
  * @hw: pointer to the hw struct
-- 
1.9.3



[dpdk-dev] [PATCH 11/29] i40e/base: define function capabilities in only one place

2016-01-15 Thread Helin Zhang
The device capabilities were defined in two places, and neither had
all the definitions. It really belongs with the AQ API definition,
so this patch removes the other set of definitions and fills out the
missing item.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h |   1 +
 drivers/net/i40e/base/i40e_common.c | 191 ++--
 2 files changed, 131 insertions(+), 61 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index ff6449c..aa11bcd 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -444,6 +444,7 @@ struct i40e_aqc_list_capabilities_element_resp {
 #define I40E_AQ_CAP_ID_LED 0x0061
 #define I40E_AQ_CAP_ID_SDP 0x0062
 #define I40E_AQ_CAP_ID_MDIO0x0063
+#define I40E_AQ_CAP_ID_WSR_PROT0x0064
 #define I40E_AQ_CAP_ID_FLEX10  0x00F1
 #define I40E_AQ_CAP_ID_CEM 0x00F2

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index cfe071b..8d2f2c7 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -3342,38 +3342,6 @@ i40e_aq_erase_nvm_exit:
return status;
 }

-#define I40E_DEV_FUNC_CAP_SWITCH_MODE  0x01
-#define I40E_DEV_FUNC_CAP_MGMT_MODE0x02
-#define I40E_DEV_FUNC_CAP_NPAR 0x03
-#define I40E_DEV_FUNC_CAP_OS2BMC   0x04
-#define I40E_DEV_FUNC_CAP_VALID_FUNC   0x05
-#ifdef X722_SUPPORT
-#define I40E_DEV_FUNC_CAP_WOL_PROXY0x08
-#endif
-#define I40E_DEV_FUNC_CAP_SRIOV_1_10x12
-#define I40E_DEV_FUNC_CAP_VF   0x13
-#define I40E_DEV_FUNC_CAP_VMDQ 0x14
-#define I40E_DEV_FUNC_CAP_802_1_QBG0x15
-#define I40E_DEV_FUNC_CAP_802_1_QBH0x16
-#define I40E_DEV_FUNC_CAP_VSI  0x17
-#define I40E_DEV_FUNC_CAP_DCB  0x18
-#define I40E_DEV_FUNC_CAP_FCOE 0x21
-#define I40E_DEV_FUNC_CAP_ISCSI0x22
-#define I40E_DEV_FUNC_CAP_RSS  0x40
-#define I40E_DEV_FUNC_CAP_RX_QUEUES0x41
-#define I40E_DEV_FUNC_CAP_TX_QUEUES0x42
-#define I40E_DEV_FUNC_CAP_MSIX 0x43
-#define I40E_DEV_FUNC_CAP_MSIX_VF  0x44
-#define I40E_DEV_FUNC_CAP_FLOW_DIRECTOR0x45
-#define I40E_DEV_FUNC_CAP_IEEE_15880x46
-#define I40E_DEV_FUNC_CAP_FLEX10   0xF1
-#define I40E_DEV_FUNC_CAP_CEM  0xF2
-#define I40E_DEV_FUNC_CAP_IWARP0x51
-#define I40E_DEV_FUNC_CAP_LED  0x61
-#define I40E_DEV_FUNC_CAP_SDP  0x62
-#define I40E_DEV_FUNC_CAP_MDIO 0x63
-#define I40E_DEV_FUNC_CAP_WR_CSR_PROT  0x64
-
 /**
  * i40e_parse_discover_capabilities
  * @hw: pointer to the hw struct
@@ -3412,79 +3380,146 @@ STATIC void i40e_parse_discover_capabilities(struct 
i40e_hw *hw, void *buff,
major_rev = cap->major_rev;

switch (id) {
-   case I40E_DEV_FUNC_CAP_SWITCH_MODE:
+   case I40E_AQ_CAP_ID_SWITCH_MODE:
p->switch_mode = number;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: Switch mode = %d\n",
+  p->switch_mode);
break;
-   case I40E_DEV_FUNC_CAP_MGMT_MODE:
+   case I40E_AQ_CAP_ID_MNG_MODE:
p->management_mode = number;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: Management Mode = %d\n",
+  p->management_mode);
break;
-   case I40E_DEV_FUNC_CAP_NPAR:
+   case I40E_AQ_CAP_ID_NPAR_ACTIVE:
p->npar_enable = number;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: NPAR enable = %d\n",
+  p->npar_enable);
break;
-   case I40E_DEV_FUNC_CAP_OS2BMC:
+   case I40E_AQ_CAP_ID_OS2BMC_CAP:
p->os2bmc = number;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: OS2BMC = %d\n", p->os2bmc);
break;
-   case I40E_DEV_FUNC_CAP_VALID_FUNC:
+   case I40E_AQ_CAP_ID_FUNCTIONS_VALID:
p->valid_functions = number;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: Valid Functions = %d\n",
+  p->valid_functions);
break;
-   case I40E_DEV_FUNC_CAP_SRIOV_1_1:
+   case I40E_AQ_CAP_ID_SRIOV:
if (number == 1)
p->sr_iov_1_1 = true;
+   i40e_debug(hw, I40E_DEBUG_INIT,
+  "HW Capability: SR-IOV = %d\n",
+ 

[dpdk-dev] [PATCH 14/29] i40e/base: add APIs to Add/remove port mirroring rules

2016-01-15 Thread Helin Zhang
This patch implements necessary functions related to port
mirroring features such as add/delete mirror rule, function
to set promiscuous VLAN mode for VSI if mirror rule_type is
"VLAN Mirroring".

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c| 162 +
 drivers/net/i40e/base/i40e_prototype.h |  12 +++
 2 files changed, 174 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 44855b3..b1d063f 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2374,6 +2374,37 @@ enum i40e_status_code i40e_aq_set_vsi_broadcast(struct 
i40e_hw *hw,
 }

 /**
+ * i40e_aq_set_vsi_vlan_promisc - control the VLAN promiscuous setting
+ * @hw: pointer to the hw struct
+ * @seid: vsi number
+ * @enable: set MAC L2 layer unicast promiscuous enable/disable for a given 
VLAN
+ * @cmd_details: pointer to command details structure or NULL
+ **/
+enum i40e_status_code i40e_aq_set_vsi_vlan_promisc(struct i40e_hw *hw,
+   u16 seid, bool enable,
+   struct i40e_asq_cmd_details *cmd_details)
+{
+   struct i40e_aq_desc desc;
+   struct i40e_aqc_set_vsi_promiscuous_modes *cmd =
+   (struct i40e_aqc_set_vsi_promiscuous_modes *)&desc.params.raw;
+   enum i40e_status_code status;
+   u16 flags = 0;
+
+   i40e_fill_default_direct_cmd_desc(&desc,
+   i40e_aqc_opc_set_vsi_promiscuous_modes);
+   if (enable)
+   flags |= I40E_AQC_SET_VSI_PROMISC_VLAN;
+
+   cmd->promiscuous_flags = CPU_TO_LE16(flags);
+   cmd->valid_flags = CPU_TO_LE16(I40E_AQC_SET_VSI_PROMISC_VLAN);
+   cmd->seid = CPU_TO_LE16(seid);
+
+   status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+   return status;
+}
+
+/**
  * i40e_get_vsi_params - get VSI configuration info
  * @hw: pointer to the hw struct
  * @vsi_ctx: pointer to a vsi context struct
@@ -2849,6 +2880,137 @@ enum i40e_status_code i40e_aq_remove_macvlan(struct 
i40e_hw *hw, u16 seid,
 }

 /**
+ * i40e_mirrorrule_op - Internal helper function to add/delete mirror rule
+ * @hw: pointer to the hw struct
+ * @opcode: AQ opcode for add or delete mirror rule
+ * @sw_seid: Switch SEID (to which rule refers)
+ * @rule_type: Rule Type (ingress/egress/VLAN)
+ * @id: Destination VSI SEID or Rule ID
+ * @count: length of the list
+ * @mr_list: list of mirrored VSI SEIDs or VLAN IDs
+ * @cmd_details: pointer to command details structure or NULL
+ * @rule_id: Rule ID returned from FW
+ * @rule_used: Number of rules used in internal switch
+ * @rule_free: Number of rules free in internal switch
+ *
+ * Add/Delete a mirror rule to a specific switch. Mirror rules are supported 
for
+ * VEBs/VEPA elements only
+ **/
+static enum i40e_status_code i40e_mirrorrule_op(struct i40e_hw *hw,
+   u16 opcode, u16 sw_seid, u16 rule_type, u16 id,
+   u16 count, __le16 *mr_list,
+   struct i40e_asq_cmd_details *cmd_details,
+   u16 *rule_id, u16 *rules_used, u16 *rules_free)
+{
+   struct i40e_aq_desc desc;
+   struct i40e_aqc_add_delete_mirror_rule *cmd =
+   (struct i40e_aqc_add_delete_mirror_rule *)&desc.params.raw;
+   struct i40e_aqc_add_delete_mirror_rule_completion *resp =
+   (struct i40e_aqc_add_delete_mirror_rule_completion *)&desc.params.raw;
+   enum i40e_status_code status;
+   u16 buf_size;
+
+   buf_size = count * sizeof(*mr_list);
+
+   /* prep the rest of the request */
+   i40e_fill_default_direct_cmd_desc(&desc, opcode);
+   cmd->seid = CPU_TO_LE16(sw_seid);
+   cmd->rule_type = CPU_TO_LE16(rule_type &
+I40E_AQC_MIRROR_RULE_TYPE_MASK);
+   cmd->num_entries = CPU_TO_LE16(count);
+   /* Dest VSI for add, rule_id for delete */
+   cmd->destination = CPU_TO_LE16(id);
+   if (mr_list) {
+   desc.flags |= CPU_TO_LE16((u16)(I40E_AQ_FLAG_BUF |
+   I40E_AQ_FLAG_RD));
+   if (buf_size > I40E_AQ_LARGE_BUF)
+   desc.flags |= CPU_TO_LE16((u16)I40E_AQ_FLAG_LB);
+   }
+
+   status = i40e_asq_send_command(hw, &desc, mr_list, buf_size,
+  cmd_details);
+   if (status == I40E_SUCCESS ||
+   hw->aq.asq_last_status == I40E_AQ_RC_ENOSPC) {
+   if (rule_id)
+   *rule_id = LE16_TO_CPU(resp->rule_id);
+   if (rules_used)
+   *rules_used = LE16_TO_CPU(resp->mirror_rules_used);
+   if (rules_free)
+   *rules_free = LE16_TO_CPU(resp->mirror_rules_free);
+   }
+   return status;
+}
+
+/**
+ * i40e_aq_add_mirrorrule - add a mirror rule
+ * @hw: pointer to the hw struct
+ * @sw_seid: Switch SEID (to which rule 

[dpdk-dev] [PATCH 13/29] i40e/base: set shared bit for multicast filters

2016-01-15 Thread Helin Zhang
Add the use of the new Shared MAC filter bit for multicast
and broadcast filters in order to make better use of the
filters available from the device. The FW folks have assured
that setting this bit on older FW will have no affect, so it
doesn't need a version check.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 1 +
 drivers/net/i40e/base/i40e_common.c | 8 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index aa11bcd..cd55a36 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -1033,6 +1033,7 @@ struct i40e_aqc_add_macvlan_element_data {
 #define I40E_AQC_MACVLAN_ADD_HASH_MATCH0x0002
 #define I40E_AQC_MACVLAN_ADD_IGNORE_VLAN   0x0004
 #define I40E_AQC_MACVLAN_ADD_TO_QUEUE  0x0008
+#define I40E_AQC_MACVLAN_ADD_USE_SHARED_MAC0x0010
__le16  queue_number;
 #define I40E_AQC_MACVLAN_CMD_QUEUE_SHIFT   0
 #define I40E_AQC_MACVLAN_CMD_QUEUE_MASK(0x7FF << \
diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index cc8a63e..44855b3 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2777,6 +2777,7 @@ enum i40e_status_code i40e_aq_add_macvlan(struct i40e_hw 
*hw, u16 seid,
(struct i40e_aqc_macvlan *)&desc.params.raw;
enum i40e_status_code status;
u16 buf_size;
+   int i;

if (count == 0 || !mv_list || !hw)
return I40E_ERR_PARAM;
@@ -2790,12 +2791,17 @@ enum i40e_status_code i40e_aq_add_macvlan(struct 
i40e_hw *hw, u16 seid,
cmd->seid[1] = 0;
cmd->seid[2] = 0;

+   for (i = 0; i < count; i++)
+   if (I40E_IS_MULTICAST(mv_list[i].mac_addr))
+   mv_list[i].flags |=
+   CPU_TO_LE16(I40E_AQC_MACVLAN_ADD_USE_SHARED_MAC);
+
desc.flags |= CPU_TO_LE16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
if (buf_size > I40E_AQ_LARGE_BUF)
desc.flags |= CPU_TO_LE16((u16)I40E_AQ_FLAG_LB);

status = i40e_asq_send_command(hw, &desc, mv_list, buf_size,
-   cmd_details);
+  cmd_details);

return status;
 }
-- 
1.9.3



[dpdk-dev] [PATCH 15/29] i40e/base: add VEB stat control and remove L2 cloud filter

2016-01-15 Thread Helin Zhang
With the latest firmware, statistics gathering can now be enabled and
disabled in the HW switch, so we need to add a parameter to allow the
driver to set it as desired. At the same time, the L2 cloud filtering
parameter has been removed as it was never used.
Older drivers working with the newer firmware and newer drivers working
with older firmware will not run into problems with these bits as the
defaults are reasonable and there is no overlap in the bit definitions.
Also, newer drivers will be forced to update because of the change in
function call parameters, a reminder that the functionality exists.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h |  3 ++-
 drivers/net/i40e/base/i40e_common.c | 11 ++-
 drivers/net/i40e/base/i40e_prototype.h  |  4 ++--
 drivers/net/i40e/i40e_ethdev.c  |  2 +-
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index cd55a36..6ec29a0 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -966,7 +966,8 @@ struct i40e_aqc_add_veb {
I40E_AQC_ADD_VEB_PORT_TYPE_SHIFT)
 #define I40E_AQC_ADD_VEB_PORT_TYPE_DEFAULT 0x2
 #define I40E_AQC_ADD_VEB_PORT_TYPE_DATA0x4
-#define I40E_AQC_ADD_VEB_ENABLE_L2_FILTER  0x8
+#define I40E_AQC_ADD_VEB_ENABLE_L2_FILTER  0x8 /* deprecated */
+#define I40E_AQC_ADD_VEB_ENABLE_DISABLE_STATS  0x10
u8  enable_tcs;
u8  reserved[9];
 };
diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index b1d063f..fdd4de7 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2682,8 +2682,8 @@ i40e_link_speed_exit:
  * @downlink_seid: the VSI SEID
  * @enabled_tc: bitmap of TCs to be enabled
  * @default_port: true for default port VSI, false for control port
- * @enable_l2_filtering: true to add L2 filter table rules to regular 
forwarding rules for cloud support
  * @veb_seid: pointer to where to put the resulting VEB SEID
+ * @enable_stats: true to turn on VEB stats
  * @cmd_details: pointer to command details structure or NULL
  *
  * This asks the FW to add a VEB between the uplink and downlink
@@ -2691,8 +2691,8 @@ i40e_link_speed_exit:
  **/
 enum i40e_status_code i40e_aq_add_veb(struct i40e_hw *hw, u16 uplink_seid,
u16 downlink_seid, u8 enabled_tc,
-   bool default_port, bool enable_l2_filtering,
-   u16 *veb_seid,
+   bool default_port, u16 *veb_seid,
+   bool enable_stats,
struct i40e_asq_cmd_details *cmd_details)
 {
struct i40e_aq_desc desc;
@@ -2719,8 +2719,9 @@ enum i40e_status_code i40e_aq_add_veb(struct i40e_hw *hw, 
u16 uplink_seid,
else
veb_flags |= I40E_AQC_ADD_VEB_PORT_TYPE_DATA;

-   if (enable_l2_filtering)
-   veb_flags |= I40E_AQC_ADD_VEB_ENABLE_L2_FILTER;
+   /* reverse logic here: set the bitflag to disable the stats */
+   if (!enable_stats)
+   veb_flags |= I40E_AQC_ADD_VEB_ENABLE_DISABLE_STATS;

cmd->veb_flags = CPU_TO_LE16(veb_flags);

diff --git a/drivers/net/i40e/base/i40e_prototype.h 
b/drivers/net/i40e/base/i40e_prototype.h
index b5b8935..81ccc96 100644
--- a/drivers/net/i40e/base/i40e_prototype.h
+++ b/drivers/net/i40e/base/i40e_prototype.h
@@ -179,8 +179,8 @@ enum i40e_status_code i40e_aq_update_vsi_params(struct 
i40e_hw *hw,
struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_add_veb(struct i40e_hw *hw, u16 uplink_seid,
u16 downlink_seid, u8 enabled_tc,
-   bool default_port, bool enable_l2_filtering,
-   u16 *pveb_seid,
+   bool default_port, u16 *pveb_seid,
+   bool enable_stats,
struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_get_veb_parameters(struct i40e_hw *hw,
u16 veb_seid, u16 *switch_id, bool *floating,
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..7fc211d 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3636,7 +3636,7 @@ i40e_veb_setup(struct i40e_pf *pf, struct i40e_vsi *vsi)
veb->uplink_seid = vsi->uplink_seid;

ret = i40e_aq_add_veb(hw, veb->uplink_seid, vsi->seid,
-   I40E_DEFAULT_TCMAP, false, false, &veb->seid, NULL);
+   I40E_DEFAULT_TCMAP, false, &veb->seid, false, NULL);

if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Add veb failed, aq_err: %d",
-- 
1.9.3



[dpdk-dev] [PATCH 16/29] i40e/base: implement the API function for aq_set_switch_config

2016-01-15 Thread Helin Zhang
Add the support code for calling the AdminQ API call
aq_set_switch_config.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 12 
 drivers/net/i40e/base/i40e_common.c | 28 
 drivers/net/i40e/base/i40e_prototype.h  |  3 +++
 3 files changed, 43 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index 6ec29a0..c84e0ec 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -164,6 +164,7 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_remove_statistics  = 0x0202,
i40e_aqc_opc_set_port_parameters= 0x0203,
i40e_aqc_opc_get_switch_resource_alloc  = 0x0204,
+   i40e_aqc_opc_set_switch_config  = 0x0205,

i40e_aqc_opc_add_vsi= 0x0210,
i40e_aqc_opc_update_vsi_parameters  = 0x0211,
@@ -740,6 +741,17 @@ struct i40e_aqc_switch_resource_alloc_element_resp {

 I40E_CHECK_STRUCT_LEN(0x10, i40e_aqc_switch_resource_alloc_element_resp);

+/* Set Switch Configuration (direct 0x0205) */
+struct i40e_aqc_set_switch_config {
+   __le16  flags;
+#define I40E_AQ_SET_SWITCH_CFG_PROMISC 0x0001
+#define I40E_AQ_SET_SWITCH_CFG_L2_FILTER   0x0002
+   __le16  valid_flags;
+   u8  reserved[12];
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_set_switch_config);
+
 /* Add VSI (indirect 0x0210)
  *this indirect command uses struct i40e_aqc_vsi_properties_data
  *as the indirect buffer (128 bytes)
diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index fdd4de7..c800fd8 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2508,6 +2508,34 @@ enum i40e_status_code i40e_aq_get_switch_config(struct 
i40e_hw *hw,
 }

 /**
+ * i40e_aq_set_switch_config
+ * @hw: pointer to the hardware structure
+ * @flags: bit flag values to set
+ * @valid_flags: which bit flags to set
+ * @cmd_details: pointer to command details structure or NULL
+ *
+ * Set switch configuration bits
+ **/
+enum i40e_status_code i40e_aq_set_switch_config(struct i40e_hw *hw,
+   u16 flags, u16 valid_flags,
+   struct i40e_asq_cmd_details *cmd_details)
+{
+   struct i40e_aq_desc desc;
+   struct i40e_aqc_set_switch_config *scfg =
+   (struct i40e_aqc_set_switch_config *)&desc.params.raw;
+   enum i40e_status_code status;
+
+   i40e_fill_default_direct_cmd_desc(&desc,
+ i40e_aqc_opc_set_switch_config);
+   scfg->flags = CPU_TO_LE16(flags);
+   scfg->valid_flags = CPU_TO_LE16(valid_flags);
+
+   status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+   return status;
+}
+
+/**
  * i40e_aq_get_firmware_version
  * @hw: pointer to the hw struct
  * @fw_major_version: firmware major version
diff --git a/drivers/net/i40e/base/i40e_prototype.h 
b/drivers/net/i40e/base/i40e_prototype.h
index 81ccc96..cbe9961 100644
--- a/drivers/net/i40e/base/i40e_prototype.h
+++ b/drivers/net/i40e/base/i40e_prototype.h
@@ -215,6 +215,9 @@ enum i40e_status_code i40e_aq_get_switch_config(struct 
i40e_hw *hw,
struct i40e_aqc_get_switch_config_resp *buf,
u16 buf_size, u16 *start_seid,
struct i40e_asq_cmd_details *cmd_details);
+enum i40e_status_code i40e_aq_set_switch_config(struct i40e_hw *hw,
+   u16 flags, u16 valid_flags,
+   struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_request_resource(struct i40e_hw *hw,
enum i40e_aq_resources_ids resource,
enum i40e_aq_resource_access_type access,
-- 
1.9.3



[dpdk-dev] [PATCH 17/29] i40e/base: Add functions to blink led on Coppervale PHY

2016-01-15 Thread Helin Zhang
This patch adds functions to blink led on devices using
Coppervale PHY since MAC registers used in other designs
do not work in this device configuration.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c| 329 +
 drivers/net/i40e/base/i40e_prototype.h |  13 ++
 drivers/net/i40e/base/i40e_type.h  |  16 ++
 3 files changed, 358 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index c800fd8..2383153 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -5969,6 +5969,335 @@ enum i40e_status_code 
i40e_aq_configure_partition_bw(struct i40e_hw *hw,

return status;
 }
+
+/**
+ * i40e_read_phy_register
+ * @hw: pointer to the HW structure
+ * @page: registers page number
+ * @reg: register address in the page
+ * @phy_adr: PHY address on MDIO interface
+ * @value: PHY register value
+ *
+ * Reads specified PHY register value
+ **/
+enum i40e_status_code i40e_read_phy_register(struct i40e_hw *hw,
+u8 page, u16 reg, u8 phy_addr,
+u16 *value)
+{
+   enum i40e_status_code status = I40E_ERR_TIMEOUT;
+   u32 command  = 0;
+   u16 retry = 1000;
+   u8 port_num = (u8)hw->func_caps.mdio_port_num;
+
+   command = (reg << I40E_GLGEN_MSCA_MDIADD_SHIFT) |
+ (page << I40E_GLGEN_MSCA_DEVADD_SHIFT) |
+ (phy_addr << I40E_GLGEN_MSCA_PHYADD_SHIFT) |
+ (I40E_MDIO_OPCODE_ADDRESS) |
+ (I40E_MDIO_STCODE) |
+ (I40E_GLGEN_MSCA_MDICMD_MASK) |
+ (I40E_GLGEN_MSCA_MDIINPROGEN_MASK);
+   wr32(hw, I40E_GLGEN_MSCA(port_num), command);
+   do {
+   command = rd32(hw, I40E_GLGEN_MSCA(port_num));
+   if (!(command & I40E_GLGEN_MSCA_MDICMD_MASK)) {
+   status = I40E_SUCCESS;
+   break;
+   }
+   i40e_usec_delay(10);
+   retry--;
+   } while (retry);
+
+   if (status) {
+   i40e_debug(hw, I40E_DEBUG_PHY,
+  "PHY: Can't write command to external PHY.\n");
+   goto phy_read_end;
+   }
+
+   command = (page << I40E_GLGEN_MSCA_DEVADD_SHIFT) |
+ (phy_addr << I40E_GLGEN_MSCA_PHYADD_SHIFT) |
+ (I40E_MDIO_OPCODE_READ) |
+ (I40E_MDIO_STCODE) |
+ (I40E_GLGEN_MSCA_MDICMD_MASK) |
+ (I40E_GLGEN_MSCA_MDIINPROGEN_MASK);
+   status = I40E_ERR_TIMEOUT;
+   retry = 1000;
+   wr32(hw, I40E_GLGEN_MSCA(port_num), command);
+   do {
+   command = rd32(hw, I40E_GLGEN_MSCA(port_num));
+   if (!(command & I40E_GLGEN_MSCA_MDICMD_MASK)) {
+   status = I40E_SUCCESS;
+   break;
+   }
+   i40e_usec_delay(10);
+   retry--;
+   } while (retry);
+
+   if (!status) {
+   command = rd32(hw, I40E_GLGEN_MSRWD(port_num));
+   *value = (command & I40E_GLGEN_MSRWD_MDIRDDATA_MASK) >>
+I40E_GLGEN_MSRWD_MDIRDDATA_SHIFT;
+   } else {
+   i40e_debug(hw, I40E_DEBUG_PHY,
+  "PHY: Can't read register value from external 
PHY.\n");
+   }
+
+phy_read_end:
+   return status;
+}
+
+/**
+ * i40e_write_phy_register
+ * @hw: pointer to the HW structure
+ * @page: registers page number
+ * @reg: register address in the page
+ * @phy_adr: PHY address on MDIO interface
+ * @value: PHY register value
+ *
+ * Writes value to specified PHY register
+ **/
+enum i40e_status_code i40e_write_phy_register(struct i40e_hw *hw,
+ u8 page, u16 reg, u8 phy_addr,
+ u16 value)
+{
+   enum i40e_status_code status = I40E_ERR_TIMEOUT;
+   u32 command  = 0;
+   u16 retry = 1000;
+   u8 port_num = (u8)hw->func_caps.mdio_port_num;
+
+   command = (reg << I40E_GLGEN_MSCA_MDIADD_SHIFT) |
+ (page << I40E_GLGEN_MSCA_DEVADD_SHIFT) |
+ (phy_addr << I40E_GLGEN_MSCA_PHYADD_SHIFT) |
+ (I40E_MDIO_OPCODE_ADDRESS) |
+ (I40E_MDIO_STCODE) |
+ (I40E_GLGEN_MSCA_MDICMD_MASK) |
+ (I40E_GLGEN_MSCA_MDIINPROGEN_MASK);
+   wr32(hw, I40E_GLGEN_MSCA(port_num), command);
+   do {
+   command = rd32(hw, I40E_GLGEN_MSCA(port_num));
+   if (!(command & I40E_GLGEN_MSCA_MDICMD_MASK)) {
+   status = I40E_SUCCESS;
+   break;
+   }
+   i40e_usec_delay(10);
+   retry--;
+   } while (retry);
+   if (status) {
+   i40e_debug(hw, I40E_DEBUG_PHY,
+  "PHY: Can't write command to external PHY.\n"

[dpdk-dev] [PATCH 18/29] i40e/base: When in promisc mode apply promisc mode to Tx Traffic as well

2016-01-15 Thread Helin Zhang
In MFP mode particularly when we were setting the PF VSI in limited
promiscuous, the HW switch was still mirroring the outgoing packets
from other VSIs (VF/VMdq) onto the PF VSI.
With this new bit set, the mirroring doesn't happen any more and so
we are in limited promiscuous on the PF VSI in MFP which is similar
to defport.
An API check is not required, since this bit is reserved for FW API
version < 1.5.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 1 +
 drivers/net/i40e/base/i40e_common.c | 9 -
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index c84e0ec..165df9b 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -1143,6 +1143,7 @@ struct i40e_aqc_set_vsi_promiscuous_modes {
 #define I40E_AQC_SET_VSI_PROMISC_BROADCAST 0x04
 #define I40E_AQC_SET_VSI_DEFAULT   0x08
 #define I40E_AQC_SET_VSI_PROMISC_VLAN  0x10
+#define I40E_AQC_SET_VSI_PROMISC_TX0x8000
__le16  seid;
 #define I40E_AQC_VSI_PROM_CMD_SEID_MASK0x3FF
__le16  vlan_tag;
diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 2383153..a4cf5cf 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2225,12 +2225,19 @@ enum i40e_status_code 
i40e_aq_set_vsi_unicast_promiscuous(struct i40e_hw *hw,
i40e_fill_default_direct_cmd_desc(&desc,
i40e_aqc_opc_set_vsi_promiscuous_modes);

-   if (set)
+   if (set) {
flags |= I40E_AQC_SET_VSI_PROMISC_UNICAST;
+   if (((hw->aq.api_maj_ver == 1) && (hw->aq.api_min_ver >= 5)) ||
+(hw->aq.api_maj_ver > 1))
+   flags |= I40E_AQC_SET_VSI_PROMISC_TX;
+   }

cmd->promiscuous_flags = CPU_TO_LE16(flags);

cmd->valid_flags = CPU_TO_LE16(I40E_AQC_SET_VSI_PROMISC_UNICAST);
+   if (((hw->aq.api_maj_ver >= 1) && (hw->aq.api_min_ver >= 5)) ||
+(hw->aq.api_maj_ver > 1))
+   cmd->valid_flags |= CPU_TO_LE16(I40E_AQC_SET_VSI_PROMISC_TX);

cmd->seid = CPU_TO_LE16(seid);
status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
-- 
1.9.3



[dpdk-dev] [PATCH 19/29] i40e/base: Increase timeout when checking GLGEN_RSTAT_DEVSTATE bit

2016-01-15 Thread Helin Zhang
When linking with particular PHY types (ex: copper PHY), the amount of
time it takes for the GLGEN_RSTAT_DEVSTATE to be set increases greatly,
which can lead to a timeout and failure to load the driver.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index a4cf5cf..925bb1c 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -1316,11 +1316,11 @@ enum i40e_status_code i40e_pf_reset(struct i40e_hw *hw)
grst_del = (rd32(hw, I40E_GLGEN_RSTCTL) &
I40E_GLGEN_RSTCTL_GRSTDEL_MASK) >>
I40E_GLGEN_RSTCTL_GRSTDEL_SHIFT;
-#ifdef I40E_ESS_SUPPORT
+
/* It can take upto 15 secs for GRST steady state */
grst_del = grst_del * 20; /* bump it to 16 secs max to be safe */
-#endif
-   for (cnt = 0; cnt < grst_del + 10; cnt++) {
+
+   for (cnt = 0; cnt < grst_del; cnt++) {
reg = rd32(hw, I40E_GLGEN_RSTAT);
if (!(reg & I40E_GLGEN_RSTAT_DEVSTATE_MASK))
break;
-- 
1.9.3



[dpdk-dev] [PATCH 20/29] i40e/base: Save off VSI resource count when updating VSI

2016-01-15 Thread Helin Zhang
When updating a VSI, save off the number of allocated and
unallocated VSIs as we do when adding a VSI.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 925bb1c..9a0b787 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -2467,6 +2467,9 @@ enum i40e_status_code i40e_aq_update_vsi_params(struct 
i40e_hw *hw,
struct i40e_aq_desc desc;
struct i40e_aqc_add_get_update_vsi *cmd =
(struct i40e_aqc_add_get_update_vsi *)&desc.params.raw;
+   struct i40e_aqc_add_get_update_vsi_completion *resp =
+   (struct i40e_aqc_add_get_update_vsi_completion *)
+   &desc.params.raw;
enum i40e_status_code status;

i40e_fill_default_direct_cmd_desc(&desc,
@@ -2478,6 +2481,9 @@ enum i40e_status_code i40e_aq_update_vsi_params(struct 
i40e_hw *hw,
status = i40e_asq_send_command(hw, &desc, &vsi_ctx->info,
sizeof(vsi_ctx->info), cmd_details);

+   vsi_ctx->vsis_allocated = LE16_TO_CPU(resp->vsi_used);
+   vsi_ctx->vsis_unallocated = LE16_TO_CPU(resp->vsi_free);
+
return status;
 }

-- 
1.9.3



[dpdk-dev] [PATCH 21/29] i40e/base: coding style fixes

2016-01-15 Thread Helin Zhang
It adds coding style fixes.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_common.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index 9a0b787..e94f726 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -1540,9 +1540,11 @@ u32 i40e_led_get(struct i40e_hw *hw)
if (!gpio_val)
continue;

-   /* ignore gpio LED src mode entries related to the activity 
LEDs */
-   current_mode = ((gpio_val & I40E_GLGEN_GPIO_CTL_LED_MODE_MASK) 
>>
-   I40E_GLGEN_GPIO_CTL_LED_MODE_SHIFT);
+   /* ignore gpio LED src mode entries related to the activity
+*  LEDs
+*/
+   current_mode = ((gpio_val & I40E_GLGEN_GPIO_CTL_LED_MODE_MASK)
+   >> I40E_GLGEN_GPIO_CTL_LED_MODE_SHIFT);
switch (current_mode) {
case I40E_COMBINED_ACTIVITY:
case I40E_FILTER_ACTIVITY:
@@ -1586,9 +1588,11 @@ void i40e_led_set(struct i40e_hw *hw, u32 mode, bool 
blink)
if (!gpio_val)
continue;

-   /* ignore gpio LED src mode entries related to the activity 
LEDs */
-   current_mode = ((gpio_val & I40E_GLGEN_GPIO_CTL_LED_MODE_MASK) 
>>
-   I40E_GLGEN_GPIO_CTL_LED_MODE_SHIFT);
+   /* ignore gpio LED src mode entries related to the activity
+* LEDs
+*/
+   current_mode = ((gpio_val & I40E_GLGEN_GPIO_CTL_LED_MODE_MASK)
+   >> I40E_GLGEN_GPIO_CTL_LED_MODE_SHIFT);
switch (current_mode) {
case I40E_COMBINED_ACTIVITY:
case I40E_FILTER_ACTIVITY:
@@ -2821,6 +2825,7 @@ enum i40e_status_code i40e_aq_get_veb_parameters(struct 
i40e_hw *hw,
*vebs_free = LE16_TO_CPU(cmd_resp->vebs_free);
if (floating) {
u16 flags = LE16_TO_CPU(cmd_resp->veb_flags);
+
if (flags & I40E_AQC_ADD_VEB_FLOATING)
*floating = true;
else
@@ -5471,7 +5476,7 @@ void 
i40e_add_filter_to_drop_tx_flow_control_frames(struct i40e_hw *hw,
u16 ethtype = I40E_FLOW_CONTROL_ETHTYPE;
enum i40e_status_code status;

-   status = i40e_aq_add_rem_control_packet_filter(hw, 0, ethtype, flag,
+   status = i40e_aq_add_rem_control_packet_filter(hw, NULL, ethtype, flag,
   seid, 0, true, NULL,
   NULL);
if (status)
-- 
1.9.3



[dpdk-dev] [PATCH 22/29] i40e/base: use FW to read/write rx control registers

2016-01-15 Thread Helin Zhang
RX control register read/write functions are added, as directly
read/write may fail when under stress small traffic. After the
adminq is ready, all rx control registers should be read/written
by dedicated functions.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h |  16 
 drivers/net/i40e/base/i40e_common.c | 126 +++-
 drivers/net/i40e/base/i40e_osdep.h  |  36 +
 drivers/net/i40e/base/i40e_prototype.h  |   8 ++
 4 files changed, 184 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index 165df9b..12ebd35 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -165,6 +165,8 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_set_port_parameters= 0x0203,
i40e_aqc_opc_get_switch_resource_alloc  = 0x0204,
i40e_aqc_opc_set_switch_config  = 0x0205,
+   i40e_aqc_opc_rx_ctl_reg_read= 0x0206,
+   i40e_aqc_opc_rx_ctl_reg_write   = 0x0207,

i40e_aqc_opc_add_vsi= 0x0210,
i40e_aqc_opc_update_vsi_parameters  = 0x0211,
@@ -752,6 +754,20 @@ struct i40e_aqc_set_switch_config {

 I40E_CHECK_CMD_LENGTH(i40e_aqc_set_switch_config);

+/* Read Receive control registers  (direct 0x0206)
+ * Write Receive control registers (direct 0x0207)
+ * used for accessing Rx control registers that can be
+ * slow and need special handling when under high Rx load
+ */
+struct i40e_aqc_rx_ctl_reg_read_write {
+   __le32 reserved1;
+   __le32 address;
+   __le32 reserved2;
+   __le32 value;
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_rx_ctl_reg_read_write);
+
 /* Add VSI (indirect 0x0210)
  *this indirect command uses struct i40e_aqc_vsi_properties_data
  *as the indirect buffer (128 bytes)
diff --git a/drivers/net/i40e/base/i40e_common.c 
b/drivers/net/i40e/base/i40e_common.c
index e94f726..ef3425e 100644
--- a/drivers/net/i40e/base/i40e_common.c
+++ b/drivers/net/i40e/base/i40e_common.c
@@ -5356,7 +5356,7 @@ enum i40e_status_code i40e_set_filter_control(struct 
i40e_hw *hw,
return ret;

/* Read the PF Queue Filter control register */
-   val = rd32(hw, I40E_PFQF_CTL_0);
+   val = i40e_read_rx_ctl(hw, I40E_PFQF_CTL_0);

/* Program required PE hash buckets for the PF */
val &= ~I40E_PFQF_CTL_0_PEHSIZE_MASK;
@@ -5393,7 +5393,7 @@ enum i40e_status_code i40e_set_filter_control(struct 
i40e_hw *hw,
if (settings->enable_macvlan)
val |= I40E_PFQF_CTL_0_MACVLAN_ENA_MASK;

-   wr32(hw, I40E_PFQF_CTL_0, val);
+   i40e_write_rx_ctl(hw, I40E_PFQF_CTL_0, val);

return I40E_SUCCESS;
 }
@@ -6317,6 +6317,128 @@ restore_config:
return status;
 }
 #endif /* PF_DRIVER */
+
+/**
+ * i40e_aq_rx_ctl_read_register - use FW to read from an Rx control register
+ * @hw: pointer to the hw struct
+ * @reg_addr: register address
+ * @reg_val: ptr to register value
+ * @cmd_details: pointer to command details structure or NULL
+ *
+ * Use the firmware to read the Rx control register,
+ * especially useful if the Rx unit is under heavy pressure
+ **/
+enum i40e_status_code i40e_aq_rx_ctl_read_register(struct i40e_hw *hw,
+   u32 reg_addr, u32 *reg_val,
+   struct i40e_asq_cmd_details *cmd_details)
+{
+   struct i40e_aq_desc desc;
+   struct i40e_aqc_rx_ctl_reg_read_write *cmd_resp =
+   (struct i40e_aqc_rx_ctl_reg_read_write *)&desc.params.raw;
+   enum i40e_status_code status;
+
+   if (reg_val == NULL)
+   return I40E_ERR_PARAM;
+
+   i40e_fill_default_direct_cmd_desc(&desc, i40e_aqc_opc_rx_ctl_reg_read);
+
+   cmd_resp->address = CPU_TO_LE32(reg_addr);
+
+   status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+   if (status == I40E_SUCCESS)
+   *reg_val = LE32_TO_CPU(cmd_resp->value);
+
+   return status;
+}
+
+/**
+ * i40e_read_rx_ctl - read from an Rx control register
+ * @hw: pointer to the hw struct
+ * @reg_addr: register address
+ **/
+u32 i40e_read_rx_ctl(struct i40e_hw *hw, u32 reg_addr)
+{
+   enum i40e_status_code status = I40E_SUCCESS;
+   bool use_register;
+   int retry = 5;
+   u32 val = 0;
+
+   use_register = (hw->aq.api_maj_ver == 1) && (hw->aq.api_min_ver < 5);
+   if (!use_register) {
+do_retry:
+   status = i40e_aq_rx_ctl_read_register(hw, reg_addr, &val, NULL);
+   if (hw->aq.asq_last_status == I40E_AQ_RC_EAGAIN && retry) {
+   i40e_msec_delay(1);
+   retry--;
+   goto do_retry;
+   }
+   }
+
+   /* if the AQ access failed, try the old-fashioned way */
+   if (status || use_register)
+   val = rd32(hw, reg_addr);
+
+   return val;
+}
+
+/**
+ * i40e_aq_r

[dpdk-dev] [PATCH 23/29] i40e/base: expose some registers to program parser, FD and RSS logic

2016-01-15 Thread Helin Zhang
This patch adds 7 new register definitions for programming the
parser, flow director and RSS blocks in the HW.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_register.h | 48 +++
 drivers/net/i40e/i40e_ethdev.c| 11 ++--
 2 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_register.h 
b/drivers/net/i40e/base/i40e_register.h
index 6e56620..fd0a723 100644
--- a/drivers/net/i40e/base/i40e_register.h
+++ b/drivers/net/i40e/base/i40e_register.h
@@ -2056,6 +2056,14 @@ POSSIBILITY OF SUCH DAMAGE.
 #define I40E_PRTPM_TLPIC  0x001E43C0 /* Reset: GLOBR */
 #define I40E_PRTPM_TLPIC_ETLPIC_SHIFT 0
 #define I40E_PRTPM_TLPIC_ETLPIC_MASK  I40E_MASK(0x, 
I40E_PRTPM_TLPIC_ETLPIC_SHIFT)
+#define I40E_GL_PRS_FVBM(_i) (0x00269760 + ((_i) * 4)) /* 
_i=0...3 */ /* Reset: CORER */
+#define I40E_GL_PRS_FVBM_MAX_INDEX   3
+#define I40E_GL_PRS_FVBM_FV_BYTE_INDX_SHIFT  0
+#define I40E_GL_PRS_FVBM_FV_BYTE_INDX_MASK   I40E_MASK(0x7F, 
I40E_GL_PRS_FVBM_FV_BYTE_INDX_SHIFT)
+#define I40E_GL_PRS_FVBM_RULE_BUS_INDX_SHIFT 8
+#define I40E_GL_PRS_FVBM_RULE_BUS_INDX_MASK  I40E_MASK(0x3F, 
I40E_GL_PRS_FVBM_RULE_BUS_INDX_SHIFT)
+#define I40E_GL_PRS_FVBM_MSK_ENA_SHIFT   31
+#define I40E_GL_PRS_FVBM_MSK_ENA_MASKI40E_MASK(0x1, 
I40E_GL_PRS_FVBM_MSK_ENA_SHIFT)
 #define I40E_GLRPB_DPSS   0x000AC828 /* Reset: CORER */
 #define I40E_GLRPB_DPSS_DPS_TCN_SHIFT 0
 #define I40E_GLRPB_DPSS_DPS_TCN_MASK  I40E_MASK(0xF, 
I40E_GLRPB_DPSS_DPS_TCN_SHIFT)
@@ -2227,6 +2235,14 @@ POSSIBILITY OF SUCH DAMAGE.
 #define I40E_PRTQF_FD_FLXINSET_MAX_INDEX   63
 #define I40E_PRTQF_FD_FLXINSET_INSET_SHIFT 0
 #define I40E_PRTQF_FD_FLXINSET_INSET_MASK  I40E_MASK(0xFF, 
I40E_PRTQF_FD_FLXINSET_INSET_SHIFT)
+#define I40E_PRTQF_FD_INSET(_i, _j)  (0x0025 + ((_i) * 64 + (_j) * 
32)) /* _i=0...63, _j=0...1 */ /* Reset: CORER */
+#define I40E_PRTQF_FD_INSET_MAX_INDEX   63
+#define I40E_PRTQF_FD_INSET_INSET_SHIFT 0
+#define I40E_PRTQF_FD_INSET_INSET_MASK  I40E_MASK(0x, 
I40E_PRTQF_FD_INSET_INSET_SHIFT)
+#define I40E_PRTQF_FD_INSET(_i, _j)  (0x0025 + ((_i) * 64 + (_j) * 
32)) /* _i=0...63, _j=0...1 */ /* Reset: CORER */
+#define I40E_PRTQF_FD_INSET_MAX_INDEX   63
+#define I40E_PRTQF_FD_INSET_INSET_SHIFT 0
+#define I40E_PRTQF_FD_INSET_INSET_MASK  I40E_MASK(0x, 
I40E_PRTQF_FD_INSET_INSET_SHIFT)
 #define I40E_PRTQF_FD_MSK(_i, _j)   (0x00252000 + ((_i) * 64 + (_j) * 32)) 
/* _i=0...63, _j=0...1 */ /* Reset: CORER */
 #define I40E_PRTQF_FD_MSK_MAX_INDEX63
 #define I40E_PRTQF_FD_MSK_MASK_SHIFT   0
@@ -5169,6 +5185,38 @@ POSSIBILITY OF SUCH DAMAGE.
 #define I40E_GLQF_FD_PCTYPES_MAX_INDEX   63
 #define I40E_GLQF_FD_PCTYPES_FD_PCTYPE_SHIFT 0
 #define I40E_GLQF_FD_PCTYPES_FD_PCTYPE_MASK  I40E_MASK(0x3F, 
I40E_GLQF_FD_PCTYPES_FD_PCTYPE_SHIFT)
+#define I40E_GLQF_FD_MSK(_i, _j)   (0x00267200 + ((_i) * 4 + (_j) * 8)) /* 
_i=0...1, _j=0...63 */ /* Reset: CORER */
+#define I40E_GLQF_FD_MSK_MAX_INDEX1
+#define I40E_GLQF_FD_MSK_MASK_SHIFT   0
+#define I40E_GLQF_FD_MSK_MASK_MASKI40E_MASK(0x, 
I40E_GLQF_FD_MSK_MASK_SHIFT)
+#define I40E_GLQF_FD_MSK_OFFSET_SHIFT 16
+#define I40E_GLQF_FD_MSK_OFFSET_MASK  I40E_MASK(0x3F, 
I40E_GLQF_FD_MSK_OFFSET_SHIFT)
+#define I40E_GLQF_HASH_INSET(_i, _j)  (0x00267600 + ((_i) * 4 + (_j) * 8)) 
/* _i=0...1, _j=0...63 */ /* Reset: CORER */
+#define I40E_GLQF_HASH_INSET_MAX_INDEX   1
+#define I40E_GLQF_HASH_INSET_INSET_SHIFT 0
+#define I40E_GLQF_HASH_INSET_INSET_MASK  I40E_MASK(0x, 
I40E_GLQF_HASH_INSET_INSET_SHIFT)
+#define I40E_GLQF_HASH_MSK(_i, _j)   (0x00267A00 + ((_i) * 4 + (_j) * 8)) 
/* _i=0...1, _j=0...63 */ /* Reset: CORER */
+#define I40E_GLQF_HASH_MSK_MAX_INDEX1
+#define I40E_GLQF_HASH_MSK_MASK_SHIFT   0
+#define I40E_GLQF_HASH_MSK_MASK_MASKI40E_MASK(0x, 
I40E_GLQF_HASH_MSK_MASK_SHIFT)
+#define I40E_GLQF_HASH_MSK_OFFSET_SHIFT 16
+#define I40E_GLQF_HASH_MSK_OFFSET_MASK  I40E_MASK(0x3F, 
I40E_GLQF_HASH_MSK_OFFSET_SHIFT)
+#define I40E_GLQF_ORT(_i)   (0x00268900 + ((_i) * 4)) /* _i=0...63 
*/ /* Reset: CORER */
+#define I40E_GLQF_ORT_MAX_INDEX 63
+#define I40E_GLQF_ORT_PIT_INDX_SHIFT0
+#define I40E_GLQF_ORT_PIT_INDX_MASK I40E_MASK(0x1F, 
I40E_GLQF_ORT_PIT_INDX_SHIFT)
+#define I40E_GLQF_ORT_FIELD_CNT_SHIFT   5
+#define I40E_GLQF_ORT_FIELD_CNT_MASKI40E_MASK(0x3, 
I40E_GLQF_ORT_FIELD_CNT_SHIFT)
+#define I40E_GLQF_ORT_FLX_PAYLOAD_SHIFT 7
+#define I40E_GLQF_ORT_FLX_PAYLOAD_MASK  I40E_MASK(0x1, 
I40E_GLQF_ORT_FLX_PAYLOAD_SHIFT)
+#define I40E_GLQF_PIT(_i)  (0x00268C80 + ((_i) * 4)) /* _i=0...23 
*/ /* Reset: CORER */
+#define I40E_GLQF_PIT_MAX_INDEX23
+#define I40E_GLQF_PIT_SOURCE_OFF_SHIFT 0
+#define I40E_GLQF_PIT_SOURCE_OFF_MASK  I40E_MASK(0x1F, 
I40E_GLQF_PIT_SOURCE_OFF_SHIFT)
+#define I40E_GLQF_PIT_FSIZE_SHIFT  5
+#define I40E_GLQF_PIT_FSIZE_MASK   I40E_MASK(0x1F, 

[dpdk-dev] [PATCH 25/29] i40e/base: add AQ thermal sensor control struct

2016-01-15 Thread Helin Zhang
It adds the new AQ command and struct for managing a
thermal sensor.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index 12ebd35..5236333 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -250,6 +250,7 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_nvm_config_read= 0x0704,
i40e_aqc_opc_nvm_config_write   = 0x0705,
i40e_aqc_opc_oem_post_update= 0x0720,
+   i40e_aqc_opc_thermal_sensor = 0x0721,

/* virtualization commands */
i40e_aqc_opc_send_msg_to_pf = 0x0801,
@@ -2001,6 +2002,22 @@ struct i40e_aqc_nvm_oem_post_update_buffer {

 I40E_CHECK_STRUCT_LEN(0x28, i40e_aqc_nvm_oem_post_update_buffer);

+/* Thermal Sensor (indirect 0x0721)
+ * read or set thermal sensor configs and values
+ * takes a sensor and command specific data buffer, not detailed here
+ */
+struct i40e_aqc_thermal_sensor {
+   u8 sensor_action;
+#define I40E_AQ_THERMAL_SENSOR_READ_CONFIG 0
+#define I40E_AQ_THERMAL_SENSOR_SET_CONFIG  1
+#define I40E_AQ_THERMAL_SENSOR_READ_TEMP   2
+   u8 reserved[7];
+   __le32  addr_high;
+   __le32  addr_low;
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_thermal_sensor);
+
 /* Send to PF command (indirect 0x0801) id is only used by PF
  * Send to VF command (indirect 0x0802) id is only used by PF
  * Send to Peer PF command (indirect 0x0803)
-- 
1.9.3



[dpdk-dev] [PATCH 26/29] i40e/base: add/update structure and macro definitions

2016-01-15 Thread Helin Zhang
Several structures and macros are added or updated, such
as 'struct i40e_aqc_get_link_status',
'struct i40e_aqc_run_phy_activity' and
'struct i40e_aqc_lldp_set_local_mib_resp'.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_adminq_cmd.h | 45 ++---
 drivers/net/i40e/base/i40e_type.h   |  5 ++--
 drivers/net/i40e/i40e_ethdev.c  |  2 +-
 3 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/drivers/net/i40e/base/i40e_adminq_cmd.h 
b/drivers/net/i40e/base/i40e_adminq_cmd.h
index 5236333..fe9d5b5 100644
--- a/drivers/net/i40e/base/i40e_adminq_cmd.h
+++ b/drivers/net/i40e/base/i40e_adminq_cmd.h
@@ -41,7 +41,7 @@ POSSIBILITY OF SUCH DAMAGE.
  */

 #define I40E_FW_API_VERSION_MAJOR  0x0001
-#define I40E_FW_API_VERSION_MINOR  0x0004
+#define I40E_FW_API_VERSION_MINOR  0x0005

 struct i40e_aq_desc {
__le16 flags;
@@ -242,6 +242,7 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_get_phy_wol_caps   = 0x0621,
i40e_aqc_opc_set_phy_debug  = 0x0622,
i40e_aqc_opc_upload_ext_phy_fm  = 0x0625,
+   i40e_aqc_opc_run_phy_activity   = 0x0626,

/* NVM commands */
i40e_aqc_opc_nvm_read   = 0x0701,
@@ -915,6 +916,10 @@ struct i40e_aqc_vsi_properties_data {
 I40E_AQ_VSI_TC_QUE_NUMBER_SHIFT)
/* queueing option section */
u8  queueing_opt_flags;
+#ifdef X722_SUPPORT
+#define I40E_AQ_VSI_QUE_OPT_MULTICAST_UDP_ENA  0x04
+#define I40E_AQ_VSI_QUE_OPT_UNICAST_UDP_ENA0x08
+#endif
 #define I40E_AQ_VSI_QUE_OPT_TCP_ENA0x10
 #define I40E_AQ_VSI_QUE_OPT_FCOE_ENA   0x20
 #ifdef X722_SUPPORT
@@ -1349,10 +1354,16 @@ struct i40e_aqc_add_remove_cloud_filters_element_data {

 #define I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT  9
 #define I40E_AQC_ADD_CLOUD_TNL_TYPE_MASK   0x1E00
-#define I40E_AQC_ADD_CLOUD_TNL_TYPE_XVLAN  0
+#define I40E_AQC_ADD_CLOUD_TNL_TYPE_VXLAN  0
 #define I40E_AQC_ADD_CLOUD_TNL_TYPE_NVGRE_OMAC 1
-#define I40E_AQC_ADD_CLOUD_TNL_TYPE_NGE2
+#define I40E_AQC_ADD_CLOUD_TNL_TYPE_GENEVE 2
 #define I40E_AQC_ADD_CLOUD_TNL_TYPE_IP 3
+#define I40E_AQC_ADD_CLOUD_TNL_TYPE_RESERVED   4
+#define I40E_AQC_ADD_CLOUD_TNL_TYPE_VXLAN_GPE  5
+
+#define I40E_AQC_ADD_CLOUD_FLAGS_SHARED_OUTER_MAC  0x2000
+#define I40E_AQC_ADD_CLOUD_FLAGS_SHARED_INNER_MAC  0x4000
+#define I40E_AQC_ADD_CLOUD_FLAGS_SHARED_OUTER_IP   0x8000

__le32  tenant_id;
u8  reserved[4];
@@ -1846,7 +1857,12 @@ struct i40e_aqc_get_link_status {
u8  config;
 #define I40E_AQ_CONFIG_CRC_ENA 0x04
 #define I40E_AQ_CONFIG_PACING_MASK 0x78
-   u8  reserved[5];
+   u8  external_power_ability;
+#define I40E_AQ_LINK_POWER_CLASS_1 0x00
+#define I40E_AQ_LINK_POWER_CLASS_2 0x01
+#define I40E_AQ_LINK_POWER_CLASS_3 0x02
+#define I40E_AQ_LINK_POWER_CLASS_4 0x03
+   u8  reserved[4];
 };

 I40E_CHECK_CMD_LENGTH(i40e_aqc_get_link_status);
@@ -1914,6 +1930,18 @@ enum i40e_aq_phy_reg_type {
I40E_AQC_PHY_REG_EXERNAL_MODULE = 0x3
 };

+/* Run PHY Activity (0x0626) */
+struct i40e_aqc_run_phy_activity {
+   __le16  activity_id;
+   u8  flags;
+   u8  reserved1;
+   __le32  control;
+   __le32  data;
+   u8  reserved2[4];
+};
+
+I40E_CHECK_CMD_LENGTH(i40e_aqc_run_phy_activity);
+
 /* NVM Read command (indirect 0x0701)
  * NVM Erase commands (direct 0x0702)
  * NVM Update commands (indirect 0x0703)
@@ -2262,6 +2290,14 @@ struct i40e_aqc_lldp_set_local_mib {

 I40E_CHECK_CMD_LENGTH(i40e_aqc_lldp_set_local_mib);

+struct i40e_aqc_lldp_set_local_mib_resp {
+#define SET_LOCAL_MIB_RESP_EVENT_TRIGGERED_MASK  0x01
+   u8  status;
+   u8  reserved[15];
+};
+
+I40E_CHECK_STRUCT_LEN(0x10, i40e_aqc_lldp_set_local_mib_resp);
+
 /* Stop/Start LLDP Agent (direct 0x0A09)
  * Used for stopping/starting specific LLDP agent. e.g. DCBx
  */
@@ -2282,6 +2318,7 @@ struct i40e_aqc_add_udp_tunnel {
 #define I40E_AQC_TUNNEL_TYPE_VXLAN 0x00
 #define I40E_AQC_TUNNEL_TYPE_NGE   0x01
 #define I40E_AQC_TUNNEL_TYPE_TEREDO0x10
+#define I40E_AQC_TUNNEL_TYPE_VXLAN_GPE 0x11
u8  reserved1[10];
 };

diff --git a/drivers/net/i40e/base/i40e_type.h 
b/drivers/net/i40e/base/i40e_type.h
index 61ee166..d5ca67a 100644
--- a/drivers/net/i40e/base/i40e_type.h
+++ b/drivers/net/i40e/base/i40e_type.h
@@ -187,11 +187,10 @@ enum i40e_memcpy_type {
I40E_DMA_TO_NONDMA
 };

-
 #ifdef X722_SUPPORT
-#define I40E_FW_API_VERSION_MINOR_X722 0x0003
+#define I40E_FW_API_VERSION_MINOR_X722 0x0004
 #endif
-#define I40E_FW_API_VERSION_MINOR_X710 0x0004
+#define I40E_FW_API_VERSION_MINOR_X710 0x0005


 /* These are structs for managing the hardware information and the operations.
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers

[dpdk-dev] [PATCH 27/29] i40e: add base driver release info

2016-01-15 Thread Helin Zhang
It adds base driver release information such as release date,
for better tracking in the future.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/i40e/Makefile b/drivers/net/i40e/Makefile
index 033ee4a..6dd6eaa 100644
--- a/drivers/net/i40e/Makefile
+++ b/drivers/net/i40e/Makefile
@@ -85,6 +85,7 @@ VPATH += $(SRCDIR)/base

 #
 # all source are stored in SRCS-y
+# base driver is based on the package of dpdk-i40e.2016.01.07.14.tar.gz.
 #
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_adminq.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_common.c
-- 
1.9.3



[dpdk-dev] [PATCH 28/29] i40e: add/remove new device IDs

2016-01-15 Thread Helin Zhang
It adds several new device IDs, and also removed one which is
not used at all.

Signed-off-by: Helin Zhang 
---
 doc/guides/rel_notes/release_2_3.rst| 15 +++
 lib/librte_eal/common/include/rte_pci_dev_ids.h |  8 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..fec5e4d 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,15 @@ DPDK Release 2.3
 New Features
 

+**Updated the i40e base driver.**
+
+  The i40e base driver was updated with changes including the
+  following:
+
+  * Use rx control AQ commands to read/write rx control registers.
+  * Add new X722 device IDs, and removed X710 one was never used.
+  * Expose registers for HASH/FD input set configuring.
+

 Resolved Issues
 ---
@@ -15,6 +24,12 @@ EAL
 Drivers
 ~~~

+* **i40e: Fixed failure of reading/writing rx control registers.**
+
+  Fixed i40e issue failing to read/write rx control registers when
+  under stress small traffic, which will result in application launch
+  failure.
+

 Libraries
 ~
diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
b/lib/librte_eal/common/include/rte_pci_dev_ids.h
index d088191..40ba98c 100644
--- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
+++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
@@ -496,7 +496,6 @@ RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, 
IXGBE_DEV_ID_82599_BYPASS)

 #define I40E_DEV_ID_SFP_XL710   0x1572
 #define I40E_DEV_ID_QEMU0x1574
-#define I40E_DEV_ID_KX_A0x157F
 #define I40E_DEV_ID_KX_B0x1580
 #define I40E_DEV_ID_KX_C0x1581
 #define I40E_DEV_ID_QSFP_A  0x1583
@@ -507,13 +506,14 @@ RTE_PCI_DEV_ID_DECL_IXGBE(PCI_VENDOR_ID_INTEL, 
IXGBE_DEV_ID_82599_BYPASS)
 #define I40E_DEV_ID_20G_KR2_A   0x1588
 #define I40E_DEV_ID_10G_BASE_T4 0x1589
 #define I40E_DEV_ID_X722_A0 0x374C
+#define I40E_DEV_ID_KX_X722 0x37CE
+#define I40E_DEV_ID_QSFP_X722   0x37CF
 #define I40E_DEV_ID_SFP_X7220x37D0
 #define I40E_DEV_ID_1G_BASE_T_X722  0x37D1
 #define I40E_DEV_ID_10G_BASE_T_X722 0x37D2

 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_SFP_XL710)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_QEMU)
-RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_KX_A)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_KX_B)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_KX_C)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_QSFP_A)
@@ -524,6 +524,8 @@ RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, 
I40E_DEV_ID_20G_KR2)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_20G_KR2_A)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_10G_BASE_T4)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_X722_A0)
+RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_KX_X722)
+RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_QSFP_X722)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_SFP_X722)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_1G_BASE_T_X722)
 RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_10G_BASE_T_X722)
@@ -572,11 +574,13 @@ RTE_PCI_DEV_ID_DECL_IXGBEVF(PCI_VENDOR_ID_INTEL, 
IXGBE_DEV_ID_X550EM_X_VF_HV)

 #define I40E_DEV_ID_VF  0x154C
 #define I40E_DEV_ID_VF_HV   0x1571
+#define I40E_DEV_ID_X722_A0_VF  0x374D
 #define I40E_DEV_ID_X722_VF 0x37CD
 #define I40E_DEV_ID_X722_VF_HV  0x37D9

 RTE_PCI_DEV_ID_DECL_I40EVF(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_VF)
 RTE_PCI_DEV_ID_DECL_I40EVF(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_VF_HV)
+RTE_PCI_DEV_ID_DECL_I40EVF(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_X722_A0_VF)
 RTE_PCI_DEV_ID_DECL_I40EVF(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_X722_VF)
 RTE_PCI_DEV_ID_DECL_I40EVF(PCI_VENDOR_ID_INTEL, I40E_DEV_ID_X722_VF_HV)

-- 
1.9.3



[dpdk-dev] [PATCH 29/29] i40e: use rx control function for rx control registers

2016-01-15 Thread Helin Zhang
As required, rx control registers have to be read/written by
rx control functions, otherwise if may fail to read/write
when under stress small traffic.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/i40e_ethdev.c| 66 ---
 drivers/net/i40e/i40e_ethdev_vf.c | 28 -
 drivers/net/i40e/i40e_fdir.c  | 13 
 drivers/net/i40e/i40e_pf.c|  6 ++--
 4 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 3cdceb6..e3519a7 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -5632,11 +5632,11 @@ i40e_pf_disable_rss(struct i40e_pf *pf)
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
uint64_t hena;

-   hena = (uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(0));
-   hena |= ((uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(1))) << 32;
+   hena = (uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(0));
+   hena |= ((uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1))) << 32;
hena &= ~I40E_RSS_HENA_ALL;
-   I40E_WRITE_REG(hw, I40E_PFQF_HENA(0), (uint32_t)hena);
-   I40E_WRITE_REG(hw, I40E_PFQF_HENA(1), (uint32_t)(hena >> 32));
+   i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (uint32_t)hena);
+   i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (uint32_t)(hena >> 32));
I40E_WRITE_FLUSH(hw);
 }

@@ -5669,7 +5669,7 @@ i40e_set_rss_key(struct i40e_vsi *vsi, uint8_t *key, 
uint8_t key_len)
uint16_t i;

for (i = 0; i <= I40E_PFQF_HKEY_MAX_INDEX; i++)
-   I40E_WRITE_REG(hw, I40E_PFQF_HKEY(i), hash_key[i]);
+   i40e_write_rx_ctl(hw, I40E_PFQF_HKEY(i), hash_key[i]);
I40E_WRITE_FLUSH(hw);
}

@@ -5698,7 +5698,7 @@ i40e_get_rss_key(struct i40e_vsi *vsi, uint8_t *key, 
uint8_t *key_len)
uint16_t i;

for (i = 0; i <= I40E_PFQF_HKEY_MAX_INDEX; i++)
-   key_dw[i] = I40E_READ_REG(hw, I40E_PFQF_HKEY(i));
+   key_dw[i] = i40e_read_rx_ctl(hw, I40E_PFQF_HKEY(i));
}
*key_len = (I40E_PFQF_HKEY_MAX_INDEX + 1) * sizeof(uint32_t);

@@ -5719,12 +5719,12 @@ i40e_hw_rss_hash_set(struct i40e_pf *pf, struct 
rte_eth_rss_conf *rss_conf)
return ret;

rss_hf = rss_conf->rss_hf;
-   hena = (uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(0));
-   hena |= ((uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(1))) << 32;
+   hena = (uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(0));
+   hena |= ((uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1))) << 32;
hena &= ~I40E_RSS_HENA_ALL;
hena |= i40e_config_hena(rss_hf);
-   I40E_WRITE_REG(hw, I40E_PFQF_HENA(0), (uint32_t)hena);
-   I40E_WRITE_REG(hw, I40E_PFQF_HENA(1), (uint32_t)(hena >> 32));
+   i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (uint32_t)hena);
+   i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (uint32_t)(hena >> 32));
I40E_WRITE_FLUSH(hw);

return 0;
@@ -5739,8 +5739,8 @@ i40e_dev_rss_hash_update(struct rte_eth_dev *dev,
uint64_t rss_hf = rss_conf->rss_hf & I40E_RSS_OFFLOAD_ALL;
uint64_t hena;

-   hena = (uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(0));
-   hena |= ((uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(1))) << 32;
+   hena = (uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(0));
+   hena |= ((uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1))) << 32;
if (!(hena & I40E_RSS_HENA_ALL)) { /* RSS disabled */
if (rss_hf != 0) /* Enable RSS */
return -EINVAL;
@@ -5764,8 +5764,8 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
i40e_get_rss_key(pf->main_vsi, rss_conf->rss_key,
 &rss_conf->rss_key_len);

-   hena = (uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(0));
-   hena |= ((uint64_t)I40E_READ_REG(hw, I40E_PFQF_HENA(1))) << 32;
+   hena = (uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(0));
+   hena |= ((uint64_t)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1))) << 32;
rss_conf->rss_hf = i40e_parse_hena(hena);

return 0;
@@ -6266,7 +6266,7 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf)
 static void
 i40e_get_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t *enable)
 {
-   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+   uint32_t reg = i40e_read_rx_ctl(hw, I40E_PRTQF_CTL_0);

*enable = reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK ? 1 : 0;
 }
@@ -6275,7 +6275,7 @@ i40e_get_symmetric_hash_enable_per_port(struct i40e_hw 
*hw, uint8_t *enable)
 static void
 i40e_set_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t enable)
 {
-   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+   uint32_t reg = i40e_read_rx_ctl(hw, I40E_PRTQF_CTL_0);

if (enable > 0) {
if (reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK) {
@@ -6292,7 +6292,7 @@ i40e_set_symmetric_hash_enable_per_port(struct i4

[dpdk-dev] [PATCH 24/29] i40e/base: Add a Virtchnl offload for RSS PCTYPE V2

2016-01-15 Thread Helin Zhang
X722 supports Expanded version of TCP, UDP PCTYPES for RSS.
Add a Virtchnl offload to support this.
Without this patch VF drivers will not be able to support
the correct PCTYPES for X722 and UDP flows will not fan out.

Signed-off-by: Helin Zhang 
---
 drivers/net/i40e/base/i40e_virtchnl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/i40e/base/i40e_virtchnl.h 
b/drivers/net/i40e/base/i40e_virtchnl.h
index 8106582..26208f3 100644
--- a/drivers/net/i40e/base/i40e_virtchnl.h
+++ b/drivers/net/i40e/base/i40e_virtchnl.h
@@ -163,6 +163,7 @@ struct i40e_virtchnl_vsi_resource {
 #define I40E_VIRTCHNL_VF_OFFLOAD_WB_ON_ITR 0x0020
 #define I40E_VIRTCHNL_VF_OFFLOAD_VLAN  0x0001
 #define I40E_VIRTCHNL_VF_OFFLOAD_RX_POLLING0x0002
+#define I40E_VIRTCHNL_VF_OFFLOAD_RSS_PCTYPE_V2 0x0004

 struct i40e_virtchnl_vf_resource {
u16 num_vsis;
-- 
1.9.3



[dpdk-dev] [PATCH v4 0/8] virtio 1.0 enabling for virtio pmd driver

2016-01-15 Thread Yuanhan Liu

v4: - mark "src" arg as const for write_dev_cfg operation

- remove unnessary inline, and likely/unlikely

v3: - export pci_unmap_device as well; and invoke it at virtio
  uninit stage.

- fixed same data corruption bug reported by Qian in simple
  rxtx code path.

- move VIRTIO_READ/WRITE_REG_X to virtio_pci.c

v2: - fix a data corruption reported by Qian, due to hdr size mismatch.
  check detailes at ptach 5.

- Add missing config_irq and isr reading support from v1.

- fix comments from v1.

Almost all difference comes from virtio 1.0 are the PCI layout change:
the major configuration structures are stored at bar space, and their
location is stored at corresponding pci cap structure. Reading/parsing
them is one of the major work of patch 7.

To make handling virtio v1.0 and v0.95 co-exist well, this patch set
introduces a virtio_pci_ops structure, to add another layer so that
we could keep those vtpci_foo_bar "APIs". With that, we could do the
minimum change to add virtio 1.0 support.


Rough test guide


Firstly, you need get a virtio 1.0 supported QEMU (say, v2.5), then add
option "disable-modern=false" to qemu virtio-net-pci device to enable
virtio 1.0 (which is disabled by default).

And if you see something like following from 'lspci -v', it means virtio
1.0 is indeed enabled:

00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
Subsystem: Red Hat, Inc Device 0001
Physical Slot: 4
Flags: bus master, fast devsel, latency 0, IRQ 11
I/O ports at c040 [size=64]
Memory at febf1000 (32-bit, non-prefetchable) [size=4K]
Memory at fe00 (64-bit, prefetchable) [size=8M]
Expansion ROM at feb8 [disabled] [size=256K]
Capabilities: [98] MSI-X: Enable+ Count=6 Masked-
==> Capabilities: [84] Vendor Specific Information: Len=14 
==> Capabilities: [70] Vendor Specific Information: Len=14 
==> Capabilities: [60] Vendor Specific Information: Len=10 
==> Capabilities: [50] Vendor Specific Information: Len=10 
==> Capabilities: [40] Vendor Specific Information: Len=10 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

After that, there wasn't anything speical comparing to the old virtio
0.95 pmd driver.


---
Yuanhan Liu (8):
  virtio: don't set vring address again at queue startup
  virtio: introduce struct virtio_pci_ops
  virtio: move left pci stuff to virtio_pci.c
  viritio: switch to 64 bit features
  virtio: retrieve hdr_size from hw->vtnet_hdr_size
  eal: pci: export pci_[un]map_device
  virtio: add 1.0 support
  virtio: move VIRTIO_READ/WRITE_REG_X into virtio_pci.c

 doc/guides/rel_notes/release_2_3.rst|   3 +
 drivers/net/virtio/virtio_ethdev.c  | 302 +
 drivers/net/virtio/virtio_ethdev.h  |   3 +-
 drivers/net/virtio/virtio_pci.c | 793 +++-
 drivers/net/virtio/virtio_pci.h | 120 +++-
 drivers/net/virtio/virtio_rxtx.c|  21 +-
 drivers/net/virtio/virtio_rxtx_simple.c |  12 +-
 drivers/net/virtio/virtqueue.h  |   4 +-
 lib/librte_eal/bsdapp/eal/eal_pci.c |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   7 +
 lib/librte_eal/common/eal_common_pci.c  |   4 +-
 lib/librte_eal/common/eal_private.h |  18 -
 lib/librte_eal/common/include/rte_pci.h |  27 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   4 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   7 +
 15 files changed, 949 insertions(+), 380 deletions(-)

-- 
1.9.0



[dpdk-dev] [PATCH v4 1/8] virtio: don't set vring address again at queue startup

2016-01-15 Thread Yuanhan Liu
As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.

Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_rxtx.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 74b39ef..b7267c0 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -339,11 +339,6 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
vq_update_avail_idx(vq);

PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs);
-
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
} else if (queue_type == VTNET_TQ) {
if (use_simple_rxtx) {
int mid_idx  = vq->vq_nentries >> 1;
@@ -362,16 +357,6 @@ virtio_dev_vring_start(struct virtqueue *vq, int 
queue_type)
for (i = mid_idx; i < vq->vq_nentries; i++)
vq->vq_ring.avail->ring[i] = i;
}
-
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
-   } else {
-   VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
-   vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
-   vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
}
 }

-- 
1.9.0



[dpdk-dev] [PATCH v4 2/8] virtio: introduce struct virtio_pci_ops

2016-01-15 Thread Yuanhan Liu
Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.

With that, this patch reimplements all exported pci functions, in
a way like:

vtpci_foo_bar(struct virtio_hw *hw)
{
hw->vtpci_ops->foo_bar(hw);
}

So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.

This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.

Signed-off-by: Yuanhan Liu 
---

v2: extra whitespace line removing, and comment on "reading status
after reset".

rename the badly taken op name "set_irq" to "set_config_irq".

v4: define "src" arg of write_dev_cfg opertion as const
---
 drivers/net/virtio/virtio_ethdev.c |  22 ++---
 drivers/net/virtio/virtio_pci.c| 164 ++---
 drivers/net/virtio/virtio_pci.h|  27 ++
 drivers/net/virtio/virtqueue.h |   2 +-
 4 files changed, 169 insertions(+), 46 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index d928339..6c1d3a0 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -272,9 +272,7 @@ virtio_dev_queue_release(struct virtqueue *vq) {

if (vq) {
hw = vq->hw;
-   /* Select and deactivate the queue */
-   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL, 
vq->vq_queue_index);
-   VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_QUEUE_PFN, 0);
+   hw->vtpci_ops->del_queue(hw, vq);

rte_free(vq->sw_ring);
rte_free(vq);
@@ -295,15 +293,13 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;

-   /* Write the virtqueue index to the Queue Select Field */
-   VIRTIO_WRITE_REG_2(hw, VIRTIO_PCI_QUEUE_SEL, vtpci_queue_idx);
-   PMD_INIT_LOG(DEBUG, "selecting queue: %u", vtpci_queue_idx);
+   PMD_INIT_LOG(DEBUG, "setting up queue: %u", vtpci_queue_idx);

/*
 * Read the virtqueue size from the Queue Size field
 * Always power of 2 and if 0 virtqueue does not exist
 */
-   vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
+   vq_size = hw->vtpci_ops->get_queue_num(hw, vtpci_queue_idx);
PMD_INIT_LOG(DEBUG, "vq_size: %u nb_desc:%u", vq_size, nb_desc);
if (vq_size == 0) {
PMD_INIT_LOG(ERR, "%s: virtqueue does not exist", __func__);
@@ -436,12 +432,8 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
memset(vq->virtio_net_hdr_mz->addr, 0, PAGE_SIZE);
}

-   /*
-* Set guest physical address of the virtqueue
-* in VIRTIO_PCI_QUEUE_PFN config register of device
-*/
-   VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_QUEUE_PFN,
-   mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
+   hw->vtpci_ops->setup_queue(hw, vq);
+
*pvq = vq;
return 0;
 }
@@ -950,7 +942,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features);

/* Read device(host) feature bits */
-   host_features = VIRTIO_READ_REG_4(hw, VIRTIO_PCI_HOST_FEATURES);
+   host_features = hw->vtpci_ops->get_features(hw);
PMD_INIT_LOG(DEBUG, "host_features before negotiate = %x",
host_features);

@@ -1287,6 +1279,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

+   vtpci_init(pci_dev, hw);
+
if (virtio_resource_init(pci_dev) < 0)
return -1;

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 2245bec..9907fd0 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -34,12 +34,11 @@

 #include "virtio_pci.h"
 #include "virtio_logs.h"
+#include "virtqueue.h"

-static uint8_t vtpci_get_status(struct virtio_hw *);
-
-void
-vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
-   void *dst, int length)
+static void
+legacy_read_dev_config(struct virtio_hw *hw, uint64_t offset,
+  void *dst, int length)
 {
uint64_t off;
uint8_t *d;
@@ -60,22 +59,22 @@ vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset,
}
 }

-void
-vtpci_write_dev_config(struct virtio_hw *hw, uint64_t offset,
-   void *src, int length)
+static void
+legacy_write_dev_config(struct virtio_hw *hw, uint64_t offset,
+   const void *src, int length)
 {
uint64_t off;
-   uint8_t *s;
+   const uint8_t *s;
int size;

off = VIRTIO_PCI_CONFIG(hw) + offset;
for (s = src; length > 0; s += size, off += size, length -= size) {
if (length >

[dpdk-dev] [PATCH v4 3/8] virtio: move left pci stuff to virtio_pci.c

2016-01-15 Thread Yuanhan Liu
virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.

Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_ethdev.c | 265 +---
 drivers/net/virtio/virtio_pci.c| 270 -
 2 files changed, 270 insertions(+), 265 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 6c1d3a0..b57224d 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -36,10 +36,6 @@
 #include 
 #include 
 #include 
-#ifdef RTE_EXEC_ENV_LINUXAPP
-#include 
-#include 
-#endif

 #include 
 #include 
@@ -955,260 +951,6 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features);
 }

-#ifdef RTE_EXEC_ENV_LINUXAPP
-static int
-parse_sysfs_value(const char *filename, unsigned long *val)
-{
-   FILE *f;
-   char buf[BUFSIZ];
-   char *end = NULL;
-
-   f = fopen(filename, "r");
-   if (f == NULL) {
-   PMD_INIT_LOG(ERR, "%s(): cannot open sysfs value %s",
-__func__, filename);
-   return -1;
-   }
-
-   if (fgets(buf, sizeof(buf), f) == NULL) {
-   PMD_INIT_LOG(ERR, "%s(): cannot read sysfs value %s",
-__func__, filename);
-   fclose(f);
-   return -1;
-   }
-   *val = strtoul(buf, &end, 0);
-   if ((buf[0] == '\0') || (end == NULL) || (*end != '\n')) {
-   PMD_INIT_LOG(ERR, "%s(): cannot parse sysfs value %s",
-__func__, filename);
-   fclose(f);
-   return -1;
-   }
-   fclose(f);
-   return 0;
-}
-
-static int get_uio_dev(struct rte_pci_addr *loc, char *buf, unsigned int 
buflen,
-   unsigned int *uio_num)
-{
-   struct dirent *e;
-   DIR *dir;
-   char dirname[PATH_MAX];
-
-   /* depending on kernel version, uio can be located in uio/uioX
-* or uio:uioX */
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/uio",
-loc->domain, loc->bus, loc->devid, loc->function);
-   dir = opendir(dirname);
-   if (dir == NULL) {
-   /* retry with the parent directory */
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT,
-loc->domain, loc->bus, loc->devid, loc->function);
-   dir = opendir(dirname);
-
-   if (dir == NULL) {
-   PMD_INIT_LOG(ERR, "Cannot opendir %s", dirname);
-   return -1;
-   }
-   }
-
-   /* take the first file starting with "uio" */
-   while ((e = readdir(dir)) != NULL) {
-   /* format could be uio%d ...*/
-   int shortprefix_len = sizeof("uio") - 1;
-   /* ... or uio:uio%d */
-   int longprefix_len = sizeof("uio:uio") - 1;
-   char *endptr;
-
-   if (strncmp(e->d_name, "uio", 3) != 0)
-   continue;
-
-   /* first try uio%d */
-   errno = 0;
-   *uio_num = strtoull(e->d_name + shortprefix_len, &endptr, 10);
-   if (errno == 0 && endptr != (e->d_name + shortprefix_len)) {
-   snprintf(buf, buflen, "%s/uio%u", dirname, *uio_num);
-   break;
-   }
-
-   /* then try uio:uio%d */
-   errno = 0;
-   *uio_num = strtoull(e->d_name + longprefix_len, &endptr, 10);
-   if (errno == 0 && endptr != (e->d_name + longprefix_len)) {
-   snprintf(buf, buflen, "%s/uio:uio%u", dirname,
-*uio_num);
-   break;
-   }
-   }
-   closedir(dir);
-
-   /* No uio resource found */
-   if (e == NULL) {
-   PMD_INIT_LOG(ERR, "Could not find uio resource");
-   return -1;
-   }
-
-   return 0;
-}
-
-static int
-virtio_has_msix(const struct rte_pci_addr *loc)
-{
-   DIR *d;
-   char dirname[PATH_MAX];
-
-   snprintf(dirname, sizeof(dirname),
-SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/msi_irqs",
-loc->domain, loc->bus, loc->devid, loc->function);
-
-   d = opendir(dirname);
-   if (d)
-   closedir(d);
-
-   return (d != NULL);
-}
-
-/* Extract I/O port numbers from sysfs */
-static int virtio_resource_init_by_uio(struct rte_pci_device *pci_dev)
-{
-   char dirname[PATH_MAX];
-   char filename[PATH_MAX];
-   unsigned long start, size;
-   unsigned int uio_num;
-
-   if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname), &uio_num) < 0)
-   return -1;
-
-   /* get portio size */
-   snprintf(filename, sizeof(filename),
-"%s/portio/p

[dpdk-dev] [PATCH v4 4/8] viritio: switch to 64 bit features

2016-01-15 Thread Yuanhan Liu
Switch to 64 bit features, which virtio 1.0 supports.

While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.

Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_ethdev.c |  8 
 drivers/net/virtio/virtio_pci.c| 15 ++-
 drivers/net/virtio/virtio_pci.h| 12 ++--
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index b57224d..94e0c4a 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -930,16 +930,16 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
 static void
 virtio_negotiate_features(struct virtio_hw *hw)
 {
-   uint32_t host_features;
+   uint64_t host_features;

/* Prepare guest_features: feature that driver wants to support */
hw->guest_features = VIRTIO_PMD_GUEST_FEATURES;
-   PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %"PRIx64,
hw->guest_features);

/* Read device(host) feature bits */
host_features = hw->vtpci_ops->get_features(hw);
-   PMD_INIT_LOG(DEBUG, "host_features before negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "host_features before negotiate = %"PRIx64,
host_features);

/*
@@ -947,7 +947,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
 * guest feature bits.
 */
hw->guest_features = vtpci_negotiate_features(hw, host_features);
-   PMD_INIT_LOG(DEBUG, "features after negotiate = %x",
+   PMD_INIT_LOG(DEBUG, "features after negotiate = %"PRIx64,
hw->guest_features);
 }

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 95aa47d..a903316 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -87,15 +87,20 @@ legacy_write_dev_config(struct virtio_hw *hw, uint64_t 
offset,
}
 }

-static uint32_t
+static uint64_t
 legacy_get_features(struct virtio_hw *hw)
 {
return VIRTIO_READ_REG_4(hw, VIRTIO_PCI_HOST_FEATURES);
 }

 static void
-legacy_set_features(struct virtio_hw *hw, uint32_t features)
+legacy_set_features(struct virtio_hw *hw, uint64_t features)
 {
+   if ((features >> 32) != 0) {
+   PMD_DRV_LOG(ERR,
+   "only 32 bit features are allowed for legacy virtio!");
+   return;
+   }
VIRTIO_WRITE_REG_4(hw, VIRTIO_PCI_GUEST_FEATURES, features);
 }

@@ -451,10 +456,10 @@ vtpci_write_dev_config(struct virtio_hw *hw, uint64_t 
offset,
hw->vtpci_ops->write_dev_cfg(hw, offset, src, length);
 }

-uint32_t
-vtpci_negotiate_features(struct virtio_hw *hw, uint32_t host_features)
+uint64_t
+vtpci_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
 {
-   uint32_t features;
+   uint64_t features;

/*
 * Limit negotiated features to what the driver, virtqueue, and
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 2064af0..17312f6 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -175,8 +175,8 @@ struct virtio_pci_ops {
uint8_t (*get_status)(struct virtio_hw *hw);
void(*set_status)(struct virtio_hw *hw, uint8_t status);

-   uint32_t (*get_features)(struct virtio_hw *hw);
-   void (*set_features)(struct virtio_hw *hw, uint32_t features);
+   uint64_t (*get_features)(struct virtio_hw *hw);
+   void (*set_features)(struct virtio_hw *hw, uint64_t features);

uint8_t (*get_isr)(struct virtio_hw *hw);

@@ -191,7 +191,7 @@ struct virtio_pci_ops {
 struct virtio_hw {
struct virtqueue *cvq;
uint32_tio_base;
-   uint32_tguest_features;
+   uint64_tguest_features;
uint32_tmax_tx_queues;
uint32_tmax_rx_queues;
uint16_tvtnet_hdr_size;
@@ -271,9 +271,9 @@ outl_p(unsigned int data, unsigned int port)
outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg

 static inline int
-vtpci_with_feature(struct virtio_hw *hw, uint32_t bit)
+vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
 {
-   return (hw->guest_features & (1u << bit)) != 0;
+   return (hw->guest_features & (1ULL << bit)) != 0;
 }

 /*
@@ -286,7 +286,7 @@ void vtpci_reinit_complete(struct virtio_hw *);

 void vtpci_set_status(struct virtio_hw *, uint8_t);

-uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t);
+uint64_t vtpci_negotiate_features(struct virtio_hw *, uint64_t);

 void vtpci_write_dev_config(struct virtio_hw *, uint64_t, void *, int);

-- 
1.9.0



[dpdk-dev] [PATCH v4 5/8] virtio: retrieve hdr_size from hw->vtnet_hdr_size

2016-01-15 Thread Yuanhan Liu
The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.

Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.

Signed-off-by: Yuanhan Liu 
---

v3: retrieve hdr_size from hw->vtnet_hdr_size for simple rxtx
code path as well: it should not break anything, as simple
rx and mergeable rx still will not co-exist.
---
 drivers/net/virtio/virtio_rxtx.c|  6 --
 drivers/net/virtio/virtio_rxtx_simple.c | 12 ++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index b7267c0..41a1366 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -560,7 +560,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
struct rte_mbuf *rcv_pkts[VIRTIO_MBUF_BURST_SZ];
int error;
uint32_t i, nb_enqueued;
-   const uint32_t hdr_size = sizeof(struct virtio_net_hdr);
+   uint32_t hdr_size;

nb_used = VIRTQUEUE_NUSED(rxvq);

@@ -580,6 +580,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
hw = rxvq->hw;
nb_rx = 0;
nb_enqueued = 0;
+   hdr_size = hw->vtnet_hdr_size;

for (i = 0; i < num ; i++) {
rxm = rcv_pkts[i];
@@ -664,7 +665,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
uint32_t seg_num;
uint16_t extra_idx;
uint32_t seg_res;
-   const uint32_t hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   uint32_t hdr_size;

nb_used = VIRTQUEUE_NUSED(rxvq);

@@ -682,6 +683,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
seg_num = 0;
extra_idx = 0;
seg_res = 0;
+   hdr_size = hw->vtnet_hdr_size;

while (i < nb_used) {
struct virtio_net_hdr_mrg_rxbuf *header;
diff --git a/drivers/net/virtio/virtio_rxtx_simple.c 
b/drivers/net/virtio/virtio_rxtx_simple.c
index ff3c11a..3e66e8b 100644
--- a/drivers/net/virtio/virtio_rxtx_simple.c
+++ b/drivers/net/virtio/virtio_rxtx_simple.c
@@ -81,9 +81,9 @@ virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,

start_dp = vq->vq_ring.desc;
start_dp[desc_idx].addr = (uint64_t)((uintptr_t)cookie->buf_physaddr +
-   RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
+   RTE_PKTMBUF_HEADROOM - vq->hw->vtnet_hdr_size);
start_dp[desc_idx].len = cookie->buf_len -
-   RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
+   RTE_PKTMBUF_HEADROOM + vq->hw->vtnet_hdr_size;

vq->vq_free_cnt--;
vq->vq_avail_idx++;
@@ -120,9 +120,9 @@ virtio_rxq_rearm_vec(struct virtqueue *rxvq)

start_dp[i].addr =
(uint64_t)((uintptr_t)sw_ring[i]->buf_physaddr +
-   RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
+   RTE_PKTMBUF_HEADROOM - rxvq->hw->vtnet_hdr_size);
start_dp[i].len = sw_ring[i]->buf_len -
-   RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
+   RTE_PKTMBUF_HEADROOM + rxvq->hw->vtnet_hdr_size;
}

rxvq->vq_avail_idx += RTE_VIRTIO_VPMD_RX_REARM_THRESH;
@@ -175,8 +175,8 @@ virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
**rx_pkts,
len_adjust = _mm_set_epi16(
0, 0,
0,
-   (uint16_t) -sizeof(struct virtio_net_hdr),
-   0, (uint16_t) -sizeof(struct virtio_net_hdr),
+   (uint16_t) -rxvq->hw->vtnet_hdr_size,
+   0, (uint16_t) -rxvq->hw->vtnet_hdr_size,
0, 0);

if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP))
-- 
1.9.0



[dpdk-dev] [PATCH v4 6/8] eal: pci: export pci_[un]map_device

2016-01-15 Thread Yuanhan Liu
Normally we could set RTE_PCI_DRV_NEED_MAPPING flag so that eal will
invoke pci_map_device internally for us. From that point view, there
is no need to export pci_map_device.

However, for virtio pmd driver, which is designed to work without
binding UIO (or something similar first), pci_map_device() will fail,
which ends up with virtio pmd driver being skipped. Therefore, we can
not set RTE_PCI_DRV_NEED_MAPPING blindly at virtio pmd driver.

Therefore, this patch exports pci_map_device, and let virtio pmd
call it when necessary.

Cc: David Marchand 
Signed-off-by: Yuanhan Liu 
Tested-By: Santosh Shukla 
---
v3: - export pci_unmap_device as well

- Add few more comments about rte_eal_pci_map_device().
---
 lib/librte_eal/bsdapp/eal/eal_pci.c |  4 ++--
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  7 +++
 lib/librte_eal/common/eal_common_pci.c  |  4 ++--
 lib/librte_eal/common/eal_private.h | 18 -
 lib/librte_eal/common/include/rte_pci.h | 27 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |  4 ++--
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  7 +++
 7 files changed, 47 insertions(+), 24 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 6c21fbd..95c32c1 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -93,7 +93,7 @@ pci_unbind_kernel_driver(struct rte_pci_device *dev 
__rte_unused)

 /* Map pci device */
 int
-pci_map_device(struct rte_pci_device *dev)
+rte_eal_pci_map_device(struct rte_pci_device *dev)
 {
int ret = -1;

@@ -115,7 +115,7 @@ pci_map_device(struct rte_pci_device *dev)

 /* Unmap pci device */
 void
-pci_unmap_device(struct rte_pci_device *dev)
+rte_eal_pci_unmap_device(struct rte_pci_device *dev)
 {
/* try unmapping the NIC resources */
switch (dev->kdrv) {
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 9d7adf1..1b28170 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -135,3 +135,10 @@ DPDK_2.2 {
rte_xen_dom0_supported;

 } DPDK_2.1;
+
+DPDK_2.3 {
+   global:
+
+   rte_eal_pci_map_device;
+   rte_eal_pci_unmap_device;
+} DPDK_2.2;
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index dcfe947..96d5113 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -188,7 +188,7 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, 
struct rte_pci_device *d
pci_config_space_set(dev);
 #endif
/* map resources for devices that use igb_uio */
-   ret = pci_map_device(dev);
+   ret = rte_eal_pci_map_device(dev);
if (ret != 0)
return ret;
} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
@@ -254,7 +254,7 @@ rte_eal_pci_detach_dev(struct rte_pci_driver *dr,

if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
/* unmap resources for devices that use igb_uio */
-   pci_unmap_device(dev);
+   rte_eal_pci_unmap_device(dev);

return 0;
}
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 072e672..2342fa1 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -165,24 +165,6 @@ struct rte_pci_device;
 int pci_unbind_kernel_driver(struct rte_pci_device *dev);

 /**
- * Map this device
- *
- * This function is private to EAL.
- *
- * @return
- *   0 on success, negative on error and positive if no driver
- *   is found for the device.
- */
-int pci_map_device(struct rte_pci_device *dev);
-
-/**
- * Unmap this device
- *
- * This function is private to EAL.
- */
-void pci_unmap_device(struct rte_pci_device *dev);
-
-/**
  * Map the PCI resource of a PCI device in virtual memory
  *
  * This function is private to EAL.
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 334c12e..2224109 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -485,6 +485,33 @@ int rte_eal_pci_read_config(const struct rte_pci_device 
*device,
  */
 int rte_eal_pci_write_config(const struct rte_pci_device *device,
 const void *buf, size_t len, off_t offset);
+/**
+ * Map the PCI device resources in user space virtual memory address
+ *
+ * Note that driver should not call this function when flag
+ * RTE_PCI_DRV_NEED_MAPPING is set, as EAL will do that for
+ * you when it's on.
+ *
+ * @param dev
+ *   A pointer to a rte_pci_device structure describing the device
+ *   to use
+ *
+ * @return
+ *   0 on succes

[dpdk-dev] [PATCH v4 7/8] virtio: add 1.0 support

2016-01-15 Thread Yuanhan Liu
Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.

Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:

cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;

Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:

- common cfg

  For generic virtio and virtqueue configuration, such as setting/getting
  features, enabling a specific queue, and so on.

- nofity cfg

  Combining with `queue_notify_off' from common cfg, we could use it to
  notify a specific virt queue.

- device cfg

  Where virtio_net_config structure is located.

- isr cfg

  Where to read isr (interrupt status).

If any of above cap is not found, we fallback to the legacy virtio
handling.

If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.

Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.

Last, set the VIRTIO_F_VERSION_1 feature flag.

Signed-off-by: Yuanhan Liu 
---

v2: - re-read status after setting FEATURES_OK to make sure status is
  set correctly.

- Add isr reading and config irq setting support.

- Define some pci macro on our own to not get the dependency of
  linux/pci_regs.h, as there should be no such file at non-Linux
  platform

v3: - invoke rte_eal_pci_unmap_device() at uninit stage

v4: - remove unnessary inline, and likely/unlikely

- mark "src" arg as const for write_dev_cfg operation
---
 doc/guides/rel_notes/release_2_3.rst |   3 +
 drivers/net/virtio/virtio_ethdev.c   |  25 ++-
 drivers/net/virtio/virtio_ethdev.h   |   3 +-
 drivers/net/virtio/virtio_pci.c  | 335 ++-
 drivers/net/virtio/virtio_pci.h  |  67 +++
 drivers/net/virtio/virtqueue.h   |   2 +
 6 files changed, 430 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..c390d97 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -4,6 +4,9 @@ DPDK Release 2.3
 New Features
 

+* **Virtio 1.0 support.**
+
+  Enabled virtio 1.0 support for virtio pmd driver.

 Resolved Issues
 ---
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 94e0c4a..deb0382 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -927,7 +927,7 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
return virtio_send_command(hw->cvq, &ctrl, &len, 1);
 }

-static void
+static int
 virtio_negotiate_features(struct virtio_hw *hw)
 {
uint64_t host_features;
@@ -949,6 +949,22 @@ virtio_negotiate_features(struct virtio_hw *hw)
hw->guest_features = vtpci_negotiate_features(hw, host_features);
PMD_INIT_LOG(DEBUG, "features after negotiate = %"PRIx64,
hw->guest_features);
+
+   if (hw->modern) {
+   if (!vtpci_with_feature(hw, VIRTIO_F_VERSION_1)) {
+   PMD_INIT_LOG(ERR,
+   "VIRTIO_F_VERSION_1 features is not enabled.");
+   return -1;
+   }
+   vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_FEATURES_OK);
+   if (!(vtpci_get_status(hw) & VIRTIO_CONFIG_STATUS_FEATURES_OK)) 
{
+   PMD_INIT_LOG(ERR,
+   "failed to set FEATURES_OK status!");
+   return -1;
+   }
+   }
+
+   return 0;
 }

 /*
@@ -1032,7 +1048,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

/* Tell the host we've known how to drive the device. */
vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER);
-   virtio_negotiate_features(hw);
+   if (virtio_negotiate_features(hw) < 0)
+   return -1;

/* If host does not support status then disable LSC */
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
@@ -1043,7 +1060,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
rx_func_get(eth_dev);

/* Setting up rx_header size for the device */
-   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF))
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF) ||
+   vtpci_with_feature(hw, VIRTIO_F_VERSION_1))
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
else
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
@@ -1159,6 +1177,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
rte_intr_cal

[dpdk-dev] [PATCH v4 8/8] virtio: move VIRTIO_READ/WRITE_REG_X into virtio_pci.c

2016-01-15 Thread Yuanhan Liu
virtio_pci.c is the only file references those macros; move them there.

Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_pci.c | 19 +++
 drivers/net/virtio/virtio_pci.h | 18 --
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index fe8d6a2..a9f179f 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -49,6 +49,25 @@
 #define PCI_CAPABILITY_LIST0x34
 #define PCI_CAP_ID_VNDR0x09

+
+#define VIRTIO_PCI_REG_ADDR(hw, reg) \
+   (unsigned short)((hw)->io_base + (reg))
+
+#define VIRTIO_READ_REG_1(hw, reg) \
+   inb((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_1(hw, reg, value) \
+   outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
+#define VIRTIO_READ_REG_2(hw, reg) \
+   inw((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_2(hw, reg, value) \
+   outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
+#define VIRTIO_READ_REG_4(hw, reg) \
+   inl((VIRTIO_PCI_REG_ADDR((hw), (reg
+#define VIRTIO_WRITE_REG_4(hw, reg, value) \
+   outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
+
 static void
 legacy_read_dev_config(struct virtio_hw *hw, uint64_t offset,
   void *dst, int length)
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index 8174245..99572a0 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -318,24 +318,6 @@ outl_p(unsigned int data, unsigned int port)
 }
 #endif

-#define VIRTIO_PCI_REG_ADDR(hw, reg) \
-   (unsigned short)((hw)->io_base + (reg))
-
-#define VIRTIO_READ_REG_1(hw, reg) \
-   inb((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_1(hw, reg, value) \
-   outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
-#define VIRTIO_READ_REG_2(hw, reg) \
-   inw((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_2(hw, reg, value) \
-   outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
-#define VIRTIO_READ_REG_4(hw, reg) \
-   inl((VIRTIO_PCI_REG_ADDR((hw), (reg
-#define VIRTIO_WRITE_REG_4(hw, reg, value) \
-   outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
-
 static inline int
 vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
 {
-- 
1.9.0



[dpdk-dev] [PATCH v4 06/14] eal: pci: vfio: add rd/wr func for pci bar space

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:29PM +0530, Santosh Shukla wrote:
...
> +int rte_eal_pci_read_bar(const struct rte_pci_device *device,
> +  void *buf, size_t len, off_t offset,
> +  int bar_idx)
> +
> +{
> +#ifdef VFIO_PRESENT
> + const struct rte_intr_handle *intr_handle = &device->intr_handle;
> + return pci_vfio_read_bar(intr_handle, buf, len, offset, bar_idx);
> +#else
> + /* UIO's not applicable */
> + RTE_SET_USED(device);
> + RTE_SET_USED(buf);
> + RTE_SET_USED(len);
> + RTE_SET_USED(offset);
> + RTE_SET_USED(bar_idx);
> + return 0;
> +#endif

Maybe you could do that like what pci_map_device() does:

switch(dev->kdrv) {
case RTE_KDRV_NIC_VFIO:
return pci_vfio_read_bar(..);
break;
default:
RTE_LOG( not supported by driver: %d\n"..);
break;
}

With that, you could get rid of those ugly RTE_SET_USED

--yliu


[dpdk-dev] [PATCH v4 07/14] virtio: vfio: add api support to rd/wr ioport bar

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:30PM +0530, Santosh Shukla wrote:
> For vfio case - Use pread/pwrite api to access virtio
> ioport space.
> 
> Signed-off-by: Santosh Shukla 
> Signed-off-by: Rizwan Ansari 
> Signed-off-by: Rakesh Krishnamurthy 
> ---
...
> +/* vfio rd/rw virtio apis */
> +static inline void ioport_inb(const struct rte_pci_device *pci_dev,
> +   uint8_t reg, uint8_t *val)

Minor nit: dpdk perfers to seperate return type and function name in
different line:

  static inline void
  ioport_inb()
  {

> +{
> + if (rte_eal_pci_read_bar(pci_dev, (uint8_t *)val, sizeof(uint8_t), reg,
  ^^^

Unnecessary cast; and few more belows.

--yliu


[dpdk-dev] [PATCH v4 08/14] virtio: pci: extend virtio pci rw api for vfio interface

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:31PM +0530, Santosh Shukla wrote:
> So far virtio handle rw access for uio / ioport interface, This patch to 
> extend
> the support for vfio interface. For that introducing private struct
> virtio_vfio_dev{
>   - is_vfio
>   - pci_dev
>   };
> Signed-off-by: Santosh Shukla 
...
> +/* For vfio only */
> +struct virtio_vfio_dev {
> + boolis_vfio;/* True: vfio i/f,
> +  * False: not a vfio i/f

Well, this is weird; you are adding a flag to tell whether it's a
vfio device __inside__ a vfio struct.

Back to the topic, this flag is not necessary to me: you can
check the pci_dev->kdrv flag.

> +  */
> + struct rte_pci_device *pci_dev; /* vfio dev */

Note that I have already added this field into virtio_hw struct
at my latest virtio 1.0 pmd patchset.

While I told you before that you should not develop patches based
on my patcheset, I guess you can do that now. Since it should be
in good shape and close to be merged.

> +};
> +
>  struct virtio_hw {
>   struct virtqueue *cvq;
>   uint32_tio_base;
> @@ -176,6 +186,7 @@ struct virtio_hw {
>   uint8_t use_msix;
>   uint8_t started;
>   uint8_t mac_addr[ETHER_ADDR_LEN];
> + struct virtio_vfio_dev dev;
>  };
>  
>  /*
> @@ -231,20 +242,65 @@ outl_p(unsigned int data, unsigned int port)
>  #define VIRTIO_PCI_REG_ADDR(hw, reg) \
>   (unsigned short)((hw)->io_base + (reg))
>  
> -#define VIRTIO_READ_REG_1(hw, reg) \
> - inb((VIRTIO_PCI_REG_ADDR((hw), (reg
> -#define VIRTIO_WRITE_REG_1(hw, reg, value) \
> - outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
> -
> -#define VIRTIO_READ_REG_2(hw, reg) \
> - inw((VIRTIO_PCI_REG_ADDR((hw), (reg
> -#define VIRTIO_WRITE_REG_2(hw, reg, value) \
> - outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
> -
> -#define VIRTIO_READ_REG_4(hw, reg) \
> - inl((VIRTIO_PCI_REG_ADDR((hw), (reg
> -#define VIRTIO_WRITE_REG_4(hw, reg, value) \
> - outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
> +#define VIRTIO_READ_REG_1(hw, reg)   \
> +({   \
> + uint8_t ret;\
> + struct virtio_vfio_dev *vdev;   \
> + (vdev) = (&(hw)->dev);  \
> + (((vdev)->is_vfio) ?\
> + (ioport_inb(((vdev)->pci_dev), reg, &ret)) :\
> + ((ret) = (inb((VIRTIO_PCI_REG_ADDR((hw), (reg)));   \
> + ret;\
> +})

It becomes unreadable. I'd suggest to define them as iniline
functions, and use "if .. else .." instead of "?:".

--yliu


[dpdk-dev] [PATCH v4 09/14] virtio: ethdev: check for vfio interface

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:32PM +0530, Santosh Shukla wrote:
> Introducing api to check interface type is vfio or not, if interface is vfio
> then update struct virtio_vfio_dev {}.
> 
> Those two apis are:
> - virtio_chk_for_vfio
> - virtio_hw_init_by_vfio
> 
> Signed-off-by: Santosh Shukla 
..
> +/* Init virtio by vfio-way */
> +static int virtio_hw_init_by_vfio(struct virtio_hw *hw,
> +   struct rte_pci_device *pci_dev)
> +{
> + struct virtio_vfio_dev *vdev;
> +
> + vdev = &hw->dev;
> + if (virtio_chk_for_vfio(pci_dev) < 0) {
> + vdev->is_vfio = false;
> + vdev->pci_dev = NULL;
> + return -1;
> + }
> +
> + /* .. So attached interface is vfio */
> + vdev->is_vfio = true;
> + vdev->pci_dev = pci_dev;

Normally, I don't like the way of adding yet another "virtio_hw_init_by_xxx".

As suggested in another reply, would pci_dev->kdrv checking be enough?
If so, do it in simple way.

--ylu


[dpdk-dev] [PATCH v4 11/14] config: armv7/v8: Enable RTE_LIBRTE_VIRTIO_PMD

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:34PM +0530, Santosh Shukla wrote:
> Enable RTE_LIBRTE_VIRTIO_PMD for armv7/v8 and setting RTE_VIRTIO_INC_VEC=n.
> Builds successfully for armv7/v8.

I guess you could squash this patch to 2nd patch.

--yliu


[dpdk-dev] [PATCH 0/4] Optimize memcpy for AVX512 platforms

2016-01-15 Thread Wang, Zhihong


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Friday, January 15, 2016 12:49 AM
> To: Wang, Zhihong 
> Cc: dev at dpdk.org; Ananyev, Konstantin ;
> Richardson, Bruce ; Xie, Huawei
> 
> Subject: Re: [PATCH 0/4] Optimize memcpy for AVX512 platforms
> 
> On Thu, 14 Jan 2016 01:13:18 -0500
> Zhihong Wang  wrote:
> 
> > This patch set optimizes DPDK memcpy for AVX512 platforms, to make full
> > utilization of hardware resources and deliver high performance.
> >
> > In current DPDK, memcpy holds a large proportion of execution time in
> > libs like Vhost, especially for large packets, and this patch can bring
> > considerable benefits.
> >
> > The implementation is based on the current DPDK memcpy framework, some
> > background introduction can be found in these threads:
> > http://dpdk.org/ml/archives/dev/2014-November/008158.html
> > http://dpdk.org/ml/archives/dev/2015-January/011800.html
> >
> > Code changes are:
> >
> >   1. Read CPUID to check if AVX512 is supported by CPU
> >
> >   2. Predefine AVX512 macro if AVX512 is enabled by compiler
> >
> >   3. Implement AVX512 memcpy and choose the right implementation based
> on
> >  predefined macros
> >
> >   4. Decide alignment unit for memcpy perf test based on predefined macros
> >
> > Zhihong Wang (4):
> >   lib/librte_eal: Identify AVX512 CPU flag
> >   mk: Predefine AVX512 macro for compiler
> >   lib/librte_eal: Optimize memcpy for AVX512 platforms
> >   app/test: Adjust alignment unit for memcpy perf test
> >
> >  app/test/test_memcpy_perf.c|   6 +
> >  .../common/include/arch/x86/rte_cpuflags.h |   2 +
> >  .../common/include/arch/x86/rte_memcpy.h   | 247
> -
> >  mk/rte.cpuflags.mk |   4 +
> >  4 files changed, 255 insertions(+), 4 deletions(-)
> >
> 
> This really looks like code that could benefit from Gcc
> function multiversioning. The current cpuflags model is useless/flawed
> in real product deployment


I've tried gcc function multi versioning, with a simple add() function
which returns a + b, and a loop calling it for millions of times. Turned
out this mechanism adds 17% extra time to execute, overall it's a lot
of extra overhead.

Quote the gcc wiki: "GCC takes care of doing the dispatching to call
the right version at runtime". So it loses inlining and adds extra
dispatching overhead.

Also this mechanism works only for C++, right?

I think using predefined macros at compile time is more efficient and
suits DPDK more.

Could you please give an example when the current CPU flags model
stop working? So I can fix it.



[dpdk-dev] [PATCH 0/4] virtio support for container

2016-01-15 Thread Tan, Jianfeng
Hi Amit,

On 1/14/2016 8:03 PM, Amit Tomer wrote:
> Hello,
>
>> Not necessary. But if you want to use hugepages inside Docker, use -v option
>> to map a hugetlbfs into containers.
> I modified Docker command line in order to make use of Hugetlbfs:
>
> CMD ["/usr/src/dpdk/examples/l2fwd/build/l2fwd", "-c", "0x3", "-n",
> "4","--no-pci", "--socket-mem","512",
> "--vdev=eth_cvio0,queue_num=256,rx=1,tx=1,cq=0,path=/var/run/usvhost",
> "--", "-p", "0x1"]



For this case, please use --single-file option because it creates much 
more than 8 fds, which can be handled by vhost-user sendmsg().


>
> Then, I run docker :
>
>   docker run -i -t --privileged  -v /dev/hugepages:/dev/hugepages  -v
> /home/ubuntu/backup/usvhost:/var/run/usvhost  l6
>
> But this is what I see:
>
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 48 lcore(s)
> EAL: Setting up physically contiguous memory...
> EAL: Failed to find phys addr for 2 MB pages
> PANIC in rte_eal_init():
> Cannot init memory
> 1: [/usr/src/dpdk/examples/l2fwd/build/l2fwd(rte_dump_stack+0x20) [0x48ea78]]

 From the log, it's caused by that it still cannot open 
/proc/self/pagemap. But it's strange that you already specify --privileged).

Thanks,
Jianfeng

>
> This is from Host:
>
> # mount | grep hugetlbfs
> hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
> none on /dev/hugepages type hugetlbfs (rw,relatime)
>
>   #cat /proc/meminfo | grep Huge
> AnonHugePages:548864 kB
> HugePages_Total:4096
> HugePages_Free: 1024
> HugePages_Rsvd:0
> HugePages_Surp:0
> Hugepagesize:   2048 kB
>
> What is it, I'm doing wrong here?
>
> Thanks,
> Amit



[dpdk-dev] [PATCH v4 01/14] virtio: Introduce config RTE_VIRTIO_INC_VECTOR

2016-01-15 Thread Yuanhan Liu
On Thu, Jan 14, 2016 at 06:58:24PM +0530, Santosh Shukla wrote:
> virtio_recv_pkts_vec and other virtio vector friend apis are written for 
> sse/avx
> instructions. For arm64 in particular, virtio vector implementation does not
> exist(todo).
> 
> So virtio pmd driver wont build for targets like i686, arm64.  By making
> RTE_VIRTIO_INC_VECTOR=n, Driver can build for non-sse/avx targets and will 
> work
> in non-vectored virtio mode.

While revisiting this patch, I'm thinking you may squash both patch 2
and patch 11 into this one.

> 
> Signed-off-by: Santosh Shukla 
> ---
>  config/common_linuxapp   |1 +
>  drivers/net/virtio/Makefile  |2 +-
>  drivers/net/virtio/virtio_rxtx.c |7 +++
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 74bc515..8677697 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -274,6 +274,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
> +CONFIG_RTE_VIRTIO_INC_VECTOR=y
>  
>  #
>  # Compile burst-oriented VMXNET3 PMD driver
> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> index 43835ba..25a842d 100644
> --- a/drivers/net/virtio/Makefile
> +++ b/drivers/net/virtio/Makefile
> @@ -50,7 +50,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtqueue.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_pci.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
> -SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> +SRCS-$(CONFIG_RTE_VIRTIO_INC_VECTOR) += virtio_rxtx_simple.c
>  
>  # this lib depends upon:
>  DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index 74b39ef..23be1ff 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -438,7 +438,9 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
>  
>   dev->data->rx_queues[queue_idx] = vq;
>  
> +#ifdef RTE_VIRTIO_INC_VECTOR
>   virtio_rxq_vec_setup(vq);
> +#endif

You should put such macros for the declaration in virtio_rxtx.h as well.


And note that you may miss one:


325 /**
326 * Enqueue allocated buffers*
327 ***/
328 if (use_simple_rxtx)
==> 329 error = 
virtqueue_enqueue_recv_refill_simple(vq, m);
330 else
331 error = 
virtqueue_enqueue_recv_refill(vq, m);
332 if (error) {
333 rte_pktmbuf_free(m);
334 break;
335 }


virtqueue_enqueue_recv_refill_simple() is defined inside virtio_rxtx_simple.c,
which is built only when CONFIG_RTE_VIRTIO_INC_VECTOR is set. But I see no
such check here.

Note that this will not break the build, as gcc just ignores it, for 
use_simple_rxtx
is 0 by default, thus the "if" part code is not compiled at all. Even for that,
I think it's better to put an explicit macro check here.

--yliu


[dpdk-dev] [RFC v2 1/2] ethdev: add packet filter flow and new behavior switch to fdir

2016-01-15 Thread Rahul Lakkireddy
Hi Jingjing,

On Thursday, January 01/14/16, 2016 at 17:30:53 -0800, Wu, Jingjing wrote:
> Hi, Rahul
> 
> > This approach seems generic enough to allow any vendor specific data
> > to be passed in filter as well.  However, 80 seems to be too low for
> > multiple flow types that can be combined in the same filter rule.
> > I think size of 256 seems reasonable.
> >
> Yes, 80 is just an example. 
> > Could the same thing be done for action arguments as well? Can we add
> > the same generic info to rte_eth_fdir_action too?
> > 
> > struct rte_eth_fdir_action {
> > uint16_t rx_queue;
> > enum rte_eth_fdir_behavior behavior;
> > enum rte_eth_fdir_status report_status;
> > uint8_t flex_off;
> > +   uint8_t behavior_arg[256];
> > };
> > 
> > This way, we can pass vendor specific action arguments too. What do
> > you think?
> Yes, it also makes sense.
> > Also, now if we take this approach then, I am wondering, that all
> > vendors would need to document their own vendor-specific format of
> > taking filter match and filter action arguments, right?
> > 
> > And probably, even come up with their own example application showing
> > how to apply filters via dpdk on their card?
> Yes, I guess it will be better to doc it or example it. Even currently, 
> different kinds of NIC may need different configuration.
> Or you can add description (how to configure) in your driver's comment log?
> Not sure about the others' opinion?
> 
> Thanks
> Jingjing

Ok.  We will wait for a couple of days to get more opinions from others.
If the above generic approach is agreeable to everyone, then I will post
the patch series using this new approach.

Thanks,
Rahul


[dpdk-dev] [RESEND PATCH] vhost_user: Make sure that memory map is set before attempting address translation

2016-01-15 Thread Pavel Fedin
 Hello!

> If this is the case, i am wondering whether we should include
> "malfunctioning clients" in commit message. It triggers me to think if
> there are existing buggy implementations.

 Well... Can you suggest how to rephrase it? May be "if a client is 
malfunctioning it can..."? I lack fantasy, really, and to tell
the truth i don't care that much about the exact phrasing, i'm OK with 
everything.

> Anyway, check is OK.

Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia




[dpdk-dev] [PATCH v2 1/3] cmdline: increase command line buffer

2016-01-15 Thread Nélio Laranjeiro
On Tue, Jan 12, 2016 at 02:46:07PM +0200, Panu Matilainen wrote:
> On 01/12/2016 12:49 PM, Nelio Laranjeiro wrote:
> >Allow long command lines in testpmd (like flow director with IPv6, ...).
> >
> >Signed-off-by: John McNamara 
> >Signed-off-by: Nelio Laranjeiro 
> >---
> >  doc/guides/rel_notes/deprecation.rst | 5 -
> >  lib/librte_cmdline/cmdline_rdline.h  | 2 +-
> >  2 files changed, 1 insertion(+), 6 deletions(-)
> >
> >diff --git a/doc/guides/rel_notes/deprecation.rst 
> >b/doc/guides/rel_notes/deprecation.rst
> >index e94d4a2..9cb288c 100644
> >--- a/doc/guides/rel_notes/deprecation.rst
> >+++ b/doc/guides/rel_notes/deprecation.rst
> >@@ -44,8 +44,3 @@ Deprecation Notices
> >and table action handlers will be updated:
> >the pipeline parameter will be added, the packets mask parameter will be
> >either removed (for input port action handler) or made input-only.
> >-
> >-* ABI changes are planned in cmdline buffer size to allow the use of long
> >-  commands (such as RETA update in testpmd).  This should impact
> >-  CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE.
> >-  It should be integrated in release 2.3.
> >diff --git a/lib/librte_cmdline/cmdline_rdline.h 
> >b/lib/librte_cmdline/cmdline_rdline.h
> >index b9aad9b..72e2dad 100644
> >--- a/lib/librte_cmdline/cmdline_rdline.h
> >+++ b/lib/librte_cmdline/cmdline_rdline.h
> >@@ -93,7 +93,7 @@ extern "C" {
> >  #endif
> >
> >  /* configuration */
> >-#define RDLINE_BUF_SIZE 256
> >+#define RDLINE_BUF_SIZE 512
> >  #define RDLINE_PROMPT_SIZE  32
> >  #define RDLINE_VT100_BUF_SIZE  8
> >  #define RDLINE_HISTORY_BUF_SIZE BUFSIZ
> 
> Having to break a library ABI for a change like this is a bit ridiculous.

Sure, but John McNamara needed it to handle flow director with IPv6[1].

For my part, I was needing it to manipulate the RETA table, but as I
wrote in the cover letter, it ends by breaking other commands.
Olivier Matz, has proposed another way to handle long commands lines[2],
it could be a good idea to go on this direction.

For RETA situation, we already discussed on a new API, but for now, I
do not have time for it (and as it is another ABI breakage it could only
be done for 16.07 or 2.4)[3].

If this patch is no more needed we can just drop it, for that I would
like to have the point of view from John.

> 
> I didn't try it so could be wrong, but based on a quick look, struct rdline
> could easily be made opaque to consumers by just adding functions for
> allocating and freeing it.
> 
>   - Panu -
> 

[1] http://dpdk.org/ml/archives/dev/2015-November/027643.html
[2] http://dpdk.org/ml/archives/dev/2015-November/028557.html
[3] http://dpdk.org/ml/archives/dev/2015-October/025294.html

-- 
N?lio Laranjeiro
6WIND


[dpdk-dev] [PATCH v4 8/8] virtio: move VIRTIO_READ/WRITE_REG_X into virtio_pci.c

2016-01-15 Thread Xu, Qian Q
Tested-by: Qian Xu 

- Test Commit: ad66a85dce9a707748cb4d9592c13022ae77c067
- OS/Kernel: Fedora 21/4.1.13
- GCC: gcc (GCC) 4.9.2 20141101 (Red Hat 4.9.2-1)
- CPU: Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
- Target: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 
01)
- Total 2 cases, 2 passed, 0 failed. 

Test Case1: test_func_vhost_user_virtio1.0-pmd with different txqflags 
==

Note: For virtio1.0 usage, we need use qemu version >2.4, such as 2.4.1 or 
2.5.0.

1. Launch the Vhost sample by below commands, socket-mem is set for the vhost 
sample to use, need ensure that the PCI port located socket has the memory. In 
our case, the PCI BDF is 81:00.0, so we need assign memory for socket1.::

taskset -c 18-20 ./examples/vhost/build/vhost-switch -c 0x1c -n 4 
--huge-dir /mnt/huge --socket-mem 0,2048 -- -p 1 --mergeable 0 --zero-copy 0 
--vm2vm 0 

2. Start VM with 1 virtio, note: we need add "disable-modern=false" to enable 
virtio 1.0. 

taskset -c 22-23 \
/root/qemu-versions/qemu-2.5.0/x86_64-softmmu/qemu-system-x86_64 -name 
us-vhost-vm1 \
 -cpu host -enable-kvm -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa 
node,memdev=mem -mem-prealloc \
 -smp cores=2,sockets=1 -drive file=/home/img/vm1.img  \
 -chardev socket,id=char0,path=/home/qxu10/virtio-1.0/dpdk/vhost-net 
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
 -device 
virtio-net-pci,mac=52:54:00:00:00:01,netdev=mynet1,disable-modern=false \
 -netdev tap,id=ipvm1,ifname=tap3,script=/etc/qemu-ifup -device 
rtl8139,netdev=ipvm1,id=net0,mac=00:00:00:00:10:01 -nographic


3. In the VM, change the config file--common_linuxapp, 
"CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT=y"; Run dpdk testpmd in VM::

 .//tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 

 .//x86_64-native-linuxapp-gcc/app/test-pmd/testpmd -c 0x3 -n 
4 -- -i --txqflags 0x0f00 --disable-hw-vlan 

 $ >set fwd mac

 $ >start tx_first

We expect similar output as below, and see modern virtio pci detected. 

PMD: virtio_read_caps(): [98] skipping non VNDR cap id: 11
PMD: virtio_read_caps(): [84] cfg type: 5, bar: 0, offset: , len: 0
PMD: virtio_read_caps(): [70] cfg type: 2, bar: 4, offset: 3000, len: 
4194304
PMD: virtio_read_caps(): [60] cfg type: 4, bar: 4, offset: 2000, len: 4096
PMD: virtio_read_caps(): [50] cfg type: 3, bar: 4, offset: 1000, len: 4096
PMD: virtio_read_caps(): [40] cfg type: 1, bar: 4, offset: , len: 4096
PMD: virtio_read_caps(): found modern virtio pci device.
PMD: virtio_read_caps(): common cfg mapped at: 0x7f2c61a83000
PMD: virtio_read_caps(): device cfg mapped at: 0x7f2c61a85000
PMD: virtio_read_caps(): isr cfg mapped at: 0x7f2c61a84000
PMD: virtio_read_caps(): notify base: 0x7f2c61a86000, notify off 
multiplier: 409 
6
PMD: vtpci_init(): modern virtio pci detected.


4. Send traffic to virtio1(MAC1=52:54:00:00:00:01) with VLAN ID=1000. Check if 
virtio packet can be RX/TX and also check the TX packet size is same as the RX 
packet size.

5. Also run the dpdk testpmd in VM with txqflags=0xf01 for the virtio pmd 
optimization usage::

 .//tools/dpdk_nic_bind.py --bind igb_uio 00:03.0

 .//x86_64-native-linuxapp-gcc/app/test-pmd/testpmd -c 0x3 -n 
4 -- -i --txqflags=0x0f01 --disable-hw-vlan 

 $ >set fwd mac

 $ >start tx_first

6. Send traffic to virtio1(MAC1=52:54:00:00:00:01) and VLAN ID=1000. Check if 
virtio packet can be RX/TX and also check the TX packet size is same as the RX 
packet size. Check the packet content is correct.


Test Case2: test_func_vhost_user_one-vm-virtio1.0-one-vm-virtio0.95
===

1. Launch the Vhost sample by below commands, socket-mem is set for the vhost 
sample to use, need ensure that the PCI port located socket has the memory. In 
our case, the PCI BDF is 81:00.0, so we need assign memory for socket1.::

taskset -c 18-20 ./examples/vhost/build/vhost-switch -c 0x1c -n 4 
--huge-dir /mnt/huge --socket-mem 0,2048 -- -p 1 --mergeable 0 --zero-copy 0 
--vm2vm 1 

2. Start VM1 with 1 virtio, note: we need add "disable-modern=false" to enable 
virtio 1.0. 

taskset -c 22-23 \
/root/qemu-versions/qemu-2.5.0/x86_64-softmmu/qemu-system-x86_64 -name 
us-vhost-vm1 \
 -cpu host -enable-kvm -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on -numa 
node,memdev=mem -mem-prealloc \
 -smp cores=2,sockets=1 -drive file=/home/img/vm1.img  \
 -chardev socket,id=char0,path=/home/qxu10/virtio-1.0/dpdk/vhost-net 
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
 -device 
virtio

[dpdk-dev] [PATCH v2 1/3] cmdline: increase command line buffer

2016-01-15 Thread Panu Matilainen
On 01/15/2016 10:44 AM, N?lio Laranjeiro wrote:
> On Tue, Jan 12, 2016 at 02:46:07PM +0200, Panu Matilainen wrote:
>> On 01/12/2016 12:49 PM, Nelio Laranjeiro wrote:
>>> Allow long command lines in testpmd (like flow director with IPv6, ...).
>>>
>>> Signed-off-by: John McNamara 
>>> Signed-off-by: Nelio Laranjeiro 
>>> ---
>>>   doc/guides/rel_notes/deprecation.rst | 5 -
>>>   lib/librte_cmdline/cmdline_rdline.h  | 2 +-
>>>   2 files changed, 1 insertion(+), 6 deletions(-)
>>>
>>> diff --git a/doc/guides/rel_notes/deprecation.rst 
>>> b/doc/guides/rel_notes/deprecation.rst
>>> index e94d4a2..9cb288c 100644
>>> --- a/doc/guides/rel_notes/deprecation.rst
>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>> @@ -44,8 +44,3 @@ Deprecation Notices
>>> and table action handlers will be updated:
>>> the pipeline parameter will be added, the packets mask parameter will be
>>> either removed (for input port action handler) or made input-only.
>>> -
>>> -* ABI changes are planned in cmdline buffer size to allow the use of long
>>> -  commands (such as RETA update in testpmd).  This should impact
>>> -  CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE.
>>> -  It should be integrated in release 2.3.
>>> diff --git a/lib/librte_cmdline/cmdline_rdline.h 
>>> b/lib/librte_cmdline/cmdline_rdline.h
>>> index b9aad9b..72e2dad 100644
>>> --- a/lib/librte_cmdline/cmdline_rdline.h
>>> +++ b/lib/librte_cmdline/cmdline_rdline.h
>>> @@ -93,7 +93,7 @@ extern "C" {
>>>   #endif
>>>
>>>   /* configuration */
>>> -#define RDLINE_BUF_SIZE 256
>>> +#define RDLINE_BUF_SIZE 512
>>>   #define RDLINE_PROMPT_SIZE  32
>>>   #define RDLINE_VT100_BUF_SIZE  8
>>>   #define RDLINE_HISTORY_BUF_SIZE BUFSIZ
>>
>> Having to break a library ABI for a change like this is a bit ridiculous.
>
> Sure, but John McNamara needed it to handle flow director with IPv6[1].
>
> For my part, I was needing it to manipulate the RETA table, but as I
> wrote in the cover letter, it ends by breaking other commands.
> Olivier Matz, has proposed another way to handle long commands lines[2],
> it could be a good idea to go on this direction.
>
> For RETA situation, we already discussed on a new API, but for now, I
> do not have time for it (and as it is another ABI breakage it could only
> be done for 16.07 or 2.4)[3].
>
> If this patch is no more needed we can just drop it, for that I would
> like to have the point of view from John.

Note that I was not objecting to the patch as such, I can easily see 256 
characters not being enough for commandline buffer.

I was merely noting that having to break an ABI to increase an 
effectively internal buffer size is a sign of a, um, less-than-optimal 
library design.

Apologies if I wasn't clear about that,

- Panu -




[dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection

2016-01-15 Thread Bruce Richardson
On Fri, Jan 15, 2016 at 01:01:04AM +, Stephen Hurd wrote:
> Found it...
> http://www.dpdk.org/ml/archives/dev/2015-December/030564.html
> 
> 
> -- Stephen Hurd
> 

Thanks for that. For some reason I missed the rest of the email thread from 
earlier.

/Bruce

> 
> -Original Message-
> From: Bruce Richardson [mailto:bruce.richardson at intel.com] 
> Sent: Thursday, January 14, 2016 5:44 AM
> To: Stephen Hurd
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] app/testpmd Fix max_socket detection
> 
> On Wed, Jan 13, 2016 at 02:23:36PM -0800, Stephen Hurd wrote:
> > Previously, max_socket was set to the highest numbered socket with
> > an enabled lcore.  The intent is to set it to the highest socket
> > regardless of it being enabled.
> > 
> 
> Can you clarify why this changes is necessary? Is it causing a bug somewhere?
> 
> thanks,
> /Bruce
> 
> 


[dpdk-dev] [PATCH 00/29] i40e base driver update

2016-01-15 Thread Bruce Richardson
On Fri, Jan 15, 2016 at 10:40:24AM +0800, Helin Zhang wrote:
> i40e base driver is updated, to support new X722 device IDs, and
> use rx control AQ commands to read/write rx control registers.
> Of cause, fixes and enhancements are added as listed as below.
> 
> Helin Zhang (29):
>   i40e/base: use explicit cast from u16 to u8
>   i40e/base: Acquire NVM, before issuing an AQ read nvm command
>   i40e/base: add hw flag for doing the SRCTL access using AQ for X722
>   i40e/base: add changes in nvm read to support X722
>   i40e/base: Limit DCB FW version checks to XL710/X710 devices
>   i40e/base: check for stopped admin queue
>   i40e/base: set aq count after memory allocation
>   i40e/base: clean event descriptor before use
>   i40e/base: add new device IDs and delete deprecated one
>   i40e/base: fix up recent proxy and wol bits for X722_SUPPORT
>   i40e/base: define function capabilities in only one place
>   i40e/base: Fix for PHY NVM interaction problem
>   i40e/base: set shared bit for multicast filters
>   i40e/base: add APIs to Add/remove port mirroring rules
>   i40e/base: add VEB stat control and remove L2 cloud filter
>   i40e/base: implement the API function for aq_set_switch_config
>   i40e/base: Add functions to blink led on Coppervale PHY
>   i40e/base: When in promisc mode apply promisc mode to Tx Traffic as
> well
>   i40e/base: Increase timeout when checking GLGEN_RSTAT_DEVSTATE bit
>   i40e/base: Save off VSI resource count when updating VSI
>   i40e/base: coding style fixes
>   i40e/base: use FW to read/write rx control registers
>   i40e/base: expose some registers to program parser, FD and RSS logic
>   i40e/base: Add a Virtchnl offload for RSS PCTYPE V2
>   i40e/base: add AQ thermal sensor control struct
>   i40e/base: add/update structure and macro definitions
>   i40e: add base driver release info
>   i40e: add/remove new device IDs
>   i40e: use rx control function for rx control registers

Couple of minor nits looking through the subject list above. 
* the promiscuous mode fix has too long a title, so please shorten (maybe drop
the "when in promisc mode" bit)
* some messages start with a capital letter, others not. They should be 
consistent
and the standard is to not capitalize.

/Bruce


[dpdk-dev] [PATCH v4 09/14] virtio: ethdev: check for vfio interface

2016-01-15 Thread Santosh Shukla
On Fri, Jan 15, 2016 at 12:05 PM, Yuanhan Liu
 wrote:
> On Thu, Jan 14, 2016 at 06:58:32PM +0530, Santosh Shukla wrote:
>> Introducing api to check interface type is vfio or not, if interface is vfio
>> then update struct virtio_vfio_dev {}.
>>
>> Those two apis are:
>> - virtio_chk_for_vfio
>> - virtio_hw_init_by_vfio
>>
>> Signed-off-by: Santosh Shukla 
> ..
>> +/* Init virtio by vfio-way */
>> +static int virtio_hw_init_by_vfio(struct virtio_hw *hw,
>> +   struct rte_pci_device *pci_dev)
>> +{
>> + struct virtio_vfio_dev *vdev;
>> +
>> + vdev = &hw->dev;
>> + if (virtio_chk_for_vfio(pci_dev) < 0) {
>> + vdev->is_vfio = false;
>> + vdev->pci_dev = NULL;
>> + return -1;
>> + }
>> +
>> + /* .. So attached interface is vfio */
>> + vdev->is_vfio = true;
>> + vdev->pci_dev = pci_dev;
>
> Normally, I don't like the way of adding yet another "virtio_hw_init_by_xxx".
>
> As suggested in another reply, would pci_dev->kdrv checking be enough?
> If so, do it in simple way.
>

No, It wont be enough, Virtio could only work for vfio for _noiommu_
mode and for that user need to preset -noiommu parameter, therefore
virtio need to check that parame. pci_dev->kdrv check not enough,
Hence not used.

> --ylu


[dpdk-dev] [PATCH v4 08/14] virtio: pci: extend virtio pci rw api for vfio interface

2016-01-15 Thread Santosh Shukla
On Fri, Jan 15, 2016 at 11:57 AM, Yuanhan Liu
 wrote:
> On Thu, Jan 14, 2016 at 06:58:31PM +0530, Santosh Shukla wrote:
>> So far virtio handle rw access for uio / ioport interface, This patch to 
>> extend
>> the support for vfio interface. For that introducing private struct
>> virtio_vfio_dev{
>>   - is_vfio
>>   - pci_dev
>>   };
>> Signed-off-by: Santosh Shukla 
> ...
>> +/* For vfio only */
>> +struct virtio_vfio_dev {
>> + boolis_vfio;/* True: vfio i/f,
>> +  * False: not a vfio i/f
>
> Well, this is weird; you are adding a flag to tell whether it's a
> vfio device __inside__ a vfio struct.
>
> Back to the topic, this flag is not necessary to me: you can
> check the pci_dev->kdrv flag.
>

yes, I'll replace is_vfio with pci_dev->kdrv.

>> +  */
>> + struct rte_pci_device *pci_dev; /* vfio dev */
>
> Note that I have already added this field into virtio_hw struct
> at my latest virtio 1.0 pmd patchset.
>
> While I told you before that you should not develop patches based
> on my patcheset, I guess you can do that now. Since it should be
> in good shape and close to be merged.

Okay, Before rebasing my v5 patch on your 1.0 virtio patch, I like to
understand which qemu version support virtio 1.0 spec?
>
>> +};
>> +
>>  struct virtio_hw {
>>   struct virtqueue *cvq;
>>   uint32_tio_base;
>> @@ -176,6 +186,7 @@ struct virtio_hw {
>>   uint8_t use_msix;
>>   uint8_t started;
>>   uint8_t mac_addr[ETHER_ADDR_LEN];
>> + struct virtio_vfio_dev dev;
>>  };
>>
>>  /*
>> @@ -231,20 +242,65 @@ outl_p(unsigned int data, unsigned int port)
>>  #define VIRTIO_PCI_REG_ADDR(hw, reg) \
>>   (unsigned short)((hw)->io_base + (reg))
>>
>> -#define VIRTIO_READ_REG_1(hw, reg) \
>> - inb((VIRTIO_PCI_REG_ADDR((hw), (reg
>> -#define VIRTIO_WRITE_REG_1(hw, reg, value) \
>> - outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>> -
>> -#define VIRTIO_READ_REG_2(hw, reg) \
>> - inw((VIRTIO_PCI_REG_ADDR((hw), (reg
>> -#define VIRTIO_WRITE_REG_2(hw, reg, value) \
>> - outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>> -
>> -#define VIRTIO_READ_REG_4(hw, reg) \
>> - inl((VIRTIO_PCI_REG_ADDR((hw), (reg
>> -#define VIRTIO_WRITE_REG_4(hw, reg, value) \
>> - outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>> +#define VIRTIO_READ_REG_1(hw, reg)   \
>> +({   \
>> + uint8_t ret;\
>> + struct virtio_vfio_dev *vdev;   \
>> + (vdev) = (&(hw)->dev);  \
>> + (((vdev)->is_vfio) ?\
>> + (ioport_inb(((vdev)->pci_dev), reg, &ret)) :\
>> + ((ret) = (inb((VIRTIO_PCI_REG_ADDR((hw), (reg)));   \
>> + ret;\
>> +})
>
> It becomes unreadable. I'd suggest to define them as iniline
> functions, and use "if .. else .." instead of "?:".
>
> --yliu


[dpdk-dev] [PATCH v4 11/14] config: armv7/v8: Enable RTE_LIBRTE_VIRTIO_PMD

2016-01-15 Thread Santosh Shukla
On Fri, Jan 15, 2016 at 12:07 PM, Yuanhan Liu
 wrote:
> On Thu, Jan 14, 2016 at 06:58:34PM +0530, Santosh Shukla wrote:
>> Enable RTE_LIBRTE_VIRTIO_PMD for armv7/v8 and setting RTE_VIRTIO_INC_VEC=n.
>> Builds successfully for armv7/v8.
>
> I guess you could squash this patch to 2nd patch.
>

Yes.
> --yliu


[dpdk-dev] [PATCH v2 00/12] Add API to get packet type info

2016-01-15 Thread Jianfeng Tan
A new ether API rte_eth_dev_get_ptype_info() is added to query what
packet type information will be provided by current pmd driver of the
specifed port.

To achieve this, a new function pointer, dev_ptype_info_get, is added
into struct eth_dev_ops. For those devices who do not implement it, it
means it will not provide any ptype info.

v2:
  - Move ptype_mask filter function from each PMDs into ether layer.
  - Add ixgbe vPMD's ptype info.
  - Fix code style issues.

Jianfeng Tan (12):
  ethdev: add API to query packet type filling info
  pmd/cxgbe: add dev_ptype_info_get implementation
  pmd/e1000: add dev_ptype_info_get implementation
  pmd/enic: add dev_ptype_info_get implementation
  pmd/fm10k: add dev_ptype_info_get implementation
  pmd/i40e: add dev_ptype_info_get implementation
  pmd/ixgbe: add dev_ptype_info_get implementation
  pmd/mlx4: add dev_ptype_info_get implementation
  pmd/mlx5: add dev_ptype_info_get implementation
  pmd/nfp: add dev_ptype_info_get implementation
  pmd/vmxnet3: add dev_ptype_info_get implementation
  examples/l3fwd: add option to parse ptype

 drivers/net/cxgbe/cxgbe_ethdev.c | 14 ++
 drivers/net/e1000/igb_ethdev.c   | 30 
 drivers/net/enic/enic_ethdev.c   | 17 +++
 drivers/net/fm10k/fm10k_ethdev.c | 43 +
 drivers/net/fm10k/fm10k_rxtx.c   |  5 ++
 drivers/net/fm10k/fm10k_rxtx_vec.c   |  5 ++
 drivers/net/i40e/i40e_ethdev.c   |  1 +
 drivers/net/i40e/i40e_ethdev_vf.c|  1 +
 drivers/net/i40e/i40e_rxtx.c | 46 +-
 drivers/net/i40e/i40e_rxtx.h |  1 +
 drivers/net/ixgbe/ixgbe_ethdev.c | 38 +++
 drivers/net/ixgbe/ixgbe_ethdev.h |  2 +
 drivers/net/ixgbe/ixgbe_rxtx.c   |  5 +-
 drivers/net/mlx4/mlx4.c  | 20 
 drivers/net/mlx5/mlx5.c  |  1 +
 drivers/net/mlx5/mlx5.h  |  1 +
 drivers/net/mlx5/mlx5_ethdev.c   | 18 +++
 drivers/net/mlx5/mlx5_rxtx.c |  2 +
 drivers/net/nfp/nfp_net.c| 17 +++
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 16 +++
 examples/l3fwd/main.c| 91 
 lib/librte_ether/rte_ethdev.c| 20 
 lib/librte_ether/rte_ethdev.h| 27 +++
 lib/librte_mbuf/rte_mbuf.h   |  6 +++
 24 files changed, 425 insertions(+), 2 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type filling info

2016-01-15 Thread Jianfeng Tan
Add a new API rte_eth_dev_get_ptype_info to query wether/what packet type will
be filled by given pmd rx burst function.

Signed-off-by: Jianfeng Tan 
---
 lib/librte_ether/rte_ethdev.c | 20 
 lib/librte_ether/rte_ethdev.h | 27 +++
 lib/librte_mbuf/rte_mbuf.h|  6 ++
 3 files changed, 53 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ed971b4..cd34f46 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1614,6 +1614,26 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info)
dev_info->driver_name = dev->data->drv_name;
 }

+int
+rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
+  uint32_t ptypes[], int num)
+{
+   int ret, i, j;
+   struct rte_eth_dev *dev;
+   uint32_t all_ptypes[RTE_PTYPE_MAX_NUM];
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
+   ret = (*dev->dev_ops->dev_ptype_info_get)(dev, all_ptypes);
+
+   for (i = 0, j = 0; i < ret && j < num; ++i)
+   if (all_ptypes[i] & ptype_mask)
+   ptypes[j++] = all_ptypes[i];
+
+   return ret;
+}
+
 void
 rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index bada8ad..42f8188 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev 
*dev,
struct rte_eth_dev_info *dev_info);
 /**< @internal Get specific informations of an Ethernet device. */

+typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
+   uint32_t ptypes[]);
+/**< @internal Get ptype info of eth_rx_burst_t. */
+
 typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
uint16_t queue_id);
 /**< @internal Start rx and tx of a queue of an Ethernet device. */
@@ -1347,6 +1351,7 @@ struct eth_dev_ops {
eth_queue_stats_mapping_set_t queue_stats_mapping_set;
/**< Configure per queue stat counter mapping. */
eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
+   eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
mtu_set_t  mtu_set; /**< Set MTU. */
vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
Setup. */
@@ -2273,6 +2278,28 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
 struct rte_eth_dev_info *dev_info);

 /**
+ * Retrieve the contextual information of an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param ptype_mask
+ *   A hint of what kind of packet type which the caller is interested in.
+ * @param ptypes
+ *   An array of packet types to be filled with.
+ * @param num
+ *   The size of ptypes array.
+ * @return
+ *   - (>0) Number of ptypes supported. May be greater than param num and
+ * caller needs to check that.
+ *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
+ uint32_t ptype_mask,
+ uint32_t ptypes[],
+ int num);
+
+/**
  * Retrieve the MTU of an Ethernet device.
  *
  * @param port_id
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f234ac9..d116711 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -667,6 +667,12 @@ extern "C" {
 #define RTE_PTYPE_INNER_L4_MASK 0x0f00

 /**
+  * Total number of all kinds of RTE_PTYPE_*, except from *_MASK, is 37 for now
+  * and reserve some space for new ptypes
+  */
+#define RTE_PTYPE_MAX_NUM  64
+
+/**
  * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one by
  * one, bit 4 is selected to be used for IPv4 only. Then checking bit 4 can
  * determine if it is an IPV4 packet.
-- 
2.1.4



[dpdk-dev] [PATCH v2 02/12] pmd/cxgbe: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/cxgbe/cxgbe_ethdev.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 97ef152..1699d8e 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -767,6 +767,19 @@ static int cxgbe_flow_ctrl_set(struct rte_eth_dev *eth_dev,
 &pi->link_cfg);
 }

+static int cxgbe_dev_ptype_info_get(struct rte_eth_dev *eth_dev,
+   uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (eth_dev->rx_pkt_burst == cxgbe_recv_pkts) {
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   } 
+
+   return num;
+}
+
 static struct eth_dev_ops cxgbe_eth_dev_ops = {
.dev_start  = cxgbe_dev_start,
.dev_stop   = cxgbe_dev_stop,
@@ -777,6 +790,7 @@ static struct eth_dev_ops cxgbe_eth_dev_ops = {
.allmulticast_disable   = cxgbe_dev_allmulticast_disable,
.dev_configure  = cxgbe_dev_configure,
.dev_infos_get  = cxgbe_dev_info_get,
+   .dev_ptype_info_get = cxgbe_dev_ptype_info_get,
.link_update= cxgbe_dev_link_update,
.mtu_set= cxgbe_dev_mtu_set,
.tx_queue_setup = cxgbe_dev_tx_queue_setup,
-- 
2.1.4



[dpdk-dev] [PATCH v2 03/12] pmd/e1000: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/e1000/igb_ethdev.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index d1bbcda..1eb1091 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -103,6 +103,8 @@ static void eth_igb_stats_reset(struct rte_eth_dev *dev);
 static void eth_igb_xstats_reset(struct rte_eth_dev *dev);
 static void eth_igb_infos_get(struct rte_eth_dev *dev,
  struct rte_eth_dev_info *dev_info);
+static int eth_igb_ptype_info_get(struct rte_eth_dev *dev,
+ uint32_t ptypes[]);
 static void eth_igbvf_infos_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
 static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
@@ -319,6 +321,7 @@ static const struct eth_dev_ops eth_igb_ops = {
.stats_reset  = eth_igb_stats_reset,
.xstats_reset = eth_igb_xstats_reset,
.dev_infos_get= eth_igb_infos_get,
+   .dev_ptype_info_get   = eth_igb_ptype_info_get,
.mtu_set  = eth_igb_mtu_set,
.vlan_filter_set  = eth_igb_vlan_filter_set,
.vlan_tpid_set= eth_igb_vlan_tpid_set,
@@ -376,6 +379,7 @@ static const struct eth_dev_ops igbvf_eth_dev_ops = {
.xstats_reset = eth_igbvf_stats_reset,
.vlan_filter_set  = igbvf_vlan_filter_set,
.dev_infos_get= eth_igbvf_infos_get,
+   .dev_ptype_info_get   = eth_igb_ptype_info_get,
.rx_queue_setup   = eth_igb_rx_queue_setup,
.rx_queue_release = eth_igb_rx_queue_release,
.tx_queue_setup   = eth_igb_tx_queue_setup,
@@ -1910,6 +1914,32 @@ eth_igb_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->tx_desc_lim = tx_desc_lim;
 }

+static int
+eth_igb_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == eth_igb_recv_pkts ||
+   dev->rx_pkt_burst == eth_igb_recv_scattered_pkts) {
+   /* refers to igb_rxd_pkt_info_to_pkt_type() */
+   ptypes[num++] = RTE_PTYPE_L2_ETHER;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_L4_UDP;
+   ptypes[num++] = RTE_PTYPE_L4_SCTP;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
+   }
+
+   return num;
+}
+
 static void
 eth_igbvf_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
-- 
2.1.4



[dpdk-dev] [PATCH v2 04/12] pmd/enic: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/enic/enic_ethdev.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 2a88043..9d3659d 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -54,6 +54,9 @@
 #define ENICPMD_FUNC_TRACE() (void)0
 #endif

+static uint16_t enicpmd_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+
 /*
  * The set of PCI devices this driver supports
  */
@@ -431,6 +434,19 @@ static void enicpmd_dev_info_get(struct rte_eth_dev 
*eth_dev,
DEV_TX_OFFLOAD_TCP_CKSUM;
 }

+static int enicpmd_dev_ptype_info_get(struct rte_eth_dev *dev,
+ uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == enicpmd_recv_pkts) {
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   }
+
+   return num;
+}
+
 static void enicpmd_dev_promiscuous_enable(struct rte_eth_dev *eth_dev)
 {
struct enic *enic = pmd_priv(eth_dev);
@@ -566,6 +582,7 @@ static const struct eth_dev_ops enicpmd_eth_dev_ops = {
.stats_reset  = enicpmd_dev_stats_reset,
.queue_stats_mapping_set = NULL,
.dev_infos_get= enicpmd_dev_info_get,
+   .dev_ptype_info_get   = enicpmd_dev_ptype_info_get,
.mtu_set  = NULL,
.vlan_filter_set  = enicpmd_vlan_filter_set,
.vlan_tpid_set= NULL,
-- 
2.1.4



[dpdk-dev] [PATCH v2 05/12] pmd/fm10k: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/fm10k/fm10k_ethdev.c   | 43 ++
 drivers/net/fm10k/fm10k_rxtx.c |  5 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  5 +
 3 files changed, 53 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index e4aed94..e155cb5 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1335,6 +1335,48 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
};
 }

+#ifdef RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE
+static int
+fm10k_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == fm10k_recv_pkts ||
+   dev->rx_pkt_burst == fm10k_recv_scattered_pkts) {
+   /* refers to rx_desc_to_ol_flags() */
+   ptypes[num++] = RTE_PTYPE_L2_ETHER;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_L4_UDP;
+   } else if (dev->rx_pkt_burst == fm10k_recv_pkts_vec ||
+  dev->rx_pkt_burst == fm10k_recv_scattered_pkts_vec) {
+   /* refers to fm10k_desc_to_pktype_v() */
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_L4_UDP;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_GENEVE;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_NVGRE;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_VXLAN;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_GRE;
+   }
+
+   return num;
+}
+#else
+static int
+fm10k_dev_ptype_info_get(struct rte_eth_dev *dev __rte_unused,
+uint32_t ptypes[] __rte_unused)
+{
+   return -ENOTSUP;
+}
+#endif
+
 static int
 fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 {
@@ -2423,6 +2465,7 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
.xstats_reset   = fm10k_stats_reset,
.link_update= fm10k_link_update,
.dev_infos_get  = fm10k_dev_infos_get,
+   .dev_ptype_info_get = fm10k_dev_ptype_info_get,
.vlan_filter_set= fm10k_vlan_filter_set,
.vlan_offload_set   = fm10k_vlan_offload_set,
.mac_addr_add   = fm10k_macaddr_add,
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index e958865..9b2d6f2 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -65,6 +65,11 @@ static inline void dump_rxd(union fm10k_rx_desc *rxd)
 }
 #endif

+/*
+ * @note
+ * When this function is changed, make corresponding change to
+ * fm10k_dev_ptype_info_get()
+ */
 static inline void
 rx_desc_to_ol_flags(struct rte_mbuf *m, const union fm10k_rx_desc *d)
 {
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2a57eef..6fc22fc 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -109,6 +109,11 @@ fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)
rx_pkts[3]->ol_flags = vol.e[3];
 }

+/*
+ * @note
+ * When this function is changed, make corresponding change to
+ * fm10k_dev_ptype_info_get()
+ */
 static inline void
 fm10k_desc_to_pktype_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 {
-- 
2.1.4



[dpdk-dev] [PATCH v2 06/12] pmd/i40e: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/i40e/i40e_ethdev.c|  1 +
 drivers/net/i40e/i40e_ethdev_vf.c |  1 +
 drivers/net/i40e/i40e_rxtx.c  | 46 ++-
 drivers/net/i40e/i40e_rxtx.h  |  1 +
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index bf6220d..1f5251b 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -439,6 +439,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
.xstats_reset = i40e_dev_stats_reset,
.queue_stats_mapping_set  = i40e_dev_queue_stats_mapping_set,
.dev_infos_get= i40e_dev_info_get,
+   .dev_ptype_info_get   = i40e_dev_ptype_info_get,
.vlan_filter_set  = i40e_vlan_filter_set,
.vlan_tpid_set= i40e_vlan_tpid_set,
.vlan_offload_set = i40e_vlan_offload_set,
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 14d2a50..5d924de 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -196,6 +196,7 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
.xstats_reset = i40evf_dev_xstats_reset,
.dev_close= i40evf_dev_close,
.dev_infos_get= i40evf_dev_info_get,
+   .dev_ptype_info_get   = i40e_dev_ptype_info_get,
.vlan_filter_set  = i40evf_vlan_filter_set,
.vlan_offload_set = i40evf_vlan_offload_set,
.vlan_pvid_set= i40evf_vlan_pvid_set,
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 39d94ec..15916e2 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -194,7 +194,11 @@ i40e_get_iee15888_flags(struct rte_mbuf *mb, uint64_t 
qword)
 }
 #endif

-/* For each value it means, datasheet of hardware can tell more details */
+/*
+ * For each value it means, datasheet of hardware can tell more details
+ *
+ * @note: fix i40e_dev_ptype_info_get() if any change here.
+ */
 static inline uint32_t
 i40e_rxd_pkt_type_mapping(uint8_t ptype)
 {
@@ -2094,6 +2098,46 @@ i40e_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
 }

 int
+i40e_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == i40e_recv_pkts ||
+#ifdef RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC
+   dev->rx_pkt_burst == i40e_recv_pkts_bulk_alloc ||
+#endif
+   dev->rx_pkt_burst == i40e_recv_scattered_pkts) {
+   /* refers to i40e_rxd_pkt_type_mapping() */
+   ptypes[num++] = RTE_PTYPE_L2_ETHER;
+   ptypes[num++] = RTE_PTYPE_L2_ETHER_TIMESYNC;
+   ptypes[num++] = RTE_PTYPE_L2_ETHER_LLDP;
+   ptypes[num++] = RTE_PTYPE_L2_ETHER_ARP;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+   ptypes[num++] = RTE_PTYPE_L4_FRAG;
+   ptypes[num++] = RTE_PTYPE_L4_ICMP;
+   ptypes[num++] = RTE_PTYPE_L4_NONFRAG;
+   ptypes[num++] = RTE_PTYPE_L4_SCTP;
+   ptypes[num++] = RTE_PTYPE_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_L4_UDP;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_GRENAT;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
+   ptypes[num++] = RTE_PTYPE_INNER_L2_ETHER;
+   ptypes[num++] = RTE_PTYPE_INNER_L2_ETHER_VLAN;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_FRAG;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_ICMP;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_NONFRAG;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_SCTP;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
+   }
+
+   return num;
+}
+
+int
 i40e_dev_rx_queue_setup(struct rte_eth_dev *dev,
uint16_t queue_idx,
uint16_t nb_desc,
diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
index 5c2f5c2..5bb34a5 100644
--- a/drivers/net/i40e/i40e_rxtx.h
+++ b/drivers/net/i40e/i40e_rxtx.h
@@ -200,6 +200,7 @@ int i40e_dev_rx_queue_start(struct rte_eth_dev *dev, 
uint16_t rx_queue_id);
 int i40e_dev_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id);
 int i40e_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id);
 int i40e_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id);
+int i40e_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[]);
 int i40e_dev_rx_queue_setup(struct rte_eth_dev *dev,
uint16_t queue_idx,
uint16_t nb_desc,
-- 
2.1.4



[dpdk-dev] [PATCH v2 07/12] pmd/ixgbe: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 38 ++
 drivers/net/ixgbe/ixgbe_ethdev.h |  2 ++
 drivers/net/ixgbe/ixgbe_rxtx.c   |  5 -
 3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 4c4c6df..b3ae7b2 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -166,6 +166,7 @@ static int ixgbe_dev_queue_stats_mapping_set(struct 
rte_eth_dev *eth_dev,
 uint8_t is_rx);
 static void ixgbe_dev_info_get(struct rte_eth_dev *dev,
   struct rte_eth_dev_info *dev_info);
+static int ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t 
ptypes[]);
 static void ixgbevf_dev_info_get(struct rte_eth_dev *dev,
 struct rte_eth_dev_info *dev_info);
 static int ixgbe_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
@@ -428,6 +429,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
.xstats_reset = ixgbe_dev_xstats_reset,
.queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
.dev_infos_get= ixgbe_dev_info_get,
+   .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
.mtu_set  = ixgbe_dev_mtu_set,
.vlan_filter_set  = ixgbe_vlan_filter_set,
.vlan_tpid_set= ixgbe_vlan_tpid_set,
@@ -512,6 +514,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
.xstats_reset = ixgbevf_dev_stats_reset,
.dev_close= ixgbevf_dev_close,
.dev_infos_get= ixgbevf_dev_info_get,
+   .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
.mtu_set  = ixgbevf_dev_set_mtu,
.vlan_filter_set  = ixgbevf_vlan_filter_set,
.vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
@@ -2829,6 +2832,41 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL;
 }

+static int
+ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == ixgbe_recv_pkts ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_lro_single_alloc ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
+   dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec) {
+   /*
+* for non-vec functions,
+* refers to ixgbe_rxd_pkt_info_to_pkt_type();
+* for vec functions,
+* refers to _recv_raw_pkts_vec().
+*/
+   ptypes[num++] = RTE_PTYPE_L2_ETHER;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_L4_SCTP;
+   ptypes[num++] = RTE_PTYPE_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_L4_UDP;
+   ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
+   }
+
+   return num;
+}
+
 static void
 ixgbevf_dev_info_get(struct rte_eth_dev *dev,
 struct rte_eth_dev_info *dev_info)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index d26771a..2479830 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -379,6 +379,8 @@ void ixgbevf_dev_rxtx_start(struct rte_eth_dev *dev);
 uint16_t ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);

+uint16_t ixgbe_recv_pkts_bulk_alloc(void *rx_queue, struct rte_mbuf **rx_pkts,
+  uint16_t nb_pkts);
 uint16_t ixgbe_recv_pkts_lro_single_alloc(void *rx_queue,
struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
 uint16_t ixgbe_recv_pkts_lro_bulk_alloc(void *rx_queue,
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 52a263c..d324099 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -899,6 +899,9 @@ end_of_tx:
 #define IXGBE_PACKET_TYPE_MAX   0X80
 #define IXGBE_PACKET_TYPE_MASK  0X7F
 #define IXGBE_PACKET_TYPE_SHIFT 0X04
+/*
+ * @note: fix ixgbe_dev_ptype_info_get() if any change here.
+ */
 static inline uint32_t
 ixgbe_rxd_pkt_info_to_pkt_type(uint16_t pkt_info)
 {
@@ -1247,7 +1250,7 @@ rx_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 }

 /* split requests into chunks of size RTE_PMD_IXGBE_RX_MAX_

[dpdk-dev] [PATCH v2 08/12] pmd/mlx4: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/mlx4/mlx4.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 207bfe2..b906a14 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -2836,6 +2836,8 @@ rxq_cleanup(struct rxq *rxq)
  * @param flags
  *   RX completion flags returned by poll_length_flags().
  *
+ * @note: fix mlx4_dev_ptype_info_get() if any change here.
+ *
  * @return
  *   Packet type for struct rte_mbuf.
  */
@@ -4268,6 +4270,23 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *info)
priv_unlock(priv);
 }

+static int
+mlx4_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == mlx4_rx_burst ||
+   dev->rx_pkt_burst == mlx4_rx_burst_sp) {
+   /* refers to rxq_cq_to_pkt_type() */
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
+   }
+
+   return num;
+}
+
 /**
  * DPDK callback to get device statistics.
  *
@@ -4989,6 +5008,7 @@ static const struct eth_dev_ops mlx4_dev_ops = {
.stats_reset = mlx4_stats_reset,
.queue_stats_mapping_set = NULL,
.dev_infos_get = mlx4_dev_infos_get,
+   .dev_ptypes_info_get = mlx4_dev_ptype_info_get,
.vlan_filter_set = mlx4_vlan_filter_set,
.vlan_tpid_set = NULL,
.vlan_strip_queue_set = NULL,
-- 
2.1.4



[dpdk-dev] [PATCH v2 09/12] pmd/mlx5: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/mlx5/mlx5.c|  1 +
 drivers/net/mlx5/mlx5.h|  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 18 ++
 drivers/net/mlx5/mlx5_rxtx.c   |  2 ++
 4 files changed, 22 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 821ee0f..e18b1e9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -153,6 +153,7 @@ static const struct eth_dev_ops mlx5_dev_ops = {
.stats_get = mlx5_stats_get,
.stats_reset = mlx5_stats_reset,
.dev_infos_get = mlx5_dev_infos_get,
+   .dev_ptype_info_get = mlx5_dev_ptype_info_get,
.vlan_filter_set = mlx5_vlan_filter_set,
.rx_queue_setup = mlx5_rx_queue_setup,
.tx_queue_setup = mlx5_tx_queue_setup,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b84d31d..761ee94 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -156,6 +156,7 @@ int priv_get_mtu(struct priv *, uint16_t *);
 int priv_set_flags(struct priv *, unsigned int, unsigned int);
 int mlx5_dev_configure(struct rte_eth_dev *);
 void mlx5_dev_infos_get(struct rte_eth_dev *, struct rte_eth_dev_info *);
+int mlx5_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[]);
 int mlx5_link_update(struct rte_eth_dev *, int);
 int mlx5_dev_set_mtu(struct rte_eth_dev *, uint16_t);
 int mlx5_dev_get_flow_ctrl(struct rte_eth_dev *, struct rte_eth_fc_conf *);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 1159fa3..b4423b0 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -526,6 +526,24 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *info)
priv_unlock(priv);
 }

+int
+mlx5_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == mlx5_rx_burst ||
+   dev->rx_pkt_burst == mlx5_rx_burst_sp) {
+   /* refers to rxq_cq_to_pkt_type() */
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
+   }
+
+   return num;
+
+}
+
 /**
  * DPDK callback to retrieve physical link information (unlocked version).
  *
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fa5e648..79bdf8d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -603,6 +603,8 @@ stop:
  * @param flags
  *   RX completion flags returned by poll_length_flags().
  *
+ * @note: fix mlx5_dev_ptype_info_get() if any change here.
+ *
  * @return
  *   Packet type for struct rte_mbuf.
  */
-- 
2.1.4



[dpdk-dev] [PATCH v2 10/12] pmd/nfp: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/nfp/nfp_net.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index bc2089f..b8da71e 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1075,6 +1075,22 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 #endif
 }

+static int
+nfp_net_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == nfp_net_recv_pkts) {
+   /* refers to nfp_net_set_hash() */
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
+   ptypes[num++] = RTE_PTYPE_INNER_L4_MASK;
+   }
+
+   return num;
+}
+
 static uint32_t
 nfp_net_rx_queue_count(struct rte_eth_dev *dev, uint16_t queue_idx)
 {
@@ -2294,6 +2310,7 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
.stats_get  = nfp_net_stats_get,
.stats_reset= nfp_net_stats_reset,
.dev_infos_get  = nfp_net_infos_get,
+   .dev_ptype_info_get = nfp_net_ptype_info_get,
.mtu_set= nfp_net_dev_mtu_set,
.vlan_offload_set   = nfp_net_vlan_offload_set,
.reta_update= nfp_net_reta_update,
-- 
2.1.4



[dpdk-dev] [PATCH v2 11/12] pmd/vmxnet3: add dev_ptype_info_get implementation

2016-01-15 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index c363bf6..ed9cd14 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -86,6 +86,8 @@ static void vmxnet3_dev_stats_get(struct rte_eth_dev *dev,
struct rte_eth_stats *stats);
 static void vmxnet3_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
+static int vmxnet3_dev_ptype_info_get(struct rte_eth_dev *dev,
+ uint32_t ptypes[]);
 static int vmxnet3_dev_vlan_filter_set(struct rte_eth_dev *dev,
   uint16_t vid, int on);
 static void vmxnet3_dev_vlan_offload_set(struct rte_eth_dev *dev, int mask);
@@ -118,6 +120,7 @@ static const struct eth_dev_ops vmxnet3_eth_dev_ops = {
.link_update  = vmxnet3_dev_link_update,
.stats_get= vmxnet3_dev_stats_get,
.dev_infos_get= vmxnet3_dev_info_get,
+   .dev_ptype_info_get   = vmxnet3_dev_ptype_info_get,
.vlan_filter_set  = vmxnet3_dev_vlan_filter_set,
.vlan_offload_set = vmxnet3_dev_vlan_offload_set,
.rx_queue_setup   = vmxnet3_dev_rx_queue_setup,
@@ -718,6 +721,19 @@ vmxnet3_dev_info_get(__attribute__((unused))struct 
rte_eth_dev *dev, struct rte_
};
 }

+static int
+vmxnet3_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
+{
+   int num = 0;
+
+   if (dev->rx_pkt_burst == vmxnet3_recv_pkts) {
+   ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
+   }
+
+   return num;
+}
+
 /* return 0 means link status changed, -1 means not changed */
 static int
 vmxnet3_dev_link_update(struct rte_eth_dev *dev, __attribute__((unused)) int 
wait_to_complete)
-- 
2.1.4



[dpdk-dev] [PATCH v2 12/12] examples/l3fwd: add option to parse ptype

2016-01-15 Thread Jianfeng Tan
As a example to use ptype info, l3fwd needs firstly to use
rte_eth_dev_get_ptype_info() API to check if device and/or PMD driver will
parse and fill the needed packet type. If not, use the newly added option,
--parse-ptype, to analyze it in the callback softly.

Signed-off-by: Jianfeng Tan 
---
 examples/l3fwd/main.c | 91 +++
 1 file changed, 91 insertions(+)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 5b0c2dd..3c6e1b7 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -174,6 +174,7 @@ static __m128i val_eth[RTE_MAX_ETHPORTS];
 static uint32_t enabled_port_mask = 0;
 static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by 
default. */
 static int numa_on = 1; /**< NUMA is enabled by default. */
+static int parse_ptype = 0; /**< parse packet type using rx callback */

 #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
 static int ipv6 = 0; /**< ipv6 is false by default. */
@@ -2022,6 +2023,7 @@ parse_eth_dest(const char *optarg)
 #define CMD_LINE_OPT_IPV6 "ipv6"
 #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
 #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
+#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"

 /* Parse the argument given in the command line of the application */
 static int
@@ -2038,6 +2040,7 @@ parse_args(int argc, char **argv)
{CMD_LINE_OPT_IPV6, 0, 0, 0},
{CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
{CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
+   {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
{NULL, 0, 0, 0}
};

@@ -2125,6 +2128,13 @@ parse_args(int argc, char **argv)
}
}
 #endif
+   if (!strncmp(lgopts[option_index].name,
+CMD_LINE_OPT_PARSE_PTYPE,
+sizeof(CMD_LINE_OPT_PARSE_PTYPE))) {
+   printf("soft parse-ptype is enabled \n");
+   parse_ptype = 1;
+   }
+
break;

default:
@@ -2559,6 +2569,74 @@ check_all_ports_link_status(uint8_t port_num, uint32_t 
port_mask)
}
 }

+static int
+check_packet_type_ok(int portid)
+{
+   int i, ret;
+   uint32_t *ptypes;
+   int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
+
+   ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, NULL, 0);
+   if (ret <= 0)
+   return 0;
+   ptypes = malloc(sizeof(uint32_t) * ret);
+   rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK,
+ptypes, ret);
+   for (i = 0; i < ret; ++i) {
+   if (ptypes[i] & RTE_PTYPE_L3_IPV4)
+   ptype_l3_ipv4 = 1;
+   if (ptypes[i] & RTE_PTYPE_L3_IPV6)
+   ptype_l3_ipv6 = 1;
+   }
+
+   if (ptype_l3_ipv4 == 0)
+   printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
+
+   if (ptype_l3_ipv6 == 0)
+   printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
+
+   if (ptype_l3_ipv4 && ptype_l3_ipv6)
+   return 1;
+
+   return 0;
+}
+static inline void
+parse_packet_type(struct rte_mbuf *m)
+{
+   struct ether_hdr *eth_hdr;
+   uint32_t packet_type = 0;
+   uint16_t ethertype;
+
+   eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
+   ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
+   switch (ethertype) {
+   case ETHER_TYPE_IPv4:
+   packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+   break;
+   case ETHER_TYPE_IPv6:
+   packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+   break;
+   default:
+   break;
+   }
+
+   m->packet_type |= packet_type;
+}
+
+static uint16_t
+cb_parse_packet_type(uint8_t port __rte_unused,
+uint16_t queue __rte_unused,
+struct rte_mbuf *pkts[],
+uint16_t nb_pkts,
+uint16_t max_pkts __rte_unused,
+void *user_param __rte_unused)
+{
+   unsigned i;
+
+   for (i = 0; i < nb_pkts; ++i)
+   parse_packet_type(pkts[i]);
+}
+
 int
 main(int argc, char **argv)
 {
@@ -2672,6 +2750,11 @@ main(int argc, char **argv)
rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: 
err=%d, "
"port=%d\n", ret, portid);

+   if (!check_packet_type_ok(portid) && !parse_ptype)
+   rte_exit(EXIT_FAILURE,
+"port %d cannot parse packet type, 
please add --%s\n",
+portid, CMD_LINE_OPT_PARSE_PTYPE);
+
qconf = &lcore_conf[lcore_id];
qconf->tx_queue_id[portid] = queueid;
queue

[dpdk-dev] [PATCH v4 08/14] virtio: pci: extend virtio pci rw api for vfio interface

2016-01-15 Thread Santosh Shukla
On Fri, Jan 15, 2016 at 6:13 PM, Santosh Shukla  wrote:
> On Fri, Jan 15, 2016 at 11:57 AM, Yuanhan Liu
>  wrote:
>> On Thu, Jan 14, 2016 at 06:58:31PM +0530, Santosh Shukla wrote:
>>> So far virtio handle rw access for uio / ioport interface, This patch to 
>>> extend
>>> the support for vfio interface. For that introducing private struct
>>> virtio_vfio_dev{
>>>   - is_vfio
>>>   - pci_dev
>>>   };
>>> Signed-off-by: Santosh Shukla 
>> ...
>>> +/* For vfio only */
>>> +struct virtio_vfio_dev {
>>> + boolis_vfio;/* True: vfio i/f,
>>> +  * False: not a vfio i/f
>>
>> Well, this is weird; you are adding a flag to tell whether it's a
>> vfio device __inside__ a vfio struct.
>>
>> Back to the topic, this flag is not necessary to me: you can
>> check the pci_dev->kdrv flag.
>>
>
> yes, I'll replace is_vfio with pci_dev->kdrv.
>
>>> +  */
>>> + struct rte_pci_device *pci_dev; /* vfio dev */
>>
>> Note that I have already added this field into virtio_hw struct
>> at my latest virtio 1.0 pmd patchset.
>>
>> While I told you before that you should not develop patches based
>> on my patcheset, I guess you can do that now. Since it should be
>> in good shape and close to be merged.
>
> Okay, Before rebasing my v5 patch on your 1.0 virtio patch, I like to
> understand which qemu version support virtio 1.0 spec?

Ignore, I figured out in other thread,
qemu version >2.4, such as 2.4.1 or 2.5.0.

>>> +};
>>> +
>>>  struct virtio_hw {
>>>   struct virtqueue *cvq;
>>>   uint32_tio_base;
>>> @@ -176,6 +186,7 @@ struct virtio_hw {
>>>   uint8_t use_msix;
>>>   uint8_t started;
>>>   uint8_t mac_addr[ETHER_ADDR_LEN];
>>> + struct virtio_vfio_dev dev;
>>>  };
>>>
>>>  /*
>>> @@ -231,20 +242,65 @@ outl_p(unsigned int data, unsigned int port)
>>>  #define VIRTIO_PCI_REG_ADDR(hw, reg) \
>>>   (unsigned short)((hw)->io_base + (reg))
>>>
>>> -#define VIRTIO_READ_REG_1(hw, reg) \
>>> - inb((VIRTIO_PCI_REG_ADDR((hw), (reg
>>> -#define VIRTIO_WRITE_REG_1(hw, reg, value) \
>>> - outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>>> -
>>> -#define VIRTIO_READ_REG_2(hw, reg) \
>>> - inw((VIRTIO_PCI_REG_ADDR((hw), (reg
>>> -#define VIRTIO_WRITE_REG_2(hw, reg, value) \
>>> - outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>>> -
>>> -#define VIRTIO_READ_REG_4(hw, reg) \
>>> - inl((VIRTIO_PCI_REG_ADDR((hw), (reg
>>> -#define VIRTIO_WRITE_REG_4(hw, reg, value) \
>>> - outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg
>>> +#define VIRTIO_READ_REG_1(hw, reg)   \
>>> +({   \
>>> + uint8_t ret;\
>>> + struct virtio_vfio_dev *vdev;   \
>>> + (vdev) = (&(hw)->dev);  \
>>> + (((vdev)->is_vfio) ?\
>>> + (ioport_inb(((vdev)->pci_dev), reg, &ret)) :\
>>> + ((ret) = (inb((VIRTIO_PCI_REG_ADDR((hw), (reg)));   \
>>> + ret;\
>>> +})
>>
>> It becomes unreadable. I'd suggest to define them as iniline
>> functions, and use "if .. else .." instead of "?:".
>>

Ok.
>> --yliu


[dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type filling info

2016-01-15 Thread Adrien Mazarguil
Hi Jianfeng, a few comments below.

On Fri, Jan 15, 2016 at 01:45:48PM +0800, Jianfeng Tan wrote:
> Add a new API rte_eth_dev_get_ptype_info to query wether/what packet type will
> be filled by given pmd rx burst function.
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  lib/librte_ether/rte_ethdev.c | 20 
>  lib/librte_ether/rte_ethdev.h | 27 +++
>  lib/librte_mbuf/rte_mbuf.h|  6 ++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..cd34f46 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1614,6 +1614,26 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->driver_name = dev->data->drv_name;
>  }
>  
> +int
> +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> +uint32_t ptypes[], int num)
> +{
> + int ret, i, j;
> + struct rte_eth_dev *dev;
> + uint32_t all_ptypes[RTE_PTYPE_MAX_NUM];
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> + dev = &rte_eth_devices[port_id];
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> + ret = (*dev->dev_ops->dev_ptype_info_get)(dev, all_ptypes);
> +
> + for (i = 0, j = 0; i < ret && j < num; ++i)
> + if (all_ptypes[i] & ptype_mask)
> + ptypes[j++] = all_ptypes[i];
> +
> + return ret;
> +}

It's a good thing that the size of ptypes[] can be provided, but I think num
should be passed to the dev_ptype_info_get() callback as well.

If you really do not want to pass the size, you have to force the array type
size onto callbacks using a pointer to the array otherwise they look unsafe
(and are actually unsafe when not called from the rte_eth_dev wrapper),
something like this:

 int (*dev_ptype_info_get)(uint8_t port_id, uint32_t 
(*ptypes)[RTE_PTYPE_MAX_NUM]);

In which case you might as well drop the num argument from
rte_eth_dev_get_ptype_info() to use the same syntax. That way there is no
need to allocate a ptypes array on the stack twice.

But since people usually do not like this syntax, I think passing num and
letting callbacks check for overflow themselves on the user-provided ptypes
array directly is better. They have to return the total number of packet
types supported even when num is 0 (ptypes may be NULL in that case).

I understand the result needs to be temporarily stored somewhere for
filtering and for that purpose the entire size must be known in advance,
hence my previous suggestion for rte_eth_dev_get_ptype_info() to return
the total number of ptypes and providing a separate function for filtering
a ptypes array for applications that need it:

 /* Return remaining number entries in ptypes[] after filtering it
  * according to ptype_mask. */
 int rte_eth_dev_ptypes_filter(uint32_t ptype_mask, uint32_t ptypes[], int num);

Usage would be like:

 int ptypes_num = rte_eth_dev_get_ptype_info(42, NULL, 0);

 if (ptypes_num <= 0)
 goto err;

 uint32_t ptypes[ptypes_num];

 rte_eth_dev_get_ptype_info(42, ptypes, ptypes_num);
 ptypes_num = rte_eth_dev_ptypes_filter(RTE_PTYPE_INNER_L4_MASK, ptypes, 
ptypes_num);

 if (ptypes_num <= 0)
goto err;

 /* process ptypes... */

What about this?

> +
>  void
>  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index bada8ad..42f8188 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev 
> *dev,
>   struct rte_eth_dev_info *dev_info);
>  /**< @internal Get specific informations of an Ethernet device. */
>  
> +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> + uint32_t ptypes[]);
> +/**< @internal Get ptype info of eth_rx_burst_t. */
> +
>  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
>   uint16_t queue_id);
>  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
>   eth_queue_stats_mapping_set_t queue_stats_mapping_set;
>   /**< Configure per queue stat counter mapping. */
>   eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> + eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
>   mtu_set_t  mtu_set; /**< Set MTU. */
>   vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
>   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> Setup. */
> @@ -2273,6 +2278,28 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
>struct rte_eth_dev_info *dev_info);
>  
>  /**
> + * Retrieve the contextual information of an Ethernet device.
> + *
> +

[dpdk-dev] [PATCH 0/2] add support for buffered tx to ethdev

2016-01-15 Thread Tomasz Kulasek
Date: Fri, 15 Jan 2016 15:25:30 +0100
Message-Id: <1452867932-5548-1-git-send-email-tomaszx.kulasek at intel.com>
X-Mailer: git-send-email 2.1.4

Many sample apps include internal buffering for single-packet-at-a-time

operation. Since this is such a common paradigm, this functionality is

better suited to being inside the core ethdev API.

The new APIs in the ethdev library are:

* rte_eth_tx_buffer - buffer up a single packet for future transmission

* rte_eth_tx_buffer_flush - flush any unsent buffered packets

* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in

  case transmitting a buffered burst fails. By default, we just free the

  unsent packets.



As well as these, an additional reference callback is provided, which

frees the packets (as the default callback does), as well as updating a

user-provided counter, so that the number of dropped packets can be

tracked.



The internal buffering of packets for TX in sample apps is no longer

needed, so this patchset also replaces this code with calls to the new

rte_eth_tx_buffer* APIs in:



* l2fwd-jobstats

* l2fwd-keepalive

* l2fwd

* l3fwd-acl

* l3fwd-power

* link_status_interrupt

* client_server_mp

* l2fwd_fork

* packet_ordering

* qos_meter



Tomasz Kulasek (2):

  ethdev: add buffered tx api

  examples: sample apps rework to use buffered tx api



 config/common_bsdapp   |1 +

 config/common_linuxapp |1 +

 examples/l2fwd-jobstats/main.c |   73 ++-

 examples/l2fwd-keepalive/main.c|   79 ++--

 examples/l2fwd/main.c  |   80 ++--

 examples/l3fwd-acl/main.c  |   64 +-

 examples/l3fwd-power/main.c|   63 +-

 examples/link_status_interrupt/main.c  |   83 ++--

 .../client_server_mp/mp_client/client.c|   77 +++

 examples/multi_process/l2fwd_fork/main.c   |   81 ++--

 examples/packet_ordering/main.c|   62 +++---

 examples/qos_meter/main.c  |   46 +

 lib/librte_ether/rte_ethdev.c  |   63 +-

 lib/librte_ether/rte_ethdev.h  |  211 +++-

 lib/librte_ether/rte_ether_version.map |8 +

 15 files changed, 445 insertions(+), 547 deletions(-)



-- 

1.7.9.5





[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-01-15 Thread Tomasz Kulasek
Date: Fri, 15 Jan 2016 15:25:31 +0100
Message-Id: <1452867932-5548-2-git-send-email-tomaszx.kulasek at intel.com>
X-Mailer: git-send-email 2.1.4
In-Reply-To: <1452867932-5548-1-git-send-email-tomaszx.kulasek at intel.com>
References: <1452867932-5548-1-git-send-email-tomaszx.kulasek at intel.com>

Many sample apps include internal buffering for single-packet-at-a-time

operation. Since this is such a common paradigm, this functionality is

better suited to being inside the core ethdev API.

The new APIs in the ethdev library are:

* rte_eth_tx_buffer - buffer up a single packet for future transmission

* rte_eth_tx_buffer_flush - flush any unsent buffered packets

* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in

  case transmitting a buffered burst fails. By default, we just free the

  unsent packets.



As well as these, an additional reference callback is provided, which

frees the packets (as the default callback does), as well as updating a

user-provided counter, so that the number of dropped packets can be

tracked.



Signed-off-by: Bruce Richardson 

Signed-off-by: Tomasz Kulasek 

---

 config/common_bsdapp   |1 +

 config/common_linuxapp |1 +

 lib/librte_ether/rte_ethdev.c  |   63 +-

 lib/librte_ether/rte_ethdev.h  |  211 +++-

 lib/librte_ether/rte_ether_version.map |8 ++

 5 files changed, 279 insertions(+), 5 deletions(-)



diff --git a/config/common_bsdapp b/config/common_bsdapp

index ed7c31c..8a2e4c5 100644

--- a/config/common_bsdapp

+++ b/config/common_bsdapp

@@ -148,6 +148,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024

 CONFIG_RTE_LIBRTE_IEEE1588=n

 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16

 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y

+CONFIG_RTE_ETHDEV_TX_BUFSIZE=32



 #

 # Support NIC bypass logic

diff --git a/config/common_linuxapp b/config/common_linuxapp

index 74bc515..6229cab 100644

--- a/config/common_linuxapp

+++ b/config/common_linuxapp

@@ -146,6 +146,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024

 CONFIG_RTE_LIBRTE_IEEE1588=n

 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16

 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y

+CONFIG_RTE_ETHDEV_TX_BUFSIZE=32



 #

 # Support NIC bypass logic

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c

index ed971b4..27dac1b 100644

--- a/lib/librte_ether/rte_ethdev.c

+++ b/lib/librte_ether/rte_ethdev.c

@@ -1,7 +1,7 @@

 /*-

  *   BSD LICENSE

  *

- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.

+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.

  *   All rights reserved.

  *

  *   Redistribution and use in source and binary forms, with or without

@@ -826,11 +826,42 @@ rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t 
tx_queue_id)



 }



+void

+rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,

+   void *userdata)

+{

+   unsigned long *count = userdata;

+   unsigned i;

+

+   for (i = 0; i < unsent; i++)

+   rte_pktmbuf_free(pkts[i]);

+

+   *count += unsent;

+}

+

+int

+rte_eth_tx_buffer_set_err_callback(uint8_t port_id, uint16_t queue_id,

+   buffer_tx_error_fn cbfn, void *userdata)

+{

+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];

+

+   if (!rte_eth_dev_is_valid_port(port_id) ||

+   queue_id >= dev->data->nb_tx_queues) {

+   rte_errno = EINVAL;

+   return -1;

+   }

+

+   dev->tx_buf_err_cb[queue_id].userdata = userdata;

+   dev->tx_buf_err_cb[queue_id].flush_cb = cbfn;

+   return 0;

+}

+

 static int

 rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)

 {

uint16_t old_nb_queues = dev->data->nb_tx_queues;

void **txq;

+   struct rte_eth_dev_tx_buffer *new_bufs;

unsigned i;



if (dev->data->tx_queues == NULL) { /* first time configuration */

@@ -841,17 +872,40 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)

dev->data->nb_tx_queues = 0;

return -(ENOMEM);

}

+

+   dev->data->txq_bufs = rte_zmalloc("ethdev->txq_bufs",

+   sizeof(*dev->data->txq_bufs) * nb_queues, 0);

+   if (dev->data->txq_bufs == NULL) {

+   dev->data->nb_tx_queues = 0;

+   rte_free(dev->data->tx_queues);

+   return -(ENOMEM);

+   }

+

} else { /* re-configure */

+

+   /* flush the packets queued for all queues*/

+   for (i = 0; i < old_nb_queues; i++)

+   rte_eth_tx_buffer_flush(dev->data->port_id, i);

+

RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, 
-ENOTSUP);



+   /* get new buffer space first, but keep old space around */

+   n

[dpdk-dev] [PATCH 2/2] examples: sample apps rework to use buffered tx api

2016-01-15 Thread Tomasz Kulasek
Date: Fri, 15 Jan 2016 15:25:32 +0100
Message-Id: <1452867932-5548-3-git-send-email-tomaszx.kulasek at intel.com>
X-Mailer: git-send-email 2.1.4
In-Reply-To: <1452867932-5548-1-git-send-email-tomaszx.kulasek at intel.com>
References: <1452867932-5548-1-git-send-email-tomaszx.kulasek at intel.com>

The internal buffering of packets for TX in sample apps is no longer

needed, so this patchset replaces this code with calls to the new

rte_eth_tx_buffer* APIs in:



* l2fwd-jobstats

* l2fwd-keepalive

* l2fwd

* l3fwd-acl

* l3fwd-power

* link_status_interrupt

* client_server_mp

* l2fwd_fork

* packet_ordering

* qos_meter



Signed-off-by: Bruce Richardson 

Signed-off-by: Tomasz Kulasek 

---

 examples/l2fwd-jobstats/main.c |   73 +

 examples/l2fwd-keepalive/main.c|   79 ---

 examples/l2fwd/main.c  |   80 ---

 examples/l3fwd-acl/main.c  |   64 ++-

 examples/l3fwd-power/main.c|   63 ++-

 examples/link_status_interrupt/main.c  |   83 

 .../client_server_mp/mp_client/client.c|   77 --

 examples/multi_process/l2fwd_fork/main.c   |   81 ---

 examples/packet_ordering/main.c|   62 +++

 examples/qos_meter/main.c  |   46 ++-

 10 files changed, 166 insertions(+), 542 deletions(-)



diff --git a/examples/l2fwd-jobstats/main.c b/examples/l2fwd-jobstats/main.c

index 7b59f4e..9a6e6ea 100644

--- a/examples/l2fwd-jobstats/main.c

+++ b/examples/l2fwd-jobstats/main.c

@@ -1,7 +1,7 @@

 /*-

  *   BSD LICENSE

  *

- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.

+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.

  *   All rights reserved.

  *

  *   Redistribution and use in source and binary forms, with or without

@@ -99,8 +99,6 @@ static unsigned int l2fwd_rx_queue_per_lcore = 1;



 struct mbuf_table {

uint64_t next_flush_time;

-   unsigned len;

-   struct rte_mbuf *mbufs[MAX_PKT_BURST];

 };



 #define MAX_RX_QUEUE_PER_LCORE 16

@@ -373,58 +371,12 @@ show_stats_cb(__rte_unused void *param)

rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);

 }



-/* Send the burst of packets on an output interface */

-static void

-l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)

-{

-   struct mbuf_table *m_table;

-   uint16_t ret;

-   uint16_t queueid = 0;

-   uint16_t n;

-

-   m_table = &qconf->tx_mbufs[port];

-   n = m_table->len;

-

-   m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;

-   m_table->len = 0;

-

-   ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);

-

-   port_statistics[port].tx += ret;

-   if (unlikely(ret < n)) {

-   port_statistics[port].dropped += (n - ret);

-   do {

-   rte_pktmbuf_free(m_table->mbufs[ret]);

-   } while (++ret < n);

-   }

-}

-

-/* Enqueue packets for TX and prepare them to be sent */

-static int

-l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)

-{

-   const unsigned lcore_id = rte_lcore_id();

-   struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];

-   struct mbuf_table *m_table = &qconf->tx_mbufs[port];

-   uint16_t len = qconf->tx_mbufs[port].len;

-

-   m_table->mbufs[len] = m;

-

-   len++;

-   m_table->len = len;

-

-   /* Enough pkts to be sent. */

-   if (unlikely(len == MAX_PKT_BURST))

-   l2fwd_send_burst(qconf, port);

-

-   return 0;

-}

-

 static void

 l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)

 {

struct ether_hdr *eth;

void *tmp;

+   int sent;

unsigned dst_port;



dst_port = l2fwd_dst_ports[portid];

@@ -437,7 +389,9 @@ l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)

/* src addr */

ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);



-   l2fwd_send_packet(m, (uint8_t) dst_port);

+   sent = rte_eth_tx_buffer(dst_port, 0, m);

+   if (sent)

+   port_statistics[dst_port].tx += sent;

 }



 static void

@@ -513,6 +467,8 @@ l2fwd_flush_job(__rte_unused struct rte_timer *timer, 
__rte_unused void *arg)

struct lcore_queue_conf *qconf;

struct mbuf_table *m_table;

uint8_t portid;

+   unsigned i;

+   uint32_t sent;



lcore_id = rte_lcore_id();

qconf = &lcore_queue_conf[lcore_id];

@@ -522,12 +478,19 @@ l2fwd_flush_job(__rte_unused struct rte_timer *timer, 
__rte_unused void *arg)

now = rte_get_timer_cycles();

lcore_id = rte_lcore_id();

qconf = &lcore_queue_conf[lcore_id];

-   for (portid = 0; portid < RTE_MAX_ETHPORTS;

[dpdk-dev] [PATCH 0/2] add support for buffered tx to ethdev

2016-01-15 Thread Tomasz Kulasek
Many sample apps include internal buffering for single-packet-at-a-time
operation. Since this is such a common paradigm, this functionality is
better suited to being inside the core ethdev API.
The new APIs in the ethdev library are:
* rte_eth_tx_buffer - buffer up a single packet for future transmission
* rte_eth_tx_buffer_flush - flush any unsent buffered packets
* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in
  case transmitting a buffered burst fails. By default, we just free the
  unsent packets.

As well as these, an additional reference callback is provided, which
frees the packets (as the default callback does), as well as updating a
user-provided counter, so that the number of dropped packets can be
tracked.

The internal buffering of packets for TX in sample apps is no longer
needed, so this patchset also replaces this code with calls to the new
rte_eth_tx_buffer* APIs in:

* l2fwd-jobstats
* l2fwd-keepalive
* l2fwd
* l3fwd-acl
* l3fwd-power
* link_status_interrupt
* client_server_mp
* l2fwd_fork
* packet_ordering
* qos_meter

Tomasz Kulasek (2):
  ethdev: add buffered tx api
  examples: sample apps rework to use buffered tx api

 config/common_bsdapp   |1 +
 config/common_linuxapp |1 +
 examples/l2fwd-jobstats/main.c |   73 ++-
 examples/l2fwd-keepalive/main.c|   79 ++--
 examples/l2fwd/main.c  |   80 ++--
 examples/l3fwd-acl/main.c  |   64 +-
 examples/l3fwd-power/main.c|   63 +-
 examples/link_status_interrupt/main.c  |   83 ++--
 .../client_server_mp/mp_client/client.c|   77 +++
 examples/multi_process/l2fwd_fork/main.c   |   81 ++--
 examples/packet_ordering/main.c|   62 +++---
 examples/qos_meter/main.c  |   46 +
 lib/librte_ether/rte_ethdev.c  |   63 +-
 lib/librte_ether/rte_ethdev.h  |  211 +++-
 lib/librte_ether/rte_ether_version.map |8 +
 15 files changed, 445 insertions(+), 547 deletions(-)

-- 
1.7.9.5



[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-01-15 Thread Tomasz Kulasek
Many sample apps include internal buffering for single-packet-at-a-time
operation. Since this is such a common paradigm, this functionality is
better suited to being inside the core ethdev API.
The new APIs in the ethdev library are:
* rte_eth_tx_buffer - buffer up a single packet for future transmission
* rte_eth_tx_buffer_flush - flush any unsent buffered packets
* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in
  case transmitting a buffered burst fails. By default, we just free the
  unsent packets.

As well as these, an additional reference callback is provided, which
frees the packets (as the default callback does), as well as updating a
user-provided counter, so that the number of dropped packets can be
tracked.

Signed-off-by: Bruce Richardson 
Signed-off-by: Tomasz Kulasek 
---
 config/common_bsdapp   |1 +
 config/common_linuxapp |1 +
 lib/librte_ether/rte_ethdev.c  |   63 +-
 lib/librte_ether/rte_ethdev.h  |  211 +++-
 lib/librte_ether/rte_ether_version.map |8 ++
 5 files changed, 279 insertions(+), 5 deletions(-)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index ed7c31c..8a2e4c5 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -148,6 +148,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y
+CONFIG_RTE_ETHDEV_TX_BUFSIZE=32

 #
 # Support NIC bypass logic
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..6229cab 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -146,6 +146,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
 CONFIG_RTE_LIBRTE_IEEE1588=n
 CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y
+CONFIG_RTE_ETHDEV_TX_BUFSIZE=32

 #
 # Support NIC bypass logic
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ed971b4..27dac1b 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -826,11 +826,42 @@ rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t 
tx_queue_id)

 }

+void
+rte_eth_count_unsent_packet_callback(struct rte_mbuf **pkts, uint16_t unsent,
+   void *userdata)
+{
+   unsigned long *count = userdata;
+   unsigned i;
+
+   for (i = 0; i < unsent; i++)
+   rte_pktmbuf_free(pkts[i]);
+
+   *count += unsent;
+}
+
+int
+rte_eth_tx_buffer_set_err_callback(uint8_t port_id, uint16_t queue_id,
+   buffer_tx_error_fn cbfn, void *userdata)
+{
+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+   if (!rte_eth_dev_is_valid_port(port_id) ||
+   queue_id >= dev->data->nb_tx_queues) {
+   rte_errno = EINVAL;
+   return -1;
+   }
+
+   dev->tx_buf_err_cb[queue_id].userdata = userdata;
+   dev->tx_buf_err_cb[queue_id].flush_cb = cbfn;
+   return 0;
+}
+
 static int
 rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
 {
uint16_t old_nb_queues = dev->data->nb_tx_queues;
void **txq;
+   struct rte_eth_dev_tx_buffer *new_bufs;
unsigned i;

if (dev->data->tx_queues == NULL) { /* first time configuration */
@@ -841,17 +872,40 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_tx_queues = 0;
return -(ENOMEM);
}
+
+   dev->data->txq_bufs = rte_zmalloc("ethdev->txq_bufs",
+   sizeof(*dev->data->txq_bufs) * nb_queues, 0);
+   if (dev->data->txq_bufs == NULL) {
+   dev->data->nb_tx_queues = 0;
+   rte_free(dev->data->tx_queues);
+   return -(ENOMEM);
+   }
+
} else { /* re-configure */
+
+   /* flush the packets queued for all queues*/
+   for (i = 0; i < old_nb_queues; i++)
+   rte_eth_tx_buffer_flush(dev->data->port_id, i);
+
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, 
-ENOTSUP);

+   /* get new buffer space first, but keep old space around */
+   new_bufs = rte_zmalloc("ethdev->txq_bufs",
+   sizeof(*dev->data->txq_bufs) * nb_queues, 0);
+   if (new_bufs == NULL)
+   return -(ENOMEM);
+
txq = dev->data->tx_queues;

for (i = nb_queues; i < old_nb_queues; i++)
(*dev->dev_ops->tx_queue_release)(txq[i]);
txq = rte_realloc(txq, sizeof(txq[0]

[dpdk-dev] [PATCH 2/2] examples: sample apps rework to use buffered tx api

2016-01-15 Thread Tomasz Kulasek
The internal buffering of packets for TX in sample apps is no longer
needed, so this patchset replaces this code with calls to the new
rte_eth_tx_buffer* APIs in:

* l2fwd-jobstats
* l2fwd-keepalive
* l2fwd
* l3fwd-acl
* l3fwd-power
* link_status_interrupt
* client_server_mp
* l2fwd_fork
* packet_ordering
* qos_meter

Signed-off-by: Bruce Richardson 
Signed-off-by: Tomasz Kulasek 
---
 examples/l2fwd-jobstats/main.c |   73 +
 examples/l2fwd-keepalive/main.c|   79 ---
 examples/l2fwd/main.c  |   80 ---
 examples/l3fwd-acl/main.c  |   64 ++-
 examples/l3fwd-power/main.c|   63 ++-
 examples/link_status_interrupt/main.c  |   83 
 .../client_server_mp/mp_client/client.c|   77 --
 examples/multi_process/l2fwd_fork/main.c   |   81 ---
 examples/packet_ordering/main.c|   62 +++
 examples/qos_meter/main.c  |   46 ++-
 10 files changed, 166 insertions(+), 542 deletions(-)

diff --git a/examples/l2fwd-jobstats/main.c b/examples/l2fwd-jobstats/main.c
index 7b59f4e..9a6e6ea 100644
--- a/examples/l2fwd-jobstats/main.c
+++ b/examples/l2fwd-jobstats/main.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -99,8 +99,6 @@ static unsigned int l2fwd_rx_queue_per_lcore = 1;

 struct mbuf_table {
uint64_t next_flush_time;
-   unsigned len;
-   struct rte_mbuf *mbufs[MAX_PKT_BURST];
 };

 #define MAX_RX_QUEUE_PER_LCORE 16
@@ -373,58 +371,12 @@ show_stats_cb(__rte_unused void *param)
rte_eal_alarm_set(timer_period * US_PER_S, show_stats_cb, NULL);
 }

-/* Send the burst of packets on an output interface */
-static void
-l2fwd_send_burst(struct lcore_queue_conf *qconf, uint8_t port)
-{
-   struct mbuf_table *m_table;
-   uint16_t ret;
-   uint16_t queueid = 0;
-   uint16_t n;
-
-   m_table = &qconf->tx_mbufs[port];
-   n = m_table->len;
-
-   m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;
-   m_table->len = 0;
-
-   ret = rte_eth_tx_burst(port, queueid, m_table->mbufs, n);
-
-   port_statistics[port].tx += ret;
-   if (unlikely(ret < n)) {
-   port_statistics[port].dropped += (n - ret);
-   do {
-   rte_pktmbuf_free(m_table->mbufs[ret]);
-   } while (++ret < n);
-   }
-}
-
-/* Enqueue packets for TX and prepare them to be sent */
-static int
-l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
-{
-   const unsigned lcore_id = rte_lcore_id();
-   struct lcore_queue_conf *qconf = &lcore_queue_conf[lcore_id];
-   struct mbuf_table *m_table = &qconf->tx_mbufs[port];
-   uint16_t len = qconf->tx_mbufs[port].len;
-
-   m_table->mbufs[len] = m;
-
-   len++;
-   m_table->len = len;
-
-   /* Enough pkts to be sent. */
-   if (unlikely(len == MAX_PKT_BURST))
-   l2fwd_send_burst(qconf, port);
-
-   return 0;
-}
-
 static void
 l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
 {
struct ether_hdr *eth;
void *tmp;
+   int sent;
unsigned dst_port;

dst_port = l2fwd_dst_ports[portid];
@@ -437,7 +389,9 @@ l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
/* src addr */
ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], ð->s_addr);

-   l2fwd_send_packet(m, (uint8_t) dst_port);
+   sent = rte_eth_tx_buffer(dst_port, 0, m);
+   if (sent)
+   port_statistics[dst_port].tx += sent;
 }

 static void
@@ -513,6 +467,8 @@ l2fwd_flush_job(__rte_unused struct rte_timer *timer, 
__rte_unused void *arg)
struct lcore_queue_conf *qconf;
struct mbuf_table *m_table;
uint8_t portid;
+   unsigned i;
+   uint32_t sent;

lcore_id = rte_lcore_id();
qconf = &lcore_queue_conf[lcore_id];
@@ -522,12 +478,19 @@ l2fwd_flush_job(__rte_unused struct rte_timer *timer, 
__rte_unused void *arg)
now = rte_get_timer_cycles();
lcore_id = rte_lcore_id();
qconf = &lcore_queue_conf[lcore_id];
-   for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
-   m_table = &qconf->tx_mbufs[portid];
-   if (m_table->len == 0 || m_table->next_flush_time <= now)
+
+   for (i = 0; i < qconf->n_rx_port; i++) {
+   m_table = &qconf->tx_mbufs[i];
+
+   if (m_table->next_flush_time <= now)
continue;
+   m_table->next_flush_time = rte_get_timer_cycles() + drain_tsc;

-   l2fwd_send_bu

[dpdk-dev] [PATCH v2 12/12] examples/l3fwd: add option to parse ptype

2016-01-15 Thread Ananyev, Konstantin
Hi Jianfeng,

> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, January 15, 2016 5:46 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> Subject: [PATCH v2 12/12] examples/l3fwd: add option to parse ptype
> 
> As a example to use ptype info, l3fwd needs firstly to use
> rte_eth_dev_get_ptype_info() API to check if device and/or PMD driver will
> parse and fill the needed packet type. If not, use the newly added option,
> --parse-ptype, to analyze it in the callback softly.
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  examples/l3fwd/main.c | 91 
> +++
>  1 file changed, 91 insertions(+)
> 
> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> index 5b0c2dd..3c6e1b7 100644
> --- a/examples/l3fwd/main.c
> +++ b/examples/l3fwd/main.c
> @@ -174,6 +174,7 @@ static __m128i val_eth[RTE_MAX_ETHPORTS];
>  static uint32_t enabled_port_mask = 0;
>  static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by 
> default. */
>  static int numa_on = 1; /**< NUMA is enabled by default. */
> +static int parse_ptype = 0; /**< parse packet type using rx callback */
> 
>  #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
>  static int ipv6 = 0; /**< ipv6 is false by default. */
> @@ -2022,6 +2023,7 @@ parse_eth_dest(const char *optarg)
>  #define CMD_LINE_OPT_IPV6 "ipv6"
>  #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
>  #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
> +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
> 
>  /* Parse the argument given in the command line of the application */
>  static int
> @@ -2038,6 +2040,7 @@ parse_args(int argc, char **argv)
>   {CMD_LINE_OPT_IPV6, 0, 0, 0},
>   {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
>   {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
> + {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
>   {NULL, 0, 0, 0}
>   };
> 
> @@ -2125,6 +2128,13 @@ parse_args(int argc, char **argv)
>   }
>   }
>  #endif
> + if (!strncmp(lgopts[option_index].name,
> +  CMD_LINE_OPT_PARSE_PTYPE,
> +  sizeof(CMD_LINE_OPT_PARSE_PTYPE))) {
> + printf("soft parse-ptype is enabled \n");
> + parse_ptype = 1;
> + }
> +
>   break;
> 
>   default:
> @@ -2559,6 +2569,74 @@ check_all_ports_link_status(uint8_t port_num, uint32_t 
> port_mask)
>   }
>  }
> 
> +static int
> +check_packet_type_ok(int portid)
> +{
> + int i, ret;
> + uint32_t *ptypes;
> + int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
> +
> + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, NULL, 0);
> + if (ret <= 0)
> + return 0;
> + ptypes = malloc(sizeof(uint32_t) * ret);
> + rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK,
> +  ptypes, ret);
> + for (i = 0; i < ret; ++i) {
> + if (ptypes[i] & RTE_PTYPE_L3_IPV4)
> + ptype_l3_ipv4 = 1;
> + if (ptypes[i] & RTE_PTYPE_L3_IPV6)
> + ptype_l3_ipv6 = 1;
> + }


I think you forgot to do: free(ptypes);
Also, formally speaking malloc can fail here, so probably need to check 
malloc() return value,
or just allocate ptypes[] on the stack - would be easier.

> +
> + if (ptype_l3_ipv4 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
> +
> + if (ptype_l3_ipv6 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
> +
> + if (ptype_l3_ipv4 && ptype_l3_ipv6)
> + return 1;
> +
> + return 0;
> +}
> +static inline void
> +parse_packet_type(struct rte_mbuf *m)
> +{
> + struct ether_hdr *eth_hdr;
> + uint32_t packet_type = 0;
> + uint16_t ethertype;
> +
> + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> + switch (ethertype) {
> + case ETHER_TYPE_IPv4:
> + packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
> + break;
> + case ETHER_TYPE_IPv6:
> + packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
> + break;

That's enough for LPM, for EM, don't we need to be sure there are no extensions
in the IP header? 

Konstantin

> + default:
> + break;
> + }
> +
> + m->packet_type |= packet_type;
> +}
> +
> +static uint16_t
> +cb_parse_packet_type(uint8_t port __rte_unused,
> +  uint16_t queue __rte_unused,
> +  struct rte_mbuf *pkts[],
> +  uint16_t nb_pkts,
> +  uint16_t max_pkts __rte_unused,
> +  void *user_param __rte_unused)
> +{
> + unsigned i;
> +
> + for (i = 0; i < nb_pkts; ++i)
> + parse_packet_type(pkts[i]);
> +}
> +
>  

[dpdk-dev] [PATCH v2 07/12] pmd/ixgbe: add dev_ptype_info_get implementation

2016-01-15 Thread Ananyev, Konstantin


> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, January 15, 2016 5:46 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> Subject: [PATCH v2 07/12] pmd/ixgbe: add dev_ptype_info_get implementation
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 38 ++
>  drivers/net/ixgbe/ixgbe_ethdev.h |  2 ++
>  drivers/net/ixgbe/ixgbe_rxtx.c   |  5 -
>  3 files changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 4c4c6df..b3ae7b2 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -166,6 +166,7 @@ static int ixgbe_dev_queue_stats_mapping_set(struct 
> rte_eth_dev *eth_dev,
>uint8_t is_rx);
>  static void ixgbe_dev_info_get(struct rte_eth_dev *dev,
>  struct rte_eth_dev_info *dev_info);
> +static int ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t 
> ptypes[]);
>  static void ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>struct rte_eth_dev_info *dev_info);
>  static int ixgbe_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
> @@ -428,6 +429,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>   .xstats_reset = ixgbe_dev_xstats_reset,
>   .queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
>   .dev_infos_get= ixgbe_dev_info_get,
> + .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>   .mtu_set  = ixgbe_dev_mtu_set,
>   .vlan_filter_set  = ixgbe_vlan_filter_set,
>   .vlan_tpid_set= ixgbe_vlan_tpid_set,
> @@ -512,6 +514,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>   .xstats_reset = ixgbevf_dev_stats_reset,
>   .dev_close= ixgbevf_dev_close,
>   .dev_infos_get= ixgbevf_dev_info_get,
> + .dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>   .mtu_set  = ixgbevf_dev_set_mtu,
>   .vlan_filter_set  = ixgbevf_vlan_filter_set,
>   .vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
> @@ -2829,6 +2832,41 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL;
>  }
> 
> +static int
> +ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptypes[])
> +{
> + int num = 0;
> +
> + if (dev->rx_pkt_burst == ixgbe_recv_pkts ||
> + dev->rx_pkt_burst == ixgbe_recv_pkts_lro_single_alloc ||
> + dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc ||
> + dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc ||
> + dev->rx_pkt_burst == ixgbe_recv_pkts_vec ||
> + dev->rx_pkt_burst == ixgbe_recv_scattered_pkts_vec) {

Is there any point in that big if above?
All ixgbe recv functions support ptype recognition, so why to have it at all?
Same question for igb.
Konstantin

> + /*
> +  * for non-vec functions,
> +  * refers to ixgbe_rxd_pkt_info_to_pkt_type();
> +  * for vec functions,
> +  * refers to _recv_raw_pkts_vec().
> +  */
> + ptypes[num++] = RTE_PTYPE_L2_ETHER;
> + ptypes[num++] = RTE_PTYPE_L3_IPV4;
> + ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
> + ptypes[num++] = RTE_PTYPE_L3_IPV6;
> + ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
> + ptypes[num++] = RTE_PTYPE_L4_SCTP;
> + ptypes[num++] = RTE_PTYPE_L4_TCP;
> + ptypes[num++] = RTE_PTYPE_L4_UDP;
> + ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
> + ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
> + ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
> + ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
> + }
> +
> + return num;
> +}
> +
>  static void
>  ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>struct rte_eth_dev_info *dev_info)
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
> b/drivers/net/ixgbe/ixgbe_ethdev.h
> index d26771a..2479830 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -379,6 +379,8 @@ void ixgbevf_dev_rxtx_start(struct rte_eth_dev *dev);
>  uint16_t ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>   uint16_t nb_pkts);
> 
> +uint16_t ixgbe_recv_pkts_bulk_alloc(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
> +uint16_t nb_pkts);
>  uint16_t ixgbe_recv_pkts_lro_single_alloc(void *rx_queue,
>   struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
>  uint16_t ixgbe_recv_pkts_lro_bulk_alloc(void *rx_queue,
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index 52a263c..d324099 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> 

[dpdk-dev] [PATCH 0/2] add support for buffered tx to ethdev

2016-01-15 Thread Kulasek, TomaszX
Sorry, winzip changes eols.



[dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type filling info

2016-01-15 Thread Ananyev, Konstantin


> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, January 15, 2016 5:46 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> Subject: [PATCH v2 01/12] ethdev: add API to query packet type filling info
> 
> Add a new API rte_eth_dev_get_ptype_info to query wether/what packet type will
> be filled by given pmd rx burst function.
> 
> Signed-off-by: Jianfeng Tan 
> ---
>  lib/librte_ether/rte_ethdev.c | 20 
>  lib/librte_ether/rte_ethdev.h | 27 +++
>  lib/librte_mbuf/rte_mbuf.h|  6 ++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..cd34f46 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1614,6 +1614,26 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->driver_name = dev->data->drv_name;
>  }
> 
> +int
> +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> +uint32_t ptypes[], int num)
> +{
> + int ret, i, j;
> + struct rte_eth_dev *dev;
> + uint32_t all_ptypes[RTE_PTYPE_MAX_NUM];
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> + dev = &rte_eth_devices[port_id];
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> + ret = (*dev->dev_ops->dev_ptype_info_get)(dev, all_ptypes);
> +
> + for (i = 0, j = 0; i < ret && j < num; ++i)
> + if (all_ptypes[i] & ptype_mask)
> + ptypes[j++] = all_ptypes[i];
> +
> + return ret;

I think it needs to be something like:

j = 0;
for (i = 0, j = 0; i < ret; ++i) {
 if (all_ptypes[i] & ptype_mask) {
  if (j < num)
   ptypes[j] = all_ptypes[i];
  j++;
   }
}

return j;

Konstantin

> +}
> +
>  void
>  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index bada8ad..42f8188 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev 
> *dev,
>   struct rte_eth_dev_info *dev_info);
>  /**< @internal Get specific informations of an Ethernet device. */
> 
> +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> + uint32_t ptypes[]);
> +/**< @internal Get ptype info of eth_rx_burst_t. */
> +
>  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
>   uint16_t queue_id);
>  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
>   eth_queue_stats_mapping_set_t queue_stats_mapping_set;
>   /**< Configure per queue stat counter mapping. */
>   eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> + eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
>   mtu_set_t  mtu_set; /**< Set MTU. */
>   vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
>   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> Setup. */
> @@ -2273,6 +2278,28 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
>struct rte_eth_dev_info *dev_info);
> 
>  /**
> + * Retrieve the contextual information of an Ethernet device.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param ptype_mask
> + *   A hint of what kind of packet type which the caller is interested in.
> + * @param ptypes
> + *   An array of packet types to be filled with.
> + * @param num
> + *   The size of ptypes array.
> + * @return
> + *   - (>0) Number of ptypes supported. May be greater than param num and
> + *   caller needs to check that.
> + *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
> + *   - (-ENODEV) if *port_id* invalid.
> + */
> +extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
> +   uint32_t ptype_mask,
> +   uint32_t ptypes[],
> +   int num);
> +
> +/**
>   * Retrieve the MTU of an Ethernet device.
>   *
>   * @param port_id
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index f234ac9..d116711 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -667,6 +667,12 @@ extern "C" {
>  #define RTE_PTYPE_INNER_L4_MASK 0x0f00
> 
>  /**
> +  * Total number of all kinds of RTE_PTYPE_*, except from *_MASK, is 37 for 
> now
> +  * and reserve some space for new ptypes
> +  */
> +#define RTE_PTYPE_MAX_NUM64
> +
> +/**
>   * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4 types one 
> by
>   * one, bit 4 is selected to be used for IPv4 onl

[dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type filling info

2016-01-15 Thread Ananyev, Konstantin
Hi lads,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, January 15, 2016 1:59 PM
> To: Tan, Jianfeng
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type 
> filling info
> 
> Hi Jianfeng, a few comments below.
> 
> On Fri, Jan 15, 2016 at 01:45:48PM +0800, Jianfeng Tan wrote:
> > Add a new API rte_eth_dev_get_ptype_info to query wether/what packet type 
> > will
> > be filled by given pmd rx burst function.
> >
> > Signed-off-by: Jianfeng Tan 
> > ---
> >  lib/librte_ether/rte_ethdev.c | 20 
> >  lib/librte_ether/rte_ethdev.h | 27 +++
> >  lib/librte_mbuf/rte_mbuf.h|  6 ++
> >  3 files changed, 53 insertions(+)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index ed971b4..cd34f46 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -1614,6 +1614,26 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> > rte_eth_dev_info *dev_info)
> > dev_info->driver_name = dev->data->drv_name;
> >  }
> >
> > +int
> > +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> > +  uint32_t ptypes[], int num)
> > +{
> > +   int ret, i, j;
> > +   struct rte_eth_dev *dev;
> > +   uint32_t all_ptypes[RTE_PTYPE_MAX_NUM];
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +   dev = &rte_eth_devices[port_id];
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> > +   ret = (*dev->dev_ops->dev_ptype_info_get)(dev, all_ptypes);
> > +
> > +   for (i = 0, j = 0; i < ret && j < num; ++i)
> > +   if (all_ptypes[i] & ptype_mask)
> > +   ptypes[j++] = all_ptypes[i];
> > +
> > +   return ret;
> > +}
> 
> It's a good thing that the size of ptypes[] can be provided, but I think num
> should be passed to the dev_ptype_info_get() callback as well.
> 
> If you really do not want to pass the size, you have to force the array type
> size onto callbacks using a pointer to the array otherwise they look unsafe
> (and are actually unsafe when not called from the rte_eth_dev wrapper),
> something like this:
> 
>  int (*dev_ptype_info_get)(uint8_t port_id, uint32_t 
> (*ptypes)[RTE_PTYPE_MAX_NUM]);
> 
> In which case you might as well drop the num argument from
> rte_eth_dev_get_ptype_info() to use the same syntax. That way there is no
> need to allocate a ptypes array on the stack twice.
> 
> But since people usually do not like this syntax, I think passing num and
> letting callbacks check for overflow themselves on the user-provided ptypes
> array directly is better. They have to return the total number of packet
> types supported even when num is 0 (ptypes may be NULL in that case).
> 
> I understand the result needs to be temporarily stored somewhere for
> filtering and for that purpose the entire size must be known in advance,
> hence my previous suggestion for rte_eth_dev_get_ptype_info() to return
> the total number of ptypes and providing a separate function for filtering
> a ptypes array for applications that need it:
> 
>  /* Return remaining number entries in ptypes[] after filtering it
>   * according to ptype_mask. */
>  int rte_eth_dev_ptypes_filter(uint32_t ptype_mask, uint32_t ptypes[], int 
> num);
> 
> Usage would be like:
> 
>  int ptypes_num = rte_eth_dev_get_ptype_info(42, NULL, 0);
> 
>  if (ptypes_num <= 0)
>  goto err;
> 
>  uint32_t ptypes[ptypes_num];
> 
>  rte_eth_dev_get_ptype_info(42, ptypes, ptypes_num);
>  ptypes_num = rte_eth_dev_ptypes_filter(RTE_PTYPE_INNER_L4_MASK, ptypes, 
> ptypes_num);
> 
>  if (ptypes_num <= 0)
> goto err;
> 
>  /* process ptypes... */
> 
> What about this?

Actually while thinking about it, we probably can make:
const uint32_t * (*dev_ptype_info_get)(uint8_t port_id); 
So PMD return to ethdev layer a pointer to a constant array of supported ptypes,
terminated by  RTE_PTYPE_UNKNOWN.   
Then rte_eth_dev_get_ptype_info() will iterate over it, and fill array provided 
by the user.

all_pytpes = (*dev->dev_ops->dev_ptype_info_get)(dev);
for (i = 0; all_ptypes[i] != RTE_PTYPE_UNKNOWN; i++) {
if (all_ptypes[i] & ptype_mask) {
  if (j < num)
   ptypes[j] = all_ptypes[i];
  j++;
}
return j;

Konstantin

> 
> > +
> >  void
> >  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
> >  {
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index bada8ad..42f8188 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct 
> > rte_eth_dev *dev,
> > struct rte_eth_dev_info *dev_info);
> >  /**< @internal Get specific informations of an Ethernet device. */
> >
> > +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> > +

[dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet type filling info

2016-01-15 Thread Adrien Mazarguil
On Fri, Jan 15, 2016 at 03:11:18PM +, Ananyev, Konstantin wrote:
> Hi lads,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Friday, January 15, 2016 1:59 PM
> > To: Tan, Jianfeng
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 01/12] ethdev: add API to query packet 
> > type filling info
> > 
> > Hi Jianfeng, a few comments below.
> > 
> > On Fri, Jan 15, 2016 at 01:45:48PM +0800, Jianfeng Tan wrote:
> > > Add a new API rte_eth_dev_get_ptype_info to query wether/what packet type 
> > > will
> > > be filled by given pmd rx burst function.
> > >
> > > Signed-off-by: Jianfeng Tan 
> > > ---
> > >  lib/librte_ether/rte_ethdev.c | 20 
> > >  lib/librte_ether/rte_ethdev.h | 27 +++
> > >  lib/librte_mbuf/rte_mbuf.h|  6 ++
> > >  3 files changed, 53 insertions(+)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > > index ed971b4..cd34f46 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -1614,6 +1614,26 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> > > rte_eth_dev_info *dev_info)
> > >   dev_info->driver_name = dev->data->drv_name;
> > >  }
> > >
> > > +int
> > > +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> > > +uint32_t ptypes[], int num)
> > > +{
> > > + int ret, i, j;
> > > + struct rte_eth_dev *dev;
> > > + uint32_t all_ptypes[RTE_PTYPE_MAX_NUM];
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > + dev = &rte_eth_devices[port_id];
> > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> > > + ret = (*dev->dev_ops->dev_ptype_info_get)(dev, all_ptypes);
> > > +
> > > + for (i = 0, j = 0; i < ret && j < num; ++i)
> > > + if (all_ptypes[i] & ptype_mask)
> > > + ptypes[j++] = all_ptypes[i];
> > > +
> > > + return ret;
> > > +}
> > 
> > It's a good thing that the size of ptypes[] can be provided, but I think num
> > should be passed to the dev_ptype_info_get() callback as well.
> > 
> > If you really do not want to pass the size, you have to force the array type
> > size onto callbacks using a pointer to the array otherwise they look unsafe
> > (and are actually unsafe when not called from the rte_eth_dev wrapper),
> > something like this:
> > 
> >  int (*dev_ptype_info_get)(uint8_t port_id, uint32_t 
> > (*ptypes)[RTE_PTYPE_MAX_NUM]);
> > 
> > In which case you might as well drop the num argument from
> > rte_eth_dev_get_ptype_info() to use the same syntax. That way there is no
> > need to allocate a ptypes array on the stack twice.
> > 
> > But since people usually do not like this syntax, I think passing num and
> > letting callbacks check for overflow themselves on the user-provided ptypes
> > array directly is better. They have to return the total number of packet
> > types supported even when num is 0 (ptypes may be NULL in that case).
> > 
> > I understand the result needs to be temporarily stored somewhere for
> > filtering and for that purpose the entire size must be known in advance,
> > hence my previous suggestion for rte_eth_dev_get_ptype_info() to return
> > the total number of ptypes and providing a separate function for filtering
> > a ptypes array for applications that need it:
> > 
> >  /* Return remaining number entries in ptypes[] after filtering it
> >   * according to ptype_mask. */
> >  int rte_eth_dev_ptypes_filter(uint32_t ptype_mask, uint32_t ptypes[], int 
> > num);
> > 
> > Usage would be like:
> > 
> >  int ptypes_num = rte_eth_dev_get_ptype_info(42, NULL, 0);
> > 
> >  if (ptypes_num <= 0)
> >  goto err;
> > 
> >  uint32_t ptypes[ptypes_num];
> > 
> >  rte_eth_dev_get_ptype_info(42, ptypes, ptypes_num);
> >  ptypes_num = rte_eth_dev_ptypes_filter(RTE_PTYPE_INNER_L4_MASK, ptypes, 
> > ptypes_num);
> > 
> >  if (ptypes_num <= 0)
> > goto err;
> > 
> >  /* process ptypes... */
> > 
> > What about this?
> 
> Actually while thinking about it, we probably can make:
> const uint32_t * (*dev_ptype_info_get)(uint8_t port_id); 
> So PMD return to ethdev layer a pointer to a constant array of supported 
> ptypes,
> terminated by  RTE_PTYPE_UNKNOWN.   
>
> Then rte_eth_dev_get_ptype_info() will iterate over it, and fill array 
> provided by the user.
> 
> all_pytpes = (*dev->dev_ops->dev_ptype_info_get)(dev);
> for (i = 0; all_ptypes[i] != RTE_PTYPE_UNKNOWN; i++) {
> if (all_ptypes[i] & ptype_mask) {
>   if (j < num)
>ptypes[j] = all_ptypes[i];
>   j++;
> }
> return j;

Looks like a much simpler and better approach, that's the implementation we
need! (ignore my overengineered blob above)

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports

2016-01-15 Thread Ferruh Yigit
This work is to make DPDK ports more visible and to enable using common
Linux tools to configure DPDK ports.

Patch is based on KNI but contains only control functionality of it,
also this patch does not include any Linux kernel network driver as
part of it.

Basically with the help of a kernel module (KCP), virtual Linux network
interfaces named as "dpdk$" are created per DPDK port, control messages
sent to these virtual interfaces are forwarded to DPDK, and response
sent back to Linux application.

Virtual interfaces created when DPDK application started and destroyed
automatically when DPDK application terminated.

Communication between kernel-space and DPDK done using netlink socket.

Currently implementation is not complete, sample support added for the
RFC, more functionality can be added based on community response.

With this RFC Patch, supported: get/set mac address/mtu of DPDK devices,
getting stats from DPDK devices and some set of ethtool commands.

Samples:

$ ifconfig
dpdk0: flags=4099  mtu 1500
ether 90:e2:ba:0e:49:b8  txqueuelen 1000  (Ethernet)
RX packets 33  bytes 2058 (2.0 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 33  bytes 2058 (2.0 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dpdk1: flags=4099  mtu 1500
ether 00:1b:21:76:fa:21  txqueuelen 1000  (Ethernet)
RX packets 0  bytes 0 (0.0 B)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 0  bytes 0 (0.0 B)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

After some traffic on port 0:

$ ifconfig
dpdk0: flags=4099  mtu 1500
ether 90:e2:ba:0e:49:77  txqueuelen 1000  (Ethernet)
RX packets 962  bytes 57798 (56.4 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 962  bytes 57798 (56.4 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


$ ethtool -i dpdk0
driver: rte_ixgbe_pmd
version: RTE 2.3.0-rc0
firmware-version: 
expansion-rom-version: 
bus-info: :08:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no


$ ip l show dpdk0
25: dpdk0:  mtu 1500 qdisc noop state DOWN 
mode DEFAULT group default qlen 1000
link/ether 90:e2:ba:0e:49:b8 brd ff:ff:ff:ff:ff:ff

$ ip l set dpdk0 addr 90:e2:ba:0e:49:77

$ ip l show dpdk0
25: dpdk0:  mtu 1500 qdisc noop state DOWN 
mode DEFAULT group default qlen 1000
link/ether 90:e2:ba:0e:49:77 brd ff:ff:ff:ff:ff:ff


Ferruh Yigit (3):
  rte_ctrl_if: add control interface library
  kcp: add kernel control path kernel module
  examples/ethtool: add control interface support to the application

 config/common_linuxapp |   9 +-
 examples/ethtool/ethtool-app/main.c|   8 +-
 lib/Makefile   |   3 +-
 lib/librte_ctrl_if/Makefile|  58 +
 lib/librte_ctrl_if/rte_ctrl_if.c   | 166 ++
 lib/librte_ctrl_if/rte_ctrl_if.h   |  54 +
 lib/librte_ctrl_if/rte_ctrl_if_version.map |   9 +
 lib/librte_ctrl_if/rte_ethtool.c   | 354 +
 lib/librte_ctrl_if/rte_ethtool.h   |  64 ++
 lib/librte_ctrl_if/rte_nl.c| 263 +
 lib/librte_ctrl_if/rte_nl.h|  60 +
 lib/librte_eal/common/include/rte_log.h|   3 +-
 lib/librte_eal/linuxapp/Makefile   |   5 +-
 lib/librte_eal/linuxapp/kcp/Makefile   |  58 +
 lib/librte_eal/linuxapp/kcp/kcp_dev.h  |  81 +++
 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c  | 261 +
 lib/librte_eal/linuxapp/kcp/kcp_misc.c | 282 +++
 lib/librte_eal/linuxapp/kcp/kcp_net.c  | 209 +
 lib/librte_eal/linuxapp/kcp/kcp_nl.c   | 194 
 mk/rte.app.mk  |   3 +-
 20 files changed, 2138 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_ctrl_if/Makefile
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.c
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.h
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if_version.map
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.c
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.h
 create mode 100644 lib/librte_ctrl_if/rte_nl.c
 create mode 100644 lib/librte_ctrl_if/rte_nl.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c

-- 
2.5.0



[dpdk-dev] [RFC 3/3] examples/ethtool: add control interface support to the application

2016-01-15 Thread Ferruh Yigit
Control interface APIs added into the sample application.

To have the support corresponding kernel module (KCP) needs to be inserted.
If kernel module is not there, application will run as it is without
kernel control path support.

When KCP module inserted, running application creates a virtual Linux
network interface (dpdk$) per DPDK port. This interface can be used by
traditional Linux tools.

Signed-off-by: Ferruh Yigit 
---
 examples/ethtool/ethtool-app/main.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/examples/ethtool/ethtool-app/main.c 
b/examples/ethtool/ethtool-app/main.c
index e21abcd..bfa2128 100644
--- a/examples/ethtool/ethtool-app/main.c
+++ b/examples/ethtool/ethtool-app/main.c
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "ethapp.h"

@@ -54,7 +55,6 @@
 #define PKTPOOL_EXTRA_SIZE 512
 #define PKTPOOL_CACHE 32

-
 struct txq_port {
uint16_t cnt_unsent;
struct rte_mbuf *buf_frames[MAX_BURST_LENGTH];
@@ -254,6 +254,8 @@ static int slave_main(__attribute__((unused)) void 
*ptr_data)
}
rte_spinlock_unlock(&ptr_port->lock);
} /* end for( idx_port ) */
+   rte_eth_control_interface_process_msg(
+   RTE_ETHTOOL_CTRL_IF_PROCESS_MSG, 0);
} /* end for(;;) */

return 0;
@@ -293,6 +295,8 @@ int main(int argc, char **argv)
id_core = rte_get_next_lcore(id_core, 1, 1);
rte_eal_remote_launch(slave_main, NULL, id_core);

+   rte_eth_control_interface_create();
+
ethapp_main();

app_cfg.exit_now = 1;
@@ -301,5 +305,7 @@ int main(int argc, char **argv)
return -1;
}

+   rte_eth_control_interface_destroy();
+
return 0;
 }
-- 
2.5.0



[dpdk-dev] [RFC 2/3] kcp: add kernel control path kernel module

2016-01-15 Thread Ferruh Yigit
This kernel module is based on KNI module, but this one is stripped
version of it and only for control messages, no data transfer
functionality provided.

This Linux kernel module helps userspace application create virtual
interfaces and when a control command issued into that virtual
interface, module pushes the command to the userspace and gets the
response back for the caller application.

The Linux tools like ethtool/ifconfig/ip can be used on virtual
interfaces but not ones for related data, like tcpdump.

Signed-off-by: Ferruh Yigit 
---
 config/common_linuxapp|   2 +
 lib/librte_eal/linuxapp/Makefile  |   5 +-
 lib/librte_eal/linuxapp/kcp/Makefile  |  58 ++
 lib/librte_eal/linuxapp/kcp/kcp_dev.h |  81 +
 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c | 261 +++
 lib/librte_eal/linuxapp/kcp/kcp_misc.c| 282 ++
 lib/librte_eal/linuxapp/kcp/kcp_net.c | 209 ++
 lib/librte_eal/linuxapp/kcp/kcp_nl.c  | 194 
 8 files changed, 1091 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_eal/linuxapp/kcp/Makefile
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_dev.h
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_ethtool.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_misc.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_net.c
 create mode 100644 lib/librte_eal/linuxapp/kcp/kcp_nl.c

diff --git a/config/common_linuxapp b/config/common_linuxapp
index de705d0..ed32ca8 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -506,6 +506,8 @@ CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
 # Compile librte_ctrl_if
 #
 CONFIG_RTE_LIBRTE_CTRL_IF=y
+CONFIG_RTE_KCP_KMOD=y
+CONFIG_RTE_KCP_KO_DEBUG=n

 #
 # Compile vhost library
diff --git a/lib/librte_eal/linuxapp/Makefile b/lib/librte_eal/linuxapp/Makefile
index d9c5233..d1fa3a3 100644
--- a/lib/librte_eal/linuxapp/Makefile
+++ b/lib/librte_eal/linuxapp/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -38,6 +38,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal
 ifeq ($(CONFIG_RTE_KNI_KMOD),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kni
 endif
+ifeq ($(CONFIG_RTE_KCP_KMOD),y)
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += kcp
+endif
 ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
 DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += xen_dom0
 endif
diff --git a/lib/librte_eal/linuxapp/kcp/Makefile 
b/lib/librte_eal/linuxapp/kcp/Makefile
new file mode 100644
index 000..e7472f3
--- /dev/null
+++ b/lib/librte_eal/linuxapp/kcp/Makefile
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# module name and path
+#
+MODULE = rte_kcp
+
+#
+# CFLAGS
+#
+MODULE_CFLAGS += -I$(SRCDIR)
+MODULE_CFLAGS += -I$(RTE_OUTPUT)/include
+MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h
+MODULE_CFLAGS += -Wall -Werror
+
+# this lib needs main eal
+DEPDIRS-y += lib/librte_eal/linuxapp/eal
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += kcp_misc.c
+SRCS-y += kcp_net.c
+SRCS-y += kcp_ethtool.c
+SRCS-y += kcp_nl.c
+
+include $(RTE_SDK)/mk/

[dpdk-dev] [RFC 1/3] rte_ctrl_if: add control interface library

2016-01-15 Thread Ferruh Yigit
This library gets control messages form kernelspace and forwards them to
librte_ether and returns response back to the kernelspace.

Library does:
1) Trigger Linux virtual interface creation
2) Initialize the netlink socket communication
3) Provides process() API to the application that does processing the
received messages

This library requires corresponding kernel module to be inserted.

Signed-off-by: Ferruh Yigit 
---
 config/common_linuxapp |   7 +-
 lib/Makefile   |   3 +-
 lib/librte_ctrl_if/Makefile|  58 +
 lib/librte_ctrl_if/rte_ctrl_if.c   | 166 ++
 lib/librte_ctrl_if/rte_ctrl_if.h   |  54 +
 lib/librte_ctrl_if/rte_ctrl_if_version.map |   9 +
 lib/librte_ctrl_if/rte_ethtool.c   | 354 +
 lib/librte_ctrl_if/rte_ethtool.h   |  64 ++
 lib/librte_ctrl_if/rte_nl.c| 274 ++
 lib/librte_ctrl_if/rte_nl.h|  60 +
 lib/librte_eal/common/include/rte_log.h|   3 +-
 mk/rte.app.mk  |   3 +-
 12 files changed, 1051 insertions(+), 4 deletions(-)
 create mode 100644 lib/librte_ctrl_if/Makefile
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.c
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if.h
 create mode 100644 lib/librte_ctrl_if/rte_ctrl_if_version.map
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.c
 create mode 100644 lib/librte_ctrl_if/rte_ethtool.h
 create mode 100644 lib/librte_ctrl_if/rte_nl.c
 create mode 100644 lib/librte_ctrl_if/rte_nl.h

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..de705d0 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -503,6 +503,11 @@ CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
 CONFIG_RTE_KNI_VHOST_DEBUG_TX=n

 #
+# Compile librte_ctrl_if
+#
+CONFIG_RTE_LIBRTE_CTRL_IF=y
+
+#
 # Compile vhost library
 # fuse-devel is needed to run vhost-cuse.
 # fuse-devel enables user space char driver development
diff --git a/lib/Makefile b/lib/Makefile
index ef172ea..a50bc1e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -58,6 +58,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
 DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
+DIRS-$(CONFIG_RTE_LIBRTE_CTRL_IF) += librte_ctrl_if

 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_ctrl_if/Makefile b/lib/librte_ctrl_if/Makefile
new file mode 100644
index 000..9e1ed0d
--- /dev/null
+++ b/lib/librte_ctrl_if/Makefile
@@ -0,0 +1,58 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_ctrl_if

[dpdk-dev] UX Bug in Sphinx HTML Layout for Programming Guide (and maybe other guides?)

2016-01-15 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matthew Hall
> Sent: Wednesday, January 13, 2016 5:26 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] UX Bug in Sphinx HTML Layout for Programming Guide
> (and maybe other guides?)
> 
> When you go to this link:
> 
> http://dpdk.org/doc/guides/prog_guide/perf_opt_guidelines.html
> 
> There is a bug in the Sphinx layout, where the subchapters of a chapter
> are invisible even after the chapter is clicked.

Hi Thomas,

I found the issue causing this and I'll submit a patch shortly.

If possible could you apply it a copy of the 2.2 code and rebuild the online 
docs.

John.
-- 



[dpdk-dev] [PATCH v1] doc: fix navigation levels in html sidebar

2016-01-15 Thread John McNamara
Fix issue where the navigation levels weren't displayed in the
html sidebar with the new read_the_docs theme.

This was due to the :titlesonly: directive in the high level
index.rst and also due to a stray newline in the parsed version
number.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py   | 2 +-
 doc/guides/index.rst | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 1861443..2c5610f 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -45,7 +45,7 @@ html_add_permalinks = ""
 html_show_copyright = False
 highlight_language = 'none'

-version = subprocess.check_output(['make', '-sRrC', '../../', 
'showversion']).decode('utf-8')
+version = subprocess.check_output(['make', '-sRrC', '../../', 
'showversion']).decode('utf-8').rstrip()
 release = version

 master_doc = 'index'
diff --git a/doc/guides/index.rst b/doc/guides/index.rst
index 638eab9..7aef7a3 100644
--- a/doc/guides/index.rst
+++ b/doc/guides/index.rst
@@ -33,7 +33,6 @@ DPDK documentation

 .. toctree::
:maxdepth: 1
-   :titlesonly:

linux_gsg/index
freebsd_gsg/index
-- 
2.5.0



[dpdk-dev] Sending and receiving on the same port at the same time, from different threads

2016-01-15 Thread Zoltan Kiss
Hi,

I've been asked this question, and I realized I'm not sure about the 
answer. In other words: can you call rte_eth_tx_burst() and 
rte_eth_tx_burst() on the same port at the same time from different 
threads? In theory it seems possible, as you still access different 
queues (an RX and a TX one), and at least taking a glance at ixgbe 
vector functions, they don't seem to use common resources while doing RX 
or TX. But I'm not sure that it's generally true, although I always 
assumed that it should be true. Have anyone seen a device where it 
wasn't true?

Regards,

Zoltan


[dpdk-dev] Sending and receiving on the same port at the same time, from different threads

2016-01-15 Thread Matthew Hall
On Fri, Jan 15, 2016 at 04:54:11PM +, Zoltan Kiss wrote:
> Can you call rte_eth_tx_burst() and rte_eth_tx_burst() on the same port at 
> the same time from different threads?

In general, yes you can. I did this before in an L4-L7 performance tester, so 
cores could concentrate on RX or TX to keep the speeds high.

> Have anyone seen a device where it wasn't true?

Not specifically. But I didn't go looking for one either.

Matthew.


[dpdk-dev] [PATCH 0/2] add support for buffered tx to ethdev

2016-01-15 Thread Stephen Hemminger
On Fri, 15 Jan 2016 15:43:56 +0100
Tomasz Kulasek  wrote:

> Many sample apps include internal buffering for single-packet-at-a-time
> operation. Since this is such a common paradigm, this functionality is
> better suited to being inside the core ethdev API.
> The new APIs in the ethdev library are:
> * rte_eth_tx_buffer - buffer up a single packet for future transmission
> * rte_eth_tx_buffer_flush - flush any unsent buffered packets
> * rte_eth_tx_buffer_set_err_callback - set up a callback to be called in
>   case transmitting a buffered burst fails. By default, we just free the
>   unsent packets.
> 
> As well as these, an additional reference callback is provided, which
> frees the packets (as the default callback does), as well as updating a
> user-provided counter, so that the number of dropped packets can be
> tracked.
> 
> The internal buffering of packets for TX in sample apps is no longer
> needed, so this patchset also replaces this code with calls to the new
> rte_eth_tx_buffer* APIs in:

The pipeline code also has its own implementation of this.


[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-01-15 Thread Stephen Hemminger
On Fri, 15 Jan 2016 15:43:57 +0100
Tomasz Kulasek  wrote:

>  static int
>  rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
>  {
>   uint16_t old_nb_queues = dev->data->nb_tx_queues;
>   void **txq;
> + struct rte_eth_dev_tx_buffer *new_bufs;
>   unsigned i;
>  
>   if (dev->data->tx_queues == NULL) { /* first time configuration */
> @@ -841,17 +872,40 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>   dev->data->nb_tx_queues = 0;
>   return -(ENOMEM);
>   }
> +
> + dev->data->txq_bufs = rte_zmalloc("ethdev->txq_bufs",
> + sizeof(*dev->data->txq_bufs) * nb_queues, 0);

You should use zmalloc_socket and put the buffering on the same numa
node as the device?


[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-01-15 Thread Stephen Hemminger
On Fri, 15 Jan 2016 15:43:57 +0100
Tomasz Kulasek  wrote:

> + return -(ENOMEM);

Please don't put () around args to return, it is a BSD stylism


[dpdk-dev] Sending and receiving on the same port at the same time, from different threads

2016-01-15 Thread Stephen Hemminger
On Fri, 15 Jan 2016 12:33:14 -0500
Matthew Hall  wrote:

> On Fri, Jan 15, 2016 at 04:54:11PM +, Zoltan Kiss wrote:
> > Can you call rte_eth_tx_burst() and rte_eth_tx_burst() on the same port at 
> > the same time from different threads?
> 
> In general, yes you can. I did this before in an L4-L7 performance tester, so 
> cores could concentrate on RX or TX to keep the speeds high.
> 
> > Have anyone seen a device where it wasn't true?
> 
> Not specifically. But I didn't go looking for one either.
> 
> Matthew.

Same port is OK, as long as each thread uses different queues.
The device queues are not thread safe; ie two threads can't be pulling/pushing
to the same queue.


[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

2016-01-15 Thread Ananyev, Konstantin
Hi Tomasz,

>  static int
>  rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
>  {
>   uint16_t old_nb_queues = dev->data->nb_tx_queues;
>   void **txq;
> + struct rte_eth_dev_tx_buffer *new_bufs;
>   unsigned i;
> 
>   if (dev->data->tx_queues == NULL) { /* first time configuration */
> @@ -841,17 +872,40 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>   dev->data->nb_tx_queues = 0;
>   return -(ENOMEM);
>   }
> +
> + dev->data->txq_bufs = rte_zmalloc("ethdev->txq_bufs",
> + sizeof(*dev->data->txq_bufs) * nb_queues, 0);
> + if (dev->data->txq_bufs == NULL) {
> + dev->data->nb_tx_queues = 0;
> + rte_free(dev->data->tx_queues);
> + return -(ENOMEM);
> + }
> +
>   } else { /* re-configure */
> +
> + /* flush the packets queued for all queues*/
> + for (i = 0; i < old_nb_queues; i++)
> + rte_eth_tx_buffer_flush(dev->data->port_id, i);
> +

I don't think it is safe to call tx_burst() at queue config stage.
Instead you need to flush (or just empty) your txq)bufs at tx_queue_stop stage.

>   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, 
> -ENOTSUP);
> 
> + /* get new buffer space first, but keep old space around */
> + new_bufs = rte_zmalloc("ethdev->txq_bufs",
> + sizeof(*dev->data->txq_bufs) * nb_queues, 0);
> + if (new_bufs == NULL)
> + return -(ENOMEM);
> +


Why not to allocate space for txq_bufs together with tx_queues (as one chunk 
for both)?
As I understand there is always one to one mapping between them anyway.
Would simplify things a bit.
Or even introduce a new struct to group with all related tx queue info togetehr
struct rte_eth_txq_data {
void *queue; /*actual pmd  queue*/
struct rte_eth_dev_tx_buffer buf;
uint8_t state;
}
And use it inside struct rte_eth_dev_data?
Would probably give a better data locality.

>   txq = dev->data->tx_queues;
> 
>   for (i = nb_queues; i < old_nb_queues; i++)
>   (*dev->dev_ops->tx_queue_release)(txq[i]);
>   txq = rte_realloc(txq, sizeof(txq[0]) * nb_queues,
> RTE_CACHE_LINE_SIZE);
> - if (txq == NULL)
> - return -ENOMEM;
> + if (txq == NULL) {
> + rte_free(new_bufs);
> + return -(ENOMEM);
> + }
> +
>   if (nb_queues > old_nb_queues) {
>   uint16_t new_qs = nb_queues - old_nb_queues;
> 
> @@ -861,6 +915,9 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
> 
>   dev->data->tx_queues = txq;
> 
> + /* now replace old buffers with new */
> + rte_free(dev->data->txq_bufs);
> + dev->data->txq_bufs = new_bufs;
>   }
>   dev->data->nb_tx_queues = nb_queues;
>   return 0;
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index bada8ad..23faa6a 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1,7 +1,7 @@
>  /*-
>   *   BSD LICENSE
>   *
> - *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> + *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
>   *   All rights reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
> @@ -182,6 +182,7 @@ extern "C" {
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "rte_ether.h"
>  #include "rte_eth_ctrl.h"
>  #include "rte_dev_info.h"
> @@ -1519,6 +1520,34 @@ enum rte_eth_dev_type {
>   RTE_ETH_DEV_MAX /**< max value of this enum */
>  };
> 
> +typedef void (*buffer_tx_error_fn)(struct rte_mbuf **unsent, uint16_t count,
> + void *userdata);
> +
> +/**
> + * @internal
> + * Structure used to buffer packets for future TX
> + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush
> + */
> +struct rte_eth_dev_tx_buffer {
> + struct rte_mbuf *pkts[RTE_ETHDEV_TX_BUFSIZE];

I think it is better to make size of pkts[] configurable at runtime.
There are a lot of different usage scenarios - hard to predict what would be an
optimal buffer size for all cases.  

> + unsigned nb_pkts;
> + uint64_t errors;
> + /**< Total number of queue packets to sent that are dropped. */
> +};
> +
> +/**
> + * @internal
> + * Structure to hold a callback to be used on error when a tx_buffer_flush
> + * call fails to send all packets.
> + * This needs to be a separate structure, as it must go in the ethdev 
> structure
> + * rather than ethdev_data, due to the use of a function pointer, which is 
> not
> + * multi-process safe.
> + */
> +struct rte_et

  1   2   >