date:20141021

[dpdk-dev] [PATCH v2 0/2] new filter APIs definition

2014-10-21 Thread Thomas Monjalon

2014-10-20 13:40, Jingjing Wu:
> new filter APIs definition in ethdev
> define filter_ctrl ops in i40e driver
> 
> v2 changes:
>   remove OP from the name of filter opeartions
>   add API implementation in i40e.
>   correct comments
> 
> Jingjing Wu (2):
>   librte_ether: new filter APIs definition
>   i40e: define filter_ctrl ops in i40e driver

As Bruce suggested, RTE_ETH_FILTER_NOP is simpler than RTE_ETH_FILTER_OP_NONE.
And I think RTE_ETH_FILTER_GET_INFO could be RTE_ETH_FILTER_INFO.
Last comment: filtering features (fdir, hash) should not be defined at this 
stage.

I know that we want this patch integrated with high priority, so I made the 
above
changes by myself. Hope you'll agree.

Acked-by: Thomas Monjalon 

Applied

Thank you for your patience
-- 
Thomas

[dpdk-dev] nic loopback

2014-10-21 Thread Liang, Cunming



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Markuze
> Sent: Tuesday, October 21, 2014 12:24 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] nic loopback
> 
> Hi,
> I'm trying to send packets from an application to it self, meaning smac  ==
> dmac.
> I'm working with intel 82599 virtual function. But it seems that these
> packets are lost.
> 
> Is there a software/hw limitation I'm missing here (some additional
> anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the hw
> loopback should work, at least thats the theory.
> 
[Liang, Cunming] You could have a check on register LLE(PFVMTXSW[n]).
Which allow an individual pool to be able to send traffic and have it loopback 
to itself.
> 
> Thanks.

[dpdk-dev] [PATCH v3 0/2] app/test: unit test to measure cycles per packet

2014-10-21 Thread Liu, Yong

Patch name: PMD performance unit test
Brief description:  unit test to measure cycles per packet
Test Flag:  Tested-by 
Tester name:yong.liu at intel.com
Test environment:
OS: Fedora20 3.11.10-301.fc20.x86_64
GCC: gcc version 4.8.3 20140911
CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
Network Connection [8086:10fb]


Commit ID:  455d09e54b92a4626e178b020fe9c23e43ede3f7

Detailed Testing information
DPDK SW Configuration:
Default x86_64-native-linuxapp-gcc configuration
Test Result Summary:Total 2 cases, 2 passed, 0 failed

Test Case - name:
Continuous Mode Performance
Test Case - Description:
Measure continuous mode cycles/packet in NIC loopback 
mode
Test Case -command / instruction:

Set stream control mode to continuous
RTE>>set_rxtx_sc continuous

Choose rx/tx pair between vector|scalar|full|hybrid
RTE>>set_rxtx_mode vector

Start pmd performance measurement
RTE>>pmd_perf_autotest

Test Case - expected test result:
Test result is OK and output cycle number for each 
packet.

Test Case - name:
Burst Mode Performance
Test Case - Description:
Measure burst mode cycles/packet in NIC loopback mode
Test Case -command / instruction:
Start unit test sample

Set stream control mode to poll_before_xmit or  
poll_after_xmit
RTE>>set_rxtx_sc poll_before_xmit

Start pmd performance measurement
RTE>>pmd_perf_autotest

Start pmd performance measurement
RTE>>pmd_perf_autotest  


Test Case - expected test result:
Test result is OK and output cycle number for each 
packet.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cunming Liang
> Sent: Monday, October 20, 2014 4:14 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 0/2] app/test: unit test to measure cycles per
> packet
> 
> v3 update:
> # Codes refine according to the feedback.
>   1. add ether_format_addr to rte_ether.h
>   2. fix typo in code comments.
>   3. %lu to %PRIu64, fixing 32-bit targets compilation err
> # merge 2 small incremental patches to the first one.
>   The whole unit test as a single patch in [PATCH v3 2/2]
> # rebase code to the latest master
> 
> v2 update:
> Rebase code to the latest master branch.
> 
> Tested-by: Zhaochen Zhan 
> This patch has been verified on ixgbe and it is ready to be integrated to
> dpdk.org.
> For e1000 and i40e, lacking loopback supporting, this patch can't support
> them for now.
> 
> --
> 
> It provides unit test to measure cycles/packet in NIC loopback mode.
> It simply gives the average cycles of IO used per packet without test
> equipment.
> When doing the test, make sure the link is UP.
> 
> There's two stream control mode support, one is continues, another is burst.
> The former continues to forward the injected packets until reaching a certain
> amount of number.
> The latter one stop when all the injected packets are received.
> In burst stream, now measure two situations, with or without desc. cache
> conflict.
> By default, it runs in continues stream mode to measure the whole rxtx.
> 
> Usage Example:
> 1. Run unit test app in interactive mode
> app/test -c f -n 4 -- -i
> 2. Set stream control mode, by default is continuous
> set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
> 3. If choose continuous stream, there are another two options can configure
> 3.1 choose rx/tx pair, default is vector
> set_rxtx_mode [vector|scalar|full|hybrid]
> Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
> without
> INC_VEC=y in config
> 3.2 choose the area of masurement, default is rxtx
> set_rxtx_anchor [rxtx|rxonly|txonly]
> 4. Run and wait for the result
> pmd_perf_autotest
> 
> For who simply just want to see how much cycles cost per packet.
> Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
> Nothing else needs to configure.
> Using other options when you understand and what to measures more.
> 
> *** BLURB HERE ***
> 
> Cunming Liang (2):
>   app/test: allow to create packets in different sizes
>   app/test: measure the cost of rx/tx routines by cycle number
> 
>  app/test/Makefile   |1 +
>  app/test/commands.c |  111 +
>  a

[dpdk-dev] [PATCH v5 0/5] Support configuring hash functions

2014-10-21 Thread Helin Zhang

These patches mainly support configuring hash functions.
In detail,
 - It can get or set hash functions.
 - It can configure symmetric hash functions.
   * Get/set symmetric hash enable per port.
   * Get/set symmetric hash enable per 'PCTYPE'.
   * Get/set filter swap configurations.
 - Six commands have been implemented in testpmd to support
   testing above.
   * get_sym_hash_ena_per_port
   * set_sym_hash_ena_per_port
   * get_sym_hash_ena_per_pctype
   * set_sym_hash_ena_per_pctype
   * get_filter_swap
   * set_filter_swap
   * get_hash_function
   * set_hash_function
Note that 'PCTYPE' means 'Packet Classification Type'.

It also uses prepared constant hash keys to replace runtime
generating hash keys. Global initialization is added to put
global registers to an initial state, as global registers
can be reset by global reset only.

v3 changes:
* Removed renamings in rte_ethdev.h.
* Redesigned filter control API and its relevant structures/enums.
* Renamed header file from rte_eth_features.h to rte_eth_ctrol.h.
* Remove public header file of rte_i40e.h specific for i40e.
* Added hardware initialization function during port init.
* Used constant random hash keys in i40e PF.
* renamed the commands in testpmd based on the redesigned filter
  control API.

v4 changes:
* Fixed a bug in testpmd to support 'set_sym_hash_ena_per_port'.

v5 changes:
* Integrated with filter API defined recently.
* Remove all for filter API definition, as it has already defined
  and merged recently.

Helin Zhang (5):
  i40e: Use constant random hash keys
  ethdev: add enum type and relevant structures for hash filter control
  i40e: add hash filter control implementation
  i40e: add hardware initialization
  app/testpmd: add commands to support hash filter

 app/test-pmd/cmdline.c| 566 ++
 lib/librte_ether/rte_eth_ctrl.h   |  75 +
 lib/librte_pmd_i40e/i40e_ethdev.c | 467 ++-
 3 files changed, 1100 insertions(+), 8 deletions(-)

-- 
1.8.1.4

[dpdk-dev] [PATCH v5 1/5] i40e: Use constant random hash keys

2014-10-21 Thread Helin Zhang

To be simpler, and remove the race condition, it uses prepared
constant random hash keys to replace runtime generating the hash
keys.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..5e5cfbe 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -191,9 +191,6 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_op filter_op,
void *arg);

-/* Default hash key buffer for RSS */
-static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
-
 static struct rte_pci_id pci_id_i40e_map[] = {
 #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
 #include "rte_pci_dev_ids.h"
@@ -4117,9 +4114,12 @@ i40e_pf_config_rss(struct i40e_pf *pf)
}
if (rss_conf.rss_key == NULL || rss_conf.rss_key_len <
(I40E_PFQF_HKEY_MAX_INDEX + 1) * sizeof(uint32_t)) {
-   /* Calculate the default hash key */
-   for (i = 0; i <= I40E_PFQF_HKEY_MAX_INDEX; i++)
-   rss_key_default[i] = (uint32_t)rte_rand();
+   /* Random default keys */
+   static uint32_t rss_key_default[] = {0x6b793944,
+   0x23504cb5, 0x5bea75b6, 0x309f4f12, 0x3dc0a2b8,
+   0x024ddcdf, 0x339b8ca0, 0x4c4af64a, 0x34fac605,
+   0x55d85839, 0x3a58997d, 0x2ec938e1, 0x66031581};
+
rss_conf.rss_key = (uint8_t *)rss_key_default;
rss_conf.rss_key_len = (I40E_PFQF_HKEY_MAX_INDEX + 1) *
sizeof(uint32_t);
-- 
1.8.1.4

[dpdk-dev] [PATCH v5 5/5] app/testpmd: add commands to support hash filter

2014-10-21 Thread Helin Zhang

To demonstrate the hash filter control, commands are added. They are
- get_sym_hash_ena_per_port
- set_sym_hash_ena_per_port
- get_sym_hash_ena_per_pctype
- set_sym_hash_ena_per_pctype
- get_filter_swap
- set_filter_swap
- get_hash_function
- set_hash_function

Signed-off-by: Helin Zhang 
---
 app/test-pmd/cmdline.c | 566 +
 1 file changed, 566 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..23c669a 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -660,6 +661,35 @@ static void cmd_help_long_parsed(void *parsed_result,

"get_flex_filter (port_id) index (idx)\n"
"get info of a flex filter.\n\n"
+
+   "get_sym_hash_ena_per_port (port_id)\n"
+   "get symmetric hash enable configuration per 
port.\n\n"
+
+   "set_sym_hash_ena_per_port (port_id)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per port"
+   " to enable or disable.\n\n"
+
+   "get_sym_hash_ena_per_pctype (port_id) (pctype)\n"
+   "get symmetric hash enable configuration per 
port\n\n"
+
+   "set_sym_hash_ena_per_pctype (port_id) (pctype)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per"
+   " pctype to enable or disable.\n\n"
+
+   "get_filter_swap (port_id) (pctype)\n"
+   "get filter swap configurations.\n\n"
+
+   "set_filter_swap (port_id) (pctype) (off0_src0) 
(off0_src1)"
+   " (len0) (off1_src0) (off1_src1) (len1)\n"
+   "set filter swap configurations.\n\n"
+
+   "get_hash_function (port_id)\n"
+   "get hash function of Toeplitz or Simple XOR.\n\n"
+
+   "set_hash_function (port_id) (toeplitz|simple_xor)\n"
+   "set the hash function to Toeplitz or Simple 
XOR.\n\n"
);
}
 }
@@ -7415,6 +7445,534 @@ cmdline_parse_inst_t cmd_get_flex_filter = {
},
 };

+/* *** Classification Filters Control *** */
+
+/* *** Get symmetric hash enable per port *** */
+struct cmd_get_sym_hash_ena_per_port_result {
+   cmdline_fixed_string_t get_sym_hash_ena_per_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_get_sym_hash_per_port_parsed(void *parsed_result,
+__rte_unused struct cmdline *cl,
+__rte_unused void *data)
+{
+   struct cmd_get_sym_hash_ena_per_port_result *res = parsed_result;
+   struct rte_eth_hash_filter_info info;
+   int ret;
+
+   if (rte_eth_dev_filter_supported(res->port_id,
+   RTE_ETH_FILTER_HASH) < 0) {
+   printf("RTE_ETH_FILTER_HASH not supported on port: %d\n",
+   res->port_id);
+   return;
+   }
+
+   memset(&info, 0, sizeof(info));
+   info.info_type = RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT;
+   ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_HASH,
+   RTE_ETH_FILTER_GET, &info);
+   if (ret < 0) {
+   printf("Cannot get symmetric hash enable per port "
+   "on port %u\n", res->port_id);
+   return;
+   }
+
+   printf("Symmetric hash is %s on port %u\n", info.info.enable ?
+   "enabled" : "disabled", res->port_id);
+}
+
+cmdline_parse_token_string_t cmd_get_sym_hash_ena_per_port_all =
+   TOKEN_STRING_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result,
+   get_sym_hash_ena_per_port, "get_sym_hash_ena_per_port");
+cmdline_parse_token_num_t cmd_get_sym_hash_ena_per_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_get_sym_hash_ena_per_port = {
+   .f = cmd_get_sym_hash_per_port_parsed,
+   .data = NULL,
+   .help_str = "get_sym_hash_ena_per_port port_id",
+   .tokens = {
+   (void *)&cmd_get_sym_hash_ena_per_port_all,
+   (void *)&cmd_get_sym_hash_ena_per_port_port_id,
+   NULL,
+   },
+};
+
+/* *** Set symmetric hash enable per port *** */
+struct cmd_set_sym_hash_ena_per_port_result {
+   cmdline_fixed_string_t set_sym_hash_ena_per_port;
+   cmdline_fixed_string_t enable;
+   uint8_t port_id;
+};
+
+static void
+cmd_set_sym_hash_per_port_parsed(void *parsed_result,
+__rte_unused

[dpdk-dev] [PATCH v5 3/5] i40e: add hash filter control implementation

2014-10-21 Thread Helin Zhang

Hash filter control has been implemented for i40e. It includes
getting/setting,
- hash function type
- symmetric hash enable per pctype (packet classification type)
- symmetric hash enable per port
- filter swap configuration

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 377 +-
 1 file changed, 375 insertions(+), 2 deletions(-)

v5 changes:
* Integrated with filter API defined recently.

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 5e5cfbe..7531e3c 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -4145,6 +4145,378 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf)
return 0;
 }

+/* Get the symmetric hash enable configurations per PCTYPE */
+static int
+i40e_get_symmetric_hash_enable_per_pctype(struct i40e_hw *hw,
+   struct rte_eth_sym_hash_ena_info *info)
+{
+   uint32_t reg;
+
+   switch (info->pctype) {
+   case ETH_RSS_NONF_IPV4_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV4_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV4_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV4_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV4_SHIFT:
+   case ETH_RSS_NONF_IPV6_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV6_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV6_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV6_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV6_SHIFT:
+   case ETH_RSS_L2_PAYLOAD_SHIFT:
+   reg = I40E_READ_REG(hw, I40E_GLQF_HSYM(info->pctype));
+   info->enable = reg & I40E_GLQF_HSYM_SYMH_ENA_MASK ? 1 : 0;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+/* Set the symmetric hash enable configurations per PCTYPE */
+static int
+i40e_set_symmetric_hash_enable_per_pctype(struct i40e_hw *hw,
+   const struct rte_eth_sym_hash_ena_info *info)
+{
+   uint32_t reg;
+
+   switch (info->pctype) {
+   case ETH_RSS_NONF_IPV4_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV4_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV4_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV4_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV4_SHIFT:
+   case ETH_RSS_NONF_IPV6_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV6_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV6_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV6_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV6_SHIFT:
+   case ETH_RSS_L2_PAYLOAD_SHIFT:
+   reg = info->enable ? I40E_GLQF_HSYM_SYMH_ENA_MASK : 0;
+   I40E_WRITE_REG(hw, I40E_GLQF_HSYM(info->pctype), reg);
+   I40E_WRITE_FLUSH(hw);
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+/* Get the symmetric hash enable configurations per port */
+static void
+i40e_get_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t *enable)
+{
+   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+
+   *enable = reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK ? 1 : 0;
+}
+
+/* Set the symmetric hash enable configurations per port */
+static void
+i40e_set_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t enable)
+{
+   uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0);
+
+   if (enable > 0) {
+   if (reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK) {
+   PMD_DRV_LOG(INFO, "Symmetric hash has already "
+   "been enabled");
+   return;
+   }
+   reg |= I40E_PRTQF_CTL_0_HSYM_ENA_MASK;
+   } else {
+   if (!(reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK)) {
+   PMD_DRV_LOG(INFO, "Symmetric hash has already "
+   "been disabled");
+   return;
+   }
+   reg &= ~I40E_PRTQF_CTL_0_HSYM_ENA_MASK;
+   }
+   I40E_WRITE_REG(hw, I40E_PRTQF_CTL_0, reg);
+   I40E_WRITE_FLUSH(hw);
+}
+
+/* Get filter swap configurations */
+static int
+i40e_get_filter_swap(struct i40e_hw *hw, struct rte_eth_filter_swap_info *info)
+{
+   uint32_t reg;
+
+   switch (info->pctype) {
+   case ETH_RSS_NONF_IPV4_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV4_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV4_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV4_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV4_SHIFT:
+   case ETH_RSS_NONF_IPV6_UDP_SHIFT:
+   case ETH_RSS_NONF_IPV6_TCP_SHIFT:
+   case ETH_RSS_NONF_IPV6_SCTP_SHIFT:
+   case ETH_RSS_NONF_IPV6_OTHER_SHIFT:
+   case ETH_RSS_FRAG_IPV6_SHIFT:
+   case ETH_RSS_L2_PAYLOAD_SHIFT:
+   reg = I40E_READ_REG(hw, I40E_GLQF_SWAP(0, info->pctype));
+   PMD_DRV_LOG(DEBUG, "Value read from I40E_GLQF_SWAP[0,%d]: "
+   "0x%x", info->pctype, reg);
+

[dpdk-dev] [PATCH v5 2/5] ethdev: add enum type and relevant structures for hash filter control

2014-10-21 Thread Helin Zhang

enum type and relevant structures are added in rte_eth_ctrl.h to
support hash filter control.

Signed-off-by: Helin Zhang 
---
 lib/librte_ether/rte_eth_ctrl.h | 75 +
 1 file changed, 75 insertions(+)

v5 changes:
* Integrated with filter API defined recently.

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..c469b57 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_HASH,
RTE_ETH_FILTER_MAX
 };

@@ -71,6 +72,80 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * Hash filter information types.
+ */
+enum rte_eth_hash_filter_info_type {
+   RTE_ETH_HASH_FILTER_INFO_TYPE_UNKNOWN = 0,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION,
+   RTE_ETH_HASH_FILTER_INFO_TYPE_MAX,
+};
+
+/**
+ * Hash function types.
+ */
+enum rte_eth_hash_function {
+   RTE_ETH_HASH_FUNCTION_UNKNOWN = 0,
+   RTE_ETH_HASH_FUNCTION_TOEPLITZ,
+   RTE_ETH_HASH_FUNCTION_SIMPLE_XOR,
+   RTE_ETH_HASH_FUNCTION_MAX,
+};
+
+/**
+ * A structure used to set or get symmetric hash enable information, to support
+ * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_GET/RTE_ETH_FILTER_SET', with
+ * information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE'.
+ */
+struct rte_eth_sym_hash_ena_info {
+   /**< packet classification type, defined in rte_ethdev.h */
+   uint8_t pctype;
+   uint8_t enable; /**< enable or disable flag */
+};
+
+/**
+ * A structure used to set or get filter swap information, to support
+ * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_GET/RTE_ETH_FILTER_SET',
+ * with information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP'.
+ */
+struct rte_eth_filter_swap_info {
+   /**< Packet classification type, defined in rte_ethdev.h */
+   uint8_t pctype;
+   /**< Offset of the 1st field of the 1st couple to be swapped. */
+   uint8_t off0_src0;
+   /**< Offset of the 2nd field of the 1st couple to be swapped. */
+   uint8_t off0_src1;
+   /**< Field length of the first couple. */
+   uint8_t len0;
+   /**< Offset of the 1st field of the 2nd couple to be swapped. */
+   uint8_t off1_src0;
+   /**< Offset of the 2nd field of the 2nd couple to be swapped. */
+   uint8_t off1_src1;
+   /**< Field length of the second couple. */
+   uint8_t len1;
+};
+
+/**
+ * A structure used to set or get hash filter information, to support filter
+ * type of 'RTE_ETH_FILTER_HASH' and its operations.
+ */
+struct rte_eth_hash_filter_info {
+   enum rte_eth_hash_filter_info_type info_type; /**< Information type. */
+   /**< Details of hash filter infomation */
+   union {
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE */
+   struct rte_eth_sym_hash_ena_info sym_hash_ena;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP */
+   struct rte_eth_filter_swap_info filter_swap;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT */
+   uint8_t enable;
+   /* For RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION */
+   enum rte_eth_hash_function hash_function;
+   } info;
+};
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.1.4

[dpdk-dev] [PATCH v5 4/5] i40e: add hardware initialization

2014-10-21 Thread Helin Zhang

As global registers will be reset only after a whole chip reset,
those registers might not be in an initial state after each
launching a physical port. The hardware initialization is added to
put specific global registers into an initial state.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 7531e3c..cedd09a 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -190,6 +190,7 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
void *arg);
+static void i40e_hw_init(struct i40e_hw *hw);

 static struct rte_pci_id pci_id_i40e_map[] = {
 #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
@@ -368,6 +369,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
/* Make sure all is clean before doing PF reset */
i40e_clear_hw(hw);

+   /* Initialize the hardware */
+   i40e_hw_init(hw);
+
/* Reset here to make sure all is clean for each PF */
ret = i40e_pf_reset(hw);
if (ret) {
@@ -4541,3 +4545,77 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev,

return ret;
 }
+
+/* Initialization for hash function */
+static void
+i40e_hash_function_hw_init(struct i40e_hw *hw)
+{
+   uint32_t i;
+   const struct rte_eth_sym_hash_ena_info sym_hash_ena_info[] = {
+   {ETH_RSS_NONF_IPV4_UDP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_TCP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_SCTP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV4_OTHER_SHIFT, 0},
+   {ETH_RSS_FRAG_IPV4_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_UDP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_TCP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_SCTP_SHIFT, 0},
+   {ETH_RSS_NONF_IPV6_OTHER_SHIFT, 0},
+   {ETH_RSS_FRAG_IPV6_SHIFT, 0},
+   {ETH_RSS_L2_PAYLOAD_SHIFT, 0},
+   };
+   const struct rte_eth_filter_swap_info swap_info[] = {
+   {ETH_RSS_NONF_IPV4_UDP_SHIFT,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV4_TCP_SHIFT,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV4_SCTP_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV4_OTHER_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_FRAG_IPV4_SHIFT,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV6_UDP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV6_TCP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_RSS_NONF_IPV6_SCTP_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_NONF_IPV6_OTHER_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_FRAG_IPV6_SHIFT,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_RSS_L2_PAYLOAD_SHIFT,
+   0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+   };
+
+   /* Disable symmetric hash per PCTYPE */
+   for (i = 0; i < RTE_DIM(sym_hash_ena_info); i++)
+   i40e_set_symmetric_hash_enable_per_pctype(hw,
+   &sym_hash_ena_info[i]);
+
+   /* Disable symmetric hash per port */
+   i40e_set_symmetric_hash_enable_per_port(hw, 0);
+
+   /* Initialize filter swap */
+   for (i = 0; i < RTE_DIM(swap_info); i++)
+   i40e_set_filter_swap(hw, &swap_info[i]);
+
+   /* Set hash function to Toeplitz by default */
+   i40e_set_hash_function(hw, RTE_ETH_HASH_FUNCTION_TOEPLITZ);
+}
+
+/*
+ * As global registers wouldn't be reset unless a global hardware reset,
+ * hardware initialization is needed to put those registers into an
+ * expected initial state.
+ */
+static void
+i40e_hw_init(struct i40e_hw *hw)
+{
+   /* clear the PF Queue Filter control register */
+   I40E_WRITE_REG(hw, I40E_PFQF_CTL_0, 0);
+
+   /* Initialize hardware for hash function */
+   i40e_hash_function_hw_init(hw);
+}
-- 
1.8.1.4

[dpdk-dev] [PATCH] examples/vmdq: support i40e in vmdq example

2014-10-21 Thread Cao, Min

Tested-by: Min Cao 

This patch has been verified on fortville and it is ready to be integrated to 
dpdk.org.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Huawei Xie
Sent: Wednesday, September 24, 2014 6:54 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] examples/vmdq: support i40e in vmdq example

With i40e, the queue index of VMDQ pools doesn't always start from zero, and 
the queues aren't all occupied by VMDQ. These information are retrieved through 
rte_eth_dev_info_get, and used to initialise VMDQ.

Huawei Xie (1):
  support i40e in vmdq example

 examples/vmdq/main.c | 162 ++-
 1 file changed, 97 insertions(+), 65 deletions(-)

-- 
1.8.1.4

[dpdk-dev] [PATCH v2 0/6] i40e VMDQ support

2014-10-21 Thread Cao, Min

Tested-by: Min Cao 

This patch has been verified on fortville and it is ready to be integrated to 
dpdk.org.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Chen Jing D(Mark)
Sent: Thursday, October 16, 2014 6:07 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v2 0/6] i40e VMDQ support

From: "Chen Jing D(Mark)" 

v2:
- Fix a few typos.
- Add comments for RX mq mode flags.
- Remove '\n' from some log messages.
- Remove 'Acked-by' in commit log.

v1:
Define extra VMDQ arguments to expand VMDQ configuration. This also
includes change in igb and ixgbe PMD driver. In the meanwhile, fix 2
defects in rte_ether library.

Add full VMDQ support in i40e PMD driver. renamed some functions, setup
VMDQ VSI after it's enabled in application. It also make some improvement
on macaddr add/delete to support setting multiple macaddr for single or
multiple pools.

Finally, change i40e rx/tx_queue_setup and dev_start/stop functions to
configure/switch queues belonging to VMDQ pools.

Chen Jing D(Mark) (6):
  ether: enhancement for VMDQ support
  igb: change for VMDQ arguments expansion
  ixgbe: change for VMDQ arguments expansion
  i40e: add VMDQ support
  i40e: macaddr add/del enhancement
  i40e: Add full VMDQ pools support

 config/common_linuxapp  |1 +
 lib/librte_ether/rte_ethdev.c   |   12 +-
 lib/librte_ether/rte_ethdev.h   |   43 +++-
 lib/librte_pmd_e1000/igb_ethdev.c   |3 +
 lib/librte_pmd_i40e/i40e_ethdev.c   |  499 ++-
 lib/librte_pmd_i40e/i40e_ethdev.h   |   21 ++-
 lib/librte_pmd_i40e/i40e_rxtx.c |  125 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 8 files changed, 536 insertions(+), 169 deletions(-)

-- 
1.7.7.6

[dpdk-dev] [PATCH 0/6] i40e VMDQ support

2014-10-21 Thread Cao, Min

Tested-by: Min Cao 

This patch has been verified on fortville and it is ready to be integrated to 
dpdk.org.

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Chen Jing D(Mark)
Sent: Tuesday, September 23, 2014 9:14 PM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH 0/6] i40e VMDQ support

From: "Chen Jing D(Mark)" 

Define extra VMDQ arguments to expand VMDQ configuration. This also
includes change in igb and ixgbe PMD driver. In the meanwhile, fix 2
defects in rte_ether library.

Add full VMDQ support in i40e PMD driver. renamed some functions, setup
VMDQ VSI after it's enabled in application. It also make some improvement
on macaddr add/delete to support setting multiple macaddr for single or
multiple pools.

Finally, change i40e rx/tx_queue_setup and dev_start/stop functions to
configure/switch queues belonging to VMDQ pools.

Chen Jing D(Mark) (6):
  ether: enhancement for VMDQ support
  igb: change for VMDQ arguments expansion
  ixgbe: change for VMDQ arguments expansion
  i40e: add VMDQ support
  i40e: macaddr add/del enhancement
  i40e: Add full VMDQ pools support

 config/common_linuxapp  |1 +
 lib/librte_ether/rte_ethdev.c   |   12 +-
 lib/librte_ether/rte_ethdev.h   |   39 ++-
 lib/librte_pmd_e1000/igb_ethdev.c   |3 +
 lib/librte_pmd_i40e/i40e_ethdev.c   |  509 ++-
 lib/librte_pmd_i40e/i40e_ethdev.h   |   21 ++-
 lib/librte_pmd_i40e/i40e_rxtx.c |  125 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |1 +
 8 files changed, 537 insertions(+), 174 deletions(-)

-- 
1.7.7.6

[dpdk-dev] [PATCH v4] KNI: use a memzone pool for KNI alloc/release

2014-10-21 Thread Zhang, Helin

> This patch implements the KNI memzone pool in order to prevent memzone
> exhaustion when allocating/deallocating KNI interfaces.
> 
> It adds a new API call, rte_kni_init(max_kni_ifaces) that shall be called 
> before
> any call to rte_kni_alloc() if KNI is used.
> 
> v2: Moved KNI fd opening to rte_kni_init(). Revised style.
> v3: Adapted kni examples/tests to rte_kni_init().
> v4: Improved example integration. Fixed kni_memzone_pool_alloc/release()
> bug.
> 
> Signed-off-by: Marc Sune 

Acked-by: Helin Zhang 

> ---
>  app/test/test_kni.c  |5 +-
>  examples/kni/main.c  |   22 
>  lib/librte_kni/rte_kni.c |  317
> +-
>  lib/librte_kni/rte_kni.h |   18 +++
>  4 files changed, 302 insertions(+), 60 deletions(-)

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread Ouyang Changchun

Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT is disabled.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index 2236250..a0d3d78 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -541,7 +541,7 @@ ixgbe_tx_free_bufs(struct igb_tx_queue *txq)
 #ifdef RTE_MBUF_REFCNT
m = __rte_pktmbuf_prefree_seg(txep[i].mbuf);
 #else
-   m = txep[i]->mbuf;
+   m = txep[i].mbuf;
 #endif
if (likely(m != NULL)) {
if (likely(m->pool == free[0]->pool))
-- 
1.8.4.2

[dpdk-dev] Why do we need iommu=pt?

2014-10-21 Thread Alex Markuze

DPDK uses a 1:1 mapping and doesn't support IOMMU.  IOMMU allows for
simpler VM physical address translation.
The second role of IOMMU is to allow protection from unwanted memory access
by an unsafe devise that has DMA privileges. Unfortunately this protection
comes with an extremely high performance costs for high speed nics.

To your question iommu=pt disables IOMMU support for the hypervisor.

On Tue, Oct 21, 2014 at 1:39 AM, Xie, Huawei  wrote:

>
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya Hiremath
> > Sent: Monday, October 20, 2014 2:59 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Why do we need iommu=pt?
> >
> > Hi,
> >
> > My question is that if the Poll mode  driver used the DMA kernel
> interface
> > to set up its mappings appropriately, would it still require that
> iommu=pt
> > be set?
> > What is the purpose of setting iommu=pt ?
> PMD allocates memory though hugetlb file system, and fills the physical
> address
> into the descriptor.
> pt is used to pass through iotlb translation. Refer to the below link.
> http://lkml.iu.edu/hypermail/linux/kernel/0906.2/02129.html
> >
> > Thank you.
>

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread Thomas Monjalon

2014-10-21 14:59, Ouyang Changchun:
> Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT is 
> disabled.
> 
> Signed-off-by: Changchun Ouyang 

Acked-by: Thomas Monjalon 

Applied

Thanks
-- 
Thomas

[dpdk-dev] Memory corruption in librte_ether?

2014-10-21 Thread Marc Sune

Pablo,

I've only tried with the kni-autotest but it seems to work fine. Thanks!

Btw, at least in my development VM the kni-autotest in the current head 
(455d09e i40e: generic filter control), but also in v1.7.1, fails:

RTE>>kni_autotest
master lcore: 0
count: 2
PMD: eth_em_rx_queue_setup(): sw_ring=0x7f27ab4e8100 
hw_ring=0x7f27aa60 dma_addr=0x36e0
PMD: eth_em_tx_queue_setup(): sw_ring=0x7f27ab4e6000 
hw_ring=0x7f27aa61 dma_addr=0x36e1
PMD: eth_em_start(): <<
KNI: pci: 00:06:00  8086:100e
KNI: Invalid KNI request operation.
KNI: Invalid kni info.
KNI: The KNI request operationhas already registered.
Change MTU of port 0 to 1450
Change MTU of port 0 to 1450 successfully.
KNI: Invalid kni info.
The ingress/egress number should not be less than 100
Test Failed
RTE>>

Maybe it is simply a lack of resources in the qemu VM.

Saludos
marc

On 20/10/14 19:31, De Lara Guarch, Pablo wrote:
> Hi Marc,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marc Sune
>> Sent: Friday, October 17, 2014 10:17 PM
>> To: 
>> Subject: [dpdk-dev] Memory corruption in librte_ether?
>>
>> Hi all,
>>
>> I was rebasing the KNI mempool v4 patch(I have it finalised, but wanted
>> to check) to the latest master HEAD
>> (075e064089e1c2b6899db58c69be1a387eb5ffa7) when I ran into problems
>> with
>> the current KNI example with em interfaces in a VM. I then switched to
>> master's head and retried (so without the KNI mempool patch!) with the
>> *same behaviour*. Behaviour here listed is with master head, so nothing
>> to do with the patch I am working on.
>>
>> The *VM*, emulated with qemu has 4 e1000 interfaces attached to several
>> bridges. qmeu version 1.1.2 running in debian 7 64bit. With this setup I
>> get the error:
>>
> [...]
>> Which seems to indicate rte_eth_dev_info_get() is somehow corrupting
>> memory(??). But I haven't figure out the problem (yet). I suspect of:
>>
>> commit fbde27f19ab8f1d386868275bd8c016e693cf073
>> Author: Pablo de Lara 
>> Date:   Wed Oct 1 10:49:04 2014 +0100
>>
>>   ethdev: get default Rx/Tx configuration from dev info
>>
>>   Many sample apps use duplicated code to set rte_eth_txconf and
>> rte_eth_rxconf
>>   structures. This patch allows the user to get a default optimal
>> RX/TX configuration
>>   through rte_eth_dev_info get, and still any parameters may be
>> tweaked as wished,
>>   before setting up queues.
>>
>>   Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
>>   rte_eth_tx_queue_setup, these functions get internally the default
>> RX/TX
>>   configuration for the user.
>>
>>   Signed-off-by: Pablo de Lara 
>>   Reviewed-by: Bruce Richardson 
>>   Acked-by: David Marchand 
>>   [Thomas: split patch]
>>
>> commit a30268e9a2d0618902e8cf96b90b27db4fb02d54
>> Author: Pablo de Lara 
>> Date:   Wed Oct 1 10:49:03 2014 +0100
>>
>>   ethdev: reset whole dev info structure before filling
>>
>>   To guarantee that RX/TX configuration structures are reseted
>>   before modifying them, plus the other dev info fields,
>>   dev info structure is zeroed beforehand.
>>
>>   Signed-off-by: Pablo de Lara 
>>   Acked-by: David Marchand 
>>
>>
>> Can anyone confirm it?
> I just pushed a fix for that problem. Indeed, the dev_info structure was 
> polluted,
> because I was calling the specific dev_info_get function in the PMDs,
> and not calling rte_eth_dev_info_get in rte_ethdev.c, which means that
>   the dev_info structure was not being reseted.
> In your case, em PMD does not overwrite the rte_eth_rxconf and
> rte_eth_txconf structures, and then you find random data in those structures.
> Well spotted and thanks very much for all the details.
> I would appreciate if you could verify that this patch works for you.
>
> Thanks,
> Pablo
>> Marc
>>
>> p.s. Has someone managed to run a dpdk app with valgrind?

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread De Lara Guarch, Pablo

Hi Thomas,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, October 21, 2014 9:04 AM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd
> 
> 2014-10-21 14:59, Ouyang Changchun:
> > Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT is
> disabled.
> >
> > Signed-off-by: Changchun Ouyang 
> 
> Acked-by: Thomas Monjalon 
> 
> Applied

I was checking this patch right now, and I come across a second compilation 
issue,
 because rte_mbuf_refcnt_update and rte_pktmbuf_attach are not declared, 
and Bond PMD and IP fragmentation libraries use those functions.

I guess that it is late to NACK this :P, but we require a second patch 
to fix completely this issue.

> 
> Thanks
> --
> Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Marc Sune

Good morning,

Some DPDK users, including myself, use a clone of the git repository to 
compile DPDK for their applications, instead of downloading the tarball 
of each release.

In my opinion, it would be useful for such users that the master branch 
contains only stable releases, to prevent (mistakenly) to use a wip DPDK 
version, and jump quickly to the latest stable with a simple git pull 
without having to check the tags. Also new users would clone the repo 
and get only the stable release.

So I would propose to use an integration/development branch, where the 
patches are integrated and only push to master once a stable release is 
tagged in this integration branch.

Thoughts?

best
marc

[dpdk-dev] [PATCH v4] KNI: use a memzone pool for KNI alloc/release

2014-10-21 Thread Thomas Monjalon

Hi Marc,

2014-10-18 00:51, Marc Sune:
> This patch implements the KNI memzone pool in order to prevent
> memzone exhaustion when allocating/deallocating KNI interfaces.
> 
> It adds a new API call, rte_kni_init(max_kni_ifaces) that shall
> be called before any call to rte_kni_alloc() if KNI is used.
> 
> v2: Moved KNI fd opening to rte_kni_init(). Revised style.
> v3: Adapted kni examples/tests to rte_kni_init().
> v4: Improved example integration. Fixed kni_memzone_pool_alloc/release() bug.
> 
> Signed-off-by: Marc Sune 

Thanks for the good work with Helin.
Before applying this patch, I'd like another version explaining in the
commit log why this change is needed.
And please use to checkpatch.pl to check and remove whitespace errors.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread Ouyang, Changchun

Hi Pablo

> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Tuesday, October 21, 2014 4:19 PM
> To: Thomas Monjalon; Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd
> 
> Hi Thomas,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > Sent: Tuesday, October 21, 2014 9:04 AM
> > To: Ouyang, Changchun
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd
> >
> > 2014-10-21 14:59, Ouyang Changchun:
> > > Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT
> > > is
> > disabled.
> > >
> > > Signed-off-by: Changchun Ouyang 
> >
> > Acked-by: Thomas Monjalon 
> >
> > Applied
> 
> I was checking this patch right now, and I come across a second compilation
> issue,  because rte_mbuf_refcnt_update and rte_pktmbuf_attach are not
> declared, and Bond PMD and IP fragmentation libraries use those functions.
> 
> I guess that it is late to NACK this :P, but we require a second patch to fix
> completely this issue.
As it fixes the compilation issue in vpmd, so no reason to NACK it,  :-)
In my config, both BOND and IP fragment is disabled. So I don't come across 
your issues.
Yes, agree with you, we need another patch to fix compilation issue in other 
both places.

[dpdk-dev] VMDQ DCB

2014-10-21 Thread Sunil Bojanapally

Hi,

Would like to know whether VMDQ_DCB is well tested and supported in 
release v1.3.1 ?

Thanks,
Sunil

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread Thomas Monjalon

2014-10-21 08:28, Ouyang, Changchun:
> From: De Lara Guarch, Pablo
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > > 2014-10-21 14:59, Ouyang Changchun:
> > > > Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT
> > > > is disabled.
> > > >
> > > > Signed-off-by: Changchun Ouyang 
> > >
> > > Acked-by: Thomas Monjalon 
> > >
> > > Applied
> > 
> > I was checking this patch right now, and I come across a second compilation
> > issue,  because rte_mbuf_refcnt_update and rte_pktmbuf_attach are not
> > declared, and Bond PMD and IP fragmentation libraries use those functions.
> > 
> > I guess that it is late to NACK this :P, but we require a second patch to 
> > fix
> > completely this issue.
> 
> As it fixes the compilation issue in vpmd, so no reason to NACK it,  :-)

Exact

> In my config, both BOND and IP fragment is disabled. So I don't come across 
> your issues.
> Yes, agree with you, we need another patch to fix compilation issue in other 
> both places.

Yes, I'm aware of these limitations.
Please, first explain why mbuf refcnt is needed for these features.
Then we have 2 options: remove the dependency or add more ifdefs.

Thanks
-- 
Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Richardson, Bruce

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marc Sune
> Sent: Tuesday, October 21, 2014 9:23 AM
> To: 
> Subject: [dpdk-dev] development/integration branch?
> 
> Good morning,
> 
> Some DPDK users, including myself, use a clone of the git repository to
> compile DPDK for their applications, instead of downloading the tarball
> of each release.
> 
> In my opinion, it would be useful for such users that the master branch
> contains only stable releases, to prevent (mistakenly) to use a wip DPDK
> version, and jump quickly to the latest stable with a simple git pull
> without having to check the tags. Also new users would clone the repo
> and get only the stable release.
> 
> So I would propose to use an integration/development branch, where the
> patches are integrated and only push to master once a stable release is
> tagged in this integration branch.
> 
> Thoughts?
> 
> best
> marc

Ideally, our master branch should always be good and stable, but given reality 
often interferes with such good intentions I think that having dev branches is 
not a bad idea. However, what we may lose by doing so is having a larger group 
of people constantly using the master branch and reporting problems to us. 

On balance, I'd be slightly in favour of this suggestion.

/Bruce

[dpdk-dev] [PATCH] Fix warnings if DPDK is used with C++1x: Format macro constants for fixed width integer types need a space after the preceding string literal.

2014-10-21 Thread Matthias Bartelt

From: Matthias Bartelt 

---
 lib/librte_eal/common/include/rte_pci.h |4 ++--
 lib/librte_mempool/rte_mempool.h|8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 66ed793..549005f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -93,10 +93,10 @@ extern struct pci_device_list pci_device_list; /**< Global 
list of PCI devices.
 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"

 /** Formatting string for PCI device identifier: Ex: :00:01.0 */
-#define PCI_PRI_FMT "%.4"PRIx16":%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_PRI_FMT "%.4"PRIx16":%.2"PRIx8":%.2"PRIx8".%" PRIx8

 /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
-#define PCI_SHORT_PRI_FMT "%.2"PRIx8":%.2"PRIx8".%"PRIx8
+#define PCI_SHORT_PRI_FMT "%.2"PRIx8":%.2"PRIx8".%" PRIx8

 /** Nb. of values in PCI device identifier format string. */
 #define PCI_FMT_NVAL 4
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7b641b0..bcdb86a 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -342,7 +342,7 @@ static inline void __mempool_check_cookies(const struct 
rte_mempool *mp,
if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
rte_log_set_history(0);
RTE_LOG(CRIT, MEMPOOL,
-   "obj=%p, mempool=%p, 
cookie=%"PRIx64"\n",
+   "obj=%p, mempool=%p, cookie=%" 
PRIx64"\n",
obj, mp, cookie);
rte_panic("MEMPOOL: bad header cookie (put)\n");
}
@@ -352,7 +352,7 @@ static inline void __mempool_check_cookies(const struct 
rte_mempool *mp,
if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
rte_log_set_history(0);
RTE_LOG(CRIT, MEMPOOL,
-   "obj=%p, mempool=%p, 
cookie=%"PRIx64"\n",
+   "obj=%p, mempool=%p, cookie=%" 
PRIx64"\n",
obj, mp, cookie);
rte_panic("MEMPOOL: bad header cookie (get)\n");
}
@@ -363,7 +363,7 @@ static inline void __mempool_check_cookies(const struct 
rte_mempool *mp,
cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
rte_log_set_history(0);
RTE_LOG(CRIT, MEMPOOL,
-   "obj=%p, mempool=%p, 
cookie=%"PRIx64"\n",
+   "obj=%p, mempool=%p, cookie=%" 
PRIx64"\n",
obj, mp, cookie);
rte_panic("MEMPOOL: bad header cookie 
(audit)\n");
}
@@ -372,7 +372,7 @@ static inline void __mempool_check_cookies(const struct 
rte_mempool *mp,
if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
rte_log_set_history(0);
RTE_LOG(CRIT, MEMPOOL,
-   "obj=%p, mempool=%p, cookie=%"PRIx64"\n",
+   "obj=%p, mempool=%p, cookie=%" PRIx64"\n",
obj, mp, cookie);
rte_panic("MEMPOOL: bad trailer cookie\n");
}
-- 
1.7.9.5

[dpdk-dev] [PATCH v6 0/9] Support VxLAN on Fortville

2014-10-21 Thread Jijiang Liu

The patch set supports VxLAN on Fortville based on latest rte_mbuf structure.

It includes:
 - Support VxLAN packet identification by configuring UDP tunneling port.
 - Support VxLAN packet filters. It uses MAC and VLAN to point
   to a queue. The filter types supported are listed below:
   1. Inner MAC and Inner VLAN ID
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID
   4. Inner MAC address
   5. Outer MAC address, tenant ID and inner MAC
 - Support VxLAN TX checksum offload, which include outer L3(IP), inner L3(IP) 
and inner L4(UDP,TCP and SCTP)

Change notes:

 v6)  * Split the rte_mbuf structure changes as a seperate patch.
  * Remove the initialization configuration of VXLAN UDP port.
  * Change the filter_type field in rte_eth_tunnel_filter_conf to 
"uint16_t" type. 
  * Add more descriptions about some API comments and commit logs.


Jijiang Liu (9):
  rte_mbuf structure changes
  add VxLAN packet identification API in librte_ether
  support VxLAN packet identification in librte_pmd_i40e
  test VxLAN packet identification in testpmd.
  add data structures of tunneling filter in rte_eth_ctrl.h
  implement the API of VxLAN packet filter in librte_pmd_i40e
  test VxLAN packet filter
  support VxLAN Tx checksum offload
  test VxLAN Tx checksum offload

 app/test-pmd/cmdline.c|  230 -
 app/test-pmd/config.c |6 +-
 app/test-pmd/csumonly.c   |  195 +++--
 app/test-pmd/parameters.c |   13 ++
 app/test-pmd/rxonly.c |   49 ++
 app/test-pmd/testpmd.c|8 +
 app/test-pmd/testpmd.h|4 +
 lib/librte_ether/rte_eth_ctrl.h   |   64 +++
 lib/librte_ether/rte_ethdev.c |   63 +++
 lib/librte_ether/rte_ethdev.h |   63 +++
 lib/librte_ether/rte_ether.h  |8 +
 lib/librte_mbuf/rte_mbuf.h|   26 +++-
 lib/librte_pmd_i40e/i40e_ethdev.c |  341 -
 lib/librte_pmd_i40e/i40e_ethdev.h |8 +-
 lib/librte_pmd_i40e/i40e_rxtx.c   |   55 ++-
 15 files changed, 1101 insertions(+), 32 deletions(-)

-- 
1.7.7.6

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-21 Thread Jijiang Liu

There are "some" destination UDP port numbers that have unque meaning.
In terms of VxLAN, "IANA has assigned the value 4789 for the VXLAN UDP port, 
and this value
SHOULD be used by default as the destination UDP port. Some early 
implementations of VXLAN
have used other values for the destination port. To enable interoperability 
with these 
implementations, the destination port SHOULD be configurable."

Add two APIs in librte_ether for supporting UDP tunneling port configuration on 
i40e.
Currently, only VxLAN is implemented in this patch set.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 lib/librte_ether/rte_ethdev.c |   63 ++
 lib/librte_ether/rte_ethdev.h |   75 +
 lib/librte_ether/rte_ether.h  |8 
 3 files changed, 146 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 50f10d9..9e111b6 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2038,6 +2038,69 @@ rte_eth_dev_rss_hash_conf_get(uint8_t port_id,
 }

 int
+rte_eth_dev_udp_tunnel_add(uint8_t port_id,
+  struct rte_eth_udp_tunnel *udp_tunnel,
+  uint8_t count)
+{
+   uint8_t i;
+   struct rte_eth_dev *dev;
+   struct rte_eth_udp_tunnel *tunnel;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
+   return -EINVAL;
+   }
+   tunnel = udp_tunnel;
+
+   for (i = 0; i < count; i++, tunnel++) {
+   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+   }
+
+   dev = &rte_eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_add, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_add)(dev, udp_tunnel, count);
+}
+
+int
+rte_eth_dev_udp_tunnel_delete(uint8_t port_id,
+ struct rte_eth_udp_tunnel *udp_tunnel,
+ uint8_t count)
+{
+   uint8_t i;
+   struct rte_eth_dev *dev;
+   struct rte_eth_udp_tunnel *tunnel;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+   dev = &rte_eth_devices[port_id];
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parametr\n");
+   return -EINVAL;
+   }
+   tunnel = udp_tunnel;
+   for (i = 0; i < count; i++, tunnel++) {
+   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_del, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_del)(dev, udp_tunnel, count);
+}
+
+int
 rte_eth_led_on(uint8_t port_id)
 {
struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b69a6af..9ad11ec 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -710,6 +710,26 @@ struct rte_fdir_conf {
 };

 /**
+ * UDP tunneling configuration.
+ */
+struct rte_eth_udp_tunnel {
+   uint16_t udp_port;
+   uint8_t prot_type;
+};
+
+/**
+ * Tunneled type.
+ */
+enum rte_eth_tunnel_type {
+   RTE_TUNNEL_TYPE_NONE = 0,
+   RTE_TUNNEL_TYPE_VXLAN,
+   RTE_TUNNEL_TYPE_GENEVE,
+   RTE_TUNNEL_TYPE_TEREDO,
+   RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_MAX,
+};
+
+/**
  *  Possible l4type of FDIR filters.
  */
 enum rte_l4type {
@@ -831,6 +851,7 @@ struct rte_intr_conf {
  * configuration settings may be needed.
  */
 struct rte_eth_conf {
+   enum rte_eth_tunnel_type tunnel_type;
uint16_t link_speed;
/**< ETH_LINK_SPEED_10[0|00|000], or 0 for autonegotation */
uint16_t link_duplex;
@@ -1266,6 +1287,17 @@ typedef int (*eth_mirror_rule_reset_t)(struct 
rte_eth_dev *dev,
  uint8_t rule_id);
 /**< @internal Remove a traffic mirroring rule on an Ethernet device */

+typedef int (*eth_udp_tunnel_add_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp,
+   uint8_t count);
+/**< @internal Add tunneling UDP info */
+
+typedef int (*eth_udp_tunnel_del_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp,
+   uint8_t count);
+/**< @internal Delete tunneling UDP info */
+
+
 #ifdef RTE_NIC_BYPASS

 enum {
@@ -1446,6 +1478,8 @@ struct eth_dev_ops {
eth_set_v

[dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure changes

2014-10-21 Thread Jijiang Liu

Remove the "reserved2" field and add the "packet_type" and the 
"inner_l2_l3_len" fields in the rte_mbuf structure.

The packet type field is used to indicate ordinary L2 packet format and also 
tunneling packet format such as IP in IP,
IP in GRE, MAC in GRE and MAC in UDP.

The inner L2 length and the inner L3 length are used for TX offloading of 
tunneling packet.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
---
 lib/librte_mbuf/rte_mbuf.h |   25 -
 1 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..98951a6 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -163,7 +163,14 @@ struct rte_mbuf {

/* remaining bytes are set on RX when pulling packet from descriptor */
MARKER rx_descriptor_fields1;
-   uint16_t reserved2;   /**< Unused field. Required for padding */
+
+   /**
+* Packet type, which is used to indicate ordinary L2 packet format and
+* also tunneled packet format such as IP in IP, IP in GRE, MAC in GRE
+* and MAC in UDP.
+*/
+   uint16_t packet_type;
+
uint16_t data_len;/**< Amount of data in segment buffer. */
uint32_t pkt_len; /**< Total pkt len: sum of all segments. */
uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) 
*/
@@ -196,6 +203,18 @@ struct rte_mbuf {
uint16_t l2_len:7;  /**< L2 (MAC) Header Length. */
};
};
+
+   /* fields for TX offloading of tunnels */
+   union {
+   uint16_t inner_l2_l3_len;
+   /**< combined inner l2/l3 lengths as single var */
+   struct {
+   uint16_t inner_l3_len:9;
+   /**< inner L3 (IP) Header Length. */
+   uint16_t inner_l2_len:7;
+   /**< inner L2 (MAC) Header Length. */
+   };
+   };
 } __rte_cache_aligned;

 /**
@@ -546,11 +565,13 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
m->next = NULL;
m->pkt_len = 0;
m->l2_l3_len = 0;
+   m->inner_l2_l3_len = 0;
m->vlan_tci = 0;
m->nb_segs = 1;
m->port = 0xff;

m->ol_flags = 0;
+   m->packet_type = 0;
m->data_off = (RTE_PKTMBUF_HEADROOM <= m->buf_len) ?
RTE_PKTMBUF_HEADROOM : m->buf_len;

@@ -614,12 +635,14 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
*mi, struct rte_mbuf *md)
mi->port = md->port;
mi->vlan_tci = md->vlan_tci;
mi->l2_l3_len = md->l2_l3_len;
+   mi->inner_l2_l3_len = md->inner_l2_l3_len;
mi->hash = md->hash;

mi->next = NULL;
mi->pkt_len = mi->data_len;
mi->nb_segs = 1;
mi->ol_flags = md->ol_flags;
+   mi->packet_type = md->packet_type;

__rte_mbuf_sanity_check(mi, 1);
__rte_mbuf_sanity_check(md, 0);
-- 
1.7.7.6

[dpdk-dev] [PATCH v6 3/9] i40e:support VxLAN packet identification in librte_pmd_i40e

2014-10-21 Thread Jijiang Liu

Implement configuration of VxLAN destination UDP port number in librte_pmd_i40e.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  164 +
 lib/librte_pmd_i40e/i40e_ethdev.h |8 ++-
 lib/librte_pmd_i40e/i40e_rxtx.c   |9 ++
 3 files changed, 180 insertions(+), 1 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..8a84e30 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -186,6 +186,12 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev 
*dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *udp_tunnel,
+   uint8_t count);
+static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *udp_tunnel,
+   uint8_t count);
 static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
@@ -241,6 +247,8 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .udp_tunnel_add   = i40e_dev_udp_tunnel_add,
+   .udp_tunnel_del   = i40e_dev_udp_tunnel_del,
.filter_ctrl  = i40e_dev_filter_ctrl,
 };

@@ -3178,6 +3186,10 @@ i40e_vsi_rx_init(struct i40e_vsi *vsi)
uint16_t i;

i40e_pf_config_mq_rx(pf);
+
+   if (data->dev_conf.tunnel_type == RTE_TUNNEL_TYPE_VXLAN)
+   pf->flags |= I40E_FLAG_VXLAN;
+
for (i = 0; i < data->nb_rx_queues; i++) {
ret = i40e_rx_queue_init(data->rx_queues[i]);
if (ret != I40E_SUCCESS) {
@@ -4092,6 +4104,158 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
return 0;
 }

+static int
+i40e_get_vxlan_port_idx(struct i40e_pf *pf, uint16_t port)
+{
+   uint8_t i;
+
+   for (i = 0; i < I40E_MAX_PF_UDP_OFFLOAD_PORTS; i++) {
+   if (pf->vxlan_ports[i] == port)
+   return i;
+   }
+
+   return -1;
+}
+
+static int
+i40e_add_vxlan_port(struct i40e_pf *pf, uint16_t port)
+{
+   int  idx, ret;
+   uint8_t filter_idx;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   if (!(pf->flags & I40E_FLAG_VXLAN)) {
+   PMD_DRV_LOG(ERR, "VxLAN tunneling mode is not configured\n");
+   return -EINVAL;
+   }
+
+   idx = i40e_get_vxlan_port_idx(pf, port);
+
+   /* Check if port already exists */
+   if (idx >= 0) {
+   PMD_DRV_LOG(ERR, "Port %d already offloaded\n", port);
+   return -EINVAL;
+   }
+
+   /* Now check if there is space to add the new port */
+   idx = i40e_get_vxlan_port_idx(pf, 0);
+   if (idx < 0) {
+   PMD_DRV_LOG(ERR, "Maximum number of UDP ports reached,"
+   "not adding port %d\n", port);
+   return -ENOSPC;
+   }
+
+   ret =  i40e_aq_add_udp_tunnel(hw, port, I40E_AQC_TUNNEL_TYPE_VXLAN,
+   &filter_idx, NULL);
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR, "Failed to add VxLAN UDP port %d\n", port);
+   return -1;
+   }
+
+   PMD_DRV_LOG(INFO, "Added %s port %d with AQ command with index %d\n",
+port,  filter_index);
+
+   /* New port: add it and mark its index in the bitmap */
+   pf->vxlan_ports[idx] = port;
+   pf->vxlan_bitmap |= (1 << idx);
+
+   return 0;
+}
+
+static int
+i40e_del_vxlan_port(struct i40e_pf *pf, uint16_t port)
+{
+   int idx;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   if (!(pf->flags & I40E_FLAG_VXLAN)) {
+   PMD_DRV_LOG(ERR, "VxLAN tunneling mode is not configured\n");
+   return -EINVAL;
+   }
+
+   idx = i40e_get_vxlan_port_idx(pf, port);
+
+   if (idx < 0) {
+   PMD_DRV_LOG(ERR, "Port %d doesn't exist\n", port);
+   return -EINVAL;
+   }
+
+   if (i40e_aq_del_udp_tunnel(hw, idx, NULL) < 0) {
+   PMD_DRV_LOG(ERR, "Failed to delete VxLAN UDP port %d\n", port);
+   return -1;
+   }
+
+   PMD_DRV_LOG(INFO, "Deleted port %d with AQ command with index %d\n",
+   port, idx);
+
+   pf->vxlan_ports[idx] = 0;
+   pf->vxlan_bitmap &= ~(1 << idx);
+
+   return 0;
+}
+
+/* Add UDP

[dpdk-dev] [PATCH v6 4/9] app/test-pmd:test VxLAN packet identification

2014-10-21 Thread Jijiang Liu

Add two commands to test VxLAN packet identification, which include
 - use commands to add/delete VxLAN UDP port.
 - use rxonly mode to receive VxLAN packet.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 app/test-pmd/cmdline.c|   65 +
 app/test-pmd/parameters.c |   13 +
 app/test-pmd/rxonly.c |   49 ++
 app/test-pmd/testpmd.c|8 +
 app/test-pmd/testpmd.h|4 +++
 5 files changed, 139 insertions(+), 0 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..7160e38 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "rx_vxlan_port add (udp_port) (port_id)\n"
+   "Add an UDP port for VxLAN packet filter on a 
port\n\n"
+
+   "rx_vxlan_port rm (udp_port) (port_id)\n"
+   "Remove an UDP port for VxLAN packet filter on a 
port\n\n"
+
"tx_vlan set vlan_id (port_id)\n"
"Set hardware insertion of VLAN ID in packets sent"
" on a port.\n\n"
@@ -6225,6 +6231,64 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** CONFIGURE TUNNEL UDP PORT *** */
+struct cmd_tunnel_udp_config {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint16_t udp_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_tunnel_udp_config_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_udp_config *res = parsed_result;
+   struct rte_eth_udp_tunnel tunnel_udp;
+   int ret;
+
+   tunnel_udp.udp_port = res->udp_port;
+
+   if (!strcmp(res->cmd, "rx_vxlan_port"))
+   tunnel_udp.prot_type = RTE_TUNNEL_TYPE_VXLAN;
+
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_udp_tunnel_add(res->port_id, &tunnel_udp, 1);
+   else
+   ret = rte_eth_dev_udp_tunnel_delete(res->port_id, &tunnel_udp, 
1);
+
+   if (ret < 0)
+   printf("udp tunneling add error: (%s)\n", strerror(-ret));
+}
+
+cmdline_parse_token_string_t cmd_tunnel_udp_config_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   cmd, "rx_vxlan_port");
+cmdline_parse_token_string_t cmd_tunnel_udp_config_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   what, "add#rm");
+cmdline_parse_token_num_t cmd_tunnel_udp_config_udp_port =
+   TOKEN_NUM_INITIALIZER(struct cmd_tunnel_udp_config,
+   udp_port, UINT16);
+cmdline_parse_token_num_t cmd_tunnel_udp_config_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_tunnel_udp_config,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_tunnel_udp_config = {
+   .f = cmd_tunnel_udp_config_parsed,
+   .data = (void *)0,
+   .help_str = "add/rm an tunneling UDP port filter: "
+   "rx_vxlan_port add udp_port port_id",
+   .tokens = {
+   (void *)&cmd_tunnel_udp_config_cmd,
+   (void *)&cmd_tunnel_udp_config_what,
+   (void *)&cmd_tunnel_udp_config_udp_port,
+   (void *)&cmd_tunnel_udp_config_port_id,
+   NULL,
+   },
+};
+
 /* *** CONFIGURE VM MIRROR VLAN/POOL RULE *** */
 struct cmd_set_mirror_mask_result {
cmdline_fixed_string_t set;
@@ -7518,6 +7582,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)&cmd_vf_rxvlan_filter,
(cmdline_parse_inst_t *)&cmd_queue_rate_limit,
(cmdline_parse_inst_t *)&cmd_vf_rate_limit,
+   (cmdline_parse_inst_t *)&cmd_tunnel_udp_config,
(cmdline_parse_inst_t *)&cmd_set_mirror_mask,
(cmdline_parse_inst_t *)&cmd_set_mirror_link,
(cmdline_parse_inst_t *)&cmd_reset_mirror_rule,
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 9573a43..fda8c1d 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -202,6 +202,10 @@ usage(char* progname)
printf("  --txpkts=X[,Y]*: set TX segment sizes.\n");
printf("  --disable-link-check: disable check on link status when "
   "starting/stopping ports.\n");
+   printf("  --tunnel-type=N: set tunneling packet type "
+  "(0 <= N <= 4).(0:non-tunneling packet;1:VxLAN; "
+  "2:GENEVE;3: TEREDO;4: NVGRE)\n");
+
 }

 #ifdef RTE_LIBRTE_CMDLINE
@@ -600,6 +604,7 @@ launch_args_parse(int argc, char** argv)
{ "no-flush-rx",0, 0, 0 },
{ "tx

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-21 Thread Jijiang Liu

Add definations of the data structures of tunneling packet filter in the 
rte_eth_ctrl.h file.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 lib/librte_ether/rte_eth_ctrl.h |   64 +++
 lib/librte_ether/rte_ethdev.h   |   12 ---
 2 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..a04333c 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_TUNNEL,
RTE_ETH_FILTER_MAX
 };

@@ -71,6 +72,69 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * filter type of tunneling packet
+ */
+#define ETH_TUNNEL_FILTER_OMAC  0x01 /**< filter by outer MAC addr */
+#define ETH_TUNNEL_FILTER_OIP   0x02 /**< filter by outer IP Addr */
+#define ETH_TUNNEL_FILTER_TENID 0x04 /**< filter by tenant ID */
+#define ETH_TUNNEL_FILTER_IMAC  0x08 /**< filter by inner MAC addr */
+#define ETH_TUNNEL_FILTER_IVLAN 0x10 /**< filter by inner VLAN ID */
+#define ETH_TUNNEL_FILTER_IIP   0x20 /**< filter by inner IP addr */
+
+#define RTE_TUNNEL_FILTER_TO_QUEUE 1 /**< point to an queue by filter type */
+
+#define RTE_TUNNEL_FILTER_IMAC_IVLAN (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_IVLAN)
+#define RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_IVLAN | \
+   ETH_TUNNEL_FILTER_TENID)
+#define RTE_TUNNEL_FILTER_IMAC_TENID (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_TENID)
+#define RTE_TUNNEL_FILTER_OMAC_TENID_IMAC (ETH_TUNNEL_FILTER_OMAC | \
+   ETH_TUNNEL_FILTER_TENID | \
+   ETH_TUNNEL_FILTER_IMAC)
+
+/**
+ *  Select IPv4 or IPv6 for tunnel filters.
+ */
+enum rte_tunnel_iptype {
+   RTE_TUNNEL_IPTYPE_IPV4 = 0, /**< IPv4. */
+   RTE_TUNNEL_IPTYPE_IPV6, /**< IPv6. */
+};
+
+/**
+ * Tunneled type.
+ */
+enum rte_eth_tunnel_type {
+   RTE_TUNNEL_TYPE_NONE = 0,
+   RTE_TUNNEL_TYPE_VXLAN,
+   RTE_TUNNEL_TYPE_GENEVE,
+   RTE_TUNNEL_TYPE_TEREDO,
+   RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_MAX,
+};
+
+/**
+ * Tunneling Packet filter configuration.
+ */
+struct rte_eth_tunnel_filter_conf {
+   struct ether_addr *outer_mac;  /**< Outer MAC address filter. */
+   struct ether_addr *inner_mac;  /**< Inner MAC address filter. */
+   uint16_t inner_vlan;   /**< Inner VLAN filter. */
+   enum rte_tunnel_iptype ip_type; /**< IP address type. */
+   union {
+   uint32_t ipv4_addr;/**< IPv4 source address to match. */
+   uint32_t ipv6_addr[4]; /**< IPv6 source address to match. */
+   } ip_addr; /**< IPv4/IPv6 source address to match (union of above). */
+
+   uint16_t filter_type;   /**< Filter type. */
+   uint8_t to_queue;   /**< Use MAC and VLAN to point to a queue. */
+   enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */
+   uint32_t tenant_id; /** < Tenant number. */
+   uint16_t queue_id;  /** < queue number. */
+};
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 9ad11ec..c8fb89a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -718,18 +718,6 @@ struct rte_eth_udp_tunnel {
 };

 /**
- * Tunneled type.
- */
-enum rte_eth_tunnel_type {
-   RTE_TUNNEL_TYPE_NONE = 0,
-   RTE_TUNNEL_TYPE_VXLAN,
-   RTE_TUNNEL_TYPE_GENEVE,
-   RTE_TUNNEL_TYPE_TEREDO,
-   RTE_TUNNEL_TYPE_NVGRE,
-   RTE_TUNNEL_TYPE_MAX,
-};
-
-/**
  *  Possible l4type of FDIR filters.
  */
 enum rte_l4type {
-- 
1.7.7.6

[dpdk-dev] [PATCH v6 6/9] i40e:implement API of VxLAN packet filter in librte_pmd_i40e

2014-10-21 Thread Jijiang Liu

The implementation of VxLAN tunnel filter in librte_pmd_i40e, which include
 - add the i40e_tunnel_filter_handle() function.
 - add the i40e_dev_tunnel_filter_set() function.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  177 -
 1 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 8a84e30..726a972 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "i40e_logs.h"
 #include "i40e/i40e_register_x710_int.h"
@@ -192,6 +193,9 @@ static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
 static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
struct rte_eth_udp_tunnel *udp_tunnel,
uint8_t count);
+static int i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
+   struct rte_eth_tunnel_filter_conf 
*tunnel_filter,
+   uint8_t add);
 static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
@@ -4105,6 +4109,108 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
 }

 static int
+i40e_dev_get_filter_type(uint16_t filter_type, uint16_t *flag)
+{
+   switch (filter_type) {
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_OMAC_TENID_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC;
+   break;
+   case ETH_TUNNEL_FILTER_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "invalid tunnel filter type\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int
+i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
+   struct rte_eth_tunnel_filter_conf *tunnel_filter,
+   uint8_t add)
+{
+   uint16_t ip_type;
+   uint8_t tun_type = 0;
+   int val, ret = 0;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct i40e_vsi *vsi = pf->main_vsi;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *cld_filter;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *pfilter;
+
+   cld_filter = rte_zmalloc("tunnel_filter",
+   sizeof(struct i40e_aqc_add_remove_cloud_filters_element_data),
+   0);
+
+   if (NULL == cld_filter) {
+   PMD_DRV_LOG(ERR, "Failed to alloc memory.\n");
+   return -EINVAL;
+   }
+   pfilter = cld_filter;
+
+   (void)rte_memcpy(&pfilter->outer_mac, tunnel_filter->outer_mac,
+   sizeof(struct ether_addr));
+   (void)rte_memcpy(&pfilter->inner_mac, tunnel_filter->inner_mac,
+   sizeof(struct ether_addr));
+
+   pfilter->inner_vlan = tunnel_filter->inner_vlan;
+   if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
+   (void)rte_memcpy(&pfilter->ipaddr.v4.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v4.data));
+   } else {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
+   (void)rte_memcpy(&pfilter->ipaddr.v6.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v6.data));
+   }
+
+   /* check tunneled type */
+   switch (tunnel_filter->tunnel_type) {
+   case RTE_TUNNEL_TYPE_VXLAN:
+   tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_XVLAN;
+   break;
+   default:
+   /* Other tunnel types is not supported. */
+   PMD_DRV_LOG(ERR, "tunnel type is not supported.\n");
+   rte_free(cld_filter);
+   return -EINVAL;
+   }
+
+   val = i40e_dev_get_filter_type(tunnel_filter->filter_type,
+   &pfilter->flags);
+   if (val < 0) {
+   rte_free(cld_filter);
+   return -EINVAL;
+   }
+
+   pfilter->flags |= I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE | ip_type |
+   (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT);
+   pfilter->tenant_id = tunnel_filter->tenant_id;
+   pfilter->queue_number = tunnel_filter->queue_id;
+
+   if (add)
+

[dpdk-dev] [PATCH v6 7/9] app/testpmd:test VxLAN packet filter

2014-10-21 Thread Jijiang Liu

Add the tunnel_filter command in testpmd to test the API of VxLAN packet filter.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 app/test-pmd/cmdline.c |  152 
 1 files changed, 152 insertions(+), 0 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 7160e38..bc9a30c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "tunnel_filter add (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   add a tunnel filter of a port.\n\n"
+
+   "tunnel_filter rm (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   remove a tunnel filter of a port.\n\n"
+
"rx_vxlan_port add (udp_port) (port_id)\n"
"Add an UDP port for VxLAN packet filter on a 
port\n\n"

@@ -6231,6 +6239,149 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** ADD TUNNEL FILTER OF A PORT *** */
+struct cmd_tunnel_filter_result {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint8_t port_id;
+   struct ether_addr outer_mac;
+   struct ether_addr inner_mac;
+   cmdline_ipaddr_t ip_value;
+   uint16_t inner_vlan;
+   cmdline_fixed_string_t tunnel_type;
+   cmdline_fixed_string_t filter_type;
+   uint32_t tenant_id;
+   uint16_t queue_num;
+};
+
+static void
+cmd_tunnel_filter_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_filter_result *res = parsed_result;
+   struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
+   int ret = 0;
+
+   tunnel_filter_conf.outer_mac = &res->outer_mac;
+   tunnel_filter_conf.inner_mac = &res->inner_mac;
+   tunnel_filter_conf.inner_vlan = res->inner_vlan;
+
+   if (res->ip_value.family == AF_INET) {
+   tunnel_filter_conf.ip_addr.ipv4_addr =
+   res->ip_value.addr.ipv4.s_addr;
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
+   } else {
+   memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
+   &(res->ip_value.addr.ipv6),
+   sizeof(struct in6_addr));
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
+   }
+
+   if (!strcmp(res->filter_type, "imac-ivlan"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_IVLAN;
+   else if (!strcmp(res->filter_type, "imac-ivlan-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID;
+   else if (!strcmp(res->filter_type, "imac-tenid"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_TENID;
+   else if (!strcmp(res->filter_type, "imac"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_IMAC;
+   else if (!strcmp(res->filter_type, "omac-imac-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_OMAC_TENID_IMAC;
+   else {
+   printf("The filter type is not supported");
+   return;
+   }
+
+   tunnel_filter_conf.to_queue = RTE_TUNNEL_FILTER_TO_QUEUE;
+
+   if (!strcmp(res->tunnel_type, "vxlan"))
+   tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_VXLAN;
+   else {
+   printf("Only VxLAN is supported now.\n");
+   return;
+   }
+
+   tunnel_filter_conf.tenant_id = res->tenant_id;
+   tunnel_filter_conf.queue_id = res->queue_num;
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_ADD,
+   &tunnel_filter_conf);
+   else
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_DELETE,
+   &tunnel_filter_conf);
+   if (ret < 0)
+   printf("cmd_tunnel_filter_parsed error: (%s)\n",
+   strerror(-ret));
+
+}
+cmdline_parse_token_string_t cmd_tunnel_filter_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
+   cmd, "tunnel_filter");
+cmdline_parse_token_string_t cmd_tunnel_filter_

[dpdk-dev] [PATCH v6 8/9] i40e:support VxLAN Tx checksum offload

2014-10-21 Thread Jijiang Liu

Support VxLAN Tx checksum offload, which include
  - outer L3(IP) checksum offload
  - inner L3(IP) checksum offload
  - inner L4(UDP, TCP and SCTP) checksum offload

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 lib/librte_mbuf/rte_mbuf.h  |1 +
 lib/librte_pmd_i40e/i40e_rxtx.c |   46 +-
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 98951a6..4144c0d 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -94,6 +94,7 @@ extern "C" {

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
 #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
+#define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VxLAN computed 
by NIC */
 #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
 #define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
 #define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index 7c3809f..592787f 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -411,11 +411,14 @@ i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
 }

 static inline void
-i40e_txd_enable_checksum(uint32_t ol_flags,
+i40e_txd_enable_checksum(uint64_t ol_flags,
uint32_t *td_cmd,
uint32_t *td_offset,
uint8_t l2_len,
-   uint8_t l3_len)
+   uint16_t l3_len,
+   uint8_t inner_l2_len,
+   uint16_t inner_l3_len,
+   uint32_t *cd_tunneling)
 {
if (!l2_len) {
PMD_DRV_LOG(DEBUG, "L2 length set to 0");
@@ -428,6 +431,27 @@ i40e_txd_enable_checksum(uint32_t ol_flags,
return;
}

+   /* VxLAN packet TX checksum offload */
+   if (unlikely(ol_flags & PKT_TX_VXLAN_CKSUM)) {
+   uint8_t l4tun_len;
+
+   l4tun_len = ETHER_VXLAN_HLEN + inner_l2_len;
+
+   if (ol_flags & PKT_TX_IPV4_CSUM)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV4;
+   else if (ol_flags & PKT_TX_IPV6)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
+
+   /* Now set the ctx descriptor fields */
+   *cd_tunneling |= (l3_len >> 2) <<
+   I40E_TXD_CTX_QW0_EXT_IPLEN_SHIFT |
+   I40E_TXD_CTX_UDP_TUNNELING |
+   (l4tun_len >> 1) <<
+   I40E_TXD_CTX_QW0_NATLEN_SHIFT;
+
+   l3_len = inner_l3_len;
+   }
+
/* Enable L3 checksum offloads */
if (ol_flags & PKT_TX_IPV4_CSUM) {
*td_cmd |= I40E_TX_DESC_CMD_IIPT_IPV4_CSUM;
@@ -1077,7 +1101,10 @@ i40e_recv_scattered_pkts(void *rx_queue,
 static inline uint16_t
 i40e_calc_context_desc(uint64_t flags)
 {
-   uint16_t mask = 0;
+   uint64_t mask = 0ULL;
+
+   if (flags | PKT_TX_VXLAN_CKSUM)
+   mask |= PKT_TX_VXLAN_CKSUM;

 #ifdef RTE_LIBRTE_IEEE1588
mask |= PKT_TX_IEEE1588_TMST;
@@ -1098,6 +1125,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
volatile struct i40e_tx_desc *txr;
struct rte_mbuf *tx_pkt;
struct rte_mbuf *m_seg;
+   uint32_t cd_tunneling_params;
uint16_t tx_id;
uint16_t nb_tx;
uint32_t td_cmd;
@@ -1106,7 +1134,9 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
uint32_t td_tag;
uint64_t ol_flags;
uint8_t l2_len;
-   uint8_t l3_len;
+   uint16_t l3_len;
+   uint8_t inner_l2_len;
+   uint16_t inner_l3_len;
uint16_t nb_used;
uint16_t nb_ctx;
uint16_t tx_last;
@@ -1134,7 +1164,9 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)

ol_flags = tx_pkt->ol_flags;
l2_len = tx_pkt->l2_len;
+   inner_l2_len = tx_pkt->inner_l2_len;
l3_len = tx_pkt->l3_len;
+   inner_l3_len = tx_pkt->inner_l3_len;

/* Calculate the number of context descriptors needed. */
nb_ctx = i40e_calc_context_desc(ol_flags);
@@ -1182,15 +1214,17 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
td_cmd |= I40E_TX_DESC_CMD_ICRC;

/* Enable checksum offloading */
+   cd_tunneling_params = 0;
i40e_txd_enable_checksum(ol_flags, &td_cmd, &td_offset,
-   l2_len, l3_len);
+   l2_len, l3_len, inner_l2_len,
+

[dpdk-dev] [PATCH v6 9/9] app/testpmd:test VxLAN Tx checksum offload

2014-10-21 Thread Jijiang Liu

Add test cases in testpmd to test VxLAN Tx Checksum offload, which include
 - IPv4 and IPv6 packet 
 - outer L3, inner L3 and L4 checksum offload for Tx side.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 
---
 app/test-pmd/cmdline.c  |   13 ++-
 app/test-pmd/config.c   |6 +-
 app/test-pmd/csumonly.c |  195 +++
 3 files changed, 192 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bc9a30c..d738258 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -310,13 +310,17 @@ static void cmd_help_long_parsed(void *parsed_result,
"Disable hardware insertion of a VLAN header in"
" packets sent on a port.\n\n"

-   "tx_checksum set mask (port_id)\n"
+   "tx_checksum set (mask) (port_id)\n"
"Enable hardware insertion of checksum offload with"
-   " the 4-bit mask, 0~0xf, in packets sent on a port.\n"
+   " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
"bit 0 - insert ip   checksum offload if set\n"
"bit 1 - insert udp  checksum offload if set\n"
"bit 2 - insert tcp  checksum offload if set\n"
"bit 3 - insert sctp checksum offload if set\n"
+   "bit 4 - insert inner ip  checksum offload if 
set\n"
+   "bit 5 - insert inner udp checksum offload if 
set\n"
+   "bit 6 - insert inner tcp checksum offload if 
set\n"
+   "bit 7 - insert inner sctp checksum offload if 
set\n"
"Please check the NIC datasheet for HW limits.\n\n"

"set fwd (%s)\n"
@@ -2763,8 +2767,9 @@ cmdline_parse_inst_t cmd_tx_cksum_set = {
.f = cmd_tx_cksum_set_parsed,
.data = NULL,
.help_str = "enable hardware insertion of L3/L4checksum with a given "
-   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip"
-   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP",
+   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip, "
+   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP, Bit 4 for inner ip, "
+   "Bit 5 for inner UDP, Bit 6 for inner TCP, Bit 7 for inner SCTP",
.tokens = {
(void *)&cmd_tx_cksum_set_tx_cksum,
(void *)&cmd_tx_cksum_set_set,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 2a1b93f..9bc08f4 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1753,9 +1753,9 @@ tx_cksum_set(portid_t port_id, uint64_t ol_flags)
uint64_t tx_ol_flags;
if (port_id_is_invalid(port_id))
return;
-   /* Clear last 4 bits and then set L3/4 checksum mask again */
-   tx_ol_flags = ports[port_id].tx_ol_flags & (~0x0Full);
-   ports[port_id].tx_ol_flags = ((ol_flags & 0xf) | tx_ol_flags);
+   /* Clear last 8 bits and then set L3/4 checksum mask again */
+   tx_ol_flags = ports[port_id].tx_ol_flags & (~0x0FFull);
+   ports[port_id].tx_ol_flags = ((ol_flags & 0xff) | tx_ol_flags);
 }

 void
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index fcc4876..e2ac129 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -196,7 +196,6 @@ get_ipv6_udptcp_checksum(struct ipv6_hdr *ipv6_hdr, 
uint16_t *l4_hdr)
return (uint16_t)cksum;
 }

-
 /*
  * Forwarding of packets. Change the checksum field with HW or SW methods
  * The HW/SW method selection depends on the ol_flags on every packet
@@ -209,10 +208,16 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
struct rte_mbuf  *mb;
struct ether_hdr *eth_hdr;
struct ipv4_hdr  *ipv4_hdr;
+   struct ether_hdr *inner_eth_hdr;
+   struct ipv4_hdr  *inner_ipv4_hdr = NULL;
struct ipv6_hdr  *ipv6_hdr;
+   struct ipv6_hdr  *inner_ipv6_hdr = NULL;
struct udp_hdr   *udp_hdr;
+   struct udp_hdr   *inner_udp_hdr;
struct tcp_hdr   *tcp_hdr;
+   struct tcp_hdr   *inner_tcp_hdr;
struct sctp_hdr  *sctp_hdr;
+   struct sctp_hdr  *inner_sctp_hdr;

uint16_t nb_rx;
uint16_t nb_tx;
@@ -221,12 +226,18 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
uint64_t pkt_ol_flags;
uint64_t tx_ol_flags;
uint16_t l4_proto;
+   uint16_t inner_l4_proto = 0;
uint16_t eth_type;
uint8_t  l2_len;
uint8_t  l3_len;
+   uint8_t  inner_l2_len = 0;
+   uint8_t  inner_l3_len = 0;

uint32_t rx_bad_ip_csum;
uint32_t rx_bad_l4_csum;
+   uint8_t  ipv4_tunnel;
+   uint8_t  ipv6_tunnel;
+   uint16_t ptype;

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES

[dpdk-dev] development/integration branch?

2014-10-21 Thread Thomas Monjalon

2014-10-21 08:36, Richardson, Bruce:
> From: Marc Sune
> > Some DPDK users, including myself, use a clone of the git repository to
> > compile DPDK for their applications, instead of downloading the tarball
> > of each release.
> > 
> > In my opinion, it would be useful for such users that the master branch
> > contains only stable releases, to prevent (mistakenly) to use a wip DPDK
> > version, and jump quickly to the latest stable with a simple git pull
> > without having to check the tags. Also new users would clone the repo
> > and get only the stable release.
> > 
> > So I would propose to use an integration/development branch, where the
> > patches are integrated and only push to master once a stable release is
> > tagged in this integration branch.
> > 
> > Thoughts?
> 
> Ideally, our master branch should always be good and stable, but given
> reality often interferes with such good intentions I think that having
> dev branches is not a bad idea. However, what we may lose by doing so
> is having a larger group of people constantly using the master branch
> and reporting problems to us. 
> 
> On balance, I'd be slightly in favour of this suggestion.

My balance is different because I have a simpler solution for Marc's problem:
git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)

-- 
Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Marc Sune

On 21/10/14 10:46, Thomas Monjalon wrote:
> My balance is different because I have a simpler solution for Marc's problem:
>   git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)
Thomas,

We all know we _can_ do this. But is it really necessary? We should be 
all as lazy as possible and make it easy for users IMHO. `git pull` is 
easier :)

I don't see any drawback of using a development branch, except if you 
consider the extra push to master per release a drawback.

Also think about new users downloading the repo for the first time. They 
are forced to do this right now if they want to checkout the latest stable.

marc

[dpdk-dev] [PATCH] Fix warnings if DPDK is used with C++1x: Format macro constants for fixed width integer types need a space after the preceding string literal.

2014-10-21 Thread Thomas Monjalon

Hi,

Thank you for the patch. Not a lot of people use DPDK with C++, so we probably
need some fixes.

Please, could you send a v2 of this patch with a shorter title, an explanation
in commit log and a signed-off?
Guidelines are explained here:
http://dpdk.org/dev#send

2014-10-21 10:38, Matthias Bartelt:
> From: Matthias Bartelt 

You need to configure your email address in .gitconfig to remove this.

-- 
Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Thomas Monjalon

2014-10-21 11:14, Marc Sune:
> On 21/10/14 10:46, Thomas Monjalon wrote:
> > My balance is different because I have a simpler solution for Marc's 
> > problem:
> > git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)
> 
> We all know we _can_ do this. But is it really necessary? We should be 
> all as lazy as possible and make it easy for users IMHO. `git pull` is 
> easier :)

Yes and lazy users download tarballs.

> I don't see any drawback of using a development branch, except if you 
> consider the extra push to master per release a drawback.

No I don't care to push one more thing.
But I care about the message brought by such change. It would mean that
we can break the development branch and that most of developers don't test
it nor base their patches on the latest commit. It's all about simple rules
and messages.

> Also think about new users downloading the repo for the first time. They 
> are forced to do this right now if they want to checkout the latest stable.

New users will get the latest release and expect to see current work in
progress right after cloning the git tree (in master branch).
It's also more common to see work in progress in default branch in cgit.

-- 
Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Marc Sune

Thomas,

On 21/10/14 11:28, Thomas Monjalon wrote:
> 2014-10-21 11:14, Marc Sune:
>> On 21/10/14 10:46, Thomas Monjalon wrote:
>>> My balance is different because I have a simpler solution for Marc's 
>>> problem:
>>> git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)
>> We all know we _can_ do this. But is it really necessary? We should be
>> all as lazy as possible and make it easy for users IMHO. `git pull` is
>> easier :)
> Yes and lazy users download tarballs.

At least for me, I stopped downloading DPDK tarballs after the third 
time I had to upgrade the release.
>> I don't see any drawback of using a development branch, except if you
>> consider the extra push to master per release a drawback.
> No I don't care to push one more thing.
> But I care about the message brought by such change. It would mean that
> we can break the development branch and that most of developers don't test
> it nor base their patches on the latest commit. It's all about simple rules
> and messages.

I understand your concern but, isn't peer reviewing meant to prevent this?

>> Also think about new users downloading the repo for the first time. They
>> are forced to do this right now if they want to checkout the latest stable.
> New users will get the latest release and expect to see current work in
> progress right after cloning the git tree (in master branch).
> It's also more common to see work in progress in default branch in cgit.
I know, but I also know other projects do the way I proposed with 
success. In any case it was just a suggestion to try to improve things.

marc

[dpdk-dev] [PATCH v3 0/6] Update libs build process

2014-10-21 Thread Gonzalez Monroy, Sergio

Hi Thomas,

Given that most of the comments/discussion for this patch set revolved 
around the removal of COMBINE_LIBS and what libs to build by default, 
I am inclined to drop this patch set, submit minimal patch to fix 
compiler errors (initial and main purpose of this patch set) and then 
submit an RFC regarding the use/removal of COMBINE_LIBS and other outstanding 
issues in the build system.

Does that sound like a better approach?

Thanks,
Sergio

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Gonzalez Monroy,
> Sergio
> Sent: Monday, October 13, 2014 5:02 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/6] Update libs build process
> 
> Are there any more comments on this patch set?
> 
> Thanks,
> Sergio
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez
> > Monroy
> > Sent: Thursday, October 9, 2014 2:05 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v3 0/6] Update libs build process
> >
> > As per the proposal, this patch set does:
> >  - Remove CONFIG_RTE_BUILD_COMBINE_LIBS as a configuration option.
> >  - For static library, build a single/combined library.
> >  - For shared libraries, build both individual/separated and single/combined
> >libraries.
> >  - Link apps only against single/combined libs.
> >  - Include external shared libs dependencies when building shared libraries.
> >
> > v3:
> >  - Split some of the patches for easier review
> >  - Improve patches descriptions
> >
> > Sergio Gonzalez Monroy (6):
> >   Link combined shared library using CC
> >   Link apps only against single/combined library
> >   Remove CONFIG_RTE_BUILD_COMBINE_LIBS and related
> >   Update library build process
> >   Avoid duplicated code
> >   Link apps/DSOs against EXECENV_LDLIBS with --as-needed
> >
> >  config/common_bsdapp   |   3 +-
> >  config/common_linuxapp |   3 +-
> >  mk/rte.app.mk  | 164 
> > ++---
> >  mk/rte.lib.mk  |  81 ++--
> >  mk/rte.sdkbuild.mk |   2 +-
> >  mk/rte.sharelib.mk |  54 
> >  mk/rte.vars.mk |   4 --
> >  7 files changed, 54 insertions(+), 257 deletions(-)
> >
> > --
> > 1.9.3

[dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure changes

2014-10-21 Thread Thomas Monjalon

Hi Jijiang,

2014-10-21 16:46, Jijiang Liu:
> Remove the "reserved2" field and add the "packet_type"

"Remove and add" can be said "Replace".

> and the "inner_l2_l3_len" fields in the rte_mbuf structure.

Please explain that you are using 2 bytes of the second cache line
for TX offloading of tunnels.

>   /* remaining bytes are set on RX when pulling packet from descriptor */
>   MARKER rx_descriptor_fields1;
> - uint16_t reserved2;   /**< Unused field. Required for padding */
> +
> + /**
> +  * Packet type, which is used to indicate ordinary L2 packet format and
> +  * also tunneled packet format such as IP in IP, IP in GRE, MAC in GRE
> +  * and MAC in UDP.
> +  */
> + uint16_t packet_type;

Why not name it "l2_type"?

-- 
Thomas

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-21 Thread Neil Horman

On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> Hi Neil,
> 
> Very appreciate your comments.
> I add inline reply, will send v3 asap when we get alignment.
> 
> BRs,
> Liang Cunming
> 
> > -Original Message-
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Saturday, October 11, 2014 1:52 AM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx 
> > cycles/packet
> > 
> > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote:
> > > It provides unit test to measure cycles/packet in NIC loopback mode.
> > > It simply gives the average cycles of IO used per packet without test 
> > > equipment.
> > > When doing the test, make sure the link is UP.
> > >
> > > Usage Example:
> > > 1. Run unit test app in interactive mode
> > > app/test -c f -n 4 -- -i
> > > 2. Run and wait for the result
> > > pmd_perf_autotest
> > >
> > > There's option to choose rx/tx pair, default is vector.
> > > set_rxtx_mode [vector|scalar|full|hybrid]
> > > Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' 
> > > without
> > INC_VEC=y in config
> > >
> > > Signed-off-by: Cunming Liang 
> > > Acked-by: Bruce Richardson 
> > 
> > Notes inline
> > 
> > > ---
> > >  app/test/Makefile   |1 +
> > >  app/test/commands.c |   38 +++
> > >  app/test/packet_burst_generator.c   |4 +-
> > >  app/test/test.h |4 +
> > >  app/test/test_pmd_perf.c|  626
> > +++
> > >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
> > >  6 files changed, 677 insertions(+), 2 deletions(-)
> > >  create mode 100644 app/test/test_pmd_perf.c
> > >
> > > diff --git a/app/test/Makefile b/app/test/Makefile
> > > index 6af6d76..ebfa0ba 100644
> > > --- a/app/test/Makefile
> > > +++ b/app/test/Makefile
> > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c
> > >
> > >  SRCS-y += test_ring.c
> > >  SRCS-y += test_ring_perf.c
> > > +SRCS-y += test_pmd_perf.c
> > >
> > >  ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
> > >  SRCS-y += test_table.c
> > > diff --git a/app/test/commands.c b/app/test/commands.c
> > > index a9e36b1..f1e746e 100644
> > > --- a/app/test/commands.c
> > > +++ b/app/test/commands.c
> > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = {
> > >
> > > +#define NB_ETHPORTS_USED(1)
> > > +#define NB_SOCKETS  (2)
> > > +#define MEMPOOL_CACHE_SIZE 250
> > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > RTE_PKTMBUF_HEADROOM)
> > Don't you want to size this in accordance with the amount of data your 
> > sending
> > (64 Bytes as noted above)?
> [Liang, Cunming] The case is designed to measure small packet IO cost with 
> normal mbuf size.
> Even if decreasing the size, it won't gain significant cycles.
> > 
That presumes a non-memory constrained system, doesn't it?  I suppose in the end
as long as you have consistency, its not overly relevant, but it seems like
you'll want to add data sizing as a feature to this eventually (i.e. the ability
to test performance for larger frames sizes), at which point you'll need to make
this non-static anyway.

> > > +static void
> > > +print_ethaddr(const char *name, const struct ether_addr *eth_addr)
> > > +{
> > > + printf("%s%02X:%02X:%02X:%02X:%02X:%02X", name,
> > > + eth_addr->addr_bytes[0],
> > > + eth_addr->addr_bytes[1],
> > > + eth_addr->addr_bytes[2],
> > > + eth_addr->addr_bytes[3],
> > > + eth_addr->addr_bytes[4],
> > > + eth_addr->addr_bytes[5]);
> > > +}
> > > +
> > This was copieed from print_ethaddr.  Seems like a good candidate for a 
> > common
> > function in rte_ether.h
> [Liang, Cunming] Agree with you, some of samples now use it with the same 
> copy.
> I'll rework it. Adding 'ether_format_addr' in rte_ether.h only for format the 
> 48bits address output.
> And leaving other prints for application customization.
> > 
Sounds good.

> > 
> > > +}
> > > +
> > > +static void
> > > +signal_handler(int signum)
> > > +{
> > > + /* When we receive a USR1 signal, print stats */
> > I think you mean SIGUSR2, below, SIGUSR1 tears the test down and exits the
> > program
> [Liang, Cunming] Thanks, it's a typo.
> > 
> > > + if (signum == SIGUSR1) {
> > SIGINT instead.  Thats the common practice.
> [Liang, Cunming] I understood your opinion. 
> The considerations I'm not using SIGINT instead are:
> 1. We unset ISIG in c_lflag of term. CRTL+C won't trigger SIGINT in command 
> interactive.
>   It always has to explicitly send signal. No matter SIGUSR1 or SIGINT.
> 2. By SIGINT semantic, expect to terminate the process.
>   Here I expect to force stop this case, but still alive in command line.
>   After it stopped, it can run again or start to run other test cases.
>   So I keep SIGINT, SIGUSR1 in different behavior.
> 3. It should be rarely used. 
>   Only when exception timeout, I leave thi

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-21 Thread Richardson, Bruce



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> Sent: Tuesday, October 21, 2014 11:33 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> 
> > >
> > > > +   if (count == 0)
> > > > +   return -1;
> > > > +
> > > > +   printf("%lu packet, %lu drop, %lu idle\n", count, drop, idle);
> > > > +   printf("Result: %ld cycles per packet\n", (cur_tsc - prev_tsc) 
> > > > / count);
> > > > +
> > > Bad math here.  Theres no guarantee that the tsc hasn't wrapped 
> > > (potentially
> > > more than once) depending on your test length.  you need to check the tsc
> before
> > > and after each burst and record an average of deltas instead, accounting 
> > > in
> each
> > > instance for the possibility of wrap.
> > [Liang, Cunming] I'm not sure catch your point correctly.
> > I think both cur_tsc and prev_tsc are 64 bits width.
> > For 3GHz, I think it won't wrapped so quick.
> > As it's uint64_t, so even get wrapped, the delta should still be correct.
> But theres no guarantee that the tsc starts at zero when you begin your test.
> The system may have been up for a long time and near wrapping already.
> Regardless, you need to account for the possibility that cur_tsc is smaller
> than prev_tsc, or this breaks.
> 

The tsc. is 64-bit and so only wraps around every couple of hundred years or so 
on a 2GHz machine, so I don't think it's necessary to handle that case. 

/Bruce

[dpdk-dev] [PATCH v5] KNI: use a memzone pool for KNI alloc/release

2014-10-21 Thread Marc Sune

The previous implementation of rte_kni_alloc() was allocating memzones with a
name composed of a fixed string and the interface name. When an application was
allocating and deallocating multiple interfaces with different names, memzones
were quickly exhausted, even though memzones from deallocated interfaces were
never used anymore (unless an interface with the same name was re-allocated).
As a result, the application was unable to allocate more KNI interfaces with
different names.

This patch implements the KNI memzone pool in order to prevent memzone
exhaustion when allocating/deallocating KNI interfaces. It adds a new API call,
rte_kni_init(max_kni_ifaces) that shall be called before any call to
rte_kni_alloc() if KNI is used. The memzones are pre-allocated with interface-
independent names so that they can be reused.

Signed-off-by: Marc Sune 
---
 app/test/test_kni.c  |5 +-
 examples/kni/main.c  |   22 
 lib/librte_kni/rte_kni.c |  311 +-
 lib/librte_kni/rte_kni.h |   18 +++
 4 files changed, 297 insertions(+), 59 deletions(-)

diff --git a/app/test/test_kni.c b/app/test/test_kni.c
index 1081131..608901d 100644
--- a/app/test/test_kni.c
+++ b/app/test/test_kni.c
@@ -58,7 +58,7 @@

 #define IFCONFIG  "/sbin/ifconfig "
 #define TEST_KNI_PORT "test_kni_port"
-
+#define KNI_TEST_MAX_PORTS 4
 /* The threshold number of mbufs to be transmitted or received. */
 #define KNI_NUM_MBUF_THRESHOLD 100
 static int kni_pkt_mtu = 0;
@@ -498,6 +498,9 @@ test_kni(void)
struct rte_eth_dev_info info;
struct rte_kni_ops ops;

+   /* Initialize KNI subsytem */
+   rte_kni_init(KNI_TEST_MAX_PORTS);
+
if (test_kni_allocate_lcores() < 0) {
printf("No enough lcores for kni processing\n");
return -1;
diff --git a/examples/kni/main.c b/examples/kni/main.c
index cb17b43..1344a87 100644
--- a/examples/kni/main.c
+++ b/examples/kni/main.c
@@ -586,6 +586,25 @@ parse_args(int argc, char **argv)
return ret;
 }

+/* Initialize KNI subsystem */
+static void
+init_kni(void)
+{
+   unsigned int num_of_kni_ports = 0, i;
+   struct kni_port_params **params = kni_port_params_array;
+
+   /* Calculate the maximum number of KNI interfaces that will be used */
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if (kni_port_params_array[i]) {
+   num_of_kni_ports += (params[i]->nb_lcore_k ?
+   params[i]->nb_lcore_k : 1);
+   }
+   }
+
+   /* Invoke rte KNI init to preallocate the ports */
+   rte_kni_init(num_of_kni_ports);
+}
+
 /* Initialise a single port on an Ethernet device */
 static void
 init_port(uint8_t port)
@@ -872,6 +891,9 @@ main(int argc, char** argv)
rte_exit(EXIT_FAILURE, "Configured invalid "
"port ID %u\n", i);

+   /* Initialize KNI subsystem */
+   init_kni();
+
/* Initialise each port */
for (port = 0; port < nb_sys_ports; port++) {
/* Skip ports that are not enabled */
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 76feef4..f64a0a8 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -40,6 +40,7 @@
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -58,7 +59,7 @@

 #define KNI_REQUEST_MBUF_NUM_MAX  32

-#define KNI_MZ_CHECK(mz) do { if (mz) goto fail; } while (0)
+#define KNI_MEM_CHECK(cond) do { if (cond) goto kni_fail; } while (0)

 /**
  * KNI context
@@ -66,6 +67,7 @@
 struct rte_kni {
char name[RTE_KNI_NAMESIZE];/**< KNI interface name */
uint16_t group_id;  /**< Group ID of KNI devices */
+   uint32_t slot_id;   /**< KNI pool slot ID */
struct rte_mempool *pktmbuf_pool;   /**< pkt mbuf mempool */
unsigned mbuf_size; /**< mbuf size */

@@ -88,10 +90,48 @@ enum kni_ops_status {
KNI_REQ_REGISTERED,
 };

+/**
+ * KNI memzone pool slot
+ */
+struct rte_kni_memzone_slot {
+   uint32_t id;
+   uint8_t in_use : 1;/**< slot in use */
+
+   /* Memzones */
+   const struct rte_memzone *m_ctx;   /**< KNI ctx */
+   const struct rte_memzone *m_tx_q;  /**< TX queue */
+   const struct rte_memzone *m_rx_q;  /**< RX queue */
+   const struct rte_memzone *m_alloc_q;   /**< Allocated mbufs queue */
+   const struct rte_memzone *m_free_q;/**< To be freed mbufs queue */
+   const struct rte_memzone *m_req_q; /**< Request queue */
+   const struct rte_memzone *m_resp_q;/**< Response queue */
+   const struct rte_memzone *m_sync_addr;
+
+   /* Free linked list */
+   struct rte_kni_memzone_slot *next; /**< Next slot link.list */
+};
+
+/**
+ * KNI memzone pool
+ */
+struct rte_kni_memzone_pool {
+   uint8_t initialized : 1;

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-21 Thread Thomas Monjalon

2014-10-21 16:46, Jijiang Liu:
> There are "some" destination UDP port numbers that have unque meaning.
> In terms of VxLAN, "IANA has assigned the value 4789 for the VXLAN UDP port, 
> and this value
> SHOULD be used by default as the destination UDP port. Some early 
> implementations of VXLAN
> have used other values for the destination port. To enable interoperability 
> with these 
> implementations, the destination port SHOULD be configurable."
> 
> Add two APIs in librte_ether for supporting UDP tunneling port configuration 
> on i40e.
> Currently, only VxLAN is implemented in this patch set.

Actually, there are 2 different things in this patch
- new tunnelling API
- VXLAN macros
Please split in 2 patches.

>  int
> +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> +struct rte_eth_udp_tunnel *udp_tunnel,
> +uint8_t count)
> +{
> + uint8_t i;
> + struct rte_eth_dev *dev;
> + struct rte_eth_udp_tunnel *tunnel;
> +
> + if (port_id >= nb_ports) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return -ENODEV;
> + }
> +
> + if (udp_tunnel == NULL) {
> + PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> + return -EINVAL;
> + }
> + tunnel = udp_tunnel;
> +
> + for (i = 0; i < count; i++, tunnel++) {
> + if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> + PMD_DEBUG_TRACE("Invalid tunnel type\n");
> + return -EINVAL;
> + }
> + }

I'm not sure it's a good idea to provide a count parameter to iterate in a loop.
It's probably something that the application should do by itself.

> +
> + dev = &rte_eth_devices[port_id];
> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_add, -ENOTSUP);
> + return (*dev->dev_ops->udp_tunnel_add)(dev, udp_tunnel, count);
> +}

[...]

> +/**
> + * Tunneled type.
> + */
> +enum rte_eth_tunnel_type {
> + RTE_TUNNEL_TYPE_NONE = 0,
> + RTE_TUNNEL_TYPE_VXLAN,
> + RTE_TUNNEL_TYPE_GENEVE,
> + RTE_TUNNEL_TYPE_TEREDO,
> + RTE_TUNNEL_TYPE_NVGRE,
> + RTE_TUNNEL_TYPE_MAX,
> +};

This is moved later from rte_ethdev.h to rte_eth_ctrl.h.
Please choose where is the right location in this patch.
By the way, I think ethdev is the right location because it's not directly
related to ctrl API, right?

>  struct rte_eth_conf {
> + enum rte_eth_tunnel_type tunnel_type;
>   uint16_t link_speed;
>   /**< ETH_LINK_SPEED_10[0|00|000], or 0 for autonegotation */
>   uint16_t link_duplex;

Please don't add this field as the first. It's more logical to start
port configuration with link speed, duplex, etc.
Then please add a comment to explain how it should be used.
But I doubt we should configure a tunnel type for a whole port.
There's something weird here.

> +/* VXLAN protocol header */

This comment should be doxygen'ed.

> +struct vxlan_hdr {
> + uint32_t vx_flags; /**< VxLAN flag. */
> + uint32_t vx_vni;   /**< VxLAN ID. */
> +} __attribute__((__packed__));
> +
[...]
>  #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
>  #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. 
> */
>  
> +#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))

Please add a doxygen comment.

-- 
Thomas

[dpdk-dev] [PATCH v4] KNI: use a memzone pool for KNI alloc/release

2014-10-21 Thread Marc Sune

Thomas,

v5: commit message arranged, all warnings from checkpatch.pl fixed except:

WARNING: Macros with flow control statements should be avoided
#104: FILE: lib/librte_kni/rte_kni.c:62:
+#define KNI_MEM_CHECK(cond) do { if (cond) goto kni_fail; } while (0)

a) This MACRO was there before, I just re-factored it to make it more 
readable.
b) There are 4 lines exceeding 80cols due to long quoted strings. I 
followed kernel convention not to split them in multiple lines.

Thanks and regards
Marc

On 21/10/14 10:29, Thomas Monjalon wrote:
> Hi Marc,
>
> 2014-10-18 00:51, Marc Sune:
>> This patch implements the KNI memzone pool in order to prevent
>> memzone exhaustion when allocating/deallocating KNI interfaces.
>>
>> It adds a new API call, rte_kni_init(max_kni_ifaces) that shall
>> be called before any call to rte_kni_alloc() if KNI is used.
>>
>> v2: Moved KNI fd opening to rte_kni_init(). Revised style.
>> v3: Adapted kni examples/tests to rte_kni_init().
>> v4: Improved example integration. Fixed kni_memzone_pool_alloc/release() bug.
>>
>> Signed-off-by: Marc Sune 
> Thanks for the good work with Helin.
> Before applying this patch, I'd like another version explaining in the
> commit log why this change is needed.
> And please use to checkpatch.pl to check and remove whitespace errors.
>
> Thanks

[dpdk-dev] [PATCH v2] pmd: Add generic support for TCP TSO (Transmit Segmentation Offload)

2014-10-21 Thread miroslaw.walukiew...@intel.com

From: Miroslaw Walukiewicz 

The NICs supported by DPDK have a possibility to accelerate TCP
traffic by sergnention offload. The application preprares a packet
with valid TCP header with size up to 64K and NIC makes packet
segmenation generating valid checksums and TCP segments.

The patch defines a generic support for TSO offload.
- Add new  PKT_TX_TCP_SEG flag.
  Only packets with this flag set in ol_flags will be handled as
  TSO packets.

- Add new fields in indicating TCP TSO segment size and TCP header len.
  The TSO requires from application setting following fields in mbuf.
  1. L2 header len including MAC/VLANs/SNAP if present
  2. L3 header len including IP options
  3. L4 header len (new field) including TCP options
  4. tso_segsz (new field) the size of TCP segment

The apllication has obligation to compute the pseudo header checksum
instead of full TCP checksum and put it in the TCP header csum field.

Handling complexity of creation combined l2_l3_len field
a new macro RTE_MBUF_TO_L2_L3_LEN() is defined to retrieve this
part of rte_mbuf.

Signed-off-by: Mirek Walukiewicz 
---
 app/test-pmd/testpmd.c|3 ++-
 lib/librte_mbuf/rte_mbuf.h|   27 +--
 lib/librte_pmd_e1000/igb_rxtx.c   |2 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |2 +-
 4 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index f76406f..d8fd025 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -408,7 +408,8 @@ testpmd_mbuf_ctor(struct rte_mempool *mp,
mb->ol_flags = 0;
mb->data_off = RTE_PKTMBUF_HEADROOM;
mb->nb_segs  = 1;
-   mb->l2_l3_len   = 0;
+   mb->l2_len   = 0;
+   mb->l3_len   = 0;
mb->vlan_tci = 0;
mb->hash.rss = 0;
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..2e2e315 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -114,6 +114,9 @@ extern "C" {
 /* Bit 51 - IEEE1588*/
 #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to 
timestamp. */

+/* Bit 49 - TCP transmit segmenation offload */
+#define PKT_TX_TCP_SEG (1ULL << 49) /**< TX TSO offload */
+ 
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

@@ -189,16 +192,28 @@ struct rte_mbuf {
struct rte_mbuf *next;/**< Next segment of scattered packet. */

/* fields to support TX offloads */
-   union {
-   uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */
+   /* two bytes - l2 len (including MAC/VLANs/SNAP if present)
+* two bytes - l3 len (including IP options)
+* two bytes - l4 len TCP/UDP header len - including TCP options
+* two bytes - TCP tso segment size
+*/
+   union{
+   uint64_t l2_l3_l4_tso_seg; /**< combined for easy fetch */
struct {
-   uint16_t l3_len:9;  /**< L3 (IP) Header Length. */
-   uint16_t l2_len:7;  /**< L2 (MAC) Header Length. */
+   uint16_t l3_len; /**< L3 (IP) Header */
+   uint16_t l2_len; /**< L2 (MAC) Header */
+   uint16_t l4_len; /**< TCP/UDP header len */
+   uint16_t tso_segsz; /**< TCP TSO segment size */
};
};
 } __rte_cache_aligned;

 /**
+ * Given the rte_mbuf returns the l2_l3_len combined
+ */
+#define RTE_MBUF_TO_L2_L3_LEN(mb) (uint32_t)(((mb)->l2_len << 16) | 
(mb)->l3_len)
+
+/**
  * Given the buf_addr returns the pointer to corresponding mbuf.
  */
 #define RTE_MBUF_FROM_BADDR(ba) (((struct rte_mbuf *)(ba)) - 1)
@@ -545,7 +560,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
 {
m->next = NULL;
m->pkt_len = 0;
-   m->l2_l3_len = 0;
+   m->l2_l3_l4_tso_seg = 0;
m->vlan_tci = 0;
m->nb_segs = 1;
m->port = 0xff;
@@ -613,7 +628,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, 
struct rte_mbuf *md)
mi->data_len = md->data_len;
mi->port = md->port;
mi->vlan_tci = md->vlan_tci;
-   mi->l2_l3_len = md->l2_l3_len;
+   mi->l2_l3_l4_tso_seg = md->l2_l3_l4_tso_seg;
mi->hash = md->hash;

mi->next = NULL;
diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index f09c525..0f3248e 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -399,7 +399,7 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,

ol_flags = tx_pkt->ol_flags;
vlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;
-   vlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;
+   vlan_macip_lens.f.l2_l3_len = RTE_MBUF_TO_L2_L3_LEN(tx_pkt);
tx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-21 Thread Neil Horman

On Tue, Oct 21, 2014 at 10:43:03AM +, Richardson, Bruce wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Tuesday, October 21, 2014 11:33 AM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > cycles/packet
> > 
> > > >
> > > > > + if (count == 0)
> > > > > + return -1;
> > > > > +
> > > > > + printf("%lu packet, %lu drop, %lu idle\n", count, drop, idle);
> > > > > + printf("Result: %ld cycles per packet\n", (cur_tsc - prev_tsc) 
> > > > > / count);
> > > > > +
> > > > Bad math here.  Theres no guarantee that the tsc hasn't wrapped 
> > > > (potentially
> > > > more than once) depending on your test length.  you need to check the 
> > > > tsc
> > before
> > > > and after each burst and record an average of deltas instead, 
> > > > accounting in
> > each
> > > > instance for the possibility of wrap.
> > > [Liang, Cunming] I'm not sure catch your point correctly.
> > > I think both cur_tsc and prev_tsc are 64 bits width.
> > > For 3GHz, I think it won't wrapped so quick.
> > > As it's uint64_t, so even get wrapped, the delta should still be correct.
> > But theres no guarantee that the tsc starts at zero when you begin your 
> > test.
> > The system may have been up for a long time and near wrapping already.
> > Regardless, you need to account for the possibility that cur_tsc is smaller
> > than prev_tsc, or this breaks.
> > 
> 
> The tsc. is 64-bit and so only wraps around every couple of hundred years or 
> so on a 2GHz machine, so I don't think it's necessary to handle that case. 
> 
But that presumes that no one has written the TSC via IA32_TIME_STAMP_COUNTER.
Assuming that something will never wrap just seems like bad practice here.  We
should have a general purpose macro to handle wrapping counters like this, if
not for this case specficially, then in general.

Neil

> /Bruce
>

[dpdk-dev] development/integration branch?

2014-10-21 Thread Neil Horman

On Tue, Oct 21, 2014 at 11:38:34AM +0200, Marc Sune wrote:
> Thomas,
> 
> On 21/10/14 11:28, Thomas Monjalon wrote:
> >2014-10-21 11:14, Marc Sune:
> >>On 21/10/14 10:46, Thomas Monjalon wrote:
> >>>My balance is different because I have a simpler solution for Marc's 
> >>>problem:
> >>>   git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)
> >>We all know we _can_ do this. But is it really necessary? We should be
> >>all as lazy as possible and make it easy for users IMHO. `git pull` is
> >>easier :)
> >Yes and lazy users download tarballs.
> 
> At least for me, I stopped downloading DPDK tarballs after the third time I
> had to upgrade the release.
> >>I don't see any drawback of using a development branch, except if you
> >>consider the extra push to master per release a drawback.
> >No I don't care to push one more thing.
> >But I care about the message brought by such change. It would mean that
> >we can break the development branch and that most of developers don't test
> >it nor base their patches on the latest commit. It's all about simple rules
> >and messages.
> 
> I understand your concern but, isn't peer reviewing meant to prevent this?
> 
> >>Also think about new users downloading the repo for the first time. They
> >>are forced to do this right now if they want to checkout the latest stable.
> >New users will get the latest release and expect to see current work in
> >progress right after cloning the git tree (in master branch).
> >It's also more common to see work in progress in default branch in cgit.
> I know, but I also know other projects do the way I proposed with success.
> In any case it was just a suggestion to try to improve things.
> 
> marc
> 

Ideally, I think the best solution (and I've proposed this on the list several
times), is to create a release branch when you begin tagging -rc branches, and
use that branch for stabilization/testing prior to a release.  Only fixes are
allowed in such a branch, and can be merged with master post release-tagging.
That would allow master to continue patch integration undeterred.

Alternatively, doing like Linus does is also a fine idea, announce a merge
window during which features are integrated, and after which new features are
disallowed during the pre-release stabilization period.  Doing so however
requires a high degree of commitment to not make exceptions.  If that is a
concern, then a release branch is the safer approach, as it separates fixes from
other patches.

Neil

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-21 Thread Thomas Monjalon

2014-10-21 16:46, Jijiang Liu:
> +#define RTE_TUNNEL_FILTER_TO_QUEUE 1 /**< point to an queue by filter type */

Sorry, I don't understand what is this value for?

> +#define RTE_TUNNEL_FILTER_IMAC_IVLAN (ETH_TUNNEL_FILTER_IMAC | \
> + ETH_TUNNEL_FILTER_IVLAN)
> +#define RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID (ETH_TUNNEL_FILTER_IMAC | \
> + ETH_TUNNEL_FILTER_IVLAN | \
> + ETH_TUNNEL_FILTER_TENID)
> +#define RTE_TUNNEL_FILTER_IMAC_TENID (ETH_TUNNEL_FILTER_IMAC | \
> + ETH_TUNNEL_FILTER_TENID)
> +#define RTE_TUNNEL_FILTER_OMAC_TENID_IMAC (ETH_TUNNEL_FILTER_OMAC | \
> + ETH_TUNNEL_FILTER_TENID | \
> + ETH_TUNNEL_FILTER_IMAC)

I thought you agree that these definitions are useless?

-- 
Thomas

[dpdk-dev] nic loopback

2014-10-21 Thread Alex Markuze

How can I set/query this bit (LLE(PFVMTXSW[n]), intel 82599 ) on ESX, or
any other friendlier environment like Linux?

On Tue, Oct 21, 2014 at 4:18 AM, Liang, Cunming 
wrote:

>
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Markuze
> > Sent: Tuesday, October 21, 2014 12:24 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] nic loopback
> >
> > Hi,
> > I'm trying to send packets from an application to it self, meaning smac
> ==
> > dmac.
> > I'm working with intel 82599 virtual function. But it seems that these
> > packets are lost.
> >
> > Is there a software/hw limitation I'm missing here (some additional
> > anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the hw
> > loopback should work, at least thats the theory.
> >
> [Liang, Cunming] You could have a check on register LLE(PFVMTXSW[n]).
> Which allow an individual pool to be able to send traffic and have it
> loopback to itself.
> >
> > Thanks.
>

[dpdk-dev] nic loopback

2014-10-21 Thread Thomas Monjalon

2014-10-20 19:24, Alex Markuze:
> I'm trying to send packets from an application to it self, meaning smac  ==
> dmac.
> I'm working with intel 82599 virtual function. But it seems that these
> packets are lost.
> 
> Is there a software/hw limitation I'm missing here (some additional
> anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the hw
> loopback should work, at least thats the theory.

I think you should look at these commits:

ixgbe: add Tx->Rx loopback mode for 82599
http://dpdk.org/browse/dpdk/commit/?id=db035925617
app/testpmd: add loopback topology
http://dpdk.org/browse/dpdk/commit/?id=3e2006d6186

-- 
Thomas

[dpdk-dev] [PATCH v5] KNI: use a memzone pool for KNI alloc/release

2014-10-21 Thread Thomas Monjalon

2014-10-21 12:46, Marc Sune:
> The previous implementation of rte_kni_alloc() was allocating memzones with a
> name composed of a fixed string and the interface name. When an application 
> was
> allocating and deallocating multiple interfaces with different names, memzones
> were quickly exhausted, even though memzones from deallocated interfaces were
> never used anymore (unless an interface with the same name was re-allocated).
> As a result, the application was unable to allocate more KNI interfaces with
> different names.
> 
> This patch implements the KNI memzone pool in order to prevent memzone
> exhaustion when allocating/deallocating KNI interfaces. It adds a new API 
> call,
> rte_kni_init(max_kni_ifaces) that shall be called before any call to
> rte_kni_alloc() if KNI is used. The memzones are pre-allocated with interface-
> independent names so that they can be reused.
> 
> Signed-off-by: Marc Sune 

Previously acked by Helin.

Applied

Thanks
-- 
Thomas

[dpdk-dev] nic loopback

2014-10-21 Thread Alex Markuze

Thanks Thomas,
unfortunately these patches are only valid a pf*. This is also evident from
the ixgbe pmd code which is the only one looking at this bit (lpbk_mode).
The ixgbevf functions are agnostic to this capability.

*
http://www.intel.com/content/dam/doc/design-guide/82599-sr-iov-driver-companion-guide.pdf

On Tue, Oct 21, 2014 at 6:32 PM, Thomas Monjalon 
wrote:

> 2014-10-20 19:24, Alex Markuze:
> > I'm trying to send packets from an application to it self, meaning smac
> ==
> > dmac.
> > I'm working with intel 82599 virtual function. But it seems that these
> > packets are lost.
> >
> > Is there a software/hw limitation I'm missing here (some additional
> > anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the hw
> > loopback should work, at least thats the theory.
>
> I think you should look at these commits:
>
> ixgbe: add Tx->Rx loopback mode for 82599
> http://dpdk.org/browse/dpdk/commit/?id=db035925617
> app/testpmd: add loopback topology
> http://dpdk.org/browse/dpdk/commit/?id=3e2006d6186
>
> --
> Thomas
>

[dpdk-dev] Why do we need iommu=pt?

2014-10-21 Thread Shivapriya Hiremath

Hi,

Thank you for all the replies.
I am trying to understand the impact of this on DPDK. What will be the
repercussions of disabling "iommu=pt" on the DPDK performance?


On Tue, Oct 21, 2014 at 12:32 AM, Alex Markuze  wrote:

> DPDK uses a 1:1 mapping and doesn't support IOMMU.  IOMMU allows for
> simpler VM physical address translation.
> The second role of IOMMU is to allow protection from unwanted memory
> access by an unsafe devise that has DMA privileges. Unfortunately this
> protection comes with an extremely high performance costs for high speed
> nics.
>
> To your question iommu=pt disables IOMMU support for the hypervisor.
>
> On Tue, Oct 21, 2014 at 1:39 AM, Xie, Huawei  wrote:
>
>>
>>
>> > -Original Message-
>> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya
>> Hiremath
>> > Sent: Monday, October 20, 2014 2:59 PM
>> > To: dev at dpdk.org
>> > Subject: [dpdk-dev] Why do we need iommu=pt?
>> >
>> > Hi,
>> >
>> > My question is that if the Poll mode  driver used the DMA kernel
>> interface
>> > to set up its mappings appropriately, would it still require that
>> iommu=pt
>> > be set?
>> > What is the purpose of setting iommu=pt ?
>> PMD allocates memory though hugetlb file system, and fills the physical
>> address
>> into the descriptor.
>> pt is used to pass through iotlb translation. Refer to the below link.
>> http://lkml.iu.edu/hypermail/linux/kernel/0906.2/02129.html
>> >
>> > Thank you.
>>
>
>

[dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure changes

2014-10-21 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 21, 2014 6:26 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure
> changes
> 
> Hi Jijiang,
> 
> 2014-10-21 16:46, Jijiang Liu:
> > Remove the "reserved2" field and add the "packet_type"
> 
> "Remove and add" can be said "Replace".
> 
> > and the "inner_l2_l3_len" fields in the rte_mbuf structure.
> 
> Please explain that you are using 2 bytes of the second cache line for TX
> offloading of tunnels.
> 
> > /* remaining bytes are set on RX when pulling packet from descriptor
> */
> > MARKER rx_descriptor_fields1;
> > -   uint16_t reserved2;   /**< Unused field. Required for padding */
> > +
> > +   /**
> > +* Packet type, which is used to indicate ordinary L2 packet format
> and
> > +* also tunneled packet format such as IP in IP, IP in GRE, MAC in GRE
> > +* and MAC in UDP.
> > +*/
> > +   uint16_t packet_type;
> 
> Why not name it "l2_type"?

In datasheet, this term is called packet type(s).
Personally , I think packet type is  more clear what meaning of this field is . 
^_^

> --
> Thomas

[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images

2014-10-21 Thread Gonzalez Monroy, Sergio

Hi Matthew,

> -Original Message-
> From: Matthew Hall [mailto:mhall at mhcomputing.net]
> Sent: Friday, October 17, 2014 9:57 AM
> To: Gonzalez Monroy, Sergio
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud
> Images
> 
[...]
> The virtio non UIO PMD doesn't depend on DPDK even if everything's built as
> a .so . To me this seems buggy but maybe I missed something:
> 
> $ ldd librte_pmd_virtio.so
> linux-vdso.so.1 =>  (0x7fff7adfe000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fa0d810c000)
> /lib64/ld-linux-x86-64.so.2 (0x7fa0d86ed000)
> 
You are right about the dependency (probably I did not explain the issue 
properly).

As you point out below, when building static DPDK we should not expect ldd to 
report any DPDK dependency.
When building shared DPDK libs, we should expect such dependency expect for the 
fact that we are not linking against DPDK
libraries when building librte_pmd_virtio.so, which as you mention is buggy.
The problem is not exclusive of librte_pmd_virtio.so module as you can easily 
check by doing ldd on DPDK shared libraries.
One way to fix the buggy behavior with librte_pmd_virtio would be to build 
against dpdk libs when dpdk is built as shared libs.

The issue is that we have two different ways of building DPDK, static or shared 
libs, and the building process differs.
Let's use librte_pmd_virtio shared object building process as an example:
- we do not want to build against static DPDK libraries as this would result in 
duplicated code in librte_pmd_virtio and other apps (ie. testpmd)
- we want to link against shared DPDK libs to add dependencies and provide 
reliable information (ie. ldd)

Currently librte_pmd_virtio is built without linking against PDDK libs, 
therefore no dependencies are included even when DPDK is built as shared libs.
With this way of building the shared object, the module will be loaded 
successfully on apps built with static/shared DPDK libs as long as all undefined
symbols in librte_pmd_virtio are defined in either the app (for static) or any 
of the loaded shared libs (for shared).

> Now, about the problems...
> 
I have not been able to reproduce these problems.
My setup was QEMU, dpdk-1.7.1, virtio-net-pmd, fedora 20 (both host and guest), 
GCC/CLANG.
I have successfully loaded the module running testpmd with static/shared DPDK 
libs using GCC and CLANG.

[...]
> 2) The second problem concerns compiling against DPDK 1.7.1 (plug my minor
> clang compilation fixes) with these settings:
> 
> # Compile to share library
> CONFIG_RTE_BUILD_SHARED_LIB=n
> # Combine to one single library
> CONFIG_RTE_BUILD_COMBINE_LIBS=y
> CONFIG_RTE_LIBNAME="intel_dpdk"
> 
> Then virtio-net-pmd is compiled using make
> RTE_INCLUDE=../dpdk/build/include .
> Just as printed above, no runtime dependency on DPDK is there (as
> expected in the static case, but not what I believe should be expected in the
> dynamic case).
> 
> From there, the app is launched. Then the following error appears, and no
> Ethernet ports can be detected since the builtin UIO PMD driver can't see the
> ports either, so the app exits.
> 
> EAL: open shared lib /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so
> EAL: /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so: undefined
> symbol: per_lcore__lcore_id
> 
Are we talking about a DPDK or custom app?
Do you only see the issue when CONFIG_RTE_BUILD_COMBINE_LIBS=y?

> Running nm and nm -D shows this:
> 
> $ nm librte_pmd_virtio.so | fgrep -i per_lcore__lcore_id U
> per_lcore__lcore_id
> 
This is expected behavior as the symbol is defined in librte_eal.
The dynamic linker will resolve the undefined reference when loading the module 
in run-time.

Thanks,
Sergio

[dpdk-dev] nic loopback

2014-10-21 Thread Thomas Monjalon

21/10/2014 18:54, Alex Markuze :,
> unfortunately these patches are only valid a pf*. This is also evident from
> the ixgbe pmd code which is the only one looking at this bit (lpbk_mode).
> The ixgbevf functions are agnostic to this capability.

And did you try testpmd with PORT_TOPOLOGY_LOOP (2nd link)?
It could help to understand what's going wrong.

> > ixgbe: add Tx->Rx loopback mode for 82599
> > 
> > http://dpdk.org/browse/dpdk/commit/?id=db035925617
> > 
> > app/testpmd: add loopback topology
> > 
> > http://dpdk.org/browse/dpdk/commit/?id=3e2006d6186

Please Alex, do not top post.
-- 
Thomas

[dpdk-dev] [PATCH v2 03/13] ethdev: add more annotation

2014-10-21 Thread Thomas Monjalon

2014-09-25 16:40, Helin Zhang:
> Add more annotation, to clearly tell the 'rte_eth_dev_info_get()'
> users that the buffer should be cleared first.

Since commit http://dpdk.org/browse/dpdk/commit/?id=a30268e9a2
(ethdev: reset whole dev info structure before filling),
this patch is now useless.

-- 
Thomas

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-21 Thread Liang, Cunming



> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Tuesday, October 21, 2014 6:33 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> 
> On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > Hi Neil,
> >
> > Very appreciate your comments.
> > I add inline reply, will send v3 asap when we get alignment.
> >
> > BRs,
> > Liang Cunming
> >
> > > -Original Message-
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Saturday, October 11, 2014 1:52 AM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> > >
> > > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote:
> > > > It provides unit test to measure cycles/packet in NIC loopback mode.
> > > > It simply gives the average cycles of IO used per packet without test
> equipment.
> > > > When doing the test, make sure the link is UP.
> > > >
> > > > Usage Example:
> > > > 1. Run unit test app in interactive mode
> > > > app/test -c f -n 4 -- -i
> > > > 2. Run and wait for the result
> > > > pmd_perf_autotest
> > > >
> > > > There's option to choose rx/tx pair, default is vector.
> > > > set_rxtx_mode [vector|scalar|full|hybrid]
> > > > Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' 
> > > > without
> > > INC_VEC=y in config
> > > >
> > > > Signed-off-by: Cunming Liang 
> > > > Acked-by: Bruce Richardson 
> > >
> > > Notes inline
> > >
> > > > ---
> > > >  app/test/Makefile   |1 +
> > > >  app/test/commands.c |   38 +++
> > > >  app/test/packet_burst_generator.c   |4 +-
> > > >  app/test/test.h |4 +
> > > >  app/test/test_pmd_perf.c|  626
> > > +++
> > > >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
> > > >  6 files changed, 677 insertions(+), 2 deletions(-)
> > > >  create mode 100644 app/test/test_pmd_perf.c
> > > >
> > > > diff --git a/app/test/Makefile b/app/test/Makefile
> > > > index 6af6d76..ebfa0ba 100644
> > > > --- a/app/test/Makefile
> > > > +++ b/app/test/Makefile
> > > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c
> > > >
> > > >  SRCS-y += test_ring.c
> > > >  SRCS-y += test_ring_perf.c
> > > > +SRCS-y += test_pmd_perf.c
> > > >
> > > >  ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
> > > >  SRCS-y += test_table.c
> > > > diff --git a/app/test/commands.c b/app/test/commands.c
> > > > index a9e36b1..f1e746e 100644
> > > > --- a/app/test/commands.c
> > > > +++ b/app/test/commands.c
> > > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = {
> > > >
> > > > +#define NB_ETHPORTS_USED(1)
> > > > +#define NB_SOCKETS  (2)
> > > > +#define MEMPOOL_CACHE_SIZE 250
> > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > > RTE_PKTMBUF_HEADROOM)
> > > Don't you want to size this in accordance with the amount of data your
> sending
> > > (64 Bytes as noted above)?
> > [Liang, Cunming] The case is designed to measure small packet IO cost with
> normal mbuf size.
> > Even if decreasing the size, it won't gain significant cycles.
> > >
> That presumes a non-memory constrained system, doesn't it?  I suppose in the
> end
> as long as you have consistency, its not overly relevant, but it seems like
> you'll want to add data sizing as a feature to this eventually (i.e. the 
> ability
> to test performance for larger frames sizes), at which point you'll need to 
> make
> this non-static anyway.
[Liang, Cunming] For a normal Ethernet packet(w/o jumbo frame), packet size is 
1518B.
As in really network, there won't have huge number of jumbo frames.
The mbuf size 2048 is a reasonable value to cover most of the packet size.
It's also be chosen by lots of NIC as the default receiving buffer size in DMA 
register.
In case larger than the size, it need do scatter and gather but lose some 
performance.
The unit test won't measure size from 64 to 9600, won't plan to measure 
scatter-gather rx/tx.
It focus on 64B packet size and taking the mbuf size being used the most often.
> 
> > > > +static void
> > > > +print_ethaddr(const char *name, const struct ether_addr *eth_addr)
> > > > +{
> > > > +   printf("%s%02X:%02X:%02X:%02X:%02X:%02X", name,
> > > > +   eth_addr->addr_bytes[0],
> > > > +   eth_addr->addr_bytes[1],
> > > > +   eth_addr->addr_bytes[2],
> > > > +   eth_addr->addr_bytes[3],
> > > > +   eth_addr->addr_bytes[4],
> > > > +   eth_addr->addr_bytes[5]);
> > > > +}
> > > > +
> > > This was copieed from print_ethaddr.  Seems like a good candidate for a
> common
> > > function in rte_ether.h
> > [Liang, Cunming] Agree with you, some of samples now use it with the same
> copy.
> > I'll rework it. Adding 'ether_format_addr' in rte_ether.h only for format

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-21 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 21, 2014 6:51 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet
> identification API in librte_ether
> 
> 2014-10-21 16:46, Jijiang Liu:
> > There are "some" destination UDP port numbers that have unque meaning.
> > In terms of VxLAN, "IANA has assigned the value 4789 for the VXLAN UDP
> > port, and this value SHOULD be used by default as the destination UDP
> > port. Some early implementations of VXLAN have used other values for
> > the destination port. To enable interoperability with these
> implementations, the destination port SHOULD be configurable."
> >
> > Add two APIs in librte_ether for supporting UDP tunneling port
> configuration on i40e.
> > Currently, only VxLAN is implemented in this patch set.
> 
> Actually, there are 2 different things in this patch
> - new tunnelling API
> - VXLAN macros
> Please split in 2 patches.
Ok
> >  int
> > +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> > +  struct rte_eth_udp_tunnel *udp_tunnel,
> > +  uint8_t count)
> > +{
> > +   uint8_t i;
> > +   struct rte_eth_dev *dev;
> > +   struct rte_eth_udp_tunnel *tunnel;
> > +
> > +   if (port_id >= nb_ports) {
> > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +   return -ENODEV;
> > +   }
> > +
> > +   if (udp_tunnel == NULL) {
> > +   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> > +   return -EINVAL;
> > +   }
> > +   tunnel = udp_tunnel;
> > +
> > +   for (i = 0; i < count; i++, tunnel++) {
> > +   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> > +   PMD_DEBUG_TRACE("Invalid tunnel type\n");
> > +   return -EINVAL;
> > +   }
> > +   }
> 
> I'm not sure it's a good idea to provide a count parameter to iterate in a
> loop.
> It's probably something that the application should do by itself.

It is necessary to check if this prot_type(tunnel type) is valid here in case 
applications don't do that. 

> > +
> > +   dev = &rte_eth_devices[port_id];
> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_add, -
> ENOTSUP);
> > +   return (*dev->dev_ops->udp_tunnel_add)(dev, udp_tunnel, count); }
> 
> [...]
> 
> > +/**
> > + * Tunneled type.
> > + */
> > +enum rte_eth_tunnel_type {
> > +   RTE_TUNNEL_TYPE_NONE = 0,
> > +   RTE_TUNNEL_TYPE_VXLAN,
> > +   RTE_TUNNEL_TYPE_GENEVE,
> > +   RTE_TUNNEL_TYPE_TEREDO,
> > +   RTE_TUNNEL_TYPE_NVGRE,
> > +   RTE_TUNNEL_TYPE_MAX,
> > +};
> 
> This is moved later from rte_ethdev.h to rte_eth_ctrl.h.
> Please choose where is the right location in this patch.
> By the way, I think ethdev is the right location because it's not directly
> related to ctrl API, right?

It is used in both rte_ethdev.h file and rte_eth_ctrl.h file. ?
In rte_eth_ctrl.h file, it is used in rte_eth_tunnel_filter_conf structure.
+struct rte_eth_tunnel_filter_conf {
+   ...
+   enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */
+   ...
+};

I will put the rte_eth_tunnel_type definition in the rte_eth_ctrl.h file at the 
beginning of definition.


> >  struct rte_eth_conf {
> > +   enum rte_eth_tunnel_type tunnel_type;
> > uint16_t link_speed;
> > /**< ETH_LINK_SPEED_10[0|00|000], or 0 for autonegotation */
> > uint16_t link_duplex;
> 
> Please don't add this field as the first. It's more logical to start port
> configuration with link speed, duplex, etc.

ok
> Then please add a comment to explain how it should be used.
I will add more comment for it.
> But I doubt we should configure a tunnel type for a whole port.
Yes, your understanding is correct. It is for a whole port/PF, that's why we 
should add 
tunnel_type in  rte_eth_conf structure.

> There's something weird here.


> > +/* VXLAN protocol header */
> 
> This comment should be doxygen'ed.
> 
> > +struct vxlan_hdr {
> > +   uint32_t vx_flags; /**< VxLAN flag. */
> > +   uint32_t vx_vni;   /**< VxLAN ID. */
> > +} __attribute__((__packed__));
> > +
> [...]
> >  #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
> > #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time
> > Protocol. */
> >
> > +#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct
> > +vxlan_hdr))
> 
> Please add a doxygen comment.
ok
> --
> Thomas

[dpdk-dev] [PATCH] librte_ip_frag: Disable ipv4/v6 fragmentation if RTE_MBUF_REFCNT=n

2014-10-21 Thread Pablo de Lara

Ipv4/v6 fragmentation libraries depends on refcnt.
There was a compilation error if RTE_MBUF_REFCNT was disabled,
so those libraries have been disabled in that situation.

Signed-off-by: Pablo de Lara 
---
 lib/librte_ip_frag/Makefile  |4 +++-
 lib/librte_ip_frag/rte_ip_frag.h |5 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 2265c93..9043543 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -38,9 +38,11 @@ CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)

 #source files
+ifeq ($(CONFIG_RTE_MBUF_REFCNT),y)
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
-SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
+endif
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ip_frag_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += ip_frag_internal.c
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index e0936dc..230a903 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -175,6 +175,7 @@ rte_ip_frag_table_destroy( struct rte_ip_frag_tbl *tbl)
rte_free(tbl);
 }

+#ifdef RTE_MBUF_REFCNT
 /**
  * This function implements the fragmentation of IPv6 packets.
  *
@@ -203,7 +204,7 @@ rte_ipv6_fragment_packet(struct rte_mbuf *pkt_in,
uint16_t mtu_size,
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
-
+#endif

 /*
  * This function implements reassembly of fragmented IPv6 packets.
@@ -252,6 +253,7 @@ rte_ipv6_frag_get_ipv6_fragment_header(struct ipv6_hdr *hdr)
return NULL;
 }

+#ifdef RTE_MBUF_REFCNT
 /**
  * IPv4 fragmentation.
  *
@@ -280,6 +282,7 @@ int32_t rte_ipv4_fragment_packet(struct rte_mbuf *pkt_in,
uint16_t nb_pkts_out, uint16_t mtu_size,
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
+#endif

 /*
  * This function implements reassembly of fragmented IPv4 packets.
-- 
1.7.4.1

[dpdk-dev] [PATCH v2 04/13] ethdev: support of multiple sizes of redirection table

2014-10-21 Thread Thomas Monjalon

2014-09-25 16:40, Helin Zhang:
> To support possible different sizes of redirection table,
> structures and functions need to be redefined. In detail,
> * 'struct rte_eth_rss_reta' has been redefined.
> * 'uint16_t reta_size' has been added into
>   'struct rte_eth_dev_info'.
> * Updating/querying reta have been reimplemented with one
>   more parameter of redirection table size.
> 
> v2 changes:
> * Put changes for supporting multiple sizes of reta in
>   ethdev into a single patch.

In order to allow usage of git bisect, compilation must not be broken,
even inside a patchset.
So when refactoring an existing API, you must adapt the dependent code
in the same patch.
To make things easy to review, please try to change API incrementally
with good explanation of why each change is needed.

>  /* Definitions used for redirection table entry size */
> -#define ETH_RSS_RETA_NUM_ENTRIES 128
> -#define ETH_RSS_RETA_MAX_QUEUE   16
> +#define ETH_RSS_RETA_SIZE_64  64
> +#define ETH_RSS_RETA_SIZE_128 128
> +#define ETH_RSS_RETA_SIZE_512 512
> +
> +#define RTE_BIT_WIDTH_64 (CHAR_BIT * sizeof(uint64_t))

Are these constants really needed?

>  /**
> - * A structure used to configure Redirection Table of  the Receive Side
> - * Scaling (RSS) feature of an Ethernet port.
> + * A structure used to configure 64 entries of Redirection Table of the
> + * Receive Side Scaling (RSS) feature of an Ethernet port. To configure
> + * more than 64 entries supported by hardware, an array of this structure
> + * is needed.
>   */

Explaining the array of 64 entries could be useful in commit log.
Please don't forget to answer the "why" question in commit logs.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] librte_ip_frag: Disable ipv4/v6 fragmentation if RTE_MBUF_REFCNT=n

2014-10-21 Thread Thomas Monjalon

2014-10-21 15:15, Pablo de Lara:
> Ipv4/v6 fragmentation libraries depends on refcnt.
> There was a compilation error if RTE_MBUF_REFCNT was disabled,
> so those libraries have been disabled in that situation.

Please Pablo, could you add a short justification that it's not
possible to implement fragmentation without refcnt (at least with
the current design)?

What do you think of adding a warning as below?

> +ifeq ($(CONFIG_RTE_MBUF_REFCNT),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
> -SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
>  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
+else
+$(info WARNING: Fragmentation feature is disabled because it needs 
MBUF_REFCNT.)
> +endif
> +SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c

-- 
Thomas

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-21 Thread Thomas Monjalon

2014-10-21 13:48, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2014-10-21 16:46, Jijiang Liu:
> > >  int
> > > +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> > > +struct rte_eth_udp_tunnel *udp_tunnel,
> > > +uint8_t count)
> > > +{
> > > + uint8_t i;
> > > + struct rte_eth_dev *dev;
> > > + struct rte_eth_udp_tunnel *tunnel;
> > > +
> > > + if (port_id >= nb_ports) {
> > > + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > + return -ENODEV;
> > > + }
> > > +
> > > + if (udp_tunnel == NULL) {
> > > + PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> > > + return -EINVAL;
> > > + }
> > > + tunnel = udp_tunnel;
> > > +
> > > + for (i = 0; i < count; i++, tunnel++) {
> > > + if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> > > + PMD_DEBUG_TRACE("Invalid tunnel type\n");
> > > + return -EINVAL;
> > > + }
> > > + }
> > 
> > I'm not sure it's a good idea to provide a count parameter to iterate in a
> > loop.
> > It's probably something that the application should do by itself.
> 
> It is necessary to check if this prot_type(tunnel type) is valid here in case
> applications don't do that. 

Yes, you have to check prot_type but looping for several tunnels is not needed
at this level.

> > But I doubt we should configure a tunnel type for a whole port.
> 
> Yes, your understanding is correct. It is for a whole port/PF, that's why we
> should add tunnel_type in rte_eth_conf structure.

Please explain me why a tunnel type should be associated to a port.
This design looks really broken.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd

2014-10-21 Thread Doherty, Declan

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, October 21, 2014 9:36 AM
> To: Ouyang, Changchun; De Lara Guarch, Pablo
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: Fix compilation issue in vpmd
> 
> 2014-10-21 08:28, Ouyang, Changchun:
> > From: De Lara Guarch, Pablo
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > > > 2014-10-21 14:59, Ouyang Changchun:
> > > > > Fix the compilation issue in vector PMD when macro RTE_MBUF_REFCNT
> > > > > is disabled.
> > > > >
> > > > > Signed-off-by: Changchun Ouyang 
> > > >
> > > > Acked-by: Thomas Monjalon 
> > > >
> > > > Applied
> > >
> > > I was checking this patch right now, and I come across a second 
> > > compilation
> > > issue,  because rte_mbuf_refcnt_update and rte_pktmbuf_attach are not
> > > declared, and Bond PMD and IP fragmentation libraries use those functions.
> > >
> > > I guess that it is late to NACK this :P, but we require a second patch to 
> > > fix
> > > completely this issue.
> >
> > As it fixes the compilation issue in vpmd, so no reason to NACK it,  :-)
> 
> Exact
> 
> > In my config, both BOND and IP fragment is disabled. So I don't come across 
> > your
> issues.
> > Yes, agree with you, we need another patch to fix compilation issue in other
> both places.
> 
> Yes, I'm aware of these limitations.
> Please, first explain why mbuf refcnt is needed for these features.
> Then we have 2 options: remove the dependency or add more ifdefs.
> 
> Thanks
> --
> Thomas

Thomas, 
for link bonding the refcnt is used in broadcast mode to allow the same
mbuf to be transmitted on multiple slaves. By increasing the refcnt, we 
can safely transmit the mbuf on multiple slaves without the danger of a 
mbuf being freed by one slave while in use by another. I can provide a patch to
disable the broadcast mode of operation if the bonding pmd is built with the 
macro 
RTE_MBUF_REFCNT disabled. I don't think there is any other option other than 
than to completely disable the library if the refcnt parameter isn't available
as we don't have access to the transmit mempool to allocate new mbufs to
make copies of the original. 

Declan

[dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement

2014-10-21 Thread Yong Wang

Rashmin/Stephen,

Since you have worked on vmxnet3 pmd drivers, I wonder if you can help review 
this set of patches.  Any other reviews/test verifications are welcome of 
course.  We have reviewed/tested all patches internally.

Yong

From: dev  on behalf of Yong Wang 
Sent: Monday, October 13, 2014 2:00 PM
To: Thomas Monjalon
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement

Only the last one is performance related and it merely tries to give hints to 
the compiler to hopefully make branch prediction more efficient.  It also moves 
a constant assignment out of the pkt polling loop.

We did performance evaluation on a Nehalem box with 4cores at 2.8GHz x 2 socket:
On the DPDK-side, it's running some l3 forwarding apps in a VM on ESXi with one 
core assigned for polling.  The client side is pktgen/dpdk, pumping 64B tcp 
packets at line rate.  Before the patch, we are seeing ~900K PPS with 65% cpu 
of a core used for DPDK.  After the patch, we are seeing the same pkt rate with 
only 45% of a core used.  CPU usage is collected factoring our the idle loop 
cost.  The packet rate is a result of the mode we used for vmxnet3 (pure 
emulation mode running default number of hypervisor contexts).  I can add these 
info in the review request.

Yong

From: Thomas Monjalon 
Sent: Monday, October 13, 2014 1:29 PM
To: Yong Wang
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement

Hi,

2014-10-12 23:23, Yong Wang:
> This patch series include various fixes and improvement to the
> vmxnet3 pmd driver.
>
> Yong Wang (5):
>   vmxnet3: Fix VLAN Rx stripping
>   vmxnet3: Add VLAN Tx offload
>   vmxnet3: Fix dev stop/restart bug
>   vmxnet3: Add rx pkt check offloads
>   vmxnet3: Some perf improvement on the rx path

Please, could describe what is the performance gain for these patches?
Benchmark numbers would be appreciated.

Thanks
--
Thomas

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-21 Thread Ananyev, Konstantin



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> Sent: Tuesday, October 21, 2014 2:38 PM
> To: Richardson, Bruce
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx 
> cycles/packet
> 
> On Tue, Oct 21, 2014 at 10:43:03AM +, Richardson, Bruce wrote:
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > Sent: Tuesday, October 21, 2014 11:33 AM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > > cycles/packet
> > >
> > > > >
> > > > > > +   if (count == 0)
> > > > > > +   return -1;
> > > > > > +
> > > > > > +   printf("%lu packet, %lu drop, %lu idle\n", count, drop, idle);
> > > > > > +   printf("Result: %ld cycles per packet\n", (cur_tsc - prev_tsc) 
> > > > > > / count);
> > > > > > +
> > > > > Bad math here.  Theres no guarantee that the tsc hasn't wrapped 
> > > > > (potentially
> > > > > more than once) depending on your test length.  you need to check the 
> > > > > tsc
> > > before
> > > > > and after each burst and record an average of deltas instead, 
> > > > > accounting in
> > > each
> > > > > instance for the possibility of wrap.
> > > > [Liang, Cunming] I'm not sure catch your point correctly.
> > > > I think both cur_tsc and prev_tsc are 64 bits width.
> > > > For 3GHz, I think it won't wrapped so quick.
> > > > As it's uint64_t, so even get wrapped, the delta should still be 
> > > > correct.
> > > But theres no guarantee that the tsc starts at zero when you begin your 
> > > test.
> > > The system may have been up for a long time and near wrapping already.
> > > Regardless, you need to account for the possibility that cur_tsc is 
> > > smaller
> > > than prev_tsc, or this breaks.
> > >
> >
> > The tsc. is 64-bit and so only wraps around every couple of hundred years 
> > or so on a 2GHz machine, so I don't think it's necessary to
> handle that case.
> >
> But that presumes that no one has written the TSC via IA32_TIME_STAMP_COUNTER.

Then the test app would just print the wrong number :)
I suppose user will just repeat the test.
Again, imagine someone wrmsr(TSC), but set the value just much bigger than 
current.
Then your statistics will be incorrect, but you have no wait to figure that 
out. 
I think that trying to handle all such hypothetical situations is a pure 
overkill.

> Assuming that something will never wrap just seems like bad practice here.  We
> should have a general purpose macro to handle wrapping counters like this, if
> not for this case specficially, then in general.
> 
> Neil
> 
> > /Bruce
> >

[dpdk-dev] [PATCH v6 0/9] Support VxLAN on Fortville

2014-10-21 Thread Liu, Yong

Tested-by: Yong Liu 

- Tested Commit: 455d09e54b92a4626e178b020fe9c23e43ede3f7
- OS: Fedora20 3.15.8-200.fc20.x86_64
- GCC: gcc version 4.8.3 20140624
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
- NIC: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ [8086:1583]
- Default x86_64-native-linuxapp-gcc configuration
- Total 6 cases, 6 passed, 0 failed

- Case: vxlan_ipv4_detect
  Description: Check testpmd can receive and detect vxlan packet 
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set forward type to rxonly and enable detail log output
testpmd>set fwd rxonly
testpmd>set verbose 1
testpmd>start
Send packets with udp/tcp/sctp inner L4 data
  Expected test result:
testpmd can receive the vxlan packet with different inner L4 data and 
detect whether the packet is vxlan packet.

- Case: vxlan_ipv6_detect
  Description: Check testpmd can receive and detect ipv6 vxlan packet
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set forward type to rxonly and enable detail log output
testpmd>set fwd rxonly
testpmd>set verbose 1
testpmd>start
Send vxlan packets with outer IPv6 header and inner IPv6 header.
  Expected test result:
testpmd can receive the vxlan packet with different inner L4 data and 
detect whether the packet is IPv6 vxlan packet.

- Case: vxlan_ipv4_checksum_offload
  Description: Check testpmd can offload vxlan checksum and forward the packet
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set csum packet forwarding mode and enable verbose log.
testpmd>set fwd csum
testpmd>set verbose 1
testpmd>start
Enable outer IP,UDP,TCP,SCTP and inner IP,UDP checksum offload when 
inner L4 protocal is UDP.
testpmd>tx_checksum set 0 0xf3
Enable outer IP,UDP,TCP,SCTP and inner IP,TCP,SCTP checksum offload 
when inner L4 protocal is TCP or SCTP.
testpmd>tx_checksum set 0 0xfd
Send ipv4 vxlan packet with invalid outer/inner l3 or l4 checksum.  
  Expected test result:
testpmd can forwarded vxlan packet and the checksum is corrected. The 
chksum error counter also increased.

- Case: vxlan_ipv6_checksum_offload
  Description: Check testpmd can offload ipv6 vxlan checksum and forward the 
packet 
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set csum packet forwarding mode and enable verbose log.
testpmd>set fwd csum
testpmd>set verbose 1
testpmd>start
Enable outer IP,UDP,TCP,SCTP and inner IP,UDP checksum offload when 
inner L4 protocal is UDP.
testpmd>tx_checksum set 0 0xf3
Enable outer IP,UDP,TCP,SCTP and inner IP,TCP,SCTP checksum offload 
when inner L4 protocal is TCP or SCTP.
testpmd>tx_checksum set 0 0xfd
Send ipv6 vxlan packet with invalid outer/inner l3 or l4 checksum.
  Expected test result:
testpmd can forwarded vxlan packet and the checksum is corrected. The 
chksum error counter also increased.


- Case: tunnel_filter
  Description: Check FVL vxlan tunnel filter function work with testpmd.
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4
Set rxonly forwarding mode and enable verbose log.
testpmd>set fwd rxonly
testpmd>set verbose

[dpdk-dev] Why do we need iommu=pt?

2014-10-21 Thread Zhou, Danny

IMHO, if memory protection with IOMMU is needed or not really depends on how 
you use 
and deploy your DPDK based applications. For Telco network middle boxes, which 
adopts 
a "close model" solution to achieve extremely high performance, the entire 
system including
HW, software in kernel and userspace are controlled by Telco vendors and 
assumed trustable, so
memory protection is not so important. While for Datacenters, which generally 
adopts a "open model" 
solution allows running user space applications(e.g. tenant applications and 
VMs) which could  
direct access NIC and DMA engine inside the NIC using modified DPDK PMD are not 
trustable 
as they can potentially DAM to/from arbitrary memory regions using physical 
addresses, so IOMMU 
is needed to provide strict memory protection, at the cost of negative 
performance impact.

So if you want to seek high performance, disable IOMMU in BIOS or OS. And if 
security is a major
concern, tune it on and tradeoff between performance and security. But I do NOT 
think is comes with 
an extremely high performance costs according to our performance measurement, 
but it probably true 
for 100G NIC.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya Hiremath
> Sent: Wednesday, October 22, 2014 12:54 AM
> To: Alex Markuze
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> 
> Hi,
> 
> Thank you for all the replies.
> I am trying to understand the impact of this on DPDK. What will be the
> repercussions of disabling "iommu=pt" on the DPDK performance?
> 
> 
> On Tue, Oct 21, 2014 at 12:32 AM, Alex Markuze  wrote:
> 
> > DPDK uses a 1:1 mapping and doesn't support IOMMU.  IOMMU allows for
> > simpler VM physical address translation.
> > The second role of IOMMU is to allow protection from unwanted memory
> > access by an unsafe devise that has DMA privileges. Unfortunately this
> > protection comes with an extremely high performance costs for high speed
> > nics.
> >
> > To your question iommu=pt disables IOMMU support for the hypervisor.
> >
> > On Tue, Oct 21, 2014 at 1:39 AM, Xie, Huawei  
> > wrote:
> >
> >>
> >>
> >> > -Original Message-
> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya
> >> Hiremath
> >> > Sent: Monday, October 20, 2014 2:59 PM
> >> > To: dev at dpdk.org
> >> > Subject: [dpdk-dev] Why do we need iommu=pt?
> >> >
> >> > Hi,
> >> >
> >> > My question is that if the Poll mode  driver used the DMA kernel
> >> interface
> >> > to set up its mappings appropriately, would it still require that
> >> iommu=pt
> >> > be set?
> >> > What is the purpose of setting iommu=pt ?
> >> PMD allocates memory though hugetlb file system, and fills the physical
> >> address
> >> into the descriptor.
> >> pt is used to pass through iotlb translation. Refer to the below link.
> >> http://lkml.iu.edu/hypermail/linux/kernel/0906.2/02129.html
> >> >
> >> > Thank you.
> >>
> >
> >

[dpdk-dev] [PATCH v2 03/13] ethdev: add more annotation

2014-10-21 Thread Zhang, Helin

Hi Thomas

OK. Good to know that. I will rework my patch based on latest master branch. 
Thank you very much!

Regards,
Helin

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 4:39 AM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 03/13] ethdev: add more annotation
> 
> 2014-09-25 16:40, Helin Zhang:
> > Add more annotation, to clearly tell the 'rte_eth_dev_info_get()'
> > users that the buffer should be cleared first.
> 
> Since commit http://dpdk.org/browse/dpdk/commit/?id=a30268e9a2
> (ethdev: reset whole dev info structure before filling), this patch is now 
> useless.
> 
> --
> Thomas

[dpdk-dev] development/integration branch?

2014-10-21 Thread Stephen Hemminger

On Tue, 21 Oct 2014 11:14:43 +0200
Marc Sune  wrote:

> On 21/10/14 10:46, Thomas Monjalon wrote:
> > My balance is different because I have a simpler solution for Marc's 
> > problem:
> > git fetch && git merge $(git tag | grep -v -- -rc | tail -n1)
> Thomas,
> 
> We all know we _can_ do this. But is it really necessary? We should be 
> all as lazy as possible and make it easy for users IMHO. `git pull` is 
> easier :)
> 
> I don't see any drawback of using a development branch, except if you 
> consider the extra push to master per release a drawback.
> 
> Also think about new users downloading the repo for the first time. They 
> are forced to do this right now if they want to checkout the latest stable.
> 
> marc

For most project master is the development branch and where patches
should be targeted.

If you want stable branch, then either use releases or volunteer to
maintain a "master-stable" branch.

73 matches

Mail list logo