date:20141022

[dpdk-dev] [PATCH v3 8/8] i40evf: support of updating/querying redirection table

2014-10-22 Thread Helin Zhang

Support of updating/querying redirection table has been
added for VF.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 89 
 1 file changed, 89 insertions(+)

v2 changes:
* Add support of updating/querying i40e reta of VF.

diff --git a/lib/librte_pmd_i40e/i40e_ethdev_vf.c 
b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
index a381521..0e8693d 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev_vf.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
@@ -134,6 +134,12 @@ static int i40evf_dev_tx_queue_start(struct rte_eth_dev 
*dev,
 uint16_t tx_queue_id);
 static int i40evf_dev_tx_queue_stop(struct rte_eth_dev *dev,
uint16_t tx_queue_id);
+static int i40evf_dev_rss_reta_update(struct rte_eth_dev *dev,
+   struct rte_eth_rss_reta_entry64 *reta_conf,
+   uint16_t reta_size);
+static int i40evf_dev_rss_reta_query(struct rte_eth_dev *dev,
+   struct rte_eth_rss_reta_entry64 *reta_conf,
+   uint16_t reta_size);
 static int i40evf_config_rss(struct i40e_vf *vf);
 static int i40evf_dev_rss_hash_update(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
@@ -166,6 +172,8 @@ static struct eth_dev_ops i40evf_eth_dev_ops = {
.rx_queue_release = i40e_dev_rx_queue_release,
.tx_queue_setup   = i40e_dev_tx_queue_setup,
.tx_queue_release = i40e_dev_tx_queue_release,
+   .reta_update  = i40evf_dev_rss_reta_update,
+   .reta_query   = i40evf_dev_rss_reta_query,
.rss_hash_update  = i40evf_dev_rss_hash_update,
.rss_hash_conf_get= i40evf_dev_rss_hash_conf_get,
 };
@@ -1611,6 +1619,87 @@ i40evf_dev_close(struct rte_eth_dev *dev)
 }

 static int
+i40evf_dev_rss_reta_update(struct rte_eth_dev *dev,
+  struct rte_eth_rss_reta_entry64 *reta_conf,
+  uint16_t reta_size)
+{
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t lut, l;
+   uint16_t i, j;
+   uint16_t idx, shift;
+   uint8_t mask;
+
+   if (reta_size != ETH_RSS_RETA_SIZE_64) {
+   PMD_DRV_LOG(ERR, "The size of hash lookup table configured "
+   "(%d) doesn't match the number of hardware can"
+   "support (%d)\n", reta_size, ETH_RSS_RETA_SIZE_64);
+   return -EINVAL;
+   }
+
+   for (i = 0; i < reta_size; i += I40E_4_BIT_WIDTH) {
+   idx = i / RTE_BIT_WIDTH_64;
+   shift = i % RTE_BIT_WIDTH_64;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) &
+   I40E_4_BIT_MASK);
+   if (!mask)
+   continue;
+   if (mask == I40E_4_BIT_MASK)
+   l = 0;
+   else
+   l = I40E_READ_REG(hw, I40E_VFQF_HLUT(i >> 2));
+
+   for (j = 0, lut = 0; j < I40E_4_BIT_WIDTH; j++) {
+   if (mask & (0x1 << j))
+   lut |= reta_conf[idx].reta[shift + j] <<
+   (CHAR_BIT * j);
+   else
+   lut |= l & (I40E_8_BIT_MASK << (CHAR_BIT * j));
+   }
+   I40E_WRITE_REG(hw, I40E_VFQF_HLUT(i >> 2), lut);
+   }
+
+   return 0;
+}
+
+static int
+i40evf_dev_rss_reta_query(struct rte_eth_dev *dev,
+ struct rte_eth_rss_reta_entry64 *reta_conf,
+ uint16_t reta_size)
+{
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t lut;
+   uint16_t i, j;
+   uint16_t idx, shift;
+   uint8_t mask;
+
+   if (reta_size != ETH_RSS_RETA_SIZE_64) {
+   PMD_DRV_LOG(ERR, "The size of hash lookup table configured "
+   "(%d) doesn't match the number of hardware can"
+   "support (%d)\n", reta_size, ETH_RSS_RETA_SIZE_64);
+   return -EINVAL;
+   }
+
+   for (i = 0; i < reta_size; i += I40E_4_BIT_WIDTH) {
+   idx = i / RTE_BIT_WIDTH_64;
+   shift = i % RTE_BIT_WIDTH_64;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) &
+   I40E_4_BIT_MASK);
+   if (!mask)
+   continue;
+
+   lut = I40E_READ_REG(hw, I40E_VFQF_HLUT(i >> 2));
+   for (j = 0; j < I40E_4_BIT_WIDTH; j++) {
+   if (mask & (0x1 << j))
+   reta_conf[idx].reta[shift] =
+   ((lut >> (CHAR_BIT * j)) &
+   I40E_8_BIT_MASK);
+   }
+   }
+
+   return 0;
+}
+
+static int

[dpdk-dev] [PATCH v3 7/8] ethdev: support of multiple sizes of redirection table

2014-10-22 Thread Helin Zhang

As 40G NIC supports different sizes (128/512/64 entries) of
redirection table from that (128 entries) of 1G and 10G NICs,
support of multiple sizes of redirection table is needed.
It includes,
* Redefine 'struct rte_eth_rss_reta' in ethdev.
  - To 'struct rte_eth_rss_reta_entry64' which contains 64
entries and 64 bits mask.
  - Array of above new structure can be used for any number
of redirection table entries, as long as the number is
multiple of 64. This is quite flexible for the future
expanding of redirection table.
* Redefinition of relevant interfaces in ethdev.
  - Interface of reta update has been redefined with new
parameters.
  - Interface of reta query has been redefined with new
parameters.
* Rework of 1G PMD in igb.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 10G PMD in ixgbe.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 40G PMD (PF only) in i40e.
  - reta update has been reworked.
  - reta query has been reworked.
* Implement relevant commands in testpmd.

Signed-off-by: Helin Zhang 
---
 app/test-pmd/cmdline.c  | 152 ++--
 app/test-pmd/config.c   |  37 +
 app/test-pmd/testpmd.h  |   4 +-
 lib/librte_ether/rte_ethdev.c   | 116 ---
 lib/librte_ether/rte_ethdev.h   |  38 +
 lib/librte_pmd_e1000/igb_ethdev.c   | 109 +-
 lib/librte_pmd_i40e/i40e_ethdev.c   |  93 --
 lib/librte_pmd_i40e/i40e_ethdev.h   |  12 +++
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 108 +
 9 files changed, 405 insertions(+), 264 deletions(-)

v2 changes:
* Put rework of updating/querying igb reta to a single patch.
* Put rework of updating/querying ixgbe reta to a single patch.
* Put rework of updating/querying i40e reta to a single patch.

v3 changes:
* Put all redefinitions of structures and interfaces into a
  single patch.
* Put all reworks of igb/igbe/i40e of supporting multiple sizes
  of reta into the same patch.
* Put all relevant testpmd reworks of supporting multiple sizes
  of reta into the same patch.

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9de574d..8a55fb5 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -186,6 +187,11 @@ static void cmd_help_long_parsed(void *parsed_result,
"show port (info|stats|xstats|fdir|stat_qmap) 
(port_id|all)\n"
"Display information for port_id, or all.\n\n"

+   "show port X rss reta (size) (mask0,mask1,...)\n"
+   "Display the rss redirection table entry indicated"
+   " by masks on port X. size is used to indicate the"
+   " hardware supported reta size\n\n"
+
"show port rss-hash [key]\n"
"Display the RSS hash functions and RSS hash key"
" of port X\n\n"
@@ -1539,11 +1545,13 @@ struct cmd_config_rss_reta {
 };

 static int
-parse_reta_config(const char *str, struct rte_eth_rss_reta *reta_conf)
+parse_reta_config(const char *str,
+ struct rte_eth_rss_reta_entry64 *reta_conf,
+ uint16_t nb_entries)
 {
int i;
unsigned size;
-   uint8_t hash_index;
+   uint16_t hash_index, idx, shift;
uint8_t nb_queue;
char s[256];
const char *p, *p0 = str;
@@ -1571,24 +1579,23 @@ parse_reta_config(const char *str, struct 
rte_eth_rss_reta *reta_conf)
for (i = 0; i < _NUM_FLD; i++) {
errno = 0;
int_fld[i] = strtoul(str_fld[i], , 0);
-   if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
+   if (errno != 0 || end == str_fld[i] ||
+   int_fld[i] > 65535)
return -1;
}

-   hash_index = (uint8_t)int_fld[FLD_HASH_INDEX];
+   hash_index = (uint16_t)int_fld[FLD_HASH_INDEX];
nb_queue = (uint8_t)int_fld[FLD_QUEUE];

-   if (hash_index >= ETH_RSS_RETA_NUM_ENTRIES) {
-   printf("Invalid RETA hash index=%d", hash_index);
+   if (hash_index >= nb_entries) {
+   printf("Invalid RETA hash index=%d\n", hash_index);
return -1;
}

-   if (hash_index < ETH_RSS_RETA_NUM_ENTRIES/2)
-   reta_conf->mask_lo |= (1ULL << hash_index);
-   else
-   reta_conf->mask_hi |= (1ULL << (hash_index - 
ETH_RSS_RETA_NUM_ENTRIES/2));
-
-   reta_conf->reta[hash_index] = nb_queue;
+   idx = hash_index / RTE_BIT_WIDTH_64;
+

[dpdk-dev] [PATCH v3 6/8] i40e: rework of ops of 'dev_infos_get' for both PF and VF

2014-10-22 Thread Helin Zhang

Returning redirection table size has been supported in ops of
'dev_infos_get' for both PF and VF. Default RX/TX configurations
of VF can be returned in ops of 'dev_infos_get', while it was
missed before.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c| 16 +++-
 lib/librte_pmd_i40e/i40e_ethdev.h| 11 +++
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 23 +++
 3 files changed, 37 insertions(+), 13 deletions(-)

v2 changes:
* Put getting reta size of both i40e PF and VF into a single patch.

v3 changes:
* Returning default RX/TX configurations has been added in ops of
  'dev_infos_get' for VF, as it was added recently in that for PF.

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index ef24175..d80004c 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -58,17 +58,6 @@
 #include "i40e_rxtx.h"
 #include "i40e_pf.h"

-#define I40E_DEFAULT_RX_FREE_THRESH  32
-#define I40E_DEFAULT_RX_PTHRESH  8
-#define I40E_DEFAULT_RX_HTHRESH  8
-#define I40E_DEFAULT_RX_WTHRESH  0
-
-#define I40E_DEFAULT_TX_FREE_THRESH  32
-#define I40E_DEFAULT_TX_PTHRESH  32
-#define I40E_DEFAULT_TX_HTHRESH  0
-#define I40E_DEFAULT_TX_WTHRESH  0
-#define I40E_DEFAULT_TX_RSBIT_THRESH 32
-
 /* Maximun number of MAC addresses */
 #define I40E_NUM_MACADDR_MAX   64
 #define I40E_CLEAR_PXE_WAIT_MS 200
@@ -1382,6 +1371,7 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_UDP_CKSUM |
DEV_TX_OFFLOAD_TCP_CKSUM |
DEV_TX_OFFLOAD_SCTP_CKSUM;
+   dev_info->reta_size = pf->hash_lut_size;

dev_info->default_rxconf = (struct rte_eth_rxconf) {
.rx_thresh = {
@@ -1401,9 +1391,9 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
},
.tx_free_thresh = I40E_DEFAULT_TX_FREE_THRESH,
.tx_rs_thresh = I40E_DEFAULT_TX_RSBIT_THRESH,
-   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS | 
ETH_TXQ_FLAGS_NOOFFLOADS,
+   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+   ETH_TXQ_FLAGS_NOOFFLOADS,
};
-
 }

 static int
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.h 
b/lib/librte_pmd_i40e/i40e_ethdev.h
index 22b693f..0b2f316 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.h
+++ b/lib/librte_pmd_i40e/i40e_ethdev.h
@@ -52,6 +52,17 @@
 /* Default TC traffic in case DCB is not enabled */
 #define I40E_DEFAULT_TCMAP0x1

+#define I40E_DEFAULT_RX_FREE_THRESH  32
+#define I40E_DEFAULT_RX_PTHRESH  8
+#define I40E_DEFAULT_RX_HTHRESH  8
+#define I40E_DEFAULT_RX_WTHRESH  0
+
+#define I40E_DEFAULT_TX_FREE_THRESH  32
+#define I40E_DEFAULT_TX_PTHRESH  32
+#define I40E_DEFAULT_TX_HTHRESH  0
+#define I40E_DEFAULT_TX_WTHRESH  0
+#define I40E_DEFAULT_TX_RSBIT_THRESH 32
+
 /* i40e flags */
 #define I40E_FLAG_RSS   (1ULL << 0)
 #define I40E_FLAG_DCB   (1ULL << 1)
diff --git a/lib/librte_pmd_i40e/i40e_ethdev_vf.c 
b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
index 3997ddb..a381521 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev_vf.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
@@ -1567,6 +1567,29 @@ i40evf_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->max_tx_queues = vf->vsi_res->num_queue_pairs;
dev_info->min_rx_bufsize = I40E_BUF_SIZE_MIN;
dev_info->max_rx_pktlen = I40E_FRAME_SIZE_MAX;
+   dev_info->reta_size = ETH_RSS_RETA_SIZE_64;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+   .pthresh = I40E_DEFAULT_RX_PTHRESH,
+   .hthresh = I40E_DEFAULT_RX_HTHRESH,
+   .wthresh = I40E_DEFAULT_RX_WTHRESH,
+   },
+   .rx_free_thresh = I40E_DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .tx_thresh = {
+   .pthresh = I40E_DEFAULT_TX_PTHRESH,
+   .hthresh = I40E_DEFAULT_TX_HTHRESH,
+   .wthresh = I40E_DEFAULT_TX_WTHRESH,
+   },
+   .tx_free_thresh = I40E_DEFAULT_TX_FREE_THRESH,
+   .tx_rs_thresh = I40E_DEFAULT_TX_RSBIT_THRESH,
+   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+   ETH_TXQ_FLAGS_NOOFFLOADS,
+   };
 }

 static void
-- 
1.8.1.4

[dpdk-dev] [PATCH v3 5/8] ixgbe: implement ops of 'dev_infos_get' for PF and VF respectively

2014-10-22 Thread Helin Zhang

As more and more information are different between PF and VF, ops of
'dev_infos_get' has been implemented respectively. In addition,
returning redirection table size has been supported in it.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 90 +
 1 file changed, 71 insertions(+), 19 deletions(-)

v2 changes:
* Added new function for ops of 'dev_infos_get' specifically for ixgbe VF.

v3 changes:
* Returning default RX/TX configurations has been added in ops of
  'dev_infos_get' for VF, as it was added recently in that for PF.

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index c5e4b71..da140c8 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -132,8 +132,9 @@ static int ixgbe_dev_queue_stats_mapping_set(struct 
rte_eth_dev *eth_dev,
 uint8_t stat_idx,
 uint8_t is_rx);
 static void ixgbe_dev_info_get(struct rte_eth_dev *dev,
-   struct rte_eth_dev_info *dev_info);
-
+  struct rte_eth_dev_info *dev_info);
+static void ixgbevf_dev_info_get(struct rte_eth_dev *dev,
+struct rte_eth_dev_info *dev_info);
 static int ixgbe_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);

 static int ixgbe_vlan_filter_set(struct rte_eth_dev *dev,
@@ -391,7 +392,7 @@ static struct eth_dev_ops ixgbevf_eth_dev_ops = {
.stats_get= ixgbevf_dev_stats_get,
.stats_reset  = ixgbevf_dev_stats_reset,
.dev_close= ixgbevf_dev_close,
-   .dev_infos_get= ixgbe_dev_info_get,
+   .dev_infos_get= ixgbevf_dev_info_get,
.mtu_set  = ixgbevf_dev_set_mtu,
.vlan_filter_set  = ixgbevf_vlan_filter_set,
.vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
@@ -1963,25 +1964,76 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_SCTP_CKSUM;

dev_info->default_rxconf = (struct rte_eth_rxconf) {
-   .rx_thresh = {
-   .pthresh = IXGBE_DEFAULT_RX_PTHRESH,
-   .hthresh = IXGBE_DEFAULT_RX_HTHRESH,
-   .wthresh = IXGBE_DEFAULT_RX_WTHRESH,
-   },
-   .rx_free_thresh = IXGBE_DEFAULT_RX_FREE_THRESH,
-   .rx_drop_en = 0,
+   .rx_thresh = {
+   .pthresh = IXGBE_DEFAULT_RX_PTHRESH,
+   .hthresh = IXGBE_DEFAULT_RX_HTHRESH,
+   .wthresh = IXGBE_DEFAULT_RX_WTHRESH,
+   },
+   .rx_free_thresh = IXGBE_DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .tx_thresh = {
+   .pthresh = IXGBE_DEFAULT_TX_PTHRESH,
+   .hthresh = IXGBE_DEFAULT_TX_HTHRESH,
+   .wthresh = IXGBE_DEFAULT_TX_WTHRESH,
+   },
+   .tx_free_thresh = IXGBE_DEFAULT_TX_FREE_THRESH,
+   .tx_rs_thresh = IXGBE_DEFAULT_TX_RSBIT_THRESH,
+   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+   ETH_TXQ_FLAGS_NOOFFLOADS,
};
+   dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
+}

+static void
+ixgbevf_dev_info_get(struct rte_eth_dev *dev,
+struct rte_eth_dev_info *dev_info)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   dev_info->max_rx_queues = (uint16_t)hw->mac.max_rx_queues;
+   dev_info->max_tx_queues = (uint16_t)hw->mac.max_tx_queues;
+   dev_info->min_rx_bufsize = 1024; /* cf BSIZEPACKET in SRRCTL reg */
+   dev_info->max_rx_pktlen = 15872; /* includes CRC, cf MAXFRS reg */
+   dev_info->max_mac_addrs = hw->mac.num_rar_entries;
+   dev_info->max_hash_mac_addrs = IXGBE_VMDQ_NUM_UC_MAC;
+   dev_info->max_vfs = dev->pci_dev->max_vfs;
+   if (hw->mac.type == ixgbe_mac_82598EB)
+   dev_info->max_vmdq_pools = ETH_16_POOLS;
+   else
+   dev_info->max_vmdq_pools = ETH_64_POOLS;
+   dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP |
+   DEV_RX_OFFLOAD_IPV4_CKSUM |
+   DEV_RX_OFFLOAD_UDP_CKSUM  |
+   DEV_RX_OFFLOAD_TCP_CKSUM;
+   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT |
+   DEV_TX_OFFLOAD_IPV4_CKSUM  |
+   DEV_TX_OFFLOAD_UDP_CKSUM   |
+   DEV_TX_OFFLOAD_TCP_CKSUM   |
+   DEV_TX_OFFLOAD_SCTP_CKSUM;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+

[dpdk-dev] [PATCH v3 4/8] igb: implement ops of 'dev_infos_get' for PF and VF respectively

2014-10-22 Thread Helin Zhang

As more and more information are different between PF and VF, ops of
'dev_infos_get' has been implemented respectively. In addition, new
field of 'reta_size' has been added in 'struct rte_eth_dev_info' for
returning redirection table size.

Signed-off-by: Helin Zhang 
---
 lib/librte_ether/rte_ethdev.h |  2 ++
 lib/librte_pmd_e1000/igb_ethdev.c | 61 ---
 2 files changed, 52 insertions(+), 11 deletions(-)

v2 changes:
* Added new function for ops of 'dev_infos_get' specifically for igb VF.

v3 changes:
* Put the adding new element of 'reta_size' in ethdev into this patch,
  as it is needed.
* Returning default RX/TX configurations has been added in ops of
  'dev_infos_get', as it was accepted recently in another patches.

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 7db08c2..ed2f15a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -911,6 +911,8 @@ struct rte_eth_dev_info {
uint16_t max_vmdq_pools; /**< Maximum number of VMDq pools. */
uint32_t rx_offload_capa; /**< Device RX offload capabilities. */
uint32_t tx_offload_capa; /**< Device TX offload capabilities. */
+   uint16_t reta_size;
+   /**< Device redirection table size, the total number of entries. */
struct rte_eth_rxconf default_rxconf; /**< Default RX configuration */
struct rte_eth_txconf default_txconf; /**< Default TX configuration */
 };
diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index 9e5665f..79998cf 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -83,6 +83,8 @@ static void eth_igb_stats_get(struct rte_eth_dev *dev,
struct rte_eth_stats *rte_stats);
 static void eth_igb_stats_reset(struct rte_eth_dev *dev);
 static void eth_igb_infos_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info);
+static void eth_igbvf_infos_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
 static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
struct rte_eth_fc_conf *fc_conf);
@@ -282,7 +284,7 @@ static struct eth_dev_ops igbvf_eth_dev_ops = {
.stats_get= eth_igbvf_stats_get,
.stats_reset  = eth_igbvf_stats_reset,
.vlan_filter_set  = igbvf_vlan_filter_set,
-   .dev_infos_get= eth_igb_infos_get,
+   .dev_infos_get= eth_igbvf_infos_get,
.rx_queue_setup   = eth_igb_rx_queue_setup,
.rx_queue_release = eth_igb_rx_queue_release,
.tx_queue_setup   = eth_igb_tx_queue_setup,
@@ -1268,8 +1270,7 @@ eth_igbvf_stats_reset(struct rte_eth_dev *dev)
 }

 static void
-eth_igb_infos_get(struct rte_eth_dev *dev,
-   struct rte_eth_dev_info *dev_info)
+eth_igb_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
 {
struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);

@@ -1330,23 +1331,61 @@ eth_igb_infos_get(struct rte_eth_dev *dev,
dev_info->max_vmdq_pools = 0;
break;

+   default:
+   /* Should not happen */
+   break;
+   }
+   dev_info->reta_size = ETH_RSS_RETA_SIZE_128;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+   .pthresh = IGB_DEFAULT_RX_PTHRESH,
+   .hthresh = IGB_DEFAULT_RX_HTHRESH,
+   .wthresh = IGB_DEFAULT_RX_WTHRESH,
+   },
+   .rx_free_thresh = IGB_DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .tx_thresh = {
+   .pthresh = IGB_DEFAULT_TX_PTHRESH,
+   .hthresh = IGB_DEFAULT_TX_HTHRESH,
+   .wthresh = IGB_DEFAULT_TX_WTHRESH,
+   },
+   .txq_flags = 0,
+   };
+}
+
+static void
+eth_igbvf_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   dev_info->min_rx_bufsize = 256; /* See BSIZE field of RCTL register. */
+   dev_info->max_rx_pktlen  = 0x3FFF; /* See RLPML register. */
+   dev_info->max_mac_addrs = hw->mac.rar_entry_count;
+   dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP |
+   DEV_RX_OFFLOAD_IPV4_CKSUM |
+   DEV_RX_OFFLOAD_UDP_CKSUM  |
+   DEV_RX_OFFLOAD_TCP_CKSUM;
+   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT |
+   DEV_TX_OFFLOAD_IPV4_CKSUM  |
+   DEV_TX_OFFLOAD_UDP_CKSUM   |
+   DEV_TX_OFFLOAD_TCP_CKSUM   |
+

[dpdk-dev] [PATCH v3 3/8] i40e: support of setting hash lookup table size

2014-10-22 Thread Helin Zhang

Add support of setting hash lookup table size according to
the hardawre capability.

Signed-off-by: Helin Zhang 
---
 lib/librte_ether/rte_ethdev.h |  3 +++
 lib/librte_pmd_i40e/i40e_ethdev.c | 14 +-
 lib/librte_pmd_i40e/i40e_ethdev.h |  1 +
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b69a6af..7db08c2 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -430,6 +430,9 @@ struct rte_eth_rss_conf {
 /* Definitions used for redirection table entry size */
 #define ETH_RSS_RETA_NUM_ENTRIES 128
 #define ETH_RSS_RETA_MAX_QUEUE   16
+#define ETH_RSS_RETA_SIZE_64  64
+#define ETH_RSS_RETA_SIZE_128 128
+#define ETH_RSS_RETA_SIZE_512 512

 /* Definitions used for VMDQ and DCB functionality */
 #define ETH_VMDQ_MAX_VLAN_FILTERS   64 /**< Maximum nb. of VMDQ vlan filters. 
*/
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..ef24175 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -2943,7 +2943,19 @@ i40e_pf_setup(struct i40e_pf *pf)

/* Configure filter control */
memset(, 0, sizeof(settings));
-   settings.hash_lut_size = I40E_HASH_LUT_SIZE_128;
+   if (hw->func_caps.rss_table_size == ETH_RSS_RETA_SIZE_128)
+   settings.hash_lut_size = I40E_HASH_LUT_SIZE_128;
+   else if (hw->func_caps.rss_table_size == ETH_RSS_RETA_SIZE_512)
+   settings.hash_lut_size = I40E_HASH_LUT_SIZE_512;
+   else {
+   PMD_DRV_LOG(ERR, "Hash lookup table size (%u) not supported\n",
+   hw->func_caps.rss_table_size);
+   return I40E_ERR_PARAM;
+   }
+   PMD_DRV_LOG(INFO, "Hardware capability of hash lookup table "
+   "size: %u\n", hw->func_caps.rss_table_size);
+   pf->hash_lut_size = hw->func_caps.rss_table_size;
+
/* Enable ethtype and macvlan filters */
settings.enable_ethtype = TRUE;
settings.enable_macvlan = TRUE;
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.h 
b/lib/librte_pmd_i40e/i40e_ethdev.h
index 1d42cd2..22b693f 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.h
+++ b/lib/librte_pmd_i40e/i40e_ethdev.h
@@ -246,6 +246,7 @@ struct i40e_pf {
uint16_t vmdq_nb_qps; /* The number of queue pairs of VMDq */
uint16_t vf_nb_qps; /* The number of queue pairs of VF */
uint16_t fdir_nb_qps; /* The number of queue pairs of Flow Director */
+   uint16_t hash_lut_size; /* The size of hash lookup table */
 };

 enum pending_msg {
-- 
1.8.1.4

[dpdk-dev] [PATCH v3 2/8] i40evf: code style fix

2014-10-22 Thread Helin Zhang

Fix several code style issues.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev_vf.c 
b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
index fa838e6..3997ddb 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev_vf.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
@@ -126,15 +126,19 @@ static void i40evf_dev_allmulticast_disable(struct 
rte_eth_dev *dev);
 static int i40evf_get_link_status(struct rte_eth_dev *dev,
  struct rte_eth_link *link);
 static int i40evf_init_vlan(struct rte_eth_dev *dev);
+static int i40evf_dev_rx_queue_start(struct rte_eth_dev *dev,
+uint16_t rx_queue_id);
+static int i40evf_dev_rx_queue_stop(struct rte_eth_dev *dev,
+   uint16_t rx_queue_id);
+static int i40evf_dev_tx_queue_start(struct rte_eth_dev *dev,
+uint16_t tx_queue_id);
+static int i40evf_dev_tx_queue_stop(struct rte_eth_dev *dev,
+   uint16_t tx_queue_id);
 static int i40evf_config_rss(struct i40e_vf *vf);
 static int i40evf_dev_rss_hash_update(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
 static int i40evf_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
struct rte_eth_rss_conf *rss_conf);
-static int i40evf_dev_rx_queue_start(struct rte_eth_dev *, uint16_t);
-static int i40evf_dev_rx_queue_stop(struct rte_eth_dev *, uint16_t);
-static int i40evf_dev_tx_queue_start(struct rte_eth_dev *, uint16_t);
-static int i40evf_dev_tx_queue_stop(struct rte_eth_dev *, uint16_t);

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_VFQF_HKEY_MAX_INDEX + 1];
-- 
1.8.1.4

[dpdk-dev] [PATCH v3 1/8] app/testpmd: code style fix

2014-10-22 Thread Helin Zhang

Fix several code style issues.

Signed-off-by: Helin Zhang 
---
 app/test-pmd/cmdline.c | 28 +++-
 app/test-pmd/config.c  |  2 +-
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..9de574d 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1579,7 +1579,7 @@ parse_reta_config(const char *str, struct 
rte_eth_rss_reta *reta_conf)
nb_queue = (uint8_t)int_fld[FLD_QUEUE];

if (hash_index >= ETH_RSS_RETA_NUM_ENTRIES) {
-   printf("Invalid RETA hash index=%d",hash_index);
+   printf("Invalid RETA hash index=%d", hash_index);
return -1;
}

@@ -1596,22 +1596,24 @@ parse_reta_config(const char *str, struct 
rte_eth_rss_reta *reta_conf)

 static void
 cmd_set_rss_reta_parsed(void *parsed_result,
-   __attribute__((unused)) struct cmdline *cl,
-   __attribute__((unused)) void *data)
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
 {
int ret;
struct rte_eth_rss_reta reta_conf;
struct cmd_config_rss_reta *res = parsed_result;

-   memset(_conf,0,sizeof(struct rte_eth_rss_reta));
+   memset(_conf, 0, sizeof(struct rte_eth_rss_reta));
if (!strcmp(res->list_name, "reta")) {
if (parse_reta_config(res->list_of_items, _conf)) {
-   printf("Invalid RSS Redirection Table config 
entered\n");
+   printf("Invalid RSS Redirection Table config "
+   "entered\n");
return;
}
ret = rte_eth_dev_rss_reta_update(res->port_id, _conf);
if (ret != 0)
-   printf("Bad redirection table parameter, return code = 
%d \n",ret);
+   printf("Bad redirection table parameter, "
+   "return code = %d\n", ret);
}
 }

@@ -1673,19 +1675,19 @@ static void cmd_showport_reta_parsed(void 
*parsed_result,
 }

 cmdline_parse_token_string_t cmd_showport_reta_show =
-TOKEN_STRING_INITIALIZER(struct  cmd_showport_reta, show, "show");
+   TOKEN_STRING_INITIALIZER(struct  cmd_showport_reta, show, "show");
 cmdline_parse_token_string_t cmd_showport_reta_port =
-TOKEN_STRING_INITIALIZER(struct  cmd_showport_reta, port, "port");
+   TOKEN_STRING_INITIALIZER(struct  cmd_showport_reta, port, "port");
 cmdline_parse_token_num_t cmd_showport_reta_port_id =
-TOKEN_NUM_INITIALIZER(struct cmd_showport_reta, port_id, UINT8);
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_reta, port_id, UINT8);
 cmdline_parse_token_string_t cmd_showport_reta_rss =
-TOKEN_STRING_INITIALIZER(struct cmd_showport_reta, rss, "rss");
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_reta, rss, "rss");
 cmdline_parse_token_string_t cmd_showport_reta_reta =
-TOKEN_STRING_INITIALIZER(struct cmd_showport_reta, reta, "reta");
+   TOKEN_STRING_INITIALIZER(struct cmd_showport_reta, reta, "reta");
 cmdline_parse_token_num_t cmd_showport_reta_mask_lo =
-TOKEN_NUM_INITIALIZER(struct cmd_showport_reta,mask_lo,UINT64);
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_reta, mask_lo, UINT64);
 cmdline_parse_token_num_t cmd_showport_reta_mask_hi =
-   TOKEN_NUM_INITIALIZER(struct cmd_showport_reta,mask_hi,UINT64);
+   TOKEN_NUM_INITIALIZER(struct cmd_showport_reta, mask_hi, UINT64);

 cmdline_parse_inst_t cmd_showport_reta = {
.f = cmd_showport_reta_parsed,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 2a1b93f..84c59b7 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -764,7 +764,7 @@ rxtx_config_display(void)
 void
 port_rss_reta_info(portid_t port_id,struct rte_eth_rss_reta *reta_conf)
 {
-   uint8_t i,j;
+   uint8_t i, j;
int ret;

if (port_id_is_invalid(port_id))
-- 
1.8.1.4

[dpdk-dev] [PATCH v3 0/8] support of multiple sizes of redirection table

2014-10-22 Thread Helin Zhang

As e1000, ixgbe and i40e hardware use different sizes of redirection
table in PF or VF, ethdev and PMDs need to be reworked to support
multiple sizes of that table. In addition, commands in testpmd also
need to be reworked to support these changes.

v2 changes:
* Reorganized the patches.
* Added code style fixes.
* Added support of reta updating/querying in i40e VF.

v3 changes:
* Reorganized the patch set.
* Added returning default RX/TX configurations in VF
  (igb/ixgbe/i40e), as the patch set of it for PF has been
  accepted recently.

Helin Zhang (8):
  app/testpmd: code style fix
  i40evf: code style fix
  i40e: support of setting hash lookup table size
  igb: implement ops of 'dev_infos_get' for PF and VF respectively
  ixgbe: implement ops of 'dev_infos_get' for PF and VF respectively
  i40e: rework of ops of 'dev_infos_get' for both PF and VF
  ethdev: support of multiple sizes of redirection table
  i40evf: support of updating/querying redirection table

 app/test-pmd/cmdline.c   | 166 +
 app/test-pmd/config.c|  37 ---
 app/test-pmd/testpmd.h   |   4 +-
 lib/librte_ether/rte_ethdev.c| 116 
 lib/librte_ether/rte_ethdev.h|  43 +---
 lib/librte_pmd_e1000/igb_ethdev.c| 170 +++---
 lib/librte_pmd_i40e/i40e_ethdev.c| 123 --
 lib/librte_pmd_i40e/i40e_ethdev.h|  24 +
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 124 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c  | 198 ++-
 10 files changed, 687 insertions(+), 318 deletions(-)

-- 
1.8.1.4

[dpdk-dev] development/integration branch?

2014-10-22 Thread Stephen Hemminger

On Wed, 22 Oct 2014 00:00:58 -0700
Matthew Hall  wrote:

> What I think git in general and DPDK in particular are missing is, they have 
> a 
> tradition tags for releases, however I think this is broken because you can't 
> easily append more stuff to tages.

In git tags and branches are almost the same thing.
You can easily create a local branch off of a tag.

[dpdk-dev] [PATCH 1/5] vmxnet3: Fix VLAN Rx stripping

2014-10-22 Thread Stephen Hemminger

On Mon, 13 Oct 2014 18:42:18 +
Yong Wang  wrote:

> Are you referring to the patch as a whole or your comment is about the reset 
> of vlan_tci on the "else" (no vlan tags stripped) path?  I am not sure I get 
> your comments here.  This patch simply fixes a bug on the rx vlan stripping 
> path (where valid vlan_tci stripped is overwritten unconditionally later on 
> the rx path in the original vmxnet3 pmd driver). All the other pmd drivers 
> are doing the same thing in terms of translating descriptor status to 
> rte_mbuf flags for vlan stripping.

I was thinking that there are many fields in a pktmbuf and rather than 
individually
setting them (like tci). The code should call the common rte_pktmbuf_reset 
before setting
the fields.  That way when someone adds a field to mbuf they don't have to 
chasing
through every driver that does it's own initialization.

[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-22 Thread Luke Gorrie

Hi Tim,

On 22 October 2014 15:48, O'driscoll, Tim  wrote:

> 2.0 (Q1 2015) DPDK Features:
> Bifurcated Driver: With the Bifurcated Driver, the kernel will retain
> direct control of the NIC, and will assign specific queue pairs to DPDK.
> Configuration of the NIC is controlled by the kernel via ethtool.
>

That sounds awesome and potentially really useful for other people writing
userspace data planes too. If I understand correctly, this way the messy
details can be contained in one place (kernel) and the application (DPDK
PMD or otherwise) will access the NIC TX/RX queue via the ABI defined in
the hardware data sheet.

Single Virtio Driver: Merge existing Virtio drivers into a single
> implementation, incorporating the best features from each of the existing
> drivers.
>

Cool. Do you have a strategy in mind already for zero-copy optimisation
with VMDq? I have seen some patches floating around for this and it's an
area of active interest for myself and others. I see a lot of potential for
making this work more effectively with some modest extensions to Virtio and
guest behaviour, and would love to meet kindred spirits who are thinking
along these lines too.

[dpdk-dev] Possible bug in eal_pci pci_scan_one

2014-10-22 Thread Matthew Hall

Hi guys,

Could anybody comment on my observations about the PCI scan process and how to 
improve the NUMA compatibility, etc.?

The original bug contains a bogus commit history (fake source for commit, fake 
timestamp, no detail about what was done in the commit). I also don't have a 
list of subsystem maintainers for DPDK or subsystem mailing lists so no clue 
who might be able to assist with the issue I was seeing:

commit b6ea6408fbc784f174626aa6ea2fa49610b1f432
Author: Intel 
Date:   Mon Jun 3 00:00:00 2013 +

ethdev: store numa_node per device

Signed-off-by: Intel

I hope that future Intel commits don't come with a censored commit history... 
it isn't very friendly when you're trying to get help tracking down bugs and 
fixing stuff from the community side.

Thanks,
Matthew.

On Mon, Oct 13, 2014 at 10:47:12PM -0700, Matthew Hall wrote:
> Hi,
> 
> Did anybody get a chance to look what might be going on in this weird NUMA 
> bug? I could use some help to understand how you're supposed to make code 
> that 
> will work right on both NUMA and non-NUMA. Otherwise it's hard to make a 
> bulletproof DPDK based app that will be able to reliably init on single 
> socket, dual socket non-NUMA, and dual socket NUMA boxes.
> 
> Thanks,
> Matthew.
> 
> On Mon, Oct 06, 2014 at 02:13:44AM -0700, Matthew Hall wrote:
> > Hi Guys,
> > 
> > I'm doing my development on kind of a cheap machine with no NUMA support... 
> > but several years ago I used DPDK to build a NUMA box that could do 40 
> > gbits 
> > bidirectional L4-L7 stateful traffic replay.
> > 
> > So given the past experiences I had before, I wanted to clean the code up 
> > so 
> > it'd work well if some crazy guy tried my code on one of these huge boxes, 
> > too, but then I ran into some weird issues.
> > 
> > 1) When I call rte_eth_dev_socket_id() I get back -1. But the call can 
> > return 
> > -1 if the port_id is bogus or if pci_scan_one didn't get a numa_node 
> > (because 
> > you're on a non-NUMA box for example).
> > 
> > int rte_eth_dev_socket_id(uint8_t port_id)
> > {
> > if (port_id >= nb_ports)
> > return -1;
> > return rte_eth_devices[port_id].pci_dev->numa_node;
> > }
> > 
> > So you couldn't tell the different between non-NUMA or a bad port value, 
> > etc.
> > 
> > 2) The code's behavior and comments disagree with one another. In the 
> > pci_scan_one function, there's this code:
> > 
> > /* get numa node */
> > snprintf(filename, sizeof(filename), "%s/numa_node",
> >  dirname);
> > if (access(filename, R_OK) != 0) {
> > /* if no NUMA support just set node to 0 */
> > dev->numa_node = -1;
> > } else {
> > if (eal_parse_sysfs_value(filename, ) < 0) {
> > free(dev);
> > return -1;
> > }
> > dev->numa_node = tmp;
> > }
> > 
> > It says, just use NUMA node 0 if there is no NUMA support. But then 
> > proceeds 
> > to set the value to -1 in disagreement with the comment, and also stomping 
> > on 
> > the other meaning for -1 in the higher function rte_eth_dev_socket_id.
> > 
> > 3) In conclusion, it seems like some stuff is missing... first there needs 
> > to 
> > be a function that will tell you the number of NUMA nodes present on the 
> > box 
> > so you can create the right number of mbuf_pools, but I couldn't find that 
> > function.
> > 
> > Then if you have the function, you can do some magic and shuffle the NICs 
> > around to get them hooked to a core on the same NUMA, and the mbuf_pool on 
> > the 
> > same NUMA.
> > 
> > When NUMA is not present, can we return 0 instead of -1, or return a 
> > specific 
> > error code that the client can use to know he should just use Socket 0? 
> > Right 
> > now I can't tell apart any potential errors or weird values from correct 
> > values.
> > 
> > 4) I'm willing to help make and test some patches... but first I want to 
> > understand what is happening with these funny functions before doing things 
> > blindly.
> > 
> > Thanks,
> > Matthew.

[dpdk-dev] [PATCH] mk: fix build 32bits shared libs on 64bits system

2014-10-22 Thread Sergio Gonzalez Monroy

Incompatible libraries error when building shared libraries for 32bits on
a 64bits system.
Fix issue by passing CPU_CFLAGS to CC when LINK_USING_CC is enabled.

Signed-off-by: Sergio Gonzalez Monroy 
---
 mk/rte.lib.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index f458258..d83e808 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -61,7 +61,7 @@ exe2cmd = $(strip $(call dotfile,$(patsubst %,%.cmd,$(1

 ifeq ($(LINK_USING_CC),1)
 # Override the definition of LD here, since we're linking with CC
-LD := $(CC)
+LD := $(CC) $(CPU_CFLAGS)
 LD_MULDEFS := $(call linkerprefix,-z$(comma)muldefs)
 CPU_LDFLAGS := $(call linkerprefix,$(CPU_LDFLAGS))
 endif
-- 
1.9.3

[dpdk-dev] DPDK Features for Q1 2015

2014-10-22 Thread Thomas Monjalon

Thanks Tim for sharing your plan.
It's really helpful to improve community collaboration.

I'm sure it's going to generate some interesting discussions.
Please take care to discuss such announce on dev list only.
The announce at dpdk.org list is moderated to keep a low traffic.

I would like to open discussion about a really important feature,
showed last week by John Fastabend and John Ronciak during LinuxCon:

> Bifurcated Driver: With the Bifurcated Driver, the kernel will retain
> direct control of the NIC, and will assign specific queue pairs to DPDK.
> Configuration of the NIC is controlled by the kernel via ethtool.

This design allows to keep the configuration code in one place: the kernel.
In the meantime, we are trying to add a lot of code to configure the NICs,
which looks to be a duplication of effort.
Why should we have two ways of configuring e.g. flow director?

Since you at Intel, you'll be supporting your code, I am fine for duplication,
but I feel it's worth arguing why both should be available instead of one.

-- 
Thomas

[dpdk-dev] [PATCH v2 3/3] testpmd: Commands to test ctrl_pkt filter

2014-10-22 Thread Jingjing Wu

Add commands to test control packet filter
 - add/delete control packet filter

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 149 +
 1 file changed, 149 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..78a73ac 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -660,6 +660,11 @@ static void cmd_help_long_parsed(void *parsed_result,

"get_flex_filter (port_id) index (idx)\n"
"get info of a flex filter.\n\n"
+
+   "ctrl_pkt_filter (port_id) (add|del)"
+   " mac_addr (mac_address) ethertype (ether_type)"
+   " (none|options) queue (queue_id)\n"
+   "Add/Del a control packet filter.\n\n"
);
}
 }
@@ -7414,6 +7419,149 @@ cmdline_parse_inst_t cmd_get_flex_filter = {
NULL,
},
 };
+/* *** Filters Control *** */
+
+/* *** deal with control packet filter *** */
+struct cmd_ctrl_pkt_filter_result {
+   cmdline_fixed_string_t filter;
+   uint8_t port_id;
+   cmdline_fixed_string_t ops;
+   cmdline_fixed_string_t mac_addr;
+   struct ether_addr mac_addr_value;
+   cmdline_fixed_string_t ethertype;
+   uint16_t ethertype_value;
+   cmdline_fixed_string_t options;
+   cmdline_fixed_string_t queue;
+   uint16_t  queue_id;
+};
+
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_filter =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+filter, "ctrl_pkt_filter");
+cmdline_parse_token_num_t cmd_ctrl_pkt_filter_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+ port_id, UINT8);
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_ops =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+ops, "add#del");
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_mac_addr =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+mac_addr, "mac_addr");
+cmdline_parse_token_etheraddr_t cmd_ctrl_pkt_filter_mac_addr_value =
+   TOKEN_ETHERADDR_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+mac_addr_value);
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_ethertype =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+ethertype, "ethertype");
+cmdline_parse_token_num_t cmd_ctrl_pkt_filter_ethertype_value =
+   TOKEN_NUM_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+ ethertype_value, UINT16);
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_options =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+options, NULL);
+cmdline_parse_token_string_t cmd_ctrl_pkt_filter_queue =
+   TOKEN_STRING_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+queue, "queue");
+cmdline_parse_token_num_t cmd_ctrl_pkt_filter_queue_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_ctrl_pkt_filter_result,
+ queue_id, UINT16);
+
+static inline int
+parse_ctrl_pkt_filter_options(const char *q_arg,
+uint16_t *flags)
+{
+#define MAX_NUM_OPTIONS 3
+   char s[256];
+   char *str_fld[MAX_NUM_OPTIONS];
+   int i;
+   int num_options = -1;
+   unsigned size;
+
+   *flags = 0;
+   if (!strcmp(q_arg, "none"))
+   return 0;
+
+   size = strnlen(q_arg, sizeof(s));
+   snprintf(s, sizeof(s), "%.*s", size, q_arg);
+   num_options = rte_strsplit(s, sizeof(s), str_fld, MAX_NUM_OPTIONS, '-');
+   /* multi-options are combined by - */
+   if (num_options < 0 || num_options > MAX_NUM_OPTIONS)
+   return -1;
+   for (i = 0; i < num_options; i++) {
+   if (!strcmp(str_fld[i], "tx"))
+   *flags |= RTE_CONTROL_PACKET_FLAGS_TX;
+   if (!strcmp(str_fld[i], "mac_ignr"))
+   *flags |= RTE_CONTROL_PACKET_FLAGS_IGNORE_MAC;
+   if (!strcmp(str_fld[i], "drop"))
+   *flags |= RTE_CONTROL_PACKET_FLAGS_DROP;
+   }
+   return num_options;
+}
+
+static void
+cmd_ctrl_pkt_filter_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_ctrl_pkt_filter_result *res = parsed_result;
+   struct rte_ctrl_pkt_filter filter;
+   int ret = 0;
+
+   ret = rte_eth_dev_filter_supported(res->port_id, 
RTE_ETH_FILTER_CTRL_PKT);
+   if (ret < 0) {
+   printf("control packet filter is not supported on port %u.\n",
+   res->port_id);
+   return;
+   }
+
+   memset(, 0,

[dpdk-dev] [PATCH v2 2/3] i40e: ctrl_pkt filter implementation in i40e pmd driver

2014-10-22 Thread Jingjing Wu

implement control packet filter, support add and delete operations.
It can assign packets to specific queue or vsi by filtering with mac
address and ethertype or only ethertype on both rx and tx directions.

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 138 +-
 1 file changed, 136 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..943b01a 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -186,6 +186,12 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev 
*dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_ctrl_pkt_filter_set(struct i40e_pf *pf,
+   struct rte_ctrl_pkt_filter *cp_filter,
+   bool add);
+static int i40e_ctrl_pkt_filter_handle(struct i40e_pf *pf,
+   enum rte_filter_op filter_op,
+   void *arg);
 static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
@@ -4145,20 +4151,148 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf)
return 0;
 }

+/* Look up vsi by vsi_id */
+static struct i40e_vsi *
+i40e_vsi_lookup_by_id(struct i40e_vsi *uplink_vsi, uint16_t id)
+{
+   struct i40e_vsi *vsi = NULL;
+   struct i40e_vsi_list *vsi_list;
+
+   if (uplink_vsi == NULL)
+   return NULL;
+
+   if (uplink_vsi->vsi_id == id)
+   return vsi;
+
+   /* if VSI has child to attach*/
+   if (uplink_vsi->veb) {
+   TAILQ_FOREACH(vsi_list, _vsi->veb->head, list) {
+   vsi = i40e_vsi_lookup_by_id(vsi_list->vsi, id);
+   if (vsi)
+   return vsi;
+   }
+   }
+   return NULL;
+}
+
+/*
+ * Configure control packet filter, which can director packet by filtering
+ * with mac address and ether_type or only ether_type
+ */
+static int
+i40e_ctrl_pkt_filter_set(struct i40e_pf *pf,
+   struct rte_ctrl_pkt_filter *cp_filter,
+   bool add)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct i40e_control_filter_stats stats;
+   struct i40e_vsi *vsi = NULL;
+   uint16_t seid;
+   uint16_t flags = 0;
+   int ret;
+
+   if (cp_filter->ether_type == ETHER_TYPE_IPv4 ||
+   cp_filter->ether_type == ETHER_TYPE_IPv6) {
+   PMD_DRV_LOG(ERR, "unsupported ether_type(0x%04x) in"
+   " control packet filter.", cp_filter->ether_type);
+   return -EINVAL;
+   }
+   if (cp_filter->ether_type == ETHER_TYPE_VLAN)
+   PMD_DRV_LOG(WARNING, "filter vlan ether_type in first tag is"
+   " not supported.");
+
+   if (cp_filter->dest_id == 0)
+   /* Use LAN VSI Id if not programmed by user */
+   vsi = pf->main_vsi;
+   else {
+   vsi = i40e_vsi_lookup_by_id(pf->main_vsi, cp_filter->dest_id);
+   if (vsi == NULL || vsi->type == I40E_VSI_FDIR) {
+   PMD_DRV_LOG(ERR, "VSI arg is invalid\n");
+   return -EINVAL;
+   }
+   }
+
+   seid = vsi->seid;
+   memset(, 0, sizeof(stats));
+
+   if (cp_filter->flags & RTE_CONTROL_PACKET_FLAGS_TX)
+   flags |= I40E_AQC_ADD_CONTROL_PACKET_FLAGS_TX;
+   if (cp_filter->flags & RTE_CONTROL_PACKET_FLAGS_IGNORE_MAC)
+   flags |= I40E_AQC_ADD_CONTROL_PACKET_FLAGS_IGNORE_MAC;
+   if (cp_filter->flags & RTE_CONTROL_PACKET_FLAGS_TO_QUEUE)
+   flags |= I40E_AQC_ADD_CONTROL_PACKET_FLAGS_TO_QUEUE;
+   if (cp_filter->flags & RTE_CONTROL_PACKET_FLAGS_DROP)
+   flags |= I40E_AQC_ADD_CONTROL_PACKET_FLAGS_DROP;
+
+   ret = i40e_aq_add_rem_control_packet_filter(hw,
+   cp_filter->mac_addr.addr_bytes,
+   cp_filter->ether_type, flags,
+   seid, cp_filter->queue, add, , NULL);
+
+   PMD_DRV_LOG(INFO, "add/rem control packet filter, return %d,"
+" mac_etype_used = %u, etype_used = %u,"
+" mac_etype_free = %u, etype_free = %u\n",
+ret, stats.mac_etype_used, stats.etype_used,
+stats.mac_etype_free, stats.etype_free);
+   if (ret < 0)
+   return -ENOSYS;
+   return 0;
+}
+
+/*
+ * Handle operations for control packte filter type.
+ */
+static int
+i40e_ctrl_pkt_filter_handle(struct i40e_pf *pf,
+   enum rte_filter_op filter_op,
+   void

[dpdk-dev] [PATCH v2 1/3] ethdev: define ctrl_pkt filter type and its structure

2014-10-22 Thread Jingjing Wu

define new filter type and its structure
 - RTE_ETH_FILTER_CTRL_PKT
 - struct rte_ctrl_pkt_filter

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..8d3dba9 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_CTRL_PKT,
RTE_ETH_FILTER_MAX
 };

@@ -71,6 +72,29 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * Define all structures for Control Packet Filter type corresponding with 
specific operations.
+ */
+
+#define RTE_CONTROL_PACKET_FLAGS_IGNORE_MAC0x0001
+#define RTE_CONTROL_PACKET_FLAGS_DROP  0x0002
+#define RTE_CONTROL_PACKET_FLAGS_TO_QUEUE  0x0004
+#define RTE_CONTROL_PACKET_FLAGS_TX0x0008
+#define RTE_CONTROL_PACKET_FLAGS_RX0x
+
+/**
+ * A structure used to define the control packet filter entry
+ * to support RTE_ETH_FILTER_CTRL_PKT with RTE_ETH_FILTER_ADD
+ * and RTE_ETH_FILTER_DELETE operations.
+ */
+struct rte_ctrl_pkt_filter {
+   struct ether_addr mac_addr;   /**< mac address to match. */
+   uint16_t ether_type;  /**< ether type to match */
+   uint16_t flags;   /**< options for filter's behavior*/
+   uint16_t dest_id; /**< destination vsi id or pool id*/
+   uint16_t queue;   /**< queue assign to if TO QUEUE flag is 
set */
+};
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.1.4

[dpdk-dev] [PATCH v2 0/3] support control packet filter on fortville

2014-10-22 Thread Jingjing Wu

The patch set enables control packet filter on Fortville.
Control packet filter can assign packet to specific destination
by filtering with mac address and ethertype or only ethertype.

v2 changes:
 - strip the filter APIs definitions from this patch set

Jingjing Wu (3):
  ethdev: define ctrl_pkt filter type and its structure
  i40e: ctrl_pkt filter implementation in i40e pmd driver
  testpmd: Commands to test ctrl_pkt filter

 app/test-pmd/cmdline.c| 149 ++
 lib/librte_ether/rte_eth_ctrl.h   |  24 ++
 lib/librte_pmd_i40e/i40e_ethdev.c | 138 ++-
 3 files changed, 309 insertions(+), 2 deletions(-)

-- 
1.8.1.4

[dpdk-dev] [PATCH v3 0/6] Update libs build process

2014-10-22 Thread Gonzalez Monroy, Sergio

Dropping patch set.

Thanks,
Sergio

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Gonzalez Monroy,
> Sergio
> Sent: Tuesday, October 21, 2014 10:44 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/6] Update libs build process
> 
> Hi Thomas,
> 
> Given that most of the comments/discussion for this patch set revolved
> around the removal of COMBINE_LIBS and what libs to build by default, I am
> inclined to drop this patch set, submit minimal patch to fix compiler errors
> (initial and main purpose of this patch set) and then submit an RFC regarding
> the use/removal of COMBINE_LIBS and other outstanding issues in the build
> system.
> 
> Does that sound like a better approach?
> 
> Thanks,
> Sergio
>

[dpdk-dev] [PATCH v2] bond: disabling broadcast mode when dpdk is built without RTE_MBUF_REFCNT

2014-10-22 Thread Thomas Monjalon

> --V2: Adds warning message to makefile, to notify user of disabling of 
> broadcast 
> mode 
> 
> Link bonding broadcast mode requires refcnt parameter in the mbuf struct to 
> allow efficient transmission of duplicated mbufs on slave ports. 
> 
> This patch disables broadcast mode when the complication option 
> RTE_MBUF_REFCNT
> is disabled to allow clean building of the bonding library
> 
> 
> Signed-off-by: Declan Doherty 

Acked-by: Thomas Monjalon 

Applied

Thanks
-- 
Thomas

[dpdk-dev] DPDK Features for Q1 2015

2014-10-22 Thread Zhu, Heqing



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liang, Cunming
> Sent: Wednesday, October 22, 2014 8:06 AM
> To: Zhou, Danny; Thomas Monjalon; O'driscoll, Tim
> Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> 
> > >
> > > This design allows to keep the configuration code in one place: the
> kernel.
> > > In the meantime, we are trying to add a lot of code to configure the
> > > NICs, which looks to be a duplication of effort.
> > > Why should we have two ways of configuring e.g. flow director?

[heqing] There will be multiple choices for DPDK usage model if/after this 
feature is available, 
the customer can choose the DPDK with or without the bifurcated driver. 

> [Liang, Cunming] The HW sometimes provides additional ability than existing
> abstraction API.
> On that time(HW ability is a superset to the abstraction wrapper, e.g. flow
> director), we need to provide another choice.
> Ethtools is good, but can't apply anything supported in NIC.
> Bifurcated driver considers a lot on reusing existing rx/tx routine.
> We'll send RFC patch soon if kernel patch moving fast.
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhou, Danny
> > Sent: Wednesday, October 22, 2014 10:44 PM
> > To: Thomas Monjalon; O'driscoll, Tim
> > Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> > Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> >
> > Thomas,
> >
> > In terms of the bifurcated driver, it is actually the same thing.
> > Specifically, the bifurcated driver PMD in DPDK depends on kernel
> > code(af_packet and 10G/40G NIC) changes. Once the kernel patches are
> > upstreamed, the corresponding DPDK PMDs patches will be submitted to
> > dpdk.org. John Fastabend and John Ronciak are working with very
> > closely to achieve the same goal.
> >
> > -Danny
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas
> Monjalon
> > > Sent: Wednesday, October 22, 2014 10:21 PM
> > > To: O'driscoll, Tim
> > > Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> > > Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> > >
> > > Thanks Tim for sharing your plan.
> > > It's really helpful to improve community collaboration.
> > >
> > > I'm sure it's going to generate some interesting discussions.
> > > Please take care to discuss such announce on dev list only.
> > > The announce at dpdk.org list is moderated to keep a low traffic.
> > >
> > > I would like to open discussion about a really important feature,
> > > showed last week by John Fastabend and John Ronciak during LinuxCon:
> > >
> > > > Bifurcated Driver: With the Bifurcated Driver, the kernel will
> > > > retain direct control of the NIC, and will assign specific queue pairs 
> > > > to
> DPDK.
> > > > Configuration of the NIC is controlled by the kernel via ethtool.
> > >
> > > This design allows to keep the configuration code in one place: the
> kernel.
> > > In the meantime, we are trying to add a lot of code to configure the
> > > NICs, which looks to be a duplication of effort.
> > > Why should we have two ways of configuring e.g. flow director?
> > >
> > > Since you at Intel, you'll be supporting your code, I am fine for
> > > duplication, but I feel it's worth arguing why both should be available
> instead of one.
> > >
> > > --
> > > Thomas

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Thomas Monjalon

2014-10-22 12:47, Liu, Jijiang:
> > > Currently, A PF  associated to a port, right? What tunnel type should
> > > be supported in a PF, which is required we configure it.
> > > Tunneling packet is encapsulation packet, in terms of VxLAN, packet
> > > format is outer L2 header+ outer L3 header +outer L4 header +
> > > tunneling header+ inner L2 header + inner L3 header + inner L4 header 
> > > +PAY4.
> > > For a VM/VF, the  real useful packet data is "inner L2 header + inner
> > > L3 header + inner L4 header +PAY4".
> > >
> > > In NIC, A port/PF receive this kind of tunneling packet(outer
> > > L2+...PAY4), software should be responsible for decapsulating the
> > > packet and deliver real data(innerL2 + PAY4) to VM/VF?
> > >
> > > DPDK just provide API/mechanism to guarantee a PF/port to receive the
> > > tunneling packet data, the encapsulation/ decapsulation work should be
> > > done by user application.
> > 
> > You mean that all packets received on the PF which doesn't match the
> > configured tunnel type, won't be received by the software?
> 
> No, it will be received by the software.
> In terms of VxLAN packet, if tunnel type is not configured VXLAN type,
> and software can't use API to configure the UDP destination number. 
> Even if the incoming packet format is VXLAN packet format, hardware and
> software don't think it is VXLAN packet because we didn't configure UDP
> Destination port. 
> 
> Now I want to remove this limitation,  even if the  tunnel type is not
> configured at PF initialization phase, user also can configure the VxLAN
> UDP destination number.
> It is more flexible and reasonable.
> 
> > Other question, why a port is associated with only one tunnel type?
> 
> Good question. Now I think we had better remove this limitation because it is 
> NIC related.
> 
> Two points are summarized here.
> 1. The tunnel types is for a whole port/PF, I have explained it a lots.
> 2. I will remove tunnel type configuration from rte_eth_conf structure.
> 
> Any comments?

Honestly, I haven't understood your explanation :)
I just understood that you want to remove tunnel type from rte_eth_conf
and I think it's a really good thing.

Thanks
-- 
Thomas

[dpdk-dev] Why do we need iommu=pt?

2014-10-22 Thread Zhou, Danny

Echo Cunming and we did not see obvious performance impact when iommu = pt is 
used despite of
igb_uio or VFIO is used.

Alex, 
The map and umap operation for each e/ingress packet is done by hw rather than 
sw, so
performance impact to DPDK should be minimum in my mind. If it actually impacst 
perf, say on 100G NIC,
I am sure it will be resolved in next generation Intel silicon. We will be 
performing some performance
tests with iommu = on to see any performance degradation. I cannot share the 
detailed performance
result here on the community, but I could tell if it really bring negative 
performance impact to DPDK.
Please stay tuned.

Alex, 

> -Original Message-
> From: Liang, Cunming
> Sent: Wednesday, October 22, 2014 4:53 PM
> To: alex; Zhou, Danny
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] Why do we need iommu=pt?
> 
> I thinks it's a good point using dma_addr rather than phys_addr.
> Without iommu, the value of them are the same.
> With iommu, the dma_addr value equal to the iova.
> It's not all for DPDK working with iommu but not pass through.
> 
> We know each iova belongs to one iommu domain.
> And each device can attach to one domain.
> It means the iova will have coupling relationship with domain/device.
> 
> Looking back to DPDK descriptor ring, it's all right, already coupling with 
> device.
> But if for mbuf mempool, in most cases, it's shared by multiple ports.
> So if keeping the way, all those ports/device need to put into the same iommu 
> domain.
> And the mempool has attach to specific domain, but not just the device.
> On this time, iommu domain no longer be transparent in DPDK.
> Vfio provide the verbs to control domain, we still need library to manager 
> such domain with mempool.
> 
> All that overhead just make DPDK works with iommu in host, but remember pt 
> always works.
> The isolation of devices mainly for security concern.
> If it's not necessary, pt definitely is a good choice without performance 
> impact.
> 
> For those self-implemented PMD using the DMA kernel interface to set up its 
> mappings appropriately.
> It don't require "iommu=pt". The default option "iommu=on" also works.
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of alex
> > Sent: Wednesday, October 22, 2014 3:36 PM
> > To: Zhou, Danny
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> >
> > Shiva.
> > The cost of disabling iommu=pt when intel_iommu=on is dire. DPDK won't work
> > as the RX/TX descriptors will be useless.
> > Any dam access by the device will be dropped as no dam-mapping will exists.
> >
> > Danny.
> > The IOMMU hurts performance in kernel drivers which perform a map and umap
> > operation for each e/ingress packet.
> > The costs of unmapping when under strict protection limit a +10Gb to 3Gb
> > with cpu maxed out at 100%. DPDK apps shouldn't feel any difference IFF the
> > rx descriptors contain iova and not real physical addresses which are used
> > currently.
> >
> >
> > On Tue, Oct 21, 2014 at 10:10 PM, Zhou, Danny  
> > wrote:
> >
> > > IMHO, if memory protection with IOMMU is needed or not really depends on
> > > how you use
> > > and deploy your DPDK based applications. For Telco network middle boxes,
> > > which adopts
> > > a "close model" solution to achieve extremely high performance, the entire
> > > system including
> > > HW, software in kernel and userspace are controlled by Telco vendors and
> > > assumed trustable, so
> > > memory protection is not so important. While for Datacenters, which
> > > generally adopts a "open model"
> > > solution allows running user space applications(e.g. tenant applications
> > > and VMs) which could
> > > direct access NIC and DMA engine inside the NIC using modified DPDK PMD
> > > are not trustable
> > > as they can potentially DAM to/from arbitrary memory regions using
> > > physical addresses, so IOMMU
> > > is needed to provide strict memory protection, at the cost of negative
> > > performance impact.
> > >
> > > So if you want to seek high performance, disable IOMMU in BIOS or OS. And
> > > if security is a major
> > > concern, tune it on and tradeoff between performance and security. But I
> > > do NOT think is comes with
> > > an extremely high performance costs according to our performance
> > > measurement, but it probably true
> > > for 100G NIC.
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya 
> > > > Hiremath
> > > > Sent: Wednesday, October 22, 2014 12:54 AM
> > > > To: Alex Markuze
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> > > >
> > > > Hi,
> > > >
> > > > Thank you for all the replies.
> > > > I am trying to understand the impact of this on DPDK. What will be the
> > > > repercussions of disabling "iommu=pt" on the DPDK performance?
> > > >
> > > >
> > > > On Tue, Oct 21, 2014 at 12:32 AM, Alex Markuze  wrote:
> > > >
> > > > >

[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images

2014-10-22 Thread Gonzalez Monroy, Sergio

> -Original Message-
> From: Matthew Hall [mailto:mhall at mhcomputing.net]
> Subject: Re: [dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud
> Images
> 
> > - we do not want to build against static DPDK libraries as this would
> > result in duplicated code in librte_pmd_virtio and other apps (ie.
> > testpmd)
> >
> > - we want to link against shared DPDK libs to add dependencies and
> > provide reliable information (ie. ldd)
> 
> OK... but now it's impossible to use librte_pmd_virtio w/o mandatory share
> library performance loss. I strongly dislike being force to do this.
> 
You are not forced to use shared libraries. This module loads successfully with 
an app (testpmd) built against static DPDK libs.

> OK... let me try to clarify this point again. In this official DPDK support 
> device
> document, http://www.dpdk.org/doc/nics , it says:
> 
> Paravirtualization
> virtio-net or virtio-net + uio (QEMU, VirtualBox)
> 
> As I've stated, when testing this on VirtualBox it does not work for me and
> gets into an infinite initialization loop which I documented in my last mail.
> But the same code works fine if it's using the VBox Intel 82545EM VNIC and
> appropriate driver. Also the VBox virtio-net device works completely fine
> using the kernel virtio-net driver. This making the virtio PMD's the most 
> likely
> suspect, especially since the UIO based one can't init itself, and the non UIO
> one gets stuck in the loop.
> 
I have reproduced this issue in VirtualBox:
- For UIO Virtio PMD, there is an issue with igb_uio module and virtio vbox 
backend device, I fail to bind igb_uio driver to the virtio device.
- For non-UIO Virtio PMD, the module fails to initialize properly as you have 
indicated in your previous post (stuck in a loop).

I get this behavior with testpmd regardless of DPDK being built as static or 
shared.

> > > EAL: open shared lib
> > > /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so
> > > EAL: /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so:
> > > undefined
> > > symbol: per_lcore__lcore_id
> > >
> > Are we talking about a DPDK or custom app?
> > Do you only see the issue when CONFIG_RTE_BUILD_COMBINE_LIBS=y?
> 
> Issue happens in my DPDK based app.
> 
> Can happen anytime you use static linked DPDK app w/ the
> librte_pmd_virtio.
> Because the link process of librte_pmd_virtio is broken.
> 
The linking is not broken if we are assuming apps built against static DPDK 
libs.
I can't think of other way of linking this module to be used in apps with 
static DPDK libs.

> > > Running nm and nm -D shows this:
> > >
> > > $ nm librte_pmd_virtio.so | fgrep -i per_lcore__lcore_id U
> > > per_lcore__lcore_id
> > >
> > This is expected behavior as the symbol is defined in librte_eal.
> > The dynamic linker will resolve the undefined reference when loading the
> module in run-time.
> 
> I am aware it's "expected behavior". But have the undefined symbol, and no
> dependency upon the DPDK .so and no link against the DPDK .a is NOT
> "expected behavior". It will break anytime you try to make a static app with
> this PMD available.
> 
Have you built static DPDK libs and run testpmd?

Your undefined symbol error is most likely because the symbol is not in the 
dynamic symbol table of you app.
You need to pass -rdynamic to GCC or -export-dynamic to LD when building your 
app.

Thanks,
Sergio

> Matthew.

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-22 Thread Richardson, Bruce



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Wednesday, October 22, 2014 3:53 PM
> To: Neil Horman; Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > Sent: Wednesday, October 22, 2014 3:03 PM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> >
> > On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > Sent: Tuesday, October 21, 2014 6:33 PM
> > > > To: Liang, Cunming
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > > > cycles/packet
> > > >
> > > > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > > > > Hi Neil,
> > > > >
> > > > > Very appreciate your comments.
> > > > > I add inline reply, will send v3 asap when we get alignment.
> > > > >
> > > > > BRs,
> > > > > Liang Cunming
> > > > >
> > > > > > -Original Message-
> > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > > > Sent: Saturday, October 11, 2014 1:52 AM
> > > > > > To: Liang, Cunming
> > > > > > Cc: dev at dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx 
> > > > > > and tx
> > > > cycles/packet
> > > > > >
<...snip...>
> > > > > >
> > > > > > > + printf("Force Stop!\n");
> > > > > > > + stop = 1;
> > > > > > > + }
> > > > > > > + if (signum == SIGUSR2)
> > > > > > > + stats_display(0);
> > > > > > > +}
> > > > > > > +/* main processing loop */
> > > > > > > +static int
> > > > > > > +main_loop(__rte_unused void *args)
> > > > > > > +{
> > > > > > > +#define PACKET_SIZE 64
> > > > > > > +#define FRAME_GAP 12
> > > > > > > +#define MAC_PREAMBLE 8
> > > > > > > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> > > > > > > + unsigned lcore_id;
> > > > > > > + unsigned i, portid, nb_rx = 0, nb_tx = 0;
> > > > > > > + struct lcore_conf *conf;
> > > > > > > + uint64_t prev_tsc, cur_tsc;
> > > > > > > + int pkt_per_port;
> > > > > > > + uint64_t packets_per_second, total_packets;
> > > > > > > +
> > > > > > > + lcore_id = rte_lcore_id();
> > > > > > > + conf = _conf[lcore_id];
> > > > > > > + if (conf->status != LCORE_USED)
> > > > > > > + return 0;
> > > > > > > +
> > > > > > > + pkt_per_port =  MAX_TRAFIC_BURST / conf->nb_ports;
> > > > > > > +
> > > > > > > + int idx = 0;
> > > > > > > + for (i = 0; i < conf->nb_ports; i++) {
> > > > > > > + int num = pkt_per_port;
> > > > > > > + portid = conf->portlist[i];
> > > > > > > + printf("inject %d packet to port %d\n", num, portid);
> > > > > > > + while (num) {
> > > > > > > + nb_tx = RTE_MIN(MAX_PKT_BURST, num);
> > > > > > > + nb_tx = rte_eth_tx_burst(portid, 0,
> > > > > > > + _burst[idx], nb_tx);
> > > > > > > + num -= nb_tx;
> > > > > > > + idx += nb_tx;
> > > > > > > + }
> > > > > > > + }
> > > > > > > + printf("Total packets inject to prime ports = %u\n", idx);
> > > > > > > +
> > > > > > > + packets_per_second = (link_mbps * 1000 * 1000) /
> > > > > > > + +((PACKET_SIZE + FRAME_GAP + MAC_PREAMBLE) *
> CHAR_BIT);
> > > > > > > + printf("Each port will do %"PRIu64" packets per second\n",
> > > > > > > + +packets_per_second);
> > > > > > > +
> > > > > > > + total_packets = RTE_TEST_DURATION * conf->nb_ports *
> > > > > > packets_per_second;
> > > > > > > + printf("Test will stop after at least %"PRIu64" packets
> received\n",
> > > > > > > + + total_packets);
> > > > > > > +
> > > > > > > + prev_tsc = rte_rdtsc();
> > > > > > > +
> > > > > > > + while (likely(!stop)) {
> > > > > > > + for (i = 0; i < conf->nb_ports; i++) {
> > > > > > > + portid = conf->portlist[i];
> > > > > > > + nb_rx = rte_eth_rx_burst((uint8_t) portid, 0,
> > > > > > > +  pkts_burst,
> MAX_PKT_BURST);
> > > > > > > + if (unlikely(nb_rx == 0)) {
> > > > > > > + idle++;
> > > > > > > + continue;
> > > > > > > + }
> > > > > > > +
> > > > > > > + count += nb_rx;
> > > > > > Doesn't take into consideration error conditions.  rte_eth_rx_burst 
> > > > > > can
> > > > return
> > > > > > -ENOTSUP
> > > > > [Liang, Cunming] It returns -ENOTSUP when turning on ETHDEV_DEBUG
> > > > CONFIG.
> > > > > The error is used to avoid no function call register.
> > > > > When ETHDEV_DEBUG turn off, the NULL function call cause segfault
> directly.
> > > > > So

[dpdk-dev] DPDK Features for Q1 2015

2014-10-22 Thread Liang, Cunming

> >
> > This design allows to keep the configuration code in one place: the kernel.
> > In the meantime, we are trying to add a lot of code to configure the NICs,
> > which looks to be a duplication of effort.
> > Why should we have two ways of configuring e.g. flow director?
[Liang, Cunming] The HW sometimes provides additional ability than existing 
abstraction API.
On that time(HW ability is a superset to the abstraction wrapper, e.g. flow 
director), we need to provide another choice.
Ethtools is good, but can't apply anything supported in NIC.
Bifurcated driver considers a lot on reusing existing rx/tx routine.
We'll send RFC patch soon if kernel patch moving fast.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhou, Danny
> Sent: Wednesday, October 22, 2014 10:44 PM
> To: Thomas Monjalon; O'driscoll, Tim
> Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> 
> Thomas,
> 
> In terms of the bifurcated driver, it is actually the same thing. 
> Specifically, the
> bifurcated
> driver PMD in DPDK depends on kernel code(af_packet and 10G/40G NIC)
> changes. Once the
> kernel patches are upstreamed, the corresponding DPDK PMDs patches will be
> submitted to dpdk.org. John Fastabend and John Ronciak are working with very
> closely to achieve the same goal.
> 
> -Danny
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > Sent: Wednesday, October 22, 2014 10:21 PM
> > To: O'driscoll, Tim
> > Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> > Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> >
> > Thanks Tim for sharing your plan.
> > It's really helpful to improve community collaboration.
> >
> > I'm sure it's going to generate some interesting discussions.
> > Please take care to discuss such announce on dev list only.
> > The announce at dpdk.org list is moderated to keep a low traffic.
> >
> > I would like to open discussion about a really important feature,
> > showed last week by John Fastabend and John Ronciak during LinuxCon:
> >
> > > Bifurcated Driver: With the Bifurcated Driver, the kernel will retain
> > > direct control of the NIC, and will assign specific queue pairs to DPDK.
> > > Configuration of the NIC is controlled by the kernel via ethtool.
> >
> > This design allows to keep the configuration code in one place: the kernel.
> > In the meantime, we are trying to add a lot of code to configure the NICs,
> > which looks to be a duplication of effort.
> > Why should we have two ways of configuring e.g. flow director?
> >
> > Since you at Intel, you'll be supporting your code, I am fine for 
> > duplication,
> > but I feel it's worth arguing why both should be available instead of one.
> >
> > --
> > Thomas

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-22 Thread Ananyev, Konstantin



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> Sent: Wednesday, October 22, 2014 3:03 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx 
> cycles/packet
> 
> On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote:
> >
> >
> > > -Original Message-
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Tuesday, October 21, 2014 6:33 PM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > > cycles/packet
> > >
> > > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > > > Hi Neil,
> > > >
> > > > Very appreciate your comments.
> > > > I add inline reply, will send v3 asap when we get alignment.
> > > >
> > > > BRs,
> > > > Liang Cunming
> > > >
> > > > > -Original Message-
> > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > > Sent: Saturday, October 11, 2014 1:52 AM
> > > > > To: Liang, Cunming
> > > > > Cc: dev at dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and 
> > > > > tx
> > > cycles/packet
> > > > >
> > > > > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote:
> > > > > > It provides unit test to measure cycles/packet in NIC loopback mode.
> > > > > > It simply gives the average cycles of IO used per packet without 
> > > > > > test
> > > equipment.
> > > > > > When doing the test, make sure the link is UP.
> > > > > >
> > > > > > Usage Example:
> > > > > > 1. Run unit test app in interactive mode
> > > > > > app/test -c f -n 4 -- -i
> > > > > > 2. Run and wait for the result
> > > > > > pmd_perf_autotest
> > > > > >
> > > > > > There's option to choose rx/tx pair, default is vector.
> > > > > > set_rxtx_mode [vector|scalar|full|hybrid]
> > > > > > Note: To get acurate scalar fast, please choose 'vector' or 
> > > > > > 'hybrid' without
> > > > > INC_VEC=y in config
> > > > > >
> > > > > > Signed-off-by: Cunming Liang 
> > > > > > Acked-by: Bruce Richardson 
> > > > >
> > > > > Notes inline
> > > > >
> > > > > > ---
> > > > > >  app/test/Makefile   |1 +
> > > > > >  app/test/commands.c |   38 +++
> > > > > >  app/test/packet_burst_generator.c   |4 +-
> > > > > >  app/test/test.h |4 +
> > > > > >  app/test/test_pmd_perf.c|  626
> > > > > +++
> > > > > >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
> > > > > >  6 files changed, 677 insertions(+), 2 deletions(-)
> > > > > >  create mode 100644 app/test/test_pmd_perf.c
> > > > > >
> > > > > > diff --git a/app/test/Makefile b/app/test/Makefile
> > > > > > index 6af6d76..ebfa0ba 100644
> > > > > > --- a/app/test/Makefile
> > > > > > +++ b/app/test/Makefile
> > > > > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c
> > > > > >
> > > > > >  SRCS-y += test_ring.c
> > > > > >  SRCS-y += test_ring_perf.c
> > > > > > +SRCS-y += test_pmd_perf.c
> > > > > >
> > > > > >  ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
> > > > > >  SRCS-y += test_table.c
> > > > > > diff --git a/app/test/commands.c b/app/test/commands.c
> > > > > > index a9e36b1..f1e746e 100644
> > > > > > --- a/app/test/commands.c
> > > > > > +++ b/app/test/commands.c
> > > > > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = {
> > > > > >
> > > > > > +#define NB_ETHPORTS_USED(1)
> > > > > > +#define NB_SOCKETS  (2)
> > > > > > +#define MEMPOOL_CACHE_SIZE 250
> > > > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > > > > RTE_PKTMBUF_HEADROOM)
> > > > > Don't you want to size this in accordance with the amount of data your
> > > sending
> > > > > (64 Bytes as noted above)?
> > > > [Liang, Cunming] The case is designed to measure small packet IO cost 
> > > > with
> > > normal mbuf size.
> > > > Even if decreasing the size, it won't gain significant cycles.
> > > > >
> > > That presumes a non-memory constrained system, doesn't it?  I suppose in 
> > > the
> > > end
> > > as long as you have consistency, its not overly relevant, but it seems 
> > > like
> > > you'll want to add data sizing as a feature to this eventually (i.e. the 
> > > ability
> > > to test performance for larger frames sizes), at which point you'll need 
> > > to make
> > > this non-static anyway.
> > [Liang, Cunming] For a normal Ethernet packet(w/o jumbo frame), packet size 
> > is 1518B.
> > As in really network, there won't have huge number of jumbo frames.
> > The mbuf size 2048 is a reasonable value to cover most of the packet size.
> > It's also be chosen by lots of NIC as the default receiving buffer size in 
> > DMA register.
> > In case larger than the size, it need do scatter and gather but lose some 
> > performance.
> > The unit test won't measure size from 64 to 9600, won't plan to measure 
> > scatter-gather rx/tx.
> > It focus on 64B

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-22 Thread Liang, Cunming



> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Wednesday, October 22, 2014 10:03 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> 
> On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote:
> >
> >
> > > -Original Message-
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Tuesday, October 21, 2014 6:33 PM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > > cycles/packet
> > >
> > > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > > > Hi Neil,
> > > >
> > > > Very appreciate your comments.
> > > > I add inline reply, will send v3 asap when we get alignment.
> > > >
> > > > BRs,
> > > > Liang Cunming
> > > >
> > > > > -Original Message-
> > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > > Sent: Saturday, October 11, 2014 1:52 AM
> > > > > To: Liang, Cunming
> > > > > Cc: dev at dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and 
> > > > > tx
> > > cycles/packet
> > > > >
> > > > > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote:
> > > > > > It provides unit test to measure cycles/packet in NIC loopback mode.
> > > > > > It simply gives the average cycles of IO used per packet without 
> > > > > > test
> > > equipment.
> > > > > > When doing the test, make sure the link is UP.
> > > > > >
> > > > > > Usage Example:
> > > > > > 1. Run unit test app in interactive mode
> > > > > > app/test -c f -n 4 -- -i
> > > > > > 2. Run and wait for the result
> > > > > > pmd_perf_autotest
> > > > > >
> > > > > > There's option to choose rx/tx pair, default is vector.
> > > > > > set_rxtx_mode [vector|scalar|full|hybrid]
> > > > > > Note: To get acurate scalar fast, please choose 'vector' or 'hybrid'
> without
> > > > > INC_VEC=y in config
> > > > > >
> > > > > > Signed-off-by: Cunming Liang 
> > > > > > Acked-by: Bruce Richardson 
> > > > >
> > > > > Notes inline
> > > > >
> > > > > > ---
> > > > > >  app/test/Makefile   |1 +
> > > > > >  app/test/commands.c |   38 +++
> > > > > >  app/test/packet_burst_generator.c   |4 +-
> > > > > >  app/test/test.h |4 +
> > > > > >  app/test/test_pmd_perf.c|  626
> > > > > +++
> > > > > >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
> > > > > >  6 files changed, 677 insertions(+), 2 deletions(-)
> > > > > >  create mode 100644 app/test/test_pmd_perf.c
> > > > > >
> > > > > > diff --git a/app/test/Makefile b/app/test/Makefile
> > > > > > index 6af6d76..ebfa0ba 100644
> > > > > > --- a/app/test/Makefile
> > > > > > +++ b/app/test/Makefile
> > > > > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c
> > > > > >
> > > > > >  SRCS-y += test_ring.c
> > > > > >  SRCS-y += test_ring_perf.c
> > > > > > +SRCS-y += test_pmd_perf.c
> > > > > >
> > > > > >  ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
> > > > > >  SRCS-y += test_table.c
> > > > > > diff --git a/app/test/commands.c b/app/test/commands.c
> > > > > > index a9e36b1..f1e746e 100644
> > > > > > --- a/app/test/commands.c
> > > > > > +++ b/app/test/commands.c
> > > > > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = {
> > > > > >
> > > > > > +#define NB_ETHPORTS_USED(1)
> > > > > > +#define NB_SOCKETS  (2)
> > > > > > +#define MEMPOOL_CACHE_SIZE 250
> > > > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > > > > RTE_PKTMBUF_HEADROOM)
> > > > > Don't you want to size this in accordance with the amount of data your
> > > sending
> > > > > (64 Bytes as noted above)?
> > > > [Liang, Cunming] The case is designed to measure small packet IO cost 
> > > > with
> > > normal mbuf size.
> > > > Even if decreasing the size, it won't gain significant cycles.
> > > > >
> > > That presumes a non-memory constrained system, doesn't it?  I suppose in
> the
> > > end
> > > as long as you have consistency, its not overly relevant, but it seems 
> > > like
> > > you'll want to add data sizing as a feature to this eventually (i.e. the 
> > > ability
> > > to test performance for larger frames sizes), at which point you'll need 
> > > to
> make
> > > this non-static anyway.
> > [Liang, Cunming] For a normal Ethernet packet(w/o jumbo frame), packet size 
> > is
> 1518B.
> > As in really network, there won't have huge number of jumbo frames.
> > The mbuf size 2048 is a reasonable value to cover most of the packet size.
> > It's also be chosen by lots of NIC as the default receiving buffer size in 
> > DMA
> register.
> > In case larger than the size, it need do scatter and gather but lose some
> performance.
> > The unit test won't measure size from 64 to 9600, won't plan to measure
> scatter-gather rx/tx.
> > It focus on 64B packet size and taking the mbuf

[dpdk-dev] DPDK Features for Q1 2015

2014-10-22 Thread Zhou, Danny

Thomas,

In terms of the bifurcated driver, it is actually the same thing. Specifically, 
the bifurcated 
driver PMD in DPDK depends on kernel code(af_packet and 10G/40G NIC) changes. 
Once the
kernel patches are upstreamed, the corresponding DPDK PMDs patches will be 
submitted to dpdk.org. John Fastabend and John Ronciak are working with very
closely to achieve the same goal.

-Danny

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, October 22, 2014 10:21 PM
> To: O'driscoll, Tim
> Cc: dev at dpdk.org; Fastabend, John R; Ronciak, John
> Subject: Re: [dpdk-dev] DPDK Features for Q1 2015
> 
> Thanks Tim for sharing your plan.
> It's really helpful to improve community collaboration.
> 
> I'm sure it's going to generate some interesting discussions.
> Please take care to discuss such announce on dev list only.
> The announce at dpdk.org list is moderated to keep a low traffic.
> 
> I would like to open discussion about a really important feature,
> showed last week by John Fastabend and John Ronciak during LinuxCon:
> 
> > Bifurcated Driver: With the Bifurcated Driver, the kernel will retain
> > direct control of the NIC, and will assign specific queue pairs to DPDK.
> > Configuration of the NIC is controlled by the kernel via ethtool.
> 
> This design allows to keep the configuration code in one place: the kernel.
> In the meantime, we are trying to add a lot of code to configure the NICs,
> which looks to be a duplication of effort.
> Why should we have two ways of configuring e.g. flow director?
> 
> Since you at Intel, you'll be supporting your code, I am fine for duplication,
> but I feel it's worth arguing why both should be available instead of one.
> 
> --
> Thomas

[dpdk-dev] [PATCH v2] bond: disabling broadcast mode when dpdk is built without RTE_MBUF_REFCNT

2014-10-22 Thread Declan Doherty

--V2: Adds warning message to makefile, to notify user of disabling of 
broadcast 
mode 

Link bonding broadcast mode requires refcnt parameter in the mbuf struct to 
allow efficient transmission of duplicated mbufs on slave ports. 

This patch disables broadcast mode when the complication option RTE_MBUF_REFCNT
is disabled to allow clean building of the bonding library


Signed-off-by: Declan Doherty 
---
 app/test/test_link_bonding.c|  9 -
 lib/librte_pmd_bond/Makefile|  4 
 lib/librte_pmd_bond/rte_eth_bond.h  |  3 ++-
 lib/librte_pmd_bond/rte_eth_bond_args.c |  2 ++
 lib/librte_pmd_bond/rte_eth_bond_pmd.c  | 12 
 5 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index db5b180..214d2a2 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -788,7 +788,10 @@ test_set_bonding_mode(void)
int bonding_modes[] = { BONDING_MODE_ROUND_ROBIN,

BONDING_MODE_ACTIVE_BACKUP,
BONDING_MODE_BALANCE,
-   BONDING_MODE_BROADCAST 
};
+#ifdef RTE_MBUF_REFCNT
+   BONDING_MODE_BROADCAST
+#endif
+   };

/* Test supported link bonding modes */
for (i = 0; i < (int)RTE_DIM(bonding_modes);i++) {
@@ -3227,6 +3230,7 @@ 
test_balance_verify_slave_link_status_change_behaviour(void)
return remove_slaves_and_stop_bonded_device();
 }

+#ifdef RTE_MBUF_REFCNT
 /** Broadcast Mode Tests */

 static int
@@ -3704,6 +3708,7 @@ 
test_broadcast_verify_slave_link_status_change_behaviour(void)
/* Clean up and remove slaves from bonded device */
return remove_slaves_and_stop_bonded_device();
 }
+#endif

 static int
 test_reconfigure_bonded_device(void)
@@ -3797,11 +3802,13 @@ static struct unit_test_suite link_bonding_test_suite  
= {
TEST_CASE(test_balance_verify_promiscuous_enable_disable),
TEST_CASE(test_balance_verify_mac_assignment),

TEST_CASE(test_balance_verify_slave_link_status_change_behaviour),
+#ifdef RTE_MBUF_REFCNT
TEST_CASE(test_broadcast_tx_burst),
TEST_CASE(test_broadcast_rx_burst),
TEST_CASE(test_broadcast_verify_promiscuous_enable_disable),
TEST_CASE(test_broadcast_verify_mac_assignment),

TEST_CASE(test_broadcast_verify_slave_link_status_change_behaviour),
+#endif
TEST_CASE(test_reconfigure_bonded_device),
TEST_CASE(test_close_bonded_device),

diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
index 953d75e..d4e10bf 100644
--- a/lib/librte_pmd_bond/Makefile
+++ b/lib/librte_pmd_bond/Makefile
@@ -46,6 +46,10 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c

+ifeq ($(CONFIG_RTE_MBUF_REFCNT),n)
+$(info WARNING: Link Bonding Broadcast mode is disabled because it needs 
MBUF_REFCNT.)
+endif
+
 #
 # Export include files
 #
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index bd59780..344ca1e 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -71,11 +71,12 @@ extern "C" {
  * slaves using one of three available transmit policies - l2, l2+3 or l3+4.
  * See BALANCE_XMIT_POLICY macros definitions for further details on transmit
  * policies. */
+#ifdef RTE_MBUF_REFCNT
 #define BONDING_MODE_BROADCAST (3)
 /**< Broadcast (Mode 3).
  * In this mode all transmitted packets will be transmitted on all available
  * active slaves of the bonded. */
-
+#endif
 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
 /**< Layer 2 (Ethernet MAC) */
diff --git a/lib/librte_pmd_bond/rte_eth_bond_args.c 
b/lib/librte_pmd_bond/rte_eth_bond_args.c
index 11d9816..05a2b07 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_args.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_args.c
@@ -169,7 +169,9 @@ bond_ethdev_parse_slave_mode_kvarg(const char *key 
__rte_unused,
case BONDING_MODE_ROUND_ROBIN:
case BONDING_MODE_ACTIVE_BACKUP:
case BONDING_MODE_BALANCE:
+#ifdef RTE_MBUF_REFCNT
case BONDING_MODE_BROADCAST:
+#endif
return 0;
default:
return -1;
diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index 08d3b5f..147028b 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -61,7 +61,9 @@ bond_ethdev_rx_burst(void *queue, struct rte_mbuf **bufs, 
uint16_t nb_pkts)

switch (internals->mode) {
case

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 5:25 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of
> VxLAN filter
> 
> 2014-10-22 06:45, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 16:46, Jijiang Liu:
> > > > +#define RTE_TUNNEL_FILTER_TO_QUEUE 1 /**< point to an queue by
> > > > +filter
> > > type */
> > >
> > > Sorry, I don't understand what is this value for?
> >
> > This MACRO is used to indicate if user application hope to filter
> > incoming packet(s) to a specific queue(multi-queue configuration is
> > required) using filter type(such as inner MAC + tenant ID).
> > If the flag is not set, and all incoming packet will always go to
> > queue 0.
> 
> 1) It's a boolean, so a new constant is not required.
OK. define bool to_queue. 
> 2) You set the variable "to_queue" with this value but the variable is not
> used
> 3) Is there a sense to add a filter with this variable to 0?
> 
> I think "to_queue" variable and this constant are useless.
No, It is useful. I will change to_queue to be configurable. 
> PS: it seems I'm the only person worried by the filtering API.
> So maybe we shouldn't integrate it at all?

I know you are joking. 
 We should certainly integrate the feature into dpdk.org
> --
> Thomas

[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-22 Thread O'driscoll, Tim

We're starting to plan our DPDK features for next year. We're planning to have 
a DPDK 2.0 release at the end of March, and we'd like to inform the community 
of the features that we hope to submit to that release. The current list of 
features, along with brief descriptions, is included below.

There will naturally be some changes to this list as work on these features 
progresses. Some will inevitably turn out to be bigger than anticipated and 
will need to be deferred to a later date, while other priorities will arise and 
need to be accommodated. So, this list will be subject to change, and should be 
taken as guidance on what we hope to submit, not a commitment.

Our aim in providing this information now is to solicit input from the 
community. We'd like to make sure we avoid duplication or conflicts with work 
that others are planning, so we'd be interested in hearing any plans that 
others in the community have for contributions to DPDK in this timeframe. This 
will allow us to build a complete picture and ensure we avoid duplication of 
effort. 

I'm sure people will have questions, and will be looking for more information 
on these features. Further details will be provided by the individual 
developers over the next few months. We aim to make better use of the RFC 
process in this release, and provide early outlines of the features so that we 
can obtain community feedback as soon as possible.

We also said at the recent DPDK Summit that we would investigate holding 
regular community conference calls. We'll be scheduling the first of these 
calls soon, and will use this to discuss the 2.0 (Q1 2015) features in a bit 
more detail.


2.0 (Q1 2015) DPDK Features:
Bifurcated Driver: With the Bifurcated Driver, the kernel will retain direct 
control of the NIC, and will assign specific queue pairs to DPDK. Configuration 
of the NIC is controlled by the kernel via ethtool.

Support the new Intel SoC platform, along with the embedded 10GbE NIC.

Packet Reordering: Assign a sequence number to packets on Rx, and then provide 
the ability to reorder on Tx to preserve the original order.

Packet Distributor (phase 2): Implement the following enhancements to the 
Packet Distributor that was originally delivered in the DPDK 1.7 release: 
performance improvements; the ability for packets from a flow to be processed 
by multiple worker cores in parallel and then reordered on Tx using the Packet 
Reordering feature; the ability to have multiple Distributors which share 
Worker cores.

Support Multiple Threads per Core: Use Linux cgroups to allow multiple threads 
to run on a single core. This would be useful in situations where a DPDK thread 
does not require the full resources of a core.

Support the Fedora 21 OS.

Support the host interface for Intel's next generation Ethernet switch. This 
only covers basic support for the host interface. Support for additional 
features will be added later.

Cuckoo Hash: A new hash algorithm was implemented as part of the Cuckoo Switch 
project (see http://www.cs.cmu.edu/~dongz/papers/cuckooswitch.pdf), and shows 
some promising performance results. This needs to be modified to make it more 
generic, and then incorporated into DPDK.

Provide DPDK support for uio_pci_generic.

Integrated Qemu Userspace vHost: Modify Userspace vHost to use Qemu version 
2.1, and remove the need for the kernel module (cuse.ko).

PCI Hot Plug: When you migrate a VM, you need hot plug support as the new VF on 
the new hardware you are running on post-migration needs to be initialized. 
With an emulated NIC migration is seamless as all configuration for the NIC is 
within the RAM of the VM and the hypervisor. With a VF you have actual hardware 
in the picture which needs to be set up properly.

Additional XL710/X710 40-Gigabit Ethernet Controller Features:  Support for 
additional XL710/X710 40-Gigabit Ethernet Controller features, including 
bandwidth and QoS management, NVGRE and other network overlay support, TSO, 
IEEE1588, DCB support.  SR-IOV switching and port mirroring.

Single Virtio Driver: Merge existing Virtio drivers into a single 
implementation, incorporating the best features from each of the existing 
drivers.

X32 ABI: This is an application binary interface project for the Linux kernel 
that allows programs to take advantage of the benefits of x86-64 (larger number 
of CPU registers, better floating-point performance, faster 
position-independent code shared libraries, function parameters passed via 
registers, faster syscall instruction) while using 32-bit pointers and thus 
avoiding the overhead of 64-bit pointers.

AVX2 ACL: Modify ACL library to use AVX2 instructions to improve performance.

Interrupt mode for PMD: Allow DPDK process to transition to interrupt mode when 
load is low so that other processes can run, or else power can be saved. This 
will increase latency/jitter.

DPDK Headroom: Provide a mechanism to indicate how much headroom (spare 
capacity) exists in a

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 5:52 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet
> identification API in librte_ether
> 
> 2014-10-22 01:46, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 13:48, Liu, Jijiang:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > > But I doubt we should configure a tunnel type for a whole port.
> > > >
> > > > Yes, your understanding is correct. It is for a whole port/PF,
> > > > that's why we should add tunnel_type in rte_eth_conf structure.
> > >
> > > Please explain me why a tunnel type should be associated to a port.
> > > This design looks really broken.
> >
> > I don't think this design looks really broken.
> >
> > Currently, A PF  associated to a port, right? What tunnel type should
> > be supported in a PF, which is required we configure it.
> > Tunneling packet is encapsulation packet, in terms of VxLAN, packet
> > format is outer L2 header+ outer L3 header +outer L4 header +
> > tunneling header+ inner L2 header + inner L3 header + inner L4 header
> +PAY4.
> > For a VM/VF, the  real useful packet data is "inner L2 header + inner
> > L3 header + inner L4 header +PAY4".
> >
> > In NIC, A port/PF receive this kind of tunneling packet(outer
> > L2+...PAY4), software should be responsible for decapsulating the
> > packet and deliver real data(innerL2 + PAY4) to VM/VF?
> >
> > DPDK just provide API/mechanism to guarantee a PF/port to receive the
> > tunneling packet data, the encapsulation/ decapsulation work should be
> > done by user application.
> 
> You mean that all packets received on the PF which doesn't match the
> configured tunnel type, won't be received by the software?

No, it will be received by the software.
In terms of VxLAN packet, if tunnel type is not configured VXLAN type, and 
software can't use API to configure the UDP destination number. 
Even if the incoming packet format is VXLAN packet format, hardware and 
software don't think it is VXLAN packet because we didn't configure UDP 
Destination port. 

Now I want to remove this limitation,  even if the  tunnel type is not 
configured at PF initialization phase, user also can configure the VxLAN UDP 
destination number.
It is more flexible and reasonable.

> Other question, why a port is associated with only one tunnel type?

Good question. Now I think we had better remove this limitation because it is 
NIC related.

Two points are summarized here.
1. The tunnel types is for a whole port/PF, I have explained it a lots.
2. I will remove tunnel type configuration from rte_eth_conf structure.

Any comments?

> > Normally, the tunneling packet processing like below:
> > Tunneling packet -->PF processing/receive -> application
> > SW do decapsulation ---> VF/VM processing
> 
> I really try to understand what you have in mind. Thanks for explaining
> --
> Thomas

[dpdk-dev] [PATCH v2] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Marc Sune

Fix compilation warning 'missing-field-initializers' for some GCC and clang
versions introduced in commit 0c6bc8e due to the use of C89/C90 initializers.
Using C99-style initializers

Signed-off-by: Marc Sune 
---
 lib/librte_kni/rte_kni.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index f64a0a8..ab5ca38 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -131,7 +131,9 @@ static void kni_free_mbufs(struct rte_kni *kni);
 static void kni_allocate_mbufs(struct rte_kni *kni);

 static volatile int kni_fd = -1;
-static struct rte_kni_memzone_pool kni_memzone_pool = {0};
+static struct rte_kni_memzone_pool kni_memzone_pool = {
+   .initialized = 0,
+};

 static const struct rte_memzone *
 kni_memzone_reserve(const char *name, size_t len, int socket_id,
@@ -224,6 +226,7 @@ rte_kni_init(unsigned int max_kni_ifaces)
kni_memzone_pool.initialized = 1;
kni_memzone_pool.max_ifaces = max_kni_ifaces;
kni_memzone_pool.free = _memzone_pool.slots[0];
+   rte_spinlock_init(_memzone_pool.mutex);

/* Pre-allocate all memzones of all the slots; panic on error */
for (i = 0; i < max_kni_ifaces; i++) {
-- 
1.7.10.4

[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-22 Thread Matthew Hall

On Wed, Oct 22, 2014 at 01:48:36PM +, O'driscoll, Tim wrote:
> Single Virtio Driver: Merge existing Virtio drivers into a single 
> implementation, incorporating the best features from each of the existing 
> drivers.

Tim,

There is a lot of good stuff in there.

Specifically, in the virtio-net case above, I have discovered, and Sergio at 
Intel just reproduced today, that neither virtio PMD works at all inside of 
VirtualBox. One can't init, and the other gets into an infinite loop. But yet 
it's claiming support for VBox on the DPDK Supported NICs page though it 
doesn't seem it ever could have worked.

So I'd like to request an initiative alongside any virtio-net and/or vmxnet3 
type of changes, to make some kind of a Virtualization Test Lab, where we 
support VMWare ESXi, QEMU, Xen, VBox, and the other popular VM systems.

Otherwise it's hard for us community / app developers to make the DPDK 
available to end users in simple, elegant ways, such as packaging it into 
Vagrant VM's, Amazon AMI's etc. which are prebaked and ready-to-run.

Note personally of course I prefer using things like the 82599... but these 
are only going to be present after the customers have begun to adopt and test 
in the virtual environment, then they decide they like it and want to scale up 
to bigger boxes.

Another thing which would help in this area would be additional improvements 
to the NUMA / socket / core / number of NICs / number of queues 
autodetections. To write a single app which can run on a virtual card, a 
hardware card without RSS available, and a hardware card with RSS available, 
in a thread-safe, flow-safe way, is somewhat complex at the present time.

I'm running into this in the VM based environments because most VNIC's don't 
have RSS and it complicates the process of keeping consistent state of the 
flows among the cores.

Thanks,
Matthew.

[dpdk-dev] [PATCH v2] librte_ip_frag: Disable ipv4/v6 fragmentation if RTE_MBUF_REFCNT=n

2014-10-22 Thread Pablo de Lara

rte_ipv4_fragment_packet() and rte_ipv6_fragment packet()
call rte_pktmbuf_attach() to attach the segment of the original
packet to the segment of the new fragmented one. Such function
is not declared if RTE_MBUF_REFCNT is disabled, as it needs to
call rte_mbuf_refcnt_update, not declared either.

Therefore, the ipv4/v6 fragmentation libraries are disabled
in that situation.

Signed-off-by: Pablo de Lara 
---
 lib/librte_ip_frag/Makefile  |6 +-
 lib/librte_ip_frag/rte_ip_frag.h |5 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ip_frag/Makefile b/lib/librte_ip_frag/Makefile
index 2265c93..8c00d39 100644
--- a/lib/librte_ip_frag/Makefile
+++ b/lib/librte_ip_frag/Makefile
@@ -38,9 +38,13 @@ CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)

 #source files
+ifeq ($(CONFIG_RTE_MBUF_REFCNT),y)
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
-SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
+else
+$(info WARNING: Fragmentation feature is disabled because it needs 
MBUF_REFCNT.)
+endif
+SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_reassembly.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ip_frag_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += ip_frag_internal.c
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index e0936dc..230a903 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -175,6 +175,7 @@ rte_ip_frag_table_destroy( struct rte_ip_frag_tbl *tbl)
rte_free(tbl);
 }

+#ifdef RTE_MBUF_REFCNT
 /**
  * This function implements the fragmentation of IPv6 packets.
  *
@@ -203,7 +204,7 @@ rte_ipv6_fragment_packet(struct rte_mbuf *pkt_in,
uint16_t mtu_size,
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
-
+#endif

 /*
  * This function implements reassembly of fragmented IPv6 packets.
@@ -252,6 +253,7 @@ rte_ipv6_frag_get_ipv6_fragment_header(struct ipv6_hdr *hdr)
return NULL;
 }

+#ifdef RTE_MBUF_REFCNT
 /**
  * IPv4 fragmentation.
  *
@@ -280,6 +282,7 @@ int32_t rte_ipv4_fragment_packet(struct rte_mbuf *pkt_in,
uint16_t nb_pkts_out, uint16_t mtu_size,
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
+#endif

 /*
  * This function implements reassembly of fragmented IPv4 packets.
-- 
1.7.4.1

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Thomas Monjalon

2014-10-22 11:49, Marc Sune:
> On 22/10/14 10:50, Thomas Monjalon wrote:
> > 2014-10-22 10:42, Marc Sune:
> >> The mutex needs to be initialized to RTE_SPINLOCK_INITIALIZER(0) too, or
> >> move the initialization of the mutex to rte_kni_init().
> > RTE_SPINLOCK_INITIALIZER is { 0 }
> > By initializing one field, all other fields are set to 0, so spinlock also.
> > Just choose one field and it's OK.
> > It should be tested with ICC also but I think it's OK.
> 
> Seems that you are right, at least for C99:
> 
> C99 Standard 6.7.8.21
> 
>  If there are fewer initializers in a brace-enclosed list than
> there are elements or members of an aggregate, or fewer characters
> in a string literal used to initialize an array of known size than
> there are elements in the array, the remainder of the aggregate
> shall be initialized implicitly the same as objects that have static
> storage duration.
> 
> 
> I am not sure if there can be problems with other C dialects (e.g. C11), 
> I don't have the std here. So to prevent any problem with them (could 
> produce a dead-lock during first rte_kni_alloc() that could be difficult 
> to troubleshoot), I would still explicitly initialize the mutex, in one 
> or the other way.
> 
> Just tell me if you agree and which one you prefer.

No problem for initializing mutex.

> I don't have an ICC license. I am always trying it with GCC and clang.

That's why it's the Intel job to support ICC in DPDK :)

-- 
Thomas

[dpdk-dev] [PATCH 2/2] ixgbe: always perform vec RX setup if vpmd enabled

2014-10-22 Thread Bruce Richardson

If the vector pmd option is turned on in the compile time config file,
then always call the vector rxq setup, since we can now use the vector
PMD for receiving jumbo frames that need chained mbufs, a.k.a scattered
packets. Up till now, this function was not being called when receiving
scattered packets, potentially leading to problems with mbufs not being
properly initialized.

Signed-off-by: Bruce Richardson 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 123b8b3..3a5a8ff 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -2191,6 +2191,9 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
 */
use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);

+#ifdef RTE_IXGBE_INC_VECTOR
+   ixgbe_rxq_vec_setup(rxq);
+#endif
/* Check if pre-conditions are satisfied, and no Scattered Rx */
if (!use_def_burst_func && !dev->data->scattered_rx) {
 #ifdef RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC
@@ -2203,7 +2206,6 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
if (!ixgbe_rx_vec_condition_check(dev)) {
PMD_INIT_LOG(INFO, "Vector rx enabled, please make "
 "sure RX burst size no less than 32.");
-   ixgbe_rxq_vec_setup(rxq);
dev->rx_pkt_burst = ixgbe_recv_pkts_vec;
}
 #endif
-- 
1.9.3

[dpdk-dev] [PATCH 1/2] ixgbe: remove static qualifier for thread safety

2014-10-22 Thread Bruce Richardson

Remove the "static" prefix to the template mbuf variable in
ixgbe_rxq_vec_setup function. This will then allow different
threads to initialize different RX queues at the same time,
without one overwriting the other's data.

Signed-off-by: Bruce Richardson 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index a0d3d78..e813e43 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -730,7 +730,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
 int
 ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
 {
-   static struct rte_mbuf mb_def = {
+   struct rte_mbuf mb_def = {
.nb_segs = 1,
.data_off = RTE_PKTMBUF_HEADROOM,
 #ifdef RTE_MBUF_REFCNT
-- 
1.9.3

[dpdk-dev] [PATCH 0/2] ixgbe: vector pmd fixes

2014-10-22 Thread Bruce Richardson

This patch set contains small fixes for issues with the vector PMD.
The issues and the fixes for them are described in each patch individually.

Bruce Richardson (2):
  ixgbe: remove static qualifier for thread safety
  ixgbe: always perform vec RX setup if vpmd enabled

 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 4 +++-
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

-- 
1.9.3

[dpdk-dev] [PATCH] ixgbe: Fix clang compilation issue

2014-10-22 Thread Bruce Richardson

Issue reported by Keith Wiles.
Clang fails with an error about a variable being used uninitialized:

  CC ixgbe_rxtx_vec.o
/home/keithw/projects/dpdk-code/org-dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
error: variable 'dma_addr0' is uninitialized
  when used here [-Werror,-Wuninitialized]
dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
  ^

This error can be fixed by replacing the call to xor which
takes two parameters, by a call to setzero, which does not take any.

Signed-off-by: Bruce Richardson 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index 457f267..2236250 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -64,7 +64,7 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
 RTE_IXGBE_RXQ_REARM_THRESH) < 0) {
if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
rxq->nb_rx_desc) {
-   dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
+   dma_addr0 = _mm_setzero_si128();
for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
rxep[i].mbuf = >fake_mbuf;
_mm_store_si128((__m128i *)[i].read,
-- 
1.9.3

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Thomas Monjalon

2014-10-22 01:46, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2014-10-21 13:48, Liu, Jijiang:
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > But I doubt we should configure a tunnel type for a whole port.
> > >
> > > Yes, your understanding is correct. It is for a whole port/PF, that's
> > > why we should add tunnel_type in rte_eth_conf structure.
> > 
> > Please explain me why a tunnel type should be associated to a port.
> > This design looks really broken.
> 
> I don't think this design looks really broken.
> 
> Currently, A PF  associated to a port, right? What tunnel type should
> be supported in a PF, which is required we configure it.
> Tunneling packet is encapsulation packet, in terms of VxLAN, packet format
> is outer L2 header+ outer L3 header +outer L4 header + tunneling header+
> inner L2 header + inner L3 header + inner L4 header +PAY4.
> For a VM/VF, the  real useful packet data is "inner L2 header +
> inner L3 header + inner L4 header +PAY4".  
> 
> In NIC, A port/PF receive this kind of tunneling packet(outer L2+...PAY4),
> software should be responsible for decapsulating the packet and deliver
> real data(innerL2 + PAY4) to VM/VF?
> 
> DPDK just provide API/mechanism to guarantee a PF/port to receive the
> tunneling packet data, the encapsulation/ decapsulation work should be
> done by user application.

You mean that all packets received on the PF which doesn't match the configured
tunnel type, won't be received by the software?

Other question, why a port is associated with only one tunnel type?

> Normally, the tunneling packet processing like below:
> Tunneling packet -->PF processing/receive -> application SW do 
> decapsulation ---> VF/VM processing

I really try to understand what you have in mind. Thanks for explaining
-- 
Thomas

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Marc Sune

On 22/10/14 10:50, Thomas Monjalon wrote:
> 2014-10-22 10:42, Marc Sune:
>> The mutex needs to be initialized to RTE_SPINLOCK_INITIALIZER(0) too, or
>> move the initialization of the mutex to rte_kni_init().
> RTE_SPINLOCK_INITIALIZER is { 0 }
> By initializing one field, all other fields are set to 0, so spinlock also.
> Just choose one field and it's OK.
> It should be tested with ICC also but I think it's OK.

Seems that you are right, at least for C99:

C99 Standard 6.7.8.21

 If there are fewer initializers in a brace-enclosed list than
there are elements or members of an aggregate, or fewer characters
in a string literal used to initialize an array of known size than
there are elements in the array, the remainder of the aggregate
shall be initialized implicitly the same as objects that have static
storage duration.

I am not sure if there can be problems with other C dialects (e.g. C11), 
I don't have the std here. So to prevent any problem with them (could 
produce a dead-lock during first rte_kni_alloc() that could be difficult 
to troubleshoot), I would still explicitly initialize the mutex, in one 
or the other way.

Just tell me if you agree and which one you prefer.

I don't have an ICC license. I am always trying it with GCC and clang.

Marc

[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images

2014-10-22 Thread Matthew Hall

On Wed, Oct 22, 2014 at 03:20:40PM +, Gonzalez Monroy, Sergio wrote:
> You are not forced to use shared libraries. This module loads successfully 
> with an app (testpmd) built against static DPDK libs.

It sounds like it just requires additional options as mentioned later in your 
mail. We should document the recommended set of link flags for static and 
dynamic link w/ DPDK for apps, as it seems to cause issues for everybody 
making non-example DPDK apps. I've seen several threads about it since joining 
the mailing list a few months back. Maybe it is documented already but I 
didn't see it. I'm willing to write it up but somebody will have to help 
verify I got it right.

> I have reproduced this issue in VirtualBox:
> 
> - For UIO Virtio PMD, there is an issue with igb_uio module and virtio vbox 
> backend device, I fail to bind igb_uio driver to the virtio device.
> 
> - For non-UIO Virtio PMD, the module fails to initialize properly as you 
> have indicated in your previous post (stuck in a loop).
> 
> I get this behavior with testpmd regardless of DPDK being built as static or 
> shared.

THANK YOU!!! I am so glad to hear I'm not crazy and it really does not work.

So... back to the Supported NICs page. Right not it claims VirtualBox is 
supported but 2 people have seen that it doesn't work at all with either 
driver. Is it intended to be supported configuration or not?

If it is intended to be supported can we find someone who can help us fix the 
bugs? It's not code I know much about. If it is not intended to work on VBox 
can we delete it from the documentation so nobody besides me and you and the 
other guy in 2013 waste more time trying to use it if it wasn't supposed to 
work in the first place?

> Your undefined symbol error is most likely because the symbol is not in the 
> dynamic symbol table of you app. You need to pass -rdynamic to GCC or 
> -export-dynamic to LD when building your app.

Good advice. I'll try it and see. Of course it won't fix the infinite loop 
though so the driver still won't help much even with the change present.

> Thanks,
> Sergio

Thanks to you for help verifying / reproducing / identifying the issues.

Matthew.

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 5:31 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of
> VxLAN filter
> 
> 2014-10-22 02:25, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 16:46, Jijiang Liu:
> > > > +#define RTE_TUNNEL_FILTER_IMAC_IVLAN (ETH_TUNNEL_FILTER_IMAC
> | \
> > > > +   ETH_TUNNEL_FILTER_IVLAN)
> > > > +#define RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID
> (ETH_TUNNEL_FILTER_IMAC | \
> > > > +   ETH_TUNNEL_FILTER_IVLAN | \
> > > > +   ETH_TUNNEL_FILTER_TENID)
> > > > +#define RTE_TUNNEL_FILTER_IMAC_TENID (ETH_TUNNEL_FILTER_IMAC
> | \
> > > > +   ETH_TUNNEL_FILTER_TENID)
> > > > +#define RTE_TUNNEL_FILTER_OMAC_TENID_IMAC
> (ETH_TUNNEL_FILTER_OMAC | \
> > > > +   ETH_TUNNEL_FILTER_TENID | \
> > > > +   ETH_TUNNEL_FILTER_IMAC)
> > >
> > > I thought you agree that these definitions are useless?
> >
> > Sorry, this MAY be  some misunderstanding, I don't think these
> > definition are useless. I just thought change "uint16_t filter_type"
> > is better than define "enum filter_type".
> >
> > Let me explain here again.
> > The filter condition are:
> > 1.  inner MAC + inner VLAN
> > 2. inner MAC + IVLAN + tenant ID
> > ..
> > 5. outer MAC + tenant ID + inner MAC
> >
> > For each filter condition, we need to check if the mandatory
> > parameters are valid by checking corresponding bit MASK.
> 
> Checking bit mask doesn't imply to define all combinations of bit masks.
> There's probably something obvious that one of us is missing.

Anybody else have comments on this? 

> > An pseudo code example:
> >
> >Switch (filter_type)
> >Case 1:  //inner MAC + inner VLAN
> > If (filter_type & ETH_TUNNEL_FILTER_IMAC )
> > if   (IMAC==NULL)
> >   return -1;
> >
> >case 5: // outer MAC + tenant ID + inner MAC
> > If (filter_type & ETH_TUNNEL_FILTER_IMAC )
> > if   (IMAC==NULL)
> >   return -1;
> >
> >  If (filter_type & ETH_TUNNEL_FILTER_OMAC )
> > if   (IMAC==NULL)
> >   return -1;
> >..

[dpdk-dev] [PATCH] ixgbe: Fix clang compilation issue

2014-10-22 Thread Richardson, Bruce

Self-nak, resent old patch.

> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, October 22, 2014 11:54 AM
> To: dev at dpdk.org
> Cc: Richardson, Bruce
> Subject: [PATCH] ixgbe: Fix clang compilation issue
> 
> Issue reported by Keith Wiles.
> Clang fails with an error about a variable being used uninitialized:
> 
>   CC ixgbe_rxtx_vec.o
> /home/keithw/projects/dpdk-code/org-
> dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c:67:30:
> error: variable 'dma_addr0' is uninitialized
>   when used here [-Werror,-Wuninitialized]
> dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
>   ^
> 
> This error can be fixed by replacing the call to xor which
> takes two parameters, by a call to setzero, which does not take any.
> 
> Signed-off-by: Bruce Richardson 
> ---
>  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> index 457f267..2236250 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> @@ -64,7 +64,7 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
>RTE_IXGBE_RXQ_REARM_THRESH) < 0) {
>   if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
>   rxq->nb_rx_desc) {
> - dma_addr0 = _mm_xor_si128(dma_addr0, dma_addr0);
> + dma_addr0 = _mm_setzero_si128();
>   for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
>   rxep[i].mbuf = >fake_mbuf;
>   _mm_store_si128((__m128i *)[i].read,
> --
> 1.9.3

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Thomas Monjalon

2014-10-22 10:42, Marc Sune:
> The mutex needs to be initialized to RTE_SPINLOCK_INITIALIZER(0) too, or 
> move the initialization of the mutex to rte_kni_init().

RTE_SPINLOCK_INITIALIZER is { 0 }
By initializing one field, all other fields are set to 0, so spinlock also.
Just choose one field and it's OK.
It should be tested with ICC also but I think it's OK.

> I can prepare a second patch with one or the other option, if you want.

Yes please.

> On 22/10/14 10:37, Thomas Monjalon wrote:
> > 2014-10-22 09:10, Marc Sune:
> >> Fix for compilation warning 'missing-field-initializers' for some
> >> GCC and clang versions introduced in commit 0c6bc8e
> >>
> >> Signed-off-by: Marc Sune 
> > It's not needed to initialize all fields.
> > This should be sufficient:
> > +static struct rte_kni_memzone_pool kni_memzone_pool = {.initialized = 0};

Please Marc, don't top post.
Thanks
-- 
Thomas

[dpdk-dev] [PATCH] bond: disabling broadcast mode when dpdk is built without RTE_MBUF_REFCNT

2014-10-22 Thread De Lara Guarch, Pablo

Hi Declan,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty
> Sent: Wednesday, October 22, 2014 11:29 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] bond: disabling broadcast mode when dpdk is
> built without RTE_MBUF_REFCNT
> 
> Link bonding broadcast mode requires refcnt parameter in the mbuf struct to
> allow efficient transmission of duplicated mbufs on slave ports.
> 
> This patch disables broadcast mode when the complication option
> RTE_MBUF_REFCNT
> is disabled to allow clean building of the bonding library
> 
> 
> Signed-off-by: Declan Doherty 
> ---
>  app/test/test_link_bonding.c|9 -
>  lib/librte_pmd_bond/rte_eth_bond.h  |3 ++-
>  lib/librte_pmd_bond/rte_eth_bond_args.c |2 ++
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c  |   12 
>  4 files changed, 24 insertions(+), 2 deletions(-)
> 

As suggested by Thomas in my other patch (disable ipv4/v6 fragmentation if 
RTE_MBUF_REFCNT=n),
 it may be useful to include a warning in the link bonding Makefile, letting 
the user know 
that broadcast mode is disabled due to that option not set.

Thanks,
Pablo

[dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure changes

2014-10-22 Thread Thomas Monjalon

2014-10-21 14:14, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2014-10-21 16:46, Jijiang Liu:
> > > - uint16_t reserved2;   /**< Unused field. Required for padding */
> > > +
> > > + /**
> > > +  * Packet type, which is used to indicate ordinary L2 packet format and
> > > +  * also tunneled packet format such as IP in IP, IP in GRE, MAC in GRE
> > > +  * and MAC in UDP.
> > > +  */
> > > + uint16_t packet_type;
> > 
> > Why not name it "l2_type"?
> 
> In datasheet, this term is called packet type(s).

That's exactly the point I want you really understand!
This is a field in generic mbuf structure, so your datasheet has no value here.

> Personally , I think packet type is  more clear what meaning of this field is 
> . ^_^

You cannot add an API field without knowing what will be its generic meaning.
Please think about it and describe its scope.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Marc Sune

The mutex needs to be initialized to RTE_SPINLOCK_INITIALIZER(0) too, or 
move the initialization of the mutex to rte_kni_init().

I can prepare a second patch with one or the other option, if you want.

marc

On 22/10/14 10:37, Thomas Monjalon wrote:
> 2014-10-22 09:10, Marc Sune:
>> Fix for compilation warning 'missing-field-initializers' for some
>> GCC and clang versions introduced in commit 0c6bc8e
>>
>> Signed-off-by: Marc Sune 
> It's not needed to initialize all fields.
> This should be sufficient:
> +static struct rte_kni_memzone_pool kni_memzone_pool = {.initialized = 0};
>

[dpdk-dev] FW: nic loopback

2014-10-22 Thread alex

On Wed, Oct 22, 2014 at 7:37 AM, Zhu, Heqing  wrote:

> One line comment inline.
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liang, Cunming
> > Sent: Tuesday, October 21, 2014 8:33 PM
> > To: Alex Markuze
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] nic loopback
> >
> > It?s a pain VF can?t set the register directly.
> > As kernel ixgbe don?t support to set the value, I?m afraid you have to
> modify
> > kernel ixgbe.
> > If your purpose is mainly for testing purpose.
> > One option is you can just set the register bit value to full 1 during
> device
> > initialization.
> > Another option is you can choose to use DPDK as host PF.
> > Running testpmd in host, and set such register by interactive command
> line.
> >
> > Ideally it?s better to add a kind of VF to PF mailbox message.
> > Host PF delegate VF to enable the local pool loopback.
> > So during runtime, VF can proactive to enable/disable the ability.
>
> [heqing] Such a proposal has been discussed a few times, but the kernel
> driver does not accept this due to the security concern.


I will try a different approach, Is there a tool available by intel for
82599 nics that can access the NIC's configuration and modify these
registers manually? w/o Modifying hypervisor drivers and/or using PF?

>
> >
> >
> > From: Alex Markuze [mailto:alex at weka.io]
> > Sent: Tuesday, October 21, 2014 11:16 PM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] nic loopback
> >
> > How can I set/query this bit (LLE(PFVMTXSW[n]), intel 82599 ) on ESX, or
> any
> > other friendlier environment like Linux?
> >
> > On Tue, Oct 21, 2014 at 4:18 AM, Liang, Cunming
> > mailto:cunming.liang at intel.com>> wrote:
> >
> >
> > > -Original Message-
> > > From: dev
> > [mailto:dev-bounces at dpdk.org]
> > > On Behalf Of Alex Markuze
> > > Sent: Tuesday, October 21, 2014 12:24 AM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] nic loopback
> > >
> > > Hi,
> > > I'm trying to send packets from an application to it self, meaning
> > > smac  == dmac.
> > > I'm working with intel 82599 virtual function. But it seems that these
> > > packets are lost.
> > >
> > > Is there a software/hw limitation I'm missing here (some additional
> > > anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the
> > > hw loopback should work, at least thats the theory.
> > >
> > [Liang, Cunming] You could have a check on register LLE(PFVMTXSW[n]).
> > Which allow an individual pool to be able to send traffic and have it
> loopback
> > to itself.
> > >
> > > Thanks.
>
>

[dpdk-dev] Why do we need iommu=pt?

2014-10-22 Thread alex

Shiva.
The cost of disabling iommu=pt when intel_iommu=on is dire. DPDK won't work
as the RX/TX descriptors will be useless.
Any dam access by the device will be dropped as no dam-mapping will exists.

Danny.
The IOMMU hurts performance in kernel drivers which perform a map and umap
operation for each e/ingress packet.
The costs of unmapping when under strict protection limit a +10Gb to 3Gb
with cpu maxed out at 100%. DPDK apps shouldn't feel any difference IFF the
rx descriptors contain iova and not real physical addresses which are used
currently.


On Tue, Oct 21, 2014 at 10:10 PM, Zhou, Danny  wrote:

> IMHO, if memory protection with IOMMU is needed or not really depends on
> how you use
> and deploy your DPDK based applications. For Telco network middle boxes,
> which adopts
> a "close model" solution to achieve extremely high performance, the entire
> system including
> HW, software in kernel and userspace are controlled by Telco vendors and
> assumed trustable, so
> memory protection is not so important. While for Datacenters, which
> generally adopts a "open model"
> solution allows running user space applications(e.g. tenant applications
> and VMs) which could
> direct access NIC and DMA engine inside the NIC using modified DPDK PMD
> are not trustable
> as they can potentially DAM to/from arbitrary memory regions using
> physical addresses, so IOMMU
> is needed to provide strict memory protection, at the cost of negative
> performance impact.
>
> So if you want to seek high performance, disable IOMMU in BIOS or OS. And
> if security is a major
> concern, tune it on and tradeoff between performance and security. But I
> do NOT think is comes with
> an extremely high performance costs according to our performance
> measurement, but it probably true
> for 100G NIC.
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya Hiremath
> > Sent: Wednesday, October 22, 2014 12:54 AM
> > To: Alex Markuze
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> >
> > Hi,
> >
> > Thank you for all the replies.
> > I am trying to understand the impact of this on DPDK. What will be the
> > repercussions of disabling "iommu=pt" on the DPDK performance?
> >
> >
> > On Tue, Oct 21, 2014 at 12:32 AM, Alex Markuze  wrote:
> >
> > > DPDK uses a 1:1 mapping and doesn't support IOMMU.  IOMMU allows for
> > > simpler VM physical address translation.
> > > The second role of IOMMU is to allow protection from unwanted memory
> > > access by an unsafe devise that has DMA privileges. Unfortunately this
> > > protection comes with an extremely high performance costs for high
> speed
> > > nics.
> > >
> > > To your question iommu=pt disables IOMMU support for the hypervisor.
> > >
> > > On Tue, Oct 21, 2014 at 1:39 AM, Xie, Huawei 
> wrote:
> > >
> > >>
> > >>
> > >> > -Original Message-
> > >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya
> > >> Hiremath
> > >> > Sent: Monday, October 20, 2014 2:59 PM
> > >> > To: dev at dpdk.org
> > >> > Subject: [dpdk-dev] Why do we need iommu=pt?
> > >> >
> > >> > Hi,
> > >> >
> > >> > My question is that if the Poll mode  driver used the DMA kernel
> > >> interface
> > >> > to set up its mappings appropriately, would it still require that
> > >> iommu=pt
> > >> > be set?
> > >> > What is the purpose of setting iommu=pt ?
> > >> PMD allocates memory though hugetlb file system, and fills the
> physical
> > >> address
> > >> into the descriptor.
> > >> pt is used to pass through iotlb translation. Refer to the below link.
> > >> http://lkml.iu.edu/hypermail/linux/kernel/0906.2/02129.html
> > >> >
> > >> > Thank you.
> > >>
> > >
> > >
>

[dpdk-dev] [PATCH] librte_ip_frag: Disable ipv4/v6 fragmentation if RTE_MBUF_REFCNT=n

2014-10-22 Thread De Lara Guarch, Pablo



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, October 21, 2014 10:12 PM
> To: De Lara Guarch, Pablo
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] librte_ip_frag: Disable ipv4/v6
> fragmentation if RTE_MBUF_REFCNT=n
> 
> 2014-10-21 15:15, Pablo de Lara:
> > Ipv4/v6 fragmentation libraries depends on refcnt.
> > There was a compilation error if RTE_MBUF_REFCNT was disabled,
> > so those libraries have been disabled in that situation.
> 
> Please Pablo, could you add a short justification that it's not
> possible to implement fragmentation without refcnt (at least with
> the current design)?

Will send a v2 with it.

> 
> What do you think of adding a warning as below?
> 
> > +ifeq ($(CONFIG_RTE_MBUF_REFCNT),y)
> >  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
> > -SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
> +else
> +$(info WARNING: Fragmentation feature is disabled because it needs
> MBUF_REFCNT.)
> > +endif
> > +SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c

Good idea. Thanks Thomas!

> 
> --
> Thomas

[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-22 Thread Neil Horman

On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote:
> 
> 
> > -Original Message-
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Tuesday, October 21, 2014 6:33 PM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > cycles/packet
> > 
> > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > > Hi Neil,
> > >
> > > Very appreciate your comments.
> > > I add inline reply, will send v3 asap when we get alignment.
> > >
> > > BRs,
> > > Liang Cunming
> > >
> > > > -Original Message-
> > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > Sent: Saturday, October 11, 2014 1:52 AM
> > > > To: Liang, Cunming
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > cycles/packet
> > > >
> > > > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote:
> > > > > It provides unit test to measure cycles/packet in NIC loopback mode.
> > > > > It simply gives the average cycles of IO used per packet without test
> > equipment.
> > > > > When doing the test, make sure the link is UP.
> > > > >
> > > > > Usage Example:
> > > > > 1. Run unit test app in interactive mode
> > > > > app/test -c f -n 4 -- -i
> > > > > 2. Run and wait for the result
> > > > > pmd_perf_autotest
> > > > >
> > > > > There's option to choose rx/tx pair, default is vector.
> > > > > set_rxtx_mode [vector|scalar|full|hybrid]
> > > > > Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' 
> > > > > without
> > > > INC_VEC=y in config
> > > > >
> > > > > Signed-off-by: Cunming Liang 
> > > > > Acked-by: Bruce Richardson 
> > > >
> > > > Notes inline
> > > >
> > > > > ---
> > > > >  app/test/Makefile   |1 +
> > > > >  app/test/commands.c |   38 +++
> > > > >  app/test/packet_burst_generator.c   |4 +-
> > > > >  app/test/test.h |4 +
> > > > >  app/test/test_pmd_perf.c|  626
> > > > +++
> > > > >  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
> > > > >  6 files changed, 677 insertions(+), 2 deletions(-)
> > > > >  create mode 100644 app/test/test_pmd_perf.c
> > > > >
> > > > > diff --git a/app/test/Makefile b/app/test/Makefile
> > > > > index 6af6d76..ebfa0ba 100644
> > > > > --- a/app/test/Makefile
> > > > > +++ b/app/test/Makefile
> > > > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c
> > > > >
> > > > >  SRCS-y += test_ring.c
> > > > >  SRCS-y += test_ring_perf.c
> > > > > +SRCS-y += test_pmd_perf.c
> > > > >
> > > > >  ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
> > > > >  SRCS-y += test_table.c
> > > > > diff --git a/app/test/commands.c b/app/test/commands.c
> > > > > index a9e36b1..f1e746e 100644
> > > > > --- a/app/test/commands.c
> > > > > +++ b/app/test/commands.c
> > > > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = {
> > > > >
> > > > > +#define NB_ETHPORTS_USED(1)
> > > > > +#define NB_SOCKETS  (2)
> > > > > +#define MEMPOOL_CACHE_SIZE 250
> > > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) +
> > > > RTE_PKTMBUF_HEADROOM)
> > > > Don't you want to size this in accordance with the amount of data your
> > sending
> > > > (64 Bytes as noted above)?
> > > [Liang, Cunming] The case is designed to measure small packet IO cost with
> > normal mbuf size.
> > > Even if decreasing the size, it won't gain significant cycles.
> > > >
> > That presumes a non-memory constrained system, doesn't it?  I suppose in the
> > end
> > as long as you have consistency, its not overly relevant, but it seems like
> > you'll want to add data sizing as a feature to this eventually (i.e. the 
> > ability
> > to test performance for larger frames sizes), at which point you'll need to 
> > make
> > this non-static anyway.
> [Liang, Cunming] For a normal Ethernet packet(w/o jumbo frame), packet size 
> is 1518B.
> As in really network, there won't have huge number of jumbo frames.
> The mbuf size 2048 is a reasonable value to cover most of the packet size.
> It's also be chosen by lots of NIC as the default receiving buffer size in 
> DMA register.
> In case larger than the size, it need do scatter and gather but lose some 
> performance.
> The unit test won't measure size from 64 to 9600, won't plan to measure 
> scatter-gather rx/tx.
> It focus on 64B packet size and taking the mbuf size being used the most 
> often.
Fine.

> > 
> > > > > +static void
> > > > > +print_ethaddr(const char *name, const struct ether_addr *eth_addr)
> > > > > +{
> > > > > + printf("%s%02X:%02X:%02X:%02X:%02X:%02X", name,
> > > > > + eth_addr->addr_bytes[0],
> > > > > + eth_addr->addr_bytes[1],
> > > > > + eth_addr->addr_bytes[2],
> > > > > + eth_addr->addr_bytes[3],
> > > > > + eth_addr->addr_bytes[4],
> > > > > +

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Richardson, Bruce



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marc Sune
> Sent: Wednesday, October 22, 2014 10:50 AM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-
> initializers'
> 
> On 22/10/14 10:50, Thomas Monjalon wrote:
> > 2014-10-22 10:42, Marc Sune:
> >> The mutex needs to be initialized to RTE_SPINLOCK_INITIALIZER(0) too, or
> >> move the initialization of the mutex to rte_kni_init().
> > RTE_SPINLOCK_INITIALIZER is { 0 }
> > By initializing one field, all other fields are set to 0, so spinlock also.
> > Just choose one field and it's OK.
> > It should be tested with ICC also but I think it's OK.
> 
> Seems that you are right, at least for C99:
> 
> C99 Standard 6.7.8.21
> 
>  If there are fewer initializers in a brace-enclosed list than
> there are elements or members of an aggregate, or fewer characters
> in a string literal used to initialize an array of known size than
> there are elements in the array, the remainder of the aggregate
> shall be initialized implicitly the same as objects that have static
> storage duration.
> 
> 
> I am not sure if there can be problems with other C dialects (e.g. C11),
> I don't have the std here. So to prevent any problem with them (could
> produce a dead-lock during first rte_kni_alloc() that could be difficult
> to troubleshoot), I would still explicitly initialize the mutex, in one
> or the other way.
> 
> Just tell me if you agree and which one you prefer.
> 
> I don't have an ICC license. I am always trying it with GCC and clang.
> 
> Marc

ICC should be fine with this, it handles just initializing a single member of a 
structure as described by Thomas above.

/Bruce

[dpdk-dev] clang compilation error in librte_table

2014-10-22 Thread Marc Sune

Hello,

There seems to be compilation issues with clang(3.0-6.2) and current 
HEAD (0c6bc8e) within librte_table:

== Build lib/librte_table
   CC rte_table_lpm.o
   CC rte_table_lpm_ipv6.o
   CC rte_table_hash_key8.o
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:212:4: error: 
implicit declaration of function '_mm_minpos_epu16' is invalid in C99
   [-Werror,-Wimplicit-function-declaration]
 lru_update(bucket, i);
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:14: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:212:4: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lru_update(bucket, i);
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:229:4: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lru_update(bucket, i);
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:241:2: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lru_update(bucket, pos);
 ^~~
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:858:4: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lookup1_stage2_lru(pkt_index, mbuf, bucket,
 ^~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:624:2: note: 
expanded from:
 lru_update(bucket2, pos);   \
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:912:3: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lookup2_stage2_lru(pkt20_index, pkt21_index, mbuf20, 
mbuf21,
^~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:793:2: note: 
expanded from:
 lru_update(bucket20, pos20);\
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:912:3: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lookup2_stage2_lru(pkt20_index, pkt21_index, mbuf20, 
mbuf21,
^~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:794:2: note: 
expanded from:
 lru_update(bucket21, pos21);\
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:936:2: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lookup2_stage2_lru(pkt20_index, pkt21_index, mbuf20, mbuf21,
 ^~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:793:2: note: 
expanded from:
 lru_update(bucket20, pos20);\
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);\
 ^   ~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:936:2: error: 
initializing '__m128i' with an expression of incompatible type 'int';
 lookup2_stage2_lru(pkt20_index, pkt21_index, mbuf20, mbuf21,
 ^~~~
/home/marc/dpdk/lib/librte_table/rte_table_hash_key8.c:794:2: note: 
expanded from:
 lru_update(bucket21, pos21);\
 ^
/home/marc/dpdk/lib/librte_table/rte_lru.h:157:10: note: expanded from:
 __m128i d = _mm_minpos_epu16(c);

[dpdk-dev] clang compilation error ACL library

2014-10-22 Thread Marc Sune

Hi all,

The latest head produces this compilation error within librte_acl, with 
clang version 3.0-6.2:

   CC acl_gen.o
/home/marc/dpdk/lib/librte_acl/acl_gen.c:249:11: error: array index of 
'-128' indexes before the beginning of the array [-Werror,-Warray-bounds]
 index = dfa[QRANGE_MIN];
 ^   ~~
/home/marc/dpdk/lib/librte_acl/acl_gen.c:211:2: note: array 'dfa' 
declared here
 uint64_t *node_a, index, dfa[RTE_ACL_DFA_SIZE];
 ^
1 error generated.

Best regards
marc

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Marc Sune

Liu,

Can you confirm that this patch fixes the issue?

Thanks
marc

On 22/10/14 09:10, Marc Sune wrote:
> Fix for compilation warning 'missing-field-initializers' for some
> GCC and clang versions introduced in commit 0c6bc8e
>
> Signed-off-by: Marc Sune 
> ---
>   lib/librte_kni/rte_kni.c |9 -
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> index f64a0a8..de29b99 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -131,7 +131,14 @@ static void kni_free_mbufs(struct rte_kni *kni);
>   static void kni_allocate_mbufs(struct rte_kni *kni);
>   
>   static volatile int kni_fd = -1;
> -static struct rte_kni_memzone_pool kni_memzone_pool = {0};
> +static struct rte_kni_memzone_pool kni_memzone_pool = {
> + .initialized = 0,
> + .max_ifaces = 0,
> + .slots = 0,
> + .mutex =  RTE_SPINLOCK_INITIALIZER,
> + .free = NULL,
> + .free_tail = NULL
> +};
>   
>   static const struct rte_memzone *
>   kni_memzone_reserve(const char *name, size_t len, int socket_id,

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Marc Sune

Fix for compilation warning 'missing-field-initializers' for some
GCC and clang versions introduced in commit 0c6bc8e

Signed-off-by: Marc Sune 
---
 lib/librte_kni/rte_kni.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index f64a0a8..de29b99 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -131,7 +131,14 @@ static void kni_free_mbufs(struct rte_kni *kni);
 static void kni_allocate_mbufs(struct rte_kni *kni);

 static volatile int kni_fd = -1;
-static struct rte_kni_memzone_pool kni_memzone_pool = {0};
+static struct rte_kni_memzone_pool kni_memzone_pool = {
+   .initialized = 0,
+   .max_ifaces = 0,
+   .slots = 0,
+   .mutex =  RTE_SPINLOCK_INITIALIZER,
+   .free = NULL,
+   .free_tail = NULL
+};

 static const struct rte_memzone *
 kni_memzone_reserve(const char *name, size_t len, int socket_id,
-- 
1.7.10.4

[dpdk-dev] ixgbe_recv_pkts, ixgbe_recv_pkts_bulk_alloc. what is difference?

2014-10-22 Thread Jeff Shaw

On Wed, Oct 22, 2014 at 11:18:17PM +0900, GyuminHwang wrote:
> Hi all
> 
> I have several questions about ixgbe_rxtx.c especially Tx and Rx function.
> What is the difference between ixgbe_recv_pkts and
> ixgbe_recv_pkts_bulk_alloc? I already know the earlier function is
> non-bulk function and the later function is bulk function. But I want to
> know is the mechanism of these two functions, and the role of H/W ring
> and S/W ring in each function.
As you mentioned, the main difference is that the bulk_alloc version allocates 
packet buffers in bulk (using rte_mempool_get_bulk) while the ixgbe_recv_pkts 
function allocates a single buffer at a time to replace the one which was just 
used to receive a frame.  Another major difference with the bulk_alloc version 
is that the descriptor ring (aka H/W ring) is scanned in bulk to determine if 
multiple frames are available to be received.  The resulting performance is 
higher than if operations were done one at a time, as is teh case with the 
ixgbe_recv_pkts function.  The drawback of using the bulk_alloc function is 
that it does not support more than one descriptor per frame, so you cannot use 
it if you are configured to receive packets greater than 2KB in size.

The H/W ring is the hardware descriptor ring on the NIC.  This is where 
descriptors are read/written.  There are plenty of details in section 7.1 of 
the Intel(R) 82599 10 Gigabit Ethernet Controller datasheet.  As for the 
software ring, this is where pointers to mbufs are stored.  You can think of 
the h/w ring as storing descriptors, and is used for controlling the NIC 
behavior, while the s/w ring is for storing buffer pointers.  The sw_ring[0] 
contains a pointer to the buffer to be used for hw_ring[0].

-Jeff

[dpdk-dev] [PATCH v4 20/21] i40e: implement operations to configure flexible masks

2014-10-22 Thread Jingjing Wu

implement operation to flexible masks for each flow type in i40e pmd driver

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_fdir.c | 124 +++-
 1 file changed, 123 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index 67498ff..2184d93 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -85,6 +85,8 @@
 static int i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq);
 static int i40e_set_flx_pld_cfg(struct i40e_pf *pf,
 struct rte_eth_flex_payload_cfg *cfg);
+static int i40e_set_fdir_flx_mask(struct i40e_pf *pf,
+   struct rte_eth_fdir_flex_masks *flex_masks);
 static int i40e_fdir_construct_pkt(struct i40e_pf *pf,
 struct rte_eth_fdir_input *fdir_input,
 unsigned char *raw_pkt);
@@ -420,6 +422,123 @@ i40e_set_flx_pld_cfg(struct i40e_pf *pf,
return 0;
 }

+static inline void
+i40e_set_flex_mask_on_pctype(
+   struct i40e_hw *hw,
+   enum i40e_filter_pctype pctype,
+   struct rte_eth_fdir_flex_masks *flex_masks)
+{
+   uint32_t flxinset, mask;
+   int i;
+
+   flxinset = (flex_masks->words_mask <<
+   I40E_PRTQF_FD_FLXINSET_INSET_SHIFT) &
+   I40E_PRTQF_FD_FLXINSET_INSET_MASK;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FD_FLXINSET(pctype), flxinset);
+
+   for (i = 0; i < flex_masks->nb_field; i++) {
+   mask = (flex_masks->field[i].bitmask <<
+   I40E_PRTQF_FD_MSK_MASK_SHIFT) &
+   I40E_PRTQF_FD_MSK_MASK_MASK;
+   mask |= ((flex_masks->field[i].offset +
+   I40E_FLX_OFFSET_IN_FIELD_VECTOR) <<
+   I40E_PRTQF_FD_MSK_OFFSET_SHIFT) &
+   I40E_PRTQF_FD_MSK_OFFSET_MASK;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FD_MSK(pctype, i), mask);
+   }
+}
+
+/*
+ * i40e_set_fdir_flx_mask - configure the mask on flexible payload
+ * @pf: board private structure
+ * @flex_masks: mask for flexible payload
+ */
+static int
+i40e_set_fdir_flx_mask(struct i40e_pf *pf,
+   struct rte_eth_fdir_flex_masks *flex_masks)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct rte_eth_fdir_info fdir;
+   int ret = 0;
+
+   if (flex_masks == NULL)
+   return -EINVAL;
+
+   if (flex_masks->nb_field > 2) {
+   PMD_DRV_LOG(ERR, "bit masks cannot support more than 2 words.");
+   return -EINVAL;
+   }
+   /*
+* flexible payload masks need to be configured before
+* flow director filters are added
+* If filters exist, flush them.
+*/
+   memset(, 0, sizeof(fdir));
+   i40e_fdir_info_get(pf, );
+   if (fdir.info_ext.best_cnt + fdir.info_ext.guarant_cnt > 0) {
+   ret = i40e_fdir_flush(pf);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "failed to flush fdir table.");
+   return ret;
+   }
+   }
+
+   switch (flex_masks->flow_type) {
+   case RTE_ETH_FLOW_TYPE_UDPV4:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV4_UDP,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_TCPV4:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV4_TCP,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_SCTPV4:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV4_SCTP,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_IPV4_OTHER:
+   /* set mask for both NONF_IPV4 and FRAG_IPV4 PCTYPE*/
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV4_OTHER,
+   flex_masks);
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_FRAG_IPV4,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_UDPV6:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV6_UDP,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_TCPV6:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV6_TCP,
+   flex_masks);
+   case RTE_ETH_FLOW_TYPE_SCTPV6:
+   i40e_set_flex_mask_on_pctype(hw,
+   I40E_FILTER_PCTYPE_NONF_IPV6_SCTP,
+   flex_masks);
+   break;
+   case RTE_ETH_FLOW_TYPE_IPV6_OTHER:
+   /* set mask for both NONF_IPV6 and FRAG_IPV6 PCTYPE*/
+   i40e_set_flex_mask_on_pctype(hw,
+

[dpdk-dev] [PATCH v4 19/21] ethdev: define structures for configuring flex masks

2014-10-22 Thread Jingjing Wu

define structures for configuring flexible masks

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index ca21313..3b336e4 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -120,7 +120,28 @@ struct rte_eth_flex_payload_cfg {
struct rte_eth_field_vector field[0];
 };

+/**
+ * A structure defined to specify each word's bit mask
+ */
+struct rte_eth_flex_mask {
+   uint8_t offset;  /**< word offset in flexible payload */
+   uint16_t bitmask;/**< bit mask for word defined by offset */
+};
+
+/**
+ * A structure used to configure FDIR masks for flexible payload
+ * for each flow type
+ */
+struct rte_eth_fdir_flex_masks {
+   enum rte_eth_flow_type flow_type;  /**< flow type */
+   uint8_t words_mask;  /**< bit i enables word i of 8 words flexible 
payload */
+   uint8_t nb_field;/**< the number of following fields */
+   struct rte_eth_flex_mask field[0];
+};
+
 #define RTE_ETH_FDIR_CFG_FLX  0x0001
+#define RTE_ETH_FDIR_CFG_MASK 0x0002
+#define RTE_ETH_FDIR_CFG_FLX_MASK 0x0003
 /**
  * A structure used to config FDIR filter global set
  * to support RTE_ETH_FILTER_FDIR with RTE_ETH_FILTER_SET operation.
@@ -130,6 +151,8 @@ struct rte_eth_fdir_cfg {
/**
 * A pointer to structure for the configuration e.g.
 * struct rte_eth_flex_payload_cfg for FDIR_CFG_FLX
+* struct rte_fdir_masks for FDIR_MASK
+* struct rte_eth_fdir_flex_masks for FDIR_FLX_MASK
*/
void *cfg;
 };
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 18/21] testpmd: add test command to configure flexible payload

2014-10-22 Thread Jingjing Wu

add test command to configure flexible payload

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 143 +
 1 file changed, 143 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 7324783..1caca54 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -685,6 +686,10 @@ static void cmd_help_long_parsed(void *parsed_result,

"flush_flow_diretor (port_id)\n"
"Flush all flow director entries of a device.\n\n"
+
+   "flow_director_flex_payload (port_id)"
+   " (l2|l3|l4) (config)\n"
+   "Configure flex payload selection.\n\n"
);
}
 }
@@ -7907,6 +7912,143 @@ cmdline_parse_inst_t cmd_flush_flow_director = {
},
 };

+/* *** deal with flow director flexible payload configuration *** */
+struct cmd_flow_director_flexpayload_result {
+   cmdline_fixed_string_t flow_director_flexpayload;
+   uint8_t port_id;
+   cmdline_fixed_string_t payload_layer;
+   cmdline_fixed_string_t payload_cfg;
+};
+
+static inline int
+parse_flex_payload_cfg(const char *q_arg,
+struct rte_eth_flex_payload_cfg *cfg)
+{
+   char s[256];
+   const char *p, *p0 = q_arg;
+   char *end;
+   enum fieldnames {
+   FLD_OFFSET = 0,
+   FLD_SIZE,
+   _NUM_FLD
+   };
+   unsigned long int_fld[_NUM_FLD];
+   char *str_fld[_NUM_FLD];
+   int i;
+   unsigned size;
+
+   cfg->nb_field = 0;
+   p = strchr(p0, '(');
+   while (p != NULL) {
+   ++p;
+   p0 = strchr(p, ')');
+   if (p0 == NULL)
+   return -1;
+
+   size = p0 - p;
+   if (size >= sizeof(s))
+   return -1;
+
+   snprintf(s, sizeof(s), "%.*s", size, p);
+   if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != 
_NUM_FLD)
+   return -1;
+   for (i = 0; i < _NUM_FLD; i++) {
+   errno = 0;
+   int_fld[i] = strtoul(str_fld[i], , 0);
+   if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
+   return -1;
+   }
+   cfg->field[cfg->nb_field].offset = (uint8_t)int_fld[FLD_OFFSET];
+   cfg->field[cfg->nb_field].size = (uint8_t)int_fld[FLD_SIZE];
+   cfg->nb_field++;
+   if (cfg->nb_field > 3) {
+   printf("exceeded max number of fields\n");
+   return -1;
+   }
+   p = strchr(p0, '(');
+   }
+   return 0;
+}
+
+static void
+cmd_flow_director_flxpld_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_flow_director_flexpayload_result *res = parsed_result;
+   struct rte_eth_fdir_cfg fdir_cfg;
+   struct rte_eth_flex_payload_cfg *flxpld_cfg;
+   int ret = 0;
+   int cfg_size = 3 * sizeof(struct rte_eth_field_vector) +
+ offsetof(struct rte_eth_flex_payload_cfg, field);
+
+   ret = rte_eth_dev_filter_supported(res->port_id, RTE_ETH_FILTER_FDIR);
+   if (ret < 0) {
+   printf("flow director is not supported on port %u.\n",
+   res->port_id);
+   return;
+   }
+
+   memset(_cfg, 0, sizeof(struct rte_eth_fdir_cfg));
+
+   flxpld_cfg = (struct rte_eth_flex_payload_cfg 
*)rte_zmalloc_socket("CLI",
+   cfg_size, CACHE_LINE_SIZE, rte_socket_id());
+
+   if (flxpld_cfg == NULL) {
+   printf("fail to malloc memory to configure flex payload\n");
+   return;
+   }
+
+   if (!strcmp(res->payload_layer, "l2"))
+   flxpld_cfg->type = RTE_ETH_L2_PAYLOAD;
+   else if (!strcmp(res->payload_layer, "l3"))
+   flxpld_cfg->type = RTE_ETH_L3_PAYLOAD;
+   else if (!strcmp(res->payload_layer, "l4"))
+   flxpld_cfg->type = RTE_ETH_L4_PAYLOAD;
+
+   ret = parse_flex_payload_cfg(res->payload_cfg, flxpld_cfg);
+   if (ret < 0) {
+   printf("error: Cannot parse flex payload input.\n");
+   rte_free(flxpld_cfg);
+   return;
+   }
+   fdir_cfg.cmd = RTE_ETH_FDIR_CFG_FLX;
+   fdir_cfg.cfg = flxpld_cfg;
+   ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_FDIR,
+RTE_ETH_FILTER_SET, _cfg);
+   if (ret < 0)
+   printf("fdir flex payload setting error: (%s)\n",
+   strerror(-ret));
+   rte_free(flxpld_cfg);
+}
+
+cmdline_parse_token_string_t

[dpdk-dev] [PATCH v4 17/21] i40e: implement operations to configure flexible payload

2014-10-22 Thread Jingjing Wu

implement operation to flexible payload in i40e pmd driver

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_fdir.c | 106 
 1 file changed, 106 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index 00ee470..67498ff 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -83,6 +83,8 @@
 #define I40E_FLX_OFFSET_IN_FIELD_VECTOR   50

 static int i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq);
+static int i40e_set_flx_pld_cfg(struct i40e_pf *pf,
+struct rte_eth_flex_payload_cfg *cfg);
 static int i40e_fdir_construct_pkt(struct i40e_pf *pf,
 struct rte_eth_fdir_input *fdir_input,
 unsigned char *raw_pkt);
@@ -327,6 +329,98 @@ i40e_fdir_teardown(struct i40e_pf *pf)
 }

 /*
+ * i40e_set_flx_pld_cfg -configure the rule how bytes stream is extracted as 
flexible payload
+ * @pf: board private structure
+ * @cfg: the rule how bytes stream is extracted as flexible payload
+ */
+static int
+i40e_set_flx_pld_cfg(struct i40e_pf *pf,
+struct rte_eth_flex_payload_cfg *cfg)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct rte_eth_fdir_info fdir;
+   uint32_t flx_pit;
+   uint16_t min_next_off = 0;
+   uint8_t idx = 0;
+   int i = 0;
+   int num_word = 0;
+   int ret;
+
+   if (cfg == NULL || cfg->nb_field > 3)
+   return -EINVAL;
+
+   if (cfg->type == RTE_ETH_L2_PAYLOAD)
+   idx = 0;
+   else if (cfg->type == RTE_ETH_L3_PAYLOAD)
+   idx = 1;
+   else if (cfg->type == RTE_ETH_L4_PAYLOAD)
+   idx = 2;
+   else {
+   PMD_DRV_LOG(ERR, "unknown payload type.");
+   return -EINVAL;
+   }
+   /*
+* flexible payload need to be configured before
+* flow director filters are added
+* If filters exist, flush them.
+*/
+   memset(, 0, sizeof(fdir));
+   i40e_fdir_info_get(pf, );
+   if (fdir.info_ext.best_cnt + fdir.info_ext.guarant_cnt > 0) {
+   ret = i40e_fdir_flush(pf);
+   if (ret) {
+   PMD_DRV_LOG(ERR, " failed to flush fdir table.");
+   return ret;
+   }
+   }
+
+   for (i = 0; i < cfg->nb_field; i++) {
+   /*
+* check register's constrain
+* Current Offset >= previous offset + previous FSIZE.
+*/
+   if (cfg->field[i].offset < min_next_off) {
+   PMD_DRV_LOG(ERR, "Offset should be larger than"
+   "previous offset + previous FSIZE.");
+   return -EINVAL;
+   }
+   flx_pit = (cfg->field[i].offset <<
+   I40E_PRTQF_FLX_PIT_SOURCE_OFF_SHIFT) &
+   I40E_PRTQF_FLX_PIT_SOURCE_OFF_MASK;
+   flx_pit |= (cfg->field[i].size <<
+   I40E_PRTQF_FLX_PIT_FSIZE_SHIFT) &
+   I40E_PRTQF_FLX_PIT_FSIZE_MASK;
+   flx_pit |= ((num_word + I40E_FLX_OFFSET_IN_FIELD_VECTOR) <<
+   I40E_PRTQF_FLX_PIT_DEST_OFF_SHIFT) &
+   I40E_PRTQF_FLX_PIT_DEST_OFF_MASK;
+   /* support no more than 8 words flexible payload*/
+   num_word += cfg->field[i].size;
+   if (num_word > 8)
+   return -EINVAL;
+
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(idx * 3 + i), flx_pit);
+   /* record the info in fdir structure */
+   pf->fdir.flex_set[idx][i].offset = cfg->field[i].offset;
+   pf->fdir.flex_set[idx][i].size = cfg->field[i].size;
+   min_next_off = cfg->field[i].offset + cfg->field[i].size;
+   }
+
+   for (; i < 3; i++) {
+   /* set the non-used register obeying register's constrain */
+   flx_pit = (min_next_off << I40E_PRTQF_FLX_PIT_SOURCE_OFF_SHIFT) 
&
+   I40E_PRTQF_FLX_PIT_SOURCE_OFF_MASK;
+   flx_pit |= (1 << I40E_PRTQF_FLX_PIT_FSIZE_SHIFT) &
+   I40E_PRTQF_FLX_PIT_FSIZE_MASK;
+   flx_pit |= (63 << I40E_PRTQF_FLX_PIT_DEST_OFF_SHIFT) &
+   I40E_PRTQF_FLX_PIT_DEST_OFF_MASK;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(idx * 3 + i), flx_pit);
+   min_next_off++;
+   }
+
+   return 0;
+}
+
+/*
  * i40e_fdir_construct_pkt - construct packet based on fields in input
  * @pf: board private structure
  * @fdir_input: input set of the flow director entry
@@ -958,6 +1052,7 @@ i40e_fdir_info_get(struct i40e_pf *pf, struct 
rte_eth_fdir_info *fdir)
 int
 i40e_fdir_ctrl_func(struct i40e_pf *pf, enum rte_filter_op filter_op, void 
*arg)
 {
+   struct rte_eth_fdir_cfg

[dpdk-dev] [PATCH v4 16/21] ethdev: define structures for configuring flexible payload

2014-10-22 Thread Jingjing Wu

define structures for configuring flexible payload

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 43 +
 1 file changed, 43 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 7ca1d6b..ca21313 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -76,6 +76,25 @@ enum rte_filter_op {
  * Define all structures for Flow Director Filter type corresponding with 
specific operations.
  */

+
+/**
+ * A structure defined a field vector to specify each field.
+ */
+struct rte_eth_field_vector {
+   uint8_t offset;   /**< Source word offset */
+   uint8_t size; /**< Field Size defined in word units */
+};
+
+/**
+ * payload type
+ */
+enum rte_eth_payload_type {
+   RTE_ETH_PAYLOAD_UNKNOWN = 0,
+   RTE_ETH_L2_PAYLOAD,
+   RTE_ETH_L3_PAYLOAD,
+   RTE_ETH_L4_PAYLOAD,
+};
+
 /**
  * flow type
  */
@@ -92,6 +111,30 @@ enum rte_eth_flow_type {
 };

 /**
+ * A structure used to select fields extracted from the protocol layers to
+ * the Field Vector as flexible payload for filter
+ */
+struct rte_eth_flex_payload_cfg {
+   enum rte_eth_payload_type type;  /**< payload type */
+   uint8_t nb_field;/**< the number of following fields */
+   struct rte_eth_field_vector field[0];
+};
+
+#define RTE_ETH_FDIR_CFG_FLX  0x0001
+/**
+ * A structure used to config FDIR filter global set
+ * to support RTE_ETH_FILTER_FDIR with RTE_ETH_FILTER_SET operation.
+ */
+struct rte_eth_fdir_cfg {
+   uint16_t cmd;  /**< sub command  */
+   /**
+* A pointer to structure for the configuration e.g.
+* struct rte_eth_flex_payload_cfg for FDIR_CFG_FLX
+   */
+   void *cfg;
+};
+
+/**
  * A structure used to define the input for IPV4 UDP flow
  */
 struct rte_eth_udpv4_flow {
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 15/21] testpmd: add test command to flush flow director table

2014-10-22 Thread Jingjing Wu

add test command to flush flow director table

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 5705b65..7324783 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -682,6 +682,9 @@ static void cmd_help_long_parsed(void *parsed_result,
" flexwords (flexwords_value) (drop|fwd)"
" queue (queue_id) fd_id (fd_id_value)\n"
"Add/Del a SCTP type flow director filter.\n\n"
+
+   "flush_flow_diretor (port_id)\n"
+   "Flush all flow director entries of a device.\n\n"
);
}
 }
@@ -7859,6 +7862,51 @@ cmdline_parse_inst_t cmd_add_del_sctp_flow_director = {
},
 };

+struct cmd_flush_flow_director_result {
+   cmdline_fixed_string_t flush_flow_director;
+   uint8_t port_id;
+};
+
+cmdline_parse_token_string_t cmd_flush_flow_director_flush =
+   TOKEN_STRING_INITIALIZER(struct cmd_flush_flow_director_result,
+flush_flow_director, "flush_flow_director");
+cmdline_parse_token_num_t cmd_flush_flow_director_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_flush_flow_director_result,
+ port_id, UINT8);
+
+static void
+cmd_flush_flow_director_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_flow_director_result *res = parsed_result;
+   int ret = 0;
+
+   ret = rte_eth_dev_filter_supported(res->port_id, RTE_ETH_FILTER_FDIR);
+   if (ret < 0) {
+   printf("flow director is not supported on port %u.\n",
+   res->port_id);
+   return;
+   }
+
+   ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_FDIR,
+   RTE_ETH_FILTER_FLUSH, NULL);
+   if (ret < 0)
+   printf("flow director table flushing error: (%s)\n",
+   strerror(-ret));
+}
+
+cmdline_parse_inst_t cmd_flush_flow_director = {
+   .f = cmd_flush_flow_director_parsed,
+   .data = NULL,
+   .help_str = "flush all flow director entries of a device on NIC",
+   .tokens = {
+   (void *)_flush_flow_director_flush,
+   (void *)_flush_flow_director_port_id,
+   NULL,
+   },
+};
+
 /* 

 */

 /* list of instructions */
@@ -7988,6 +8036,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)_add_del_ip_flow_director,
(cmdline_parse_inst_t *)_add_del_udp_flow_director,
(cmdline_parse_inst_t *)_add_del_sctp_flow_director,
+   (cmdline_parse_inst_t *)_flush_flow_director,
NULL,
 };

-- 
1.8.1.4

[dpdk-dev] [PATCH v4 14/21] i40e: implement operation to flush flow director table

2014-10-22 Thread Jingjing Wu

implement operation to flush flow director table

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_fdir.c | 47 +
 1 file changed, 47 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index d2c8304..00ee470 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -73,6 +73,10 @@
 #define I40E_FDIR_WAIT_COUNT   10
 #define I40E_FDIR_WAIT_INTERVAL_US 1000

+/* Wait count and inteval for fdir filter flush */
+#define I40E_FDIR_FLUSH_RETRY   50
+#define I40E_FDIR_FLUSH_INTERVAL_MS 5
+
 #define I40E_COUNTER_PF   2
 /* Statistic counter index for one pf */
 #define I40E_COUNTER_INDEX_FDIR(pf_id)   (0 + (pf_id) * I40E_COUNTER_PF)
@@ -89,6 +93,7 @@ static int i40e_fdir_filter_programming(struct i40e_pf *pf,
enum i40e_filter_pctype pctype,
struct rte_eth_fdir_filter *filter,
bool add);
+static int i40e_fdir_flush(struct i40e_pf *pf);
 static void i40e_fdir_info_get(struct i40e_pf *pf,
   struct rte_eth_fdir_info *fdir);

@@ -876,6 +881,45 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
 }

 /*
+ * i40e_fdir_flush - clear all filters of Flow Director table
+ * @pf: board private structure
+ */
+static int
+i40e_fdir_flush(struct i40e_pf *pf)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   uint32_t reg;
+   uint16_t guarant_cnt, best_cnt;
+   int i;
+
+   I40E_WRITE_REG(hw, I40E_PFQF_CTL_1, I40E_PFQF_CTL_1_CLEARFDTABLE_MASK);
+   I40E_WRITE_FLUSH(hw);
+
+   for (i = 0; i < I40E_FDIR_FLUSH_RETRY; i++) {
+   rte_delay_ms(I40E_FDIR_FLUSH_INTERVAL_MS);
+   reg = I40E_READ_REG(hw, I40E_PFQF_CTL_1);
+   if (!(reg & I40E_PFQF_CTL_1_CLEARFDTABLE_MASK))
+   break;
+   }
+   if (i >= I40E_FDIR_FLUSH_RETRY) {
+   PMD_DRV_LOG(ERR, "FD table did not flush, may need more time.");
+   return -ETIMEDOUT;
+   }
+   guarant_cnt = (uint16_t)((I40E_READ_REG(hw, I40E_PFQF_FDSTAT) &
+   I40E_PFQF_FDSTAT_GUARANT_CNT_MASK) >>
+   I40E_PFQF_FDSTAT_GUARANT_CNT_SHIFT);
+   best_cnt = (uint16_t)((I40E_READ_REG(hw, I40E_PFQF_FDSTAT) &
+   I40E_PFQF_FDSTAT_BEST_CNT_MASK) >>
+   I40E_PFQF_FDSTAT_BEST_CNT_SHIFT);
+   if (guarant_cnt != 0 || best_cnt != 0) {
+   PMD_DRV_LOG(ERR, "Failed to flush FD table.");
+   return -ENOSYS;
+   } else
+   PMD_DRV_LOG(INFO, "FD table Flush success.");
+   return 0;
+}
+
+/*
  * i40e_fdir_info_get - get information of Flow Director
  * @pf: ethernet device to get info from
  * @fdir: a pointer to a structure of type *rte_eth_fdir_info* to be filled 
with
@@ -935,6 +979,9 @@ i40e_fdir_ctrl_func(struct i40e_pf *pf, enum rte_filter_op 
filter_op, void *arg)
(struct rte_eth_fdir_filter *)arg,
FALSE);
break;
+   case RTE_ETH_FILTER_FLUSH:
+   ret = i40e_fdir_flush(pf);
+   break;
case RTE_ETH_FILTER_INFO:
i40e_fdir_info_get(pf, (struct rte_eth_fdir_info *)arg);
break;
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 13/21] testpmd: display fdir statistics

2014-10-22 Thread Jingjing Wu

display flow director's statistics information

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/config.c | 38 ++
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 2a1b93f..8625251 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1815,26 +1815,48 @@ fdir_remove_signature_filter(portid_t port_id,
 void
 fdir_get_infos(portid_t port_id)
 {
-   struct rte_eth_fdir fdir_infos;
+   struct rte_eth_fdir_info fdir_infos;
+   int ret;

static const char *fdir_stats_border = "";

if (port_id_is_invalid(port_id))
return;

-   rte_eth_dev_fdir_get_infos(port_id, _infos);
-
+   memset(_infos, 0, sizeof(fdir_infos));
+   ret = rte_eth_dev_filter_ctrl(port_id, RTE_ETH_FILTER_FDIR,
+  RTE_ETH_FILTER_INFO, _infos);
+   if (ret < 0) {
+   ret = rte_eth_dev_fdir_get_infos(port_id, _infos.info);
+   if (ret < 0) {
+   printf("\n getting fdir info fails on port %-2d\n",
+   port_id);
+   return;
+   }
+   fdir_infos.mode = (fdir_conf.mode == RTE_FDIR_MODE_NONE) ? 0 : 
1;
+   }
printf("\n  %s FDIR infos for port %-2d %s\n",
   fdir_stats_border, port_id, fdir_stats_border);
-
+   if (fdir_infos.mode)
+   printf("  FDIR is enabled\n");
+   else
+   printf("  FDIR is disabled\n");
printf("  collision: %-10"PRIu64"  free: %"PRIu64"\n"
   "  maxhash:   %-10"PRIu64"  maxlen:   %"PRIu64"\n"
   "  add:   %-10"PRIu64"  remove:   %"PRIu64"\n"
   "  f_add: %-10"PRIu64"  f_remove: %"PRIu64"\n",
-  (uint64_t)(fdir_infos.collision), (uint64_t)(fdir_infos.free),
-  (uint64_t)(fdir_infos.maxhash), (uint64_t)(fdir_infos.maxlen),
-  fdir_infos.add, fdir_infos.remove,
-  fdir_infos.f_add, fdir_infos.f_remove);
+  (uint64_t)(fdir_infos.info.collision), 
(uint64_t)(fdir_infos.info.free),
+  (uint64_t)(fdir_infos.info.maxhash), 
(uint64_t)(fdir_infos.info.maxlen),
+  fdir_infos.info.add, fdir_infos.info.remove,
+  fdir_infos.info.f_add, fdir_infos.info.f_remove);
+   printf("  guarant_space: %-10"PRIu16
+  "  best_space:%-10"PRIu16"\n",
+  fdir_infos.info_ext.guarant_spc,
+  fdir_infos.info_ext.best_spc);
+   printf("  guarant_count: %-10"PRIu16
+  "  best_count:%-10"PRIu16"\n",
+  fdir_infos.info_ext.guarant_cnt,
+  fdir_infos.info_ext.best_cnt);
printf("  %s%s\n",
   fdir_stats_border, fdir_stats_border);
 }
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 12/21] i40e: implement operations to get fdir info

2014-10-22 Thread Jingjing Wu

implement operation to get flow director information in i40e pmd driver

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_fdir.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index 8996a1c..d2c8304 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -89,6 +89,8 @@ static int i40e_fdir_filter_programming(struct i40e_pf *pf,
enum i40e_filter_pctype pctype,
struct rte_eth_fdir_filter *filter,
bool add);
+static void i40e_fdir_info_get(struct i40e_pf *pf,
+  struct rte_eth_fdir_info *fdir);

 static int
 i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq)
@@ -874,6 +876,36 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
 }

 /*
+ * i40e_fdir_info_get - get information of Flow Director
+ * @pf: ethernet device to get info from
+ * @fdir: a pointer to a structure of type *rte_eth_fdir_info* to be filled 
with
+ *the flow director information.
+ */
+static void
+i40e_fdir_info_get(struct i40e_pf *pf, struct rte_eth_fdir_info *fdir)
+{
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   uint32_t pfqf_ctl;
+
+   pfqf_ctl = I40E_READ_REG(hw, I40E_PFQF_CTL_0);
+   fdir->mode = pfqf_ctl & I40E_PFQF_CTL_0_FD_ENA_MASK ?
+RTE_FDIR_MODE_PERFECT : RTE_FDIR_MODE_NONE;
+   fdir->info_ext.guarant_spc =
+   (uint16_t)hw->func_caps.fd_filters_guaranteed;
+   fdir->info_ext.guarant_cnt =
+   (uint16_t)((I40E_READ_REG(hw, I40E_PFQF_FDSTAT) &
+   I40E_PFQF_FDSTAT_GUARANT_CNT_MASK) >>
+   I40E_PFQF_FDSTAT_GUARANT_CNT_SHIFT);
+   fdir->info_ext.best_spc =
+   (uint16_t)hw->func_caps.fd_filters_best_effort;
+   fdir->info_ext.best_cnt =
+   (uint16_t)((I40E_READ_REG(hw, I40E_PFQF_FDSTAT) &
+   I40E_PFQF_FDSTAT_BEST_CNT_MASK) >>
+   I40E_PFQF_FDSTAT_BEST_CNT_SHIFT);
+   return;
+}
+
+/*
  * i40e_fdir_ctrl_func - deal with all operations on flow director.
  * @pf: board private structure
  * @filter_op:operation will be taken.
@@ -903,6 +935,9 @@ i40e_fdir_ctrl_func(struct i40e_pf *pf, enum rte_filter_op 
filter_op, void *arg)
(struct rte_eth_fdir_filter *)arg,
FALSE);
break;
+   case RTE_ETH_FILTER_INFO:
+   i40e_fdir_info_get(pf, (struct rte_eth_fdir_info *)arg);
+   break;
default:
PMD_DRV_LOG(ERR, "unknown operation %u.", filter_op);
ret = -EINVAL;
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 11/21] ethdev: define structures for getting flow director information

2014-10-22 Thread Jingjing Wu

define structures for getting flow director information

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 40 
 lib/librte_ether/rte_ethdev.h   | 23 ---
 2 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 3efdaae..7ca1d6b 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -231,6 +231,46 @@ struct rte_eth_fdir_filter {
struct rte_eth_fdir_action action;  /**< action taken when match */
 };

+/**
+ * A structure used to report the status of the flow director filters in use.
+ */
+struct rte_eth_fdir {
+   /** Number of filters with collision indication. */
+   uint16_t collision;
+   /** Number of free (non programmed) filters. */
+   uint16_t free;
+   /** The Lookup hash value of the added filter that updated the value
+  of the MAXLEN field */
+   uint16_t maxhash;
+   /** Longest linked list of filters in the table. */
+   uint8_t maxlen;
+   /** Number of added filters. */
+   uint64_t add;
+   /** Number of removed filters. */
+   uint64_t remove;
+   /** Number of failed added filters (no more space in device). */
+   uint64_t f_add;
+   /** Number of failed removed filters. */
+   uint64_t f_remove;
+};
+
+struct rte_eth_fdir_ext {
+   uint16_t guarant_spc;  /**< guaranteed spaces.*/
+   uint16_t guarant_cnt;  /**< Number of filters in guaranteed spaces. */
+   uint16_t best_spc; /**< best effort spaces.*/
+   uint16_t best_cnt; /**< Number of filters in best effort spaces. */
+};
+
+/**
+ * A structure used to get the status information of flow director filter.
+ * to support RTE_ETH_FILTER_FDIR with RTE_ETH_FILTER_INFO operation.
+ */
+struct rte_eth_fdir_info {
+   int mode; /**< if 0 disbale, if 1 enable*/
+   struct rte_eth_fdir info;
+   struct rte_eth_fdir_ext info_ext;
+};
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b69a6af..0dc399d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -795,29 +795,6 @@ struct rte_fdir_masks {
 };

 /**
- *  A structure used to report the status of the flow director filters in use.
- */
-struct rte_eth_fdir {
-   /** Number of filters with collision indication. */
-   uint16_t collision;
-   /** Number of free (non programmed) filters. */
-   uint16_t free;
-   /** The Lookup hash value of the added filter that updated the value
-  of the MAXLEN field */
-   uint16_t maxhash;
-   /** Longest linked list of filters in the table. */
-   uint8_t maxlen;
-   /** Number of added filters. */
-   uint64_t add;
-   /** Number of removed filters. */
-   uint64_t remove;
-   /** Number of failed added filters (no more space in device). */
-   uint64_t f_add;
-   /** Number of failed removed filters. */
-   uint64_t f_remove;
-};
-
-/**
  * A structure used to enable/disable specific device interrupts.
  */
 struct rte_intr_conf {
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 10/21] testpmd: print extended fdir info in mbuf

2014-10-22 Thread Jingjing Wu

print extend fdir info in rxonly fwd engine when fdir match.

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/rxonly.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 98c788b..7f5099c 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -160,10 +160,18 @@ pkt_burst_receive(struct fwd_stream *fs)
if (ol_flags & PKT_RX_RSS_HASH) {
printf(" - RSS hash=0x%x", (unsigned) mb->hash.rss);
printf(" - RSS queue=0x%x",(unsigned) fs->rx_queue);
+   } else if (ol_flags & PKT_RX_FDIR) {
+   printf(" - FDIR matched ");
+   if (ol_flags & PKT_RX_FDIR_ID)
+   printf("ID=0x%x",
+  mb->hash.fdir.hi);
+   else if (ol_flags & PKT_RX_FDIR_FLX)
+   printf("flex bytes=0x%08x %08x",
+  mb->hash.fdir.hi, mb->hash.fdir.lo);
+   else
+   printf("hash=0x%x ID=0x%x ",
+  mb->hash.fdir.hash, mb->hash.fdir.id);
}
-   else if (ol_flags & PKT_RX_FDIR)
-   printf(" - FDIR hash=0x%x - FDIR id=0x%x ",
-  mb->hash.fdir.hash, mb->hash.fdir.id);
if (ol_flags & PKT_RX_VLAN_PKT)
printf(" - VLAN tci=0x%x", mb->vlan_tci);
printf("\n");
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 09/21] i40e: report flow director match info to mbuf

2014-10-22 Thread Jingjing Wu

support to set the FDIR information in mbuf if match

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_rxtx.c | 98 +++--
 1 file changed, 95 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index 1017e3f..5a361ca 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -105,6 +105,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword)
I40E_RX_DESC_FLTSTAT_RSS_HASH) ==
I40E_RX_DESC_FLTSTAT_RSS_HASH) ? PKT_RX_RSS_HASH : 0;

+   /* Check if FDIR Match */
+   flags |= (uint16_t)(qword & (1 << I40E_RX_DESC_STATUS_FLM_SHIFT) ?
+   PKT_RX_FDIR : 0);
+
return flags;
 }

@@ -637,10 +641,38 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
-   mb->ol_flags = pkt_flags;
if (pkt_flags & PKT_RX_RSS_HASH)
mb->hash.rss = rte_le_to_cpu_32(\
rxdp->wb.qword0.hi_dword.rss);
+
+   if (pkt_flags & PKT_RX_FDIR) {
+#ifdef RTE_LIBRTE_I40E_16BYTE_RX_DESC
+   mb->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   rxdp[j].wb.qword0.hi_dword.fd);
+   pkt_flags |= PKT_RX_FDIR_ID;
+#else
+   if (((rxdp[j].wb.qword2.ext_status >>
+   I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) &
+   0x03) == 0x01) {
+   mb->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   
rxdp[j].wb.qword3.hi_dword.fd_id);
+   pkt_flags |= PKT_RX_FDIR_ID;
+   } else if (((rxdp[j].wb.qword2.ext_status >>
+   I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) &
+   0x03) == 0x02) {
+   mb->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   
rxdp[j].wb.qword3.hi_dword.flex_bytes_hi);
+   pkt_flags |= PKT_RX_FDIR_FLX;
+   }
+   mb->hash.fdir.lo =
+   rte_le_to_cpu_32(\
+   
rxdp[j].wb.qword3.lo_dword.flex_bytes_lo);
+#endif
+   }
+   mb->ol_flags = pkt_flags;
}

for (j = 0; j < I40E_LOOK_AHEAD; j++)
@@ -873,10 +905,40 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
-   rxm->ol_flags = pkt_flags;
if (pkt_flags & PKT_RX_RSS_HASH)
rxm->hash.rss =
rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
+   if (pkt_flags & PKT_RX_FDIR) {
+#ifdef RTE_LIBRTE_I40E_16BYTE_RX_DESC
+   rxm->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   rxd.wb.qword0.hi_dword.fd);
+   pkt_flags |= PKT_RX_FDIR_ID;
+#else
+   if (((rxd.wb.qword2.ext_status >>
+   I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) &
+   0x03) == 0x01) {
+   rxm->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   rxd.wb.qword3.hi_dword.fd_id);
+   pkt_flags |= PKT_RX_FDIR_ID;
+   } else if (((rxd.wb.qword2.ext_status >>
+   I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) &
+   0x03) == 0x02) {
+   rxm->hash.fdir.hi =
+   rte_le_to_cpu_32(\
+   rxd.wb.qword3.hi_dword.flex_bytes_hi);
+   pkt_flags |= PKT_RX_FDIR_FLX;
+   }
+   if (((rxd.wb.qword2.ext_status >>
+   I40E_RX_DESC_EXT_STATUS_FLEXBL_SHIFT) &
+   0x03) == 0x01)
+

[dpdk-dev] [PATCH v4 08/21] mbuf: extend fdir field

2014-10-22 Thread Jingjing Wu

extend fdir field to support flex bytes reported when fdir match

Signed-off-by: Jingjing Wu 
---
 lib/librte_mbuf/rte_mbuf.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..d2fbf40 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -91,6 +91,8 @@ extern "C" {
 #define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with extended IPv6 
header. */
 #define PKT_RX_IEEE1588_PTP  (1ULL << 9)  /**< RX IEEE1588 L2 Ethernet PT 
Packet. */
 #define PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped 
packet.*/
+#define PKT_RX_FDIR_ID   (1ULL << 11) /**< FD id reported if FDIR match. */
+#define PKT_RX_FDIR_FLX  (1ULL << 12) /**< Flexible bytes reported if FDIR 
match. */

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
 #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
@@ -171,8 +173,14 @@ struct rte_mbuf {
union {
uint32_t rss; /**< RSS hash result if RSS enabled */
struct {
-   uint16_t hash;
-   uint16_t id;
+   union {
+   struct {
+   uint16_t hash;
+   uint16_t id;
+   };
+   uint32_t lo; /**< flexible bytes low*/
+   };
+   uint32_t hi; /**< flexible bytes high*/
} fdir;   /**< Filter identifier if FDIR enabled */
uint32_t sched;   /**< Hierarchical scheduler */
} hash;   /**< hash information */
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 07/21] i40e: match counter for flow director

2014-10-22 Thread Jingjing Wu

support to get the fdir_match counter

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index f56a4f6..3ff3965 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -1285,6 +1285,9 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
I40E_GLPRT_PTC9522L(hw->port),
pf->offset_loaded, >tx_size_big,
>tx_size_big);
+   i40e_stat_update_32(hw, I40E_GLQF_PCNT(pf->fdir.match_counter_index),
+  pf->offset_loaded,
+  >fd_sb_match, >fd_sb_match);
/* GLPRT_MSPDC not supported */
/* GLPRT_XEC not supported */

@@ -1301,6 +1304,7 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
stats->obytes   = ns->eth.tx_bytes;
stats->oerrors  = ns->eth.tx_errors;
stats->imcasts  = ns->eth.rx_multicast;
+   stats->fdirmatch = ns->fd_sb_match;

/* Rx Errors */
stats->ibadcrc  = ns->crc_errors;
@@ -1376,6 +1380,7 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
ns->mac_short_packet_dropped);
PMD_DRV_LOG(DEBUG, "checksum_error:   %lu",
ns->checksum_error);
+   PMD_DRV_LOG(DEBUG, "fdir_match:   %lu", ns->fd_sb_match);
PMD_DRV_LOG(DEBUG, "* PF stats end 
");
 }

-- 
1.8.1.4

[dpdk-dev] [PATCH v4 06/21] testpmd: add test commands to add/delete flow director filter

2014-10-22 Thread Jingjing Wu

add commands which can be used to test adding or deleting 8 flow
types of the flow director filters: ipv4, tcpv4, udpv4, sctpv4,
ipv6, tcpv6, udpv6, sctpv6

Signed-off-by: Jingjing Wu 
---
 app/test-pmd/cmdline.c | 447 +
 app/test-pmd/testpmd.h |   3 +
 2 files changed, 450 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..5705b65 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -660,6 +660,28 @@ static void cmd_help_long_parsed(void *parsed_result,

"get_flex_filter (port_id) index (idx)\n"
"get info of a flex filter.\n\n"
+
+   "flow_director_filter (port_id) (add|del|update)"
+   " flow (ip4|ip6) src (src_ip_address) dst 
(dst_ip_address)"
+   " flexwords (flexwords_value)"
+   " (drop|fwd) queue (queue_id) fd_id (fd_id_value)\n"
+   "Add/Del a IP type flow director filter.\n\n"
+
+   "flow_director_filter (port_id) (add|del|update)"
+   " flow (udp4|tcp4|udp6|tcp6)"
+   " src (src_ip_address) (src_port)"
+   " dst (dst_ip_address) (dst_port)"
+   " flexwords (flexwords_value)"
+   " (drop|fwd) queue (queue_id) fd_id (fd_id_value)\n"
+   "Add/Del a UDP/TCP type flow director filter.\n\n"
+
+   "flow_director_filter (port_id) (add|del|update)"
+   " flow (sctp4|sctp6)"
+   " src (src_ip_address) dst (dst_ip_address)"
+   " tag (verification_tag)"
+   " flexwords (flexwords_value) (drop|fwd)"
+   " queue (queue_id) fd_id (fd_id_value)\n"
+   "Add/Del a SCTP type flow director filter.\n\n"
);
}
 }
@@ -7415,6 +7437,428 @@ cmdline_parse_inst_t cmd_get_flex_filter = {
},
 };

+/* *** Filters Control *** */
+
+/* *** deal with flow director filter *** */
+struct cmd_flow_director_result {
+   cmdline_fixed_string_t flow_director_filter;
+   uint8_t port_id;
+   cmdline_fixed_string_t ops;
+   cmdline_fixed_string_t flow;
+   cmdline_fixed_string_t flow_type;
+   cmdline_fixed_string_t src;
+   cmdline_ipaddr_t ip_src;
+   uint16_t port_src;
+   cmdline_fixed_string_t dst;
+   cmdline_ipaddr_t ip_dst;
+   uint16_t port_dst;
+   cmdline_fixed_string_t verify_tag;
+   uint32_t verify_tag_value;
+   cmdline_fixed_string_t flexwords;
+   cmdline_fixed_string_t flexwords_value;
+   cmdline_fixed_string_t drop;
+   cmdline_fixed_string_t queue;
+   uint16_t  queue_id;
+   cmdline_fixed_string_t fd_id;
+   uint32_t  fd_id_value;
+};
+
+static inline int
+parse_flexwords(const char *q_arg, uint16_t *flexwords)
+{
+#define MAX_NUM_WORD 8
+   char s[256];
+   const char *p, *p0 = q_arg;
+   char *end;
+   unsigned long int_fld[MAX_NUM_WORD];
+   char *str_fld[MAX_NUM_WORD];
+   int i;
+   unsigned size;
+   int num_words = -1;
+
+   p = strchr(p0, '(');
+   if (p == NULL)
+   return -1;
+   ++p;
+   p0 = strchr(p, ')');
+   if (p0 == NULL)
+   return -1;
+
+   size = p0 - p;
+   if (size >= sizeof(s))
+   return -1;
+
+   snprintf(s, sizeof(s), "%.*s", size, p);
+   num_words = rte_strsplit(s, sizeof(s), str_fld, MAX_NUM_WORD, ',');
+   if (num_words < 0 || num_words > MAX_NUM_WORD)
+   return -1;
+   for (i = 0; i < num_words; i++) {
+   errno = 0;
+   int_fld[i] = strtoul(str_fld[i], , 0);
+   if (errno != 0 || end == str_fld[i] || int_fld[i] > UINT16_MAX)
+   return -1;
+   flexwords[i] = rte_cpu_to_be_16((uint16_t)int_fld[i]);
+   }
+   return num_words;
+}
+
+static void
+cmd_flow_director_filter_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_flow_director_result *res = parsed_result;
+   struct rte_eth_fdir_filter entry;
+   uint16_t flexwords[8];
+   int num_flexwords;
+   int ret = 0;
+
+   ret = rte_eth_dev_filter_supported(res->port_id, RTE_ETH_FILTER_FDIR);
+   if (ret < 0) {
+   printf("flow director is not supported on port %u.\n",
+   res->port_id);
+   return;
+   }
+   memset(flexwords, 0, sizeof(flexwords));
+   memset(, 0, sizeof(struct rte_eth_fdir_filter));
+   num_flexwords = parse_flexwords(res->flexwords_value, flexwords);
+   if (num_flexwords < 0) {
+   printf("error: Cannot parse flexwords input.\n");
+

[dpdk-dev] [PATCH v4 05/21] i40e: implement operations to add/delete flow director

2014-10-22 Thread Jingjing Wu

deal with two operations for flow director
 - RTE_ETH_FILTER_ADD
 - RTE_ETH_FILTER_DELETE
encode the flow inputs to programming packet
sent the packet to filter programming queue and check status on the status 
report queue

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |   6 +-
 lib/librte_pmd_i40e/i40e_ethdev.h |   3 +
 lib/librte_pmd_i40e/i40e_fdir.c   | 622 ++
 3 files changed, 629 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 59a4b6a..f56a4f6 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -4241,14 +4241,16 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_op filter_op,
 void *arg)
 {
+   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
int ret = 0;
-   (void)filter_op;
-   (void)arg;

if (dev == NULL)
return -EINVAL;

switch (filter_type) {
+   case RTE_ETH_FILTER_FDIR:
+   ret = i40e_fdir_ctrl_func(pf, filter_op, arg);
+   break;
default:
PMD_DRV_LOG(WARNING, "Filter type (%d) not supported",
filter_type);
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.h 
b/lib/librte_pmd_i40e/i40e_ethdev.h
index d998980..a20c00e 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.h
+++ b/lib/librte_pmd_i40e/i40e_ethdev.h
@@ -375,6 +375,9 @@ enum i40e_status_code i40e_fdir_setup_rx_resources(struct 
i40e_pf *pf,
unsigned int socket_id);
 int i40e_fdir_setup(struct i40e_pf *pf);
 void i40e_fdir_teardown(struct i40e_pf *pf);
+int i40e_fdir_ctrl_func(struct i40e_pf *pf,
+ enum rte_filter_op filter_op,
+ void *arg);

 /* I40E_DEV_PRIVATE_TO */
 #define I40E_DEV_PRIVATE_TO_PF(adapter) \
diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index 848fb92..8996a1c 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -44,6 +44,10 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 

 #include "i40e_logs.h"
 #include "i40e/i40e_type.h"
@@ -51,7 +55,23 @@
 #include "i40e_rxtx.h"

 #define I40E_FDIR_MZ_NAME  "FDIR_MEMZONE"
+#ifndef IPV6_ADDR_LEN
+#define IPV6_ADDR_LEN  16
+#endif
+
 #define I40E_FDIR_PKT_LEN   512
+#define I40E_FDIR_IP_DEFAULT_LEN420
+#define I40E_FDIR_IP_DEFAULT_TTL0x40
+#define I40E_FDIR_IP_DEFAULT_VERSION_IHL0x45
+#define I40E_FDIR_TCP_DEFAULT_DATAOFF   0x50
+#define I40E_FDIR_IPv6_DEFAULT_VTC_FLOW 0x6030
+#define I40E_FDIR_IPv6_DEFAULT_HOP_LIMITS   0xFF
+#define I40E_FDIR_IPv6_PAYLOAD_LEN  380
+#define I40E_FDIR_UDP_DEFAULT_LEN   400
+
+/* Wait count and inteval for fdir filter programming */
+#define I40E_FDIR_WAIT_COUNT   10
+#define I40E_FDIR_WAIT_INTERVAL_US 1000

 #define I40E_COUNTER_PF   2
 /* Statistic counter index for one pf */
@@ -59,6 +79,16 @@
 #define I40E_FLX_OFFSET_IN_FIELD_VECTOR   50

 static int i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq);
+static int i40e_fdir_construct_pkt(struct i40e_pf *pf,
+struct rte_eth_fdir_input *fdir_input,
+unsigned char *raw_pkt);
+static int i40e_add_del_fdir_filter(struct i40e_pf *pf,
+   struct rte_eth_fdir_filter *filter,
+   bool add);
+static int i40e_fdir_filter_programming(struct i40e_pf *pf,
+   enum i40e_filter_pctype pctype,
+   struct rte_eth_fdir_filter *filter,
+   bool add);

 static int
 i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq)
@@ -288,3 +318,595 @@ i40e_fdir_teardown(struct i40e_pf *pf)
pf->fdir.fdir_vsi = NULL;
return;
 }
+
+/*
+ * i40e_fdir_construct_pkt - construct packet based on fields in input
+ * @pf: board private structure
+ * @fdir_input: input set of the flow director entry
+ * @raw_pkt: a packet to be constructed
+ */
+static int
+i40e_fdir_construct_pkt(struct i40e_pf *pf,
+struct rte_eth_fdir_input *fdir_input,
+unsigned char *raw_pkt)
+{
+   struct ether_hdr *ether;
+   unsigned char *payload;
+   struct ipv4_hdr *ip;
+   struct ipv6_hdr *ip6;
+   struct udp_hdr *udp;
+   struct tcp_hdr *tcp;
+   struct sctp_hdr *sctp;
+   uint8_t size = 0;
+   int i, set_idx = 2; /* set_idx = 2 means using l4 pyload by default*/
+
+   switch (fdir_input->flow_type) {
+   case RTE_ETH_FLOW_TYPE_UDPV4:
+   ether = (struct ether_hdr *)raw_pkt;
+   ip = (struct ipv4_hdr *)(raw_pkt + sizeof(struct ether_hdr));
+   udp = (struct udp_hdr

[dpdk-dev] [PATCH v4 04/21] ethdev: define structures for adding/deleting flow director

2014-10-22 Thread Jingjing Wu

define structures to add or delete flow director filter

Signed-off-by: Jingjing Wu 
---
 lib/librte_ether/rte_eth_ctrl.h | 160 
 1 file changed, 160 insertions(+)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..3efdaae 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_FDIR,
RTE_ETH_FILTER_MAX
 };

@@ -71,6 +72,165 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * Define all structures for Flow Director Filter type corresponding with 
specific operations.
+ */
+
+/**
+ * flow type
+ */
+enum rte_eth_flow_type {
+   RTE_ETH_FLOW_TYPE_NONE = 0x0,
+   RTE_ETH_FLOW_TYPE_UDPV4,
+   RTE_ETH_FLOW_TYPE_TCPV4,
+   RTE_ETH_FLOW_TYPE_SCTPV4,
+   RTE_ETH_FLOW_TYPE_IPV4_OTHER,
+   RTE_ETH_FLOW_TYPE_UDPV6,
+   RTE_ETH_FLOW_TYPE_TCPV6,
+   RTE_ETH_FLOW_TYPE_SCTPV6,
+   RTE_ETH_FLOW_TYPE_IPV6_OTHER,
+};
+
+/**
+ * A structure used to define the input for IPV4 UDP flow
+ */
+struct rte_eth_udpv4_flow {
+   uint32_t src_ip;  /**< IPv4 source address to match. */
+   uint32_t dst_ip;  /**< IPv4 destination address to match. */
+   uint16_t src_port;/**< UDP Source port to match. */
+   uint16_t dst_port;/**< UDP Destination port to match. */
+};
+
+/**
+ * A structure used to define the input for IPV4 TCP flow
+ */
+struct rte_eth_tcpv4_flow {
+   uint32_t src_ip;  /**< IPv4 source address to match. */
+   uint32_t dst_ip;  /**< IPv4 destination address to match. */
+   uint16_t src_port;/**< TCP Source port to match. */
+   uint16_t dst_port;/**< TCP Destination port to match. */
+};
+
+/**
+ * A structure used to define the input for IPV4 SCTP flow
+ */
+struct rte_eth_sctpv4_flow {
+   uint32_t src_ip;  /**< IPv4 source address to match. */
+   uint32_t dst_ip;  /**< IPv4 destination address to match. */
+   uint32_t verify_tag;  /**< verify tag to match */
+};
+
+/**
+ * A structure used to define the input for IPV4 flow
+ */
+struct rte_eth_ipv4_flow {
+   uint32_t src_ip;  /**< IPv4 source address to match. */
+   uint32_t dst_ip;  /**< IPv4 destination address to match. */
+};
+
+/**
+ * A structure used to define the input for IPV6 UDP flow
+ */
+struct rte_eth_udpv6_flow {
+   uint32_t src_ip[4];  /**< IPv6 source address to match. */
+   uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+   uint16_t src_port;   /**< UDP Source port to match. */
+   uint16_t dst_port;   /**< UDP Destination port to match. */
+};
+
+/**
+ * A structure used to define the input for IPV6 TCP flow
+ */
+struct rte_eth_tcpv6_flow {
+   uint32_t src_ip[4];  /**< IPv6 source address to match. */
+   uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+   uint16_t src_port;   /**< TCP Source port to match. */
+   uint16_t dst_port;   /**< TCP Destination port to match. */
+};
+
+/**
+ * A structure used to define the input for IPV6 SCTP flow
+ */
+struct rte_eth_sctpv6_flow {
+   uint32_t src_ip[4];  /**< IPv6 source address to match. */
+   uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+   uint32_t verify_tag; /**< verify tag to match */
+};
+
+/**
+ * A structure used to define the input for IPV6 flow
+ */
+struct rte_eth_ipv6_flow {
+   uint32_t src_ip[4];  /**< IPv6 source address to match. */
+   uint32_t dst_ip[4];  /**< IPv6 destination address to match. */
+};
+
+/**
+ * An union contains the inputs for all types of flow
+ */
+union rte_eth_fdir_flow {
+   struct rte_eth_udpv4_flow  udp4_flow;
+   struct rte_eth_tcpv4_flow  tcp4_flow;
+   struct rte_eth_sctpv4_flow sctp4_flow;
+   struct rte_eth_ipv4_flow   ip4_flow;
+   struct rte_eth_udpv6_flow  udp6_flow;
+   struct rte_eth_tcpv6_flow  tcp6_flow;
+   struct rte_eth_sctpv6_flow sctp6_flow;
+   struct rte_eth_ipv6_flow   ip6_flow;
+};
+
+#define RTE_ETH_FDIR_MAX_FLEXWORD_LEN  8
+/**
+ * A structure used to contain extend input of flow
+ */
+struct rte_eth_fdir_flow_ext {
+   uint16_t vlan_tci;
+   uint8_t num_flexwords; /**< number of flexwords */
+   uint16_t flexwords[RTE_ETH_FDIR_MAX_FLEXWORD_LEN];
+   uint16_t dest_id;  /**< destination vsi or pool id*/
+};
+
+/**
+ * A structure used to define the input for an flow director filter entry
+ */
+struct rte_eth_fdir_input {
+   enum rte_eth_flow_type flow_type;  /**< type of flow */
+   union rte_eth_fdir_flow flow;  /**< specific flow structure */
+   struct rte_eth_fdir_flow_ext flow_ext; /**< specific flow info */
+};
+
+/**
+ * Flow director report status
+ */
+enum rte_eth_fdir_status {

[dpdk-dev] [PATCH v4 03/21] i40e: initialize flexible payload setting

2014-10-22 Thread Jingjing Wu

set flexible payload related registers to default value at initialization time.

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 33 +
 lib/librte_pmd_i40e/i40e_fdir.c   | 51 ++-
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 0054a1e..59a4b6a 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -322,6 +322,32 @@ static struct rte_driver rte_i40e_driver = {

 PMD_REGISTER_DRIVER(rte_i40e_driver);

+/*
+ * Initialize registers for flexible payload, which should be set by NVM.
+ * This should be removed from code once is fixed in NVM.
+ */
+static inline void i40e_flex_payload_reg_init(struct i40e_hw *hw)
+{
+   /* GLQF_ORT Registers */
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(18), 0x0030);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(19), 0x0030);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(26), 0x002B);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(30), 0x002B);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(33), 0x00E0);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(34), 0x00E3);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(35), 0x00E6);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(20), 0x0031);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(23), 0x0031);
+   I40E_WRITE_REG(hw, I40E_GLQF_ORT(63), 0x002D);
+
+   /* GLQF_PIT Registers */
+   I40E_WRITE_REG(hw, I40E_GLQF_PIT(16), 0x7480);
+   I40E_WRITE_REG(hw, I40E_GLQF_PIT(17), 0x7440);
+
+   /* GL_PRS_FVBM Registers */
+   I40E_WRITE_REG(hw, I40E_GL_PRS_FVBM(1), 0x835B);
+}
+
 static int
 eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
   struct rte_eth_dev *dev)
@@ -385,6 +411,13 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
return ret;
}

+   /*
+* To work around the NVM issue,initialize registers
+* for flexible payload by software.
+* It should be removed once issues are fixed in NVM.
+*/
+   i40e_flex_payload_reg_init(hw);
+
/* Initialize the parameters for adminq */
i40e_init_adminq_parameter(hw);
ret = i40e_init_adminq(hw);
diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index bb474d2..848fb92 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -111,6 +111,53 @@ i40e_fdir_rx_queue_init(struct i40e_rx_queue *rxq)
 }

 /*
+ * Initialize the configuration about bytes stream extracted as flexible 
payload
+ * and mask setting
+ */
+static inline void
+i40e_init_flx_pld(struct i40e_pf *pf)
+{
+   uint8_t pctype;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   /*
+* Define the bytes stream extracted as flexible payload in
+* field vector. By default, select 8 words from the beginning
+* of payload as flexible payload.
+*/
+   memset(pf->fdir.flex_set, 0, sizeof(pf->fdir.flex_set));
+
+   /* initialize the flexible payload for L2 payload*/
+   pf->fdir.flex_set[0][0].offset = 0;
+   pf->fdir.flex_set[0][0].size = 8;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(0), 0xC900);
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(1), 0xFC29);/*non-used*/
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(2), 0xFC2A);/*non-used*/
+
+   /* initialize the flexible payload for L3 payload*/
+   pf->fdir.flex_set[1][0].offset = 0;
+   pf->fdir.flex_set[1][0].size = 8;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(3), 0xC900);
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(4), 0xFC29);/*non-used*/
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(5), 0xFC2A);/*non-used*/
+
+   /* initialize the flexible payload for L4 payload*/
+   pf->fdir.flex_set[2][0].offset = 0;
+   pf->fdir.flex_set[2][0].size = 8;
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(6), 0xC900);
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(7), 0xFC29);/*non-used*/
+   I40E_WRITE_REG(hw, I40E_PRTQF_FLX_PIT(8), 0xFC2A);/*non-used*/
+
+   /* initialize the masks */
+   for (pctype = I40E_FILTER_PCTYPE_NONF_IPV4_UDP;
+pctype <= I40E_FILTER_PCTYPE_FRAG_IPV6; pctype++) {
+   I40E_WRITE_REG(hw, I40E_PRTQF_FD_FLXINSET(pctype), 0);
+   I40E_WRITE_REG(hw, I40E_PRTQF_FD_MSK(pctype, 0), 0);
+   I40E_WRITE_REG(hw, I40E_PRTQF_FD_MSK(pctype, 1), 0);
+   }
+}
+
+/*
  * i40e_fdir_setup - reserve and initialize the Flow Director resources
  * @pf: board private structure
  */
@@ -184,6 +231,8 @@ i40e_fdir_setup(struct i40e_pf *pf)
goto fail_mem;
}

+   i40e_init_flx_pld(pf);
+
/* reserve memory for the fdir programming packet */
snprintf(z_name, sizeof(z_name), "%s_%s_%d",
eth_dev->driver->pci_drv.name,
@@ -238,4 +287,4 @@

[dpdk-dev] [PATCH v4 02/21] i40e: tear down flow director

2014-10-22 Thread Jingjing Wu

release fortville resources on flow director, includes
 - queue 0 pair release
 - release vsi

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  4 +++-
 lib/librte_pmd_i40e/i40e_ethdev.h |  1 +
 lib/librte_pmd_i40e/i40e_fdir.c   | 19 +++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 838dd1e..0054a1e 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -508,7 +508,8 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
return 0;

 err_setup_pf_switch:
-   rte_free(pf->main_vsi);
+   i40e_fdir_teardown(pf);
+   i40e_vsi_release(pf->main_vsi);
 err_get_mac_addr:
 err_configure_lan_hmc:
(void)i40e_shutdown_lan_hmc(hw);
@@ -836,6 +837,7 @@ i40e_dev_close(struct rte_eth_dev *dev)
i40e_shutdown_lan_hmc(hw);

/* release all the existing VSIs and VEBs */
+   i40e_fdir_teardown(pf);
i40e_vsi_release(pf->main_vsi);

/* shutdown the adminq */
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.h 
b/lib/librte_pmd_i40e/i40e_ethdev.h
index 327388b..d998980 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.h
+++ b/lib/librte_pmd_i40e/i40e_ethdev.h
@@ -374,6 +374,7 @@ enum i40e_status_code i40e_fdir_setup_tx_resources(struct 
i40e_pf *pf,
 enum i40e_status_code i40e_fdir_setup_rx_resources(struct i40e_pf *pf,
unsigned int socket_id);
 int i40e_fdir_setup(struct i40e_pf *pf);
+void i40e_fdir_teardown(struct i40e_pf *pf);

 /* I40E_DEV_PRIVATE_TO */
 #define I40E_DEV_PRIVATE_TO_PF(adapter) \
diff --git a/lib/librte_pmd_i40e/i40e_fdir.c b/lib/librte_pmd_i40e/i40e_fdir.c
index a44bb73..bb474d2 100644
--- a/lib/librte_pmd_i40e/i40e_fdir.c
+++ b/lib/librte_pmd_i40e/i40e_fdir.c
@@ -219,4 +219,23 @@ fail_setup_tx:
i40e_vsi_release(vsi);
pf->fdir.fdir_vsi = NULL;
return err;
+}
+
+/*
+ * i40e_fdir_teardown - release the Flow Director resources
+ * @pf: board private structure
+ */
+void
+i40e_fdir_teardown(struct i40e_pf *pf)
+{
+   struct i40e_vsi *vsi;
+
+   vsi = pf->fdir.fdir_vsi;
+   i40e_dev_rx_queue_release(pf->fdir.rxq);
+   pf->fdir.rxq = NULL;
+   i40e_dev_tx_queue_release(pf->fdir.txq);
+   pf->fdir.txq = NULL;
+   i40e_vsi_release(vsi);
+   pf->fdir.fdir_vsi = NULL;
+   return;
 }
\ No newline at end of file
-- 
1.8.1.4

[dpdk-dev] [PATCH v4 01/21] i40e: set up and initialize flow director

2014-10-22 Thread Jingjing Wu

set up fortville resources to support flow director, includes
 - queue 0 pair allocated and set up for flow director
 - create vsi
 - reserve memzone for flow director programming packet

Signed-off-by: Jingjing Wu 
---
 lib/librte_pmd_i40e/Makefile  |   2 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  79 +++---
 lib/librte_pmd_i40e/i40e_ethdev.h |  30 +-
 lib/librte_pmd_i40e/i40e_fdir.c   | 222 ++
 lib/librte_pmd_i40e/i40e_rxtx.c   | 127 ++
 5 files changed, 447 insertions(+), 13 deletions(-)
 create mode 100644 lib/librte_pmd_i40e/i40e_fdir.c

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index bd3428f..98e4bdf 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -91,6 +91,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev_vf.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_pf.c
+SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_fdir.c
+
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_mempool lib/librte_mbuf
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..838dd1e 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -765,6 +765,12 @@ i40e_dev_start(struct rte_eth_dev *dev)
i40e_vsi_queues_bind_intr(vsi);
i40e_vsi_enable_queues_intr(vsi);

+   /* enable FDIR MSIX interrupt */
+   if (pf->flags & I40E_FLAG_FDIR) {
+   i40e_vsi_queues_bind_intr(pf->fdir.fdir_vsi);
+   i40e_vsi_enable_queues_intr(pf->fdir.fdir_vsi);
+   }
+
/* Enable all queues which have been configured */
ret = i40e_vsi_switch_queues(vsi, TRUE);
if (ret != I40E_SUCCESS) {
@@ -2602,16 +2608,30 @@ i40e_vsi_setup(struct i40e_pf *pf,
case I40E_VSI_SRIOV :
vsi->nb_qps = pf->vf_nb_qps;
break;
+   case I40E_VSI_FDIR:
+   vsi->nb_qps = pf->fdir_nb_qps;
+   break;
default:
goto fail_mem;
}
-   ret = i40e_res_pool_alloc(>qp_pool, vsi->nb_qps);
-   if (ret < 0) {
-   PMD_DRV_LOG(ERR, "VSI %d allocate queue failed %d",
-   vsi->seid, ret);
-   goto fail_mem;
-   }
-   vsi->base_queue = ret;
+   /*
+* The filter status descriptor is reported in rx queue 0,
+* while the tx queue for fdir filter programming has no
+* such constraints, can be non-zero queues.
+* To simplify it, choose FDIR vsi use queue 0 pair.
+* To make sure it will use queue 0 pair, queue allocation
+* need be done before this function is called
+*/
+   if (type != I40E_VSI_FDIR) {
+   ret = i40e_res_pool_alloc(>qp_pool, vsi->nb_qps);
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR, "VSI %d allocate queue failed 
%d",
+   vsi->seid, ret);
+   goto fail_mem;
+   }
+   vsi->base_queue = ret;
+   } else
+   vsi->base_queue = I40E_FDIR_QUEUE_ID;

/* VF has MSIX interrupt in VF range, don't allocate here */
if (type != I40E_VSI_SRIOV) {
@@ -2743,9 +2763,25 @@ i40e_vsi_setup(struct i40e_pf *pf,
 * Since VSI is not created yet, only configure parameter,
 * will add vsi below.
 */
-   }
-   else {
-   PMD_DRV_LOG(ERR, "VSI: Not support other type VSI yet");
+   } else if (type == I40E_VSI_FDIR) {
+   vsi->uplink_seid = uplink_vsi->uplink_seid;
+   ctxt.pf_num = hw->pf_id;
+   ctxt.vf_num = 0;
+   ctxt.uplink_seid = vsi->uplink_seid;
+   ctxt.connection_type = 0x1; /* regular data port */
+   ctxt.flags = I40E_AQ_VSI_TYPE_PF;
+   ret = i40e_vsi_config_tc_queue_mapping(vsi, ,
+   I40E_DEFAULT_TCMAP);
+   if (ret != I40E_SUCCESS) {
+   PMD_DRV_LOG(ERR, "Failed to configure "
+   "TC queue mapping\n");
+   goto fail_msix_alloc;
+   }
+   ctxt.info.up_enable_bits = I40E_DEFAULT_TCMAP;
+   ctxt.info.valid_sections |=
+   rte_cpu_to_le_16(I40E_AQ_VSI_PROP_SCHED_VALID);
+   } else {
+   PMD_DRV_LOG(ERR, "VSI: Not support other type VSI yet\n");
goto fail_msix_alloc;
}

@@ -2930,8 +2966,16 @@ i40e_pf_setup(struct i40e_pf *pf)
PMD_DRV_LOG(ERR, "Could not get switch config, err %d", ret);
return

[dpdk-dev] [PATCH v4 00/21] Support flow director programming on Fortville

2014-10-22 Thread Jingjing Wu

The patch set supports flow director on fortville.
It includes:
 - set up/tear down fortville resources to support flow director, such as queue 
and vsi.
 - support operation to add or delete 8 flow types of the flow director 
filters, they are ipv4, tcpv4, udpv4, sctpv4, ipv6, tcpv6, udpv6, sctpv6.
 - support flushing flow director table (all filters).
 - support operation to get flow director information.
 - match status statistics, FD_ID report.
 - support operation to configure flexible payload and its mask
 - support flexible payload involved in comparison and flex bytes report.

v2 changes:
 - create real fdir vsi and assign queue 0 pair to it.
 - check filter status report on the rx queue 0

v3 changes:
 - redefine filter APIs to support multi-kind filters
 - support sctpv4 and sctpv6 type flows
 - support flexible payload involved in comparison 

v4 changes:
 - strip the filter APIs definitions from this patch set
 - extend mbuf field to support flex bytes report
 - fix typos

Jingjing Wu (21):
  i40e: set up and initialize flow director
  i40e: tear down flow director
  i40e: initialize flexible payload setting
  ethdev: define structures for adding/deleting flow director
  i40e: implement operations to add/delete flow director
  testpmd: add test commands to add/delete flow director filter
  i40e: match counter for flow director
  mbuf: extend fdir field
  i40e: report flow director match info to mbuf
  testpmd: print extended fdir info in mbuf
  ethdev: define structures for getting flow director information
  i40e: implement operations to get fdir info
  testpmd: display fdir statistics
  i40e: implement operation to flush flow director table
  testpmd: add test command to flush flow director table
  ethdev: define structures for configuring flexible payload
  i40e: implement operations to configure flexible payload
  testpmd: add test command to configure flexible payload
  ethdev:  define structures for configuring flex masks
  i40e: implement operations to configure flexible masks
  testpmd: add test command to configure flexible masks

 app/test-pmd/cmdline.c|  812 
 app/test-pmd/config.c |   38 +-
 app/test-pmd/rxonly.c |   14 +-
 app/test-pmd/testpmd.h|3 +
 lib/librte_ether/rte_eth_ctrl.h   |  266 
 lib/librte_ether/rte_ethdev.h |   23 -
 lib/librte_mbuf/rte_mbuf.h|   12 +-
 lib/librte_pmd_i40e/Makefile  |2 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  127 +++-
 lib/librte_pmd_i40e/i40e_ethdev.h |   34 +-
 lib/librte_pmd_i40e/i40e_fdir.c   | 1222 +
 lib/librte_pmd_i40e/i40e_rxtx.c   |  225 ++-
 12 files changed, 2723 insertions(+), 55 deletions(-)
 create mode 100644 lib/librte_pmd_i40e/i40e_fdir.c

-- 
1.8.1.4

[dpdk-dev] development/integration branch?

2014-10-22 Thread Matthew Hall

I am aware of that. But it's a pain to do it. And then your local branch 
doesn't move forward when new stable releases come out. So I was suggesting we 
have a stable branch always available and known-good pointing to latest 1.X.X 
or 2.X.X release of latest stable 1.X or 2.X. It would also be friendly to 
maintenance programmers who want to submit patches to stable versions and 
encourage them to contribute stability fixes to DPDK just like Greg KH and the 
stable kernel guys do for the Long Term kernels.

Matthew.
-- 
Sent from my mobile device.

On October 22, 2014 6:43:36 AM PDT, Stephen Hemminger  wrote:
>On Wed, 22 Oct 2014 00:00:58 -0700
>Matthew Hall  wrote:
>
>> What I think git in general and DPDK in particular are missing is,
>they have a 
>> tradition tags for releases, however I think this is broken because
>you can't 
>> easily append more stuff to tages.
>
>In git tags and branches are almost the same thing.
>You can easily create a local branch off of a tag.

[dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure changes

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 4:46 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 1/9] librte_mbuf:the rte_mbuf structure
> changes
> 
> 2014-10-21 14:14, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 16:46, Jijiang Liu:
> > > > -   uint16_t reserved2;   /**< Unused field. Required for 
> > > > padding */
> > > > +
> > > > +   /**
> > > > +* Packet type, which is used to indicate ordinary L2 packet 
> > > > format
> and
> > > > +* also tunneled packet format such as IP in IP, IP in GRE, MAC 
> > > > in GRE
> > > > +* and MAC in UDP.
> > > > +*/
> > > > +   uint16_t packet_type;
> > >
> > > Why not name it "l2_type"?
> >
> > In datasheet, this term is called packet type(s).
> 
> That's exactly the point I want you really understand!
> This is a field in generic mbuf structure, so your datasheet has no value 
> here.
> 
> > Personally , I think packet type is  more clear what meaning of this field 
> > is .
> ^_^
> 
> You cannot add an API field without knowing what will be its generic
> meaning.
> Please think about it and describe its scope.
 its generic meaning is that each UNIT  number stand for a kind of packet 
format?
I will add more description for this field.


> Thanks
> --
> Thomas

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marc Sune
> Sent: Wednesday, October 22, 2014 3:11 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-
> initializers'
> 
> Fix for compilation warning 'missing-field-initializers' for some GCC and 
> clang
> versions introduced in commit 0c6bc8e
> 
> Signed-off-by: Marc Sune 
> ---
>  lib/librte_kni/rte_kni.c |9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index
> f64a0a8..de29b99 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -131,7 +131,14 @@ static void kni_free_mbufs(struct rte_kni *kni);
> static void kni_allocate_mbufs(struct rte_kni *kni);
> 
>  static volatile int kni_fd = -1;
> -static struct rte_kni_memzone_pool kni_memzone_pool = {0};
> +static struct rte_kni_memzone_pool kni_memzone_pool = {
> + .initialized = 0,
> + .max_ifaces = 0,
> + .slots = 0,
> + .mutex =  RTE_SPINLOCK_INITIALIZER,
> + .free = NULL,
> + .free_tail = NULL
> +};
> 
>  static const struct rte_memzone *
>  kni_memzone_reserve(const char *name, size_t len, int socket_id,
> --
> 1.7.10.4

Acked-by: Jijiang Liu

[dpdk-dev] FW: nic loopback

2014-10-22 Thread Liang, Cunming



From: alex [mailto:a...@weka.io]
Sent: Wednesday, October 22, 2014 3:42 PM
To: Zhu, Heqing
Cc: Liang, Cunming; dev at dpdk.org
Subject: Re: FW: [dpdk-dev] nic loopback



On Wed, Oct 22, 2014 at 7:37 AM, Zhu, Heqing mailto:heqing.zhu at intel.com>> wrote:
One line comment inline.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Liang, Cunming
> Sent: Tuesday, October 21, 2014 8:33 PM
> To: Alex Markuze
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] nic loopback
>
> It?s a pain VF can?t set the register directly.
> As kernel ixgbe don?t support to set the value, I?m afraid you have to modify
> kernel ixgbe.
> If your purpose is mainly for testing purpose.
> One option is you can just set the register bit value to full 1 during device
> initialization.
> Another option is you can choose to use DPDK as host PF.
> Running testpmd in host, and set such register by interactive command line.
>
> Ideally it?s better to add a kind of VF to PF mailbox message.
> Host PF delegate VF to enable the local pool loopback.
> So during runtime, VF can proactive to enable/disable the ability.

[heqing] Such a proposal has been discussed a few times, but the kernel driver 
does not accept this due to the security concern.

I will try a different approach, Is there a tool available by intel for 82599 
nics that can access the NIC's configuration and modify these registers 
manually? w/o Modifying hypervisor drivers and/or using PF?
[Liang, Cunming] I don?t know. I think it?s not hard for you to make it, but 
with security concern.

>
>
> From: Alex Markuze [mailto:alex at weka.io]
> Sent: Tuesday, October 21, 2014 11:16 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] nic loopback
>
> How can I set/query this bit (LLE(PFVMTXSW[n]), intel 82599 ) on ESX, or any
> other friendlier environment like Linux?
>
> On Tue, Oct 21, 2014 at 4:18 AM, Liang, Cunming
> mailto:cunming.liang at 
> intel.com> intel.com>>> wrote:
>
>
> > -Original Message-
> > From: dev
> [mailto:dev-bounces at dpdk.org dpdk.org>>]
> > On Behalf Of Alex Markuze
> > Sent: Tuesday, October 21, 2014 12:24 AM
> > To: dev at dpdk.org > dpdk.org>
> > Subject: [dpdk-dev] nic loopback
> >
> > Hi,
> > I'm trying to send packets from an application to it self, meaning
> > smac  == dmac.
> > I'm working with intel 82599 virtual function. But it seems that these
> > packets are lost.
> >
> > Is there a software/hw limitation I'm missing here (some additional
> > anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the
> > hw loopback should work, at least thats the theory.
> >
> [Liang, Cunming] You could have a check on register LLE(PFVMTXSW[n]).
> Which allow an individual pool to be able to send traffic and have it loopback
> to itself.
> >
> > Thanks.

[dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement

2014-10-22 Thread Cao, Waterman

Hi Yong,

We verified your patch with VMWare ESXi 5.5 and found VMware L2fwd and 
L3fwd cmd can't run.
But We use DPDK1.7_rc1 package to validate VMware regression, It works fine.
. 
1.[Test Environment]:
 - VMware ESXi 5.5;
 - 2 VM
 - FC20 on Host / FC20-64 on VM
 - Crown Pass server (E2680 v2 ivy bridge )
 - Niantic 82599

2. [Test Topology]:
Create 2VMs (Fedora 18, 64bit) .
We pass through one physical port(Niantic 82599) to each VM, and also 
create one virtual device: vmxnet3 in each VM. 
To connect with two VMs, we use one vswitch to connect two vmxnet3 
interface.
Then, PF1 and vmxnet3A are in VM1; PF2 and vmxnet3B are in VM2.
The traffic flow for l2fwd/l3fwd is as below::
Ixia -> PF1 -> vmxnet3A -> vswitch -> vmxnet3B -> PF2 -> Ixia. (traffic 
generator)

3.[ Test Step]:

tar dpdk1.8.rc1 ,compile and run;

L2fwd:  ./build/l2fwd -c f -n 4 -- -p 0x3
L3fwd:  ./build/l3fwd-vf -c 0x6 -n 4 -- -p 0x3 -config "(0,0,1),(1,0,2)"

4.[Error log]:

---VMware L2fwd:---

EAL:   :0b:00.0 not managed by UIO driver, skipping
EAL: PCI device :13:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f678ae6e000
EAL:   PCI memory mapped at 0x7f678af34000
PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 17, SFP+: 5
PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x10fb
EAL: PCI device :1b:00.0 on NUMA socket -1
EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
EAL:   PCI memory mapped at 0x7f678af33000
EAL:   PCI memory mapped at 0x7f678af32000
EAL:   PCI memory mapped at 0x7f678af3
Lcore 0: RX port 0
Lcore 1: RX port 1
Initializing port 0... PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f670b0f5580 
hw_ring=0x7f6789fe5280 dma_addr=0x373e5280
PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=0, queue=0.
PMD: ixgbe_dev_rx_queue_setup(): Vector rx enabled, please make sure RX burst 
size no less than 32.
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f670b0f3480 hw_ring=0x7f671b820080 
dma_addr=0x100020080
PMD: ixgbe_dev_tx_queue_setup(): Using simple tx code path
PMD: ixgbe_dev_tx_queue_setup(): Vector tx enabled.
done: 
Port 0, MAC address: 90:E2:BA:4A:33:78

Initializing port 1... EAL: Error - exiting with code: 1
  Cause: rte_eth_tx_queue_setup:err=-22, port=1

---VMware L3fwd:---

EAL: TSC frequency is ~2793265 KHz
EAL: Master core 1 is ready (tid=9f49a880)
EAL: Core 2 is ready (tid=1d7f2700)
EAL: PCI device :0b:00.0 on NUMA socket -1
EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
EAL:   :0b:00.0 not managed by UIO driver, skipping
EAL: PCI device :13:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f079f3e4000
EAL:   PCI memory mapped at 0x7f079f4aa000
PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 17, SFP+: 5
PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x10fb
EAL: PCI device :1b:00.0 on NUMA socket -1
EAL:   probe driver: 15ad:7b0 rte_vmxnet3_pmd
EAL:   PCI memory mapped at 0x7f079f4a9000
EAL:   PCI memory mapped at 0x7f079f4a8000
EAL:   PCI memory mapped at 0x7f079f4a6000
Initializing port 0 ... Creating queues: nb_rxq=1 nb_txq=1...  
Address:90:E2:BA:4A:33:78, Allocated mbuf pool on socket 0
LPM: Adding route 0x01010100 / 24 (0)
LPM: Adding route 0x02010100 / 24 (1)
LPM: Adding route 0x03010100 / 24 (2)
LPM: Adding route 0x04010100 / 24 (3)
LPM: Adding route 0x05010100 / 24 (4)
LPM: Adding route 0x06010100 / 24 (5)
LPM: Adding route 0x07010100 / 24 (6)
LPM: Adding route 0x08010100 / 24 (7)
txq=0,0,0 PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f071f6f3c80 
hw_ring=0x7f079e5e5280 dma_addr=0x373e5280
PMD: ixgbe_dev_tx_queue_setup(): Using simple tx code path
PMD: ixgbe_dev_tx_queue_setup(): Vector tx enabled.

Initializing port 1 ... Creating queues: nb_rxq=1 nb_txq=1...  
Address:00:0C:29:F0:90:41, txq=1,0,0 EAL: Error - exiting with code: 1
  Cause: rte_eth_tx_queue_setup: err=-22, port=1


Can you help to recheck this patch with latest DPDK code?

Regards
Waterman 

-Original Message-
>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yong Wang
>Sent: Wednesday, October 22, 2014 6:10 AM
>To: Patel, Rashmin N; Stephen Hemminger
>Cc: dev at dpdk.org
>Subject: Re: [dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement
>
>Rashmin/Stephen,
>
>Since you have worked on vmxnet3 pmd drivers, I wonder if you can help review 
>this set of patches.  Any other reviews/test verifications are welcome of 
>course.  We have reviewed/tested all patches internally.
>
>Yong
>
>From: dev  on behalf of Yong Wang vmware.com>
>Sent: Monday, October 13, 2014 2:00 PM
>To: Thomas Monjalon
>Cc: dev at dpdk.org
>Subject: Re: [dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement
>
>Only the last one is performance related and it merely tries to give hints to 
>the compiler to hopefully make branch prediction more efficient.  It also

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 21, 2014 11:13 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of
> VxLAN filter
> 
> 2014-10-21 16:46, Jijiang Liu:
> > +#define RTE_TUNNEL_FILTER_TO_QUEUE 1 /**< point to an queue by filter
> type */
> 
> Sorry, I don't understand what is this value for?

This MACRO is used to indicate if user application hope to filter incoming 
packet(s) to a specific queue(multi-queue configuration is required) using 
filter type(such as inner MAC + tenant ID).
If the flag is not set, and all incoming packet will always go to queue 0.


> --
> Thomas

[dpdk-dev] [PATCH v4] KNI: use a memzone pool for KNI alloc/release

2014-10-22 Thread Liu, Jijiang

There is a compilation error using gcc version 4.6.2.

./lib/librte_kni/rte_kni.c:134:15: error: missing initializer 
[-Werror=missing-field-initializers]
/lib/librte_kni/rte_kni.c:134:15: error: (near initialization for 
kni_memzone_pool.max_iface? [-Werror=missing-field-initializers]


cc1: all warnings being treated as errors
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marc Sune
> Sent: Tuesday, October 21, 2014 6:52 PM
> To: Thomas Monjalon
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4] KNI: use a memzone pool for KNI
> alloc/release
> 
> Thomas,
> 
> v5: commit message arranged, all warnings from checkpatch.pl fixed except:
> 
> WARNING: Macros with flow control statements should be avoided
> #104: FILE: lib/librte_kni/rte_kni.c:62:
> +#define KNI_MEM_CHECK(cond) do { if (cond) goto kni_fail; } while (0)
> 
> a) This MACRO was there before, I just re-factored it to make it more
> readable.
> b) There are 4 lines exceeding 80cols due to long quoted strings. I followed
> kernel convention not to split them in multiple lines.
> 
> Thanks and regards
> Marc
> 
> On 21/10/14 10:29, Thomas Monjalon wrote:
> > Hi Marc,
> >
> > 2014-10-18 00:51, Marc Sune:
> >> This patch implements the KNI memzone pool in order to prevent
> >> memzone exhaustion when allocating/deallocating KNI interfaces.
> >>
> >> It adds a new API call, rte_kni_init(max_kni_ifaces) that shall be
> >> called before any call to rte_kni_alloc() if KNI is used.
> >>
> >> v2: Moved KNI fd opening to rte_kni_init(). Revised style.
> >> v3: Adapted kni examples/tests to rte_kni_init().
> >> v4: Improved example integration. Fixed
> kni_memzone_pool_alloc/release() bug.
> >>
> >> Signed-off-by: Marc Sune 
> > Thanks for the good work with Helin.
> > Before applying this patch, I'd like another version explaining in the
> > commit log why this change is needed.
> > And please use to checkpatch.pl to check and remove whitespace errors.
> >
> > Thanks

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 5:19 AM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet
> identification API in librte_ether
> 
> 2014-10-21 13:48, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 16:46, Jijiang Liu:
> > > >  int
> > > > +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> > > > +  struct rte_eth_udp_tunnel *udp_tunnel,
> > > > +  uint8_t count)
> > > > +{
> > > > +   uint8_t i;
> > > > +   struct rte_eth_dev *dev;
> > > > +   struct rte_eth_udp_tunnel *tunnel;
> > > > +
> > > > +   if (port_id >= nb_ports) {
> > > > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > > +   return -ENODEV;
> > > > +   }
> > > > +
> > > > +   if (udp_tunnel == NULL) {
> > > > +   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> > > > +   return -EINVAL;
> > > > +   }
> > > > +   tunnel = udp_tunnel;
> > > > +
> > > > +   for (i = 0; i < count; i++, tunnel++) {
> > > > +   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> > > > +   PMD_DEBUG_TRACE("Invalid tunnel type\n");
> > > > +   return -EINVAL;
> > > > +   }
> > > > +   }
> > >
> > > I'm not sure it's a good idea to provide a count parameter to
> > > iterate in a loop.
> > > It's probably something that the application should do by itself.
> >
> > It is necessary to check if this prot_type(tunnel type) is valid here
> > in case applications don't do that.
> 
> Yes, you have to check prot_type but looping for several tunnels is not
> needed at this level.

Ok, remove loop check from here.

> Thanks
> --
> Thomas

[dpdk-dev] nic loopback

2014-10-22 Thread Liang, Cunming

It?s a pain VF can?t set the register directly.
As kernel ixgbe don?t support to set the value, I?m afraid you have to modify 
kernel ixgbe.
If your purpose is mainly for testing purpose.
One option is you can just set the register bit value to full 1 during device 
initialization.
Another option is you can choose to use DPDK as host PF.
Running testpmd in host, and set such register by interactive command line.

Ideally it?s better to add a kind of VF to PF mailbox message.
Host PF delegate VF to enable the local pool loopback.
So during runtime, VF can proactive to enable/disable the ability.

From: Alex Markuze [mailto:a...@weka.io]
Sent: Tuesday, October 21, 2014 11:16 PM
To: Liang, Cunming
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] nic loopback

How can I set/query this bit (LLE(PFVMTXSW[n]), intel 82599 ) on ESX, or any 
other friendlier environment like Linux?

On Tue, Oct 21, 2014 at 4:18 AM, Liang, Cunming mailto:cunming.liang at intel.com>> wrote:

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Alex Markuze
> Sent: Tuesday, October 21, 2014 12:24 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] nic loopback
>
> Hi,
> I'm trying to send packets from an application to it self, meaning smac  ==
> dmac.
> I'm working with intel 82599 virtual function. But it seems that these
> packets are lost.
>
> Is there a software/hw limitation I'm missing here (some additional
> anti-spoofing)? AFAIK modern NICs with sriov are mini switches so the hw
> loopback should work, at least thats the theory.
>
[Liang, Cunming] You could have a check on register LLE(PFVMTXSW[n]).
Which allow an individual pool to be able to send traffic and have it loopback 
to itself.
>
> Thanks.

[dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of VxLAN filter

2014-10-22 Thread Liu, Jijiang



> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, October 21, 2014 11:13 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 5/9] librte_ether:add data structures of
> VxLAN filter
> 
> 2014-10-21 16:46, Jijiang Liu:
> > +#define RTE_TUNNEL_FILTER_TO_QUEUE 1 /**< point to an queue by filter
> type */
> 
> Sorry, I don't understand what is this value for?
> 
> > +#define RTE_TUNNEL_FILTER_IMAC_IVLAN (ETH_TUNNEL_FILTER_IMAC | \
> > +   ETH_TUNNEL_FILTER_IVLAN)
> > +#define RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID
> (ETH_TUNNEL_FILTER_IMAC | \
> > +   ETH_TUNNEL_FILTER_IVLAN | \
> > +   ETH_TUNNEL_FILTER_TENID)
> > +#define RTE_TUNNEL_FILTER_IMAC_TENID (ETH_TUNNEL_FILTER_IMAC | \
> > +   ETH_TUNNEL_FILTER_TENID)
> > +#define RTE_TUNNEL_FILTER_OMAC_TENID_IMAC
> (ETH_TUNNEL_FILTER_OMAC | \
> > +   ETH_TUNNEL_FILTER_TENID | \
> > +   ETH_TUNNEL_FILTER_IMAC)
> 
> I thought you agree that these definitions are useless?
> 

Sorry, this MAY be  some misunderstanding, I don't think these definition are 
useless. I just thought change "uint16_t filter_type" is better than define 
"enum filter_type".

Let me explain here again.
The filter condition are: 
1.  inner MAC + inner VLAN
2. inner MAC + IVLAN + tenant ID
..
5. outer MAC + tenant ID + inner MAC

For each filter condition, we need to check if the mandatory parameters are 
valid by checking corresponding bit MASK.

An pseudo code example:

   Switch (filter_type)
   Case 1:  //inner MAC + inner VLAN
If (filter_type & ETH_TUNNEL_FILTER_IMAC )
if   (IMAC==NULL)
  return -1;

   case 5: // outer MAC + tenant ID + inner MAC
If (filter_type & ETH_TUNNEL_FILTER_IMAC )
if   (IMAC==NULL)
  return -1;

 If (filter_type & ETH_TUNNEL_FILTER_OMAC )
if   (IMAC==NULL)
  return -1;
   ..










> Thomas

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Liu, Jijiang


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, October 22, 2014 5:19 AM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet
> identification API in librte_ether
> 
> 2014-10-21 13:48, Liu, Jijiang:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-21 16:46, Jijiang Liu:
> > > >  int
> > > > +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> > > > +  struct rte_eth_udp_tunnel *udp_tunnel,
> > > > +  uint8_t count)
> > > > +{
> > > > +   uint8_t i;
> > > > +   struct rte_eth_dev *dev;
> > > > +   struct rte_eth_udp_tunnel *tunnel;
> > > > +
> > > > +   if (port_id >= nb_ports) {
> > > > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > > +   return -ENODEV;
> > > > +   }
> > > > +
> > > > +   if (udp_tunnel == NULL) {
> > > > +   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> > > > +   return -EINVAL;
> > > > +   }
> > > > +   tunnel = udp_tunnel;
> > > > +
> > > > +   for (i = 0; i < count; i++, tunnel++) {
> > > > +   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> > > > +   PMD_DEBUG_TRACE("Invalid tunnel type\n");
> > > > +   return -EINVAL;
> > > > +   }
> > > > +   }
> > >
> > > I'm not sure it's a good idea to provide a count parameter to
> > > iterate in a loop.
> > > It's probably something that the application should do by itself.
> >
> > It is necessary to check if this prot_type(tunnel type) is valid here
> > in case applications don't do that.
> 
> Yes, you have to check prot_type but looping for several tunnels is not
> needed at this level.
> 
> > > But I doubt we should configure a tunnel type for a whole port.
> >
> > Yes, your understanding is correct. It is for a whole port/PF, that's
> > why we should add tunnel_type in rte_eth_conf structure.
> 
> Please explain me why a tunnel type should be associated to a port.
> This design looks really broken.

I don't think this design looks really broken.

Currently, A PF  associated to a port, right? What tunnel type should be 
supported in a PF, which is required we configure it.
Tunneling packet is encapsulation packet, in terms of VxLAN, packet format is 
outer L2 header+ outer L3 header +outer L4 header + tunneling header+ inner L2 
header + inner L3 header + inner L4 header +PAY4.
For a VM/VF, the  real useful packet data is "inner L2 header + inner L3 header 
+ inner L4 header +PAY4".  

In NIC, A port/PF receive this kind of tunneling packet(outer L2+...PAY4),  
software should be responsible for decapsulating the packet and deliver real 
data(innerL2 + PAY4) to VM/VF?

DPDK just provide API/mechanism to guarantee a PF/port to receive the tunneling 
packet data, the encapsulation/ decapsulation work should be done by user 
application.

Normally, the tunneling packet processing like below:
Tunneling packet -->PF processing/receive -> application SW do 
decapsulation ---> VF/VM processing

> Thanks
> --
> Thomas

[dpdk-dev] [PATCH] KNI: fix compilation warning 'missing-field-initializers'

2014-10-22 Thread Thomas Monjalon

2014-10-22 09:10, Marc Sune:
> Fix for compilation warning 'missing-field-initializers' for some
> GCC and clang versions introduced in commit 0c6bc8e
> 
> Signed-off-by: Marc Sune 

It's not needed to initialize all fields.
This should be sufficient:
+static struct rte_kni_memzone_pool kni_memzone_pool = {.initialized = 0};

-- 
Thomas

[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images

2014-10-22 Thread Matthew Hall

On Tue, Oct 21, 2014 at 01:22:27PM +, Gonzalez Monroy, Sergio wrote:
> As you point out below, when building static DPDK we should not expect ldd 
> to report any DPDK dependency. When building shared DPDK libs, we should 
> expect such dependency expect for the fact that we are not linking against 
> DPDK libraries when building librte_pmd_virtio.so, which as you mention is 
> buggy.

Yes. I agree. Can we see about fixing this bug?

> - we do not want to build against static DPDK libraries as this would result 
> in duplicated code in librte_pmd_virtio and other apps (ie. testpmd)
>
> - we want to link against shared DPDK libs to add dependencies and provide 
> reliable information (ie. ldd)

OK... but now it's impossible to use librte_pmd_virtio w/o mandatory share 
library performance loss. I strongly dislike being force to do this.

> > Now, about the problems...
>
> I have not been able to reproduce these problems. My setup was QEMU, 
> dpdk-1.7.1, virtio-net-pmd, fedora 20 (both host and guest), GCC/CLANG. I 
> have successfully loaded the module running testpmd with static/shared DPDK 
> libs using GCC and CLANG.

OK... let me try to clarify this point again. In this official DPDK support 
device document, http://www.dpdk.org/doc/nics , it says:

Paravirtualization
virtio-net or virtio-net + uio (QEMU, VirtualBox)

As I've stated, when testing this on VirtualBox it does not work for me and 
gets into an infinite initialization loop which I documented in my last mail. 
But the same code works fine if it's using the VBox Intel 82545EM VNIC and 
appropriate driver. Also the VBox virtio-net device works completely fine 
using the kernel virtio-net driver. This making the virtio PMD's the most 
likely suspect, especially since the UIO based one can't init itself, and the 
non UIO one gets stuck in the loop.

So my question is very simple.

1) Who tested this setup using *VirtualBox NOT QEMU*? QEMU doesn't 
help at all for my app because I'm trying to prepackage it as a Vagrant VM and 
Vagrant uses VirtualBox. It also doesn't help repro my bug because the 
virtio-net device is not 100% same between QEMU and VBOX so you can't compare 
1-1.

2) Who made the instructions to configure this with VirtualBox? I could not 
find any such thing.

3) Who ever got this to work right in the first place?

It's been multiple weeks of emailing and I still have no answer who placed 
this inaccurate text on the website. Nobody answered the last guy who asked it 
in 2013 either. So now it's IMPOSSIBLE for me to know if it worked and I 
configured it wrong or it never worked in the first place.

> > EAL: open shared lib /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so
> > EAL: /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so: undefined
> > symbol: per_lcore__lcore_id
> > 
> Are we talking about a DPDK or custom app?
> Do you only see the issue when CONFIG_RTE_BUILD_COMBINE_LIBS=y?

Issue happens in my DPDK based app.

Can happen anytime you use static linked DPDK app w/ the librte_pmd_virtio. 
Because the link process of librte_pmd_virtio is broken.

> > Running nm and nm -D shows this:
> > 
> > $ nm librte_pmd_virtio.so | fgrep -i per_lcore__lcore_id U
> > per_lcore__lcore_id
> > 
> This is expected behavior as the symbol is defined in librte_eal.
> The dynamic linker will resolve the undefined reference when loading the 
> module in run-time.

I am aware it's "expected behavior". But have the undefined symbol, and no 
dependency upon the DPDK .so and no link against the DPDK .a is NOT "expected 
behavior". It will break anytime you try to make a static app with this PMD 
available.

Matthew.

[dpdk-dev] development/integration branch?

2014-10-22 Thread Matthew Hall

On Tue, Oct 21, 2014 at 11:28:47AM +0200, Thomas Monjalon wrote:
> But I care about the message brought by such change. It would mean that
> we can break the development branch and that most of developers don't test
> it nor base their patches on the latest commit. It's all about simple rules
> and messages.

I have seen two common ways to do this which I think are about equal.

1) master is latest release in production, develop branch is tip

2) master is tip, production releases live in branches / tags

A lot of non-free stuff uses (1) along with some open source.

So the DPDK is using model (2), which is pretty common for open source.

What I think git in general and DPDK in particular are missing is, they have a 
tradition tags for releases, however I think this is broken because you can't 
easily append more stuff to tages.

I really prefer putting my releases on actual branches to make it as easy as 
possible for users / maintenance programmers to follow and/or add stuff to a 
codeline. For example I'd like a 1.7.X branch I could follow for my app until 
1.8.X is ready.

Having a stable branch would also make stuff easier for guys like Marc who 
want to follow the known-stable release in an easy way without horsing around 
with "the latest tag of the day" all the time.

Perhaps this is an OK option?

Matthew.

[dpdk-dev] [PATCH v6 2/9] librte_ether:add VxLAN packet identification API in librte_ether

2014-10-22 Thread Thomas Monjalon

2014-10-21 13:48, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2014-10-21 16:46, Jijiang Liu:
> > >  int
> > > +rte_eth_dev_udp_tunnel_add(uint8_t port_id,
> > > +struct rte_eth_udp_tunnel *udp_tunnel,
> > > +uint8_t count)
> > > +{
> > > + uint8_t i;
> > > + struct rte_eth_dev *dev;
> > > + struct rte_eth_udp_tunnel *tunnel;
> > > +
> > > + if (port_id >= nb_ports) {
> > > + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > + return -ENODEV;
> > > + }
> > > +
> > > + if (udp_tunnel == NULL) {
> > > + PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
> > > + return -EINVAL;
> > > + }
> > > + tunnel = udp_tunnel;
> > > +
> > > + for (i = 0; i < count; i++, tunnel++) {
> > > + if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
> > > + PMD_DEBUG_TRACE("Invalid tunnel type\n");
> > > + return -EINVAL;
> > > + }
> > > + }
> > 
> > I'm not sure it's a good idea to provide a count parameter to iterate in a
> > loop.
> > It's probably something that the application should do by itself.
> 
> It is necessary to check if this prot_type(tunnel type) is valid here in case
> applications don't do that. 

Yes, you have to check prot_type but looping for several tunnels is not needed
at this level.

> > But I doubt we should configure a tunnel type for a whole port.
> 
> Yes, your understanding is correct. It is for a whole port/PF, that's why we
> should add tunnel_type in rte_eth_conf structure.

Please explain me why a tunnel type should be associated to a port.
This design looks really broken.

Thanks
-- 
Thomas

[dpdk-dev] [PATCH] librte_ip_frag: Disable ipv4/v6 fragmentation if RTE_MBUF_REFCNT=n

2014-10-22 Thread Thomas Monjalon

2014-10-21 15:15, Pablo de Lara:
> Ipv4/v6 fragmentation libraries depends on refcnt.
> There was a compilation error if RTE_MBUF_REFCNT was disabled,
> so those libraries have been disabled in that situation.

Please Pablo, could you add a short justification that it's not
possible to implement fragmentation without refcnt (at least with
the current design)?

What do you think of adding a warning as below?

> +ifeq ($(CONFIG_RTE_MBUF_REFCNT),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_fragmentation.c
> -SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c
>  SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv6_fragmentation.c
+else
+$(info WARNING: Fragmentation feature is disabled because it needs 
MBUF_REFCNT.)
> +endif
> +SRCS-$(CONFIG_RTE_LIBRTE_IP_FRAG) += rte_ipv4_reassembly.c

-- 
Thomas

98 matches

Mail list logo