[dpdk-dev] Wrong TCP Checkum computed by hardware

2015-10-27 Thread Matthew Hall
On Wed, Oct 28, 2015 at 12:20:22PM +0530, Padam Jeet Singh wrote:
> Any hint what could I be doing wrong here?

When this kind of stuff doesn't work it often will depend on the exact version 
of card, chip, etc. if there are any errata.

So you might want to collect the specifics of the board with lspci -v, 
ethtool, and pulling it out to check the chip and board revisions.

In addition check over the example apps and see how things work there compared 
with your own code. Often the DPDK interfaces are kind of complex and small 
pointer or mbuf manipulation mistakes can cause very odd results.

Matthew.


[dpdk-dev] [PATCH v2 7/7] l3fwd-power: disable interrupt when wake up from sleep

2015-10-27 Thread Yong Liu
e1000 device interrupt can't be auto-clear. So disble interrupt when
thread wake-up.

Signed-off-by: Marvin Liu 

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 8bb88ce..9175989 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -798,6 +798,7 @@ sleep_until_rx_interrupt(int num)
port_id = ((uintptr_t)data) >> CHAR_BIT;
queue_id = ((uintptr_t)data) &
RTE_LEN2MASK(CHAR_BIT, uint8_t);
+   rte_eth_dev_rx_intr_disable(port_id, queue_id);
RTE_LOG(INFO, L3FWD_POWER,
"lcore %u is waked up from rx interrupt on"
" port %d queue %d\n",
-- 
1.9.3



[dpdk-dev] [PATCH v2 6/7] e1000: lsc interrupt setup function only enable itself

2015-10-27 Thread Yong Liu
Only mask lsc interrupt bit when setup device interrupt.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index b1e0c3c..d2d017c 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1343,11 +1343,14 @@ eth_em_vlan_offload_set(struct rte_eth_dev *dev, int 
mask)
 static int
 eth_em_interrupt_setup(struct rte_eth_dev *dev)
 {
+   uint32_t regval;
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);

-   E1000_WRITE_REG(hw, E1000_IMS, E1000_ICR_LSC);
-   rte_intr_enable(&(dev->pci_dev->intr_handle));
+   /* clear interrupt */
+   E1000_READ_REG(hw, E1000_ICR);
+   regval = E1000_READ_REG(hw, E1000_IMS);
+   E1000_WRITE_REG(hw, E1000_IMS, regval | E1000_ICR_LSC);
return (0);
 }

-- 
1.9.3



[dpdk-dev] [PATCH v2 5/7] e1000: check lsc and rxq not enable in the same time

2015-10-27 Thread Yong Liu
e1000 only support one type of interrupt cause, so remove lsc interrupt
handler if rxq enabled.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index fc3cc1e..b1e0c3c 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -630,14 +630,22 @@ eth_em_start(struct rte_eth_dev *dev)
}
e1000_setup_link(hw);

-   /* check if lsc interrupt feature is enabled */
-   if (dev->data->dev_conf.intr_conf.lsc != 0) {
-   ret = eth_em_interrupt_setup(dev);
-   if (ret) {
-   PMD_INIT_LOG(ERR, "Unable to setup interrupts");
-   em_dev_clear_queues(dev);
-   return ret;
-   }
+   if (rte_intr_allow_others(intr_handle)) {
+   /* check if lsc interrupt is enabled */
+   if (dev->data->dev_conf.intr_conf.lsc != 0)
+   ret = eth_em_interrupt_setup(dev);
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Unable to setup interrupts");
+   em_dev_clear_queues(dev);
+   return ret;
+   }
+   } else {
+   rte_intr_callback_unregister(intr_handle,
+   eth_em_interrupt_handler,
+   (void *)dev);
+   if (dev->data->dev_conf.intr_conf.lsc != 0)
+   PMD_INIT_LOG(INFO, "lsc won't enable because of"
+" no intr multiplex\n");
}
/* check if rxq interrupt is enabled */
if (dev->data->dev_conf.intr_conf.rxq != 0)
@@ -688,6 +696,12 @@ eth_em_stop(struct rte_eth_dev *dev)
memset(&link, 0, sizeof(link));
rte_em_dev_atomic_write_link_status(dev, &link);

+   if (!rte_intr_allow_others(intr_handle))
+   /* resume to the default handler */
+   rte_intr_callback_register(intr_handle,
+  eth_em_interrupt_handler,
+  (void *)dev);
+
/* Clean datapath event and queue/vec mapping */
rte_intr_efd_disable(intr_handle);
if (intr_handle->intr_vec != NULL) {
-- 
1.9.3



[dpdk-dev] [PATCH v2 4/7] e1000: add rxq interrupt handler

2015-10-27 Thread Yong Liu
When datapath rxq interupt is enabled, enable related device rxq.
Remove the interrupt handler after device stopped.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 6dc2534..fc3cc1e 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -501,7 +501,9 @@ eth_em_start(struct rte_eth_dev *dev)
E1000_DEV_PRIVATE(dev->data->dev_private);
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
int ret, mask;
+   uint32_t intr_vector = 0;

PMD_INIT_FUNC_TRACE();

@@ -537,6 +539,26 @@ eth_em_start(struct rte_eth_dev *dev)
/* Configure for OS presence */
em_init_manageability(hw);

+   if (dev->data->dev_conf.intr_conf.rxq != 0) {
+   intr_vector = dev->data->nb_rx_queues;
+   if (rte_intr_efd_enable(intr_handle, intr_vector))
+   return -1;
+   }
+
+   if (rte_intr_dp_is_en(intr_handle)) {
+   intr_handle->intr_vec =
+   rte_zmalloc("intr_vec",
+   dev->data->nb_rx_queues * sizeof(int), 
0);
+   if (intr_handle->intr_vec == NULL) {
+   PMD_INIT_LOG(ERR, "Failed to allocate %d rx_queues"
+   " intr_vec\n", 
dev->data->nb_rx_queues);
+   return -ENOMEM;
+   }
+
+   /* enable rx interrupt */
+   em_rxq_intr_enable(hw);
+   }
+
eth_em_tx_init(dev);

ret = eth_em_rx_init(dev);
@@ -621,6 +643,8 @@ eth_em_start(struct rte_eth_dev *dev)
if (dev->data->dev_conf.intr_conf.rxq != 0)
eth_em_rxq_interrupt_setup(dev);

+   rte_intr_enable(intr_handle);
+
adapter->stopped = 0;

PMD_INIT_LOG(DEBUG, "<<");
@@ -646,6 +670,7 @@ eth_em_stop(struct rte_eth_dev *dev)
 {
struct rte_eth_link link;
struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;

em_rxq_intr_disable(hw);
em_lsc_intr_disable(hw);
@@ -662,6 +687,13 @@ eth_em_stop(struct rte_eth_dev *dev)
/* clear the recorded link status */
memset(&link, 0, sizeof(link));
rte_em_dev_atomic_write_link_status(dev, &link);
+
+   /* Clean datapath event and queue/vec mapping */
+   rte_intr_efd_disable(intr_handle);
+   if (intr_handle->intr_vec != NULL) {
+   rte_free(intr_handle->intr_vec);
+   intr_handle->intr_vec = NULL;
+   }
 }

 static void
-- 
1.9.3



[dpdk-dev] [PATCH v2 3/7] e1000: add ethdev rxq enable and disable function

2015-10-27 Thread Yong Liu
Implement rxq interrupt related functions in eth_dev_ops structure.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 39f330a..6dc2534 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -108,9 +108,13 @@ static void em_vlan_hw_strip_disable(struct rte_eth_dev 
*dev);
 static void eth_em_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
 */
+
+static int eth_em_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t 
queue_id);
+static int eth_em_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t 
queue_id);
 static void em_lsc_intr_disable(struct e1000_hw *hw);
 static void em_rxq_intr_enable(struct e1000_hw *hw);
 static void em_rxq_intr_disable(struct e1000_hw *hw);
+
 static int eth_em_led_on(struct rte_eth_dev *dev);
 static int eth_em_led_off(struct rte_eth_dev *dev);

@@ -162,6 +166,8 @@ static const struct eth_dev_ops eth_em_ops = {
.rx_descriptor_done   = eth_em_rx_descriptor_done,
.tx_queue_setup   = eth_em_tx_queue_setup,
.tx_queue_release = eth_em_tx_queue_release,
+   .rx_queue_intr_enable = eth_em_rx_queue_intr_enable,
+   .rx_queue_intr_disable = eth_em_rx_queue_intr_disable,
.dev_led_on   = eth_em_led_on,
.dev_led_off  = eth_em_led_off,
.flow_ctrl_get= eth_em_flow_ctrl_get,
@@ -890,6 +896,27 @@ eth_em_stats_reset(struct rte_eth_dev *dev)
memset(hw_stats, 0, sizeof(*hw_stats));
 }

+static int
+eth_em_rx_queue_intr_enable(struct rte_eth_dev *dev, __rte_unused uint16_t 
queue_id)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   em_rxq_intr_enable(hw);
+   rte_intr_enable(&(dev->pci_dev->intr_handle));
+
+   return 0;
+}
+
+static int
+eth_em_rx_queue_intr_disable(struct rte_eth_dev *dev, __rte_unused uint16_t 
queue_id)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   em_rxq_intr_disable(hw);
+
+   return 0;
+}
+
 static uint32_t
 em_get_max_pktlen(const struct e1000_hw *hw)
 {
-- 
1.9.3



[dpdk-dev] [PATCH v2 2/7] e1000: separate lsc and rxq interrupt disable function

2015-10-27 Thread Yong Liu
Separate lsc and rxq interrupt for they have different interrupt handlers.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 3be8269..39f330a 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -108,11 +108,12 @@ static void em_vlan_hw_strip_disable(struct rte_eth_dev 
*dev);
 static void eth_em_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
 */
+static void em_lsc_intr_disable(struct e1000_hw *hw);
 static void em_rxq_intr_enable(struct e1000_hw *hw);
+static void em_rxq_intr_disable(struct e1000_hw *hw);
 static int eth_em_led_on(struct rte_eth_dev *dev);
 static int eth_em_led_off(struct rte_eth_dev *dev);

-static void em_intr_disable(struct e1000_hw *hw);
 static int em_get_rx_buffer_size(struct e1000_hw *hw);
 static void eth_em_rar_set(struct rte_eth_dev *dev, struct ether_addr 
*mac_addr,
uint32_t index, uint32_t pool);
@@ -640,7 +641,9 @@ eth_em_stop(struct rte_eth_dev *dev)
struct rte_eth_link link;
struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);

-   em_intr_disable(hw);
+   em_rxq_intr_disable(hw);
+   em_lsc_intr_disable(hw);
+
e1000_reset_hw(hw);
if (hw->mac.type >= e1000_82544)
E1000_WRITE_REG(hw, E1000_WUC, 0);
@@ -1254,13 +1257,7 @@ eth_em_vlan_offload_set(struct rte_eth_dev *dev, int 
mask)
}
 }

-static void
-em_intr_disable(struct e1000_hw *hw)
-{
-   E1000_WRITE_REG(hw, E1000_IMC, ~0);
-}
-
-/**
+/*
  * It enables the interrupt mask and then enable the interrupt.
  *
  * @param dev
@@ -1304,6 +1301,35 @@ eth_em_rxq_interrupt_setup(struct rte_eth_dev *dev)
 }

 /*
+ * It disabled lsc interrupt.
+ * @param hw
+ * Pointer to struct e1000_hw
+ *
+ * @return
+ */
+static void
+em_lsc_intr_disable(struct e1000_hw *hw)
+{
+   E1000_WRITE_REG(hw, E1000_IMC, E1000_IMS_LSC);
+   E1000_WRITE_FLUSH(hw);
+}
+
+/*
+ * It disabled receive packet interrupt.
+ * @param hw
+ * Pointer to struct e1000_hw
+ *
+ * @return
+ */
+static void
+em_rxq_intr_disable(struct e1000_hw *hw)
+{
+   E1000_READ_REG(hw, E1000_ICR);
+   E1000_WRITE_REG(hw, E1000_IMC, E1000_IMS_RXT0);
+   E1000_WRITE_FLUSH(hw);
+}
+
+/*
  * It enable receive packet interrupt.
  * @param hw
  * Pointer to struct e1000_hw
-- 
1.9.3



[dpdk-dev] [PATCH v2 1/7] e1000: add rx interrupt support

2015-10-27 Thread Yong Liu
Enable rx interrupt support on e1000 physical and emulated device.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 912f5dd..3be8269 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -81,6 +81,7 @@ static int eth_em_flow_ctrl_get(struct rte_eth_dev *dev,
 static int eth_em_flow_ctrl_set(struct rte_eth_dev *dev,
struct rte_eth_fc_conf *fc_conf);
 static int eth_em_interrupt_setup(struct rte_eth_dev *dev);
+static int eth_em_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int eth_em_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_em_interrupt_action(struct rte_eth_dev *dev);
 static void eth_em_interrupt_handler(struct rte_intr_handle *handle,
@@ -107,6 +108,7 @@ static void em_vlan_hw_strip_disable(struct rte_eth_dev 
*dev);
 static void eth_em_vlan_filter_set(struct rte_eth_dev *dev,
uint16_t vlan_id, int on);
 */
+static void em_rxq_intr_enable(struct e1000_hw *hw);
 static int eth_em_led_on(struct rte_eth_dev *dev);
 static int eth_em_led_off(struct rte_eth_dev *dev);

@@ -608,6 +610,9 @@ eth_em_start(struct rte_eth_dev *dev)
return ret;
}
}
+   /* check if rxq interrupt is enabled */
+   if (dev->data->dev_conf.intr_conf.rxq != 0)
+   eth_em_rxq_interrupt_setup(dev);

adapter->stopped = 0;

@@ -1277,6 +1282,42 @@ eth_em_interrupt_setup(struct rte_eth_dev *dev)
 }

 /*
+ * It clears the interrupt causes and enables the interrupt.
+ * It will be called once only during nic initialized.
+ *
+ * @param dev
+ *  Pointer to struct rte_eth_dev.
+ *
+ * @return
+ *  - On success, zero.
+ *  - On failure, a negative value.
+ */
+static int
+eth_em_rxq_interrupt_setup(struct rte_eth_dev *dev)
+{
+   struct e1000_hw *hw =
+   E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   E1000_READ_REG(hw, E1000_ICR);
+   em_rxq_intr_enable(hw);
+   return 0;
+}
+
+/*
+ * It enable receive packet interrupt.
+ * @param hw
+ * Pointer to struct e1000_hw
+ *
+ * @return
+ */
+static void
+em_rxq_intr_enable(struct e1000_hw *hw)
+{
+   E1000_WRITE_REG(hw, E1000_IMS, E1000_IMS_RXT0);
+   E1000_WRITE_FLUSH(hw);
+}
+
+/*
  * It reads ICR and gets interrupt causes, check it and set a bit flag
  * to update link status.
  *
-- 
1.9.3



[dpdk-dev] [PATCH v2 0/7] interrupt mode for e1000

2015-10-27 Thread Yong Liu
v2 changes:
describe interrupt mode work with uio and vfio+msi
flush after enable and disable interrupt register
replace attribuite __unused__ with __rte_unused

This patch set will enable interrup for physical and emulated e1000 device.
Rx queue interrupt will work with uio driver or vfio driver with msi mode.
l3fwd-power will disable interrupt immediately when wake-up for that e1000 not
support interrupt auto clear.
LSC and rxq interrupt will be seperated for e1000 can only support one
interrupt cause in the same time.

The patch set is developed based on one previous patch set
"[PATCH v1 00/11] interrupt mode for i40e"
http://www.dpdk.org/ml/archives/dev/2015-September/023903.html

Marvin Liu (7):
  e1000: add rx interrupt support
  e1000: separate lsc and rxq interrupt disable function
  e1000: add ethdev rxq enable and disable function
  e1000: add rxq interrupt handler
  e1000: check lsc and rxq not enable in the same time
  e1000: lsc interrupt setup function only enable itself
  l3fwd-power: disable interrupt when wake up from sleep

 drivers/net/e1000/em_ethdev.c | 181 +-
 examples/l3fwd-power/main.c   |   1 +
 2 files changed, 163 insertions(+), 19 deletions(-)

-- 
1.9.3



[dpdk-dev] Mellanox PMD failure w/DPDK-2.1.0 and MLNX_OFED-3.1-1.0.3

2015-10-27 Thread Dave Engebretsen
Hi

I'm seeing a similar failure "CQ creation failure: Cannot allocate memory".   
My FW level and OFED version match Bill's and we are on DPDK 2.1. 

Has any solution other than changing the linkage been found yet?

Thanks 


[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-10-27 Thread Vlad Zolotarov


On 10/27/15 21:10, Ananyev, Konstantin wrote:
> Hi lads,
>
>> -Original Message-
>> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>> Sent: Tuesday, October 27, 2015 6:48 PM
>> To: Thomas Monjalon; Ananyev, Konstantin; Zhang, Helin
>> Cc: dev at dpdk.org; Kirsher, Jeffrey T; Brandeburg, Jesse
>> Subject: Re: [dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP 
>> descriptor for devices newer than 82598
>>
>>
>>
>> On 10/27/15 20:09, Thomas Monjalon wrote:
>>> Any Follow-up to this discussion?
>>> Should we mark this patch as rejected?
>> Hmmm... This patch fixes an obvious spec violation. Why would it be
>> rejected?
> No I don't think we can reject the patch:
> There is a reproducible  TX hang on ixgbe PMD on described conditions.
> Though, as I explained here:
> http://dpdk.org/ml/archives/dev/2015-September/023574.html
> Vlad's patch would cause quite a big slowdown.
> We are still in the process to get an answer from HW guys are there any
> alternatives that will allow to fix the problem and avoid the slowdown.

+1

> Konstantin
>
>>> 2015-08-24 11:11, Vlad Zolotarov:
 On 08/20/15 18:37, Vlad Zolotarov wrote:
> According to 82599 and x540 HW specifications RS bit *must* be
> set in the last descriptor of *every* packet.
>
> Before this patch there were 3 types of Tx callbacks that were setting
> RS bit every tx_rs_thresh descriptors. This patch introduces a set of
> new callbacks, one for each type mentioned above, that will set the RS
> bit in every EOP descriptor.
>
> ixgbe_set_tx_function() will set the appropriate Tx callback according
> to the device family.
 [+Jesse and Jeff]

 I've started to look at the i40e PMD and it has the same RS bit
 deferring logic
 as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a
 descriptor write-back
 completion mode.

From the HW Spec it's unclear if RS bit should be set on *every* 
 descriptor
 with EOP bit. However I noticed that Linux driver, before it moved to
 HEAD write-back mode, was setting RS
 bit on every EOP descriptor.

 So, here is a question to Intel guys: could u, pls., clarify this point?

 Thanks in advance,
 vlad
>>>



[dpdk-dev] [Patch 2/2] i40e simple tx: Larger list size (33 to 128) throughput optimization

2015-10-27 Thread Polehn, Mike A
Added packet memory prefetch for faster access to variables inside packet 
buffer needed 
for the free operation.

Signed-off-by: Mike A. Polehn 

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 177fb2e..d9bc30a 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -1748,7 +1748,8 @@ static inline int __attribute__((always_inline))
 i40e_tx_free_bufs(struct i40e_tx_queue *txq)
 {
struct i40e_tx_entry *txep;
-   uint16_t i;
+   unsigned i, l, tx_rs_thresh;
+   struct rte_mbuf *pk, *pk_next;

if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) !=
@@ -1757,18 +1758,46 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)

txep = &(txq->sw_ring[txq->tx_next_dd - (txq->tx_rs_thresh - 1)]);

-   for (i = 0; i < txq->tx_rs_thresh; i++)
-   rte_prefetch0((txep + i)->mbuf);
+   /* Prefetch first 2 packets */
+   pk = txep->mbuf;
+   rte_prefetch0(pk);
+   txep->mbuf = NULL;
+   txep++;
+   tx_rs_thresh = txq->tx_rs_thresh;
+   if (likely(txq->tx_rs_thresh > 1)) {
+   pk_next = txep->mbuf;
+   rte_prefetch0(pk_next);
+   txep->mbuf = NULL;
+   txep++;
+   l = tx_rs_thresh - 2;
+   } else {
+   pk_next = pk;
+   l = tx_rs_thresh - 1;
+   }

if (!(txq->txq_flags & (uint32_t)ETH_TXQ_FLAGS_NOREFCOUNT)) {
-   for (i = 0; i < txq->tx_rs_thresh; ++i, ++txep) {
-   rte_mempool_put(txep->mbuf->pool, txep->mbuf);
-   txep->mbuf = NULL;
+   for (i = 0; i < tx_rs_thresh; ++i) {
+   struct rte_mbuf *mbuf = pk;
+   pk = pk_next;
+   if (likely(i < l)) {
+   pk_next = txep->mbuf;
+   rte_prefetch0(pk_next);
+   txep->mbuf = NULL;
+   txep++;
+   }
+   rte_mempool_put(mbuf->pool, mbuf);
}
} else {
-   for (i = 0; i < txq->tx_rs_thresh; ++i, ++txep) {
-   rte_pktmbuf_free_seg(txep->mbuf);
-   txep->mbuf = NULL;
+   for (i = 0; i < tx_rs_thresh; ++i) {
+   struct rte_mbuf *mbuf = pk;
+   pk = pk_next;
+   if (likely(i < l)) {
+   pk_next = txep->mbuf;
+   rte_prefetch0(pk_next);
+   txep->mbuf = NULL;
+   txep++;
+   }
+   rte_pktmbuf_free_seg(mbuf);
}
}



[dpdk-dev] [Patch 1/2] i40e simple tx: Larger list size (33 to 128) throughput optimization

2015-10-27 Thread Polehn, Mike A
Reduce the 32 packet list size focus for better packet list size range handling.

Changed maximum new buffer loop process size to the NIC queue free buffer count 
per loop.

Removed redundant single call check to just one call with focused loop.

Remove NIC register update write from per loop to one per write driver call to 
minimize CPU
stalls waiting for multiple SMP synchronization points and for earlier NIC 
register writes that
often take large cycle counts to complete. For example with an output list size 
of 64, the default 
loops size of 32, when 33 packets are queued on descriptor table, the second 
NIC register write will occur just after TX processing for 1 packet, resulting 
in a large CPU stall time.

Used some standard variables to help reduce overhead of non-standard variable 
sizes.

Reordered variable structure to put most active variables in first cache line, 
better utilize 
memory bytes inside cache line, and reduced active cache line count during call.

Signed-off-by: Mike A. Polehn 

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index ec62f75..2032e06 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -64,6 +64,7 @@
 #define DEFAULT_TX_FREE_THRESH 32
 #define I40E_MAX_PKT_TYPE  256
 #define I40E_RX_INPUT_BUF_MAX  256
+#define I40E_RX_FREE_THRESH_MIN  2

 #define I40E_TX_MAX_BURST  32

@@ -942,6 +943,12 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
struct i40e_rx_queue *rxq)
 "rxq->rx_free_thresh=%d",
 rxq->nb_rx_desc, rxq->rx_free_thresh);
ret = -EINVAL;
+   } else if (rxq->rx_free_thresh < I40E_RX_FREE_THRESH_MIN) {
+   PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
+   "rxq->rx_free_thresh=%d, "
+   "I40E_RX_FREE_THRESH_MIN=%d",
+   rxq->rx_free_thresh, I40E_RX_FREE_THRESH_MIN);
+   ret = -EINVAL;
} else if (!(rxq->nb_rx_desc < (I40E_MAX_RING_DESC -
RTE_PMD_I40E_RX_MAX_BURST))) {
PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
@@ -1058,9 +1065,8 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
 {
volatile union i40e_rx_desc *rxdp;
struct i40e_rx_entry *rxep;
-   struct rte_mbuf *mb;
-   unsigned alloc_idx, i;
-   uint64_t dma_addr;
+   struct rte_mbuf *pk, *npk;
+   unsigned alloc_idx, i, l;
int diag;

/* Allocate buffers in bulk */
@@ -1076,22 +1082,36 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
return -ENOMEM;
}

+   pk = rxep->mbuf;
+   rte_prefetch0(pk);
+   rxep++;
+   npk = rxep->mbuf;
+   rte_prefetch0(npk);
+   rxep++;
+   l = rxq->rx_free_thresh - 2;
+
rxdp = &rxq->rx_ring[alloc_idx];
for (i = 0; i < rxq->rx_free_thresh; i++) {
-   if (likely(i < (rxq->rx_free_thresh - 1)))
+   struct rte_mbuf *mb = pk;
+   pk = npk;
+   if (likely(i < l)) {
/* Prefetch next mbuf */
-   rte_prefetch0(rxep[i + 1].mbuf);
-
-   mb = rxep[i].mbuf;
-   rte_mbuf_refcnt_set(mb, 1);
-   mb->next = NULL;
+   npk = rxep->mbuf;
+   rte_prefetch0(npk);
+   rxep++;
+   }
mb->data_off = RTE_PKTMBUF_HEADROOM;
+   rte_mbuf_refcnt_set(mb, 1);
mb->nb_segs = 1;
mb->port = rxq->port_id;
-   dma_addr = rte_cpu_to_le_64(\
-   RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
-   rxdp[i].read.hdr_addr = 0;
-   rxdp[i].read.pkt_addr = dma_addr;
+   mb->next = NULL;
+   {
+   uint64_t dma_addr = rte_cpu_to_le_64(
+   RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
+   rxdp->read.hdr_addr = dma_addr;
+   rxdp->read.pkt_addr = dma_addr;
+   }
+   rxdp++;
}

rxq->rx_last_pos = alloc_idx + rxq->rx_free_thresh - 1;


[dpdk-dev] [Patch 2/2] i40e rx Bulk Alloc: Larger list size (33 to 128) throughput optimization

2015-10-27 Thread Polehn, Mike A
Added check of minimum of 2 packet allocation count to eliminate the extra 
overhead for 
supporting prefetch for the case of checking for only one packet allocated into 
the queue 
at a time.

Used some standard variables to help reduce overhead of non-standard variable 
sizes.

Added second level prefetch to get packet address in cache 0 earlier and 
eliminated
calculation inside loop to determine end of prefetch loop.

Used old time instruction C optimization methods of, using pointers instead of 
arrays, 
and reducing scope of some variables to improve chances of using register 
variables 
instead of a stack variables.

Signed-off-by: Mike A. Polehn 

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index ec62f75..2032e06 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -64,6 +64,7 @@
 #define DEFAULT_TX_FREE_THRESH 32
 #define I40E_MAX_PKT_TYPE  256
 #define I40E_RX_INPUT_BUF_MAX  256
+#define I40E_RX_FREE_THRESH_MIN  2

 #define I40E_TX_MAX_BURST  32

@@ -942,6 +943,12 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
struct i40e_rx_queue *rxq)
 "rxq->rx_free_thresh=%d",
 rxq->nb_rx_desc, rxq->rx_free_thresh);
ret = -EINVAL;
+   } else if (rxq->rx_free_thresh < I40E_RX_FREE_THRESH_MIN) {
+   PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
+   "rxq->rx_free_thresh=%d, "
+   "I40E_RX_FREE_THRESH_MIN=%d",
+   rxq->rx_free_thresh, I40E_RX_FREE_THRESH_MIN);
+   ret = -EINVAL;
} else if (!(rxq->nb_rx_desc < (I40E_MAX_RING_DESC -
RTE_PMD_I40E_RX_MAX_BURST))) {
PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions: "
@@ -1058,9 +1065,8 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
 {
volatile union i40e_rx_desc *rxdp;
struct i40e_rx_entry *rxep;
-   struct rte_mbuf *mb;
-   unsigned alloc_idx, i;
-   uint64_t dma_addr;
+   struct rte_mbuf *pk, *npk;
+   unsigned alloc_idx, i, l;
int diag;

/* Allocate buffers in bulk */
@@ -1076,22 +1082,36 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
return -ENOMEM;
}

+   pk = rxep->mbuf;
+   rte_prefetch0(pk);
+   rxep++;
+   npk = rxep->mbuf;
+   rte_prefetch0(npk);
+   rxep++;
+   l = rxq->rx_free_thresh - 2;
+
rxdp = &rxq->rx_ring[alloc_idx];
for (i = 0; i < rxq->rx_free_thresh; i++) {
-   if (likely(i < (rxq->rx_free_thresh - 1)))
+   struct rte_mbuf *mb = pk;
+   pk = npk;
+   if (likely(i < l)) {
/* Prefetch next mbuf */
-   rte_prefetch0(rxep[i + 1].mbuf);
-
-   mb = rxep[i].mbuf;
-   rte_mbuf_refcnt_set(mb, 1);
-   mb->next = NULL;
+   npk = rxep->mbuf;
+   rte_prefetch0(npk);
+   rxep++;
+   }
mb->data_off = RTE_PKTMBUF_HEADROOM;
+   rte_mbuf_refcnt_set(mb, 1);
mb->nb_segs = 1;
mb->port = rxq->port_id;
-   dma_addr = rte_cpu_to_le_64(\
-   RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
-   rxdp[i].read.hdr_addr = 0;
-   rxdp[i].read.pkt_addr = dma_addr;
+   mb->next = NULL;
+   {
+   uint64_t dma_addr = rte_cpu_to_le_64(
+   RTE_MBUF_DATA_DMA_ADDR_DEFAULT(mb));
+   rxdp->read.hdr_addr = dma_addr;
+   rxdp->read.pkt_addr = dma_addr;
+   }
+   rxdp++;
}

rxq->rx_last_pos = alloc_idx + rxq->rx_free_thresh - 1;



[dpdk-dev] [Patch 1/2] i40e RX Bulk Alloc: Larger list size (33 to 128) throughput optimization

2015-10-27 Thread Polehn, Mike A
Combined 2 subroutines of code into one subroutine with one read operation 
followed by 
buffer allocate and load loop.

Eliminated the staging queue and subroutine, which removed extra pointer list 
movements 
and reduced number of active variable cache pages during for call.

Reduced queue position variables to just 2, the next read point and last NIC RX 
descriptor 
position, also changed to allowing NIC descriptor table to not always need to 
be filled.

Removed NIC register update write from per loop to one per driver write call to 
minimize CPU 
stalls waiting of multiple SMB synchronization points and for earlier NIC 
register writes that 
often take large cycle counts to complete. For example with an input packet 
list of 33, with 
the default loops size of 32, the second NIC register write will occur just 
after RX processing 
for just 1 packet, resulting in large CPU stall time.

Eliminated initial rx packet present test before rx processing loop that also 
checks, since less 
free time is generally available when packets are present than when not 
processing any input 
packets. 

Used some standard variables to help reduce overhead of non-standard variable 
sizes.

Reduced number of variables, reordered variable structure to put most active 
variables in 
first cache line, better utilize memory bytes inside cache line, and reduced 
active cache line 
count to 1 cache line during processing call. Other RX subroutine sets might 
still use more 
than 1 variable cache line.

Signed-off-by: Mike A. Polehn 

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index fd656d5..ea63f2f 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -63,6 +63,7 @@
 #define DEFAULT_TX_RS_THRESH   32
 #define DEFAULT_TX_FREE_THRESH 32
 #define I40E_MAX_PKT_TYPE  256
+#define I40E_RX_INPUT_BUF_MAX  256

 #define I40E_TX_MAX_BURST  32

@@ -959,115 +960,97 @@ check_rx_burst_bulk_alloc_preconditions(__rte_unused 
struct i40e_rx_queue *rxq)
 }

 #ifdef RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC
-#define I40E_LOOK_AHEAD 8
-#if (I40E_LOOK_AHEAD != 8)
-#error "PMD I40E: I40E_LOOK_AHEAD must be 8\n"
-#endif
-static inline int
-i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
+
+static inline unsigned
+i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
+   unsigned nb_pkts)
 {
volatile union i40e_rx_desc *rxdp;
struct i40e_rx_entry *rxep;
-   struct rte_mbuf *mb;
-   uint16_t pkt_len;
-   uint64_t qword1;
-   uint32_t rx_status;
-   int32_t s[I40E_LOOK_AHEAD], nb_dd;
-   int32_t i, j, nb_rx = 0;
-   uint64_t pkt_flags;
+   unsigned i, n, tail;

-   rxdp = &rxq->rx_ring[rxq->rx_tail];
-   rxep = &rxq->sw_ring[rxq->rx_tail];
-
-   qword1 = rte_le_to_cpu_64(rxdp->wb.qword1.status_error_len);
-   rx_status = (qword1 & I40E_RXD_QW1_STATUS_MASK) >>
-   I40E_RXD_QW1_STATUS_SHIFT;
+   /* Wrap tail */
+   if (rxq->rx_tail >= rxq->nb_rx_desc)
+   tail = 0;
+   else
+   tail = rxq->rx_tail;
+
+   /* Stop at end of Q, for end, next read alligned at Q start */
+   n = rxq->nb_rx_desc - tail;
+   if (n < nb_pkts)
+   nb_pkts = n;
+
+   rxdp = &rxq->rx_ring[tail];
+   rte_prefetch0(rxdp);
+   rxep = &rxq->sw_ring[tail];
+   rte_prefetch0(rxep);
+
+   i = 0;
+   while (nb_pkts > 0) {
+   /* Prefetch NIC descriptors and packet list */
+   if (likely(nb_pkts > 4)) {
+   rte_prefetch0(&rxdp[4]);
+   if (likely(nb_pkts > 8)) {
+   rte_prefetch0(&rxdp[8]);
+   rte_prefetch0(&rxep[8]);
+   }
+   }

-   /* Make sure there is at least 1 packet to receive */
-   if (!(rx_status & (1 << I40E_RX_DESC_STATUS_DD_SHIFT)))
-   return 0;
+   for (n = 0; (nb_pkts > 0)&&(n < 8); n++, nb_pkts--, i++) {
+   uint64_t qword1;
+   uint64_t pkt_flags;
+   uint16_t pkt_len;
+   struct rte_mbuf *mb = rxep->mbuf;
+   rxep++;

-   /**
-* Scan LOOK_AHEAD descriptors at a time to determine which
-* descriptors reference packets that are ready to be received.
-*/
-   for (i = 0; i < RTE_PMD_I40E_RX_MAX_BURST; i+=I40E_LOOK_AHEAD,
-   rxdp += I40E_LOOK_AHEAD, rxep += I40E_LOOK_AHEAD) {
-   /* Read desc statuses backwards to avoid race condition */
-   for (j = I40E_LOOK_AHEAD - 1; j >= 0; j--) {
+   /* Translate descriptor info to mbuf parameters */
qword1 = rte_le_to_cpu_64(\
-   rxdp[j].wb.qword1.status_error_len);
-   s[j] = (qword1 & I40E_RXD_QW1_STATUS_MASK) >>
-   

[dpdk-dev] [Patch] Eth Driver: Optimization for improved NIC processing rates

2015-10-27 Thread Polehn, Mike A
Prefetch of interface access variables while calling into driver RX and TX 
subroutines.

For converging zero loss packet task tests, a small drop in latency for zero 
loss measurements 
and small drop in lost packet counts for the lossy measurement points was 
observed, 
indicating some savings of execution clock cycles.

Signed-off-by: Mike A. Polehn 

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 8a8c82b..09f1069 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2357,11 +2357,15 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
 {
struct rte_eth_dev *dev;
+   void *rxq;

dev = &rte_eth_devices[port_id];

-   int16_t nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
-   rx_pkts, nb_pkts);
+   /* rxq is going to be immediately used, prefetch it */
+   rxq = dev->data->rx_queues[queue_id];
+   rte_prefetch0(rxq);
+
+   int16_t nb_rx = (*dev->rx_pkt_burst)(rxq, rx_pkts, nb_pkts);

 #ifdef RTE_ETHDEV_RXTX_CALLBACKS
struct rte_eth_rxtx_callback *cb = dev->post_rx_burst_cbs[queue_id];
@@ -2499,6 +2503,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
 struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
struct rte_eth_dev *dev;
+   void *txq;

dev = &rte_eth_devices[port_id];

@@ -2514,7 +2519,11 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
}
 #endif

-   return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, 
nb_pkts);
+   /* txq is going to be immediately used, prefetch it */
+   txq = dev->data->tx_queues[queue_id];
+   rte_prefetch0(txq);
+
+   return (*dev->tx_pkt_burst)(txq, tx_pkts, nb_pkts);
 }
 #endif


[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-10-27 Thread Vlad Zolotarov


On 10/27/15 20:09, Thomas Monjalon wrote:
> Any Follow-up to this discussion?
> Should we mark this patch as rejected?

Hmmm... This patch fixes an obvious spec violation. Why would it be 
rejected?

>
> 2015-08-24 11:11, Vlad Zolotarov:
>> On 08/20/15 18:37, Vlad Zolotarov wrote:
>>> According to 82599 and x540 HW specifications RS bit *must* be
>>> set in the last descriptor of *every* packet.
>>>
>>> Before this patch there were 3 types of Tx callbacks that were setting
>>> RS bit every tx_rs_thresh descriptors. This patch introduces a set of
>>> new callbacks, one for each type mentioned above, that will set the RS
>>> bit in every EOP descriptor.
>>>
>>> ixgbe_set_tx_function() will set the appropriate Tx callback according
>>> to the device family.
>> [+Jesse and Jeff]
>>
>> I've started to look at the i40e PMD and it has the same RS bit
>> deferring logic
>> as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a
>> descriptor write-back
>> completion mode.
>>
>>   From the HW Spec it's unclear if RS bit should be set on *every* descriptor
>> with EOP bit. However I noticed that Linux driver, before it moved to
>> HEAD write-back mode, was setting RS
>> bit on every EOP descriptor.
>>
>> So, here is a question to Intel guys: could u, pls., clarify this point?
>>
>> Thanks in advance,
>> vlad
>
>



[dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported

2015-10-27 Thread Jan Viktorin
The main goal of this check is to avoid passing the -msse4.1
option to the GCC that does not support it (like arm toolchains).

The ACL now builds for ARM.

Signed-off-by: Jan Viktorin 
---
v2 -> v3: handle missing SSE as suggested by K. Ananyev
---
 lib/librte_acl/Makefile  |  7 ++-
 lib/librte_acl/rte_acl.c | 19 +--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..ed95f03 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
 SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
-SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c

+CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 
2>&1 && echo 1)
+
+ifeq ($(CC_SSE4_1_SUPPORT),1)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
 CFLAGS_acl_run_sse.o += -msse4.1
+endif

 #
 # If the compiler supports AVX2 instructions,
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index d60219f..e7822de 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -42,6 +42,20 @@ static struct rte_tailq_elem rte_acl_tailq = {
 EAL_REGISTER_TAILQ(rte_acl_tailq)

 /*
+ * If the compiler doesn't support SSE instructions,
+ * then the dummy one would be used instead for SSE classify method.
+ */
+int __attribute__ ((weak))
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+   __rte_unused const uint8_t **data,
+   __rte_unused uint32_t *results,
+   __rte_unused uint32_t num,
+   __rte_unused uint32_t categories)
+{
+   return -ENOTSUP;
+}
+
+/*
  * If the compiler doesn't support AVX2 instructions,
  * then the dummy one would be used instead for AVX2 classify method.
  */
@@ -97,10 +111,11 @@ rte_acl_init(void)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
alg = RTE_ACL_CLASSIFY_AVX2;
else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#else
+   alg = RTE_ACL_CLASSIFY_SSE;
+#elif defined (CC_SSE41_SUPPORT)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#endif
alg = RTE_ACL_CLASSIFY_SSE;
+#endif

rte_acl_set_default_classify(alg);
 }
-- 
2.6.1



[dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
the function is reimplemented using non-vector operations for non-x86
architectures.

LPM now builds for ARM.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
v2 -> v3: as SIMD operations have been moved to rte_vect.h,
  this patch is now quite clear and just defines the
  non-x86 version of rte_lpm_lookupx4
---
 config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
 lib/librte_lpm/rte_lpm.h  | 24 +---
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc 
b/config/defconfig_arm-armv7-a-linuxapp-gcc
index 5a778cf..a2c8b95 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -55,7 +55,6 @@ CONFIG_RTE_EAL_IGB_UIO=n

 # fails to compile on ARM
 CONFIG_RTE_LIBRTE_ACL=n
-CONFIG_RTE_LIBRTE_LPM=n

 # cannot use those on ARM
 CONFIG_RTE_KNI_KMOD=n
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index c299ce2..c02b355 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -358,9 +358,6 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const 
uint32_t * ips,
return 0;
 }

-/* Mask four results. */
-#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
-
 /**
  * Lookup four IP addresses in an LPM table.
  *
@@ -382,6 +379,14 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const 
uint32_t * ips,
  */
 static inline void
 rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+   uint16_t defv);
+
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
+/* Mask four results. */
+#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
+
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
uint16_t defv)
 {
__m128i i24;
@@ -472,6 +477,19 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, 
uint16_t hop[4],
hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
 }
+#else
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+   uint16_t defv)
+{
+   rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
+
+   hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
+   hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
+   hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
+   hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
+}
+#endif

 #ifdef __cplusplus
 }
-- 
2.6.1



[dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect

2015-10-27 Thread Jan Viktorin
This patch does not map x86 SIMD operations to the ARM ones.
It just fills the necessary gap between the platforms to enable
compilation of libraries LPM (includes rte_vect.h, lpm_test needs
those SIMD functions) and ACL (includes rte_vect.h).

Signed-off-by: Jan Viktorin 
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 81 +++
 1 file changed, 81 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
b/lib/librte_eal/common/include/arch/arm/rte_vect.h
new file mode 100644
index 000..b346c7d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -0,0 +1,81 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VECT_ARM_H_
+#define _RTE_VECT_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define XMM_SIZE 16
+#define XMM_MASK (XMM_MASK - 1)
+
+typedef struct {
+   union uint128 {
+   uint8_t uint8[16];
+   uint32_t uint32[4];
+   } val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+   __m128i res;
+   res.val.uint32[0] = v0;
+   res.val.uint32[1] = v1;
+   res.val.uint32[2] = v2;
+   res.val.uint32[3] = v3;
+   return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i * v)
+{
+   __m128i res;
+   res = *v;
+   return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i * v)
+{
+   __m128i res;
+   res = *v;
+   return res;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.6.1



[dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7

2015-10-27 Thread Jan Viktorin
Signed-off-by: Jan Viktorin 
---
 MAINTAINERS | 4 
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
 M: Chao Zhu 
 F: lib/librte_eal/common/include/arch/ppc_64/

+ARM v7
+M: Jan Viktorin 
+F: lib/librte_eal/common/include/arch/arm/
+
 Intel x86
 M: Bruce Richardson 
 M: Konstantin Ananyev 
-- 
2.6.1



[dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build

2015-10-27 Thread Jan Viktorin
There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.

Signed-off-by: Jan Viktorin 
Signed-off-by: Vlastimil Kosar 
---
 mk/toolchain/gcc/rte.vars.mk | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..8f9c396 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
 WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
 WERROR_FLAGS += -Wundef -Wwrite-strings

+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error
+endif
+
 # process cpu flags
 include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk

-- 
2.6.1



[dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM

2015-10-27 Thread Jan Viktorin
Just a copy from PPC.

Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_rwlock.h   | 40 ++
 1 file changed, 40 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h 
b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+   rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+   rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+   rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+   rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
-- 
2.6.1



[dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags

2015-10-27 Thread Jan Viktorin
Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin 
Signed-off-by: Amruta Zende 
Signed-off-by: David Hunt 
---
v2 -> v3: fixed forgotten include of string.h
---
 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h 
b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..7ce9d14 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -41,6 +41,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 

 #include "generic/rte_cpuflags.h"

@@ -52,10 +53,15 @@ extern "C" {
 #define AT_HWCAP2 26
 #endif

+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
 /* software based registers */
 enum cpu_register_t {
REG_HWCAP = 0,
REG_HWCAP2,
+   REG_PLATFORM,
 };

 /**
@@ -89,6 +95,8 @@ enum rte_cpu_flag_t {
RTE_CPUFLAG_SHA1,
RTE_CPUFLAG_SHA2,
RTE_CPUFLAG_CRC32,
+   RTE_CPUFLAG_AARCH32,
+   RTE_CPUFLAG_AARCH64,
/* The last item */
RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
 };
@@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = {
FEAT_DEF(SHA1,  0x0001, 0, REG_HWCAP2,  2)
FEAT_DEF(SHA2,  0x0001, 0, REG_HWCAP2,  3)
FEAT_DEF(CRC32, 0x0001, 0, REG_HWCAP2,  4)
+   FEAT_DEF(AARCH32,   0x0001, 0, REG_PLATFORM, 0)
+   FEAT_DEF(AARCH64,   0x0001, 0, REG_PLATFORM, 1)
 };

 /*
@@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
out[REG_HWCAP] = auxv.a_un.a_val;
else if (auxv.a_type == AT_HWCAP2)
out[REG_HWCAP2] = auxv.a_un.a_val;
+   else if (auxv.a_type == AT_PLATFORM) {
+   if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+   out[REG_PLATFORM] = 0x0001;
+   else if (!strcmp((const char *)auxv.a_un.a_val, 
"aarch64"))
+   out[REG_PLATFORM] = 0x0002;
+   }
}
 }

-- 
2.6.1



[dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 app/test/test_cpuflags.c   |   5 +
 .../common/include/arch/arm/rte_cpuflags.h | 177 +
 mk/rte.cpuflags.mk |   6 +
 3 files changed, 188 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
 #endif

+#if defined(RTE_ARCH_ARM)
+   printf("Check for NEON:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
 #if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
printf("Check for SSE:\t\t");
CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h 
b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include 
+#include 
+#include 
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+   REG_HWCAP = 0,
+   REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+   RTE_CPUFLAG_SWP = 0,
+   RTE_CPUFLAG_HALF,
+   RTE_CPUFLAG_THUMB,
+   RTE_CPUFLAG_A26BIT,
+   RTE_CPUFLAG_FAST_MULT,
+   RTE_CPUFLAG_FPA,
+   RTE_CPUFLAG_VFP,
+   RTE_CPUFLAG_EDSP,
+   RTE_CPUFLAG_JAVA,
+   RTE_CPUFLAG_IWMMXT,
+   RTE_CPUFLAG_CRUNCH,
+   RTE_CPUFLAG_THUMBEE,
+   RTE_CPUFLAG_NEON,
+   RTE_CPUFLAG_VFPv3,
+   RTE_CPUFLAG_VFPv3D16,
+   RTE_CPUFLAG_TLS,
+   RTE_CPUFLAG_VFPv4,
+   RTE_CPUFLAG_IDIVA,
+   RTE_CPUFLAG_IDIVT,
+   RTE_CPUFLAG_VFPD32,
+   RTE_CPUFLAG_LPAE,
+   RTE_CPUFLAG_EVTSTRM,
+   RTE_CPUFLAG_AES,
+   RTE_CPUFLAG_PMULL,
+   RTE_CPUFLAG_SHA1,
+   RTE_CPUFLAG_SHA2,
+   RTE_CPUFLAG_CRC32,
+   /* The last item */
+   RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+   FEAT_DEF(SWP,   0x0001, 0, REG_HWCAP,  0)
+   FEAT_DEF(HALF,  0x0001, 0, REG_HWCAP,  1)
+   FEAT_DEF(THUMB, 0x0001, 0, REG_HWCAP,  2)
+   FEAT_DEF(A26BIT,0x0001, 0, REG_HWCAP,  3)
+   FEAT_DEF(FAST_MULT, 0x0001, 0, REG_HWCAP,  4)
+   FEAT_DEF(FPA,   0x0001, 0, REG_HWCAP,  5)
+   FEAT_DEF(VFP,   0x0001, 0, REG_HWCAP,  6)
+   FEAT_DEF(EDSP,  0x0001, 0, REG_HWCAP,  7)
+   FEAT_DEF(JAVA,  0x0001, 0, REG_HWCAP,  8)
+   FEAT_DEF(IWMMXT,0x0001, 0, REG_HWCAP,  9)
+   F

[dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled

2015-10-27 Thread Jan Viktorin
The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin 
Signed-off-by: Amruta Zende 
Signed-off-by: David Hunt 
---
 .../common/include/arch/arm/rte_memcpy.h   | 59 +-
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index ac885e9..75e8bda 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@

 #include 
 #include 
-/* ARM NEON Intrinsics are used to copy data */
-#include 

 #ifdef __cplusplus
 extern "C" {
@@ -44,6 +42,11 @@ extern "C" {

 #include "generic/rte_memcpy.h"

+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include 
+
 static inline void
 rte_mov16(uint8_t *dst, const uint8_t *src)
 {
@@ -263,6 +266,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
return ret;
 }

+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+   memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+   return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+   return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.6.1



[dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.

The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.

   by X % faster
Length (B)   a15a7 a9
   1 4.9  15.23.2
   756.9  48.2   40.3
   837.3  39.8   29.6
   969.3  38.7   33.9
  1560.8  35.3   23.7
  1650.6  35.9   35.0
  1757.7  35.7   31.1
  3116.0  23.39.0
  3265.9  13.5   21.4
  33 3.9  10.3   -3.7
  63 2.0  12.9   -2.0
  6466.5   0.0   16.5
  65 2.7   7.6  -35.6
 127 0.1   4.5  -18.9
 12866.2   1.5  -51.4
 129-0.8   3.2  -35.8
 255-3.1  -0.9  -69.1
 25667.9   1.27.2
 257-3.6  -1.9  -36.9
 32067.7   1.40.0
 38466.8   1.4  -14.2
 511   -44.9  -2.3  -41.9
 51267.3   1.4   -6.8
 513   -41.7  -3.0  -36.2
1023   -82.4  -2.8  -41.2
102468.3   1.4  -11.6
1025   -80.1  -3.3  -38.1
1518   -47.3  -5.0  -38.3
1522   -48.3  -6.0  -37.9
160065.4   1.3  -27.3
204859.5   1.5  -10.9
307252.3   1.5  -12.2
409645.3   1.4  -12.5
512040.6   1.5  -14.5
614435.4   1.4  -13.4
716832.9   1.4  -13.9
819228.2   1.4  -15.1

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_memcpy.h   | 270 +
 1 file changed, 270 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h 
b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 000..ac885e9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,270 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include 
+#include 
+/* ARM NEON Intrinsics are used to copy data */
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+   vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+   asm volatile ("vld1.8 {d0-d3}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+   asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d5}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d5}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+   asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d7}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d7}, [%[dst]]\n

[dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM)

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_spinlock.h | 114 +
 1 file changed, 114 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h 
b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+   while (__sync_lock_test_and_set(&sl->locked, 1))
+   while (sl->locked)
+   rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+   __sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+   return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+   return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+   rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+   return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+   rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+   rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+   rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+   return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
-- 
2.6.1



[dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_prefetch.h | 61 ++
 1 file changed, 61 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h 
b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 000..8d75fe6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+   asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+   asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+   asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
-- 
2.6.1



[dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime

2015-10-27 Thread Jan Viktorin
Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.

Based on the patch by David Hunt and Armuta Zende:

  lib: added support for armv7 architecture

Signed-off-by: Jan Viktorin 
Signed-off-by: Amruta Zende 
Signed-off-by: David Hunt 
---
 .../common/include/arch/arm/rte_cycles.h   | 38 +-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h 
b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
  * @return
  *   The time base for this lcore.
  */
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
 static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
 {
struct timespec val;
uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
v += (uint64_t) val.tv_nsec;
return v;
 }
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ *  asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ *  asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ *  asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x800f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+   unsigned tsc;
+   uint64_t final_tsc;
+
+   /* Read PMCCNTR */
+   asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+   /* 1 tick = 64 clocks */
+   final_tsc = ((uint64_t)tsc) << 6;
+
+   return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */

 static inline uint64_t
 rte_rdtsc_precise(void)
-- 
2.6.1



[dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle operations for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_cycles.h   | 85 ++
 1 file changed, 85 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h 
b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock 
counter
+   available in the core is 32 bit wide. Therefore it is unsuitable as the
+   counter overlaps every few seconds and probably is not accessible by
+   userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+   simulate counter running at 1GHz.
+*/
+
+#include 
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+   struct timespec val;
+   uint64_t v;
+
+   while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+   /* no body */;
+
+   v  = (uint64_t) val.tv_sec * 10LL;
+   v += (uint64_t) val.tv_nsec;
+   return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+   rte_mb();
+   return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
-- 
2.6.1



[dpdk-dev] [PATCH v3 03/17] eal/arm: byte order operations for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
 .../common/include/arch/arm/rte_byteorder.h| 148 +
 1 file changed, 148 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h 
b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 000..04e7b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,148 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+   register uint16_t x = _x;
+   asm volatile ("rev16 %[x1],%[x2]"
+ : [x1] "=r" (x)
+ : [x2] "r" (x)
+ );
+   return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+   register uint32_t x = _x;
+   asm volatile ("rev %[x1],%[x2]"
+ : [x1] "=r" (x)
+ : [x2] "r" (x)
+ );
+   return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+   return  __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?   \
+  rte_constant_bswap16(x) :\
+  rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?   \
+  rte_constant_bswap32(x) :\
+  rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?   \
+  rte_constant_bswap64(x) :\
+  rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?   \
+  rte_constant_bswap16(x) :\
+  rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#def

[dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)
---
 .../common/include/arch/arm/rte_atomic.h   | 256 +
 1 file changed, 256 insertions(+)
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h 
b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 000..1815766
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#definerte_mb()  __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#definerte_wmb() do { asm volatile ("dmb st" : : : "memory"); } 
while(0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#definerte_rmb() __sync_synchronize()
+
+/*- 16 bit atomic operations 
-*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+   return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+   __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+   return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+   __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+   __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+   return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+   return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*- 32 bit atomic operations 
-*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+   return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+   __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+   return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+   __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+   __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int r

[dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture

2015-10-27 Thread Jan Viktorin
From: Vlastimil Kosar 

Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.

Signed-off-by: Vlastimil Kosar 
Signed-off-by: Jan Viktorin 
---
v1 -> v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off

v2 -> v3: XMM_SIZE is defined in rte_vect.h in a following patch
---
 config/defconfig_arm-armv7-a-linuxapp-gcc | 75 +++
 mk/arch/arm/rte.vars.mk   | 39 
 mk/machine/armv7-a/rte.vars.mk| 67 +++
 3 files changed, 181 insertions(+)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc 
b/config/defconfig_arm-armv7-a-linuxapp-gcc
new file mode 100644
index 000..5a778cf
--- /dev/null
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -0,0 +1,75 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All right reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of RehiveTech nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# avoids using i686/x86_64 SIMD instructions, nothing for ARM
+CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of RehiveTech nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF ME

[dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture

2015-10-27 Thread Jan Viktorin
Hello DPDK community,

this is the third attempt to post support for ARMv7 into the DPDK.
There are changes related to the LPM and ACL libraries only:

* included rte_vect.h, however, it is more a placeholder
* rte_lpm.h was simplified due to the previous point
* ACL now compiles as we detect whether the compiler
  supports SSE 4.1

Regards
Jan

---

You can pull the changes from

  https://github.com/RehiveTech/dpdk.git arm-support-v3

since commit 67b38d93c960e3a650b49dec1cd8b53c12ee3e2d:

  e1000: fix total byte statistics (2015-10-27 18:39:44 +0100)

up to d84581e3681de076819d202b1f09f2751d28d5be:

acl: handle when SSE 4.1 is unsupported (2015-10-27 20:03:24 +0100)

---

Jan Viktorin (8):
  eal/arm: implement rdtsc by PMU or clock_gettime
  eal/arm: use vector memcpy only when NEON is enabled
  eal/arm: detect arm architecture in cpu flags
  eal/arm: rwlock support for ARM
  gcc/arm: avoid alignment errors to break build
  maintainers: claim responsibility for ARMv7
  eal/arm: add very incomplete rte_vect
  acl: handle when SSE 4.1 is unsupported

Vlastimil Kosar (9):
  mk: Introduce ARMv7 architecture
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for
non-x86

 MAINTAINERS|   4 +
 app/test/test_cpuflags.c   |   5 +
 config/defconfig_arm-armv7-a-linuxapp-gcc  |  74 +
 lib/librte_acl/Makefile|   7 +-
 lib/librte_acl/rte_acl.c   |  19 +-
 .../common/include/arch/arm/rte_atomic.h   | 256 
 .../common/include/arch/arm/rte_byteorder.h| 148 ++
 .../common/include/arch/arm/rte_cpuflags.h | 193 
 .../common/include/arch/arm/rte_cycles.h   | 121 
 .../common/include/arch/arm/rte_memcpy.h   | 325 +
 .../common/include/arch/arm/rte_prefetch.h |  61 
 .../common/include/arch/arm/rte_rwlock.h   |  40 +++
 .../common/include/arch/arm/rte_spinlock.h | 114 
 lib/librte_eal/common/include/arch/arm/rte_vect.h  |  81 +
 lib/librte_lpm/rte_lpm.h   |  24 +-
 mk/arch/arm/rte.vars.mk|  39 +++
 mk/machine/armv7-a/rte.vars.mk |  67 +
 mk/rte.cpuflags.mk |   6 +
 mk/toolchain/gcc/rte.vars.mk   |   6 +
 19 files changed, 1584 insertions(+), 6 deletions(-)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.6.1



[dpdk-dev] [PATCH v4 0/2] e1000: enable igb TSO support

2015-10-27 Thread Thomas Monjalon
> > v4:
> > * Added ULL postfix to compare mask of igb_tx_offload.
> > 
> > v3:
> > * Removed the "unlikely" in check_tso_para function, for there was no
> > obvious performance
> >   difference, let the branch predictor do the job.
> > 
> > v2:
> > * Reworded the old comment about union igb_vlan_macip which was no
> > more used.
> > 
> > * Corrected typo in line "There're some limitaions in hardware for TCP
> > segmentaion offload".
> > 
> > * Added "unlikely" in check_tso_para function.
> > 
> > v1:
> > * Initial version for igb TSO feature.
> > 
> > Wang Xiao W (2):
> >   e1000: enable igb TSO support
> >   doc: update release note for igb TSO support
> Acked-by: Wenzhuo Lu 

Applied, thanks


[dpdk-dev] [PATCH 10/11] mk: add makefile and config changes for armv8 architecture

2015-10-27 Thread Jan Viktorin
On Mon, 26 Oct 2015 17:22:01 +0100
Jan Viktorin  wrote:

> On Fri, 23 Oct 2015 15:17:12 +0100
> David Hunt  wrote:
> 
> >  
> > +# ARMv8 CPU flags
> > +ifneq ($(filter $(AUTO_CPUFLAGS),__aarch64__),)

I do not believe that this works. The function filter accepts
arguments swapped. I.e. first a pattern and then the list of
filtered data. I suppose, __aarch64__ is the pattern...

Jan

> > +CPUFLAGS += AARCH64
> > +endif
> > +
> > +ifneq ($(filter $(AUTO_CPUFLAGS),__aarch32__),)
> > +CPUFLAGS += AARCH32
> > +endif
> > +  
> 
> I think, this should go with the ARMv7 series.

> 
> Jan
> 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH v3 1/2] e1000: enable igb TSO support

2015-10-27 Thread Thomas Monjalon
Please guys, allow us to read your discussion by removing the useless lines.
It's enough hard to follow to add the constraint of looking for the
interesting lines.

Thanks

PS: do not hesitate to spread the word around you,
Patchwork and the MUAs will thank you.


[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-10-27 Thread Ananyev, Konstantin

Hi lads,

> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Tuesday, October 27, 2015 6:48 PM
> To: Thomas Monjalon; Ananyev, Konstantin; Zhang, Helin
> Cc: dev at dpdk.org; Kirsher, Jeffrey T; Brandeburg, Jesse
> Subject: Re: [dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP 
> descriptor for devices newer than 82598
> 
> 
> 
> On 10/27/15 20:09, Thomas Monjalon wrote:
> > Any Follow-up to this discussion?
> > Should we mark this patch as rejected?
> 
> Hmmm... This patch fixes an obvious spec violation. Why would it be
> rejected?

No I don't think we can reject the patch:
There is a reproducible  TX hang on ixgbe PMD on described conditions.
Though, as I explained here:
http://dpdk.org/ml/archives/dev/2015-September/023574.html
Vlad's patch would cause quite a big slowdown.
We are still in the process to get an answer from HW guys are there any
alternatives that will allow to fix the problem and avoid the slowdown.
Konstantin  

> 
> >
> > 2015-08-24 11:11, Vlad Zolotarov:
> >> On 08/20/15 18:37, Vlad Zolotarov wrote:
> >>> According to 82599 and x540 HW specifications RS bit *must* be
> >>> set in the last descriptor of *every* packet.
> >>>
> >>> Before this patch there were 3 types of Tx callbacks that were setting
> >>> RS bit every tx_rs_thresh descriptors. This patch introduces a set of
> >>> new callbacks, one for each type mentioned above, that will set the RS
> >>> bit in every EOP descriptor.
> >>>
> >>> ixgbe_set_tx_function() will set the appropriate Tx callback according
> >>> to the device family.
> >> [+Jesse and Jeff]
> >>
> >> I've started to look at the i40e PMD and it has the same RS bit
> >> deferring logic
> >> as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a
> >> descriptor write-back
> >> completion mode.
> >>
> >>   From the HW Spec it's unclear if RS bit should be set on *every* 
> >> descriptor
> >> with EOP bit. However I noticed that Linux driver, before it moved to
> >> HEAD write-back mode, was setting RS
> >> bit on every EOP descriptor.
> >>
> >> So, here is a question to Intel guys: could u, pls., clarify this point?
> >>
> >> Thanks in advance,
> >> vlad
> >
> >



[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-10-27 Thread Thomas Monjalon
Any Follow-up to this discussion?
Should we mark this patch as rejected?

2015-08-24 11:11, Vlad Zolotarov:
> On 08/20/15 18:37, Vlad Zolotarov wrote:
> > According to 82599 and x540 HW specifications RS bit *must* be
> > set in the last descriptor of *every* packet.
> >
> > Before this patch there were 3 types of Tx callbacks that were setting
> > RS bit every tx_rs_thresh descriptors. This patch introduces a set of
> > new callbacks, one for each type mentioned above, that will set the RS
> > bit in every EOP descriptor.
> >
> > ixgbe_set_tx_function() will set the appropriate Tx callback according
> > to the device family.
> 
> [+Jesse and Jeff]
> 
> I've started to look at the i40e PMD and it has the same RS bit 
> deferring logic
> as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a 
> descriptor write-back
> completion mode.
> 
>  From the HW Spec it's unclear if RS bit should be set on *every* descriptor
> with EOP bit. However I noticed that Linux driver, before it moved to 
> HEAD write-back mode, was setting RS
> bit on every EOP descriptor.
> 
> So, here is a question to Intel guys: could u, pls., clarify this point?
> 
> Thanks in advance,
> vlad





[dpdk-dev] [PATCH v2 4/5] doc: add documentation for szedata2 PMD

2015-10-27 Thread Thomas Monjalon
2015-10-27 18:33, Matej Vido:
> 2015-10-26 16:09 GMT+01:00 Thomas Monjalon :
> > SZEDATA2 is not a vdev. Is it possible to probe it as a standard PCI
> > device?
> >
> It would be possible to probe it as a standard PCI device, but as this
> szedata2 driver uses libsze2 library it needs to pass some parameters to
> the library and we thought that using vdev would be the easiest solution.
> Is there a way how to provide parameters to pdev driver?

No, but it could be added by working on unifying pdev and vdev.

> We also work on a new PMD which will eliminate dependencies on kernel
> modules and libsze2, and using this PMD COMBO card will be probed as a
> standard PCI device.

I think it's better to wait for your new implementation.
It would be nice to have it in the release 2.2.
Do you think it could be submitted in the coming weeks?


[dpdk-dev] [PATCH] enic: improve Tx packet rate

2015-10-27 Thread Thomas Monjalon
2015-10-23 15:47, John Daley:
> For every packet sent, a completion was being requested and the
> posted_index register on the nic was being updated. Instead,
> request a completion and update the posted index once per burst
> after all packets have been sent by the burst function.
> 
> Signed-off-by: John Daley 
> Acked-by: Sujith Sankar 

Unfortunately, checkpatch.pl is against this patch,
mainly because of the indentation.
It should be easy to fix ;)



[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-10-27 Thread Brandeburg, Jesse
+ixgbe developers.

-Original Message-
From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com] 
Sent: Tuesday, October 27, 2015 11:48 AM
To: Thomas Monjalon; Ananyev, Konstantin; Zhang, Helin
Cc: dev at dpdk.org; Kirsher, Jeffrey T; Brandeburg, Jesse
Subject: Re: [dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP 
descriptor for devices newer than 82598



On 10/27/15 20:09, Thomas Monjalon wrote:
> Any Follow-up to this discussion?
> Should we mark this patch as rejected?

Hmmm... This patch fixes an obvious spec violation. Why would it be 
rejected?

>
> 2015-08-24 11:11, Vlad Zolotarov:
>> On 08/20/15 18:37, Vlad Zolotarov wrote:
>>> According to 82599 and x540 HW specifications RS bit *must* be
>>> set in the last descriptor of *every* packet.
>>>
>>> Before this patch there were 3 types of Tx callbacks that were setting
>>> RS bit every tx_rs_thresh descriptors. This patch introduces a set of
>>> new callbacks, one for each type mentioned above, that will set the RS
>>> bit in every EOP descriptor.
>>>
>>> ixgbe_set_tx_function() will set the appropriate Tx callback according
>>> to the device family.
>> [+Jesse and Jeff]
>>
>> I've started to look at the i40e PMD and it has the same RS bit
>> deferring logic
>> as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a
>> descriptor write-back
>> completion mode.
>>
>>   From the HW Spec it's unclear if RS bit should be set on *every* descriptor
>> with EOP bit. However I noticed that Linux driver, before it moved to
>> HEAD write-back mode, was setting RS
>> bit on every EOP descriptor.
>>
>> So, here is a question to Intel guys: could u, pls., clarify this point?
>>
>> Thanks in advance,
>> vlad
>
>



[dpdk-dev] [PATCH] librte: Link status interrupt race condition, IGB E1000

2015-10-27 Thread Thomas Monjalon
2015-10-26 05:25, Lu, Wenzhuo:
> I think you're right. To my opinion, this if is added to avoid the race 
> condition. So, it should be " dev->data->dev_conf.intr_conf.lsc == 0". It 
> means if the interrupts are not enabled, we'd update the link when starting, 
> if not we can leave it the interrupt handler.
> Seems it's not an igb specific but common issue. 

Tim, please could you send an appropriate patch?
The procedure is described in http://dpdk.org/dev#send

Could you also check if other PMDs have the same bug?
Thanks




[dpdk-dev] [PATCH] e1000: fix rx/tx total byte statistics

2015-10-27 Thread Thomas Monjalon
> > This patch fixes a bug in reading the 64 bit register reading which was
> > causing the total octets counters to show zero.
> > Now the code reads both the lower and higher 32 bits.
> > Tested in testpmd, byte values are correct.
> > 
> > Fixes: 805803445a02 ("e1000: support EM devices (also known as
> > e1000/e1000e)")
> > 
> > Signed-off-by: Harry van Haaren 
> Acked-by: Wenzhuo Lu 

It was an old bug :)

Applied, thanks


[dpdk-dev] [PATCH v2 3/5] szedata2: add handling of scattered packets in TX

2015-10-27 Thread Matej Vido
Hi Thomas,

2015-10-26 15:55 GMT+01:00 Thomas Monjalon :

> Hi Matej,
>
> 2015-09-18 10:32, Matej Vido:
> > - rte_memcpy(tmp_dst,
> > - rte_pktmbuf_mtod(mbuf, const void *),
> > - pkt_len);
> > + if (likely(mbuf_segs == 1)) {
> > + /*
> > +  * non-scattered packet,
> > +  * transmit from one mbuf
> > +  */
> > + rte_memcpy(tmp_dst,
> > + rte_pktmbuf_mtod(mbuf, const void
> *),
> > + pkt_len);
>
> You could avoid this change by keeping "if (likely(mbuf_segs == 1))"
> in the first patch.
> By the way, it seems to be an abusive use of "likely".
>
>
I will edit it in v3. Thanks.

Best regards,
Matej Vido


[dpdk-dev] [PATCH v6 6/9] test: dynamic rss configuration

2015-10-27 Thread Thomas Monjalon
This new test depends on null PMD.
The dependency should be checked gracefully, see below.

2015-10-16 12:00, Tomasz Kulasek:
> --- a/app/test/Makefile
> +++ b/app/test/Makefile
> @@ -138,6 +138,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += test_acl.c
>  ifeq ($(CONFIG_RTE_LIBRTE_PMD_RING),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += test_link_bonding.c
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += test_link_bonding_mode4.c

Why not enclose in ifeq ($(CONFIG_RTE_LIBRTE_PMD_NULL),y)?

> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += test_link_bonding_rssconf.c
>  endif
>  
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring.c
> @@ -168,6 +169,13 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
>  LDLIBS += -lrte_pmd_ring
>  endif
>  endif
> +ifneq ($(CONFIG_RTE_LIBRTE_PMD_NULL),y)
> +$(error Link bonding rssconf tests require CONFIG_RTE_LIBRTE_PMD_NULL=y)

Not needed if handled as suggested above.
The build should not fail because a module is disabled.

> +else
> +ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
> +LDLIBS += -lrte_pmd_null
> +endif
> +endif



[dpdk-dev] [PATCH v2 4/5] doc: add documentation for szedata2 PMD

2015-10-27 Thread Matej Vido
Hi Thomas,

thank you for feedback.

2015-10-26 16:09 GMT+01:00 Thomas Monjalon :

> Hi Matej,
>
> Thanks for providing a documentation.
> I'm sorry to give a late feedback and I would like that other contributors
> have reviewed it. There are a lot of PMD developers around. Please help.
>
>
> 2015-09-18 10:32, Matej Vido:
> > +- **libsze2**
> > +
> > +  This library provides API for initialization of sze2 transfers,
> receiving and
> > +  transmitting data segments.
>
> Please provide more information to help installing the dependencies.
>

Dependencies can be installed from RPM packages. I will add those
information in new patch version.


>
> > +SZEDATA2 PMD can be created by passing --vdev= option to EAL in the
> following
> > +format:
> > +
> > +.. code-block:: console
> > +
> > +--vdev
> 'DEVICE_NAME,dev_path=PATH_TO_SZEDATA2_DEVICE,rx_ifaces=RX_MASK,tx_ifaces=TX_MASK'
>
> SZEDATA2 is not a vdev. Is it possible to probe it as a standard PCI
> device?
>
>
 It would be possible to probe it as a standard PCI device, but as this
szedata2 driver uses libsze2 library it needs to pass some parameters to
the library and we thought that using vdev would be the easiest solution.
Is there a way how to provide parameters to pdev driver?
We also work on a new PMD which will eliminate dependencies on kernel
modules and libsze2, and using this PMD COMBO card will be probed as a
standard PCI device.

Best regards,
Matej Vido


[dpdk-dev] [PATCH v5] mem: command line option to delete hugepage backing files

2015-10-27 Thread shesha Sreenivasamurthy (shesha)
When an application using huge-pages crash or exists, the hugetlbfs
backing files are not cleaned up. This is a patch to clean those files.
There are multi-process DPDK applications that may be benefited by those
backing files. Therefore, I have made that configurable so that the
application that does not need those backing files can remove them, thus
not changing the current default behavior. The application itself can
clean it up, however the rationale behind DPDK cleaning it up is, DPDK
created it and therefore, it is better it unlinks it.

Signed-off-by: Shesha Sreenivasamurthy 
Acked-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_options.c | 12 
 lib/librte_eal/common/eal_internal_cfg.h   |  1 +
 lib/librte_eal/common/eal_options.h|  2 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 30
++
 4 files changed, 45 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c
b/lib/librte_eal/common/eal_common_options.c
index 1f459ac..5fe6374 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -79,6 +79,7 @@ eal_long_options[] = {
{OPT_MASTER_LCORE,  1, NULL, OPT_MASTER_LCORE_NUM },
{OPT_NO_HPET,   0, NULL, OPT_NO_HPET_NUM  },
{OPT_NO_HUGE,   0, NULL, OPT_NO_HUGE_NUM  },
+   {OPT_HUGE_UNLINK,   0, NULL, OPT_HUGE_UNLINK_NUM  },
{OPT_NO_PCI,0, NULL, OPT_NO_PCI_NUM   },
{OPT_NO_SHCONF, 0, NULL, OPT_NO_SHCONF_NUM},
{OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM},
@@ -722,6 +723,10 @@ eal_parse_common_option(int opt, const char *optarg,
conf->no_hugetlbfs = 1;
break;

+   case OPT_HUGE_UNLINK_NUM:
+   conf->hugepage_unlink = 1;
+   break;
+
case OPT_NO_PCI_NUM:
conf->no_pci = 1;
break;
@@ -856,6 +861,12 @@ eal_check_common_options(struct internal_config
*internal_cfg)
return -1;
}

+   if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) {
+   RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot "
+   "be specified together with --"OPT_NO_HUGE"\n");
+   return -1;
+   }
+
if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) != 0 &&
rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_PCI) != 0) {
RTE_LOG(ERR, EAL, "Options blacklist (-b) and whitelist (-w) "
@@ -906,6 +917,7 @@ eal_common_usage(void)
   "  -h, --help  This help\n"
   "\nEAL options for DEBUG use only:\n"
   "  --"OPT_NO_HUGE"   Use malloc instead of hugetlbfs\n"
+  "  --"OPT_HUGE_UNLINK"   Unlink hugepage backing file after
initalization\n"
   "  --"OPT_NO_PCI"Disable PCI\n"
   "  --"OPT_NO_HPET"   Disable HPET\n"
   "  --"OPT_NO_SHCONF" No shared config (mmap'd files)\n"
diff --git a/lib/librte_eal/common/eal_internal_cfg.h
b/lib/librte_eal/common/eal_internal_cfg.h
index e2ecb0d..84b075f 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -64,6 +64,7 @@ struct internal_config {
volatile unsigned force_nchannel; /**< force number of channels */
volatile unsigned force_nrank;/**< force number of ranks */
volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
+   volatile unsigned hugepage_unlink; /** < true to unlink backing files */
volatile unsigned xen_dom0_support; /**< support app running on Xen
Dom0*/
volatile unsigned no_pci; /**< true to disable PCI */
volatile unsigned no_hpet;/**< true to disable HPET */
diff --git a/lib/librte_eal/common/eal_options.h
b/lib/librte_eal/common/eal_options.h
index f6714d9..745f38c 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -63,6 +63,8 @@ enum {
OPT_PROC_TYPE_NUM,
 #define OPT_NO_HPET   "no-hpet"
OPT_NO_HPET_NUM,
+#define OPT_HUGE_UNLINK"huge-unlink"
+   OPT_HUGE_UNLINK_NUM,
 #define OPT_NO_HUGE   "no-huge"
OPT_NO_HUGE_NUM,
 #define OPT_NO_PCI"no-pci"
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index ac2745e..c7e2485 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -786,6 +786,28 @@ copy_hugepages_to_shared_mem(struct hugepage_file *
dst, int dest_size,
return 0;
 }

+static int
+unlink_hugepage_files(struct hugepage_file *hugepg_tbl,
+   unsigned num_hp_info)
+{
+   unsigned socket, size;
+   int page, nrpages = 0;
+
+   /* get total number of hugepages */
+   for (size = 0; size < 

[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Tetsuya Mukawa
On 2015/10/27 17:39, Yuanhan Liu wrote:
> On Tue, Oct 27, 2015 at 08:24:00AM +, Xie, Huawei wrote:
>> On 10/27/2015 3:52 PM, Tetsuya Mukawa wrote:
>>> The patch fixes wrong handling of virtqueue array index when
>>> GET_VRING_BASE message comes.
>>> The vhost backend will receive the message per virtqueue.
>>> Also we should call a destroy callback handler when both RXQ
>>> and TXQ receives the message.
>>>
>>> Signed-off-by: Tetsuya Mukawa 
>>> ---
>>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++--
>>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
>>> b/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> index a998ad8..99c075f 100644
>>> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
>>> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>>> struct vhost_vring_state *state)
>>>  {
>>> struct virtio_net *dev = get_device(ctx);
>>> +   uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;
>>>  
>>> if (dev == NULL)
>>> return -1;
>>> -   /* We have to stop the queue (virtio) if it is running. */
>>> -   if (dev->flags & VIRTIO_DEV_RUNNING)
>>> -   notify_ops->destroy_device(dev);
>> Hi Tetsuya:
>> I don't understand why we move it to the end of the function.
>> If we don't tell the application to remove the virtio device from the
> As you stated, he just moved it to the end of the function: it
> still does invoke notfiy_ops->destroy_device() in the end.
>
> And the reason he moved it to the end is he want to invoke the
> callback just when the second GET_VRING_BASE message is received
> for the queue pair. And while thinking twice, it's not necessary,
> as we will do the "flags & VIRTIO_DEV_RUNNING" check first, it
> doesn't matter on which virt queue we invoke the callback.

I thought we had 2 choices.
1. Call the callback handler at first place of this function when 1st
GET_VRING_BASE message comes.
2. Call the callback handler at last place of this function when 2nd
GET_VRING_BASE message comes.

And I chose 2nd, because in the case of 1st choice, before sending 2nd
message, QEMU guess one of queue is still alive, but actually in DPDK
application, it has been closed already.
I thought above inconsistency might cause the issue.
But yes, if we chose 2nd, we may have an issue as Xie said.

>
>   --yliu
>
>> data plane, then the vhost application is still operating on that
>> device, we shouldn't do anything to the virtio_net device.
>> For this case, as vhost doesn't use kickfd, it will not cause issue, but
>> i think it is best practice firstly to remove it from data plan through
>> destroy_device.
>>
>> I think we could call destroy_device the first time we receive this
>> message. Currently we don't have per queue granularity control to only
>> remove one queue from data plane.
>>
>> I am Okay to only close the kickfd for the specified queue index.
>>
>> Btw, do you meet issue with previous implementation?

Yes, I faced illegal memory access.
For example, if we have RX and TX queues, we will have 2 GET_VRING_BASE
messages when virtio-net device is finalized.
While handling these messages, 'dev->virtqueue[2]' will be accessed,
then will cause illegal access.
(We only have 2 queues, so above will be NULL)
So actually we need to change the function a bit.

Thanks,
Tetsuya


[dpdk-dev] [PATCH v5] mem: command line option to delete hugepage backing files

2015-10-27 Thread shesha Sreenivasamurthy (shesha)
When an application using huge-pages crash or exists, the hugetlbfs
backing files are not cleaned up. This is a patch to clean those files.
There are multi-process DPDK applications that may be benefited by those
backing files. Therefore, I have made that configurable so that the
application that does not need those backing files can remove them, thus
not changing the current default behavior. The application itself can
clean it up, however the rationale behind DPDK cleaning it up is, DPDK
created it and therefore, it is better it unlinks it.


Signed-off-by: Shesha Sreenivasamurthy 
Acked-by: Sergio Gonzalez Monroy 
---
 lib/librte_eal/common/eal_common_options.c | 12 
 lib/librte_eal/common/eal_internal_cfg.h   |  1 +
 lib/librte_eal/common/eal_options.h|  2 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 30
++
 4 files changed, 45 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c
b/lib/librte_eal/common/eal_common_options.c
index 1f459ac..5fe6374 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -79,6 +79,7 @@ eal_long_options[] = {
{OPT_MASTER_LCORE,  1, NULL, OPT_MASTER_LCORE_NUM },
{OPT_NO_HPET,   0, NULL, OPT_NO_HPET_NUM  },
{OPT_NO_HUGE,   0, NULL, OPT_NO_HUGE_NUM  },
+   {OPT_HUGE_UNLINK,   0, NULL, OPT_HUGE_UNLINK_NUM  },
{OPT_NO_PCI,0, NULL, OPT_NO_PCI_NUM   },
{OPT_NO_SHCONF, 0, NULL, OPT_NO_SHCONF_NUM},
{OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM},
@@ -722,6 +723,10 @@ eal_parse_common_option(int opt, const char *optarg,
conf->no_hugetlbfs = 1;
break;

+   case OPT_HUGE_UNLINK_NUM:
+   conf->hugepage_unlink = 1;
+   break;
+
case OPT_NO_PCI_NUM:
conf->no_pci = 1;
break;
@@ -856,6 +861,12 @@ eal_check_common_options(struct internal_config
*internal_cfg)
return -1;
}

+   if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) {
+   RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot "
+   "be specified together with --"OPT_NO_HUGE"\n");
+   return -1;
+   }
+
if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) != 0 &&
rte_eal_devargs_type_count(RTE_DEVTYPE_BLACKLISTED_PCI) != 0) {
RTE_LOG(ERR, EAL, "Options blacklist (-b) and whitelist (-w) "
@@ -906,6 +917,7 @@ eal_common_usage(void)
   "  -h, --help  This help\n"
   "\nEAL options for DEBUG use only:\n"
   "  --"OPT_NO_HUGE"   Use malloc instead of hugetlbfs\n"
+  "  --"OPT_HUGE_UNLINK"   Unlink hugepage backing file after
initalization\n"
   "  --"OPT_NO_PCI"Disable PCI\n"
   "  --"OPT_NO_HPET"   Disable HPET\n"
   "  --"OPT_NO_SHCONF" No shared config (mmap'd files)\n"
diff --git a/lib/librte_eal/common/eal_internal_cfg.h
b/lib/librte_eal/common/eal_internal_cfg.h
index e2ecb0d..84b075f 100644
--- a/lib/librte_eal/common/eal_internal_cfg.h
+++ b/lib/librte_eal/common/eal_internal_cfg.h
@@ -64,6 +64,7 @@ struct internal_config {
volatile unsigned force_nchannel; /**< force number of channels */
volatile unsigned force_nrank;/**< force number of ranks */
volatile unsigned no_hugetlbfs;   /**< true to disable hugetlbfs */
+   volatile unsigned hugepage_unlink; /** < true to unlink backing files */
volatile unsigned xen_dom0_support; /**< support app running on Xen
Dom0*/
volatile unsigned no_pci; /**< true to disable PCI */
volatile unsigned no_hpet;/**< true to disable HPET */
diff --git a/lib/librte_eal/common/eal_options.h
b/lib/librte_eal/common/eal_options.h
index f6714d9..745f38c 100644
--- a/lib/librte_eal/common/eal_options.h
+++ b/lib/librte_eal/common/eal_options.h
@@ -63,6 +63,8 @@ enum {
OPT_PROC_TYPE_NUM,
 #define OPT_NO_HPET   "no-hpet"
OPT_NO_HPET_NUM,
+#define OPT_HUGE_UNLINK"huge-unlink"
+   OPT_HUGE_UNLINK_NUM,
 #define OPT_NO_HUGE   "no-huge"
OPT_NO_HUGE_NUM,
 #define OPT_NO_PCI"no-pci"
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index ac2745e..c7e2485 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -786,6 +786,28 @@ copy_hugepages_to_shared_mem(struct hugepage_file *
dst, int dest_size,
return 0;
 }

+static int
+unlink_hugepage_files(struct hugepage_file *hugepg_tbl,
+   unsigned num_hp_info)
+{
+   unsigned socket, size;
+   int page, nrpages = 0;
+
+   /* get total number of hugepages */
+   for (size = 0; size <

[dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support

2015-10-27 Thread Jan Viktorin
Hello Konstantin,

thank you for those hints! I reworked the code a little bit -
refactored the LPM vector operations into rte_vect.h which helped to
the ACL library as well. I will sort it out a little and send the v3 of
the series. The LPM and ACL now seem to be much more sane on ARM.

Regards
Jan

On Tue, 27 Oct 2015 15:55:48 +
"Ananyev, Konstantin"  wrote:

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> > Sent: Monday, October 26, 2015 4:38 PM
> > To: Thomas Monjalon; Hunt, David; dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
> > 
> > The main goal of this check is to avoid passing the -msse4.1
> > option to the GCC that does not support it (like arm toolchains).
> > 
> > Anyway, the ACL library does not compile on ARM.
> > 
> > Signed-off-by: Jan Viktorin 
> > ---
> >  lib/librte_acl/Makefile | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> > index 7a1cf8a..401fb8c 100644
> > --- a/lib/librte_acl/Makefile
> > +++ b/lib/librte_acl/Makefile
> > @@ -50,7 +50,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> > 
> > +CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null 
> > >/dev/null 2>&1 && echo 1)
> > +
> > +ifeq ($(CC_SSE4_1_SUPPORT),1)
> >  CFLAGS_acl_run_sse.o += -msse4.1
> > +endif  
> 
> I don't think acl_run_sse.c would compile if SSE4_1 is not supported.
> So, I think you need to do same thing, as is done for AVX2:
> Compile in acl_run_sse.c  only if SSE41 is supported by the compiler:  
> 
> - SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> -
> -CFLAGS_acl_run_sse.o += -msse4.1
> 
> +CC_SSE41_SUPPORT=$(shell $(CC) -msse4.1 -dM -E - &1 | \
> grep -q  && echo 1)
> +ifeq ($(CC_SSE41_SUPPORT), 1)
> +SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> +CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
> +CFLAGS_acl_run_sse.o += -msse4.1 
> +endif
> 
> And then change rte_acl_init() accordingly.
> Something like:
> 
> int __attribute__ ((weak))
> rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
> __rte_unused const uint8_t **data,
> __rte_unused uint32_t *results,
> __rte_unused uint32_t num,
> __rte_unused uint32_t categories)
> {
> return -ENOTSUP;
> }
> 
> 
> 
> static void __attribute__((constructor))
> rte_acl_init(void)
> {
> enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;
> 
> #if defined(CC_AVX2_SUPPORT)
> if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> alg = RTE_ACL_CLASSIFY_AVX2;
> else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
> #elif defined (CC_SSE41_SUPPORT)
> if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
> alg = RTE_ACL_CLASSIFY_SSE;
> #endif
> 
> rte_acl_set_default_classify(alg);
> }
> 
> After that, I suppose, you should be able to build and (probably use) 
> librte_acl on arm.
> Konstantin
> 
> 
> 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH v6 2/9] null: fix segfault when null_pmd added to bonding

2015-10-27 Thread Thomas Monjalon
Hi,
There is no change in v6 for this patch which was acked by Tetsuya.
So why not keep the Acked-by below your Signed-off-by?

It seems patches 2, 3, 4 and 5 were Acked by Tetsuya.
Other acks I'm missing?


2015-10-16 12:00, Tomasz Kulasek:
> This patch initializes eth_dev->link_intr_cbs queue used when null pmd is
> added to the bonding.
> 
> v5 changes:
>  - removed unnecessary malloc for eth_driver (rte_null_pmd)
> 
> Signed-off-by: Tomasz Kulasek 



[dpdk-dev] [PATCH v8 3/8] vhost: vring queue setup for multiple queue support

2015-10-27 Thread Yuanhan Liu
On Tue, Oct 27, 2015 at 11:42:24AM +0200, Michael S. Tsirkin wrote:
...
> > > Looking at that, at least when MQ is enabled, please don't key
> > > stopping queues off GET_VRING_BASE.
> > 
> > Yes, that's only a workaround. I guess it has been there for quite a
> > while, maybe at the time qemu doesn't send RESET_OWNER message.
> 
> RESET_OWNER was a bad idea since it basically closes
> everything.
> 
> > > There are ENABLE/DISABLE messages for that.
> > 
> > That's something new,
> 
> That's part of multiqueue support. If you ignore them,
> nothing works properly.

I will handle them shortly. (well, it may still need weeks :(

> > though I have plan to use them instead, we still
> > need to make sure our code work with old qemu, without ENABLE/DISABLE
> > messages.
> 
> OK but don't rely on this for new code.

Yes.

> 
> > And I will think more while enabling live migration: I should have
> > more time to address issues like this at that time.
> > 
> > > Generally guys, don't take whatever QEMU happens to do for
> > > granted! Look at the protocol spec under doc/specs directory,
> > > if you are making more assumptions you must document them!
> > 
> > Indeed. And we will try to address them bit by bit in future.
> > 
> > --yliu
> 
> But don't pile up these workarounds meanwhile.  I'm very worried.  The
> way you are carrying on, each new QEMU is likely to break your
> assumptions.

Good point. I'll have more discussion with Huawei, to see if we can
fix them sooner.

--yliu


[dpdk-dev] [PATCH v3 16/16] doc: release notes update for fm10k Vector PMD

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Update 2.2 release notes, add descriptions for Vector PMD implementation
in fm10k driver.

Signed-off-by: Chen Jing D(Mark) 
---
 doc/guides/rel_notes/release_2_2.rst |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 9a70dae..44a3f74 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -39,6 +39,11 @@ Drivers

   Fixed issue with libvirt ``virsh destroy`` not killing the VM.

+* **fm10k:  Add Vector Rx/Tx implementation.**
+
+  This patch set includes Vector Rx/Tx functions to receive/transmit packets
+  for fm10k devices. It also contains logic to do sanity check for proper
+  RX/TX function selections.

 Libraries
 ~
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 15/16] fm10k: fix a crash issue in vector RX func

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Vector RX function will process 4 packets at a time. When the RX
ring wrapps to the tail and the left descriptor size is not multiple
of 4, SW will overwrite memory that not belongs to it and cause crash.
The fix will allocate additional 4 HW/SW spaces at the tail to avoid
overwrite.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h|4 
 drivers/net/fm10k/fm10k_ethdev.c |   19 +--
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 68ae1b8..82a548f 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -177,12 +177,16 @@ struct fm10k_rx_queue {
struct rte_mbuf *pkt_last_seg;  /* Last segment of current packet. */
uint64_t hw_ring_phys_addr;
uint64_t mbuf_initializer; /* value to init mbufs */
+   /* need to alloc dummy mbuf, for wraparound when scanning hw ring */
+   struct rte_mbuf fake_mbuf;
uint16_t next_dd;
uint16_t next_alloc;
uint16_t next_trigger;
uint16_t alloc_thresh;
volatile uint32_t *tail_ptr;
uint16_t nb_desc;
+   /* Number of faked desc added at the tail for Vector RX function */
+   uint16_t nb_fake_desc;
uint16_t queue_id;
/* Below 2 fields only valid in case vPMD is applied. */
uint16_t rxrearm_nb; /* number of remaining to be re-armed */
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 469bd85..fb8ec0d 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -102,6 +102,7 @@ fm10k_mbx_unlock(struct fm10k_hw *hw)
 static inline int
 rx_queue_reset(struct fm10k_rx_queue *q)
 {
+   static const union fm10k_rx_desc zero = {{0}};
uint64_t dma_addr;
int i, diag;
PMD_INIT_FUNC_TRACE();
@@ -122,6 +123,15 @@ rx_queue_reset(struct fm10k_rx_queue *q)
q->hw_ring[i].q.hdr_addr = dma_addr;
}

+   /* initialize extra software ring entries. Space for these extra
+* entries is always allocated.
+*/
+   memset(&q->fake_mbuf, 0x0, sizeof(q->fake_mbuf));
+   for (i = 0; i < q->nb_fake_desc; ++i) {
+   q->sw_ring[q->nb_desc + i] = &q->fake_mbuf;
+   q->hw_ring[q->nb_desc + i] = zero;
+   }
+
q->next_dd = 0;
q->next_alloc = 0;
q->next_trigger = q->alloc_thresh - 1;
@@ -147,6 +157,10 @@ rx_queue_clean(struct fm10k_rx_queue *q)
for (i = 0; i < q->nb_desc; ++i)
q->hw_ring[i] = zero;

+   /* zero faked descriptors */
+   for (i = 0; i < q->nb_fake_desc; ++i)
+   q->hw_ring[q->nb_desc + i] = zero;
+
/* vPMD driver has a different way of releasing mbufs. */
if (q->rx_using_sse) {
fm10k_rx_queue_release_mbufs_vec(q);
@@ -1323,6 +1337,7 @@ fm10k_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
/* setup queue */
q->mp = mp;
q->nb_desc = nb_desc;
+   q->nb_fake_desc = FM10K_MULT_RX_DESC;
q->port_id = dev->data->port_id;
q->queue_id = queue_id;
q->tail_ptr = (volatile uint32_t *)
@@ -1332,8 +1347,8 @@ fm10k_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,

/* allocate memory for the software ring */
q->sw_ring = rte_zmalloc_socket("fm10k sw ring",
-   nb_desc * sizeof(struct rte_mbuf *),
-   RTE_CACHE_LINE_SIZE, socket_id);
+   (nb_desc + q->nb_fake_desc) * sizeof(struct rte_mbuf *),
+   RTE_CACHE_LINE_SIZE, socket_id);
if (q->sw_ring == NULL) {
PMD_INIT_LOG(ERR, "Cannot allocate software ring");
rte_free(q);
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 14/16] fm10k: Add function to decide best TX func

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add func fm10k_set_tx_function to decide the best TX func in
fm10k_dev_tx_init.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h|1 +
 drivers/net/fm10k/fm10k_ethdev.c |   38 --
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 2bead12..68ae1b8 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -222,6 +222,7 @@ struct fm10k_tx_queue {
uint16_t next_rs; /* Next pos to set RS flag */
uint16_t next_dd; /* Next pos to check DD flag */
volatile uint32_t *tail_ptr;
+   uint32_t txq_flags; /* Holds flags for this TXq */
uint16_t nb_desc;
uint8_t port_id;
uint8_t tx_deferred_start; /** < don't start this queue in dev start. */
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 88bd887..469bd85 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -53,6 +53,9 @@
 #define CHARS_PER_UINT32 (sizeof(uint32_t))
 #define BIT_MASK_PER_UINT32 ((1 << CHARS_PER_UINT32) - 1)

+#define FM10K_SIMPLE_TX_FLAG ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
+   ETH_TXQ_FLAGS_NOOFFLOADS)
+
 static void fm10k_close_mbx_service(struct fm10k_hw *hw);
 static void fm10k_dev_promiscuous_enable(struct rte_eth_dev *dev);
 static void fm10k_dev_promiscuous_disable(struct rte_eth_dev *dev);
@@ -68,6 +71,7 @@ fm10k_MACVLAN_remove_all(struct rte_eth_dev *dev);
 static void fm10k_tx_queue_release(void *queue);
 static void fm10k_rx_queue_release(void *queue);
 static void fm10k_set_rx_function(struct rte_eth_dev *dev);
+static void fm10k_set_tx_function(struct rte_eth_dev *dev);

 static void
 fm10k_mbx_initlock(struct fm10k_hw *hw)
@@ -414,6 +418,10 @@ fm10k_dev_tx_init(struct rte_eth_dev *dev)
base_addr >> (CHAR_BIT * sizeof(uint32_t)));
FM10K_WRITE_REG(hw, FM10K_TDLEN(i), size);
}
+
+   /* set up vector or scalar TX function as appropriate */
+   fm10k_set_tx_function(dev);
+
return 0;
 }

@@ -980,8 +988,7 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
},
.tx_free_thresh = FM10K_TX_FREE_THRESH_DEFAULT(0),
.tx_rs_thresh = FM10K_TX_RS_THRESH_DEFAULT(0),
-   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
-   ETH_TXQ_FLAGS_NOOFFLOADS,
+   .txq_flags = FM10K_SIMPLE_TX_FLAG,
};

 }
@@ -1479,6 +1486,7 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
q->nb_desc = nb_desc;
q->port_id = dev->data->port_id;
q->queue_id = queue_id;
+   q->txq_flags = conf->txq_flags;
q->ops = &def_txq_ops;
q->tail_ptr = (volatile uint32_t *)
&((uint32_t *)hw->hw_addr)[FM10K_TDT(queue_id)];
@@ -2090,6 +2098,32 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
 };

 static void __attribute__((cold))
+fm10k_set_tx_function(struct rte_eth_dev *dev)
+{
+   struct fm10k_tx_queue *txq;
+   int i;
+   int use_sse = 1;
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   txq = dev->data->tx_queues[i];
+   if ((txq->txq_flags & FM10K_SIMPLE_TX_FLAG) != \
+   FM10K_SIMPLE_TX_FLAG) {
+   use_sse = 0;
+   break;
+   }
+   }
+
+   if (use_sse) {
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   txq = dev->data->tx_queues[i];
+   fm10k_txq_vec_setup(txq);
+   }
+   dev->tx_pkt_burst = fm10k_xmit_pkts_vec;
+   } else
+   dev->tx_pkt_burst = fm10k_xmit_pkts;
+}
+
+static void __attribute__((cold))
 fm10k_set_rx_function(struct rte_eth_dev *dev)
 {
struct fm10k_dev_info *dev_info = FM10K_DEV_PRIVATE_TO_INFO(dev);
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 13/16] fm10k: introduce 2 funcs to reset TX queue and mbuf release

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add 2 funcs to reset TX queue and mbuf release when Vector TX
applied.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c |   68 
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index e802eec..7ef7910 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -44,6 +44,11 @@
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif

+static void
+fm10k_tx_queue_release_mbufs_vec(struct fm10k_tx_queue *txq);
+static void
+fm10k_reset_tx_queue(struct fm10k_tx_queue *txq);
+
 /* Handling the offload flags (olflags) field takes computation
  * time when receiving packets. Therefore we provide a flag to disable
  * the processing of the olflags field when they are not needed. This
@@ -620,6 +625,17 @@ fm10k_recv_scattered_pkts_vec(void *rx_queue,
&split_flags[i]);
 }

+static const struct fm10k_txq_ops vec_txq_ops = {
+   .release_mbufs = fm10k_tx_queue_release_mbufs_vec,
+   .reset = fm10k_reset_tx_queue,
+};
+
+void __attribute__((cold))
+fm10k_txq_vec_setup(struct fm10k_tx_queue *txq)
+{
+   txq->ops = &vec_txq_ops;
+}
+
 static inline void
 vtx1(volatile struct fm10k_tx_desc *txdp,
struct rte_mbuf *pkt, uint64_t flags)
@@ -769,3 +785,55 @@ fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf 
**tx_pkts,

return nb_pkts;
 }
+
+static void __attribute__((cold))
+fm10k_tx_queue_release_mbufs_vec(struct fm10k_tx_queue *txq)
+{
+   unsigned i;
+   const uint16_t max_desc = (uint16_t)(txq->nb_desc - 1);
+
+   if (txq->sw_ring == NULL || txq->nb_free == max_desc)
+   return;
+
+   /* release the used mbufs in sw_ring */
+   for (i = txq->next_dd - (txq->rs_thresh - 1);
+i != txq->next_free;
+i = (i + 1) & max_desc)
+   rte_pktmbuf_free_seg(txq->sw_ring[i]);
+
+   txq->nb_free = max_desc;
+
+   /* reset tx_entry */
+   for (i = 0; i < txq->nb_desc; i++)
+   txq->sw_ring[i] = NULL;
+
+   rte_free(txq->sw_ring);
+   txq->sw_ring = NULL;
+}
+
+static void __attribute__((cold))
+fm10k_reset_tx_queue(struct fm10k_tx_queue *txq)
+{
+   static const struct fm10k_tx_desc zeroed_desc = {0};
+   struct rte_mbuf **txe = txq->sw_ring;
+   uint16_t i;
+
+   /* Zero out HW ring memory */
+   for (i = 0; i < txq->nb_desc; i++)
+   txq->hw_ring[i] = zeroed_desc;
+
+   /* Initialize SW ring entries */
+   for (i = 0; i < txq->nb_desc; i++)
+   txe[i] = NULL;
+
+   txq->next_dd = (uint16_t)(txq->rs_thresh - 1);
+   txq->next_rs = (uint16_t)(txq->rs_thresh - 1);
+
+   txq->next_free = 0;
+   txq->nb_used = 0;
+   /* Always allow 1 descriptor to be un-allocated to avoid
+* a H/W race condition
+*/
+   txq->nb_free = (uint16_t)(txq->nb_desc - 1);
+   FM10K_PCI_REG_WRITE(txq->tail_ptr, 0);
+}
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 12/16] fm10k: use func pointer to reset TX queue and mbuf release

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Vector TX use different way to manage TX queue, it's necessary
to use different functions to reset TX queue and release mbuf
in TX queue. So, introduce 2 function pointers to do such ops.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h|9 +
 drivers/net/fm10k/fm10k_ethdev.c |   21 -
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 0a4c174..2bead12 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -204,11 +204,14 @@ struct fifo {
uint16_t *endp;
 };

+struct fm10k_txq_ops;
+
 struct fm10k_tx_queue {
struct rte_mbuf **sw_ring;
struct fm10k_tx_desc *hw_ring;
uint64_t hw_ring_phys_addr;
struct fifo rs_tracker;
+   const struct fm10k_txq_ops *ops; /* txq ops */
uint16_t last_free;
uint16_t next_free;
uint16_t nb_free;
@@ -225,6 +228,11 @@ struct fm10k_tx_queue {
uint16_t queue_id;
 };

+struct fm10k_txq_ops {
+   void (*release_mbufs)(struct fm10k_tx_queue *txq);
+   void (*reset)(struct fm10k_tx_queue *txq);
+};
+
 #define MBUF_DMA_ADDR(mb) \
((uint64_t) ((mb)->buf_physaddr + (mb)->data_off))

@@ -338,4 +346,5 @@ uint16_t fm10k_recv_scattered_pkts_vec(void *, struct 
rte_mbuf **,
uint16_t);
 uint16_t fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);
+void fm10k_txq_vec_setup(struct fm10k_tx_queue *txq);
 #endif
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a46a349..88bd887 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -292,6 +292,11 @@ tx_queue_disable(struct fm10k_hw *hw, uint16_t qnum)
return 0;
 }

+static const struct fm10k_txq_ops def_txq_ops = {
+   .release_mbufs = tx_queue_free,
+   .reset = tx_queue_reset,
+};
+
 static int
 fm10k_dev_configure(struct rte_eth_dev *dev)
 {
@@ -571,7 +576,8 @@ fm10k_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
PMD_INIT_FUNC_TRACE();

if (tx_queue_id < dev->data->nb_tx_queues) {
-   tx_queue_reset(dev->data->tx_queues[tx_queue_id]);
+   struct fm10k_tx_queue *q = dev->data->tx_queues[tx_queue_id];
+   q->ops->reset(q);

/* reset head and tail pointers */
FM10K_WRITE_REG(hw, FM10K_TDH(tx_queue_id), 0);
@@ -837,8 +843,10 @@ fm10k_dev_queue_release(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

if (dev->data->tx_queues) {
-   for (i = 0; i < dev->data->nb_tx_queues; i++)
-   fm10k_tx_queue_release(dev->data->tx_queues[i]);
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   struct fm10k_tx_queue *txq = dev->data->tx_queues[i];
+   txq->ops->release_mbufs(txq);
+   }
}

if (dev->data->rx_queues) {
@@ -1454,7 +1462,8 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
 * different socket than was previously used.
 */
if (dev->data->tx_queues[queue_id] != NULL) {
-   tx_queue_free(dev->data->tx_queues[queue_id]);
+   struct fm10k_tx_queue *txq = dev->data->tx_queues[queue_id];
+   txq->ops->release_mbufs(txq);
dev->data->tx_queues[queue_id] = NULL;
}

@@ -1470,6 +1479,7 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
q->nb_desc = nb_desc;
q->port_id = dev->data->port_id;
q->queue_id = queue_id;
+   q->ops = &def_txq_ops;
q->tail_ptr = (volatile uint32_t *)
&((uint32_t *)hw->hw_addr)[FM10K_TDT(queue_id)];
if (handle_txconf(q, conf))
@@ -1528,9 +1538,10 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
 static void
 fm10k_tx_queue_release(void *queue)
 {
+   struct fm10k_tx_queue *q = queue;
PMD_INIT_FUNC_TRACE();

-   tx_queue_free(queue);
+   q->ops->release_mbufs(q);
 }

 static int
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 11/16] fm10k: add Vector TX function

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add Vector TX func fm10k_xmit_pkts_vec to transmit packets.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |5 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  150 
 2 files changed, 155 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index c5e66e2..0a4c174 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -215,6 +215,9 @@ struct fm10k_tx_queue {
uint16_t nb_used;
uint16_t free_thresh;
uint16_t rs_thresh;
+   /* Below 2 fields only valid in case vPMD is applied. */
+   uint16_t next_rs; /* Next pos to set RS flag */
+   uint16_t next_dd; /* Next pos to check DD flag */
volatile uint32_t *tail_ptr;
uint16_t nb_desc;
uint8_t port_id;
@@ -333,4 +336,6 @@ void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue 
*rxq);
 uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
 uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
uint16_t);
+uint16_t fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
 #endif
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index ea85996..e802eec 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -619,3 +619,153 @@ fm10k_recv_scattered_pkts_vec(void *rx_queue,
return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
&split_flags[i]);
 }
+
+static inline void
+vtx1(volatile struct fm10k_tx_desc *txdp,
+   struct rte_mbuf *pkt, uint64_t flags)
+{
+   __m128i descriptor = _mm_set_epi64x(flags << 56 |
+   pkt->vlan_tci << 16 | pkt->data_len,
+   MBUF_DMA_ADDR(pkt));
+   _mm_store_si128((__m128i *)txdp, descriptor);
+}
+
+static inline void
+vtx(volatile struct fm10k_tx_desc *txdp,
+   struct rte_mbuf **pkt, uint16_t nb_pkts,  uint64_t flags)
+{
+   int i;
+
+   for (i = 0; i < nb_pkts; ++i, ++txdp, ++pkt)
+   vtx1(txdp, *pkt, flags);
+}
+
+static inline int __attribute__((always_inline))
+fm10k_tx_free_bufs(struct fm10k_tx_queue *txq)
+{
+   struct rte_mbuf **txep;
+   uint8_t flags;
+   uint32_t n;
+   uint32_t i;
+   int nb_free = 0;
+   struct rte_mbuf *m, *free[RTE_FM10K_TX_MAX_FREE_BUF_SZ];
+
+   /* check DD bit on threshold descriptor */
+   flags = txq->hw_ring[txq->next_dd].flags;
+   if (!(flags & FM10K_TXD_FLAG_DONE))
+   return 0;
+
+   n = txq->rs_thresh;
+
+   /* First buffer to free from S/W ring is at index
+* next_dd - (rs_thresh-1)
+*/
+   txep = &txq->sw_ring[txq->next_dd - (n - 1)];
+   m = __rte_pktmbuf_prefree_seg(txep[0]);
+   if (likely(m != NULL)) {
+   free[0] = m;
+   nb_free = 1;
+   for (i = 1; i < n; i++) {
+   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   if (likely(m != NULL)) {
+   if (likely(m->pool == free[0]->pool))
+   free[nb_free++] = m;
+   else {
+   rte_mempool_put_bulk(free[0]->pool,
+   (void *)free, nb_free);
+   free[0] = m;
+   nb_free = 1;
+   }
+   }
+   }
+   rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free);
+   } else {
+   for (i = 1; i < n; i++) {
+   m = __rte_pktmbuf_prefree_seg(txep[i]);
+   if (m != NULL)
+   rte_mempool_put(m->pool, m);
+   }
+   }
+
+   /* buffers were freed, update counters */
+   txq->nb_free = (uint16_t)(txq->nb_free + txq->rs_thresh);
+   txq->next_dd = (uint16_t)(txq->next_dd + txq->rs_thresh);
+   if (txq->next_dd >= txq->nb_desc)
+   txq->next_dd = (uint16_t)(txq->rs_thresh - 1);
+
+   return txq->rs_thresh;
+}
+
+static inline void __attribute__((always_inline))
+tx_backlog_entry(struct rte_mbuf **txep,
+struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+   int i;
+
+   for (i = 0; i < (int)nb_pkts; ++i)
+   txep[i] = tx_pkts[i];
+}
+
+uint16_t
+fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts)
+{
+   struct fm10k_tx_queue *txq = (struct fm10k_tx_queue *)tx_queue;
+   volatile struct fm10k_tx_desc *txdp;
+   struct rte_mbuf **txep;
+   uint16_t n, nb_commit, tx_id;
+   uint64_t flags = FM10K_TXD_FLAG_LAST;
+   uint64_t rs = FM10K_TXD_FLAG_RS | FM10

[dpdk-dev] [PATCH v3 10/16] fm10k: add func to release mbuf in case Vector RX applied

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Since Vector RX use different variables to trace RX HW ring, it
leads to need different func to release mbuf properly.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |1 +
 drivers/net/fm10k/fm10k_ethdev.c   |6 ++
 drivers/net/fm10k/fm10k_rxtx_vec.c |   18 ++
 3 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 8614e81..c5e66e2 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -329,6 +329,7 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,

 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
 int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
+void fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq);
 uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
 uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
uint16_t);
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 4690a0c..a46a349 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -143,6 +143,12 @@ rx_queue_clean(struct fm10k_rx_queue *q)
for (i = 0; i < q->nb_desc; ++i)
q->hw_ring[i] = zero;

+   /* vPMD driver has a different way of releasing mbufs. */
+   if (q->rx_using_sse) {
+   fm10k_rx_queue_release_mbufs_vec(q);
+   return;
+   }
+
/* free software buffers */
for (i = 0; i < q->nb_desc; ++i) {
if (q->sw_ring[i]) {
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 3fd5d45..ea85996 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -313,6 +313,24 @@ fm10k_rxq_rearm(struct fm10k_rx_queue *rxq)
FM10K_PCI_REG_WRITE(rxq->tail_ptr, rx_id);
 }

+void __attribute__((cold))
+fm10k_rx_queue_release_mbufs_vec(struct fm10k_rx_queue *rxq)
+{
+   const unsigned mask = rxq->nb_desc - 1;
+   unsigned i;
+
+   if (rxq->sw_ring == NULL || rxq->rxrearm_nb >= rxq->nb_desc)
+   return;
+
+   /* free all mbufs that are valid in the ring */
+   for (i = rxq->next_dd; i != rxq->rxrearm_start; i = (i + 1) & mask)
+   rte_pktmbuf_free_seg(rxq->sw_ring[i]);
+   rxq->rxrearm_nb = rxq->nb_desc;
+
+   /* set all entries to NULL */
+   memset(rxq->sw_ring, 0, sizeof(rxq->sw_ring[0]) * rxq->nb_desc);
+}
+
 static inline uint16_t
 fm10k_recv_raw_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts, uint8_t *split_packet)
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 09/16] fm10k: add function to decide best RX function

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add func fm10k_set_rx_function to decide best RX func in
fm10k_dev_rx_init

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h|1 +
 drivers/net/fm10k/fm10k_ethdev.c |   36 
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 06697fa..8614e81 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -187,6 +187,7 @@ struct fm10k_rx_queue {
/* Below 2 fields only valid in case vPMD is applied. */
uint16_t rxrearm_nb; /* number of remaining to be re-armed */
uint16_t rxrearm_start;  /* the idx we start the re-arming from */
+   uint16_t rx_using_sse; /* indicates that vector RX is in use */
uint8_t port_id;
uint8_t drop_en;
uint8_t rx_deferred_start; /* don't start this queue in dev start. */
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 44c3d34..4690a0c 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -67,6 +67,7 @@ static void
 fm10k_MACVLAN_remove_all(struct rte_eth_dev *dev);
 static void fm10k_tx_queue_release(void *queue);
 static void fm10k_rx_queue_release(void *queue);
+static void fm10k_set_rx_function(struct rte_eth_dev *dev);

 static void
 fm10k_mbx_initlock(struct fm10k_hw *hw)
@@ -462,7 +463,6 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)
dev->data->dev_conf.rxmode.enable_scatter) {
uint32_t reg;
dev->data->scattered_rx = 1;
-   dev->rx_pkt_burst = fm10k_recv_scattered_pkts;
reg = FM10K_READ_REG(hw, FM10K_SRRCTL(i));
reg |= FM10K_SRRCTL_BUFFER_CHAINING_EN;
FM10K_WRITE_REG(hw, FM10K_SRRCTL(i), reg);
@@ -478,6 +478,9 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)

/* Configure RSS if applicable */
fm10k_dev_mq_rx_configure(dev);
+
+   /* Decide the best RX function */
+   fm10k_set_rx_function(dev);
return 0;
 }

@@ -2069,6 +2072,34 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
.rss_hash_conf_get  = fm10k_rss_hash_conf_get,
 };

+static void __attribute__((cold))
+fm10k_set_rx_function(struct rte_eth_dev *dev)
+{
+   struct fm10k_dev_info *dev_info = FM10K_DEV_PRIVATE_TO_INFO(dev);
+   uint16_t i, rx_using_sse;
+
+   /* In order to allow Vector Rx there are a few configuration
+* conditions to be met.
+*/
+   if (!fm10k_rx_vec_condition_check(dev) && dev_info->rx_vec_allowed) {
+   if (dev->data->scattered_rx)
+   dev->rx_pkt_burst = fm10k_recv_scattered_pkts_vec;
+   else
+   dev->rx_pkt_burst = fm10k_recv_pkts_vec;
+   } else if (dev->data->scattered_rx)
+   dev->rx_pkt_burst = fm10k_recv_scattered_pkts;
+
+   rx_using_sse =
+   (dev->rx_pkt_burst == fm10k_recv_scattered_pkts_vec ||
+   dev->rx_pkt_burst == fm10k_recv_pkts_vec);
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   struct fm10k_rx_queue *rxq = dev->data->rx_queues[i];
+   rxq->rx_using_sse = rx_using_sse;
+   }
+
+}
+
 static void
 fm10k_params_init(struct rte_eth_dev *dev)
 {
@@ -2103,9 +2134,6 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
dev->rx_pkt_burst = &fm10k_recv_pkts;
dev->tx_pkt_burst = &fm10k_xmit_pkts;

-   if (dev->data->scattered_rx)
-   dev->rx_pkt_burst = &fm10k_recv_scattered_pkts;
-
/* only initialize in the primary process */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return 0;
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 08/16] fm10k: add Vector RX scatter function

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add func fm10k_recv_scattered_pkts_vec to receive chained packets
with SSE instructions.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |2 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |   88 
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 1502ae3..06697fa 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -329,4 +329,6 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
 int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
 uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
+uint16_t fm10k_recv_scattered_pkts_vec(void *, struct rte_mbuf **,
+   uint16_t);
 #endif
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2e6f1a2..3fd5d45 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -513,3 +513,91 @@ fm10k_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 {
return fm10k_recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
 }
+
+static inline uint16_t
+fm10k_reassemble_packets(struct fm10k_rx_queue *rxq,
+   struct rte_mbuf **rx_bufs,
+   uint16_t nb_bufs, uint8_t *split_flags)
+{
+   struct rte_mbuf *pkts[RTE_FM10K_MAX_RX_BURST]; /*finished pkts*/
+   struct rte_mbuf *start = rxq->pkt_first_seg;
+   struct rte_mbuf *end =  rxq->pkt_last_seg;
+   unsigned pkt_idx, buf_idx;
+
+
+   for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
+   if (end != NULL) {
+   /* processing a split packet */
+   end->next = rx_bufs[buf_idx];
+   start->nb_segs++;
+   start->pkt_len += rx_bufs[buf_idx]->data_len;
+   end = end->next;
+
+   if (!split_flags[buf_idx]) {
+   /* it's the last packet of the set */
+   start->hash = end->hash;
+   start->ol_flags = end->ol_flags;
+   pkts[pkt_idx++] = start;
+   start = end = NULL;
+   }
+   } else {
+   /* not processing a split packet */
+   if (!split_flags[buf_idx]) {
+   /* not a split packet, save and skip */
+   pkts[pkt_idx++] = rx_bufs[buf_idx];
+   continue;
+   }
+   end = start = rx_bufs[buf_idx];
+   }
+   }
+
+   /* save the partial packet for next time */
+   rxq->pkt_first_seg = start;
+   rxq->pkt_last_seg = end;
+   memcpy(rx_bufs, pkts, pkt_idx * (sizeof(*pkts)));
+   return pkt_idx;
+}
+
+/*
+ * vPMD receive routine that reassembles scattered packets
+ *
+ * Notice:
+ * - don't support ol_flags for rss and csum err
+ * - nb_pkts > RTE_FM10K_MAX_RX_BURST, only scan RTE_FM10K_MAX_RX_BURST
+ *   numbers of DD bit
+ */
+uint16_t
+fm10k_recv_scattered_pkts_vec(void *rx_queue,
+   struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts)
+{
+   struct fm10k_rx_queue *rxq = rx_queue;
+   uint8_t split_flags[RTE_FM10K_MAX_RX_BURST] = {0};
+   unsigned i = 0;
+
+   /* Split_flags only can support max of RTE_FM10K_MAX_RX_BURST */
+   nb_pkts = RTE_MIN(nb_pkts, RTE_FM10K_MAX_RX_BURST);
+   /* get some new buffers */
+   uint16_t nb_bufs = fm10k_recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts,
+   split_flags);
+   if (nb_bufs == 0)
+   return 0;
+
+   /* happy day case, full burst + no packets to be joined */
+   const uint64_t *split_fl64 = (uint64_t *)split_flags;
+   if (rxq->pkt_first_seg == NULL &&
+   split_fl64[0] == 0 && split_fl64[1] == 0 &&
+   split_fl64[2] == 0 && split_fl64[3] == 0)
+   return nb_bufs;
+
+   /* reassemble any packets that need reassembly*/
+   if (rxq->pkt_first_seg == NULL) {
+   /* find the first split flag, and only reassemble then*/
+   while (i < nb_bufs && !split_flags[i])
+   i++;
+   if (i == nb_bufs)
+   return nb_bufs;
+   }
+   return i + fm10k_reassemble_packets(rxq, &rx_pkts[i], nb_bufs - i,
+   &split_flags[i]);
+}
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 07/16] fm10k: add func to do Vector RX condition check

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add func fm10k_rx_vec_condition_check to check if Vector RX
func can be applied.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |1 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |   31 +++
 2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index f04ba2c..1502ae3 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -327,5 +327,6 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
uint16_t nb_pkts);

 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
+int fm10k_rx_vec_condition_check(struct rte_eth_dev *);
 uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
 #endif
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 6127cf9..2e6f1a2 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -172,6 +172,37 @@ fm10k_desc_to_pktype_v(__m128i descs[4], struct rte_mbuf 
**rx_pkts)
 #endif

 int __attribute__((cold))
+fm10k_rx_vec_condition_check(struct rte_eth_dev *dev)
+{
+#ifndef RTE_LIBRTE_IEEE1588
+   struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+   struct rte_fdir_conf *fconf = &dev->data->dev_conf.fdir_conf;
+
+#ifndef RTE_FM10K_RX_OLFLAGS_ENABLE
+   /* whithout rx ol_flags, no VP flag report */
+   if (rxmode->hw_vlan_extend != 0)
+   return -1;
+#endif
+
+   /* no fdir support */
+   if (fconf->mode != RTE_FDIR_MODE_NONE)
+   return -1;
+
+   /* - no csum error report support
+* - no header split support
+*/
+   if (rxmode->hw_ip_checksum == 1 ||
+   rxmode->header_split == 1)
+   return -1;
+
+   return 0;
+#else
+   RTE_SET_USED(dev);
+   return -1;
+#endif
+}
+
+int __attribute__((cold))
 fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq)
 {
uintptr_t p;
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 06/16] fm10k: add Vector RX function

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add func fm10k_recv_raw_pkts_vec to parse raw packets, in which
includes possible chained packets.
Add func fm10k_recv_pkts_vec to receive single mbuf packet.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |1 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  201 
 2 files changed, 202 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 5df7960..f04ba2c 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -327,4 +327,5 @@ uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
uint16_t nb_pkts);

 int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
+uint16_t fm10k_recv_pkts_vec(void *, struct rte_mbuf **, uint16_t);
 #endif
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 581a309..6127cf9 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -281,3 +281,204 @@ fm10k_rxq_rearm(struct fm10k_rx_queue *rxq)
/* Update the tail pointer on the NIC */
FM10K_PCI_REG_WRITE(rxq->tail_ptr, rx_id);
 }
+
+static inline uint16_t
+fm10k_recv_raw_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts, uint8_t *split_packet)
+{
+   volatile union fm10k_rx_desc *rxdp;
+   struct rte_mbuf **mbufp;
+   uint16_t nb_pkts_recd;
+   int pos;
+   struct fm10k_rx_queue *rxq = rx_queue;
+   uint64_t var;
+   __m128i shuf_msk;
+   __m128i dd_check, eop_check;
+   uint16_t next_dd;
+
+   next_dd = rxq->next_dd;
+
+   /* Just the act of getting into the function from the application is
+* going to cost about 7 cycles
+*/
+   rxdp = rxq->hw_ring + next_dd;
+
+   _mm_prefetch((const void *)rxdp, _MM_HINT_T0);
+
+   /* See if we need to rearm the RX queue - gives the prefetch a bit
+* of time to act
+*/
+   if (rxq->rxrearm_nb > RTE_FM10K_RXQ_REARM_THRESH)
+   fm10k_rxq_rearm(rxq);
+
+   /* Before we start moving massive data around, check to see if
+* there is actually a packet available
+*/
+   if (!(rxdp->d.staterr & FM10K_RXD_STATUS_DD))
+   return 0;
+
+   /* Vecotr RX will process 4 packets at a time, strip the unaligned
+* tails in case it's not multiple of 4.
+*/
+   nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_FM10K_DESCS_PER_LOOP);
+
+   /* 4 packets DD mask */
+   dd_check = _mm_set_epi64x(0x00010001LL, 0x00010001LL);
+
+   /* 4 packets EOP mask */
+   eop_check = _mm_set_epi64x(0x00020002LL, 0x00020002LL);
+
+   /* mask to shuffle from desc. to mbuf */
+   shuf_msk = _mm_set_epi8(
+   7, 6, 5, 4,  /* octet 4~7, 32bits rss */
+   15, 14,  /* octet 14~15, low 16 bits vlan_macip */
+   13, 12,  /* octet 12~13, 16 bits data_len */
+   0xFF, 0xFF,  /* skip high 16 bits pkt_len, zero out */
+   13, 12,  /* octet 12~13, low 16 bits pkt_len */
+   0xFF, 0xFF,  /* skip high 16 bits pkt_type */
+   0xFF, 0xFF   /* Skip pkt_type field in shuffle operation */
+   );
+
+   /* Cache is empty -> need to scan the buffer rings, but first move
+* the next 'n' mbufs into the cache
+*/
+   mbufp = &rxq->sw_ring[next_dd];
+
+   /* A. load 4 packet in one loop
+* [A*. mask out 4 unused dirty field in desc]
+* B. copy 4 mbuf point from swring to rx_pkts
+* C. calc the number of DD bits among the 4 packets
+* [C*. extract the end-of-packet bit, if requested]
+* D. fill info. from desc to mbuf
+*/
+   for (pos = 0, nb_pkts_recd = 0; pos < nb_pkts;
+   pos += RTE_FM10K_DESCS_PER_LOOP,
+   rxdp += RTE_FM10K_DESCS_PER_LOOP) {
+   __m128i descs0[RTE_FM10K_DESCS_PER_LOOP];
+   __m128i pkt_mb1, pkt_mb2, pkt_mb3, pkt_mb4;
+   __m128i zero, staterr, sterr_tmp1, sterr_tmp2;
+   __m128i mbp1, mbp2; /* two mbuf pointer in one XMM reg. */
+
+   /* B.1 load 1 mbuf point */
+   mbp1 = _mm_loadu_si128((__m128i *)&mbufp[pos]);
+
+   /* Read desc statuses backwards to avoid race condition */
+   /* A.1 load 4 pkts desc */
+   descs0[3] = _mm_loadu_si128((__m128i *)(rxdp + 3));
+
+   /* B.2 copy 2 mbuf point into rx_pkts  */
+   _mm_storeu_si128((__m128i *)&rx_pkts[pos], mbp1);
+
+   /* B.1 load 1 mbuf point */
+   mbp2 = _mm_loadu_si128((__m128i *)&mbufp[pos+2]);
+
+   descs0[2] = _mm_loadu_si128((__m128i *)(rxdp + 2));
+   /* B.1 load 2 mbuf point */
+   descs0[1] = _mm_loadu_si128((__m128i *)(rxdp + 1));
+   descs0[0] = _mm_loadu_s

[dpdk-dev] [PATCH v3 05/16] fm10k: add 2 functions to parse pkt_type and offload flag

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add 2 functions, in which using SSE instructions to parse RX desc
to get pkt_type and ol_flags in mbuf.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c |  127 
 1 files changed, 127 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 75533f9..581a309 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -44,6 +44,133 @@
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif

+/* Handling the offload flags (olflags) field takes computation
+ * time when receiving packets. Therefore we provide a flag to disable
+ * the processing of the olflags field when they are not needed. This
+ * gives improved performance, at the cost of losing the offload info
+ * in the received packet
+ */
+#ifdef RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE
+
+/* Vlan present flag shift */
+#define VP_SHIFT (2)
+/* L3 type shift */
+#define L3TYPE_SHIFT (4)
+/* L4 type shift */
+#define L4TYPE_SHIFT (7)
+
+static inline void
+fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
+{
+   __m128i ptype0, ptype1, vtag0, vtag1;
+   union {
+   uint16_t e[4];
+   uint64_t dword;
+   } vol;
+
+   const __m128i pkttype_msk = _mm_set_epi16(
+   0x, 0x, 0x, 0x,
+   PKT_RX_VLAN_PKT, PKT_RX_VLAN_PKT,
+   PKT_RX_VLAN_PKT, PKT_RX_VLAN_PKT);
+
+   /* mask everything except rss type */
+   const __m128i rsstype_msk = _mm_set_epi16(
+   0x, 0x, 0x, 0x,
+   0x000F, 0x000F, 0x000F, 0x000F);
+
+   /* map rss type to rss hash flag */
+   const __m128i rss_flags = _mm_set_epi8(0, 0, 0, 0,
+   0, 0, 0, PKT_RX_RSS_HASH,
+   PKT_RX_RSS_HASH, 0, PKT_RX_RSS_HASH, 0,
+   PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, 0);
+
+   ptype0 = _mm_unpacklo_epi16(descs[0], descs[1]);
+   ptype1 = _mm_unpacklo_epi16(descs[2], descs[3]);
+   vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]);
+   vtag1 = _mm_unpackhi_epi16(descs[2], descs[3]);
+
+   ptype0 = _mm_unpacklo_epi32(ptype0, ptype1);
+   ptype0 = _mm_and_si128(ptype0, rsstype_msk);
+   ptype0 = _mm_shuffle_epi8(rss_flags, ptype0);
+
+   vtag1 = _mm_unpacklo_epi32(vtag0, vtag1);
+   vtag1 = _mm_srli_epi16(vtag1, VP_SHIFT);
+   vtag1 = _mm_and_si128(vtag1, pkttype_msk);
+
+   vtag1 = _mm_or_si128(ptype0, vtag1);
+   vol.dword = _mm_cvtsi128_si64(vtag1);
+
+   rx_pkts[0]->ol_flags = vol.e[0];
+   rx_pkts[1]->ol_flags = vol.e[1];
+   rx_pkts[2]->ol_flags = vol.e[2];
+   rx_pkts[3]->ol_flags = vol.e[3];
+}
+
+static inline void
+fm10k_desc_to_pktype_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
+{
+   __m128i l3l4type0, l3l4type1, l3type, l4type;
+   union {
+   uint16_t e[4];
+   uint64_t dword;
+   } vol;
+
+   /* L3 pkt type mask  Bit4 to Bit6 */
+   const __m128i l3type_msk = _mm_set_epi16(
+   0x, 0x, 0x, 0x,
+   0x0070, 0x0070, 0x0070, 0x0070);
+
+   /* L4 pkt type mask  Bit7 to Bit9 */
+   const __m128i l4type_msk = _mm_set_epi16(
+   0x, 0x, 0x, 0x,
+   0x0380, 0x0380, 0x0380, 0x0380);
+
+   /* convert RRC l3 type to mbuf format */
+   const __m128i l3type_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, RTE_PTYPE_L3_IPV6_EXT,
+   RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV4_EXT,
+   RTE_PTYPE_L3_IPV4, 0);
+
+   /* Convert RRC l4 type to mbuf format l4type_flags shift-left 8 bits
+* to fill into8 bits length.
+*/
+   const __m128i l4type_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, 0,
+   RTE_PTYPE_TUNNEL_GENEVE >> 8,
+   RTE_PTYPE_TUNNEL_NVGRE >> 8,
+   RTE_PTYPE_TUNNEL_VXLAN >> 8,
+   RTE_PTYPE_TUNNEL_GRE >> 8,
+   RTE_PTYPE_L4_UDP >> 8,
+   RTE_PTYPE_L4_TCP >> 8,
+   0);
+
+   l3l4type0 = _mm_unpacklo_epi16(descs[0], descs[1]);
+   l3l4type1 = _mm_unpacklo_epi16(descs[2], descs[3]);
+   l3l4type0 = _mm_unpacklo_epi32(l3l4type0, l3l4type1);
+
+   l3type = _mm_and_si128(l3l4type0, l3type_msk);
+   l4type = _mm_and_si128(l3l4type0, l4type_msk);
+
+   l3type = _mm_srli_epi16(l3type, L3TYPE_SHIFT);
+   l4type = _mm_srli_epi16(l4type, L4TYPE_SHIFT);
+
+   l3type = _mm_shuffle_epi8(l3type_flags, l3type);
+   /* l4type_flags shift-left for 8 bits, need shift-right back */
+   l4type = _mm_shuffle_epi8(l4type_flags, l4type);
+
+   l4type = _mm_slli_epi16(l4type,

[dpdk-dev] [PATCH v3 04/16] fm10k: add func to re-allocate mbuf for RX ring

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add function fm10k_rxq_rearm to re-allocate mbuf for used desc
in RX HW ring.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |9 
 drivers/net/fm10k/fm10k_ethdev.c   |3 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |   90 
 3 files changed, 102 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index 362a2d0..5df7960 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -123,6 +123,12 @@
 #define FM10K_VFTA_BIT(vlan_id)(1 << ((vlan_id) & 0x1F))
 #define FM10K_VFTA_IDX(vlan_id)((vlan_id) >> 5)

+#define RTE_FM10K_RXQ_REARM_THRESH  32
+#define RTE_FM10K_VPMD_TX_BURST 32
+#define RTE_FM10K_MAX_RX_BURST  RTE_FM10K_RXQ_REARM_THRESH
+#define RTE_FM10K_TX_MAX_FREE_BUF_SZ64
+#define RTE_FM10K_DESCS_PER_LOOP4
+
 struct fm10k_macvlan_filter_info {
uint16_t vlan_num;   /* Total VLAN number */
uint16_t mac_num;/* Total mac number */
@@ -178,6 +184,9 @@ struct fm10k_rx_queue {
volatile uint32_t *tail_ptr;
uint16_t nb_desc;
uint16_t queue_id;
+   /* Below 2 fields only valid in case vPMD is applied. */
+   uint16_t rxrearm_nb; /* number of remaining to be re-armed */
+   uint16_t rxrearm_start;  /* the idx we start the re-arming from */
uint8_t port_id;
uint8_t drop_en;
uint8_t rx_deferred_start; /* don't start this queue in dev start. */
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 363ef98..44c3d34 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -121,6 +121,9 @@ rx_queue_reset(struct fm10k_rx_queue *q)
q->next_alloc = 0;
q->next_trigger = q->alloc_thresh - 1;
FM10K_PCI_REG_WRITE(q->tail_ptr, q->nb_desc - 1);
+   q->rxrearm_start = 0;
+   q->rxrearm_nb = 0;
+
return 0;
 }

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 34b677b..75533f9 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -64,3 +64,93 @@ fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq)
rxq->mbuf_initializer = *(uint64_t *)p;
return 0;
 }
+
+static inline void
+fm10k_rxq_rearm(struct fm10k_rx_queue *rxq)
+{
+   int i;
+   uint16_t rx_id;
+   volatile union fm10k_rx_desc *rxdp;
+   struct rte_mbuf **mb_alloc = &rxq->sw_ring[rxq->rxrearm_start];
+   struct rte_mbuf *mb0, *mb1;
+   __m128i head_off = _mm_set_epi64x(
+   RTE_PKTMBUF_HEADROOM + FM10K_RX_DATABUF_ALIGN - 1,
+   RTE_PKTMBUF_HEADROOM + FM10K_RX_DATABUF_ALIGN - 1);
+   __m128i dma_addr0, dma_addr1;
+   /* Rx buffer need to be aligned with 512 byte */
+   const __m128i hba_msk = _mm_set_epi64x(0,
+   UINT64_MAX - FM10K_RX_DATABUF_ALIGN + 1);
+
+   rxdp = rxq->hw_ring + rxq->rxrearm_start;
+
+   /* Pull 'n' more MBUFs into the software ring */
+   if (rte_mempool_get_bulk(rxq->mp,
+(void *)mb_alloc,
+RTE_FM10K_RXQ_REARM_THRESH) < 0) {
+   rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed +=
+   RTE_FM10K_RXQ_REARM_THRESH;
+   return;
+   }
+
+   /* Initialize the mbufs in vector, process 2 mbufs in one loop */
+   for (i = 0; i < RTE_FM10K_RXQ_REARM_THRESH; i += 2, mb_alloc += 2) {
+   __m128i vaddr0, vaddr1;
+   uintptr_t p0, p1;
+
+   mb0 = mb_alloc[0];
+   mb1 = mb_alloc[1];
+
+   /* Flush mbuf with pkt template.
+* Data to be rearmed is 6 bytes long.
+* Though, RX will overwrite ol_flags that are coming next
+* anyway. So overwrite whole 8 bytes with one load:
+* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
+*/
+   p0 = (uintptr_t)&mb0->rearm_data;
+   *(uint64_t *)p0 = rxq->mbuf_initializer;
+   p1 = (uintptr_t)&mb1->rearm_data;
+   *(uint64_t *)p1 = rxq->mbuf_initializer;
+
+   /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */
+   vaddr0 = _mm_loadu_si128((__m128i *)&(mb0->buf_addr));
+   vaddr1 = _mm_loadu_si128((__m128i *)&(mb1->buf_addr));
+
+   /* convert pa to dma_addr hdr/data */
+   dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0);
+   dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1);
+
+   /* add headroom to pa values */
+   dma_addr0 = _mm_add_epi64(dma_addr0, head_off);
+   dma_addr1 = _mm_add_epi64(dma_addr1, head_off);
+
+   /* Do 512 byte alignment to satisfy HW requirement, in the
+* meanwhile, set Header Buffer

[dpdk-dev] [PATCH v3 03/16] fm10k: Add a new func to initialize all parameters

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add new function fm10k_params_init to initialize all fm10k related
variables.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k_ethdev.c |   35 +++
 1 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 3c7784e..363ef98 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2066,6 +2066,27 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
.rss_hash_conf_get  = fm10k_rss_hash_conf_get,
 };

+static void
+fm10k_params_init(struct rte_eth_dev *dev)
+{
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct fm10k_dev_info *info = FM10K_DEV_PRIVATE_TO_INFO(dev);
+
+   /* Inialize bus info. Normally we would call fm10k_get_bus_info(), but
+* there is no way to get link status without reading BAR4.  Until this
+* works, assume we have maximum bandwidth.
+* @todo - fix bus info
+*/
+   hw->bus_caps.speed = fm10k_bus_speed_8000;
+   hw->bus_caps.width = fm10k_bus_width_pcie_x8;
+   hw->bus_caps.payload = fm10k_bus_payload_512;
+   hw->bus.speed = fm10k_bus_speed_8000;
+   hw->bus.width = fm10k_bus_width_pcie_x8;
+   hw->bus.payload = fm10k_bus_payload_256;
+
+   info->rx_vec_allowed = true;
+}
+
 static int
 eth_fm10k_dev_init(struct rte_eth_dev *dev)
 {
@@ -2112,18 +2133,8 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
return -EIO;
}

-   /*
-* Inialize bus info. Normally we would call fm10k_get_bus_info(), but
-* there is no way to get link status without reading BAR4.  Until this
-* works, assume we have maximum bandwidth.
-* @todo - fix bus info
-*/
-   hw->bus_caps.speed = fm10k_bus_speed_8000;
-   hw->bus_caps.width = fm10k_bus_width_pcie_x8;
-   hw->bus_caps.payload = fm10k_bus_payload_512;
-   hw->bus.speed = fm10k_bus_speed_8000;
-   hw->bus.width = fm10k_bus_width_pcie_x8;
-   hw->bus.payload = fm10k_bus_payload_256;
+   /* Initialize parameters */
+   fm10k_params_init(dev);

/* Initialize the hw */
diag = fm10k_init_hw(hw);
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 02/16] fm10k: add vPMD pre-condition check for each RX queue

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add condition check in rx_queue_setup func. If number of RX desc
can't satisfy vPMD requirement, record it into a variable. Or
call fm10k_rxq_vec_setup to initialize Vector RX.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/fm10k.h  |   11 ---
 drivers/net/fm10k/fm10k_ethdev.c   |   11 +++
 drivers/net/fm10k/fm10k_rxtx_vec.c |   21 +
 3 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index c089882..362a2d0 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -135,6 +135,8 @@ struct fm10k_dev_info {
/* Protect the mailbox to avoid race condition */
rte_spinlock_tmbx_lock;
struct fm10k_macvlan_filter_infomacvlan;
+   /* Flag to indicate if RX vector conditions satisfied */
+   bool rx_vec_allowed;
 };

 /*
@@ -165,9 +167,10 @@ struct fm10k_rx_queue {
struct rte_mempool *mp;
struct rte_mbuf **sw_ring;
volatile union fm10k_rx_desc *hw_ring;
-   struct rte_mbuf *pkt_first_seg; /**< First segment of current packet. */
-   struct rte_mbuf *pkt_last_seg;  /**< Last segment of current packet. */
+   struct rte_mbuf *pkt_first_seg; /* First segment of current packet. */
+   struct rte_mbuf *pkt_last_seg;  /* Last segment of current packet. */
uint64_t hw_ring_phys_addr;
+   uint64_t mbuf_initializer; /* value to init mbufs */
uint16_t next_dd;
uint16_t next_alloc;
uint16_t next_trigger;
@@ -177,7 +180,7 @@ struct fm10k_rx_queue {
uint16_t queue_id;
uint8_t port_id;
uint8_t drop_en;
-   uint8_t rx_deferred_start; /**< don't start this queue in dev start. */
+   uint8_t rx_deferred_start; /* don't start this queue in dev start. */
 };

 /*
@@ -313,4 +316,6 @@ uint16_t fm10k_recv_scattered_pkts(void *rx_queue,

 uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);
+
+int fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq);
 #endif
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a69c990..3c7784e 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1251,6 +1251,7 @@ fm10k_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
const struct rte_eth_rxconf *conf, struct rte_mempool *mp)
 {
struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct fm10k_dev_info *dev_info = FM10K_DEV_PRIVATE_TO_INFO(dev);
struct fm10k_rx_queue *q;
const struct rte_memzone *mz;

@@ -1333,6 +1334,16 @@ fm10k_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
q->hw_ring_phys_addr = mz->phys_addr;
 #endif

+   /* Check if number of descs satisfied Vector requirement */
+   if (!rte_is_power_of_2(nb_desc)) {
+   PMD_INIT_LOG(DEBUG, "queue[%d] doesn't meet Vector Rx "
+   "preconditions - canceling the feature for "
+   "the whole port[%d]",
+q->queue_id, q->port_id);
+   dev_info->rx_vec_allowed = false;
+   } else
+   fm10k_rxq_vec_setup(q);
+
dev->data->rx_queues[queue_id] = q;
return 0;
 }
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 69174d9..34b677b 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -43,3 +43,24 @@
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
+
+int __attribute__((cold))
+fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq)
+{
+   uintptr_t p;
+   struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
+
+   mb_def.nb_segs = 1;
+   /* data_off will be ajusted after new mbuf allocated for 512-byte
+* alignment.
+*/
+   mb_def.data_off = RTE_PKTMBUF_HEADROOM;
+   mb_def.port = rxq->port_id;
+   rte_mbuf_refcnt_set(&mb_def, 1);
+
+   /* prevent compiler reordering: rearm_data covers previous fields */
+   rte_compiler_barrier();
+   p = (uintptr_t)&mb_def.rearm_data;
+   rxq->mbuf_initializer = *(uint64_t *)p;
+   return 0;
+}
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 01/16] fm10k: add new vPMD file

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

Add new file fm10k_rxtx_vec.c and add it into compiling.

Signed-off-by: Chen Jing D(Mark) 
---
 drivers/net/fm10k/Makefile |1 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |   45 
 2 files changed, 46 insertions(+), 0 deletions(-)
 create mode 100644 drivers/net/fm10k/fm10k_rxtx_vec.c

diff --git a/drivers/net/fm10k/Makefile b/drivers/net/fm10k/Makefile
index a4a8f56..06ebf83 100644
--- a/drivers/net/fm10k/Makefile
+++ b/drivers/net/fm10k/Makefile
@@ -93,6 +93,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k_common.c
 SRCS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k_mbx.c
 SRCS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k_vf.c
 SRCS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k_api.c
+SRCS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += fm10k_rxtx_vec.c

 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
new file mode 100644
index 000..69174d9
--- /dev/null
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -0,0 +1,45 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2013-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+#include "fm10k.h"
+#include "base/fm10k_type.h"
+
+#include 
+
+#ifndef __INTEL_COMPILER
+#pragma GCC diagnostic ignored "-Wcast-qual"
+#endif
-- 
1.7.7.6



[dpdk-dev] [PATCH v3 00/16] Vector Rx/Tx PMD implementation for fm10k

2015-10-27 Thread Chen Jing D(Mark)
From: "Chen Jing D(Mark)" 

v3:
 - Add a blank line after variable definition.
 - Do floor alignment for passing in argument nb_pkts to avoid memory 
overwritten.
 - Only scan max of 32 desc in scatter Rx function to avoid memory overwritten.

v2:
 - Fix a typo issue.
 - Fix an improper prefetch in vector RX function, in which prefetches
   un-initialized mbuf.
 - Remove limitation on number of desc pointer in vector RX function.
 - Re-organize some comments.
 - Add a new patch to fix a crash issue in vector RX func.
 - Add a new patch to update release notes.

v1:
This patch set includes Vector Rx/Tx functions to receive/transmit packets
for fm10k devices. It also contains logic to do sanity check for proper
RX/TX function selections.

Chen Jing D(Mark) (16):
  fm10k: add new vPMD file
  fm10k: add vPMD pre-condition check for each RX queue
  fm10k: Add a new func to initialize all parameters
  fm10k: add func to re-allocate mbuf for RX ring
  fm10k: add 2 functions to parse pkt_type and offload flag
  fm10k: add Vector RX function
  fm10k: add func to do Vector RX condition check
  fm10k: add Vector RX scatter function
  fm10k: add function to decide best RX function
  fm10k: add func to release mbuf in case Vector RX applied
  fm10k: add Vector TX function
  fm10k: use func pointer to reset TX queue and mbuf release
  fm10k: introduce 2 funcs to reset TX queue and mbuf release
  fm10k: Add function to decide best TX func
  fm10k: fix a crash issue in vector RX func
  doc: release notes update for fm10k Vector PMD

 doc/guides/rel_notes/release_2_2.rst |5 +
 drivers/net/fm10k/Makefile   |1 +
 drivers/net/fm10k/fm10k.h|   45 ++-
 drivers/net/fm10k/fm10k_ethdev.c |  169 ++-
 drivers/net/fm10k/fm10k_rxtx_vec.c   |  839 ++
 5 files changed, 1031 insertions(+), 28 deletions(-)
 create mode 100644 drivers/net/fm10k/fm10k_rxtx_vec.c

-- 
1.7.7.6



[dpdk-dev] [PATCH] lib/librte_ether: Prevent link status race condition when LSI enabled

2015-10-27 Thread Tim Shearer
Calling the Ethernet driver's link_update function from rte_eth_dev_start can 
result in a race condition if the NIC raises the link interrupt at the same 
time. Depending on the interrupt handler implementation, the race can cause the 
it to think that it received two consecutive link up interrupts, and it exits 
without calling the user callback. Appears to impact E1000/IGB and virtio 
drivers only.

Signed-off-by: Tim Shearer 
---
 lib/librte_ether/rte_ethdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f593f6e..a821a1f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1300,7 +1300,7 @@ rte_eth_dev_start(uint8_t port_id)

rte_eth_dev_config_restore(port_id);

-   if (dev->data->dev_conf.intr_conf.lsc != 0) {
+   if (dev->data->dev_conf.intr_conf.lsc == 0) {
FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update, -ENOTSUP);
(*dev->dev_ops->link_update)(dev, 0);
}
-- 
1.7.2.3



[dpdk-dev] [PATCH v8 8/8] doc: update release note for vhost-user mq support

2015-10-27 Thread Yuanhan Liu
On Mon, Oct 26, 2015 at 09:22:10PM +0100, Thomas Monjalon wrote:
> 2015-10-22 20:35, Yuanhan Liu:
> > +* **vhost: added vhost-user mulitple queue support.**
> > +
> > +  Added vhost-user multiple queue support.
> 
> Excepted the typo, it is the same sentence twice, so not needed.

Good to know; I thought it was a must, like a required style.
> 
> General comment to every contributors: please avoid making a special commit
> to just update the release notes.
> There is no log message and it is understandable because it does not
> deserve a commit.

Got it.

> It will be merged with the previous one here.

And thank you!

--yliu


[dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code path selection

2015-10-27 Thread Traynor, Kevin

> -Original Message-
> From: Ananyev, Konstantin
> Sent: Tuesday, October 27, 2015 12:13 PM
> To: Richardson, Bruce; Traynor, Kevin
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code path
> selection
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, October 27, 2015 11:50 AM
> > To: Traynor, Kevin
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] ixgbe: change logging for ixgbe tx code
> path selection
> >
> > On Tue, Oct 27, 2015 at 11:41:08AM +, Kevin Traynor wrote:
> > > Simple and vector are different tx code paths. If vector
> > > is selected, change logging from:
> > > PMD: ixgbe_set_tx_function(): Using simple tx code path
> > > PMD: ixgbe_set_tx_function(): Vector tx enabled.
> > >
> > > to:
> > > PMD: ixgbe_set_tx_function(): Using vector tx code path
> > >
> > > or, if simple selected:
> > > PMD: ixgbe_set_tx_function(): Using simple tx code path
> > >
> > > The dangling else in the #ifdef makes readability difficult,
> > > so resolving in way that seems most readable.
> > >
> > > Signed-off-by: Kevin Traynor 
> > > ---
> > >  drivers/net/ixgbe/ixgbe_rxtx.c |8 +---
> > >  1 files changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > index a598a72..11d7feb 100644
> > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > @@ -1963,16 +1963,18 @@ ixgbe_set_tx_function(struct rte_eth_dev *dev,
> struct ixgbe_tx_queue *txq)
> > >   /* Use a simple Tx queue (no offloads, no multi segs) if possible */
> > >   if (((txq->txq_flags & IXGBE_SIMPLE_FLAGS) == IXGBE_SIMPLE_FLAGS)
> > >   && (txq->tx_rs_thresh >= RTE_PMD_IXGBE_TX_MAX_BURST)) {
> > > - PMD_INIT_LOG(DEBUG, "Using simple tx code path");
> > >  #ifdef RTE_IXGBE_INC_VECTOR
> > >   if (txq->tx_rs_thresh <= RTE_IXGBE_TX_MAX_FREE_BUF_SZ &&
> > >   (rte_eal_process_type() != RTE_PROC_PRIMARY ||
> > >   ixgbe_txq_vec_setup(txq) == 0)) {
> > > - PMD_INIT_LOG(DEBUG, "Vector tx enabled.");
> > > + PMD_INIT_LOG(DEBUG, "Using vector tx code path");
> > >   dev->tx_pkt_burst = ixgbe_xmit_pkts_vec;
> > >   } else
> > >  #endif
> > > - dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
> > > + {
> > > + PMD_INIT_LOG(DEBUG, "Using simple tx code path");
> > > + dev->tx_pkt_burst = ixgbe_xmit_pkts_simple;
> > > + }
> > >   } else {
> > >   PMD_INIT_LOG(DEBUG, "Using full-featured tx code path");
> > >   PMD_INIT_LOG(DEBUG,
> > > --
> > > 1.7.4.1
> > >
> > Hi Kevin,
> >
> > can I suggest a slight alternative here that might help make things easier.
> > Instead of printing the message as we pick the code path, why not have a
> "logmsg"
> > pointer variable that is assigned in the code, and then print out the log
> path
> > at the end.
> >
> > This would have a number of advantages:
> > 1. there are no issues with changing our mind, so we can assign one path
> type,
> > and then later change it to something different without cluttering up the
> debug
> > output with the history of our code's flow.
> > 2. it means that you don't have a problem with smaller else legs as you can
> > easily do multiple assignments in the one line using a comma as:
> > dev->tx_pkt_burst = ixgbe_xmit_pkts_simple, logmsg = "Using simple
> ...";
> 
> While I like approach with logmsg, please avoid commas here.
> It will make this peace of code even more hard to read (at least for me).
> Konstantin

yeah, sure. I agree with changing for 1. I also agree with Konstantin re commas.
The code under the dangling else is aligned incorrectly/correctly depending on
whether the #ifdef is true or not, so I think adding multiple statements with {}
now will make it obvious for the next person who modifies.

> 
> >
> > Regards,
> > /Bruce


[dpdk-dev] [PATCH v8 3/8] vhost: vring queue setup for multiple queue support

2015-10-27 Thread Yuanhan Liu
On Tue, Oct 27, 2015 at 11:17:16AM +0200, Michael S. Tsirkin wrote:
> On Tue, Oct 27, 2015 at 03:20:40PM +0900, Tetsuya Mukawa wrote:
> > On 2015/10/26 14:42, Yuanhan Liu wrote:
> > > On Mon, Oct 26, 2015 at 02:24:08PM +0900, Tetsuya Mukawa wrote:
> > >> On 2015/10/22 21:35, Yuanhan Liu wrote:
> > > ...
> > >>> @@ -292,13 +300,13 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> > >>>  * sent and only sent in vhost_vring_stop.
> > >>>  * TODO: cleanup the vring, it isn't usable since here.
> > >>>  */
> > >>> -   if ((dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
> > >>> -   close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
> > >>> -   dev->virtqueue[VIRTIO_RXQ]->kickfd = -1;
> > >>> +   if ((dev->virtqueue[state->index]->kickfd + VIRTIO_RXQ) >= 0) {
> > >>> +   close(dev->virtqueue[state->index + 
> > >>> VIRTIO_RXQ]->kickfd);
> > >>> +   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
> > >>> }
> > >> Hi Yuanhan,
> > >>
> > >> Please let me make sure whether below is correct.
> > >> if ((dev->virtqueue[state->index]->kickfd + VIRTIO_RXQ) >= 0) {
> > >>
> > >>> -   if ((dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
> > >>> -   close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
> > >>> -   dev->virtqueue[VIRTIO_TXQ]->kickfd = -1;
> > >>> +   if ((dev->virtqueue[state->index]->kickfd + VIRTIO_TXQ) >= 0) {
> > >>> +   close(dev->virtqueue[state->index + 
> > >>> VIRTIO_TXQ]->kickfd);
> > >>> +   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
> > >> Also, same question here.
> > > Oops, silly typos... Thanks for catching it out!
> > >
> > > Here is an update patch (Thomas, please let me know if you prefer me
> > > to send the whole patchset for you to apply):
> > 
> > Hi Yuanhan,
> > 
> > I may miss one more issue here.
> > Could you please see below patch I've submitted today?
> > (I may find a similar issue, so I've fixed it also in below patch.)
> >  
> > - http://dpdk.org/dev/patchwork/patch/8038/
> >  
> > Thanks,
> > Tetsuya
> 
> Looking at that, at least when MQ is enabled, please don't key
> stopping queues off GET_VRING_BASE.

Yes, that's only a workaround. I guess it has been there for quite a
while, maybe at the time qemu doesn't send RESET_OWNER message.

> There are ENABLE/DISABLE messages for that.

That's something new, though I have plan to use them instead, we still
need to make sure our code work with old qemu, without ENABLE/DISABLE
messages.

And I will think more while enabling live migration: I should have
more time to address issues like this at that time.

> Generally guys, don't take whatever QEMU happens to do for
> granted! Look at the protocol spec under doc/specs directory,
> if you are making more assumptions you must document them!

Indeed. And we will try to address them bit by bit in future.

--yliu


[dpdk-dev] [PATCH v3 4/4] doc: update release note for fm10k VMDQ support

2015-10-27 Thread Shaopeng He
Signed-off-by: Shaopeng He 
---
 doc/guides/rel_notes/release_2_2.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 5687676..278149f 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -4,6 +4,11 @@ DPDK Release 2.2
 New Features
 

+* **fm10k: Added VMDQ support.**
+
+  Added functions to configure VMDQ mode and add MAC address for each VMDQ
+  queue pool. Also included logic to do sanity check for multi-queue settings.
+

 Resolved Issues
 ---
-- 
1.9.3



[dpdk-dev] [PATCH v3 3/4] fm10k: add VMDQ support in multi-queue configure

2015-10-27 Thread Shaopeng He
Add separate functions to configure VMDQ and RSS.
Update dglort map and logic ports accordingly.
Reset MAC/VLAN filter after VMDQ config was changed.

Signed-off-by: Shaopeng He 
---
 drivers/net/fm10k/fm10k_ethdev.c | 164 +--
 1 file changed, 141 insertions(+), 23 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index cf48cd5..8c31391 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -337,8 +337,41 @@ fm10k_dev_configure(struct rte_eth_dev *dev)
return 0;
 }

+/* fls = find last set bit = 32 minus the number of leading zeros */
+#ifndef fls
+#define fls(x) (((x) == 0) ? 0 : (32 - __builtin_clz((x
+#endif
+
 static void
-fm10k_dev_mq_rx_configure(struct rte_eth_dev *dev)
+fm10k_dev_vmdq_rx_configure(struct rte_eth_dev *dev)
+{
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct rte_eth_vmdq_rx_conf *vmdq_conf;
+   uint32_t i;
+
+   vmdq_conf = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
+
+   for (i = 0; i < vmdq_conf->nb_pool_maps; i++) {
+   if (!vmdq_conf->pool_map[i].pools)
+   continue;
+   fm10k_mbx_lock(hw);
+   fm10k_update_vlan(hw, vmdq_conf->pool_map[i].vlan_id, 0, true);
+   fm10k_mbx_unlock(hw);
+   }
+}
+
+static void
+fm10k_dev_pf_main_vsi_reset(struct rte_eth_dev *dev)
+{
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* Add default mac address */
+   fm10k_MAC_filter_set(dev, hw->mac.addr, true,
+   MAIN_VSI_POOL_NUMBER);
+}
+
+static void
+fm10k_dev_rss_configure(struct rte_eth_dev *dev)
 {
struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
@@ -409,6 +442,78 @@ fm10k_dev_mq_rx_configure(struct rte_eth_dev *dev)
FM10K_WRITE_REG(hw, FM10K_MRQC(0), mrqc);
 }

+static void
+fm10k_dev_logic_port_update(struct rte_eth_dev *dev,
+   uint16_t nb_lport_old, uint16_t nb_lport_new)
+{
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t i;
+
+   fm10k_mbx_lock(hw);
+   /* Disable previous logic ports */
+   if (nb_lport_old)
+   hw->mac.ops.update_lport_state(hw, hw->mac.dglort_map,
+   nb_lport_old, false);
+   /* Enable new logic ports */
+   hw->mac.ops.update_lport_state(hw, hw->mac.dglort_map,
+   nb_lport_new, true);
+   fm10k_mbx_unlock(hw);
+
+   for (i = 0; i < nb_lport_new; i++) {
+   /* Set unicast mode by default. App can change
+* to other mode in other API func.
+*/
+   fm10k_mbx_lock(hw);
+   hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map + i,
+   FM10K_XCAST_MODE_NONE);
+   fm10k_mbx_unlock(hw);
+   }
+}
+
+static void
+fm10k_dev_mq_rx_configure(struct rte_eth_dev *dev)
+{
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct rte_eth_vmdq_rx_conf *vmdq_conf;
+   struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
+   struct fm10k_macvlan_filter_info *macvlan;
+   uint16_t nb_queue_pools = 0; /* pool number in configuration */
+   uint16_t nb_lport_new, nb_lport_old;
+
+   macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);
+   vmdq_conf = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
+
+   fm10k_dev_rss_configure(dev);
+
+   /* only PF supports VMDQ */
+   if (hw->mac.type != fm10k_mac_pf)
+   return;
+
+   if (dev_conf->rxmode.mq_mode & ETH_MQ_RX_VMDQ_FLAG)
+   nb_queue_pools = vmdq_conf->nb_queue_pools;
+
+   /* no pool number change, no need to update logic port and VLAN/MAC */
+   if (macvlan->nb_queue_pools == nb_queue_pools)
+   return;
+
+   nb_lport_old = macvlan->nb_queue_pools ? macvlan->nb_queue_pools : 1;
+   nb_lport_new = nb_queue_pools ? nb_queue_pools : 1;
+   fm10k_dev_logic_port_update(dev, nb_lport_old, nb_lport_new);
+
+   /* reset MAC/VLAN as it's based on VMDQ or PF main VSI */
+   memset(dev->data->mac_addrs, 0,
+   ETHER_ADDR_LEN * FM10K_MAX_MACADDR_NUM);
+   ether_addr_copy((const struct ether_addr *)hw->mac.addr,
+   &dev->data->mac_addrs[0]);
+   memset(macvlan, 0, sizeof(*macvlan));
+   macvlan->nb_queue_pools = nb_queue_pools;
+
+   if (nb_queue_pools)
+   fm10k_dev_vmdq_rx_configure(dev);
+   else
+   fm10k_dev_pf_main_vsi_reset(dev);
+}
+
 static int
 fm10k_dev_tx_init(struct rte_eth_dev *dev)
 {
@@ -517,7 +622,7 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)
FM10K_WRITE_FLUSH(hw);
}

-   /* Configure RSS if applicable */
+   /* Configure VMDQ/RSS if applicable */
   

[dpdk-dev] [PATCH v3 2/4] fm10k: add VMDQ support in MAC/VLAN filter

2015-10-27 Thread Shaopeng He
The patch does below things for fm10k MAC/VLAN filter:
- Add separate functions for VMDQ and main VSI to change
  MAC filter.
- Disable modification to VLAN filter in VMDQ mode.
- In device close phase, delete logic ports to remove all
  MAC/VLAN filters belonging to those ports.

Signed-off-by: Shaopeng He 
---
 drivers/net/fm10k/fm10k.h|   3 +
 drivers/net/fm10k/fm10k_ethdev.c | 150 +--
 2 files changed, 99 insertions(+), 54 deletions(-)

diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
index c089882..439e95f 100644
--- a/drivers/net/fm10k/fm10k.h
+++ b/drivers/net/fm10k/fm10k.h
@@ -126,6 +126,9 @@
 struct fm10k_macvlan_filter_info {
uint16_t vlan_num;   /* Total VLAN number */
uint16_t mac_num;/* Total mac number */
+   uint16_t nb_queue_pools; /* Active queue pools number */
+   /* VMDQ ID for each MAC address */
+   uint8_t  mac_vmdq_id[FM10K_MAX_MACADDR_NUM];
uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
 };

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 082937d..cf48cd5 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -45,6 +45,8 @@
 #define FM10K_MBXLOCK_DELAY_US 20
 #define UINT64_LOWER_32BITS_MASK 0xULL

+#define MAIN_VSI_POOL_NUMBER 0
+
 /* Max try times to acquire switch status */
 #define MAX_QUERY_SWITCH_STATE_TIMES 10
 /* Wait interval to get switch status */
@@ -61,10 +63,8 @@ static void fm10k_dev_allmulticast_disable(struct 
rte_eth_dev *dev);
 static inline int fm10k_glort_valid(struct fm10k_hw *hw);
 static int
 fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on);
-static void
-fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add);
-static void
-fm10k_MACVLAN_remove_all(struct rte_eth_dev *dev);
+static void fm10k_MAC_filter_set(struct rte_eth_dev *dev,
+   const u8 *mac, bool add, uint32_t pool);
 static void fm10k_tx_queue_release(void *queue);
 static void fm10k_rx_queue_release(void *queue);

@@ -883,10 +883,17 @@ static void
 fm10k_dev_close(struct rte_eth_dev *dev)
 {
struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint16_t nb_lport;
+   struct fm10k_macvlan_filter_info *macvlan;

PMD_INIT_FUNC_TRACE();

-   fm10k_MACVLAN_remove_all(dev);
+   macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);
+   nb_lport = macvlan->nb_queue_pools ? macvlan->nb_queue_pools : 1;
+   fm10k_mbx_lock(hw);
+   hw->mac.ops.update_lport_state(hw, hw->mac.dglort_map,
+   nb_lport, false);
+   fm10k_mbx_unlock(hw);

/* Stop mailbox service first */
fm10k_close_mbx_service(hw);
@@ -1023,6 +1030,11 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t 
vlan_id, int on)
hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);

+   if (macvlan->nb_queue_pools > 0) { /* VMDQ mode */
+   PMD_INIT_LOG(ERR, "Cannot change VLAN filter in VMDQ mode");
+   return (-EINVAL);
+   }
+
if (vlan_id > ETH_VLAN_ID_MAX) {
PMD_INIT_LOG(ERR, "Invalid vlan_id: must be < 4096");
return (-EINVAL);
@@ -1100,38 +1112,80 @@ fm10k_vlan_offload_set(__rte_unused struct rte_eth_dev 
*dev, int mask)
}
 }

-/* Add/Remove a MAC address, and update filters */
-static void
-fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add)
+/* Add/Remove a MAC address, and update filters to main VSI */
+static void fm10k_MAC_filter_set_main_vsi(struct rte_eth_dev *dev,
+   const u8 *mac, bool add, uint32_t pool)
 {
-   uint32_t i, j, k;
-   struct fm10k_hw *hw;
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct fm10k_macvlan_filter_info *macvlan;
+   uint32_t i, j, k;

-   hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);

-   i = 0;
-   for (j = 0; j < FM10K_VFTA_SIZE; j++) {
-   if (macvlan->vfta[j]) {
-   for (k = 0; k < FM10K_UINT32_BIT_SIZE; k++) {
-   if (macvlan->vfta[j] & (1 << k)) {
-   if (i + 1 > macvlan->vlan_num) {
-   PMD_INIT_LOG(ERR, "vlan number "
-   "not match");
-   return;
-   }
-   fm10k_mbx_lock(hw);
-   fm10k_update_uc_addr(hw,
-   hw->mac.dglort_map, mac,
-   j * FM10K_UINT32_BIT_SIZE + k,
-

[dpdk-dev] [PATCH v3 1/4] fm10k: add multi-queue checking

2015-10-27 Thread Shaopeng He
Add multi-queue checking in device configure function.
Currently, VMDQ and RSS are supported.

Signed-off-by: Shaopeng He 
---
 drivers/net/fm10k/fm10k_ethdev.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index a69c990..082937d 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -283,12 +283,56 @@ tx_queue_disable(struct fm10k_hw *hw, uint16_t qnum)
 }

 static int
+fm10k_check_mq_mode(struct rte_eth_dev *dev)
+{
+   enum rte_eth_rx_mq_mode rx_mq_mode = dev->data->dev_conf.rxmode.mq_mode;
+   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct rte_eth_vmdq_rx_conf *vmdq_conf;
+   uint16_t nb_rx_q = dev->data->nb_rx_queues;
+
+   vmdq_conf = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
+
+   if (rx_mq_mode & ETH_MQ_RX_DCB_FLAG) {
+   PMD_INIT_LOG(ERR, "DCB mode is not supported.");
+   return -EINVAL;
+   }
+
+   if (!(rx_mq_mode & ETH_MQ_RX_VMDQ_FLAG))
+   return 0;
+
+   if (hw->mac.type == fm10k_mac_vf) {
+   PMD_INIT_LOG(ERR, "VMDQ mode is not supported in VF.");
+   return -EINVAL;
+   }
+
+   /* Check VMDQ queue pool number */
+   if (vmdq_conf->nb_queue_pools >
+   sizeof(vmdq_conf->pool_map[0].pools) * CHAR_BIT ||
+   vmdq_conf->nb_queue_pools > nb_rx_q) {
+   PMD_INIT_LOG(ERR, "Too many of queue pools: %d",
+   vmdq_conf->nb_queue_pools);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int
 fm10k_dev_configure(struct rte_eth_dev *dev)
 {
+   int ret;
+
PMD_INIT_FUNC_TRACE();

if (dev->data->dev_conf.rxmode.hw_strip_crc == 0)
PMD_INIT_LOG(WARNING, "fm10k always strip CRC");
+   /* multipe queue mode checking */
+   ret  = fm10k_check_mq_mode(dev);
+   if (ret != 0) {
+   PMD_DRV_LOG(ERR, "fm10k_check_mq_mode fails with %d.",
+   ret);
+   return ret;
+   }

return 0;
 }
-- 
1.9.3



[dpdk-dev] [PATCH v3 0/4] fm10k: add VMDQ support

2015-10-27 Thread Shaopeng He
This patch series adds VMDQ support for fm10k.
It includes the functions to configure VMDQ mode and
add MAC address for each VMDQ queue pool.
It also includes logic to do sanity check for
multi-queue settings.

Changes in v3:
- Keep device default MAC address even in VMDQ mode after
  queue pool config was changed, because some applications
  (e.g. vmdq_app) always need a valid MAC address there.

Changes in v2:
- Reword some comments and commit messages
- Updated release note

Shaopeng He (4):
  fm10k: add multi-queue checking
  fm10k: add VMDQ support in MAC/VLAN filter
  fm10k: add VMDQ support in multi-queue configure
  doc: update release note for fm10k VMDQ support

 doc/guides/rel_notes/release_2_2.rst |   5 +
 drivers/net/fm10k/fm10k.h|   3 +
 drivers/net/fm10k/fm10k_ethdev.c | 358 +++
 3 files changed, 289 insertions(+), 77 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH v3 1/1] vmxnet3: add PCI Port Hotplug support

2015-10-27 Thread Bernard Iremonger
Signed-off-by: Bernard Iremonger 
---
 doc/guides/rel_notes/release_2_2.rst |  2 ++
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 33 +
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 455b5a2..a26c5b8 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -13,6 +13,8 @@ New Features

   * This change required modifications to librte_ether and all vdev and pdev 
PMD's.

+* **Added port hotplug support to the vmxnet3 PMD.**
+
 Resolved Issues
 ---

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index 2beee3e..5cd708e 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -70,6 +70,7 @@
 #define PROCESS_SYS_EVENTS 0

 static int eth_vmxnet3_dev_init(struct rte_eth_dev *eth_dev);
+static int eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev);
 static int vmxnet3_dev_configure(struct rte_eth_dev *dev);
 static int vmxnet3_dev_start(struct rte_eth_dev *dev);
 static void vmxnet3_dev_stop(struct rte_eth_dev *dev);
@@ -296,13 +297,37 @@ eth_vmxnet3_dev_init(struct rte_eth_dev *eth_dev)
return 0;
 }

+static int
+eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev)
+{
+   struct vmxnet3_hw *hw = eth_dev->data->dev_private;
+
+   PMD_INIT_FUNC_TRACE();
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   if (hw->adapter_stopped == 0)
+   vmxnet3_dev_close(eth_dev);
+
+   eth_dev->dev_ops = NULL;
+   eth_dev->rx_pkt_burst = NULL;
+   eth_dev->tx_pkt_burst = NULL;
+
+   rte_free(eth_dev->data->mac_addrs);
+   eth_dev->data->mac_addrs = NULL;
+
+   return 0;
+}
+
 static struct eth_driver rte_vmxnet3_pmd = {
.pci_drv = {
.name = "rte_vmxnet3_pmd",
.id_table = pci_id_vmxnet3_map,
-   .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = eth_vmxnet3_dev_init,
+   .eth_dev_uninit = eth_vmxnet3_dev_uninit,
.dev_private_size = sizeof(struct vmxnet3_hw),
 };

@@ -581,7 +606,7 @@ vmxnet3_dev_stop(struct rte_eth_dev *dev)

PMD_INIT_FUNC_TRACE();

-   if (hw->adapter_stopped == TRUE) {
+   if (hw->adapter_stopped == 1) {
PMD_INIT_LOG(DEBUG, "Device already closed.");
return;
}
@@ -597,7 +622,7 @@ vmxnet3_dev_stop(struct rte_eth_dev *dev)
/* reset the device */
VMXNET3_WRITE_BAR1_REG(hw, VMXNET3_REG_CMD, VMXNET3_CMD_RESET_DEV);
PMD_INIT_LOG(DEBUG, "Device reset.");
-   hw->adapter_stopped = FALSE;
+   hw->adapter_stopped = 0;

vmxnet3_dev_clear_queues(dev);

@@ -617,7 +642,7 @@ vmxnet3_dev_close(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

vmxnet3_dev_stop(dev);
-   hw->adapter_stopped = TRUE;
+   hw->adapter_stopped = 1;
 }

 static void
-- 
1.9.1



[dpdk-dev] [PATCH v3 0/1] vmxnet3 hotplug support

2015-10-27 Thread Bernard Iremonger
add PCI Port Hotplug support to the vmxnet3 PMD

This patch depends on v5 of the following patch set:

remove-pci-driver-from-vdevs.patch

Changes in v3:
Rebase.

Changes in v2:
Update release notes.

Bernard Iremonger (1):
  vmxnet3: add PCI Port Hotplug support

 doc/guides/rel_notes/release_2_2.rst |  2 ++
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 33 +
 2 files changed, 31 insertions(+), 4 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH v3] examples/vmdq: Fix the core dump issue when mem_pool is more than 34

2015-10-27 Thread Xutao Sun
Macro MAX_QUEUES was defined to 128, only allow 16 vmdq_pools in theory.
When running vmdq_app with more than 34 vmdq_pools, it will cause the
core_dump issue.
Change MAX_QUEUES to 1024 will solve this issue.

Signed-off-by: Xutao Sun 
---
v2:
 - Rectify the NUM_MBUFS_PER_PORT since MAX_QUEUES has been changed

v3:
 - Change the comments above the relevant code.

 examples/vmdq/main.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/examples/vmdq/main.c b/examples/vmdq/main.c
index a142d49..178af2f 100644
--- a/examples/vmdq/main.c
+++ b/examples/vmdq/main.c
@@ -69,12 +69,13 @@
 #include 
 #include 

-#define MAX_QUEUES 128
+#define MAX_QUEUES 1024
 /*
- * For 10 GbE, 128 queues require roughly
- * 128*512 (RX/TX_queue_nb * RX/TX_ring_descriptors_nb) per port.
+ * 1024 queues require to meet the needs of a large number of vmdq_pools.
+ * (RX/TX_queue_nb * RX/TX_ring_descriptors_nb) per port.
  */
-#define NUM_MBUFS_PER_PORT (128*512)
+#define NUM_MBUFS_PER_PORT (MAX_QUEUES * RTE_MAX(RTE_TEST_RX_DESC_DEFAULT, \
+   RTE_TEST_TX_DESC_DEFAULT))
 #define MBUF_CACHE_SIZE 64

 #define MAX_PKT_BURST 32
-- 
1.9.3



[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Yuanhan Liu
On Tue, Oct 27, 2015 at 08:46:48AM +, Xie, Huawei wrote:
> On 10/27/2015 4:39 PM, Yuanhan Liu wrote:
> > On Tue, Oct 27, 2015 at 08:24:00AM +, Xie, Huawei wrote:
> >> On 10/27/2015 3:52 PM, Tetsuya Mukawa wrote:
> >>> The patch fixes wrong handling of virtqueue array index when
> >>> GET_VRING_BASE message comes.
> >>> The vhost backend will receive the message per virtqueue.
> >>> Also we should call a destroy callback handler when both RXQ
> >>> and TXQ receives the message.
> >>>
> >>> Signed-off-by: Tetsuya Mukawa 
> >>> ---
> >>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++--
> >>>  1 file changed, 10 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
> >>> b/lib/librte_vhost/vhost_user/virtio-net-user.c
> >>> index a998ad8..99c075f 100644
> >>> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> >>> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> >>> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >>>   struct vhost_vring_state *state)
> >>>  {
> >>>   struct virtio_net *dev = get_device(ctx);
> >>> + uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;
> >>>  
> >>>   if (dev == NULL)
> >>>   return -1;
> >>> - /* We have to stop the queue (virtio) if it is running. */
> >>> - if (dev->flags & VIRTIO_DEV_RUNNING)
> >>> - notify_ops->destroy_device(dev);
> >> Hi Tetsuya:
> >> I don't understand why we move it to the end of the function.
> >> If we don't tell the application to remove the virtio device from the
> > As you stated, he just moved it to the end of the function: it
> > still does invoke notfiy_ops->destroy_device() in the end.
> The problem is before calling destroy_device, we shouldn't modify the
> virtio_net data structure as data plane is also using it.

Right then, we may shoud not move it in the end.

> >
> > And the reason he moved it to the end is he want to invoke the
> > callback just when the second GET_VRING_BASE message is received
> > for the queue pair.
> Don't get it. What issue it fixes?

I guess Tetsuya thinks that'd be a more proper time to invoke the
callback, but in fact, it's not, as we have MQ enabled :)

--yliu

> >  And while thinking twice, it's not necessary,
> > as we will do the "flags & VIRTIO_DEV_RUNNING" check first, it
> > doesn't matter on which virt queue we invoke the callback.
> >
> >
> > --yliu
> >
> >> data plane, then the vhost application is still operating on that
> >> device, we shouldn't do anything to the virtio_net device.
> >> For this case, as vhost doesn't use kickfd, it will not cause issue, but
> >> i think it is best practice firstly to remove it from data plan through
> >> destroy_device.
> >>
> >> I think we could call destroy_device the first time we receive this
> >> message. Currently we don't have per queue granularity control to only
> >> remove one queue from data plane.
> >>
> >> I am Okay to only close the kickfd for the specified queue index.
> >>
> >> Btw, do you meet issue with previous implementation?
> >>>  
> >>>   /* Here we are safe to get the last used index */
> >>>   ops->get_vring_base(ctx, state->index, state);
> >>> @@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >>>* sent and only sent in vhost_vring_stop.
> >>>* TODO: cleanup the vring, it isn't usable since here.
> >>>*/
> >>> - if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
> >>> - close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
> >>> - dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
> >>> - }
> >>> - if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
> >>> - close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
> >>> - dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
> >>> + if (dev->virtqueue[state->index]->kickfd >= 0) {
> >>> + close(dev->virtqueue[state->index]->kickfd);
> >>> + dev->virtqueue[state->index]->kickfd = -1;
> >>>   }
> >>>  
> >>> + /* We have to stop the queue (virtio) if it is running. */
> >>> + if ((dev->flags & VIRTIO_DEV_RUNNING) &&
> >>> + (dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
> >>> + (dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
> >>> + notify_ops->destroy_device(dev);
> >>> +
> >>>   return 0;
> >>>  }
> >>>  
> 


[dpdk-dev] [PATCH 0/3] Add VHOST PMD

2015-10-27 Thread Tetsuya Mukawa
Below patch has been submitted as a separate patch.

-  [dpdk-dev,1/3] vhost: Fix wrong handling of virtqueue array index
(http://dpdk.org/dev/patchwork/patch/8038/)

Tetsuya

On 2015/10/27 15:12, Tetsuya Mukawa wrote:
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost. The patch will work on below patch series.
>  - [PATCH v5 00/28] remove pci driver from vdevs
>
> * Known issue.
> We may see issues while handling RESET_OWNER message.
> These handlings are done in vhost library, so not a part of vhost PMD.
> So far, we are waiting for QEMU fixing.
>
> PATCH v4 changes:
>  - Support vhost multiple queues.
>  - Rebase on "remove pci driver from vdevs".
>  - Optimize RX/TX functions.
>  - Fix resource leaks.
>  - Fix compile issue.
>  - Add patch to fix vhost library.
>
> PATCH v3 changes:
>  - Optimize performance.
>In RX/TX functions, change code to access only per core data.
>  - Add below API to allow user to use vhost library APIs for a port managed
>by vhost PMD. There are a few limitations. See "rte_eth_vhost.h".
> - rte_eth_vhost_portid2vdev()
>To support this functionality, vhost library is also changed.
>Anyway, if users doesn't use vhost PMD, can fully use vhost library APIs.
>  - Add code to support vhost multiple queues.
>Actually, multiple queues functionality is not enabled so far.
>
> PATCH v2 changes:
>  - Fix issues reported by checkpatch.pl
>(Thanks to Stephen Hemminger)
>
>
> Tetsuya Mukawa (3):
>   vhost: Fix wrong handling of virtqueue array index
>   vhost: Add callback and private data for vhost PMD
>   vhost: Add VHOST PMD
>
>  config/common_linuxapp|   6 +
>  doc/guides/nics/index.rst |   1 +
>  doc/guides/nics/vhost.rst |  82 +++
>  doc/guides/rel_notes/release_2_2.rst  |   2 +
>  drivers/net/Makefile  |   4 +
>  drivers/net/vhost/Makefile|  62 +++
>  drivers/net/vhost/rte_eth_vhost.c | 765 
> ++
>  drivers/net/vhost/rte_eth_vhost.h |  65 +++
>  drivers/net/vhost/rte_pmd_vhost_version.map   |   8 +
>  lib/librte_vhost/rte_vhost_version.map|   6 +
>  lib/librte_vhost/rte_virtio_net.h |   3 +
>  lib/librte_vhost/vhost_user/virtio-net-user.c |  33 +-
>  lib/librte_vhost/virtio-net.c |  61 +-
>  lib/librte_vhost/virtio-net.h |   4 +-
>  mk/rte.app.mk |   8 +-
>  15 files changed, 1085 insertions(+), 25 deletions(-)
>  create mode 100644 doc/guides/nics/vhost.rst
>  create mode 100644 drivers/net/vhost/Makefile
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.c
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.h
>  create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map
>



[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Tetsuya Mukawa
The patch fixes wrong handling of virtqueue array index when
GET_VRING_BASE message comes.
The vhost backend will receive the message per virtqueue.
Also we should call a destroy callback handler when both RXQ
and TXQ receives the message.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index a998ad8..99c075f 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
struct vhost_vring_state *state)
 {
struct virtio_net *dev = get_device(ctx);
+   uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;

if (dev == NULL)
return -1;
-   /* We have to stop the queue (virtio) if it is running. */
-   if (dev->flags & VIRTIO_DEV_RUNNING)
-   notify_ops->destroy_device(dev);

/* Here we are safe to get the last used index */
ops->get_vring_base(ctx, state->index, state);
@@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 * sent and only sent in vhost_vring_stop.
 * TODO: cleanup the vring, it isn't usable since here.
 */
-   if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
-   }
-   if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
-   close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
-   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
+   if (dev->virtqueue[state->index]->kickfd >= 0) {
+   close(dev->virtqueue[state->index]->kickfd);
+   dev->virtqueue[state->index]->kickfd = -1;
}

+   /* We have to stop the queue (virtio) if it is running. */
+   if ((dev->flags & VIRTIO_DEV_RUNNING) &&
+   (dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
+   (dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
+   notify_ops->destroy_device(dev);
+
return 0;
 }

-- 
2.1.4



[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Yuanhan Liu
On Tue, Oct 27, 2015 at 08:24:00AM +, Xie, Huawei wrote:
> On 10/27/2015 3:52 PM, Tetsuya Mukawa wrote:
> > The patch fixes wrong handling of virtqueue array index when
> > GET_VRING_BASE message comes.
> > The vhost backend will receive the message per virtqueue.
> > Also we should call a destroy callback handler when both RXQ
> > and TXQ receives the message.
> >
> > Signed-off-by: Tetsuya Mukawa 
> > ---
> >  lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++--
> >  1 file changed, 10 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
> > b/lib/librte_vhost/vhost_user/virtio-net-user.c
> > index a998ad8..99c075f 100644
> > --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> > +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> > @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> > struct vhost_vring_state *state)
> >  {
> > struct virtio_net *dev = get_device(ctx);
> > +   uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;
> >  
> > if (dev == NULL)
> > return -1;
> > -   /* We have to stop the queue (virtio) if it is running. */
> > -   if (dev->flags & VIRTIO_DEV_RUNNING)
> > -   notify_ops->destroy_device(dev);
> Hi Tetsuya:
> I don't understand why we move it to the end of the function.
> If we don't tell the application to remove the virtio device from the

As you stated, he just moved it to the end of the function: it
still does invoke notfiy_ops->destroy_device() in the end.

And the reason he moved it to the end is he want to invoke the
callback just when the second GET_VRING_BASE message is received
for the queue pair. And while thinking twice, it's not necessary,
as we will do the "flags & VIRTIO_DEV_RUNNING" check first, it
doesn't matter on which virt queue we invoke the callback.


--yliu

> data plane, then the vhost application is still operating on that
> device, we shouldn't do anything to the virtio_net device.
> For this case, as vhost doesn't use kickfd, it will not cause issue, but
> i think it is best practice firstly to remove it from data plan through
> destroy_device.
> 
> I think we could call destroy_device the first time we receive this
> message. Currently we don't have per queue granularity control to only
> remove one queue from data plane.
> 
> I am Okay to only close the kickfd for the specified queue index.
> 
> Btw, do you meet issue with previous implementation?
> >  
> > /* Here we are safe to get the last used index */
> > ops->get_vring_base(ctx, state->index, state);
> > @@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >  * sent and only sent in vhost_vring_stop.
> >  * TODO: cleanup the vring, it isn't usable since here.
> >  */
> > -   if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
> > -   close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
> > -   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
> > -   }
> > -   if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
> > -   close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
> > -   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
> > +   if (dev->virtqueue[state->index]->kickfd >= 0) {
> > +   close(dev->virtqueue[state->index]->kickfd);
> > +   dev->virtqueue[state->index]->kickfd = -1;
> > }
> >  
> > +   /* We have to stop the queue (virtio) if it is running. */
> > +   if ((dev->flags & VIRTIO_DEV_RUNNING) &&
> > +   (dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
> > +   (dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
> > +   notify_ops->destroy_device(dev);
> > +
> > return 0;
> >  }
> >  
> 


[dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86

2015-10-27 Thread Jan Viktorin
Hi Konstantin,

On Tue, 27 Oct 2015 15:31:44 +
"Ananyev, Konstantin"  wrote:

> Hi Jan,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> > Sent: Monday, October 26, 2015 4:38 PM
> > To: Thomas Monjalon; Hunt, David; dev at dpdk.org
> > Cc: Vlastimil Kosar
> > Subject: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 
> > using rte_lpm_lookup_bulk on for-x86
> > 
> > From: Vlastimil Kosar 
> > 
> > LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
> > the function is reimplemented using non-vector operations for non-x86
> > architectures. In the future, each architecture should have vectorized code.
> > This patch includes rudimentary emulation of intrinsic functions 
> > _mm_set_epi32(),
> > _mm_loadu_si128() and _mm_load_si128() for easy portability of existing
> > applications.
> > 
> > LPM builds now when on ARM.
> > 
> > FIXME: to be reworked
> > 
> > Signed-off-by: Vlastimil Kosar 
> > Signed-off-by: Jan Viktorin 
> > ---
> >  config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
> >  lib/librte_lpm/rte_lpm.h  | 71 
> > +++
> >  2 files changed, 71 insertions(+), 1 deletion(-)
> > 
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc 
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > index 5b582a8..33afb33 100644
> > --- a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > @@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=16
> > 
> >  # fails to compile on ARM
> >  CONFIG_RTE_LIBRTE_ACL=n
> > -CONFIG_RTE_LIBRTE_LPM=n
> > 
> >  # cannot use those on ARM
> >  CONFIG_RTE_KNI_KMOD=n
> > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> > index c299ce2..4619992 100644
> > --- a/lib/librte_lpm/rte_lpm.h
> > +++ b/lib/librte_lpm/rte_lpm.h
> > @@ -47,7 +47,9 @@
> >  #include 
> >  #include 
> >  #include 
> > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
> >  #include 
> > +#endif
> > 
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, 
> > const uint32_t * ips,
> > return 0;
> >  }
> > 
> > +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
> >  /* Mask four results. */
> >  #define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
> > 
> > @@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i 
> > ip, uint16_t hop[4],
> > hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
> > hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
> >  }
> > +#else  
> 
> Probably better to create an 
> lib/librte_eal/common/include/arch/arm/rte_vect.h,
> and move all these x86 vector support emulation there?
> Konstantin

Sure. This patch is terribly wrong and it's not to be merged. It is a
question whether to make it this way (with the refactoring as you
suggested) or to make some general abstraction of the SSE calls in DPDK.

Jan

> 
> > +// TODO: this code should be reworked.
> > +
> > +typedef struct {
> > +   union uint128 {
> > +   uint8_t uint8[16];
> > +   uint32_t uint32[4];
> > +   } val;
> > +} __m128i;
> > +
> > +static inline __m128i
> > +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
> > +{
> > +   __m128i res;
> > +   res.val.uint32[0] = v0;
> > +   res.val.uint32[1] = v1;
> > +   res.val.uint32[2] = v2;
> > +   res.val.uint32[3] = v3;
> > +   return res;
> > +}
> > +
> > +static inline __m128i
> > +_mm_loadu_si128(__m128i * v)
> > +{
> > +   __m128i res;
> > +   res = *v;
> > +   return res;
> > +}
> > +
> > +static inline __m128i
> > +_mm_load_si128(__m128i * v)
> > +{
> > +   __m128i res;
> > +   res = *v;
> > +   return res;
> > +}
> > +
> > +/**
> > + * Lookup four IP addresses in an LPM table.
> > + *
> > + * @param lpm
> > + *   LPM object handle
> > + * @param ip
> > + *   Four IPs to be looked up in the LPM table
> > + * @param hop
> > + *   Next hop of the most specific rule found for IP (valid on lookup hit 
> > only).
> > + *   This is an 4 elements array of two byte values.
> > + *   If the lookup was succesfull for the given IP, then least significant 
> > byte
> > + *   of the corresponding element is the  actual next hop and the most
> > + *   significant byte is zero.
> > + *   If the lookup for the given IP failed, then corresponding element 
> > would
> > + *   contain default value, see description of then next parameter.
> > + * @param defv
> > + *   Default value to populate into corresponding element of hop[] array,
> > + *   if lookup would fail.
> > + */
> > +static inline void
> > +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
> > +   uint16_t defv)
> > +{
> > +   rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
> > +
> > +   hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
> > +   hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
> > +   

[dpdk-dev] [PATCH] doc: update release note for i40e base driver update

2015-10-27 Thread Jingjing Wu
Signed-off-by: Jingjing Wu 
---
 doc/guides/rel_notes/release_2_2.rst | 16 
 1 file changed, 16 insertions(+)

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index de6916e..1665ec7 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -11,6 +11,22 @@ New Features

 * **Added vhost-user multiple queue support.**

+* **Updated the i40e base driver.**
+
+  The i40e base driver was updated with several changes including the
+  following:
+
+  *  Added promiscuous on VLAN support
+  *  Added a workaround to drop all flow control frames
+  *  Added VF capabilities to virtual channel interface
+  *  Added Tx Scheduling related AQ commands
+  *  Added additional PCTYPES supported for FortPark RSS
+  *  Added parsing for CEE DCBX TLVs
+  *  Added FortPark specific registers
+  *  Added AQ functions to handle RSS Key and LUT programming
+  *  Increased pf reset max loop limit
+
+  See the git log for full details of the i40e/base changes.

 Resolved Issues
 ---
-- 
2.4.0



[dpdk-dev] [PATCH v3 00/36] update e1000 base code

2015-10-27 Thread Thomas Monjalon
2015-10-16 10:50, Wenzhuo Lu:
> Wenzhuo Lu (36):
>   e1000/base: update readme and copyright
>   e1000/base: add new devices
>   e1000/base: fix issue with link flap on 82579
>   e1000/base: fix issue with jumbo frame CRC failures in client
>   e1000/base: redundant PHY power down for i210
>   e1000/base: add return value to the functions of setting receive
> address register
>   e1000/base: add defaults for i210 TX/RX PBSIZE
>   e1000/base: remove E1000_WRITE_FLUSH for DH89XXCC_SGMII after
> commencing HW reset
>   e1000/base: add evaluation of e1000_nvm_read return value
>   e1000/base: change invariant return to not use variables
>   e1000/base: add return value handler when check manage mode
>   e1000/base: add return value handler for ESB2 controller init and
> reset
>   e1000/base: add support for inverted format ETrackId
>   e1000/base: add EEARBC_I210 for i210
>   e1000/base: apply paranoia to macro arguments
>   e1000/base: add flags to set eee advertisement modes
>   e1000/base: prevent ulp flow if cable connected
>   e1000/base: fix TIPG value for non 10 half duplex mode
>   e1000/base: add return value for resume workaround
>   e1000/base: fix link detect flow
>   e1000/base: cleanup NAHUM6LP_HW tags
>   e1000/base: add bit for disable packetbuffer read
>   e1000/base: K1 flow fixes
>   e1000/base: remove FIXME comment
>   e1000/base: set correct value of beacon duration
>   e1000/base: disable extension header parsing for IPv6
>   e1000/base: fix for i354 88E1112 PHY using AutoMedia Detect
>   e1000/base: increase timeout of polling bit RSPCIPHY in
> check_reset_block
>   e1000/base: implement 88E1543 PHY initialization
>   e1000/base: use the correct i210 register for EEMNGCTL
>   e1000/base: move the print to the right position
>   e1000/base: synchronization of MAC-PHY interface only on non- ME
> systems
>   e1000/base: fix to enable both ulp and EEE in Sx state
>   e1000/base: some minor change
>   e1000: add new devices
>   doc: update release notes for e1000 base code update

Applied without the release notes patch which is not ready.
The git titles have been reworded to be easier to read and understand.
It may help to reword the release notes.
Thanks


[dpdk-dev] [PATCH v3 36/36] doc: update release notes for e1000 base code update

2015-10-27 Thread Thomas Monjalon
2015-10-16 10:51, Wenzhuo Lu:
>  New Features
>  
>  
> +* **Updated the e1000 base driver.**
> +  The e1000 base driver was updated with several changes including the
> +  following:
> +
> +  * Add some new i218 devices
> +  * Fix issue with link flap on 82579
> +  * Fix issue with jumbo frame CRC failures in client
> +  * Add support for inverted format ETrackId
> +  * Add flags to set eee advertisement modes
> +  * Prevent ulp flow if cable connected
> +  * Cleanup NAHUM6LP_HW tags

Is it really a feature?

> +  * Use the correct i210 register for EEMNGCTL
> +  * Fix to enable both ulp and EEE in Sx state
> +  * Fix link detect flow
> +  * Set correct value of beacon duration
> +  * Disable extension header parsing for IPv6
> +  * Fix for i354 88E1112 PHY using AutoMedia Detect
> +  * Implement 88E1543 PHY initialization
> +  * Increase timeout of polling bit RSPCIPHY in check_reset_block
> +  * Synchronization of MAC-PHY interface only on non- ME systems

Some of theses features are not easy to understand.
Please avoid to speak of the registers in the release notes.
Something more high level is expected.

> +  See the git log for full details of the e1000/base changes.

This last sentence is valuable for any entry in the release notes.
Let's say it is implicit.

>  Resolved Issues
>  ---

Some of the fixes above should be in this section.

John, please could you check?

Thanks


[dpdk-dev] [PATCH 1/3] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Tetsuya Mukawa
Hi Yuanhan,

I appreciate your checking.
I haven't noticed SET_BACKEND is only supported by vhost-cuse. :-(
I will follow your comments, then submit again.

Thanks,
Tetsuya

On 2015/10/27 15:47, Yuanhan Liu wrote:
> On Tue, Oct 27, 2015 at 03:12:53PM +0900, Tetsuya Mukawa wrote:
>> The patch fixes wrong handling of virtqueue array index.
>>
>> GET_VRING_BASE:
>> The vhost backend will receive the message per virtqueue.
>> Also we should call a destroy callback when both RXQ and TXQ receives
>> the message.
>>
>> SET_BACKEND:
>> Because vhost library supports multiple queue, the index may be over 2.
>> Also a vhost frontend(QEMU) may send such a index.
> Note that only vhost-user supports MQ. vhost-cuse does not.
>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 22 +++---
>>  lib/librte_vhost/virtio-net.c |  5 +++--
>>  2 files changed, 14 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
>> b/lib/librte_vhost/vhost_user/virtio-net-user.c
>> index a998ad8..3e8dfea 100644
>> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
>> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
>> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>>  struct vhost_vring_state *state)
>>  {
>>  struct virtio_net *dev = get_device(ctx);
>> +uint16_t base_idx = state->index / VIRTIO_QNUM;
> So, fixing what my 1st reply said, for Nth queue pair, state->index
> is "N * 2 + is_tx". So, the base should be "state->index / 2 * 2".
>
>>  
>>  if (dev == NULL)
>>  return -1;
>> -/* We have to stop the queue (virtio) if it is running. */
>> -if (dev->flags & VIRTIO_DEV_RUNNING)
>> -notify_ops->destroy_device(dev);
>>  
>>  /* Here we are safe to get the last used index */
>>  ops->get_vring_base(ctx, state->index, state);
>> @@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>>   * sent and only sent in vhost_vring_stop.
>>   * TODO: cleanup the vring, it isn't usable since here.
>>   */
>> -if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
>> -close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
>> -dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
>> -}
>> -if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
>> -close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
>> -dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
>> +if (dev->virtqueue[state->index]->kickfd >= 0) {
>> +close(dev->virtqueue[state->index]->kickfd);
>> +dev->virtqueue[state->index]->kickfd = -1;
>>  }
>>  
>> +/* We have to stop the queue (virtio) if it is running. */
>> +if ((dev->flags & VIRTIO_DEV_RUNNING) &&
>> +(dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
>> +(dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
>> +notify_ops->destroy_device(dev);
> This is a proper fix then. (You just need fix base_idx).
>
>>  return 0;
>>  }
>>  
>> @@ -321,7 +321,7 @@ user_set_vring_enable(struct vhost_device_ctx ctx,
>>struct vhost_vring_state *state)
>>  {
>>  struct virtio_net *dev = get_device(ctx);
>> -uint16_t base_idx = state->index;
>> +uint16_t base_idx = state->index / VIRTIO_QNUM;
> user_set_vring_enable is sent per queue pair (I'm sure this time), so
> base_idx equals to state->index. No need fix here.
>
>>  int enable = (int)state->num;
>>  
>>  RTE_LOG(INFO, VHOST_CONFIG,
>> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
>> index 97213c5..ee2e84d 100644
>> --- a/lib/librte_vhost/virtio-net.c
>> +++ b/lib/librte_vhost/virtio-net.c
>> @@ -778,6 +778,7 @@ static int
>>  set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
>>  {
>>  struct virtio_net *dev;
>> +uint32_t base_idx = file->index / VIRTIO_QNUM;
> As stated, vhost-cuse doesn't not support MQ.
>
>   --yliu
>>  
>>  dev = get_device(ctx);
>>  if (dev == NULL)
>> @@ -791,8 +792,8 @@ set_backend(struct vhost_device_ctx ctx, struct 
>> vhost_vring_file *file)
>>   * we add the device.
>>   */
>>  if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
>> -if (((int)dev->virtqueue[VIRTIO_TXQ]->backend != 
>> VIRTIO_DEV_STOPPED) &&
>> -((int)dev->virtqueue[VIRTIO_RXQ]->backend != 
>> VIRTIO_DEV_STOPPED)) {
>> +if (((int)dev->virtqueue[base_idx + VIRTIO_RXQ]->backend != 
>> VIRTIO_DEV_STOPPED) &&
>> +((int)dev->virtqueue[base_idx + VIRTIO_TXQ]->backend != 
>> VIRTIO_DEV_STOPPED)) {
>>  return notify_ops->new_device(dev);
>>  }
>>  /* Otherwise we remove it. */
>> -- 
>> 2.1.4



[dpdk-dev] [PATCH v2 2/2] xenvirt: free queues in dev_close

2015-10-27 Thread Bernard Iremonger
Signed-off-by: Bernard Iremonger 
---
 drivers/net/xenvirt/rte_eth_xenvirt.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c 
b/drivers/net/xenvirt/rte_eth_xenvirt.c
index 084a753..6c2c067 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.c
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.c
@@ -75,6 +75,9 @@ static struct rte_eth_link pmd_link = {
.link_status = 0
 };

+static void
+eth_xenvirt_free_queues(struct rte_eth_dev *dev);
+
 static inline struct rte_mbuf *
 rte_rxmbuf_alloc(struct rte_mempool *mp)
 {
@@ -326,7 +329,7 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static void
 eth_dev_close(struct rte_eth_dev *dev)
 {
-   RTE_SET_USED(dev);
+   eth_xenvirt_free_queues(dev);
 }

 static void
@@ -362,8 +365,9 @@ eth_stats_reset(struct rte_eth_dev *dev)
 }

 static void
-eth_queue_release(void *q __rte_unused)
+eth_queue_release(void *q)
 {
+   rte_free(q);
 }

 static int
@@ -524,7 +528,23 @@ eth_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
tx_queue_id,
return 0;
 }

+static void
+eth_xenvirt_free_queues(struct rte_eth_dev *dev)
+{
+   int i;

+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   eth_queue_release(dev->data->rx_queues[i]);
+   dev->data->rx_queues[i] = NULL;
+   }
+   dev->data->nb_rx_queues = 0;
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   eth_queue_release(dev->data->tx_queues[i]);
+   dev->data->tx_queues[i] = NULL;
+   }
+   dev->data->nb_tx_queues = 0;
+}

 static const struct eth_dev_ops ops = {
.dev_start = eth_dev_start,
-- 
1.9.1



[dpdk-dev] [PATCH v2 1/2] xenvirt: add support for Port Hotplug

2015-10-27 Thread Bernard Iremonger
update release notes.

Signed-off-by: Bernard Iremonger 
---
 doc/guides/rel_notes/release_2_2.rst  |  2 ++
 drivers/net/xenvirt/rte_eth_xenvirt.c | 49 ++-
 drivers/net/xenvirt/rte_xen_lib.c | 26 ---
 drivers/net/xenvirt/rte_xen_lib.h |  5 +++-
 4 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 455b5a2..c88392b 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -13,6 +13,8 @@ New Features

   * This change required modifications to librte_ether and all vdev and pdev 
PMD's.

+* **Added Port Hotplug support to xenvirt PMD.**
+
 Resolved Issues
 ---

diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c 
b/drivers/net/xenvirt/rte_eth_xenvirt.c
index 1b13758..084a753 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.c
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.c
@@ -661,7 +661,7 @@ eth_dev_xenvirt_create(const char *name, const char *params,

eth_dev->data = data;
eth_dev->dev_ops = &ops;
-   eth_dev->data->dev_flags = 0;
+   eth_dev->data->dev_flags = RTE_PCI_DRV_DETACHABLE;
eth_dev->data->kdrv = RTE_KDRV_NONE;
eth_dev->data->drv_name = NULL;
eth_dev->driver = NULL;
@@ -683,6 +683,38 @@ err:
 }


+static int
+eth_dev_xenvirt_free(const char *name, const unsigned numa_node)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(DEBUG, PMD,
+   "Free virtio rings backed ethdev on numa socket %u\n",
+   numa_node);
+
+   /* find an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   if (eth_dev->data->dev_started == 1) {
+   eth_dev_stop(eth_dev);
+   eth_dev_close(eth_dev);
+   }
+
+   eth_dev->rx_pkt_burst = NULL;
+   eth_dev->tx_pkt_burst = NULL;
+   eth_dev->dev_ops = NULL;
+
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data->mac_addrs);
+
+   virtio_idx--;
+
+   return 0;
+}
+
 /*TODO: Support multiple process model */
 static int
 rte_pmd_xenvirt_devinit(const char *name, const char *params)
@@ -701,10 +733,25 @@ rte_pmd_xenvirt_devinit(const char *name, const char 
*params)
return 0;
 }

+static int
+rte_pmd_xenvirt_devuninit(const char *name)
+{
+   eth_dev_xenvirt_free(name, rte_socket_id());
+
+   if (virtio_idx == 0) {
+   if (xenstore_uninit() != 0)
+   RTE_LOG(ERR, PMD, "%s: xenstore uninit failed\n", 
__func__);
+
+   gntalloc_close();
+   }
+   return 0;
+}
+
 static struct rte_driver pmd_xenvirt_drv = {
.name = "eth_xenvirt",
.type = PMD_VDEV,
.init = rte_pmd_xenvirt_devinit,
+   .uninit = rte_pmd_xenvirt_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_xenvirt_drv);
diff --git a/drivers/net/xenvirt/rte_xen_lib.c 
b/drivers/net/xenvirt/rte_xen_lib.c
index b3932f0..5900b53 100644
--- a/drivers/net/xenvirt/rte_xen_lib.c
+++ b/drivers/net/xenvirt/rte_xen_lib.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -50,6 +50,7 @@

 #include 
 #include 
+#include 

 #include "rte_xen_lib.h"

@@ -72,6 +73,8 @@ int gntalloc_fd = -1;
 static char *dompath = NULL;
 /* handle to xenstore read/write operations */
 static struct xs_handle *xs = NULL;
+/* flag to indicate if xenstore cleanup is required */
+static bool is_xenstore_cleaned_up;

 /*
  * Reserve a virtual address space.
@@ -275,7 +278,6 @@ xenstore_init(void)
 {
unsigned int len, domid;
char *buf;
-   static int cleanup = 0;
char *end;

xs = xs_domain_open();
@@ -301,16 +303,32 @@ xenstore_init(void)

xs_transaction_start(xs); /* When to stop transaction */

-   if (cleanup == 0) {
+   if (is_xenstore_cleaned_up == 0) {
if (xenstore_cleanup())
return -1;
-   cleanup = 1;
+   is_xenstore_cleaned_up = 1;
}

return 0;
 }

 int
+xenstore_uninit(void)
+{
+   xs_close(xs);
+
+   if (is_xenstore_cleaned_up == 0) {
+   if (xenstore_cleanup())
+   return -1;
+   is_xenstore_cleaned_up = 1;
+   }
+   free(dompath);
+   dompath = NULL;
+
+   return 0;
+}
+
+int
 xenstore_write(const char *key_str, const char *val_str)
 {
char grant_path[PATH_MAX];
diff --git a/drivers/net/xenvirt/rte_xen_lib.h 
b/drivers/net/xenvirt/rte_xen_lib.h
index 0ba7148..d973eac 100644
--- a/drivers/net/xenvirt/rte_xen_lib.h
+++ b/drivers/net/xenvirt/rte_xen_lib.h
@@ -1,7 +1,7 @@
 /*-
  *   BSD L

[dpdk-dev] [PATCH v2 0/2] xenvirt hotplug support

2015-10-27 Thread Bernard Iremonger
add Port Hotplug support to the xenvirt PMD

This patch depends on v5 of the following patch set:

remove-pci-driver-from-vdevs.patch

Changes in  v2:
Rebase
Update release notes.

Bernard Iremonger (2):
  xenvirt: add support for Port Hotplug
  xenvirt: free queues in dev_close

 doc/guides/rel_notes/release_2_2.rst  |  2 +
 drivers/net/xenvirt/rte_eth_xenvirt.c | 73 +--
 drivers/net/xenvirt/rte_xen_lib.c | 26 +++--
 drivers/net/xenvirt/rte_xen_lib.h |  5 ++-
 4 files changed, 98 insertions(+), 8 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Yuanhan Liu

On Tue, Oct 27, 2015 at 04:51:46PM +0900, Tetsuya Mukawa wrote:
> The patch fixes wrong handling of virtqueue array index when
> GET_VRING_BASE message comes.
> The vhost backend will receive the message per virtqueue.
> Also we should call a destroy callback handler when both RXQ
> and TXQ receives the message.
> 
> Signed-off-by: Tetsuya Mukawa 


Acked-by: Yuanhan Liu 

Thanks.

--yliu
> ---
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
> b/lib/librte_vhost/vhost_user/virtio-net-user.c
> index a998ad8..99c075f 100644
> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>   struct vhost_vring_state *state)
>  {
>   struct virtio_net *dev = get_device(ctx);
> + uint16_t base_idx = state->index / VIRTIO_QNUM * VIRTIO_QNUM;
>  
>   if (dev == NULL)
>   return -1;
> - /* We have to stop the queue (virtio) if it is running. */
> - if (dev->flags & VIRTIO_DEV_RUNNING)
> - notify_ops->destroy_device(dev);
>  
>   /* Here we are safe to get the last used index */
>   ops->get_vring_base(ctx, state->index, state);
> @@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>* sent and only sent in vhost_vring_stop.
>* TODO: cleanup the vring, it isn't usable since here.
>*/
> - if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
> - close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
> - dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
> - }
> - if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
> - close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
> - dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
> + if (dev->virtqueue[state->index]->kickfd >= 0) {
> + close(dev->virtqueue[state->index]->kickfd);
> + dev->virtqueue[state->index]->kickfd = -1;
>   }
>  
> + /* We have to stop the queue (virtio) if it is running. */
> + if ((dev->flags & VIRTIO_DEV_RUNNING) &&
> + (dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
> + (dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
> + notify_ops->destroy_device(dev);
> +
>   return 0;
>  }
>  
> -- 
> 2.1.4


[dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support

2015-10-27 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> Sent: Monday, October 26, 2015 4:38 PM
> To: Thomas Monjalon; Hunt, David; dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 16/16] acl: check for SSE 4.1 support
> 
> The main goal of this check is to avoid passing the -msse4.1
> option to the GCC that does not support it (like arm toolchains).
> 
> Anyway, the ACL library does not compile on ARM.
> 
> Signed-off-by: Jan Viktorin 
> ---
>  lib/librte_acl/Makefile | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> index 7a1cf8a..401fb8c 100644
> --- a/lib/librte_acl/Makefile
> +++ b/lib/librte_acl/Makefile
> @@ -50,7 +50,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
>  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
>  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> 
> +CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 
> 2>&1 && echo 1)
> +
> +ifeq ($(CC_SSE4_1_SUPPORT),1)
>  CFLAGS_acl_run_sse.o += -msse4.1
> +endif

I don't think acl_run_sse.c would compile if SSE4_1 is not supported.
So, I think you need to do same thing, as is done for AVX2:
Compile in acl_run_sse.c  only if SSE41 is supported by the compiler:  

- SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
-
-CFLAGS_acl_run_sse.o += -msse4.1

+CC_SSE41_SUPPORT=$(shell $(CC) -msse4.1 -dM -E - &1 | \
grep -q  && echo 1)
+ifeq ($(CC_SSE41_SUPPORT), 1)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
+CFLAGS_acl_run_sse.o += -msse4.1   
+endif

And then change rte_acl_init() accordingly.
Something like:

int __attribute__ ((weak))
rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
__rte_unused const uint8_t **data,
__rte_unused uint32_t *results,
__rte_unused uint32_t num,
__rte_unused uint32_t categories)
{
return -ENOTSUP;
}



static void __attribute__((constructor))
rte_acl_init(void)
{
enum rte_acl_classify_alg alg = RTE_ACL_CLASSIFY_DEFAULT;

#if defined(CC_AVX2_SUPPORT)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
alg = RTE_ACL_CLASSIFY_AVX2;
else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
#elif defined (CC_SSE41_SUPPORT)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
alg = RTE_ACL_CLASSIFY_SSE;
#endif

rte_acl_set_default_classify(alg);
}

After that, I suppose, you should be able to build and (probably use) 
librte_acl on arm.
Konstantin





[dpdk-dev] [PATCH 1/3] vhost: Fix wrong handling of virtqueue array index

2015-10-27 Thread Yuanhan Liu
On Tue, Oct 27, 2015 at 04:28:58PM +0900, Tetsuya Mukawa wrote:
> Hi Yuanhan,
> 
> I appreciate your checking.

Welcome! And thank you for catching out my faults.

--yliu
> I haven't noticed SET_BACKEND is only supported by vhost-cuse. :-(
> I will follow your comments, then submit again.
> 
> Thanks,
> Tetsuya
> 
> On 2015/10/27 15:47, Yuanhan Liu wrote:
> > On Tue, Oct 27, 2015 at 03:12:53PM +0900, Tetsuya Mukawa wrote:
> >> The patch fixes wrong handling of virtqueue array index.
> >>
> >> GET_VRING_BASE:
> >> The vhost backend will receive the message per virtqueue.
> >> Also we should call a destroy callback when both RXQ and TXQ receives
> >> the message.
> >>
> >> SET_BACKEND:
> >> Because vhost library supports multiple queue, the index may be over 2.
> >> Also a vhost frontend(QEMU) may send such a index.
> > Note that only vhost-user supports MQ. vhost-cuse does not.
> >
> >> Signed-off-by: Tetsuya Mukawa 
> >> ---
> >>  lib/librte_vhost/vhost_user/virtio-net-user.c | 22 +++---
> >>  lib/librte_vhost/virtio-net.c |  5 +++--
> >>  2 files changed, 14 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
> >> b/lib/librte_vhost/vhost_user/virtio-net-user.c
> >> index a998ad8..3e8dfea 100644
> >> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> >> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> >> @@ -283,12 +283,10 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >>struct vhost_vring_state *state)
> >>  {
> >>struct virtio_net *dev = get_device(ctx);
> >> +  uint16_t base_idx = state->index / VIRTIO_QNUM;
> > So, fixing what my 1st reply said, for Nth queue pair, state->index
> > is "N * 2 + is_tx". So, the base should be "state->index / 2 * 2".
> >
> >>  
> >>if (dev == NULL)
> >>return -1;
> >> -  /* We have to stop the queue (virtio) if it is running. */
> >> -  if (dev->flags & VIRTIO_DEV_RUNNING)
> >> -  notify_ops->destroy_device(dev);
> >>  
> >>/* Here we are safe to get the last used index */
> >>ops->get_vring_base(ctx, state->index, state);
> >> @@ -300,15 +298,17 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >> * sent and only sent in vhost_vring_stop.
> >> * TODO: cleanup the vring, it isn't usable since here.
> >> */
> >> -  if (dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd >= 0) {
> >> -  close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
> >> -  dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
> >> -  }
> >> -  if (dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd >= 0) {
> >> -  close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
> >> -  dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
> >> +  if (dev->virtqueue[state->index]->kickfd >= 0) {
> >> +  close(dev->virtqueue[state->index]->kickfd);
> >> +  dev->virtqueue[state->index]->kickfd = -1;
> >>}
> >>  
> >> +  /* We have to stop the queue (virtio) if it is running. */
> >> +  if ((dev->flags & VIRTIO_DEV_RUNNING) &&
> >> +  (dev->virtqueue[base_idx + VIRTIO_RXQ]->kickfd == -1) &&
> >> +  (dev->virtqueue[base_idx + VIRTIO_TXQ]->kickfd == -1))
> >> +  notify_ops->destroy_device(dev);
> > This is a proper fix then. (You just need fix base_idx).
> >
> >>return 0;
> >>  }
> >>  
> >> @@ -321,7 +321,7 @@ user_set_vring_enable(struct vhost_device_ctx ctx,
> >>  struct vhost_vring_state *state)
> >>  {
> >>struct virtio_net *dev = get_device(ctx);
> >> -  uint16_t base_idx = state->index;
> >> +  uint16_t base_idx = state->index / VIRTIO_QNUM;
> > user_set_vring_enable is sent per queue pair (I'm sure this time), so
> > base_idx equals to state->index. No need fix here.
> >
> >>int enable = (int)state->num;
> >>  
> >>RTE_LOG(INFO, VHOST_CONFIG,
> >> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
> >> index 97213c5..ee2e84d 100644
> >> --- a/lib/librte_vhost/virtio-net.c
> >> +++ b/lib/librte_vhost/virtio-net.c
> >> @@ -778,6 +778,7 @@ static int
> >>  set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
> >>  {
> >>struct virtio_net *dev;
> >> +  uint32_t base_idx = file->index / VIRTIO_QNUM;
> > As stated, vhost-cuse doesn't not support MQ.
> >
> > --yliu
> >>  
> >>dev = get_device(ctx);
> >>if (dev == NULL)
> >> @@ -791,8 +792,8 @@ set_backend(struct vhost_device_ctx ctx, struct 
> >> vhost_vring_file *file)
> >> * we add the device.
> >> */
> >>if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
> >> -  if (((int)dev->virtqueue[VIRTIO_TXQ]->backend != 
> >> VIRTIO_DEV_STOPPED) &&
> >> -  ((int)dev->virtqueue[VIRTIO_RXQ]->backend != 
> >> VIRTIO_DEV_STOPPED)) {
> >> +  if (((int)dev->virtqueue[base_idx + VIRTIO_RXQ]->backend != 
> >> VIRTIO_DEV_STOPPED) &&
> >> +  ((int)dev->virtqueue[base_idx + VIRTIO_TXQ]->

[dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86

2015-10-27 Thread Ananyev, Konstantin
Hi Jan,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jan Viktorin
> Sent: Monday, October 26, 2015 4:38 PM
> To: Thomas Monjalon; Hunt, David; dev at dpdk.org
> Cc: Vlastimil Kosar
> Subject: [dpdk-dev] [PATCH v2 15/16] lpm/arm: implement rte_lpm_lookupx4 
> using rte_lpm_lookup_bulk on for-x86
> 
> From: Vlastimil Kosar 
> 
> LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
> the function is reimplemented using non-vector operations for non-x86
> architectures. In the future, each architecture should have vectorized code.
> This patch includes rudimentary emulation of intrinsic functions 
> _mm_set_epi32(),
> _mm_loadu_si128() and _mm_load_si128() for easy portability of existing
> applications.
> 
> LPM builds now when on ARM.
> 
> FIXME: to be reworked
> 
> Signed-off-by: Vlastimil Kosar 
> Signed-off-by: Jan Viktorin 
> ---
>  config/defconfig_arm-armv7-a-linuxapp-gcc |  1 -
>  lib/librte_lpm/rte_lpm.h  | 71 
> +++
>  2 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc 
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> index 5b582a8..33afb33 100644
> --- a/config/defconfig_arm-armv7-a-linuxapp-gcc
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> @@ -58,7 +58,6 @@ CONFIG_XMM_SIZE=16
> 
>  # fails to compile on ARM
>  CONFIG_RTE_LIBRTE_ACL=n
> -CONFIG_RTE_LIBRTE_LPM=n
> 
>  # cannot use those on ARM
>  CONFIG_RTE_KNI_KMOD=n
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> index c299ce2..4619992 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -47,7 +47,9 @@
>  #include 
>  #include 
>  #include 
> +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
>  #include 
> +#endif
> 
>  #ifdef __cplusplus
>  extern "C" {
> @@ -358,6 +360,7 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const 
> uint32_t * ips,
>   return 0;
>  }
> 
> +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
>  /* Mask four results. */
>  #define   RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
> 
> @@ -472,6 +475,74 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, 
> uint16_t hop[4],
>   hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
>   hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
>  }
> +#else

Probably better to create an lib/librte_eal/common/include/arch/arm/rte_vect.h,
and move all these x86 vector support emulation there?
Konstantin

> +// TODO: this code should be reworked.
> +
> +typedef struct {
> + union uint128 {
> + uint8_t uint8[16];
> + uint32_t uint32[4];
> + } val;
> +} __m128i;
> +
> +static inline __m128i
> +_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
> +{
> + __m128i res;
> + res.val.uint32[0] = v0;
> + res.val.uint32[1] = v1;
> + res.val.uint32[2] = v2;
> + res.val.uint32[3] = v3;
> + return res;
> +}
> +
> +static inline __m128i
> +_mm_loadu_si128(__m128i * v)
> +{
> + __m128i res;
> + res = *v;
> + return res;
> +}
> +
> +static inline __m128i
> +_mm_load_si128(__m128i * v)
> +{
> + __m128i res;
> + res = *v;
> + return res;
> +}
> +
> +/**
> + * Lookup four IP addresses in an LPM table.
> + *
> + * @param lpm
> + *   LPM object handle
> + * @param ip
> + *   Four IPs to be looked up in the LPM table
> + * @param hop
> + *   Next hop of the most specific rule found for IP (valid on lookup hit 
> only).
> + *   This is an 4 elements array of two byte values.
> + *   If the lookup was succesfull for the given IP, then least significant 
> byte
> + *   of the corresponding element is the  actual next hop and the most
> + *   significant byte is zero.
> + *   If the lookup for the given IP failed, then corresponding element would
> + *   contain default value, see description of then next parameter.
> + * @param defv
> + *   Default value to populate into corresponding element of hop[] array,
> + *   if lookup would fail.
> + */
> +static inline void
> +rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
> + uint16_t defv)
> +{
> + rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
> +
> + hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
> + hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
> + hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
> + hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
> +}
> +#endif
> 
>  #ifdef __cplusplus
>  }
> --
> 2.6.1



[dpdk-dev] [PATCH v8 3/8] vhost: vring queue setup for multiple queue support

2015-10-27 Thread Tetsuya Mukawa
On 2015/10/26 14:42, Yuanhan Liu wrote:
> On Mon, Oct 26, 2015 at 02:24:08PM +0900, Tetsuya Mukawa wrote:
>> On 2015/10/22 21:35, Yuanhan Liu wrote:
> ...
>>> @@ -292,13 +300,13 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>>>  * sent and only sent in vhost_vring_stop.
>>>  * TODO: cleanup the vring, it isn't usable since here.
>>>  */
>>> -   if ((dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
>>> -   close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
>>> -   dev->virtqueue[VIRTIO_RXQ]->kickfd = -1;
>>> +   if ((dev->virtqueue[state->index]->kickfd + VIRTIO_RXQ) >= 0) {
>>> +   close(dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd);
>>> +   dev->virtqueue[state->index + VIRTIO_RXQ]->kickfd = -1;
>>> }
>> Hi Yuanhan,
>>
>> Please let me make sure whether below is correct.
>> if ((dev->virtqueue[state->index]->kickfd + VIRTIO_RXQ) >= 0) {
>>
>>> -   if ((dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
>>> -   close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
>>> -   dev->virtqueue[VIRTIO_TXQ]->kickfd = -1;
>>> +   if ((dev->virtqueue[state->index]->kickfd + VIRTIO_TXQ) >= 0) {
>>> +   close(dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd);
>>> +   dev->virtqueue[state->index + VIRTIO_TXQ]->kickfd = -1;
>> Also, same question here.
> Oops, silly typos... Thanks for catching it out!
>
> Here is an update patch (Thomas, please let me know if you prefer me
> to send the whole patchset for you to apply):

Hi Yuanhan,

I may miss one more issue here.
Could you please see below patch I've submitted today?
(I may find a similar issue, so I've fixed it also in below patch.)

- http://dpdk.org/dev/patchwork/patch/8038/

Thanks,
Tetsuya



  1   2   >