[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-13 Thread Kamraan Nasim
Hello,

I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they work
very well.

Was wondering if I these could also be used for I210, 1Gbps NICs?

The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
),
however these do not support IPv6 yet.


Have people in the community had any luck with configuring L3/L4 hardware
filters for the I210 NIC?

Thanks,
Kam


[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-13 Thread Ferriter, Cian
Comments on alternative solutions:
1) how would this solution work when there is no NIC present, and 
"rte_eth_from_rings" is called? Here, could you have an else where the socket 
id of the master core is passed to the "memzone_reserve"?
2) how would you advise making this change? I have looked at where 
"rte_eth_dev_allocate" is being called and in all but one case, there is a 
"numa_id" that could be passed in. This isn't the case for " rte_eth_dev_init" 
however, is there an easy solution for this? Would there now need to be an 
"rte_eth_dev_data" struct for each socket that there is a NIC attached to, 
reserving memory from that socket?

Cian

-Original Message-
From: Richardson, Bruce 
Sent: Tuesday, January 13, 2015 1:56 PM
To: Ferriter, Cian
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to 
rte_memzone_reserve

On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote:
> Passing a socket id of "rte_socket_id()" can cause problems in non DPDK 
> applications as there is a dependency on the current logical core we are 
> running on.
> Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id to 
> rte_memzone_reserve resolves these issues as the master lcore doesn't change.
> 

The only trouble is that when affinitizing the memory for the NICs to the 
socket of the master lcore, it gives us no way to correctly configure an app to 
use NICs connected to two different sockets on the one system. All memory for 
all NICs will end up on the same socket. Two possible alternative solutions:
1) affinitize memory to the socket the NIC is connected to
2) add a socket parameter to the API calls to allow the user complete control 
over their memory allocations

Obviously the second one breaks backward compatibility (assume we modify 
existing API call), but is more powerful.

Thoughts?

/Bruce

> -Original Message-
> From: Ferriter, Cian
> Sent: Tuesday, January 13, 2015 9:22 AM
> To: dev at dpdk.org
> Cc: Ferriter, Cian
> Subject: [PATCH] lib/librte_ether: change socket_id passed to 
> rte_memzone_reserve
> 
> Change the socket id that is passed to rte_memzone_reserve from the socket id 
> of current logical core to the socket id of the master_lcore.
> ---
>  lib/librte_ether/rte_ethdev.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)  mode change 100644 
> => 100755 lib/librte_ether/rte_ethdev.c
> 
> diff --git a/lib/librte_ether/rte_ethdev.c 
> b/lib/librte_ether/rte_ethdev.c old mode 100644 new mode 100755 index 
> 95f2ceb..835540d
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
>   if (rte_eal_process_type() == RTE_PROC_PRIMARY){
>   mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
>   RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> - rte_socket_id(), flags);
> + rte_lcore_to_socket_id(rte_get_master_lcore()), 
> flags);
>   } else
>   mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
>   if (mz == NULL)
> --
> 1.7.4.1
> 


[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-13 Thread Stephen Hemminger
On Tue, 13 Jan 2015 09:22:00 +
Cian Ferriter  wrote:

> Change the socket id that is passed to rte_memzone_reserve from
> the socket id of current logical core to the socket id of the
> master_lcore.
> ---
>  lib/librte_ether/rte_ethdev.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>  mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> old mode 100644
> new mode 100755
> index 95f2ceb..835540d
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
>   if (rte_eal_process_type() == RTE_PROC_PRIMARY){
>   mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
>   RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> - rte_socket_id(), flags);
> + rte_lcore_to_socket_id(rte_get_master_lcore()), 
> flags);
>   } else
>   mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
>   if (mz == NULL)


Why is this a memzone at all?
Seems like it should be allocated on a per-device basis on the same NUMA node
of the device. Probably with rte_malloc_socket().



[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-13 Thread Bruce Richardson
On Tue, Jan 13, 2015 at 09:23:16AM +, Ferriter, Cian wrote:
> Passing a socket id of "rte_socket_id()" can cause problems in non DPDK 
> applications as there is a dependency on the current logical core we are 
> running on.
> Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id to 
> rte_memzone_reserve resolves these issues as the master lcore doesn't change.
> 

The only trouble is that when affinitizing the memory for the NICs to the socket
of the master lcore, it gives us no way to correctly configure an app
to use NICs connected to two different sockets on the one system. All memory for
all NICs will end up on the same socket. Two possible alternative solutions:
1) affinitize memory to the socket the NIC is connected to
2) add a socket parameter to the API calls to allow the user complete control
over their memory allocations

Obviously the second one breaks backward compatibility (assume we modify 
existing
API call), but is more powerful.

Thoughts?

/Bruce

> -Original Message-
> From: Ferriter, Cian 
> Sent: Tuesday, January 13, 2015 9:22 AM
> To: dev at dpdk.org
> Cc: Ferriter, Cian
> Subject: [PATCH] lib/librte_ether: change socket_id passed to 
> rte_memzone_reserve
> 
> Change the socket id that is passed to rte_memzone_reserve from the socket id 
> of current logical core to the socket id of the master_lcore.
> ---
>  lib/librte_ether/rte_ethdev.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)  mode change 100644 => 
> 100755 lib/librte_ether/rte_ethdev.c
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
> old mode 100644 new mode 100755 index 95f2ceb..835540d
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
>   if (rte_eal_process_type() == RTE_PROC_PRIMARY){
>   mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
>   RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> - rte_socket_id(), flags);
> + rte_lcore_to_socket_id(rte_get_master_lcore()), 
> flags);
>   } else
>   mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
>   if (mz == NULL)
> --
> 1.7.4.1
> 


[dpdk-dev] [PATCH 2/2] testpmd: fix dcb in vt mode

2015-01-13 Thread Vlad Zolotarov

On 01/12/15 17:50, Michal Jastrzebski wrote:
> From: Pawel Wodkowski 
>
> This patch incorporate fixes to support DCB in SRIOV mode for testpmd.
> It also clean up some old code that is not needed or wrong.

The same here: could u, pls., separate the "cleanup" part of the patch 
from the "fixes" part into separate patches?

thanks,
vlad

>
> Signed-off-by: Pawel Wodkowski 
> ---
>   app/test-pmd/cmdline.c |4 ++--
>   app/test-pmd/testpmd.c |   39 +--
>   app/test-pmd/testpmd.h |   10 --
>   3 files changed, 31 insertions(+), 22 deletions(-)
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 882a5a2..3c60087 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -1947,9 +1947,9 @@ cmd_config_dcb_parsed(void *parsed_result,
>   
>   /* DCB in VT mode */
>   if (!strncmp(res->vt_en, "on",2))
> - dcb_conf.dcb_mode = DCB_VT_ENABLED;
> + dcb_conf.vt_en = 1;
>   else
> - dcb_conf.dcb_mode = DCB_ENABLED;
> + dcb_conf.vt_en = 0;
>   
>   if (!strncmp(res->pfc_en, "on",2)) {
>   dcb_conf.pfc_en = 1;
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 8c69756..6677a5e 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1733,7 +1733,8 @@ const uint16_t vlan_tags[] = {
>   };
>   
>   static  int
> -get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf)
> +get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct dcb_config *dcb_conf,
> + uint16_t sriov)
>   {
>   uint8_t i;
>   
> @@ -1741,7 +1742,7 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct 
> dcb_config *dcb_conf)
>* Builds up the correct configuration for dcb+vt based on the vlan 
> tags array
>* given above, and the number of traffic classes available for use.
>*/
> - if (dcb_conf->dcb_mode == DCB_VT_ENABLED) {
> + if (dcb_conf->vt_en == 1) {
>   struct rte_eth_vmdq_dcb_conf vmdq_rx_conf;
>   struct rte_eth_vmdq_dcb_tx_conf vmdq_tx_conf;
>   
> @@ -1758,9 +1759,17 @@ get_eth_dcb_conf(struct rte_eth_conf *eth_conf, struct 
> dcb_config *dcb_conf)
>   vmdq_rx_conf.pool_map[i].vlan_id = vlan_tags[ i ];
>   vmdq_rx_conf.pool_map[i].pools = 1 << (i % 
> vmdq_rx_conf.nb_queue_pools);
>   }
> - for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
> - vmdq_rx_conf.dcb_queue[i] = i;
> - vmdq_tx_conf.dcb_queue[i] = i;
> +
> + if (sriov == 0) {
> + for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
> + vmdq_rx_conf.dcb_queue[i] = i;
> + vmdq_tx_conf.dcb_queue[i] = i;
> + }
> + } else {
> + for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++) {
> + vmdq_rx_conf.dcb_queue[i] = i % 
> dcb_conf->num_tcs;
> + vmdq_tx_conf.dcb_queue[i] = i % 
> dcb_conf->num_tcs;
> + }
>   }
>   
>   /*set DCB mode of RX and TX of multiple queues*/
> @@ -1818,22 +1827,32 @@ init_port_dcb_config(portid_t pid,struct dcb_config 
> *dcb_conf)
>   uint16_t nb_vlan;
>   uint16_t i;
>   
> - /* rxq and txq configuration in dcb mode */
> - nb_rxq = 128;
> - nb_txq = 128;
>   rx_free_thresh = 64;
>   
> + rte_port = &ports[pid];
>   memset(&port_conf,0,sizeof(struct rte_eth_conf));
>   /* Enter DCB configuration status */
>   dcb_config = 1;
>   
>   nb_vlan = sizeof( vlan_tags )/sizeof( vlan_tags[ 0 ]);
>   /*set configuration of DCB in vt mode and DCB in non-vt mode*/
> - retval = get_eth_dcb_conf(&port_conf, dcb_conf);
> + retval = get_eth_dcb_conf(&port_conf, dcb_conf, 
> rte_port->dev_info.max_vfs);
> +
> + /* rxq and txq configuration in dcb mode */
> + nb_rxq = rte_port->dev_info.max_rx_queues;
> + nb_txq = rte_port->dev_info.max_tx_queues;
> +
> + if (rte_port->dev_info.max_vfs) {
> + if (port_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB)
> + nb_rxq /= 
> port_conf.rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
> +
> + if (port_conf.txmode.mq_mode == ETH_MQ_TX_VMDQ_DCB)
> + nb_txq /= 
> port_conf.tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
> + }
> +
>   if (retval < 0)
>   return retval;
>   
> - rte_port = &ports[pid];
>   memcpy(&rte_port->dev_conf, &port_conf,sizeof(struct rte_eth_conf));
>   
>   rte_port->rx_conf.rx_thresh = rx_thresh;
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index f8b0740..8976acc 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -227,20 +227,10 @@ struct fwd_config {
>   portid_t   nb_fwd_ports;/**< Nb. of ports involved. *

[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-13 Thread Vlad Zolotarov

On 01/12/15 17:50, Michal Jastrzebski wrote:
> From: Pawel Wodkowski 
>
> This patch add support for DCB in SRIOV mode. When no PFC
> is enabled this feature might be used as multiple queues
> (up to 8 or 4) for VF.
>
> It incorporate following modifications:
>   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
> Rationale:
> in SRIOV mode PF use first free VF to RX/TX. If VF count
> is 16 or 32 all recources are assigned to VFs so PF can
> be used only for configuration.
>   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
> Rationale:
> rx and tx number of queue might be different if RX and TX are
> configured in different mode. This allow to inform VF about
> proper number of queues.
>   - extern mailbox API for DCB mode

IMHO each bullet above is worth a separate patch. ;)
It would be much easier to review.

thanks,
vlad

>
> Signed-off-by: Pawel Wodkowski 
> ---
>   lib/librte_ether/rte_ethdev.c   |   84 +-
>   lib/librte_ether/rte_ethdev.h   |5 +-
>   lib/librte_pmd_e1000/igb_pf.c   |3 +-
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   10 ++--
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |1 +
>   lib/librte_pmd_ixgbe/ixgbe_pf.c |   98 
> ++-
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |7 ++-
>   7 files changed, 159 insertions(+), 49 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 95f2ceb..4c1a494 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -333,7 +333,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>   dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
>   sizeof(dev->data->rx_queues[0]) * nb_queues,
>   RTE_CACHE_LINE_SIZE);
> - if (dev->data->rx_queues == NULL) {
> + if (dev->data->rx_queues == NULL && nb_queues > 0) {
>   dev->data->nb_rx_queues = 0;
>   return -(ENOMEM);
>   }
> @@ -475,7 +475,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>   dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
>   sizeof(dev->data->tx_queues[0]) * nb_queues,
>   RTE_CACHE_LINE_SIZE);
> - if (dev->data->tx_queues == NULL) {
> + if (dev->data->tx_queues == NULL && nb_queues > 0) {
>   dev->data->nb_tx_queues = 0;
>   return -(ENOMEM);
>   }
> @@ -507,6 +507,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
> const struct rte_eth_conf *dev_conf)
>   {
>   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> + struct rte_eth_dev_info dev_info;
>   
>   if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>   /* check multi-queue mode */
> @@ -524,11 +525,33 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>   return (-EINVAL);
>   }
>   
> + if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB) &&
> + (dev_conf->txmode.mq_mode == ETH_MQ_TX_VMDQ_DCB)) {
> + enum rte_eth_nb_pools rx_pools =
> + 
> dev_conf->rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
> + enum rte_eth_nb_pools tx_pools =
> + 
> dev_conf->tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
> +
> + if (rx_pools != tx_pools) {
> + /* Only equal number of pools is supported when
> +  * DCB+VMDq in SRIOV */
> + PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
> + " SRIOV active, DCB+VMDQ mode, "
> + "number of rx and tx pools is 
> not eqaul\n",
> + port_id);
> + return (-EINVAL);
> + }
> + }
> +
> + uint16_t nb_rx_q_per_pool = 
> RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool;
> + uint16_t nb_tx_q_per_pool = 
> RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
> +
>   switch (dev_conf->rxmode.mq_mode) {
> - case ETH_MQ_RX_VMDQ_RSS:
>   case ETH_MQ_RX_VMDQ_DCB:
> + break;
> + case ETH_MQ_RX_VMDQ_RSS:
>   case ETH_MQ_RX_VMDQ_DCB_RSS:
> - /* DCB/RSS VMDQ in SRIOV mode, not implement yet */
> + /* RSS, DCB+RSS VMDQ in SRIOV mode, not implement yet */
>   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>   " SRIOV active, "
>  

[dpdk-dev] [PATCH 0/2] Enable DCB in SRIOV mode for ixgbe driver

2015-01-13 Thread Vlad Zolotarov

On 01/12/15 17:50, Michal Jastrzebski wrote:
> From: Pawel Wodkowski 
>
> Hi,
> this patchset enables DCB in SRIOV (ETH_MQ_RX_VMDQ_DCB and ETH_MQ_TX_VMDQ_DCB)
> for each VF and PF for ixgbe driver.
>
> As a side effect this allow to use multiple queues for TX in VF (8 if there is
> 16 or less VFs or 4 if there is 32 or less VFs) when PFC is not enabled.

Here it is! ;) Thanks. Pls., ignore my previous email about the 
respinning... ;)

>
>
> Pawel Wodkowski (2):
>pmd: add DCB for VF for ixgbe
>testpmd: fix dcb in vt mode
>
>   app/test-pmd/cmdline.c  |4 +-
>   app/test-pmd/testpmd.c  |   39 ++
>   app/test-pmd/testpmd.h  |   10 
>   lib/librte_ether/rte_ethdev.c   |   84 +-
>   lib/librte_ether/rte_ethdev.h   |5 +-
>   lib/librte_pmd_e1000/igb_pf.c   |3 +-
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   10 ++--
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |1 +
>   lib/librte_pmd_ixgbe/ixgbe_pf.c |   98 
> ++-
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |7 ++-
>   10 files changed, 190 insertions(+), 71 deletions(-)
>



[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-13 Thread Vlad Zolotarov

On 01/12/15 16:43, Michal Jastrzebski wrote:
> Date: Mon, 12 Jan 2015 15:39:40 +0100
> Message-Id: <1421073581-6644-2-git-send-email-michalx.k.jastrzebski at 
> intel.com>
> X-Mailer: git-send-email 2.1.1
> In-Reply-To: <1421073581-6644-1-git-send-email-michalx.k.jastrzebski at 
> intel.com>
> References: <1421073581-6644-1-git-send-email-michalx.k.jastrzebski at 
> intel.com>
>
> From: Pawel Wodkowski 
>
>
> This patch add support for DCB in SRIOV mode. When no PFC
>
> is enabled this feature might be used as multiple queues
>
> (up to 8 or 4) for VF.
>
>
>
> It incorporate following modifications:
>
>   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
>
> Rationale:
>
> in SRIOV mode PF use first free VF to RX/TX. If VF count
>
> is 16 or 32 all recources are assigned to VFs so PF can
>
> be used only for configuration.
>
>   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
>
> Rationale:
>
> rx and tx number of queue might be different if RX and TX are
>
> configured in different mode. This allow to inform VF about
>
> proper number of queues.


Nice move! Ouyang, this is a nice answer to my recent remarks about your 
PATCH4 in "Enable VF RSS for Niantic" series.

Michal, could u, pls., respin this series after fixing the formatting 
and (maybe) using "git send-email" for sending? ;)

thanks,
vlad


>
>   - extern mailbox API for DCB mode
>
>
>
> Signed-off-by: Pawel Wodkowski 
>
> ---
>
>   lib/librte_ether/rte_ethdev.c   |   84 +-
>
>   lib/librte_ether/rte_ethdev.h   |5 +-
>
>   lib/librte_pmd_e1000/igb_pf.c   |3 +-
>
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   10 ++--
>
>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |1 +
>
>   lib/librte_pmd_ixgbe/ixgbe_pf.c |   98 
> ++-
>
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |7 ++-
>
>   7 files changed, 159 insertions(+), 49 deletions(-)
>
>
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>
> index 95f2ceb..4c1a494 100644
>
> --- a/lib/librte_ether/rte_ethdev.c
>
> +++ b/lib/librte_ether/rte_ethdev.c
>
> @@ -333,7 +333,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>
>   dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
>
>   sizeof(dev->data->rx_queues[0]) * nb_queues,
>
>   RTE_CACHE_LINE_SIZE);
>
> - if (dev->data->rx_queues == NULL) {
>
> + if (dev->data->rx_queues == NULL && nb_queues > 0) {
>
>   dev->data->nb_rx_queues = 0;
>
>   return -(ENOMEM);
>
>   }
>
> @@ -475,7 +475,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
> uint16_t nb_queues)
>
>   dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
>
>   sizeof(dev->data->tx_queues[0]) * nb_queues,
>
>   RTE_CACHE_LINE_SIZE);
>
> - if (dev->data->tx_queues == NULL) {
>
> + if (dev->data->tx_queues == NULL && nb_queues > 0) {
>
>   dev->data->nb_tx_queues = 0;
>
>   return -(ENOMEM);
>
>   }
>
> @@ -507,6 +507,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>
> const struct rte_eth_conf *dev_conf)
>
>   {
>
>   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>
> + struct rte_eth_dev_info dev_info;
>
>   
>
>   if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>
>   /* check multi-queue mode */
>
> @@ -524,11 +525,33 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>
>   return (-EINVAL);
>
>   }
>
>   
>
> + if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB) &&
>
> + (dev_conf->txmode.mq_mode == ETH_MQ_TX_VMDQ_DCB)) {
>
> + enum rte_eth_nb_pools rx_pools =
>
> + 
> dev_conf->rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
>
> + enum rte_eth_nb_pools tx_pools =
>
> + 
> dev_conf->tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
>
> +
>
> + if (rx_pools != tx_pools) {
>
> + /* Only equal number of pools is supported when
>
> +  * DCB+VMDq in SRIOV */
>
> + PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>
> + " SRIOV active, DCB+VMDQ mode, "
>
> + "number of rx and tx pools is 
> not eqaul\n",
>
> + port_id);
>
> + return (-EINVAL);
>
> + }
>
> + }
>
> +
>
> + uint16_t nb_rx_q_per_pool = 
> RTE_ETH_DEV

[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-13 Thread Vlad Zolotarov

On 01/12/15 17:46, Jastrzebski, MichalX K wrote:
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michal Jastrzebski
>> Sent: Monday, January 12, 2015 3:43 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe
>>
>> Date: Mon, 12 Jan 2015 15:39:40 +0100
>> Message-Id: <1421073581-6644-2-git-send-email-
>> michalx.k.jastrzebski at intel.com>
>> X-Mailer: git-send-email 2.1.1
>> In-Reply-To: <1421073581-6644-1-git-send-email-
>> michalx.k.jastrzebski at intel.com>
>> References: <1421073581-6644-1-git-send-email-
>> michalx.k.jastrzebski at intel.com>
>>
>> From: Pawel Wodkowski 
>>
>>
>> This patch add support for DCB in SRIOV mode. When no PFC
>>
>> is enabled this feature might be used as multiple queues
>>
>> (up to 8 or 4) for VF.
>>
>>
>>
>> It incorporate following modifications:
>>
>>   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
>>
>> Rationale:
>>
>> in SRIOV mode PF use first free VF to RX/TX. If VF count
>>
>> is 16 or 32 all recources are assigned to VFs so PF can
>>
>> be used only for configuration.
>>
>>   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
>>
>> Rationale:
>>
>> rx and tx number of queue might be different if RX and TX are
>>
>> configured in different mode. This allow to inform VF about
>>
>> proper number of queues.
>>
>>   - extern mailbox API for DCB mode
>>
>>
>>
>> Signed-off-by: Pawel Wodkowski 
>>
>> ---
>>
>>   lib/librte_ether/rte_ethdev.c   |   84 +-
>>
>>   lib/librte_ether/rte_ethdev.h   |5 +-
>>
>>   lib/librte_pmd_e1000/igb_pf.c   |3 +-
>>
>>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   10 ++--
>>
>>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |1 +
>>
>>   lib/librte_pmd_ixgbe/ixgbe_pf.c |   98 ++--
>> ---
>>
>>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c   |7 ++-
>>
>>   7 files changed, 159 insertions(+), 49 deletions(-)
>>
>>
>>
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>>
>> index 95f2ceb..4c1a494 100644
>>
>> --- a/lib/librte_ether/rte_ethdev.c
>>
>> +++ b/lib/librte_ether/rte_ethdev.c
>>
>> @@ -333,7 +333,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev
>> *dev, uint16_t nb_queues)
>>
>>  dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
>>
>>  sizeof(dev->data->rx_queues[0]) * nb_queues,
>>
>>  RTE_CACHE_LINE_SIZE);
>>
>> -if (dev->data->rx_queues == NULL) {
>>
>> +if (dev->data->rx_queues == NULL && nb_queues > 0) {
>>
>>  dev->data->nb_rx_queues = 0;
>>
>>  return -(ENOMEM);
>>
>>  }
>>
>> @@ -475,7 +475,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev
>> *dev, uint16_t nb_queues)
>>
>>  dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
>>
>>  sizeof(dev->data->tx_queues[0]) * nb_queues,
>>
>>  RTE_CACHE_LINE_SIZE);
>>
>> -if (dev->data->tx_queues == NULL) {
>>
>> +if (dev->data->tx_queues == NULL && nb_queues > 0) {
>>
>>  dev->data->nb_tx_queues = 0;
>>
>>  return -(ENOMEM);
>>
>>  }
>>
>> @@ -507,6 +507,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>
>>const struct rte_eth_conf *dev_conf)
>>
>>   {
>>
>>  struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>
>> +struct rte_eth_dev_info dev_info;
>>
>>
>>
>>  if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>>
>>  /* check multi-queue mode */
>>
>> @@ -524,11 +525,33 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>
>>  return (-EINVAL);
>>
>>  }
>>
>>
>>
>> +if ((dev_conf->rxmode.mq_mode ==
>> ETH_MQ_RX_VMDQ_DCB) &&
>>
>> +(dev_conf->txmode.mq_mode ==
>> ETH_MQ_TX_VMDQ_DCB)) {
>>
>> +enum rte_eth_nb_pools rx_pools =
>>
>> +dev_conf-
>>> rx_adv_conf.vmdq_dcb_conf.nb_queue_pools;
>> +enum rte_eth_nb_pools tx_pools =
>>
>> +dev_conf-
>>> tx_adv_conf.vmdq_dcb_tx_conf.nb_queue_pools;
>> +
>>
>> +if (rx_pools != tx_pools) {
>>
>> +/* Only equal number of pools is supported
>> when
>>
>> + * DCB+VMDq in SRIOV */
>>
>> +PMD_DEBUG_TRACE("ethdev port_id=%"
>> PRIu8
>>
>> +" SRIOV active, DCB+VMDQ
>> mode, "
>>
>> +"number of rx and tx pools is
>> not eqaul\n",
>>
>> +port_id);
>>
>> +return (-EINVA

[dpdk-dev] [PATCH 2/2] testpmd: fix dcb in vt mode

2015-01-13 Thread Wodkowski, PawelX
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vlad Zolotarov
> Sent: Tuesday, January 13, 2015 11:16 AM
> To: Jastrzebski, MichalX K; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] testpmd: fix dcb in vt mode
> 
> 
> On 01/12/15 17:50, Michal Jastrzebski wrote:
> > From: Pawel Wodkowski 
> >
> > This patch incorporate fixes to support DCB in SRIOV mode for testpmd.
> > It also clean up some old code that is not needed or wrong.
> 
> The same here: could u, pls., separate the "cleanup" part of the patch
> from the "fixes" part into separate patches?
> 

Maybe little confusion I introduced by saying cleanups. Some code became
obsolete (like enum dcb_mode_enable) when I fixed DCV in VT mode, so
removing those parts I called "cleanups". Please consider them to be a fixes.

Pawel



[dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe

2015-01-13 Thread Wodkowski, PawelX
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vlad Zolotarov
> Sent: Tuesday, January 13, 2015 11:14 AM
> To: Jastrzebski, MichalX K; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/2] pmd: add DCB for VF for ixgbe
> 
> 
> On 01/12/15 17:50, Michal Jastrzebski wrote:
> > From: Pawel Wodkowski 
> >
> > This patch add support for DCB in SRIOV mode. When no PFC
> > is enabled this feature might be used as multiple queues
> > (up to 8 or 4) for VF.
> >
> > It incorporate following modifications:
> >   - Allow zero rx/tx queues to be passed to rte_eth_dev_configure().
> > Rationale:
> > in SRIOV mode PF use first free VF to RX/TX. If VF count
> > is 16 or 32 all recources are assigned to VFs so PF can
> > be used only for configuration.
> >   - split nb_q_per_pool to nb_rx_q_per_pool and nb_tx_q_per_pool
> > Rationale:
> > rx and tx number of queue might be different if RX and TX are
> > configured in different mode. This allow to inform VF about
> > proper number of queues.
> >   - extern mailbox API for DCB mode
> 
> IMHO each bullet above is worth a separate patch. ;)
> It would be much easier to review.
> 

Good point. I will send next version shortly.

Pawel


[dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode

2015-01-13 Thread Vlad Zolotarov

On 01/13/15 03:50, Ouyang, Changchun wrote:
>
> *From:*Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> *Sent:* Monday, January 12, 2015 9:59 PM
> *To:* Ouyang, Changchun; dev at dpdk.org
> *Subject:* Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode
>
> On 01/12/15 05:41, Ouyang, Changchun wrote:
>
> *From:*Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> *Sent:* Friday, January 09, 2015 9:50 PM
> *To:* Ouyang, Changchun; dev at dpdk.org 
> *Subject:* Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode
>
> On 01/09/15 07:54, Ouyang, Changchun wrote:
>
>   
>
>   
>
> -Original Message-
>
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>
> Sent: Friday, January 9, 2015 2:49 AM
>
> To: Ouyang, Changchun;dev at dpdk.org  
>
> Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode
>
>   
>
>   
>
> On 01/08/15 11:19, Vlad Zolotarov wrote:
>
>   
>
> On 01/07/15 08:32, Ouyang Changchun wrote:
>
> Check mq mode for VMDq RSS, handle it correctly instead 
> of returning
>
> an error; Also remove the limitation of per pool queue 
> number has max
>
> value of 1, because the per pool queue number could be 2 
> or 4 if it
>
> is VMDq RSS mode;
>
>   
>
> The number of rxq specified in config will determine the 
> mq mode for
>
> VMDq RSS.
>
>   
>
> Signed-off-by: Changchun Ouyang intel.com>  
>
>   
>
> changes in v5:
>
> - Fix '<' issue, it should be '<=' to test rxq number;
>
> - Extract a function to remove the embeded 
> switch-case statement.
>
>   
>
> ---
>
>lib/librte_ether/rte_ethdev.c | 50
>
> ++-
>
>1 file changed, 45 insertions(+), 5 deletions(-)
>
>   
>
> diff --git a/lib/librte_ether/rte_ethdev.c
>
> b/lib/librte_ether/rte_ethdev.c index 95f2ceb..8363e26 
> 100644
>
> --- a/lib/librte_ether/rte_ethdev.c
>
> +++ b/lib/librte_ether/rte_ethdev.c
>
> @@ -503,6 +503,31 @@ rte_eth_dev_tx_queue_config(struct
>
> rte_eth_dev
>
> *dev, uint16_t nb_queues)
>
>}
>
>  static int
>
> +rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, 
> uint16_t nb_rx_q)
>
> +{
>
> +struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>
> +switch (nb_rx_q) {
>
> +case 1:
>
> +case 2:
>
> +RTE_ETH_DEV_SRIOV(dev).active =
>
> +ETH_64_POOLS;
>
> +break;
>
> +case 4:
>
> +RTE_ETH_DEV_SRIOV(dev).active =
>
> +ETH_32_POOLS;
>
> +break;
>
> +default:
>
> +return -EINVAL;
>
> +}
>
> +
>
> +RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;
>
> +RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
>
> +dev->pci_dev->max_vfs * nb_rx_q;
>
> +
>
> +return 0;
>
> +}
>
> +
>
> +static int
>
>rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q,
>
> uint16_t nb_tx_q,
>
>  const struct rte_eth_conf *dev_conf)
>
>{
>
> @@ -510,8 +535,7 @@ rte_eth_dev_check_mq_mode(uint8_t 
> port_id,
>
> uint16_t nb_rx_q, uint16_t nb_tx_q,
>
>  if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>
>/* check multi-queue mode */
>
> -if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) 
> ||
>
> -(dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) 
> ||
>
> +if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) 
> ||
>
>(dev_conf->rxmode.mq_mode == 
> ETH_MQ_RX_DCB_RSS) ||
>
>(dev_conf->txmode.mq_mode == 
> ETH_MQ_TX_DCB)) {
>
>  

[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-13 Thread Olivier MATZ
Hi Jijiang,

On 01/13/2015 04:04 AM, Liu, Jijiang wrote:
> the following two commands are.
>
> 1. tx_checksum set sw-tunnel-mode on/off
>
> 2. tx_checksum set hw-tunnel-mode on/off
>
> For command 1, If the sw-tunnel-mode is set/clear, which will set/clear a 
> testpmd flag that is used in the process of analyzing incoming packet., the 
> pseudo-codes are list below,
>
> If (sw-tunnel-mode)
>
>   Csum fwd engine will analyze if incoming packet is a tunneling packet.
> tunnel = 1;
> else
> Csum fwd engine will not analyze if incoming packet is a 
> tunneling packet, and treat all the incoming packets as non-tunneling packets.
> It is used for A.

What about "recognize-tunnel" instead of "sw-tunnel-mode"?
Or "parse-tunnel"?

To me, using "sw-" or "hw-" prefix is confusing because in any case
the checksums can be calculated in software or hardware depending on
"tx_checksum set outer-ip hw|sw".

Moreover, this command has an impact on receive side, but the name
is still "tx_checksum". Maybe this is also confusing.

> For command 2, If the hw-tunnel-mode is set/clear, which will set/clear a 
> testpmd flag that is used in the process of how to handle tunneling packet, 
> the pseudo-codes are list below,
>
> if (tunnel == 1) { // this is a tunneling packet
>  If (hw-tunnel-mode)
>ol_flags |= PKT_TX_UDP_TUNNEL_PKT;
>
>  Csum fwd engine set PKT_TX_UDP_TUNNEL_PKT offload flag, which 
> means to tell HW treat  the transmit packet as a tunneling packet to do 
> checksum offload.
>  It is used for B.1
> Else
>Csum fwd engine doesn't  set PKT_TX_UDP_TUNNEL_PKT 
> offload flag, which means  tell HW to treat the packet as ordinary 
> (non-tunnelled) packet.
> It is used for B.2
> }

What about:
   tx_checksum set tunnel-method normal|outer

It would select if we use lX_len or outer_lX_len. Is it what you mean?

And this only makes sense when we use hw checksum right?

>> And will it be possible to support future hardware that will be able to 
>> compute
>> both outer l3, outer l4, l3 and l4 checksums?
>
> Yes.
> Currently, i40e support outer l3, outer l4, l3 and l4 checksums offload at 
> the same time.

I probably missed something here: we only have PKT_TX_OUTER_IP_CKSUM
but there is no PKT_TX_OUTER_UDP_CKSUM. Is outer UDP checksum supported

> test case C:
> tx_checksum set tunnel-mode hw
> tx_checksum set  outer-ip   hw
> tx_checksum set  ip   hw
> tx_checksum set  tcp   hw
>
> Of course, outer udp is not listed here for VXLAN.

I don't understand why. Could you detail it?

>> I have another idea, please let me know if you find it clearer or not.
>> The commands format would be:
>>
>> tx_checksum  ...
>>
>> [...]
>>
>> What do you think?
>
> Thanks for your proposal.
> It is clear for me.
>
> But there are two questions for me.
>
> As I know, in current command line framework, the option in command line is 
> exact match, so you probably have to add duplicated codes when you want to 
> support a new packet types.

I don't think it's really a problem. The cmdline library supports
string list, so can have the following 3 commands definitions:

1. tx_checksum 
ip-udp|ip-tcp|ip-sctp|vxlan-ip-udp|vxlan-ip-tcp|vxlan-ip-sctp l3 
off|sw|hw l4 off|sw|hw
2. tx_checksum ip-other|vxlan-ip-other l3 off|sw|hw
3. tx_checksum vxlan outer-l3 off|sw|hw outer-l4 off|sw|hw

Maybe 1 and 2 could be splitted in non-vxlan and vxlan. But only
the structure should be redefined to have a different help string,
not the callback function.

> Other question:
>
> Currently, the following testpmd flag is for per port, not for per packet 
> type, when they are set, which will affect whole port, not just for packet 
> type or format, if you  add   option in cmdline, which means you 
> have to other changes.
>
> /** Offload IP checksum in csum forward engine */
> #define TESTPMD_TX_OFFLOAD_IP_CKSUM  0x0001
> /** Offload UDP checksum in csum forward engine */
> #define TESTPMD_TX_OFFLOAD_UDP_CKSUM 0x0002
> /** Offload TCP checksum in csum forward engine */
> #define TESTPMD_TX_OFFLOAD_TCP_CKSUM 0x0004
> /** Offload SCTP checksum in csum forward engine */
> #define TESTPMD_TX_OFFLOAD_SCTP_CKSUM0x0008
> /** Offload VxLAN checksum in csum forward engine */
> #define TESTPMD_TX_OFFLOAD_VXLAN_CKSUM   0x0010

We can add a portid in each command.

> Of course, it is welcome if you can send this patch set with this idea for 
> community review.

Let's first agree on the user API :)

Regards,
Olivier





[dpdk-dev] [PATCH 0/2] Enable DCB in SRIOV mode for ixgbe driver

2015-01-13 Thread Wodkowski, PawelX
Comments are more than welcome :)

Pawel



[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-13 Thread Ferriter, Cian
Passing a socket id of "rte_socket_id()" can cause problems in non DPDK 
applications as there is a dependency on the current logical core we are 
running on.
Passing " rte_lcore_to_socket_id(rte_get_master_lcore())" as the socket id to 
rte_memzone_reserve resolves these issues as the master lcore doesn't change.

-Original Message-
From: Ferriter, Cian 
Sent: Tuesday, January 13, 2015 9:22 AM
To: dev at dpdk.org
Cc: Ferriter, Cian
Subject: [PATCH] lib/librte_ether: change socket_id passed to 
rte_memzone_reserve

Change the socket id that is passed to rte_memzone_reserve from the socket id 
of current logical core to the socket id of the master_lcore.
---
 lib/librte_ether/rte_ethdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)  mode change 100644 => 100755 
lib/librte_ether/rte_ethdev.c

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c old 
mode 100644 new mode 100755 index 95f2ceb..835540d
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
if (rte_eal_process_type() == RTE_PROC_PRIMARY){
mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
-   rte_socket_id(), flags);
+   rte_lcore_to_socket_id(rte_get_master_lcore()), 
flags);
} else
mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
if (mz == NULL)
--
1.7.4.1



[dpdk-dev] [PATCH] lib/librte_ether: change socket_id passed to rte_memzone_reserve

2015-01-13 Thread Cian Ferriter
Change the socket id that is passed to rte_memzone_reserve from
the socket id of current logical core to the socket id of the
master_lcore.
---
 lib/librte_ether/rte_ethdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
 mode change 100644 => 100755 lib/librte_ether/rte_ethdev.c

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
old mode 100644
new mode 100755
index 95f2ceb..835540d
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -184,7 +184,7 @@ rte_eth_dev_data_alloc(void)
if (rte_eal_process_type() == RTE_PROC_PRIMARY){
mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
-   rte_socket_id(), flags);
+   rte_lcore_to_socket_id(rte_get_master_lcore()), 
flags);
} else
mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
if (mz == NULL)
-- 
1.7.4.1



[dpdk-dev] daemon process problem in DPDK

2015-01-13 Thread Hiroshi Shimamoto
Hi,

> Subject: Re: [dpdk-dev] daemon process problem in DPDK
> 
> Much appericated, Get it now.
> 
> Thanks,
> Xun
> 
> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, January 13, 2015 3:14 AM
> To: Neil Horman
> Cc: Ni, Xun; dev at dpdk.org
> Subject: Re: [dpdk-dev] daemon process problem in DPDK
> 
> On Mon, 12 Jan 2015 09:52:10 -0500
> Neil Horman  wrote:
> 
> > On Mon, Jan 12, 2015 at 02:28:20PM +, Ni, Xun wrote:
> > > Hello:
> > >
> > >I have basic questions related to dpdk and trying to find help.
> > >
> > >I am about to create a daemon process, is there a way for other 
> > > process to know whether the daemon is already created?
> I doesn't mean to get the pid, because it changes every time.
> > >
> > >If the daemon is created, how do other process to communicate with 
> > > this daemon? Dpdk seems to have rte ring but
> it only exists on the Ethernet, while I am talking about the process within 
> the same computer, and the way like share-memory,
> but I didn't find examples about the share memory between processes.
> > >
> > > Thanks,
> > > Xun
> > >
> > >
> >
> > Thats not really a dpdk question, that a generic programming question.
> > You can do this lots of ways.  Open a socket that other process can
> > connect to on an agreed port, create a shared memory segment, write a
> > file with connect information to a well know location, etc.
> > Neil
> >
> 
> We did have to make some changes to the basic application model (not in DPDK) 
> to allow for a daemon.
> 
> The normal/correct way to make a daemon is to use the daemon glibc call, and 
> this closes all file descriptors etc. Therefore
> the DPDK (eal) must be initialized after the daemon call.

How about to have daemon option in DPDK eal?

I think that many network service programs work as daemon.
If DPDK has daemon option, it may be helpful.

thanks,
Hiroshi

> 
> Also, wanted to make daemon optional for debugging.
> This led to change where the main program process application argv first then 
> passes DPDK args as second group. This is
> the inverse of the example applications.
> 
> 
> int
> main(int argc, char **argv)
> {
>   int ret;
> char *progname;
> 
>   progname = strrchr(argv[0], '/');
>   progname = strdup(progname ? progname + 1 : argv[0]);
> 
>   ret = parse_args(argc, argv);
>   if (ret < 0)
>   return -1;
> 
>   argc -= ret;
>   argv += ret;
> 
>   if (daemon_mode && daemon(1, 1) < 0)
>   return -1;
> 
>   /* workaround fact that EAL expects progname as first argument */
>   argv[0] = progname;
> 
>   ret = rte_eal_init(argc, argv);
>   if (ret < 0)
>   return -1;


[dpdk-dev] [PATCH RFC 00/13] Update build system

2015-01-13 Thread Neil Horman
On Mon, Jan 12, 2015 at 04:33:53PM +, Sergio Gonzalez Monroy wrote:
> This patch series updates the DPDK build system.
> 
> Following are the goals it tries to accomplish:
>  - Create a library containing core DPDK libraries (librte_eal,
>librte_malloc, librte_mempool, librte_mbuf and librte_ring).
>The idea of core libraries is to group those libraries that are
>always required for any DPDK application.
>  - Remove config option to build a combined library.
>  - For shared libraries, explicitly link against dependant
>libraries (adding entries to DT_NEEDED).
>  - Update app linking flags against static/shared DPDK libs.
> 
> Note that this patch turns up being quite big because of moving lib
> directories to a new subdirectory.
> I have ommited the actual diff from the patch doing the move of librte_eal
> as it is quite big (6MB). Probably a different approach is preferred.
> 
> Sergio Gonzalez Monroy (13):
>   mk: Remove combined library and related options
>   lib/core: create new core dir and makefiles
>   core: move librte_eal to core subdir
>   core: move librte_malloc to core subdir
>   core: move librte_mempool to core subdir
>   core: move librte_mbuf to core subdir
>   core: move librte_ring to core subdir
>   Update path of core libraries
>   mk: new corelib makefile
>   lib: Set LDLIBS for each library
>   mk: Use LDLIBS when linking shared libraries
>   mk: update apps build
>   mk: add -lpthread to linuxapp EXECENV_LDLIBS
> 
Series
Acked-by: Neil Horman 



[dpdk-dev] Callbacks after buffer (mbuf) sent out

2015-01-13 Thread Vithal S Mohare
Hi,

I am looking for application callbacks after mbufs are sent (tx) out 
successfully.   One of the use cases is for async multicast (over different gre 
tunnels etc).   Using direct/indirect buffers along with ref-count itself is 
not enough, as actual 'pkt-data' itself changes while flooding on list of 
tunnels.

Thanks,
-Vithal


[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-13 Thread Liu, Jijiang
Hi,

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, January 12, 2015 7:43 PM
> To: Liu, Jijiang; Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> csum forwarding engine
> 
> Hi Jijiang,
> 
> Please find some comments below.
> 
> On 01/12/2015 04:41 AM, Liu, Jijiang wrote:
> > There are some examples for the different packet types:
> >
> > 1. For L2 Packet types:
> > MAC, ARP
> > MAC, PAY2
> > ...
> > They are forwarded without beeing modified no matter if these above
> commands are set.
> 
> ok
> 
> >  2. For Non Tunneled IPv4/6 packet
> > MAC, IPV4, UDP, PAY4
> > MAC, IPV6, UDP, PAY4
> > ...
> > Ipv4:
> > tx_checksum set  ip   hw
> > tx_checksum set  udp   hw
> >
> > IPv6:
> > tx_checksum set  udp   hw
> >
> > They are forwarded with TX checksum offload if these above commands are
> set.
> 
> Two questions here:
> 
> - today, we also have the "sw" argument that allows to calculate the
>   checksum in software. Do you plan to keep this behavior?

Yes

> - today, the csumonly forward engine modifies the IP addresses to
>   validate that it is able to recalculate the checksum. Do you plan
>   to keep this behavior? 
Yes

> I'm not opposed to remove it if it makes
>   the code more complex.


> > 3. For Tunneled IPv4/6 packet
> >
> > See the above test cases:
> > Test case A
> > test case B.1
> > test case B.2
> > test case C
> >
> > They are forwarded with TX checksum offload if these above commands are
> set.
> >
> >> I think that the test-pmd command API should define a behavior for
> >> the csum forward engine for any packet. What do you think?
> >
> > Agree.
> >
> > Let me explain the checksum offload behavior of different packet type
> > below,
> >
> > 1. For L2 Packet types:
> > Checksum offload behavior definition:
> > tx_checksum set sw-tunnel-mode on :   NONE
> > tx_checksum set hw-tunnel-mode on:   NONE
> > tx_checksum set  outer-ip|ip|tcp|udp|sctp   hw: NONE
> >
> > 2. For Non Tunneled IPv4/6 packet
> >
> > Checksum offload behavior definition:
> >
> > tx_checksum set sw-tunnel-mode on :NONE
> > tx_checksum set hw-tunnel-mode on: NONE
> > tx_checksum set  outer-ip|ip|tcp|udp|sctp   hw: ip|tcp|udp|sctp options
> are VALID
> >
> > 3. For Tunneled IPv4/6 packet
> > Checksum offload behavior definition:
> >
> > tx_checksum set sw-tunnel-mode on :VALID
> > tx_checksum set hw-tunnel-mode on: VALID
> > tx_checksum set  outer-ip|ip|tcp|udp|sctp   hw: VALID
> >
> > It is very welcome if you have better solution that is able to cover all 
> > the case in
> the http://dpdk.org/ml/archives/dev/2014-December/009213.html  and all
> packet types in csum fwd engine.
> 
> Thank you for your efforts to explain your proposition. I still have some 
> difficulties
> to understand the naming "sw-tunnel" and "hw-tunnel".


Again,  I'd like to explain what behaviors  the following two commands are.

1. tx_checksum set sw-tunnel-mode on/off

2. tx_checksum set hw-tunnel-mode on/off

For command 1, If the sw-tunnel-mode is set/clear, which will set/clear a 
testpmd flag that is used in the process of analyzing incoming packet., the 
pseudo-codes are list below,

If (sw-tunnel-mode) 

Csum fwd engine will analyze if incoming packet is a tunneling packet.
   tunnel = 1;
else
   Csum fwd engine will not analyze if incoming packet is a tunneling 
packet, and treat all the incoming packets as non-tunneling packets. 
   It is used for A.


For command 2, If the hw-tunnel-mode is set/clear, which will set/clear a 
testpmd flag that is used in the process of how to handle tunneling packet, the 
pseudo-codes are list below,

if (tunnel == 1) { // this is a tunneling packet
If (hw-tunnel-mode) 
  ol_flags |= PKT_TX_UDP_TUNNEL_PKT;

   Csum fwd engine set PKT_TX_UDP_TUNNEL_PKT offload flag, which 
means to tell HW treat  the transmit packet as a tunneling packet to do 
checksum offload.
   It is used for B.1
   Else
  Csum fwd engine doesn't  set PKT_TX_UDP_TUNNEL_PKT 
offload flag, which means  tell HW to treat the packet as ordinary 
(non-tunnelled) packet.
  It is used for B.2
}


> From the user point of view "sw" means "software" and "hw" means "hardware".
> I think it's difficult to understand how both can be on at the same time. 
> Maybe it's
> just a naming problem?
>


Yes.
Your comments make sense. And I think it's just a naming problem, I will 
combine the two hw/sw-tunnel-mode commands into a command in order to make it  
as simple and  understandable as possible.

tx_checksum set tunnel-mode (hw|none)

when user set 'hw' option,   the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag will be 
set in cmdline; actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if 
 the testpmd f

[dpdk-dev] daemon process problem in DPDK

2015-01-13 Thread Ni, Xun
Much appericated, Get it now.

Thanks,
Xun

-Original Message-
From: Stephen Hemminger [mailto:step...@networkplumber.org] 
Sent: Tuesday, January 13, 2015 3:14 AM
To: Neil Horman
Cc: Ni, Xun; dev at dpdk.org
Subject: Re: [dpdk-dev] daemon process problem in DPDK

On Mon, 12 Jan 2015 09:52:10 -0500
Neil Horman  wrote:

> On Mon, Jan 12, 2015 at 02:28:20PM +, Ni, Xun wrote:
> > Hello:
> > 
> >I have basic questions related to dpdk and trying to find help.
> > 
> >I am about to create a daemon process, is there a way for other process 
> > to know whether the daemon is already created? I doesn't mean to get the 
> > pid, because it changes every time.
> > 
> >If the daemon is created, how do other process to communicate with this 
> > daemon? Dpdk seems to have rte ring but it only exists on the Ethernet, 
> > while I am talking about the process within the same computer, and the way 
> > like share-memory, but I didn't find examples about the share memory 
> > between processes.
> > 
> > Thanks,
> > Xun
> > 
> > 
> 
> Thats not really a dpdk question, that a generic programming question.  
> You can do this lots of ways.  Open a socket that other process can 
> connect to on an agreed port, create a shared memory segment, write a 
> file with connect information to a well know location, etc.
> Neil
> 

We did have to make some changes to the basic application model (not in DPDK) 
to allow for a daemon.

The normal/correct way to make a daemon is to use the daemon glibc call, and 
this closes all file descriptors etc. Therefore the DPDK (eal) must be 
initialized after the daemon call.

Also, wanted to make daemon optional for debugging.
This led to change where the main program process application argv first then 
passes DPDK args as second group. This is the inverse of the example 
applications.


int
main(int argc, char **argv)
{
int ret;
char *progname;

progname = strrchr(argv[0], '/');
progname = strdup(progname ? progname + 1 : argv[0]);

ret = parse_args(argc, argv);
if (ret < 0)
return -1;

argc -= ret;
argv += ret;

if (daemon_mode && daemon(1, 1) < 0)
return -1;

/* workaround fact that EAL expects progname as first argument */
argv[0] = progname;

ret = rte_eal_init(argc, argv);
if (ret < 0)
return -1;


[dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode

2015-01-13 Thread Ouyang, Changchun


From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Monday, January 12, 2015 9:59 PM
To: Ouyang, Changchun; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode


On 01/12/15 05:41, Ouyang, Changchun wrote:


From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]
Sent: Friday, January 09, 2015 9:50 PM
To: Ouyang, Changchun; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode


On 01/09/15 07:54, Ouyang, Changchun wrote:





-Original Message-

From: Vlad Zolotarov [mailto:vl...@cloudius-systems.com]

Sent: Friday, January 9, 2015 2:49 AM

To: Ouyang, Changchun; dev at dpdk.org

Subject: Re: [dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode





On 01/08/15 11:19, Vlad Zolotarov wrote:



On 01/07/15 08:32, Ouyang Changchun wrote:

Check mq mode for VMDq RSS, handle it correctly instead of returning

an error; Also remove the limitation of per pool queue number has max

value of 1, because the per pool queue number could be 2 or 4 if it

is VMDq RSS mode;



The number of rxq specified in config will determine the mq mode for

VMDq RSS.



Signed-off-by: Changchun Ouyang 



changes in v5:

   - Fix '<' issue, it should be '<=' to test rxq number;

   - Extract a function to remove the embeded switch-case statement.



---

  lib/librte_ether/rte_ethdev.c | 50

++-

  1 file changed, 45 insertions(+), 5 deletions(-)



diff --git a/lib/librte_ether/rte_ethdev.c

b/lib/librte_ether/rte_ethdev.c index 95f2ceb..8363e26 100644

--- a/lib/librte_ether/rte_ethdev.c

+++ b/lib/librte_ether/rte_ethdev.c

@@ -503,6 +503,31 @@ rte_eth_dev_tx_queue_config(struct

rte_eth_dev

*dev, uint16_t nb_queues)

  }

static int

+rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)

+{

+struct rte_eth_dev *dev = &rte_eth_devices[port_id];

+switch (nb_rx_q) {

+case 1:

+case 2:

+RTE_ETH_DEV_SRIOV(dev).active =

+ETH_64_POOLS;

+break;

+case 4:

+RTE_ETH_DEV_SRIOV(dev).active =

+ETH_32_POOLS;

+break;

+default:

+return -EINVAL;

+}

+

+RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;

+RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =

+dev->pci_dev->max_vfs * nb_rx_q;

+

+return 0;

+}

+

+static int

  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,

uint16_t nb_tx_q,

const struct rte_eth_conf *dev_conf)

  {

@@ -510,8 +535,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,

uint16_t nb_rx_q, uint16_t nb_tx_q,

if (RTE_ETH_DEV_SRIOV(dev).active != 0) {

  /* check multi-queue mode */

-if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) ||

-(dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||

+if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||

  (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS) ||

  (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {

  /* SRIOV only works in VMDq enable mode */ @@ -525,7

+549,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t

nb_rx_q, uint16_t nb_tx_q,

  }

switch (dev_conf->rxmode.mq_mode) {

-case ETH_MQ_RX_VMDQ_RSS:

  case ETH_MQ_RX_VMDQ_DCB:

  case ETH_MQ_RX_VMDQ_DCB_RSS:

  /* DCB/RSS VMDQ in SRIOV mode, not implement yet */ @@

-534,6 +557,25 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,

uint16_t

nb_rx_q, uint16_t nb_tx_q,

  "unsupported VMDQ mq_mode rx %u\n",

  port_id, dev_conf->rxmode.mq_mode);

  return (-EINVAL);

+case ETH_MQ_RX_RSS:

+PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8

+" SRIOV active, "

+"Rx mq mode is changed from:"

+"mq_mode %u into VMDQ mq_mode %u\n",

+port_id,

+dev_conf->rxmode.mq_mode,

+dev->data->dev_conf.rxmode.mq_mode);

+case ETH_MQ_RX_VMDQ_RSS:

+dev->data->dev_conf.rxmode.mq_mode =

ETH_MQ_RX_VMDQ_RSS;

+if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)

+if (rte_eth_dev_check_vf_rss_rxq_num(port_id,

nb_rx_q) != 0) {

+PMD_DEBUG_TRACE("ethdev port_id=%d"

+" SRIOV active, invalid queue"

+" number for VMDQ RSS\n",

+port_id);



Some nitpicking here: I'd add the allowed values descriptions to the

error message. Something like: "invalid queue number for VMDQ RSS.

Allowed values are 1, 2 or 4\n".



+return -EINVAL;

+}

+break;

  default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */

  /* if nothing mq mode configure, use default scheme */

  dev->d