[dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by klocwork

2015-02-25 Thread Liang, Cunming


> -Original Message-
> From: John W. Linville [mailto:linville at tuxdriver.com]
> Sent: Saturday, February 21, 2015 2:39 AM
> To: Thomas Monjalon
> Cc: Liang, Cunming; dev at dpdk.org; John Linville
> Subject: Re: [dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by
> klocwork
> 
> On Fri, Feb 20, 2015 at 11:19:59AM +0100, Thomas Monjalon wrote:
> > Hi Cunming,
> >
> > You would have more chance to have a review by CC'ing John.
> > I checked your patch and have a comment below.
> >
> > 2015-02-12 17:08, Cunming Liang:
> > > Klocwork report 'req' might be used uninitialized.
> > > In some cases it can 'goto error' when '*internals' not been set.
> > > The result is unexpected checking the value of '*internals'.
> > >
> > > Signed-off-by: Cunming Liang 
> > > ---
> > >  lib/librte_pmd_af_packet/rte_eth_af_packet.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > index 1ffe1cd..185607d 100644
> > > --- a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > +++ b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > @@ -439,13 +439,15 @@ rte_pmd_init_internals(const char *name,
> > >   size_t ifnamelen;
> > >   unsigned k_idx;
> > >   struct sockaddr_ll sockaddr;
> > > - struct tpacket_req *req;
> > > + struct tpacket_req *req = NULL;
> >
> > If *internals is set to NULL, there should be no case where req used
> > and undefined.
[LCM] Agree, so that's why I add '*internals = NULL' below as well.
> 
> I agree -- it looks to me like req is protected by checking for
> *internals == NULL.  I don't think this patch is necessary.
[LCM] The major piece of the patch is add setting for '*internals=NULL;'.
> 
> > >   struct pkt_rx_queue *rx_queue;
> > >   struct pkt_tx_queue *tx_queue;
> > >   int rc, qsockfd, tpver, discard;
> > >   unsigned int i, q, rdsize;
> > >   int fanout_arg __rte_unused, bypass __rte_unused;
> > >
> > > + *internals = NULL;
> > > +
> > >   for (k_idx = 0; k_idx < kvlist->count; k_idx++) {
> > >   pair = &kvlist->pairs[k_idx];
> > >   if (strstr(pair->key, ETH_AF_PACKET_IFACE_ARG) != NULL)
> > >
> >
> >
> >
> >
> 
> --
> John W. Linville  Someday the world will need a hero, and you
> linville at tuxdriver.com might be all we have.  Be ready.


[dpdk-dev] [PATCH v8 00/19] support multi-pthread per core

2015-02-25 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 25, 2015 2:53 AM
> To: Liang, Cunming
> Cc: Ananyev, Konstantin; dev at dpdk.org; olivier.matz at 6wind.com;
> nhorman at tuxdriver.com
> Subject: Re: [PATCH v8 00/19] support multi-pthread per core
> 
> > > v8 changes:
> > >   keep using strlen for trusted input string
> > >
> > > v7 changes:
> > >   update EAL version map for new public EAL API
> > >   rollback to use strnlen() passing EAL core option
> > >
> > > v6 changes:
> > >   rename RTE_RING_PAUSE_REP(_COUNT) and set default to 0
> > >   rollback to use RTE_MAX_LCORE when checking valid lcore_id for EAL 
> > > thread
> > >
> > > v5 changes:
> > >   reorder some patch and split into addtional two patches
> > >   rte_thread_get_affinity() return type change to avoid
> > >   add RTE_RING_PAUSE_REP into config and by default turn off
> > >
> > > v4 changes:
> > >   new patch fixing strnlen() invalid return in 32bit icc [03/17]
> > >   update and add more comments on sched_yield() [16/17]
> > >
> > > v3 changes:
> > >   new patch adding sched_yield() in rte_ring to avoid long spin [16/17]
> > >
> > > v2 changes:
> > >   add '-' support for EAL option '--lcores' [02/17]
> > >
> > > The patch series contain the enhancements of EAL and fixes for libraries
> > > to run multi-pthreads(either EAL or non-EAL thread) per physical core.
> > > Two major changes list as below:
> > > - Extend the core affinity of each EAL thread to 1:n.
> > >   Each lcore stands for a EAL thread rather than a logical core.
> > >   The change adds new EAL option to allow static lcore to cpuset 
> > > assginment.
> > >   Then a lcore(EAL thread) affinity to a cpuset, original 1:1 mapping is 
> > > the
> special case.
> > > - Fix the libraries to allow running on any non-EAL thread.
> > >   It fix the gaps running libraries in non-EAL thread(dynamic created by 
> > > user).
> > >   Each fix libraries take care the case of rte_lcore_id() >= 
> > > RTE_MAX_LCORE.
> > >
> > > Thanks a million for the comments from Konstantin, Bruce, Mirek and
> Stephen in RFC review.
> > >
> > > Cunming Liang (19):
> > >   eal: add cpuset into per EAL thread lcore_config
> > >   eal: fix PAGE_SIZE redefine complaint on freebsd
> > >   eal: new eal option '--lcores' for cpu assignment
> > >   eal: fix wrong strnlen() return value in 32bit icc
> > >   eal: add public function parsing socket_id from cpu_id
> > >   eal: new TLS definition and API declaration
> > >   eal: add eal_common_thread.c for common thread API
> > >   eal: standardize init sequence between linux and bsd
> > >   eal: add rte_gettid() to acquire unique system tid
> > >   eal: apply affinity of EAL thread by assigned cpuset
> > >   enic: fix re-define freebsd compile complain
> > >   malloc: fix the issue of SOCKET_ID_ANY
> > >   log: fix the gap to support non-EAL thread
> > >   eal: set _lcore_id and _socket_id to (-1) by default
> > >   eal: fix recursive spinlock in non-EAL thraed
> > >   mempool: add support to non-EAL thread
> > >   ring: add support to non-EAL thread
> > >   ring: add sched_yield to avoid spin forever
> > >   timer: add support to non-EAL thread
> >
> > Acked-by: Konstantin Ananyev 
> 
> I tried to fix many english typos. Please consider it during reviews.
> Cunming, you'll repeat 10 times "non-EAL threads compute more than none" ;)
> 
> Applied, thanks
[Liang, Cunming] Thanks, Thomas. I'll take care of it next time. :)
> 
> My main concern in this patchset is about naming. Now lcore means thread
> in many places. I would prefer to have a cleanup to use right term at
> right place, even if it requires breaking API.
[Liang, Cunming] 'lcore' is limited used as EAL thread.
Comparing to the legacy usage, the difference is such lcore(logical core) may 
not only affinity to one physical core.
If extending the meaning of 'lcore' a bit wider (as prog_guide doc said, let's 
consider a logical core stands for an EAL thread), it then makes sense to keep 
origin APIs.
That helps the existing apps migrate transparently.
> 
> Are we going to deprecate the fresh option -l in favor of --lcores/--threads?
[Liang, Cunming] I think so, as '--lcores' already covered '-l' pattern.
Mark it as deprecated, and remove it on next version ?



[dpdk-dev] [PATCH v1] doc: prog guide update for eal multi-pthread

2015-02-25 Thread Liang, Cunming
I'm afraid not yet, so appreciate for any revision suggestion.

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 25, 2015 3:11 AM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] doc: prog guide update for eal 
> multi-pthread
> 
> 2015-02-16 15:34, Cunming Liang:
> > The patch add the multi-pthread section under EAL chapter of prog_guide.
> >
> > Signed-off-by: Cunming Liang 
> 
> I guess this documentation has been co-written with a native english?
> 
> Applied, thanks



[dpdk-dev] Manage DPDK port capability via KNI

2015-02-25 Thread Tim Deng
Hi,


I am wondering how could we manage a DPDK port offload capabilities,
e.g. if we want to disable TSO capability on a DPDK port, is it feasible
that we use ethtool to configure a KNI then the config will be sync to a DPDK 
port?


Thanks,
Tim


[dpdk-dev] Cannot compile l2fwd-jobstats example

2015-02-25 Thread Tetsuya Mukawa
Hi,

I cannot compile l2fwd-jobstats using master branch.
Here is log

$ T=x86_64-native-linuxapp-gcc make examples
== Build examples for x86_64-native-linuxapp-gcc
== bond
== cmdline
== distributor
== exception_path
== helloworld
== ip_pipeline
== ip_reassembly
== ipv4_multicast
== kni
== l2fwd
== l2fwd-jobstats
make: *** l2fwd-jobstats: No such file or directory.  Stop.
make[2]: *** [l2fwd-jobstats] Error 2
make[1]: *** [x86_64-native-linuxapp-gcc_examples] Error 2
make: *** [examples] Error 2


As a result of bisecting, it seems after applying below commit, this
error can be seen.

commit 2caeb8c0141dcf488f2d68aa8e8c44d1f85ed28b
Author: Pawel Wodkowski 
Date:   Tue Feb 24 17:33:24 2015 +0100

examples/l2fwd-jobstats: new example


Thanks,
Tetsuya



[dpdk-dev] Missing symbol error

2015-02-25 Thread Tetsuya Mukawa
Hi,

I've got following error when I enable CONFIG_RTE_BUILD_SHARED_LIB.

dpdk/x86_64-native-linuxapp-gcc/lib/libethdev.so: undefined reference to
`per_lcore__socket_id'
collect2: error: ld returned 1 exit status
make[5]: *** [dump_cfg] Error 1
make[4]: *** [dump_cfg] Error 2
make[4]: *** Waiting for unfinished jobs


It seems after applying below commit, this issue is occurred.
8baacdd... eal: apply thread affinity by assigned cpuset

Thanks,
Tetsuya


[dpdk-dev] [PATCH] virtio: Changed variable types to prevent possible data loss in virtio_ethdev.c, virtio_ethdev.h and virtio_rxtx.c

2015-02-25 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maciej Gajdzica
> Sent: Saturday, February 21, 2015 12:13 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] virtio: Changed variable types to prevent
> possible data loss in virtio_ethdev.c, virtio_ethdev.h and virtio_rxtx.c
> 
> Changed vtpci_queue_idx type in function virtio_dev_queue_setup from
> uint8_t to uint16_t to prevent possible data loss. Also changed type of head
> variable in function virtio_send_command from uint32_t to uint16_t.
> Issues found with static code analysis tool.
> 
> Signed-off-by: Maciej Gajdzica 
> ---

Acked-by: Changchun Ouyang 


[dpdk-dev] Missing symbol error

2015-02-25 Thread Liang, Cunming
Hi,

You're right, it's missing in the version map.
Will send path to fix it.

Thanks.

> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Wednesday, February 25, 2015 10:49 AM
> To: dev at dpdk.org
> Cc: Liang, Cunming
> Subject: Missing symbol error
> 
> Hi,
> 
> I've got following error when I enable CONFIG_RTE_BUILD_SHARED_LIB.
> 
> dpdk/x86_64-native-linuxapp-gcc/lib/libethdev.so: undefined reference to
> `per_lcore__socket_id'
> collect2: error: ld returned 1 exit status
> make[5]: *** [dump_cfg] Error 1
> make[4]: *** [dump_cfg] Error 2
> make[4]: *** Waiting for unfinished jobs
> 
> 
> It seems after applying below commit, this issue is occurred.
> 8baacdd... eal: apply thread affinity by assigned cpuset
> 
> Thanks,
> Tetsuya


[dpdk-dev] [PATCH] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Yerden Zhumabekov

24.02.2015 20:57, Bruce Richardson ?:
> +#define CRC32_ITERATIONS (1U << 16)
> This test takes almost no time at all, so maybe we want to do a few more
> iterations e.g. 2^18 - 2^20. 
Noted, I'll put (1U << 20).
>> +printf("# CRC32 implementations equivalence test\n");
>> +for (i = 0; i < CRC32_ITERATIONS; i++) {
>> +/* Randomizing data_len of data set */
>> +data_len = (size_t) (rte_rand() % sizeof(data64) + 1);
> I suggest parenthesis around the % operation for clarity.
Noted.
>> +init_val = (uint32_t) rte_rand();
>> +
>> +/* Fill the data set */
>> +for (j = 0; j < CRC32_DWORDS; j++) {
>> +data64[j] = rte_rand();
>> +}
> As a matter of style, we generally omit braces for single-statement loop 
> bodies.
Noted.
>> +
>> +/* Calculate software CRC32 */
>> +rte_hash_crc_set_alg(CRC32_SW);
>> +hash_val = rte_hash_crc(data64, data_len, init_val);
>> +
>> +/* Check against 4-byte-operand sse4.2 CRC32 if available */
>> +rte_hash_crc_set_alg(CRC32_SSE42);
>> +if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
>> +res = -1;
> I think you need a print statement here, stating that the test failed, and
> why exactly it failed.
> Also, rather than setting res to -1, you can just do a print and break, and
> change "return res" below to "return i == CRC32_ITERATIONS ? 0 : -1", making
> use of the fact that you can check i to detect early termination on error.

Noted; then I suggest I'll print out test data which caused the break as
well. It might be handy for further investigation.

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ



[dpdk-dev] [PATCH v4 3/7] pmd: igb/ixgbe split nb_q_per_pool to rx and tx nb_q_per_pool

2015-02-25 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Thursday, February 19, 2015 11:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 3/7] pmd: igb/ixgbe split nb_q_per_pool to rx
> and tx nb_q_per_pool
> 
> rx and tx number of queue might be different if RX and TX are configured in
> different mode. This allow to inform VF about proper number of queues.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  lib/librte_ether/rte_ethdev.c   | 12 ++--
>  lib/librte_ether/rte_ethdev.h   |  3 ++-
>  lib/librte_pmd_e1000/igb_pf.c   |  3 ++-
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |  2 +-
>  lib/librte_pmd_ixgbe/ixgbe_pf.c |  9 +
>  5 files changed, 16 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 2e814db..4007054 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -520,7 +520,7 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id,
> uint16_t nb_rx_q)
>   return -EINVAL;
>   }
> 
> - RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;
> + RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = nb_rx_q;
>   RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
>   dev->pci_dev->max_vfs * nb_rx_q;
> 
> @@ -567,7 +567,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> uint16_t nb_rx_q, uint16_t nb_tx_q,
>   dev->data-
> >dev_conf.rxmode.mq_mode);
>   case ETH_MQ_RX_VMDQ_RSS:
>   dev->data->dev_conf.rxmode.mq_mode =
> ETH_MQ_RX_VMDQ_RSS;
> - if (nb_rx_q <=
> RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)
> + if (nb_rx_q <=
> RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool)
>   if
> (rte_eth_dev_check_vf_rss_rxq_num(port_id, nb_rx_q) != 0) {
>   PMD_DEBUG_TRACE("ethdev
> port_id=%d"
>   " SRIOV active, invalid queue"
> @@ -580,8 +580,8 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> uint16_t nb_rx_q, uint16_t nb_tx_q,
>   default: /* ETH_MQ_RX_VMDQ_ONLY or
> ETH_MQ_RX_NONE */
>   /* if nothing mq mode configure, use default scheme
> */
>   dev->data->dev_conf.rxmode.mq_mode =
> ETH_MQ_RX_VMDQ_ONLY;
> - if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
> - RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> 1;
> + if (RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool > 1)
> +
>   RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = 1;
>   break;
>   }
> 
> @@ -600,8 +600,8 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> uint16_t nb_rx_q, uint16_t nb_tx_q,
>   }
> 
>   /* check valid queue number */
> - if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) ||
> - (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)) {
> + if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool)

Here,  how about use nb_rx_q_per_pool to replace nb_tx_q_per_pool ?
so it will be more clear to check rx queue number.

> ||
> + (nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool))
> {
>   PMD_DEBUG_TRACE("ethdev port_id=%d SRIOV
> active, "
>   "queue number must less equal to %d\n",
>   port_id,
> RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool);



[dpdk-dev] [PATCH v4 5/7] pmd ixgbe: enable DCB in SRIOV

2015-02-25 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Thursday, February 19, 2015 11:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 5/7] pmd ixgbe: enable DCB in SRIOV
> 
> Enable DCB in SRIOV mode for ixgbe driver.
> 
> To use DCB in VF PF must configure port as  DCB + VMDQ and VF must
> configure port as DCB only. VF are not allowed to change DCB settings that
> are common to all ports like number of TC.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |  2 +-
>  lib/librte_pmd_ixgbe/ixgbe_pf.c | 19 ---
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 18 +++---
>  3 files changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index 8e9da3b..7551bcc 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -1514,7 +1514,7 @@ ixgbe_dev_configure(struct rte_eth_dev *dev)
>   if (conf->nb_queue_pools != ETH_16_POOLS &&
>  conf->nb_queue_pools != ETH_32_POOLS) {
>   PMD_INIT_LOG(ERR, " VMDQ+DCB selected, "
> - "number of TX qqueue pools must
> be %d or %d\n",
> + "number of TX queue pools must
> be %d or %d\n",
>   ETH_16_POOLS, ETH_32_POOLS);
>   return (-EINVAL);
>   }
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index a7b9333..7c4afba 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
> @@ -109,9 +109,12 @@ int ixgbe_pf_host_init(struct rte_eth_dev *eth_dev)
>   /* Fill sriov structure using default configuration. */
>   retval = ixgbe_pf_configure_mq_sriov(eth_dev);
>   if (retval != 0) {
> - if (retval < 0)
> - PMD_INIT_LOG(ERR, " Setting up SRIOV with default
> device "
> + if (retval < 0) {
> + PMD_INIT_LOG(ERR, "Setting up SRIOV with default
> device "
>   "configuration should not fail. This is 
> a
> BUG.");
> + return retval;
> + }
> +
>   return 0;
>   }
> 
> @@ -652,7 +655,9 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev,
> uint32_t vf, uint32_t *msgbuf)  {
>   struct ixgbe_vf_info *vfinfo =
>   *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data-
> >dev_private);
> - uint32_t default_q = vf *
> RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
> + struct ixgbe_dcb_config *dcbinfo =
> + IXGBE_DEV_PRIVATE_TO_DCB_CFG(dev->data-
> >dev_private);
> + uint32_t default_q = RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx;

Why need change the default_q here?

> 
>   /* Verify if the PF supports the mbox APIs version or not */
>   switch (vfinfo[vf].api_version) {
> @@ -670,10 +675,10 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev,
> uint32_t vf, uint32_t *msgbuf)
>   /* Notify VF of default queue */
>   msgbuf[IXGBE_VF_DEF_QUEUE] = default_q;
> 
> - /*
> -  * FIX ME if it needs fill msgbuf[IXGBE_VF_TRANS_VLAN]
> -  * for VLAN strip or VMDQ_DCB or VMDQ_DCB_RSS
> -  */
> + if (dcbinfo->num_tcs.pg_tcs)
> + msgbuf[IXGBE_VF_TRANS_VLAN] = dcbinfo-
> >num_tcs.pg_tcs;
> + else
> + msgbuf[IXGBE_VF_TRANS_VLAN] = 1;
> 
>   return 0;
>  }
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> index e6766b3..2e3522c 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> @@ -3166,10 +3166,9 @@ void ixgbe_configure_dcb(struct rte_eth_dev
> *dev)
> 
>   /* check support mq_mode for DCB */
>   if ((dev_conf->rxmode.mq_mode != ETH_MQ_RX_VMDQ_DCB) &&
> - (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB))
> - return;
> -
> - if (dev->data->nb_rx_queues != ETH_DCB_NUM_QUEUES)
> + (dev_conf->rxmode.mq_mode != ETH_MQ_RX_DCB) &&
> + (dev_conf->txmode.mq_mode != ETH_MQ_TX_VMDQ_DCB) &&
> + (dev_conf->txmode.mq_mode != ETH_MQ_TX_DCB))
>   return;
> 
>   /** Configure DCB hardware **/
> @@ -3442,8 +3441,13 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev
> *dev)
>   ixgbe_config_vf_rss(dev);
>   break;
> 
> - /* FIXME if support DCB/RSS together with VMDq & SRIOV */
> + /*
> +  * DCB will be configured during port startup.
> +  */
>   case ETH_MQ_RX_VMDQ_DCB:
> + break;
> +
> + /* FIXME if support DCB+RSS together with VMDq & SRIOV
> */
>   case ETH_MQ_RX_VMDQ_DCB_RSS:
>   PMD_INIT_LOG(ERR,
>   "Could not support DCB with VMDq &
> SRIOV"); @@ -3488,8 +3492,8 @

[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Cunming Liang
These two patches are the fixing for the compling error when 
CONFIG_RTE_BUILD_SHARED_LIB=y.
The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing in 
the version map.
Thanks for the notification from Tetsuya Mukawa . 

Cunming Liang (2):
  eal/linux: fix symbol missing in version map
  eal/bsd: fix symbol missing in version map

 lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 2 ++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
 2 files changed, 4 insertions(+)

-- 
1.8.1.4



[dpdk-dev] [PATCH v1 1/2] eal/linux: fix symbol missing in version map

2015-02-25 Thread Cunming Liang
As per_lcore__socket_id and rte_sys_gettid are missing in version map,
it causes compiling error when CONFIG_RTE_BUILD_SHARED_LIB is enabled.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index c207cee..17515a9 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -10,6 +10,7 @@ DPDK_2.0 {
pci_driver_list;
per_lcore__lcore_id;
per_lcore__rte_errno;
+   per_lcore__socket_id;
rte_cpu_check_supported;
rte_cpu_get_flag_enabled;
rte_cycles_vmware_tsc_map;
@@ -83,6 +84,7 @@ DPDK_2.0 {
rte_snprintf;
rte_strerror;
rte_strsplit;
+   rte_sys_gettid;
rte_thread_get_affinity;
rte_thread_set_affinity;
rte_vlog;
-- 
1.8.1.4



[dpdk-dev] [PATCH v1 2/2] eal/bsd: fix symbol missing in version map

2015-02-25 Thread Cunming Liang
As per_lcore__socket_id and rte_sys_gettid are missing in version map,
it causes compiling error when CONFIG_RTE_BUILD_SHARED_LIB is enabled.

Signed-off-by: Cunming Liang 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index c207cee..17515a9 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -10,6 +10,7 @@ DPDK_2.0 {
pci_driver_list;
per_lcore__lcore_id;
per_lcore__rte_errno;
+   per_lcore__socket_id;
rte_cpu_check_supported;
rte_cpu_get_flag_enabled;
rte_cycles_vmware_tsc_map;
@@ -83,6 +84,7 @@ DPDK_2.0 {
rte_snprintf;
rte_strerror;
rte_strsplit;
+   rte_sys_gettid;
rte_thread_get_affinity;
rte_thread_set_affinity;
rte_vlog;
-- 
1.8.1.4



[dpdk-dev] [PATCH v14 00/13] Port Hotplug Framework

2015-02-25 Thread Tetsuya Mukawa
This patch series adds a dynamic port hotplug framework to DPDK.
With the patches, DPDK apps can attach or detach ports at runtime.

The basic concept of the port hotplug is like followings.
- DPDK apps must have responsibility to manage ports.
  DPDK apps only know which ports are attached or detached at the moment.
  The port hotplug framework is implemented to allow DPDK apps to manage ports.
  For example, when DPDK apps call port attach function, attached port number
  will be returned. Also, DPDK apps can detach port by port number.
- Kernel support is needed for attaching or detaching physical device ports.
  To attach a new physical device port, the device will be recognized by
  userspace directly I/O framework in kernel at first. Then DPDK apps can
  call the port hotplug functions to attach ports.
  For detaching, steps are vice versa.
- Before detach ports, ports must be stopped and closed.
  DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
  detaching ports. These function will call finalization codes of PMDs.
  But so far, no PMD frees all resources allocated by initialization.
  It means PMDs are needed to be fixed to support the port hotplug.
  'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
  Without this flag, detaching will be failed.
- Mustn't affect legacy DPDK apps.
  No DPDK EAL behavior is changed, if the port hotplug functions are't called.
  So all legacy DPDK apps can still work without modifications.

And a few limitations.
- The port hotplug functions are not thread safe.
  DPDK apps should handle it.
- Only support Linux and igb_uio so far.
  BSD and VFIO is not supported. I will send VFIO patches at least, but I don't
  have a plan to submit BSD patch so far.


Here is port hotplug APIs.
---
/**
 * Attach a new device.
 *
 * @param devargs
 *   A pointer to a strings array describing the new device
 *   to be attached. The strings should be a pci address like
 *   ':01:00.0' or virtual device name like 'eth_pcap0'.
 * @param port_id
 *  A pointer to a port identifier actually attached.
 * @return
 *  0 on success and port_id is filled, negative on error
 */
int rte_eal_dev_attach(const char *devargs, uint8_t *port_id);

/**
 * Detach a device.
 *
 * @param port_id
 *   The port identifier of the device to detach.
 * @param addr
 *  A pointer to a device name actually detached.
 * @return
 *  0 on success and devname is filled, negative on error
 */
int rte_eal_dev_detach(uint8_t port_id, char *devname);
---

This patch series are for DPDK EAL. To use port hotplug function by DPDK apps,
each PMD should be fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check
a patch for pcap PMD.

Also, please check testpmd patch. It will show you how to fix your legacy
applications to support port hotplug feature.

PATCH v14 changes
 - Remove needless if statement.
   (Thanks to Maxime Leroy)

PATCH v13 changes
 - Change log level when error occurs in rte_eal_vdev_init() and
   rte_eal_dev_init().
 - Return value of driver init and uninit functions.
 - Replace rte_panic by RTE_LOG in rte_eal_dev_init()
 - Fix return value of rte_eal_vdev_uninit().
 - Fix rte_eal_dev_attach_vdev to set port_id correctly.
   (Thanks to Maxime Leroy)

PATCH v12 changes
 - Add missing symbol in version map.
   (Thanks to Iremonger, Bernard)

PATCH v11 changes
 - Remove needless devargs handling codes.
 - Replace get_vdev_name() by rte_eal_parse_devargs_str().
 - Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
 - Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
 - Fix rte_eal_dev_init() to use rte_eal_vdev_init().
 - Remove needless patch.
   (Thanks to Maxime Leroy)

PATCH v10 changes
 - Add comments.
 - Chagne order of version.map.
 - Fix comment of "rte_ethdev.h".
   (Thanks to Thomas Monjalon)
 - Add size parameter to rte_eth_dev_create_unique_device_name().
   (Thanks to Iremonger, Bernard)

PATCH v9 changes
 - Fix commit title.
 - Fix commit log.
 - Fix comments.
 - Define CONFIG_RTE_LIBRTE_EAL_HOTPLUG at the top of this patch series.
 - DEV_INVALID/VALID are removed.
 - DEV_DISCONNECTED is replaced by DEV_DETACHED.
 - DEV_CONNECTED is replaced by DEV_ATTACHED.
 - rte_eth_dev_allocate_new_port() is renamed to
   rte_eth_dev_find_free_port().
 - rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
 - rte_eth_dev_is_valid_port() is changed not to handle log toggle.
 - eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
 - rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
 - Add a function to create a unique device name.
 - Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
 - Remove code that initiaize callback of ethdev from
   rte_eth_dev_uninit().
 - Remove pci_unmap_device(). It will be implemented in later patch.
 -

[dpdk-dev] [PATCH v14 01/13] eal: Enable port Hotplug framework in Linux

2015-02-25 Thread Tetsuya Mukawa
The patch adds CONFIG_RTE_LIBRTE_EAL_HOTPLUG in Linux and BSD
configuration. So far, Hotplug functions only support linux.

v9:
- Move this patch at the top of this patch series.
  (Thanks to Thomas Monjalon)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp   | 6 ++
 config/common_linuxapp | 5 +
 2 files changed, 11 insertions(+)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 83a62a6..4108c01 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -116,6 +116,12 @@ CONFIG_RTE_LIBRTE_EAL_BSDAPP=y
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=n

 #
+# Compile Environment Abstraction Layer to support hotplug
+# So far, Hotplug functions only support linux
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2716381..8ba0258 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -114,6 +114,11 @@ CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE=0
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=y

 #
+# Compile Environment Abstraction Layer to support hotplug
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=y
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
-- 
1.9.1



[dpdk-dev] [PATCH v14 02/13] eal_pci: Add flag to hold kernel driver type

2015-02-25 Thread Tetsuya Mukawa
From: Michael Qiu 

Currently, dpdk has no ability to know which type of driver(
vfio-pci/igb_uio/uio_pci_generic) the device used. It only can
check whether vfio is enabled or not staticly.

It really useful to have the flag, becasue different type need to
handle differently in runtime. For example, pci memory map,
pot hotplug, and so on.

This patch add a flag field for pci device to solve above issue.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  8 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 53 +++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 3df07e8..a87b4b3 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -142,6 +142,13 @@ struct rte_pci_addr {

 struct rte_devargs;

+enum rte_pt_driver {
+   RTE_PT_UNKNOWN  = 0,
+   RTE_PT_IGB_UIO  = 1,
+   RTE_PT_VFIO = 2,
+   RTE_PT_UIO_GENERIC  = 3,
+};
+
 /**
  * A structure describing a PCI device.
  */
@@ -155,6 +162,7 @@ struct rte_pci_device {
uint16_t max_vfs;   /**< sriov enable if not zero */
int numa_node;  /**< NUMA node connection */
struct rte_devargs *devargs;/**< Device user arguments */
+   enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
 };

 /** Any PCI device identifier (vendor, device, ...) */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index a4fd5f5..4615756 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -97,6 +97,35 @@ error:
return -1;
 }

+static int
+pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
+{
+   int count;
+   char path[PATH_MAX];
+   char *name;
+
+   if (!filename || !dri_name)
+   return -1;
+
+   count = readlink(filename, path, PATH_MAX);
+   if (count >= PATH_MAX)
+   return -1;
+
+   /* For device does not have a driver */
+   if (count < 0)
+   return 1;
+
+   path[count] = '\0';
+
+   name = strrchr(path, '/');
+   if (name) {
+   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
+   return 0;
+   }
+
+   return -1;
+}
+
 void *
 pci_find_max_end_va(void)
 {
@@ -221,11 +250,12 @@ pci_scan_one(const char *dirname, uint16_t domain, 
uint8_t bus,
char filename[PATH_MAX];
unsigned long tmp;
struct rte_pci_device *dev;
+   char driver[PATH_MAX];
+   int ret;

dev = malloc(sizeof(*dev));
-   if (dev == NULL) {
+   if (dev == NULL)
return -1;
-   }

memset(dev, 0, sizeof(*dev));
dev->addr.domain = domain;
@@ -304,6 +334,25 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
return -1;
}

+   /* parse driver */
+   snprintf(filename, sizeof(filename), "%s/driver", dirname);
+   ret = pci_get_kernel_driver_by_path(filename, driver);
+   if (!ret) {
+   if (!strcmp(driver, "vfio-pci"))
+   dev->pt_driver = RTE_PT_VFIO;
+   else if (!strcmp(driver, "igb_uio"))
+   dev->pt_driver = RTE_PT_IGB_UIO;
+   else if (!strcmp(driver, "uio_pci_generic"))
+   dev->pt_driver = RTE_PT_UIO_GENERIC;
+   else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+   } else if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
+   free(dev);
+   return -1;
+   } else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+
/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(&pci_device_list)) {
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-- 
1.9.1



[dpdk-dev] [PATCH v14 03/13] eal_pci: pci memory map work with driver type

2015-02-25 Thread Tetsuya Mukawa
From: Michael Qiu 

With the driver type flag in struct rte_pci_dev, we do not need
to always  map uio devices with vfio related function when
vfio enabled.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 4615756..3291c68 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -555,25 +555,29 @@ pci_config_space_set(struct rte_pci_device *dev)
 static int
 pci_map_device(struct rte_pci_device *dev)
 {
-   int ret, mapped = 0;
+   int ret = -1;

/* try mapping the NIC resources using VFIO if it exists */
+   switch (dev->pt_driver) {
+   case RTE_PT_VFIO:
 #ifdef VFIO_PRESENT
-   if (pci_vfio_is_enabled()) {
-   ret = pci_vfio_map_resource(dev);
-   if (ret == 0)
-   mapped = 1;
-   else if (ret < 0)
-   return ret;
-   }
+   if (pci_vfio_is_enabled())
+   ret = pci_vfio_map_resource(dev);
 #endif
-   /* map resources for devices that use uio_pci_generic or igb_uio */
-   if (!mapped) {
+   break;
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   /* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
-   if (ret != 0)
-   return ret;
+   break;
+   default:
+   RTE_LOG(DEBUG, EAL, "  Not managed by known pt driver,"
+   " skipped\n");
+   ret = 1;
+   break;
}
-   return 0;
+
+   return ret;
 }

 /*
-- 
1.9.1



[dpdk-dev] [PATCH v14 04/13] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-25 Thread Tetsuya Mukawa
To remove assumption, do like followings.

This patch adds "RTE_PCI_DRV_DETACHABLE" to drv_flags of rte_pci_driver
structure. The flags indicate the driver can detach devices at runtime.
Also, remove assumption that port will not be detached.

To remove the assumption.
- Add 'attached' member to rte_eth_dev structure.
  This member is used for indicating the port is attached, or not.
  DEV_ATTACHED indicates a port is attached.
  DEV_DETACHED indicates a port is detached.
- Add rte_eth_dev_allocate_new_port().
  This function is used for allocating new port.

v9:
- DEV_INVALID/VALID are removed.
- DEV_DISCONNECTED is replaced by DEV_DETACHED.
- DEV_CONNECTED is replaced by DEV_ATTACHED.
- rte_eth_dev_allocate_new_port() is renamed to
  rte_eth_dev_find_free_port().
- rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
- rte_eth_dev_is_valid_port() is changed not to handle log toggle.
- Fix commit log to describe DEV_ATACHED and DEV_DETACHED.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is changed to NO_TRACE.
  (Thanks to Iremonger, Bernard)
v5:
- Change parameters of rte_eth_dev_validate_port() to cleanup code.
v4:
- Use braces with 'for' loop.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |   2 +
 lib/librte_ether/rte_ethdev.c   | 248 
 lib/librte_ether/rte_ethdev.h   |   5 +
 3 files changed, 164 insertions(+), 91 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index a87b4b3..255a77b 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -210,6 +210,8 @@ struct rte_pci_driver {
 #define RTE_PCI_DRV_FORCE_UNBIND 0x0004
 /** Device driver supports link state interrupt */
 #define RTE_PCI_DRV_INTR_LSC   0x0008
+/** Device driver supports detaching capability */
+#define RTE_PCI_DRV_DETACHABLE 0x0010

 /**< Internal use only - Macro used by pci addr parsing functions **/
 #define GET_PCIADDR_FIELD(in, fd, lim, dlm)   \
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ecbe93c..b702039 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -175,6 +175,11 @@ enum {
STAT_QMAP_RX
 };

+enum {
+   DEV_DETACHED = 0,
+   DEV_ATTACHED
+};
+
 static inline void
 rte_eth_dev_data_alloc(void)
 {
@@ -201,19 +206,34 @@ rte_eth_dev_allocated(const char *name)
 {
unsigned i;

-   for (i = 0; i < nb_ports; i++) {
-   if (strcmp(rte_eth_devices[i].data->name, name) == 0)
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if ((rte_eth_devices[i].attached == DEV_ATTACHED) &&
+   strcmp(rte_eth_devices[i].data->name, name) == 0)
return &rte_eth_devices[i];
}
return NULL;
 }

+static uint8_t
+rte_eth_dev_find_free_port(void)
+{
+   unsigned i;
+
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if (rte_eth_devices[i].attached == DEV_DETACHED)
+   return i;
+   }
+   return RTE_MAX_ETHPORTS;
+}
+
 struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
+   uint8_t port_id;
struct rte_eth_dev *eth_dev;

-   if (nb_ports == RTE_MAX_ETHPORTS) {
+   port_id = rte_eth_dev_find_free_port();
+   if (port_id == RTE_MAX_ETHPORTS) {
PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
return NULL;
}
@@ -226,10 +246,12 @@ rte_eth_dev_allocate(const char *name)
return NULL;
}

-   eth_dev = &rte_eth_devices[nb_ports];
-   eth_dev->data = &rte_eth_dev_data[nb_ports];
+   eth_dev = &rte_eth_devices[port_id];
+   eth_dev->data = &rte_eth_dev_data[port_id];
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
-   eth_dev->data->port_id = nb_ports++;
+   eth_dev->data->port_id = port_id;
+   eth_dev->attached = DEV_ATTACHED;
+   nb_ports++;
return eth_dev;
 }

@@ -283,6 +305,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
(unsigned) pci_dev->id.device_id);
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
rte_free(eth_dev->data->dev_private);
+   eth_dev->attached = DEV_DETACHED;
nb_ports--;
return diag;
 }
@@ -308,10 +331,20 @@ rte_eth_driver_register(struct eth_driver *eth_drv)
rte_eal_pci_register(ð_drv->pci_drv);
 }

+static int
+rte_eth_dev_is_valid_port(uint8_t port_id)
+{
+   if (port_id >= RTE_MAX_ETHPORTS ||
+   rte_eth_devices[port_id].attached != DEV_ATTACHED)
+   return 0;
+   else
+   return 1;
+}
+
 int
 rte_eth_dev_socket_id(uint8_t port_id)
 {
-   if (port_id >= nb_ports)
+   if (!rte_eth_dev_is_valid_port(port_id))
return -1;
return rte_eth_device

[dpdk-dev] [PATCH v14 05/13] eal/pci: Consolidate pci address comparison APIs

2015-02-25 Thread Tetsuya Mukawa
This patch replaces pci_addr_comparison() and memcmp() of pci addresses by
rte_eal_compare_pci_addr().

To compare PCI addresses, rte_eal_compare_pci_addr() doesn't use memcmp().
This is because sizeof(struct rte_pci_addr) returns 6, but actually
this structure is like below.

struct rte_pci_addr {
uint16_t domain;/**< Device domain */
uint8_t bus;/**< Device bus */
uint8_t devid;  /**< Device ID */
uint8_t function;   /**< Device function. */
};

If the structure is dynamically allocated in a function without bzero,
last 1 byte may have value. As a result, memcmp may not work.
To avoid such a case, rte_eal_compare_pci_addr() compare following values.

dev_addr = (addr->domain << 24) | (addr->bus << 16) |
(addr->devid << 8) | addr->function;

v9:
- eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
- Fix commit log.
  (Thanks to Thomas Monjalon)
v8:
- Fix pci_scan_one() to update sysfs values.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v5:
- Fix pci_scan_one to handle pt_driver correctly.
v4:
- Fix calculation method of eal_compare_pci_addr().
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 29 --
 lib/librte_eal/common/eal_common_pci.c|  2 +-
 lib/librte_eal/common/include/rte_pci.h   | 34 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  2 +-
 5 files changed, 63 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 74ecce7..9193f80 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return (0);
 }

-/* Compare two PCI device addresses. */
-static int
-pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr *addr2)
-{
-   uint64_t dev_addr = (addr->domain << 24) + (addr->bus << 16) + 
(addr->devid << 8) + addr->function;
-   uint64_t dev_addr2 = (addr2->domain << 24) + (addr2->bus << 16) + 
(addr2->devid << 8) + addr2->function;
-
-   if (dev_addr > dev_addr2)
-   return 1;
-   else
-   return 0;
-}
-
-
 /* Scan one pci sysfs entry, and fill the devices list from it. */
 static int
 pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
@@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
+   int ret;

TAILQ_FOREACH(dev2, &pci_device_list, next) {
-   if (pci_addr_comparison(&dev->addr, &dev2->addr))
+   ret = rte_eal_compare_pci_addr(&dev->addr, &dev2->addr);
+   if (ret > 0)
continue;
-   else {
+   else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
+   } else { /* already registered */
+   /* update pt_driver */
+   dev2->pt_driver = dev->pt_driver;
+   dev2->max_vfs = dev->max_vfs;
+   memmove(dev2->mem_resource,
+   dev->mem_resource,
+   sizeof(dev->mem_resource));
+   free(dev);
+   return 0;
}
}
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index f3c7f71..bf2793f 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
if (devargs->type != RTE_DEVTYPE_BLACKLISTED_PCI &&
devargs->type != RTE_DEVTYPE_WHITELISTED_PCI)
continue;
-   if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr)))
+   if (!rte_eal_compare_pci_addr(&dev->addr, &devargs->pci.addr))
return devargs;
}
return NULL;
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 255a77b..dcf9c81 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -272,6 +272,40 @@ eal_parse_pci_DomBDF(const char *input, struct 
rte_pci_addr *dev_addr)
 }
 #undef GET_PCIADDR_FIELD

+/* Compare two PCI device addresses. */
+/**
+ * Utility function to compare two PCI device addresses.
+ *
+ 

[dpdk-dev] [PATCH v14 06/13] ethdev: Add rte_eth_dev_release_port to release specified port

2015-02-25 Thread Tetsuya Mukawa
This patch adds rte_eth_dev_release_port(). The function is used for
changing an attached status of the device that has specified name.

v9:
- rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
  (Thanks to Thomas Monjalon)
v6:
- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
v4:
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c | 11 +++
 lib/librte_ether/rte_ethdev.h | 12 
 2 files changed, 23 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b702039..a089557 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -255,6 +255,17 @@ rte_eth_dev_allocate(const char *name)
return eth_dev;
 }

+int
+rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   eth_dev->attached = 0;
+   nb_ports--;
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 110ddba..7963e56 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1539,6 +1539,18 @@ extern uint8_t rte_eth_dev_count(void);
  */
 struct rte_eth_dev *rte_eth_dev_allocate(const char *name);

+/**
+ * Function for internal use by dummy drivers primarily, e.g. ring-based
+ * driver.
+ * Release the specified ethdev port.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
+
 struct eth_driver;
 /**
  * @internal
-- 
1.9.1



[dpdk-dev] [PATCH v14 07/13] eal, ethdev: Add a function and function pointers to close ether device

2015-02-25 Thread Tetsuya Mukawa
The patch adds function pointer to rte_pci_driver and eth_driver
structure. These function pointers are used when ports are detached.
Also, the patch adds rte_eth_dev_uninit(). So far, it's not called
by anywhere, but it will be called when port hotplug function is
implemented.

v10:
- Add size parameter to rte_eth_dev_create_unique_device_name().
  (Thanks to Iremonger, Bernard)
v9:
- Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
- Remove code that initiaize callback of ethdev from
  rte_eth_dev_uninit().
- Add a function to create a unique device name.
  (Thanks to Thomas Monjalon)
v6:
- Fix rte_eth_dev_uninit() to handle a return value of uninit
  function of PMD.
v4:
- Add parameter checking.
- Change function names.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  6 
 lib/librte_ether/rte_ethdev.c   | 64 +++--
 lib/librte_ether/rte_ethdev.h   | 24 +
 3 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index dcf9c81..ecde36f 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -192,12 +192,18 @@ struct rte_pci_driver;
 typedef int (pci_devinit_t)(struct rte_pci_driver *, struct rte_pci_device *);

 /**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (pci_devuninit_t)(struct rte_pci_device *);
+
+/**
  * A structure describing a PCI driver.
  */
 struct rte_pci_driver {
TAILQ_ENTRY(rte_pci_driver) next;   /**< Next in list. */
const char *name;   /**< Driver name. */
pci_devinit_t *devinit; /**< Device init. function. */
+   pci_devuninit_t *devuninit; /**< Device uninit function. */
struct rte_pci_id *id_table;/**< ID table, NULL terminated. 
*/
uint32_t drv_flags; /**< Flags contolling handling 
of device. */
 };
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a089557..165ec74 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -266,6 +266,24 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

+static inline int
+rte_eth_dev_create_unique_device_name(char *name, size_t size,
+   struct rte_pci_device *pci_dev)
+{
+   int ret;
+
+   if ((name == NULL) || (pci_dev == NULL))
+   return -EINVAL;
+
+   ret = snprintf(name, size, "%d:%d.%d",
+   pci_dev->addr.bus, pci_dev->addr.devid,
+   pci_dev->addr.function);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
@@ -279,8 +297,8 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_drv = (struct eth_driver *)pci_drv;

/* Create unique Ethernet device name using PCI address */
-   snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid, 
pci_dev->addr.function);
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);

eth_dev = rte_eth_dev_allocate(ethdev_name);
if (eth_dev == NULL)
@@ -321,6 +339,47 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

+static int
+rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+{
+   const struct eth_driver *eth_drv;
+   struct rte_eth_dev *eth_dev;
+   char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+   int ret;
+
+   if (pci_dev == NULL)
+   return -EINVAL;
+
+   /* Create unique Ethernet device name using PCI address */
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);
+
+   eth_dev = rte_eth_dev_allocated(ethdev_name);
+   if (eth_dev == NULL)
+   return -ENODEV;
+
+   eth_drv = (const struct eth_driver *)pci_dev->driver;
+
+   /* Invoke PMD device uninit function */
+   if (*eth_drv->eth_dev_uninit) {
+   ret = (*eth_drv->eth_dev_uninit)(eth_drv, eth_dev);
+   if (ret)
+   return ret;
+   }
+
+   /* free ether device */
+   rte_eth_dev_release_port(eth_dev);
+
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   rte_free(eth_dev->data->dev_private);
+
+   eth_dev->pci_dev = NULL;
+   eth_dev->driver = NULL;
+   eth_dev->data = NULL;
+
+   return 0;
+}
+
 /**
  * Register an Ethernet [Poll Mode] driver.
  *
@@ -339,6 +398,7 @@ void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
eth_drv->pci_drv.devinit = rte_eth_dev_init;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
rte_eal_pci_regis

[dpdk-dev] [PATCH v14 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-25 Thread Tetsuya Mukawa
The patch adds following functions.

- rte_eth_dev_save()
  The function is used for saving current rte_eth_dev structures.
- rte_eth_dev_get_changed_port()
  The function receives the rte_eth_dev structures, then compare
  these with current values to know which port is actually
  attached or detached.
- rte_eth_dev_get_addr_by_port()
  The function returns a pci address of an ethdev specified by port
  identifier.
- rte_eth_dev_get_port_by_addr()
  The function returns a port identifier of an ethdev specified by
  pci address.
- rte_eth_dev_get_name_by_port()
  The function returns a unique identifier name of an ethdev
  specified by port identifier.
- Add rte_eth_dev_is_detachable()
  The function returns whether a PMD supports detach function.

Also, the patch changes scope of rte_eth_dev_allocated() to global.
This function will be called by virtual PMDs to support port hotplug.
So change scope of the function to global.

v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- rte_eth_dev_check_detachable() is replaced by
  rte_eth_dev_is_detachable().
- strncpy() is replaced by strcpy().
  (Thanks to Thomas Monjalon)
- Add missing symbol in version map.
  (Thanks to Nail Horman)
v8:
- Add size parameter to rte_eth_dev_save().
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Add pt_driver checking to rte_eth_dev_check_detachable().
  (Thanks to Qiu, Michael)
v5:
- Fix return value of below functions.
  rte_eth_dev_get_changed_port().
  rte_eth_dev_get_port_by_addr().
v4:
- Add parameter checking.
v3:
- Fix if-condition bug while comparing pci addresses.
- Add error checking codes.
Reported-by: Mark Enright 

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c  | 103 -
 lib/librte_ether/rte_ethdev.h  |  83 ++
 lib/librte_ether/rte_ether_version.map |   7 +++
 3 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 165ec74..1f6a066 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
 }

-static struct rte_eth_dev *
+struct rte_eth_dev *
 rte_eth_dev_allocated(const char *name)
 {
unsigned i;
@@ -426,6 +426,107 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+int
+rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
+{
+   if ((devs == NULL) ||
+   (size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
+   return -EINVAL;
+
+   /* save current rte_eth_devices */
+   memcpy(devs, rte_eth_devices, size);
+   return 0;
+}
+
+int
+rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
+{
+   if ((devs == NULL) || (port_id == NULL))
+   return -EINVAL;
+
+   /* check which port was attached or detached */
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
+   if (rte_eth_devices[*port_id].attached ^ devs->attached)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
+{
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (addr == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   *addr = rte_eth_devices[port_id].pci_dev->addr;
+   return 0;
+}
+
+int
+rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
+{
+   struct rte_pci_addr *tmp;
+
+   if ((addr == NULL) || (port_id == NULL)) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
+   if (!rte_eth_devices[*port_id].attached)
+   continue;
+   if (!rte_eth_devices[*port_id].pci_dev)
+   continue;
+   tmp = &rte_eth_devices[*port_id].pci_dev->addr;
+   if (rte_eal_compare_pci_addr(tmp, addr) == 0)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
+{
+   char *tmp;
+
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (name == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   /* shouldn't check 'rte_eth_devices[i].data',
+* because it might be overwritten by VDEV PMD */
+   tmp = rte_eth_dev_data[port_id].name;
+   strcpy(name,

[dpdk-dev] [PATCH v14 09/13] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-25 Thread Tetsuya Mukawa
The patch adds functions for unmapping igb_uio resources. The patch is only
for Linux and igb_uio environment. VFIO and BSD are not supported.

v9:
- Remove "rte_dev_hotplug.h".
- Remove needless "#ifdef".
  (Thanks to Thomas Monjalon and Neil Horman)
- Remove pci_unmap_device(). It will be implemented in later patch.
v8:
- Fix typo.
  (Thanks to Iremonger, Bernard)
v5:
- Fix pci_unmap_device() to check pt_driver.
v4:
- Add parameter checking.
- Add header file to determine if hotplug can be enabled.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 17 
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  7 
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 65 ++
 3 files changed, 89 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 06bfc1a..d03429c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -168,6 +168,23 @@ pci_map_resource(void *requested_addr, int fd, off_t 
offset, size_t size,
return mapaddr;
 }

+/* unmap a particular resource */
+void
+pci_unmap_resource(void *requested_addr, size_t size)
+{
+   if (requested_addr == NULL)
+   return;
+
+   /* Unmap the PCI memory resource of device */
+   if (munmap(requested_addr, size)) {
+   RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+   __func__, requested_addr, (unsigned long)size,
+   strerror(errno));
+   } else
+   RTE_LOG(DEBUG, EAL, "  PCI memory unmapped at %p\n",
+   requested_addr);
+}
+
 /* parse the "resource" sysfs file */
 static int
 pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 03d2b52..6af84d1 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -72,6 +72,13 @@ void *pci_map_resource(void *requested_addr, int fd, off_t 
offset,
 /* map IGB_UIO resource prototype */
 int pci_uio_map_resource(struct rte_pci_device *dev);

+void pci_unmap_resource(void *requested_addr, size_t size);
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/* unmap IGB_UIO resource prototype */
+void pci_uio_unmap_resource(struct rte_pci_device *dev);
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 #ifdef VFIO_PRESENT

 #define VFIO_MAX_GROUPS 64
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index c5e0cf3..35d31c5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -386,3 +386,68 @@ pci_uio_map_resource(struct rte_pci_device *dev)

return 0;
 }
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static void
+pci_uio_unmap(struct mapped_pci_resource *uio_res)
+{
+   int i;
+
+   if (uio_res == NULL)
+   return;
+
+   for (i = 0; i != uio_res->nb_maps; i++)
+   pci_unmap_resource(uio_res->maps[i].addr,
+   (size_t)uio_res->maps[i].size);
+}
+
+static struct mapped_pci_resource *
+pci_uio_find_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return NULL;
+
+   TAILQ_FOREACH(uio_res, pci_res_list, next) {
+
+   /* skip this element if it doesn't match our PCI address */
+   if (!rte_eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
+   return uio_res;
+   }
+   return NULL;
+}
+
+/* unmap the PCI resource of a PCI device in virtual memory */
+void
+pci_uio_unmap_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return;
+
+   /* find an entry for the device */
+   uio_res = pci_uio_find_resource(dev);
+   if (uio_res == NULL)
+   return;
+
+   /* secondary processes - just free maps */
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return pci_uio_unmap(uio_res);
+
+   TAILQ_REMOVE(pci_res_list, uio_res, next);
+
+   /* unmap all resources */
+   pci_uio_unmap(uio_res);
+
+   /* free uio resource */
+   rte_free(uio_res);
+
+   /* close fd if in primary process */
+   close(dev->intr_handle.fd);
+
+   dev->intr_handle.fd = -1;
+   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
-- 
1.9.1



[dpdk-dev] [PATCH v14 10/13] eal/pci: Add probe and close functions of pci driver

2015-02-25 Thread Tetsuya Mukawa
- Add pci_close_all_drivers()
  The function tries to find a driver for the specified device, and
  then close the driver.
- Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
  The functions are used for probe and close a device.
  First the function tries to find a device that has the specified
  PCI address. Then, probe or close the device.

v9:
- Fix commit title.
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
  (Thanks to Thomas Monjalon)
- Implement pci_unmap_device() in this patch.
v5:
- Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
v4:
- Fix parameter checking.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_pci.c  | 98 -
 lib/librte_eal/common/eal_private.h | 15 +
 lib/librte_eal/common/include/rte_pci.h | 32 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 94 +++
 4 files changed, 238 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index bf2793f..5b6b55d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -108,7 +108,10 @@ static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
 {
struct rte_pci_driver *dr = NULL;
-   int rc;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;

TAILQ_FOREACH(dr, &pci_driver_list, next) {
rc = rte_eal_pci_probe_one_driver(dr, dev);
@@ -123,6 +126,99 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
return 1;
 }

+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/*
+ * If vendor/device ID match, call the devuninit() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+pci_close_all_drivers(struct rte_pci_device *dev)
+{
+   struct rte_pci_driver *dr = NULL;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dr, &pci_driver_list, next) {
+   rc = rte_eal_pci_close_one_driver(dr, dev);
+   if (rc < 0)
+   /* negative value is an error */
+   return -1;
+   if (rc > 0)
+   /* positive value means driver not found */
+   continue;
+   return 0;
+   }
+   return 1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke probe function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_probe_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_probe_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke close function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_close_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_close_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+
+   TAILQ_REMOVE(&pci_device_list, dev, next);
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 /*
  * Scan the content of the PCI bus, and call the devinit() function for
  * all registered drivers that have a matching entry in its id_table
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 159cd66..4acf5a0 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -165,6 +165,21 @@ int rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr,
struct rte_pci_device *dev);

 /**
+ * Munmap memory for single PCI device
+ *
+ * This function is private to EAL.
+ *
+ * @param  dr
+ *  The pointer to the pci driver structure
+ * @param  dev
+ *  The pointer to the pci device structure
+ * @return
+ 

[dpdk-dev] [PATCH v14 11/13] ethdev: Add one dev_type parameter to rte_eth_dev_allocate

2015-02-25 Thread Tetsuya Mukawa
This new parameter is needed to keep device type like PCI or virtual.
Port detaching processes are different between PCI device and virtual
device.
RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL
indicates device is virtual.

v12:
- Add missing symbol in version map.
  (Thanks to Iremonger, Bernard)
v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
- Fix comment of "rte_ethdev.h".
  (Thanks to Thomas Monjalon)
v9:
- Fix commit log.
- RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is replaced by NO_TRACE.
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v4:
- Fix comments of rte_eth_dev_type.

Signed-off-by: Tetsuya Mukawa 
---
 app/test/virtual_pmd.c   |  2 +-
 lib/librte_ether/rte_ethdev.c| 25 +++--
 lib/librte_ether/rte_ethdev.h| 25 -
 lib/librte_ether/rte_ether_version.map   |  1 +
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c   |  2 +-
 lib/librte_pmd_pcap/rte_eth_pcap.c   |  2 +-
 lib/librte_pmd_ring/rte_eth_ring.c   |  2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |  2 +-
 9 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 785bccc..9b07ab1 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -580,7 +580,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
goto err;

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name);
+   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
goto err;

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 1f6a066..4ebdd9f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -227,7 +227,7 @@ rte_eth_dev_find_free_port(void)
 }

 struct rte_eth_dev *
-rte_eth_dev_allocate(const char *name)
+rte_eth_dev_allocate(const char *name, enum rte_eth_dev_type type)
 {
uint8_t port_id;
struct rte_eth_dev *eth_dev;
@@ -251,6 +251,7 @@ rte_eth_dev_allocate(const char *name)
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->attached = DEV_ATTACHED;
+   eth_dev->dev_type = type;
nb_ports++;
return eth_dev;
 }
@@ -262,6 +263,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return -EINVAL;

eth_dev->attached = 0;
+   eth_dev->dev_type = RTE_ETH_DEV_UNKNOWN;
nb_ports--;
return 0;
 }
@@ -300,7 +302,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
rte_eth_dev_create_unique_device_name(ethdev_name,
sizeof(ethdev_name), pci_dev);

-   eth_dev = rte_eth_dev_allocate(ethdev_name);
+   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
return -ENOMEM;

@@ -426,6 +428,14 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+enum rte_eth_dev_type
+rte_eth_dev_get_device_type(uint8_t port_id)
+{
+   if (!rte_eth_dev_is_valid_port(port_id))
+   return -1;
+   return rte_eth_devices[port_id].dev_type;
+}
+
 int
 rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
 {
@@ -523,6 +533,17 @@ rte_eth_dev_is_detachable(uint8_t port_id)
return -EINVAL;
}

+   if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
+   switch (rte_eth_devices[port_id].pci_dev->pt_driver) {
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   break;
+   case RTE_PT_VFIO:
+   default:
+   return -ENOTSUP;
+   }
+   }
+
drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
return !(drv_flags & RTE_PCI_DRV_DETACHABLE);
 }
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 5519ce0..d8e5543 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1422,6 +1422,17 @@ struct rte_eth_rxtx_callback {
void *param;
 };

+/*
+ * The eth device type
+ */
+enum rte_eth_dev_type {
+   RTE_ETH_DEV_UNKNOWN,/**< unknown device type */
+   RTE_ETH_DEV_PCI,
+   /**< Physical function and Virtual function of PCI devices */
+   RTE_ETH_DEV_VIRTUAL,/**< non hardware device */
+   RTE_ETH_DEV_MAX /**< max value of this enum */
+};
+
 /**
  * @internal
  * The generic data structure associated with each ethernet device.
@@ -1452,6 +1463,7 @@ struct rte_eth_dev {
 */
struct rte_eth_rxtx_callback **pre_tx_burst_cbs;
uint8_t attached; /**< Flag indicating the port is attached */
+   enum rte_eth_

[dpdk-dev] [PATCH v14 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-25 Thread Tetsuya Mukawa
These functions are used for attaching or detaching a port.
When rte_eal_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eal_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eal_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

v14:
- Remove needless if statement.
  (Thanks to Maxime Leroy)
v13:
- Change log level when error occurs in rte_eal_vdev_init() and
  rte_eal_dev_init().
- Return value of driver init and uninit functions.
- Replace rte_panic by RTE_LOG in rte_eal_dev_init()
- Fix return value of rte_eal_vdev_uninit().
- Fix rte_eal_dev_attach_vdev to set port_id correctly.
  (Thanks to Maxime Leroy)
v11:
- Remove needless devargs handling codes.
- Replace get_vdev_name() by rte_eal_parse_devargs_str().
- Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
- Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
- Fix rte_eal_dev_init() to use rte_eal_vdev_init().
  (Thanks to Maxime Leroy)
v10:
- Add comments.
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- Fix comments.
- Use strcmp() instead of strncmp().
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
- Change definition of rte_dev_uninit_t.
  (Thanks to Thomas Monjalon and Maxime Leroy)
v8:
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Fix typo of warning messages.
  (Thanks to Qiu, Michael)
v5:
- Change function names like below.
  rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
  rte_eal_dev_invoke() to rte_eal_vdev_invoke().
- Add code to handle a return value of rte_eal_devargs_remove().
- Fix pci address format in rte_eal_dev_detach().
v4:
- Fix comment.
- Add error checking.
- Fix indent of 'if' statement.
- Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_dev.c  | 278 ++--
 lib/librte_eal/common/eal_common_devargs.c  |  54 +++--
 lib/librte_eal/common/eal_private.h |  11 +
 lib/librte_eal/common/include/rte_dev.h |  33 +++
 lib/librte_eal/common/include/rte_devargs.h |  28 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   6 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 8 files changed, 375 insertions(+), 38 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index eae5656..6d805aa 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -32,10 +32,13 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+#include 
 #include 
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -61,6 +64,32 @@ rte_eal_driver_unregister(struct rte_driver *driver)
TAILQ_REMOVE(&dev_driver_list, driver, next);
 }

+static int
+rte_eal_vdev_init(const char *name, const char *args)
+{
+   struct rte_driver *driver;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   TAILQ_FOREACH(driver, &dev_driver_list, next) {
+   if (driver->type != PMD_VDEV)
+   continue;
+
+   /*
+* search a driver prefix in virtual device name.
+* For example, if the driver is pcap PMD, driver->name
+* will be "eth_pcap", but "name" will be "eth_pcapN".
+* So use strncmp to compare.
+*/
+   if (!strncmp(driver->name, name, strlen(driver->name)))
+   return driver->init(name, args);
+   }
+
+   RTE_LOG(ERR, EAL, "no driver found for %s\n", name);
+   return -EINVAL;
+}
+
 int
 rte_eal_dev_init(void)
 {
@@ -79,22 +108,11 @@ rte_eal_dev_init(void)
if (devargs->type != RTE_DEVTYPE_VIRTUAL)
continue;

-   TAILQ_FOREACH(driver, &dev_driver_list, next) {
-   if (driver->type != PMD_VDEV)
-   continue;
-
-   /* search a driver prefix in virtual device name */
-   if (!strncmp(driver->name, devargs->virtual.drv_name,
-   strlen(driver->name))) {
-   driver->init(devargs->virtual.drv_name,
-   devargs->args);
-   break;
-   }
-   }
-
-   if (driver == NULL) {
-   rte_panic("no driver found for %s\n",
- devargs->virtual.drv_name);
+   if (rte_eal_vdev_init(devargs->virtual.drv_name,
+   devargs->args)) {
+   RTE_LOG

[dpdk-dev] [PATCH v14 13/13] doc: Add port hotplug framework section to programmers guide

2015-02-25 Thread Tetsuya Mukawa
This patch adds a new section for describing port hotplug framework.

Signed-off-by: Tetsuya Mukawa 
---
 doc/guides/prog_guide/index.rst  |   1 +
 doc/guides/prog_guide/port_hotplug_framework.rst | 110 +++
 2 files changed, 111 insertions(+)
 create mode 100644 doc/guides/prog_guide/port_hotplug_framework.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index de69682..60a6ac5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
 packet_classif_access_ctrl
 packet_framework
 vhost_lib
+port_hotplug_framework
 source_org
 dev_kit_build_system
 dev_kit_root_make_help
diff --git a/doc/guides/prog_guide/port_hotplug_framework.rst 
b/doc/guides/prog_guide/port_hotplug_framework.rst
new file mode 100644
index 000..355ae28
--- /dev/null
+++ b/doc/guides/prog_guide/port_hotplug_framework.rst
@@ -0,0 +1,110 @@
+..  BSD LICENSE
+Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of IGEL Co.,Ltd. nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Port Hotplug Framework
+==
+
+The Port Hotplug Framework provides DPDK applications with the ability to
+attach and detach ports at runtime. Because the framework depends on PMD
+implementation, the ports that PMDs cannot handle are out of scope of this
+framework. Furthermore, after detaching a port from a DPDK application, the
+framework doesn't provide a way for removing the devices from the system.
+For the ports backed by a physical NIC, the kernel will need to support PCI
+Hotplug feature.
+
+Overview
+
+
+The basic requirements of the Port Hotplug Framework are:
+
+*   DPDK applications that use the Port Hotplug Framework must manage their
+own ports.
+
+The Port Hotplug Framework is implemented to allow DPDK applications to
+manage ports. For example, when DPDK applications call the port attach
+function, the attached port number is returned. DPDK applications can
+also detach the port by port number.
+
+*   Kernel support is needed for attaching or detaching physical device
+ports.
+
+To attach new physical device ports, the device will be recognized by
+userspace driver I/O framework in kernel at first. Then DPDK
+applications can call the Port Hotplug functions to attach the ports.
+For detaching, steps are vice versa.
+
+*   Before detaching, they must be stopped and closed.
+
+DPDK applications must call "rte_eth_dev_stop()" and
+"rte_eth_dev_close()" APIs before detaching ports. These functions will
+start finalization sequence of the PMDs.
+
+*   The framework doesn't affect legacy DPDK applications behavior.
+
+If the Port Hotplug functions aren't called, all legacy DPDK apps can
+still work without modifications.
+
+Port Hotplug API overview
+-
+
+*   Attaching a port
+
+"rte_eal_dev_attach()" API attaches a port to DPDK application, and
+returns the attached port number. Before calling the API, the device
+should be recognized by an userspace driver I/O framework. The API
+receives a pci address like ":01:00.0" or a virtual device name
+like "eth_pcap0,iface=eth0". In the case of virtual device name, the
+format is the same as the general "--vdev" option of DPDK.
+
+*   Detac

[dpdk-dev] [PATCH v14] librte_pmd_pcap: Add port hotplug support

2015-02-25 Thread Tetsuya Mukawa
This patch adds finalization code to free resources allocated by the
PMD.

v6:
 - Fix a paramter of rte_eth_dev_free().
v4:
 - Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_pcap/rte_eth_pcap.c | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index af7fae8..5e94930 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -498,6 +498,13 @@ static struct eth_dev_ops ops = {
.stats_reset = eth_stats_reset,
 };

+static struct eth_driver rte_pcap_pmd = {
+   .pci_drv = {
+   .name = "rte_pcap_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 /*
  * Function handler that opens the pcap file for reading a stores a
  * reference of it for use it later on.
@@ -713,6 +720,10 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
if (*eth_dev == NULL)
goto error;

+   /* check length of device name */
+   if ((strlen((*eth_dev)->data->name) + 1) > sizeof(data->name))
+   goto error;
+
/* now put it all together
 * - store queue data in internals,
 * - store numa_node info in pci_driver
@@ -739,10 +750,13 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name,
+   (*eth_dev)->data->name, strlen((*eth_dev)->data->name));

(*eth_dev)->data = data;
(*eth_dev)->dev_ops = &ops;
(*eth_dev)->pci_dev = pci_dev;
+   (*eth_dev)->driver = &rte_pcap_pmd;

return 0;

@@ -927,10 +941,36 @@ rte_pmd_pcap_devinit(const char *name, const char *params)

 }

+static int
+rte_pmd_pcap_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(INFO, PMD, "Closing pcap ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   if (name == NULL)
+   return -1;
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_pcap_drv = {
.name = "eth_pcap",
.type = PMD_VDEV,
.init = rte_pmd_pcap_devinit,
+   .uninit = rte_pmd_pcap_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_pcap_drv);
-- 
1.9.1



[dpdk-dev] [PATCH v14] testpmd: Add port hotplug support

2015-02-25 Thread Tetsuya Mukawa
The patch introduces following commands.
- port attach [ident]
- port detach [port_id]
 - attach: attaching a port
 - detach: detaching a port
 - ident: pci address of physical device.
  Or device name and parameters of virtual device.
 (ex. :02:00.0, eth_pcap0,iface=eth0)
 - port_id: port identifier

v7:
- Fix doc.
  (Thanks to Iremonger, Bernard)
- Fix port checking implementation of star_port();
  (Thanks to Qiu, Michael)
v5:
- Add testpmd documentation.
  (Thanks to Iremonger, Bernard)
v4:
 - Fix strings of command help.

Signed-off-by: Tetsuya Mukawa 
---
 app/test-pmd/cmdline.c  | 137 +++
 app/test-pmd/config.c   | 102 --
 app/test-pmd/parameters.c   |  22 ++-
 app/test-pmd/testpmd.c  | 199 +---
 app/test-pmd/testpmd.h  |  18 ++-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
 6 files changed, 409 insertions(+), 126 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4c9f423..c8312be 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -513,6 +513,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"port close (port_id|all)\n"
"Close all ports or port_id.\n\n"

+   "port attach (ident)\n"
+   "Attach physical or virtual dev by pci address or 
virtual device name\n\n"
+
+   "port detach (port_id)\n"
+   "Detach physical or virtual dev by port_id\n\n"
+
"port config (port_id|all)"
" speed (10|100|1000|1|4|auto)"
" duplex (half|full|auto)\n"
@@ -793,6 +799,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
},
 };

+/* *** attach a specified port *** */
+struct cmd_operate_attach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   cmdline_fixed_string_t identifier;
+};
+
+static void cmd_operate_attach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_attach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "attach"))
+   attach_port(res->identifier);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_attach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   keyword, "attach");
+cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   identifier, NULL);
+
+cmdline_parse_inst_t cmd_operate_attach_port = {
+   .f = cmd_operate_attach_port_parsed,
+   .data = NULL,
+   .help_str = "port attach identifier, "
+   "identifier: pci address or virtual dev name",
+   .tokens = {
+   (void *)&cmd_operate_attach_port_port,
+   (void *)&cmd_operate_attach_port_keyword,
+   (void *)&cmd_operate_attach_port_identifier,
+   NULL,
+   },
+};
+
+/* *** detach a specified port *** */
+struct cmd_operate_detach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   uint8_t port_id;
+};
+
+static void cmd_operate_detach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_detach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "detach"))
+   detach_port(res->port_id);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_detach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   keyword, "detach");
+cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_operate_detach_port = {
+   .f = cmd_operate_detach_port_parsed,
+   .data = NULL,
+   .help_str = "port detach port_id",
+   .tokens = {
+   (void *)&cmd_operate_detach_port_port,
+   (void *)&cmd_operate_detach_port_keyword,
+ 

[dpdk-dev] [PATCH v2] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Yerden Zhumabekov
New function test_crc32_hash_alg_equiv() checks whether software,
4-byte operand and 8-byte operand versions of CRC32 hash function
implementations return the same result value.

Signed-off-by: Yerden Zhumabekov 
---
 app/test/test_hash.c |   63 ++
 1 file changed, 63 insertions(+)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 76b1b8f..3e94af1 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -177,6 +177,66 @@ static struct rte_hash_parameters ut_params = {
.socket_id = 0,
 };

+#define CRC32_ITERATIONS (1U << 20)
+#define CRC32_DWORDS (1U << 6)
+/*
+ * Test if all CRC32 implementations yield the same hash value
+ */
+static int
+test_crc32_hash_alg_equiv(void)
+{
+   uint32_t hash_val;
+   uint32_t init_val;
+   uint64_t data64[CRC32_DWORDS];
+   unsigned i, j;
+   size_t data_len;
+
+   printf("# CRC32 implementations equivalence test\n");
+   for (i = 0; i < CRC32_ITERATIONS; i++) {
+   /* Randomizing data_len of data set */
+   data_len = (size_t) ((rte_rand() % sizeof(data64)) + 1);
+   init_val = (uint32_t) rte_rand();
+
+   /* Fill the data set */
+   for (j = 0; j < CRC32_DWORDS; j++)
+   data64[j] = rte_rand();
+
+   /* Calculate software CRC32 */
+   rte_hash_crc_set_alg(CRC32_SW);
+   hash_val = rte_hash_crc(data64, data_len, init_val);
+
+   /* Check against 4-byte-operand sse4.2 CRC32 if available */
+   rte_hash_crc_set_alg(CRC32_SSE42);
+   if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
+   printf("Failed checking CRC32_SW against 
CRC32_SSE42\n");
+   break;
+   }
+
+   /* Check against 8-byte-operand sse4.2 CRC32 if available */
+   rte_hash_crc_set_alg(CRC32_SSE42_x64);
+   if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
+   printf("Failed checking CRC32_SW against 
CRC32_SSE42_x64\n");
+   break;
+   }
+   }
+
+   /* Resetting to best available algorithm */
+   rte_hash_crc_set_alg(CRC32_SSE42_x64);
+
+   if (i == CRC32_ITERATIONS)
+   return 0;
+
+   printf("Failed test data (hex):\n");
+
+   for (j = 0; j < data_len; j++) {
+   printf("%02X", ((uint8_t *)data64)[j]);
+   if ((j+1) % 16 == 0 || j == data_len - 1)
+   printf("\n");
+   }
+
+   return -1;
+}
+
 /*
  * Test a hash function.
  */
@@ -1356,6 +1416,9 @@ test_hash(void)

run_hash_func_tests();

+   if (test_crc32_hash_alg_equiv() < 0)
+   return -1;
+
return 0;
 }

-- 
1.7.9.5



[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Tetsuya Mukawa
On 2015/02/25 12:39, Cunming Liang wrote:
> These two patches are the fixing for the compling error when 
> CONFIG_RTE_BUILD_SHARED_LIB=y.
> The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing in 
> the version map.
> Thanks for the notification from Tetsuya Mukawa . 
>
> Cunming Liang (2):
>   eal/linux: fix symbol missing in version map
>   eal/bsd: fix symbol missing in version map
>
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 2 ++
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
>  2 files changed, 4 insertions(+)
>
Hi Liang,

I've confirmed it works on my Linux environment.

Thanks,
Tetsuya


[dpdk-dev] ixgbe vector mode not working.

2015-02-25 Thread Liang, Cunming
Hi Stephen,

I tried on the latest mater branch with testpmd.
2 rxq and 2 txq as below, vector pmd on both rx and tx. I can't reproduced it.
I checked your log, on tx side, it looks the tx vector haven't enabled. (it 
shows vpmd on rx, spmd on tx).
Would you help to share the below params in your app ?
RX desc=128 - RX free threshold=32
TX desc=512 - TX free threshold=32
TX RS bit threshold=32 - TXQ flags=0xf01
As in your case which using 2 rxq and 1 txq, would you explain the traffic flow 
between them.
One thread polling packets from each rxq and send to the specified txq ?

./x86_64-native-linuxapp-gcc/app/testpmd -c 0xff00 -n 4 -- -i --coremask=f000 
--txfreet=32 --rxfreet=32 --txqflags=0xf01 --txrst=32 --rxq=2 --txq=2 --numa
 [...]
Configuring Port 0 (socket 1)
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f99cace9ac0 hw_ring=0x7f99c9c3f480 
dma_addr=0x1fdd83f480
PMD: set_tx_function(): Using simple tx code path
PMD: set_tx_function(): Vector tx enabled.
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f99cace7980 hw_ring=0x7f99c9c4f480 
dma_addr=0x1fdd84f480
PMD: set_tx_function(): Using simple tx code path
PMD: set_tx_function(): Vector tx enabled.
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f99cace7100 hw_ring=0x7f99c9c5f480 
dma_addr=0x1fdd85f480
PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=0, queue=0.
PMD: ixgbe_dev_rx_queue_setup(): Vector rx enabled, please make sure RX burst 
size no less than 32.
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f99cace6880 hw_ring=0x7f99c9c6f500 
dma_addr=0x1fdd86f500
PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=0, queue=1.
PMD: ixgbe_dev_rx_queue_setup(): Vector rx enabled, please make sure RX burst 
size no less than 32.
Port 0: 90:E2:BA:30:A0:75
Configuring Port 1 (socket 1)
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f99cace4540 hw_ring=0x7f99c9c7f580 
dma_addr=0x1fdd87f580
PMD: set_tx_function(): Using simple tx code path
PMD: set_tx_function(): Vector tx enabled.
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f99cace2400 hw_ring=0x7f99c9c8f580 
dma_addr=0x1fdd88f580
PMD: set_tx_function(): Using simple tx code path
PMD: set_tx_function(): Vector tx enabled.
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f99cace1b80 hw_ring=0x7f99c9c9f580 
dma_addr=0x1fdd89f580
PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=1, queue=0.
PMD: ixgbe_dev_rx_queue_setup(): Vector rx enabled, please make sure RX burst 
size no less than 32.
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f99cace1300 hw_ring=0x7f99c9caf600 
dma_addr=0x1fdd8af600
PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=1, queue=1.
PMD: ixgbe_dev_rx_queue_setup(): Vector rx enabled, please make sure RX burst 
size no less than 32.
Port 1: 90:E2:BA:06:90:59
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Port 1 Link Up - speed 1 Mbps - full-duplex
Done
testpmd> show config rxtx
  io packet forwarding - CRC stripping disabled - packets/burst=32
  nb forwarding cores=4 - nb forwarding ports=2
  RX queues=2 - RX desc=128 - RX free threshold=32
  RX threshold registers: pthresh=8 hthresh=8 wthresh=0
  TX queues=2 - TX desc=512 - TX free threshold=32
  TX threshold registers: pthresh=32 hthresh=0 wthresh=0
  TX RS bit threshold=32 - TXQ flags=0xf01

-Cunming

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Wednesday, February 25, 2015 8:16 AM
> To: Nemeth, Balazs; Richardson, Bruce; Liang, Cunming; Neil Horman
> Cc: dev at dpdk.org
> Subject: ixgbe vector mode not working.
> 
> The ixgbe driver (from 1.8 or 2.0) works fine in normal (non-vectored) mode.
> But when vector mode is enabled, it gets a few packets through then hangs.
> We use 2 Rx queues and 1 Tx queue per interface.
> 
> Devices:
> 01:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+
> Network Connection (rev 01)
> 02:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit 
> X540-
> AT2 (rev 01)
> 
> Log:
> EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
> PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 17, SFP+: 5
> PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x10fb
> EAL:   probe driver: 8086:1528 rte_ixgbe_pmd
> PMD: eth_ixgbe_dev_init(): MAC: 4, PHY: 3
> PMD: eth_ixgbe_dev_init(): port 1 vendorID=0x8086 deviceID=0x1528
> [0.43] DATAPLANE: Port 0 rte_ixgbe_pmd on socket 0
> [0.53] DATAPLANE: Port 1 rte_ixgbe_pmd on socket 0
> [0.031638] PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fc5ac6a1b40
> hw_ring=0x7fc5ab548300 dma_addr=0x67348300
> [0.031647] PMD: ixgbe_dev_rx_queue_setup(): Rx Burst Bulk Alloc
> Preconditions are satisfied. Rx Burst Bulk Alloc f

[dpdk-dev] Manage DPDK port capability via KNI

2015-02-25 Thread Zhou, Danny
You can do it but it will not sync with DPDK. In current KNI implementation, 
the devices'
I/O address spaces are mapped to both userspace DPDK and kenrelspace KNI, so one
can control the NIC device independently(using ethtool for KNI and ethdev APIs 
for DPDK)
without synchronization.

In theory, KNI should route all device control request from ethtool to DPDK. 
But unfortunately,
a short path is adopted at the moment due to DPDK reused lots of legacy kernel 
codes with BSD license.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tim Deng
> Sent: Wednesday, February 25, 2015 9:57 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Manage DPDK port capability via KNI
> 
> Hi,
> 
> 
> I am wondering how could we manage a DPDK port offload capabilities,
> e.g. if we want to disable TSO capability on a DPDK port, is it feasible
> that we use ethtool to configure a KNI then the config will be sync to a DPDK 
> port?
> 
> 
> Thanks,
> Tim


[dpdk-dev] [PATCH 5/5] Fix usage of fgets in various places

2015-02-25 Thread Panu Matilainen
On 02/24/2015 09:01 PM, Stephen Hemminger wrote:
> On Tue, 24 Feb 2015 11:20:33 +0200
> Panu Matilainen  wrote:
>
>> The tool is technically correct, even if loss of precision might be
>> unlikely to occur in this context
>
> Overflow is not there in the code.
> That is why I said "shooting Unicorns"; this is all about
> about fixing bugs that don't exist because there is nothing there
> in the real world.
>
> In this code buffer is always something normal in size and does
> not exceed 2^32-1.

Oh, if the tool is complaining about fixed-size buffers then yeah the 
the tool is being silly, sorry I didn't actually look up the cases.

- Panu -



[dpdk-dev] [PATCH v3 00/11] qemu vhost-user support

2015-02-25 Thread Xie, Huawei
PC:
Thanks a lot for the effort.
During one of the rebase process, i moved eventfd copy into
eventfd_copy.c but forget to update virtio-net.c, so it  isn't
compilable until later commit.
Sorry for the trouble. Will check if each commit could be compiled in
future.

On 2/24/2015 1:36 AM, Przemyslaw Czesnowicz wrote:
> v3 changes:
>   * move things around to make all patches compile
>   
>
> Xie, Huawei (11):
>   lib/librte_vhost: enable VIRTIO_NET_F_CTRL_RX VIRTIO_NET_F_CTRL_RX is
> dependant on VIRTIO_NET_F_CTRL_VQ. Observed that virtio-net driver
> in guest would crash with only CTRL_RX enabled.
>   lib/librte_vhost: create vhost_cuse directory and move
> vhost-net-cdev.c into vhost_cuse
>   lib/librte_vhost: rename vhost-net-cdev.h to vhost-net.h
>   lib/librte_vhost: move fd copying(from qemu process into vhost
> process) to eventfd_copy.c
>   lib/librte_vhost: copy host_memory_map from virtio-net.c to a new file
> virtio-net-cdev.c
>   lib/librte_vhost: make host_memory_map a more generic function.
>   lib/librte_vhost: implement cuse_set_memory_table
>   lib/librte_vhost: add select based event driven processing
>   lib/librte_vhost: vhost user support
>   lib/librte_vhost: support dev->ifname for vhost-user
>   lib/librte_vhost: support dynamically registering vhost server
>
>  lib/librte_vhost/Makefile |   8 +-
>  lib/librte_vhost/rte_virtio_net.h |   5 +-
>  lib/librte_vhost/vhost-net-cdev.c | 389 
>  lib/librte_vhost/vhost-net-cdev.h | 113 --
>  lib/librte_vhost/vhost-net.h  | 118 +++
>  lib/librte_vhost/vhost_cuse/eventfd_copy.c|  88 +
>  lib/librte_vhost/vhost_cuse/eventfd_copy.h|  39 ++
>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c  | 417 ++
>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 423 ++
>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.h |  48 +++
>  lib/librte_vhost/vhost_rxtx.c |   2 +-
>  lib/librte_vhost/vhost_user/fd_man.c  | 258 ++
>  lib/librte_vhost/vhost_user/fd_man.h  |  67 
>  lib/librte_vhost/vhost_user/vhost-net-user.c  | 472 +
>  lib/librte_vhost/vhost_user/vhost-net-user.h  | 106 ++
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 314 
>  lib/librte_vhost/vhost_user/virtio-net-user.h |  49 +++
>  lib/librte_vhost/virtio-net.c | 491 
> ++
>  lib/librte_vhost/virtio-net.h |  43 +++
>  19 files changed, 2491 insertions(+), 959 deletions(-)
>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.c
>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.h
>  create mode 100644 lib/librte_vhost/vhost-net.h
>  create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.h
>  create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h
>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.c
>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.h
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h
>  create mode 100644 lib/librte_vhost/virtio-net.h
>



[dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode() logic to driver

2015-02-25 Thread Ouyang, Changchun


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Thursday, February 19, 2015 11:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode()
> logic to driver
> 
> Function rte_eth_dev_check_mq_mode() is driver specific. It should be
> done in PF configuration phase. This patch move igb/ixgbe driver specific mq
> check and SRIOV configuration code to driver part. Also rewriting log
> messages to be shorter and more descriptive.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  lib/librte_ether/rte_ethdev.c   | 197 ---
>  lib/librte_pmd_e1000/igb_ethdev.c   |  43 
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 105 ++-
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   5 +-
>  lib/librte_pmd_ixgbe/ixgbe_pf.c | 202
> +++-
>  5 files changed, 327 insertions(+), 225 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 4007054..aa27e39 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -502,195 +502,6 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev
> *dev, uint16_t nb_queues)
>   return (0);
>  }
> 
> -static int
> -rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q) -{
> - struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> - switch (nb_rx_q) {
> - case 1:
> - case 2:
> - RTE_ETH_DEV_SRIOV(dev).active =
> - ETH_64_POOLS;
> - break;
> - case 4:
> - RTE_ETH_DEV_SRIOV(dev).active =
> - ETH_32_POOLS;
> - break;
> - default:
> - return -EINVAL;
> - }
> -
> - RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = nb_rx_q;
> - RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
> - dev->pci_dev->max_vfs * nb_rx_q;
> -
> - return 0;
> -}
> -
> -static int
> -rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t
> nb_tx_q,
> -   const struct rte_eth_conf *dev_conf)
> -{
> - struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> -
> - if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
> - /* check multi-queue mode */
> - if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
> - (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS)
> ||
> - (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {
> - /* SRIOV only works in VMDq enable mode */
> - PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
> - " SRIOV active, "
> - "wrong VMDQ mq_mode rx %u
> tx %u\n",
> - port_id,
> - dev_conf->rxmode.mq_mode,
> - dev_conf->txmode.mq_mode);
> - return (-EINVAL);
> - }
> -
> - switch (dev_conf->rxmode.mq_mode) {
> - case ETH_MQ_RX_VMDQ_DCB:
> - case ETH_MQ_RX_VMDQ_DCB_RSS:
> - /* DCB/RSS VMDQ in SRIOV mode, not implement
> yet */
> - PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
> - " SRIOV active, "
> - "unsupported VMDQ mq_mode
> rx %u\n",
> - port_id, dev_conf-
> >rxmode.mq_mode);
> - return (-EINVAL);
> - case ETH_MQ_RX_RSS:
> - PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
> - " SRIOV active, "
> - "Rx mq mode is changed from:"
> - "mq_mode %u into VMDQ
> mq_mode %u\n",
> - port_id,
> - dev_conf->rxmode.mq_mode,
> - dev->data-
> >dev_conf.rxmode.mq_mode);
> - case ETH_MQ_RX_VMDQ_RSS:
> - dev->data->dev_conf.rxmode.mq_mode =
> ETH_MQ_RX_VMDQ_RSS;
> - if (nb_rx_q <=
> RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool)
> - if
> (rte_eth_dev_check_vf_rss_rxq_num(port_id, nb_rx_q) != 0) {
> - PMD_DEBUG_TRACE("ethdev
> port_id=%d"
> - " SRIOV active, invalid queue"
> - " number for VMDQ RSS,
> allowed"
> - " value are 1, 2 or 4\n",
> - port_id);
> - return -EINVAL;
> - }
> - break;
> - default: /* ETH_MQ_RX_VMDQ_ONLY or
> ETH_MQ_RX_NONE */
> - /* if nothing mq mode configure, use default scheme
> 

[dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling based on VFIO

2015-02-25 Thread Zhou, Danny
Thanks for comments and please see my answers inline.

From: David Marchand [mailto:david.march...@6wind.com]
Sent: Tuesday, February 24, 2015 6:42 PM
To: Zhou, Danny
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling 
based on VFIO

Hello Danny,

On Mon, Feb 23, 2015 at 5:55 PM, Zhou Danny mailto:danny.zhou at intel.com>> wrote:

[snip]

+/**
+ * @param intr_handle
+ *   pointer to the interrupt handle.
+ * @param queue_id
+ *   the queue id
+ * @return
+ *   - On success, return 0
+ *   - On failure, returns -1.
+ */
+int rte_intr_wait_rx_pkt(struct rte_intr_handle *intr_handle,
+   uint8_t queue_id);
+


[dpdk-dev] Cannot compile l2fwd-jobstats example

2015-02-25 Thread Pawel Wodkowski
On 2015-02-25 03:26, Tetsuya Mukawa wrote:
> Hi,
>
> I cannot compile l2fwd-jobstats using master branch.
> Here is log
>
> $ T=x86_64-native-linuxapp-gcc make examples
> == Build examples for x86_64-native-linuxapp-gcc
> == bond
> == cmdline
> == distributor
> == exception_path
> == helloworld
> == ip_pipeline
> == ip_reassembly
> == ipv4_multicast
> == kni
> == l2fwd
> == l2fwd-jobstats
> make: *** l2fwd-jobstats: No such file or directory.  Stop.
> make[2]: *** [l2fwd-jobstats] Error 2
> make[1]: *** [x86_64-native-linuxapp-gcc_examples] Error 2
> make: *** [examples] Error 2
>
>
> As a result of bisecting, it seems after applying below commit, this
> error can be seen.
>
> commit 2caeb8c0141dcf488f2d68aa8e8c44d1f85ed28b
> Author: Pawel Wodkowski 
> Date:   Tue Feb 24 17:33:24 2015 +0100
>
>  examples/l2fwd-jobstats: new example
>
>
> Thanks,
> Tetsuya
>

Looking on git log, there are missing two files there:

  examples/l2fwd-jobstats/Makefile
  examples/l2fwd-jobstats/main.c

from patch http://dpdk.org/ml/archives/dev/2015-February/014107.html

-- 
Pawel


[dpdk-dev] Vhost-user early adopter feedback

2015-02-25 Thread Xie, Huawei
On 2/18/2015 3:59 PM, Beno?t Canet wrote:
> Hello Xie,
>
> As promized I integrated your vhost-user patchset from january in my vswitch.
>
> I just tried it, it works pretty well.
>
> I just had a minor bug with rte_vhost_driver_register taking ownership of the
> string patch pointer too late. I freed it out of habit just after registering 
> in the
> caller and when ifname[IFNAMESIZ] was written the pointer was used for a new 
> string I
> allocated later. Maybe an early strdup() would fix this.
Thanks.
Do you mean we duplicate a string from the first parameter path, like
vserver->path = strdup(path) ?
If this was the case, it was ever in my mind. We would do this if
necessary.
>
> The last patch of your new version is really a great idea since it will
> simplify a lot the socket creation and management code.
>
> Best regards
>
> Beno?t
>
>



[dpdk-dev] [PATCH v4 3/7] pmd: igb/ixgbe split nb_q_per_pool to rx and tx nb_q_per_pool

2015-02-25 Thread Pawel Wodkowski
On 2015-02-25 04:24, Ouyang, Changchun wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
>> Sent: Thursday, February 19, 2015 11:55 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH v4 3/7] pmd: igb/ixgbe split nb_q_per_pool to rx
>> and tx nb_q_per_pool
>>
[...]
>>
>>  /* check valid queue number */
>> -if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) ||
>> -(nb_tx_q > RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)) {
>> +if ((nb_rx_q > RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool)
>
> Here,  how about use nb_rx_q_per_pool to replace nb_tx_q_per_pool ?
> so it will be more clear to check rx queue number.

Yes, this should be nb_rx_q_per_pool. I missed this, because in next 
patch I moved this and corrected "on the fly" :). I will correct this in 
next version.


-- 
Pawel


[dpdk-dev] [PATCH] eal: add missing symbol export for rte_sys_gettid()

2015-02-25 Thread Panu Matilainen
Signed-off-by: Panu Matilainen 
---
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index c207cee..117246a 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -83,6 +83,7 @@ DPDK_2.0 {
rte_snprintf;
rte_strerror;
rte_strsplit;
+   rte_sys_gettid;
rte_thread_get_affinity;
rte_thread_set_affinity;
rte_vlog;
-- 
2.1.0



[dpdk-dev] [PATCH] eal: add missing symbol export for rte_sys_gettid()

2015-02-25 Thread Panu Matilainen
On 02/25/2015 09:50 AM, Panu Matilainen wrote:
> Signed-off-by: Panu Matilainen 
> ---
>   lib/librte_eal/linuxapp/eal/rte_eal_version.map | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
> b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> index c207cee..117246a 100644
> --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
> @@ -83,6 +83,7 @@ DPDK_2.0 {
>   rte_snprintf;
>   rte_strerror;
>   rte_strsplit;
> + rte_sys_gettid;
>   rte_thread_get_affinity;
>   rte_thread_set_affinity;
>   rte_vlog;

Never mind, already addressed here: 
http://dpdk.org/dev/patchwork/patch/3683/

NAK for myself...

- Panu -



[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cunming Liang
> Sent: Wednesday, February 25, 2015 3:40 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map
> 
> These two patches are the fixing for the compling error when
> CONFIG_RTE_BUILD_SHARED_LIB=y.
> The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing
> in the version map.
> Thanks for the notification from Tetsuya Mukawa .
> 
> Cunming Liang (2):
>   eal/linux: fix symbol missing in version map
>   eal/bsd: fix symbol missing in version map
> 
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 2 ++
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
>  2 files changed, 4 insertions(+)
> 

Series Acked-by: John McNamara 




[dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by klocwork

2015-02-25 Thread Thomas Monjalon
2015-02-25 00:57, Liang, Cunming:
> From: John W. Linville [mailto:linville at tuxdriver.com]
> > On Fri, Feb 20, 2015 at 11:19:59AM +0100, Thomas Monjalon wrote:
> > > 2015-02-12 17:08, Cunming Liang:
> > > > --- a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > +++ b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > @@ -439,13 +439,15 @@ rte_pmd_init_internals(const char *name,
> > > > size_t ifnamelen;
> > > > unsigned k_idx;
> > > > struct sockaddr_ll sockaddr;
> > > > -   struct tpacket_req *req;
> > > > +   struct tpacket_req *req = NULL;
> > >
> > > If *internals is set to NULL, there should be no case where req used
> > > and undefined.
> 
> [LCM] Agree, so that's why I add '*internals = NULL' below as well.
> > 
> > I agree -- it looks to me like req is protected by checking for
> > *internals == NULL.  I don't think this patch is necessary.
> 
> [LCM] The major piece of the patch is add setting for '*internals=NULL;'.

Yes understood, but it is already initialized to NULL before calling
rte_pmd_init_internals():
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_af_packet/rte_eth_af_packet.c#n706



[dpdk-dev] ixgbe vector mode not working.

2015-02-25 Thread Liang, Cunming
Hi Stephen,

Thanks for the info, with rxd=4000, I can reproduce it.
On that time, it runs out of mbuf.
I'll follow up this issue.

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Wednesday, February 25, 2015 3:37 PM
> To: Liang, Cunming
> Cc: Nemeth, Balazs; Richardson, Bruce; Neil Horman; dev at dpdk.org
> Subject: Re: ixgbe vector mode not working.
> 
> On Wed, 25 Feb 2015 04:55:09 +
> "Liang, Cunming"  wrote:
> 
> > Hi Stephen,
> >
> > I tried on the latest mater branch with testpmd.
> > 2 rxq and 2 txq as below, vector pmd on both rx and tx. I can't reproduced 
> > it.
> > I checked your log, on tx side, it looks the tx vector haven't enabled. (it 
> > shows
> vpmd on rx, spmd on tx).
> > Would you help to share the below params in your app ?
> > RX desc=128 - RX free threshold=32
> > TX desc=512 - TX free threshold=32
> > TX RS bit threshold=32 - TXQ flags=0xf01
> > As in your case which using 2 rxq and 1 txq, would you explain the traffic 
> > flow
> between them.
> > One thread polling packets from each rxq and send to the specified txq ?
> 
> Basic thread model of application is same as examples/qos_sched.
> 
> On ixgbe:
>   RX desc = 4000 - RX free threshold=32
>   TX desc = 512  - TX free threshold=0 so driver sets default of 32
> 
> I was setting rx/tx conf but since examples don't went away from that.

[LCM] All these params defined in rte_eth_rxconf/rte_eth_txconf which are used 
during rte_eth_rx/tx_queue_setup.
If don't care the value and assign nothing for it, it takes the default value 
per each device.
For ixgbe, the default_txconf will use the vpmd. In your log, it's not. So 
that's why I asked for such params.

> 
> The whole RX/TX tuning parameters are a very poor programming model only
> a hardware engineer could love. Requiring the application to look at
> driver string and choose the magic parameter settings, is in my opnion
> an indication of using incorrect abstraction.
[LCM] It's not necessary for application to look at such parameter. As you 
said, that's only for RX/TX tuning.
If tuning, it makes sense to understand what these parameters mean.




[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Olivier MATZ
Hi Robert,

On 02/25/2015 05:09 AM, Robert Sanford wrote:
> Changes in v2:
> - split into multiple patches
> - minor coding-style changes
>
> Robert Sanford (3):
>timer: fix return value of rte_timer_reset(),
>  insert rte_pause() into rte_timer_reset_sync() wait-loop
>app/test: fix timer stress test to succeed on multiple runs,
>  display number of times that rte_timer_reset() fails
>  (expected) due to races with other cores
>
> app/test/test_timer.c|   26 +++---
> lib/librte_timer/rte_timer.c |7 +++
> 2 files changed, 26 insertions(+), 7 deletions(-)
>

Series:
Acked-by: Olivier Matz 

Thanks!



[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Thomas Monjalon
> > These two patches are the fixing for the compling error when
> > CONFIG_RTE_BUILD_SHARED_LIB=y.
> > The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing
> > in the version map.
> > Thanks for the notification from Tetsuya Mukawa .

Please use Reported-by: in such case.

Fixes: ef76436c6834 ("eal: get unique thread id")
Fixes: 9e29251b2afa ("eal: thread affinity API")

> > Cunming Liang (2):
> >   eal/linux: fix symbol missing in version map
> >   eal/bsd: fix symbol missing in version map

Merged together

> Series Acked-by: John McNamara 

Applied, thanks



[dpdk-dev] ixgbe vector mode not working.

2015-02-25 Thread Thomas Monjalon
2015-02-24 23:36, Stephen Hemminger:
> On Wed, 25 Feb 2015 04:55:09 +
> "Liang, Cunming"  wrote:
> 
> > Hi Stephen,
> > 
> > I tried on the latest mater branch with testpmd.
> > 2 rxq and 2 txq as below, vector pmd on both rx and tx. I can't reproduced 
> > it.
> > I checked your log, on tx side, it looks the tx vector haven't enabled. (it 
> > shows vpmd on rx, spmd on tx).
> > Would you help to share the below params in your app ?
> > RX desc=128 - RX free threshold=32
> > TX desc=512 - TX free threshold=32
> > TX RS bit threshold=32 - TXQ flags=0xf01
> > As in your case which using 2 rxq and 1 txq, would you explain the traffic 
> > flow between them.
> > One thread polling packets from each rxq and send to the specified txq ?
> 
> Basic thread model of application is same as examples/qos_sched.
> 
> On ixgbe:
>   RX desc = 4000 - RX free threshold=32
>   TX desc = 512  - TX free threshold=0 so driver sets default of 32
> 
> I was setting rx/tx conf but since examples don't went away from that.
> 
> The whole RX/TX tuning parameters are a very poor programming model only
> a hardware engineer could love. Requiring the application to look at
> driver string and choose the magic parameter settings, is in my opnion
> an indication of using incorrect abstraction.

Yes, improvements are welcome.


[dpdk-dev] Cannot compile l2fwd-jobstats example

2015-02-25 Thread Thomas Monjalon
2015-02-25 08:38, Pawel Wodkowski:
> On 2015-02-25 03:26, Tetsuya Mukawa wrote:
> > Hi,
> >
> > I cannot compile l2fwd-jobstats using master branch.
> > Here is log
> >
> > $ T=x86_64-native-linuxapp-gcc make examples
> > == Build examples for x86_64-native-linuxapp-gcc
> > == bond
> > == cmdline
> > == distributor
> > == exception_path
> > == helloworld
> > == ip_pipeline
> > == ip_reassembly
> > == ipv4_multicast
> > == kni
> > == l2fwd
> > == l2fwd-jobstats
> > make: *** l2fwd-jobstats: No such file or directory.  Stop.
> > make[2]: *** [l2fwd-jobstats] Error 2
> > make[1]: *** [x86_64-native-linuxapp-gcc_examples] Error 2
> > make: *** [examples] Error 2
> >
> >
> > As a result of bisecting, it seems after applying below commit, this
> > error can be seen.
> >
> > commit 2caeb8c0141dcf488f2d68aa8e8c44d1f85ed28b
> > Author: Pawel Wodkowski 
> > Date:   Tue Feb 24 17:33:24 2015 +0100
> >
> >  examples/l2fwd-jobstats: new example
> >
> >
> > Thanks,
> > Tetsuya
> >
> 
> Looking on git log, there are missing two files there:
> 
>   examples/l2fwd-jobstats/Makefile
>   examples/l2fwd-jobstats/main.c
> 
> from patch http://dpdk.org/ml/archives/dev/2015-February/014107.html

Yes, it explains why it works on my machine...
I forgot to add them after fixing merge.
It's fixed now. Sorry for the inconvenience.




[dpdk-dev] : ixgbe: why bulk allocation is not used for a scattered Rx flow?

2015-02-25 Thread Vlad Zolotarov
Hi, I have a question about the "scattered Rx" feature: why enabling it 
disabled "bulk allocation" feature?
There is some unclear comment in the ixgbe_recv_scattered_pkts():

/*
 * Descriptor done.
 *
 * Allocate a new mbuf to replenish the RX ring descriptor.
 * If the allocation fails:
 *- arrange for that RX descriptor to be the first one
 *  being parsed the next time the receive function is
 *  invoked [on the same queue].
 *
 *- Stop parsing the RX ring and return immediately.
 *
 * This policy does not drop the packet received in the RX
 * descriptor for which the allocation of a new mbuf failed.
 * Thus, it allows that packet to be later retrieved if
 * mbuf have been freed in the mean time.
 * As a side effect, holding RX descriptors instead of
 * systematically giving them back to the NIC may lead to
 * RX ring exhaustion situations.
 * However, the NIC can gracefully prevent such situations
 * to happen by sending specific "back-pressure" flow control
 * frames to its peer(s).
 */

Why the same "policy" can't be done in the bulk-context allocation? - 
Don't advance the RDT until u've refilled the ring. What do I miss here?

Another question is about the LRO feature - is there a reason why it's 
not implemented? I've implemented the LRO support in ixgbe PMD to begin 
with - I used a "scattered Rx" as a template and now I'm tuning it 
(things like the stuff above).

Is there any philosophical reason why it hasn't been implemented in 
*any* PMD so far? ;)

thanks,
vlad


[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Thomas Monjalon
> > Changes in v2:
> > - split into multiple patches
> > - minor coding-style changes
> >
> > Robert Sanford (3):
> >timer: fix return value of rte_timer_reset(),
> >  insert rte_pause() into rte_timer_reset_sync() wait-loop
> >app/test: fix timer stress test to succeed on multiple runs,
> >  display number of times that rte_timer_reset() fails
> >  (expected) due to races with other cores
> 
> Series:
> Acked-by: Olivier Matz 

Applied, thanks

Robert, as you well know rte_timer and you work on it,
maybe you are interested in becoming maintainer?


[dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by klocwork

2015-02-25 Thread Liang, Cunming


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, February 25, 2015 4:46 PM
> To: Liang, Cunming
> Cc: John W. Linville; dev at dpdk.org; John Linville
> Subject: Re: [dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by
> klocwork
> 
> 2015-02-25 00:57, Liang, Cunming:
> > From: John W. Linville [mailto:linville at tuxdriver.com]
> > > On Fri, Feb 20, 2015 at 11:19:59AM +0100, Thomas Monjalon wrote:
> > > > 2015-02-12 17:08, Cunming Liang:
> > > > > --- a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > > +++ b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > > @@ -439,13 +439,15 @@ rte_pmd_init_internals(const char *name,
> > > > >   size_t ifnamelen;
> > > > >   unsigned k_idx;
> > > > >   struct sockaddr_ll sockaddr;
> > > > > - struct tpacket_req *req;
> > > > > + struct tpacket_req *req = NULL;
> > > >
> > > > If *internals is set to NULL, there should be no case where req used
> > > > and undefined.
> >
> > [LCM] Agree, so that's why I add '*internals = NULL' below as well.
> > >
> > > I agree -- it looks to me like req is protected by checking for
> > > *internals == NULL.  I don't think this patch is necessary.
> >
> > [LCM] The major piece of the patch is add setting for '*internals=NULL;'.
> 
> Yes understood, but it is already initialized to NULL before calling
> rte_pmd_init_internals():
> http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_af_packet/rte_eth_af_packet
> .c#n706
[LCM] I see, it's complained by klocwork.
So either adding 'internals=NULL' or adding some comments helps to avoid 
checking again on the next scanning.
How do you think ?


[dpdk-dev] dpdk - poll mode - context switches

2015-02-25 Thread Jog Lie
Hello,

I am not sure to understand the mechanism behind dpdk concerning the context 
switches.
I have two user space applications that need access to the NIC according to 
incoming port rules (port 80 and port 443).

How to be sure that DPDK spreads the load to the right application ? 

Will 2 dpdk instances be needed (one per app) -> two incoming packets analysis 
to "know" if the packet should be forwarded to 
the user space process ? Which would basically be the same thing as inefficient 
promiscuous mode.

i don't understand that "filtering" point.

Could you please clarify ?

Thanks

--?
Jog


[dpdk-dev] Manage DPDK port capability via KNI

2015-02-25 Thread Tim Deng
Thanks Danny,
That means DPDK ports have to have dedicated control path other than KNI.


I originally got confused by the statement at 
http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html:
"...Allows management of DPDK ports using standard Linux net tools such as 
ethtool, ifconfig and tcpdump."
Thanks,
Tim


At 2015-02-25 13:05:08, "Zhou, Danny"  wrote:
>You can do it but it will not sync with DPDK. In current KNI implementation, 
>the devices'
>I/O address spaces are mapped to both userspace DPDK and kenrelspace KNI, so 
>one
>can control the NIC device independently(using ethtool for KNI and ethdev APIs 
>for DPDK)
>without synchronization.
>
>In theory, KNI should route all device control request from ethtool to DPDK. 
>But unfortunately,
>a short path is adopted at the moment due to DPDK reused lots of legacy kernel 
>codes with BSD license.
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tim Deng
>> Sent: Wednesday, February 25, 2015 9:57 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] Manage DPDK port capability via KNI
>> 
>> Hi,
>> 
>> 
>> I am wondering how could we manage a DPDK port offload capabilities,
>> e.g. if we want to disable TSO capability on a DPDK port, is it feasible
>> that we use ethtool to configure a KNI then the config will be sync to a 
>> DPDK port?
>> 
>> 
>> Thanks,
>> Tim


[dpdk-dev] [PATCH v1] afpacket: fix critical issue reported by klocwork

2015-02-25 Thread Thomas Monjalon
2015-02-25 09:52, Liang, Cunming:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-25 00:57, Liang, Cunming:
> > > From: John W. Linville [mailto:linville at tuxdriver.com]
> > > > On Fri, Feb 20, 2015 at 11:19:59AM +0100, Thomas Monjalon wrote:
> > > > > 2015-02-12 17:08, Cunming Liang:
> > > > > > --- a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > > > +++ b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
> > > > > > @@ -439,13 +439,15 @@ rte_pmd_init_internals(const char *name,
> > > > > > size_t ifnamelen;
> > > > > > unsigned k_idx;
> > > > > > struct sockaddr_ll sockaddr;
> > > > > > -   struct tpacket_req *req;
> > > > > > +   struct tpacket_req *req = NULL;
> > > > >
> > > > > If *internals is set to NULL, there should be no case where req used
> > > > > and undefined.
> > >
> > > [LCM] Agree, so that's why I add '*internals = NULL' below as well.
> > > >
> > > > I agree -- it looks to me like req is protected by checking for
> > > > *internals == NULL.  I don't think this patch is necessary.
> > >
> > > [LCM] The major piece of the patch is add setting for '*internals=NULL;'.
> > 
> > Yes understood, but it is already initialized to NULL before calling
> > rte_pmd_init_internals():
> > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_af_packet/rte_eth_af_packet
> > .c#n706
> [LCM] I see, it's complained by klocwork.
> So either adding 'internals=NULL' or adding some comments helps to avoid 
> checking again on the next scanning.
> How do you think ?

No, we don't have to pollute the code for a tool.
You should check how to disable this false positive in your tool.



[dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode() logic to driver

2015-02-25 Thread Pawel Wodkowski
On 2015-02-25 07:14, Ouyang, Changchun wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
>> Sent: Thursday, February 19, 2015 11:55 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode()
>> logic to driver
>>
>> Function rte_eth_dev_check_mq_mode() is driver specific. It should be
>> done in PF configuration phase. This patch move igb/ixgbe driver specific mq
>> check and SRIOV configuration code to driver part. Also rewriting log
>> messages to be shorter and more descriptive.
>>
>> Signed-off-by: Pawel Wodkowski 
>> ---
>>   lib/librte_ether/rte_ethdev.c   | 197 
>> ---
>>   lib/librte_pmd_e1000/igb_ethdev.c   |  43 
>>   lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 105 ++-
>>   lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   5 +-
>>   lib/librte_pmd_ixgbe/ixgbe_pf.c | 202
>> +++-
>>   5 files changed, 327 insertions(+), 225 deletions(-)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>> index 4007054..aa27e39 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -502,195 +502,6 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev
>> *dev, uint16_t nb_queues)
>>  return (0);
>>   }
>>
>> -static int
>> -rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q) -{
>> -struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>> -switch (nb_rx_q) {
>> -case 1:
>> -case 2:
>> -RTE_ETH_DEV_SRIOV(dev).active =
>> -ETH_64_POOLS;
>> -break;
>> -case 4:
>> -RTE_ETH_DEV_SRIOV(dev).active =
>> -ETH_32_POOLS;
>> -break;
>> -default:
>> -return -EINVAL;
>> -}
>> -
>> -RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool = nb_rx_q;
>> -RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
>> -dev->pci_dev->max_vfs * nb_rx_q;
>> -
>> -return 0;
>> -}
>> -
>> -static int
>> -rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t
>> nb_tx_q,
>> -  const struct rte_eth_conf *dev_conf)
>> -{
>> -struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>> -
>> -if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>> -/* check multi-queue mode */
>> -if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
>> -(dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS)
>> ||
>> -(dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {
>> -/* SRIOV only works in VMDq enable mode */
>> -PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>> -" SRIOV active, "
>> -"wrong VMDQ mq_mode rx %u
>> tx %u\n",
>> -port_id,
>> -dev_conf->rxmode.mq_mode,
>> -dev_conf->txmode.mq_mode);
>> -return (-EINVAL);
>> -}
>> -
>> -switch (dev_conf->rxmode.mq_mode) {
>> -case ETH_MQ_RX_VMDQ_DCB:
>> -case ETH_MQ_RX_VMDQ_DCB_RSS:
>> -/* DCB/RSS VMDQ in SRIOV mode, not implement
>> yet */
>> -PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>> -" SRIOV active, "
>> -"unsupported VMDQ mq_mode
>> rx %u\n",
>> -port_id, dev_conf-
>>> rxmode.mq_mode);
>> -return (-EINVAL);
>> -case ETH_MQ_RX_RSS:
>> -PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
>> -" SRIOV active, "
>> -"Rx mq mode is changed from:"
>> -"mq_mode %u into VMDQ
>> mq_mode %u\n",
>> -port_id,
>> -dev_conf->rxmode.mq_mode,
>> -dev->data-
>>> dev_conf.rxmode.mq_mode);
>> -case ETH_MQ_RX_VMDQ_RSS:
>> -dev->data->dev_conf.rxmode.mq_mode =
>> ETH_MQ_RX_VMDQ_RSS;
>> -if (nb_rx_q <=
>> RTE_ETH_DEV_SRIOV(dev).nb_rx_q_per_pool)
>> -if
>> (rte_eth_dev_check_vf_rss_rxq_num(port_id, nb_rx_q) != 0) {
>> -PMD_DEBUG_TRACE("ethdev
>> port_id=%d"
>> -" SRIOV active, invalid queue"
>> -" number for VMDQ RSS,
>> allowed"
>> -" value are 1, 2 or 4\n",
>> -port_id);
>> -return -EINVAL;
>> -}
>> -break;
>> -default: /* ETH_MQ_RX_VM

[dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling based on VFIO

2015-02-25 Thread David Marchand
Hello Danny,

On Wed, Feb 25, 2015 at 7:58 AM, Zhou, Danny  wrote:

>
> +int
> +rte_intr_wait_rx_pkt(struct rte_intr_handle *intr_handle, uint8_t
> queue_id)
> +{
> +   struct epoll_event ev;
> +   unsigned numfds = 0;
> +
> +   if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd
> < 0)
> +   return -1;
> +   if (queue_id >= VFIO_MAX_QUEUE_ID)
> +   return -1;
> +
> +   /* create epoll fd */
> +   int pfd = epoll_create(1);
> +   if (pfd < 0) {
> +   RTE_LOG(ERR, EAL, "Cannot create epoll instance\n");
> +   return -1;
> +   }
>
>
>
> Why recreate the epoll instance at each call to this function ?
>
>
>
> DZ: To avoid recreating the epoll instance for each queue, the struct
> rte_intr_handle(or a new structure added to ethdev)
>
> should be extended by adding fields storing per-queue pfd. This way, it
> could reduce user/kernel context  switch overhead
>
> when calling epoll_create() each time.
>
>
>
> Sounds good?
>

You don't need a epfd per queue. And hardcoding epfd == eventfd will give a
not very usable api.

Plus, epoll is something linux-specific, so you can't move it out of
eal/linux.
I suppose you need an abstraction here (and in the future we could add
something for bsd ?).



>
> Looking at this patchset, I think there is a design issue.
>
> eal does not need to know about portid neither queueid.
>
>
>
> eal can provide an api to retrieve the interrupt fds, configure an epoll
> instance, wait on an epoll instance etc...
>
> ethdev is then responsible to setup the mapping between port id / queue id
> and interrupt fds by asking the eal about those fds.
>
>
>
> This would result in an eal api even simpler and we could add other fds in
> a single epoll fd for other uses.
>
>
>
> DZ: The queueid is just an index to the queue related eventfd array stored
> in EAL. If this array is still in the EAL and ethdev can apply for it and
> setup mapping for certain queue, there
>
> might be issue for multiple-process use case where the fd resources
> allocated for secondary process are not freed if the secondary process
> exits unexpectedly.
>

Not sure I follow you.
If a secondary process exits, the eventfds created in primary process
should still be valid and reusable.
Why would you need to free them ? Something to do with vfio ?



>
> Probably we can setup the eventfd array inside ethdev,  and we just need
> EAL API to wait for ethdev?fd. So application invokes ethdev API with
> portid and queueid, and ethdev calls eal
>
> API to wait on a ethdev fd which correlates with the specified portid and
> queueid.
>
>
>
> Sounds ok to you?
>

eventfds creation can not be handled by ethdev, since it needs
infrastructure and informations from within the eal/linux.
Again, do we need an abstraction ?

ethdev must be the one that does the mappings between port/queue and
eventfds (or any object that represents a way to wake up for a given
port/queue).


-- 
David Marchand


[dpdk-dev] KNI as kernel vHost backend failing

2015-02-25 Thread Xie, Huawei
On 1/1/2015 5:02 PM, sai kiran wrote:
> Hi,
>
>
>
> We are trying to experiment with DPDK?s KNI application, with KNI working
> as Kernel vHost backend.
>
>
> 1.  After starting the KNI application, KNI application has detected link
> up.
>
>
> *[root at localhost kni]# ./build/app/kni -c 0xf0 -n 4 -- -p 0x3 -P
> --config="(0,4,6),(1,5,7)"*
>
>
> APP: Initialising port 0 ...
>
> KNI: pci: 10:00:01   8086:10fb
>
> APP: Initialising port 1 ...
>
> PMD: To improve 1G driver performance, consider setting the TX WTHRESH
> value to 4, 8, or 16.
>
> KNI: pci: 16:00:01   8086:10e7
>
> Checking link status
>
> .done
>
> Port 0 Link Up - speed 1 Mbps - full-duplex
>
> Port 1 Link Up - speed 1000 Mbps - full-duplex
>
> APP: Lcore 5 is reading from port 1
>
> APP: Lcore 7 is writing to port 1
>
> APP: Lcore 6 is writing to port 0
>
> APP: Lcore 4 is reading from port 0
>
>
> 2. As mentioned in Programming guide, *sock_en* variable in sysfs is
> enabled and a fd is generated
>
> [root at localhost dpdk-1.7.1]# cat /sys/class/net/vEth0/sock_en
> 1
> [root at localhost dpdk-1.7.1]# cat /sys/class/net/vEth1/sock_en
> 1
> [root at localhost dpdk-1.7.1]# cat /sys/class/net/vEth0/sock_fd
> 11
> [root at localhost dpdk-1.7.1]# cat /sys/class/net/vEth1/sock_fd
> 12
>
> 3. But when a VM is launched with this file-descriptor as the
> vhost-backend, the qemu-kvm is throwing an ioctl-failure error. This
> ioctl
> is making the vhost-backend fallback to virtio-userspace.
>
>
>
> [root at localhost qemu-kvm-1.2.0]# /usr/bin/qemu-kvm -m 2048 -enable-kvm
> -cpu
> host -smp 2 -name VSK1 -drive file=/root/SAI/NSVPX-KVM-11.0-28.1_nc.raw
> -netdev tap,fd=12,id=mynet_kni,vhost=on -device
> virtio-net-pci,netdev=mynet_kni,bus=pci.0,addr=0x4,ioeventfd=on
>
> qemu-kvm: -netdev tap,fd=12,id=mynet_kni,vhost=on: TUNGETIFF ioctl()
> failed: Bad file descriptor
>
> TUNSETOFFLOAD ioctl() failed: Bad file descriptor
>
> qemu-kvm: unable to start vhost net: 88: falling back on userspace virtio
>
> qemu-kvm: unable to start vhost net: 88: falling back on userspace virtio
>
> qemu-kvm: unable to start vhost net: 88: falling back on userspace virtio
>
> With this failure, the traffic from VM is not flowing through KNI
> interface.
>
>
>
> The above mentioned ioctl failure does NOT happen consistently. During
> the
> instances when failure is not seen, traffic flows successfully through
> the
> KNI interfaces.
>
>
>
> Can someone please shed some light as to what is happening in this case.
> Are we missing something here? Is there a known issue?
>
>
Hi Kiran:
Is it possible you switch to user space vhost?
>
> Thanks,
>
> Kiran
>
>
>



[dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization

2015-02-25 Thread Thomas Monjalon
> > This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
> > It also extends memcpy test coverage with unaligned cases and more test
> > points.
> > 
> > Optimization techniques are summarized below:
> > 
> > 1. Utilize full cache bandwidth
> > 
> > 2. Enforce aligned stores
> > 
> > 3. Apply load address alignment based on architecture features
> > 
> > 4. Make load/store address available as early as possible
> > 
> > 5. General optimization techniques like inlining, branch reducing, prefetch
> > pattern access
> > 
> > --
> > Changes in v2:
> > 
> > 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast
> > build
> > 
> > 2. Modified macro definition for better code readability & safety
> > 
> > Zhihong Wang (4):
> >   app/test: Disabled VTA for memcpy test in app/test/Makefile
> >   app/test: Removed unnecessary test cases in app/test/test_memcpy.c
> >   app/test: Extended test coverage in app/test/test_memcpy_perf.c
> >   lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE
> > and AVX platforms
> 
> Acked-by: Pablo de Lara 

Applied, thanks for the great work!

Note: we are still looking for a maintainer of x86 EAL.



[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Robert Sanford
Hi Thomas,

Yes, I'm interested in becoming a maintainer of rte_timer. What are the
responsibilities?


One question about lib rte_timer that's been troubling me for a while: How
are skip lists better than BSD-style timer wheels?

--
Regards,
Robert


On Wed, Feb 25, 2015 at 4:46 AM, Thomas Monjalon 
wrote:

> > > Changes in v2:
> > > - split into multiple patches
> > > - minor coding-style changes
> > >
> > > Robert Sanford (3):
> > >timer: fix return value of rte_timer_reset(),
> > >  insert rte_pause() into rte_timer_reset_sync() wait-loop
> > >app/test: fix timer stress test to succeed on multiple runs,
> > >  display number of times that rte_timer_reset() fails
> > >  (expected) due to races with other cores
> >
> > Series:
> > Acked-by: Olivier Matz 
>
> Applied, thanks
>
> Robert, as you well know rte_timer and you work on it,
> maybe you are interested in becoming maintainer?
>


[dpdk-dev] : ixgbe: why bulk allocation is not used for a scattered Rx flow?

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 11:40:36AM +0200, Vlad Zolotarov wrote:
> Hi, I have a question about the "scattered Rx" feature: why enabling it
> disabled "bulk allocation" feature?

The "bulk-allocation" feature is one where a more optimized RX code path is
used. For the sake of performance, when doing that code path, certain 
assumptions
were made, one of which was that packets would fit inside a single mbuf. Not
having this assumption makes the receiving of packets much more complicated and
therefore slower. [For similar reasons, the optimized TX routines e.g. vector
TX, are only used if it is guaranteed that no hardware offload features are
going to be used].

Now, it is possible, though challenging, to write optimized code for these more
complicated cases, such as scattered RX, or TX with offloads or scattered 
packets.
In general, we will always want separate routines for the simple case and the
complicated cases, as the performance hit of checking for the offloads, or
multi-mbuf packets will be significant enough to hit our performance badly when
they are not needed. In the case of the vector PMD for ixgbe - our highest
performance path right now - we have indeed two receive routines, for simple
and scattered cases. For TX, we only have an optimized path for the simple case,
but that is not to say that at some point someone may provide one for the
offload case too.

A final note on scattered packets in particular: if packets are too big to fit
in a single mbuf, then they are not small packets, and the processing time per
packet available is, by definition, larger than for packets that fit in a 
single mbuf. For 64-byte packets, the packet arrival rate is 67ns @ 10G, or
approx 200 cycles at 3GHz. If we assume a standard 2k mbuf, then a packet which
spans two mbufs takes at least 1654ns, and therefore a 3GHz CPU has nearly 5000
cycles to process that same packet. Therefore, since the processing budget is
so much bigger the need to optimize is much less. Therefore it's more important
to focus on the small packet case, which is what we have done.

> There is some unclear comment in the ixgbe_recv_scattered_pkts():
> 
>   /*
>* Descriptor done.
>*
>* Allocate a new mbuf to replenish the RX ring descriptor.
>* If the allocation fails:
>*- arrange for that RX descriptor to be the first one
>*  being parsed the next time the receive function is
>*  invoked [on the same queue].
>*
>*- Stop parsing the RX ring and return immediately.
>*
>* This policy does not drop the packet received in the RX
>* descriptor for which the allocation of a new mbuf failed.
>* Thus, it allows that packet to be later retrieved if
>* mbuf have been freed in the mean time.
>* As a side effect, holding RX descriptors instead of
>* systematically giving them back to the NIC may lead to
>* RX ring exhaustion situations.
>* However, the NIC can gracefully prevent such situations
>* to happen by sending specific "back-pressure" flow control
>* frames to its peer(s).
>*/
> 
> Why the same "policy" can't be done in the bulk-context allocation? - Don't
> advance the RDT until u've refilled the ring. What do I miss here?

A lot of the optimizations done in other code paths, such as bulk alloc, may 
well
be applicable here, it's just that the work has not been done yet, as the focus
is elsewhere. For vector PMD RX, we have now routines that work on both regular
and scattered packets, and both perform much better than the scalar equivalents.
Also to note that in every RX (and TX) routine, the NIC tail pointer update is
always done just once at the end of the function. 

> 
> Another question is about the LRO feature - is there a reason why it's not
> implemented? I've implemented the LRO support in ixgbe PMD to begin with - I
> used a "scattered Rx" as a template and now I'm tuning it (things like the
> stuff above).
> 
> Is there any philosophical reason why it hasn't been implemented in *any*
> PMD so far? ;)

I'm not aware of any philosophical reasons why it hasn't been done. Patches
are welcome, as always. :-)

/Bruce



[dpdk-dev] dpdk - poll mode - context switches

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 10:54:51AM +0100, Jog Lie wrote:
> Hello,
> 
> I am not sure to understand the mechanism behind dpdk concerning the context 
> switches.
> I have two user space applications that need access to the NIC according to 
> incoming port rules (port 80 and port 443).
> 
> How to be sure that DPDK spreads the load to the right application ? 
> 
> Will 2 dpdk instances be needed (one per app) -> two incoming packets 
> analysis to "know" if the packet should be forwarded to 
> the user space process ? Which would basically be the same thing as 
> inefficient promiscuous mode.
> 
> i don't understand that "filtering" point.
> 
> Could you please clarify ?
> 
> Thanks
> 
> --?
> Jog

Hi Jog,

The missing link in connecting applications which receive packets from port
80/443 and DPDK itself is the TCP/IP stack in use. DPDK itself does not include
any stack, so you'll need to select a stack to use with your applications. The
mechanics of how apps talk to ports and how traffic gets filtered to them is
largely the stack's responsibility.

/Bruce


[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Thomas Monjalon
2015-02-25 06:02, Robert Sanford:
> Hi Thomas,
> 
> Yes, I'm interested in becoming a maintainer of rte_timer. What are the
> responsibilities?

It means we know someone who can answer our questions about rte_timer.
Having you email in the MAINTAINERS file helps to CC you.
And we expect from the maintainer he tried to review patches for its part.
But reviews may be done by someone else.
In general, technical review from the maintainer is more trustable.

If you are still interested, please drop a patch like this one:
http://dpdk.org/browse/dpdk/commit/?id=a7d7ece480093



[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 06:02:24AM -0500, Robert Sanford wrote:
> Hi Thomas,
> 
> Yes, I'm interested in becoming a maintainer of rte_timer. What are the
> responsibilities?
> 
> 
> One question about lib rte_timer that's been troubling me for a while: How
> are skip lists better than BSD-style timer wheels?
> 
> --

The skip list may not be any better than a timer wheel - it's just what is used
now, and it does give pretty good performance (insert O(log n) [up to a few 
million timers per core], expiry O(1)).
Originally in DPDK, the timers were maintained in a regular sorted linked list,
but that suffered from scalability issues when starting timers, or stopped 
before
expiry. The skip-list was therefore a big improvement on that, and gave us
much greater scalability in timers, without any regressions in performance. I
don't know if anyone has tried to implement and benchmark a timer-wheel based
rte_timer library replacement. I'd be interested to see a performance comparison
between the two implementations! :-)

Regards,
/Bruce

> Regards,
> Robert
> 
> 
> On Wed, Feb 25, 2015 at 4:46 AM, Thomas Monjalon  6wind.com>
> wrote:
> 
> > > > Changes in v2:
> > > > - split into multiple patches
> > > > - minor coding-style changes
> > > >
> > > > Robert Sanford (3):
> > > >timer: fix return value of rte_timer_reset(),
> > > >  insert rte_pause() into rte_timer_reset_sync() wait-loop
> > > >app/test: fix timer stress test to succeed on multiple runs,
> > > >  display number of times that rte_timer_reset() fails
> > > >  (expected) due to races with other cores
> > >
> > > Series:
> > > Acked-by: Olivier Matz 
> >
> > Applied, thanks
> >
> > Robert, as you well know rte_timer and you work on it,
> > maybe you are interested in becoming maintainer?
> >


[dpdk-dev] dpdk - poll mode - context switches

2015-02-25 Thread Jog Lie
Hi Bruce,

Ok. understood. 

Thanks !


--?
Jog


[dpdk-dev] [PATCH v14 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-25 Thread Thomas Monjalon
2015-02-25 13:04, Tetsuya Mukawa:
> --- a/lib/librte_eal/common/eal_common_dev.c
> +++ b/lib/librte_eal/common/eal_common_dev.c
> @@ -32,10 +32,13 @@
>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>   */
>  
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  
> +#include 
>  #include 
>  #include 

No, you must not include ethdev in EAL.
The ethdev layer is by design on top of EAL.
Maxime already asked why you did it. He was implicitly asking to remove it.
You said that you are calling ethdev_is_detachable() but you should
call a function eal_is_detachable() or something like that.
The detachable state must be only device-related, i.e. in EAL.
The ethdev API is only a wrapper (with port id) in such case.

> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -45,6 +45,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
>  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
>  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
>  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf

By removing ethdev dependency, you can remove this ugly mbuf dependency.

Thanks Tetsuya



[dpdk-dev] [PATCH v4 5/7] pmd ixgbe: enable DCB in SRIOV

2015-02-25 Thread Pawel Wodkowski
On 2015-02-25 04:36, Ouyang, Changchun wrote:
>> @@ -652,7 +655,9 @@ ixgbe_get_vf_queues(struct rte_eth_dev *dev,
>> >uint32_t vf, uint32_t *msgbuf)  {
>> >struct ixgbe_vf_info *vfinfo =
>> >*IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data-
>>> > >dev_private);
>> >-   uint32_t default_q = vf *
>> >RTE_ETH_DEV_SRIOV(dev).nb_tx_q_per_pool;
>> >+   struct ixgbe_dcb_config *dcbinfo =
>> >+   IXGBE_DEV_PRIVATE_TO_DCB_CFG(dev->data-
>>> > >dev_private);
>> >+   uint32_t default_q = RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx;
> Why need change the default_q here?
>

Because this field holds default queue index.

-- 
Pawel


[dpdk-dev] Vhost-user early adopter feedback

2015-02-25 Thread Benoît Canet
The Wednesday 25 Feb 2015 ? 07:46:56 (+), Xie, Huawei wrote :
> On 2/18/2015 3:59 PM, Beno?t Canet wrote:
> > Hello Xie,
> >
> > As promized I integrated your vhost-user patchset from january in my 
> > vswitch.
> >
> > I just tried it, it works pretty well.
> >
> > I just had a minor bug with rte_vhost_driver_register taking ownership of 
> > the
> > string patch pointer too late. I freed it out of habit just after 
> > registering in the
> > caller and when ifname[IFNAMESIZ] was written the pointer was used for a 
> > new string I
> > allocated later. Maybe an early strdup() would fix this.
> Thanks.
> Do you mean we duplicate a string from the first parameter path, like
> vserver->path = strdup(path) ?

Yes I was thinking about this.

Best regards

Beno?t

> If this was the case, it was ever in my mind. We would do this if
> necessary.

> >
> > The last patch of your new version is really a great idea since it will
> > simplify a lot the socket creation and management code.
> >
> > Best regards
> >
> > Beno?t
> >
> >
> 


[dpdk-dev] [PATCH v2] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 10:08:32AM +0600, Yerden Zhumabekov wrote:
> New function test_crc32_hash_alg_equiv() checks whether software,
> 4-byte operand and 8-byte operand versions of CRC32 hash function
> implementations return the same result value.
> 
> Signed-off-by: Yerden Zhumabekov 

Two small notes below for improving output on error.

Acked-by: Bruce Richardson 

> ---
>  app/test/test_hash.c |   63 
> ++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/app/test/test_hash.c b/app/test/test_hash.c
> index 76b1b8f..3e94af1 100644
> --- a/app/test/test_hash.c
> +++ b/app/test/test_hash.c
> @@ -177,6 +177,66 @@ static struct rte_hash_parameters ut_params = {
>   .socket_id = 0,
>  };
>  
> +#define CRC32_ITERATIONS (1U << 20)
> +#define CRC32_DWORDS (1U << 6)
> +/*
> + * Test if all CRC32 implementations yield the same hash value
> + */
> +static int
> +test_crc32_hash_alg_equiv(void)
> +{
> + uint32_t hash_val;
> + uint32_t init_val;
> + uint64_t data64[CRC32_DWORDS];
> + unsigned i, j;
> + size_t data_len;
> +
> + printf("# CRC32 implementations equivalence test\n");
> + for (i = 0; i < CRC32_ITERATIONS; i++) {
> + /* Randomizing data_len of data set */
> + data_len = (size_t) ((rte_rand() % sizeof(data64)) + 1);
> + init_val = (uint32_t) rte_rand();
> +
> + /* Fill the data set */
> + for (j = 0; j < CRC32_DWORDS; j++)
> + data64[j] = rte_rand();
> +
> + /* Calculate software CRC32 */
> + rte_hash_crc_set_alg(CRC32_SW);
> + hash_val = rte_hash_crc(data64, data_len, init_val);
> +
> + /* Check against 4-byte-operand sse4.2 CRC32 if available */
> + rte_hash_crc_set_alg(CRC32_SSE42);
> + if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
> + printf("Failed checking CRC32_SW against 
> CRC32_SSE42\n");
> + break;
> + }
> +
> + /* Check against 8-byte-operand sse4.2 CRC32 if available */
> + rte_hash_crc_set_alg(CRC32_SSE42_x64);
> + if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
> + printf("Failed checking CRC32_SW against 
> CRC32_SSE42_x64\n");
> + break;
> + }
> + }
> +
> + /* Resetting to best available algorithm */
> + rte_hash_crc_set_alg(CRC32_SSE42_x64);
> +
> + if (i == CRC32_ITERATIONS)
> + return 0;
> +
> + printf("Failed test data (hex):\n");
> +
> + for (j = 0; j < data_len; j++) {
> + printf("%02X", ((uint8_t *)data64)[j]);
Put in a space after each hex character, otherwise it comes out like:

Failed test data (hex):
AAD292776348010C7A18D3080DB3A300
FD
Test Failed

[I forced a failure by changing a != to == to test it, don't worry, the
hash calculations are fine! :-)]

> + if ((j+1) % 16 == 0 || j == data_len - 1)
> + printf("\n");
> + }
Maybe also print out here, or before the hex digits, the length of the data
that was tested. e.g. "printf("%u bytes total\n", data_len);" or similar.
> +
> + return -1;
> +}
> +
>  /*
>   * Test a hash function.
>   */
> @@ -1356,6 +1416,9 @@ test_hash(void)
>  
>   run_hash_func_tests();
>  
> + if (test_crc32_hash_alg_equiv() < 0)
> + return -1;
> +
>   return 0;
>  }
>  
> -- 
> 1.7.9.5
> 


[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Hemant Agrawal
From: Hemant Agrawal 

if any buffer is read from the tx_q, MAX_BURST buffers will be allocated and 
attempted to be added to to the alloc_q.
This seems terribly inefficient and it also looks like the alloc_q will quickly 
fill to its maximum capacity. If the system buffers are low in number, it will 
reach "out of memory" situation.

This patch allocates the number of buffers as many dequeued from tx_q.

Signed-off-by: Hemant Agrawal 
---
 lib/librte_kni/rte_kni.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 4e70fa0..4cf8e30 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {


 static void kni_free_mbufs(struct rte_kni *kni);
-static void kni_allocate_mbufs(struct rte_kni *kni);
+static void kni_allocate_mbufs(struct rte_kni *kni, int num);

 static volatile int kni_fd = -1;
 static struct rte_kni_memzone_pool kni_memzone_pool = {
@@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf 
**mbufs, unsigned num)

/* If buffers removed, allocate mbufs and then put them into alloc_q */
if (ret)
-   kni_allocate_mbufs(kni);
+   kni_allocate_mbufs(kni, ret);

return ret;
 }
@@ -594,7 +594,7 @@ kni_free_mbufs(struct rte_kni *kni)
 }

 static void
-kni_allocate_mbufs(struct rte_kni *kni)
+kni_allocate_mbufs(struct rte_kni *kni, int num)
 {
int i, ret;
struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
@@ -620,7 +620,10 @@ kni_allocate_mbufs(struct rte_kni *kni)
return;
}

-   for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
+   if (num == 0 || num > MAX_MBUF_BURST_NUM)
+   num = MAX_MBUF_BURST_NUM;
+
+   for (i = 0; i < num; i++) {
pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
if (unlikely(pkts[i] == NULL)) {
/* Out of memory */
@@ -636,7 +639,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
ret = kni_fifo_put(kni->alloc_q, (void **)pkts, i);

/* Check if any mbufs not put into alloc_q, and then free them */
-   if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
+   if (ret >= 0 && ret < i && ret < num) {
int j;

for (j = ret; j < i; j++)
-- 
1.9.1



[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Olivier Deme
Thank you Hemant, I think there might be one issue left with the patch 
though.
The alloc_q must initially be filled with mbufs before getting mbuf back 
on the tx_q.

So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
If so, it should invoke kni_allocate_mbufs(kni, 0)
(to fill the alloc_q with MAX_MBUF_BURST_NUM mbufs)

The patch for rte_kni_rx_burst would then look like:

@@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct 
rte_mbuf **mbufs, unsigned num)

  /* If buffers removed, allocate mbufs and then put them into 
alloc_q */
  if (ret)
-kni_allocate_mbufs(kni);
+  kni_allocate_mbufs(kni, ret);
+  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
+  kni_allocate_mbufs(kni, 0);


Olivier.

On 25/02/15 11:48, Hemant Agrawal wrote:
> From: Hemant Agrawal 
>
> if any buffer is read from the tx_q, MAX_BURST buffers will be allocated and 
> attempted to be added to to the alloc_q.
> This seems terribly inefficient and it also looks like the alloc_q will 
> quickly fill to its maximum capacity. If the system buffers are low in 
> number, it will reach "out of memory" situation.
>
> This patch allocates the number of buffers as many dequeued from tx_q.
>
> Signed-off-by: Hemant Agrawal 
> ---
>   lib/librte_kni/rte_kni.c | 13 -
>   1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> index 4e70fa0..4cf8e30 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {
>   
>   
>   static void kni_free_mbufs(struct rte_kni *kni);
> -static void kni_allocate_mbufs(struct rte_kni *kni);
> +static void kni_allocate_mbufs(struct rte_kni *kni, int num);
>   
>   static volatile int kni_fd = -1;
>   static struct rte_kni_memzone_pool kni_memzone_pool = {
> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf 
> **mbufs, unsigned num)
>   
>   /* If buffers removed, allocate mbufs and then put them into alloc_q */
>   if (ret)
> - kni_allocate_mbufs(kni);
> + kni_allocate_mbufs(kni, ret);
>   
>   return ret;
>   }
> @@ -594,7 +594,7 @@ kni_free_mbufs(struct rte_kni *kni)
>   }
>   
>   static void
> -kni_allocate_mbufs(struct rte_kni *kni)
> +kni_allocate_mbufs(struct rte_kni *kni, int num)
>   {
>   int i, ret;
>   struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
> @@ -620,7 +620,10 @@ kni_allocate_mbufs(struct rte_kni *kni)
>   return;
>   }
>   
> - for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
> + if (num == 0 || num > MAX_MBUF_BURST_NUM)
> + num = MAX_MBUF_BURST_NUM;
> +
> + for (i = 0; i < num; i++) {
>   pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>   if (unlikely(pkts[i] == NULL)) {
>   /* Out of memory */
> @@ -636,7 +639,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
>   ret = kni_fifo_put(kni->alloc_q, (void **)pkts, i);
>   
>   /* Check if any mbufs not put into alloc_q, and then free them */
> - if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {MAX_MBUF_BURST_NUM
>
> + if (ret >= 0 && ret < i && ret < num) {
>   int j;
>   
>   for (j = ret; j < i; j++)

-- 
*Olivier Dem?*
*Druid Software Ltd.*
*Tel: +353 1 202 1831*
*Email: odeme at druidsoftware.com *
*URL: http://www.druidsoftware.com*
*Hall 7, stand 7F70.*
Druid Software: Monetising enterprise small cells solutions.



[dpdk-dev] [PATCH 1/2] doc: Update GSG for uio_pci_generic use

2015-02-25 Thread Iremonger, Bernard


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, February 24, 2015 4:28 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 1/2] doc: Update GSG for uio_pci_generic use
> 
> Since DPDK now has support for the in-tree uio_pci_generic driver, update the 
> GSG document to
> reference this module, and to use it in preference to the igb_uio driver, 
> which is DPDK-specific.
> 
> Signed-off-by: Bruce Richardson 
> ---
>  doc/guides/linux_gsg/build_dpdk.rst| 63 
> +-
>  doc/guides/linux_gsg/build_sample_apps.rst |  5 ++-
>  doc/guides/linux_gsg/enable_func.rst   |  2 +
>  3 files changed, 40 insertions(+), 30 deletions(-)
> 
> diff --git a/doc/guides/linux_gsg/build_dpdk.rst 
> b/doc/guides/linux_gsg/build_dpdk.rst
> index d09c69d..255d6dc 100644
> --- a/doc/guides/linux_gsg/build_dpdk.rst
> +++ b/doc/guides/linux_gsg/build_dpdk.rst
> @@ -133,7 +133,8 @@ use the make config T= command:
> 
>  .. warning::
> 
> -The igb_uio module must be compiled with the same kernel as the one 
> running on the target.
> +Any kernel modules to be used, e.g. igb_uio, kni, must be compiled with 
> the
> +same kernel as the one running on the target.
>  If the DPDK is not being built on the target machine,
>  the RTE_KERNELDIR environment variable should be used to point the 
> compilation at a copy of the
> kernel version to be used on the target machine.
> 
> @@ -154,28 +155,29 @@ Browsing the Installed DPDK Environment Target
> 
>  Once a target is created it contains all libraries and header files for the 
> DPDK environment that are
> required to build customer applications.
>  In addition, the test and testpmd applications are built under the build/app 
> directory, which may be
> used for testing.
> -In the case of Linux, a kmod  directory is also present that contains a 
> module to install:
> +A kmod  directory is also present that contains kernel modules which may be 
> loaded if needed:
> 
>  .. code-block:: console
> 
>  $ ls x86_64-native-linuxapp-gcc
>  app build hostapp include kmod lib Makefile
> 
> -Loading the DPDK igb_uio Module
> 
> +Loading Modules to Enable Userspace IO for DPDK
> +---
> 
> -To run any DPDK application, the igb_uio module can be loaded into the 
> running kernel.
> -The module is found in the kmod sub-directory of the DPDK target directory.
> -This module should be loaded using the insmod command as shown below 
> (assuming that the
> current directory is the DPDK target directory).
> -In many cases, the uio support in the Linux* kernel is compiled as a module 
> rather than as part of the
> kernel, -so it is often necessary to load the uio module first:


Hi Bruce,

Should the information about igb_uio be retained alongside the new information 
about uio_pci_generic?

> +To run any DPDK application, a suitable uio module can be loaded into the 
> running kernel.
> +In most cases, the standard uio_pci_generic module included in the
> +linux kernel can provide the uio capability. This module can be loaded
> +using the command
> 
>  .. code-block:: console
> 
> -sudo modprobe uio
> -sudo insmod kmod/igb_uio.ko

Should the information about igb_uio be retained alongside the new information 
about uio_pci_generic?

> +sudo modprobe uio_pci_generic
> 
> -Since DPDK release 1.7 provides VFIO support, compilation and use of igb_uio 
> module has become
> optional for platforms that support using VFIO.
> +As an alternative to the uio_pci_generic, the DPDK also includes the
> +igb_uio module which can be found in the kmod subdirectory referred to above.
> +
> +Since DPDK release 1.7 onward provides VFIO support, use of UIO is
> +optional for platforms that support using VFIO.
> 
>  Loading VFIO Module
>  ---
> @@ -195,24 +197,29 @@ Also, to use VFIO, both kernel and BIOS must support 
> and be configured to
> use IO  For proper operation of VFIO when running DPDK applications as a 
> non-privileged user, correct
> permissions should also be set up.
>  This can be done by using the DPDK setup script (called setup.sh and located 
> in the tools directory).
> 
> -Binding and Unbinding Network Ports to/from the igb_uioor VFIO Modules
> +Binding and Unbinding Network Ports to/from the Kernel Modules
>  --
> 
>  As of release 1.4, DPDK applications no longer automatically unbind all 
> supported network ports from
> the kernel driver in use.
> -Instead, all ports that are to be used by an DPDK application must be bound 
> to the igb_uio or vfio-pci
> module before the application is run.
> +Instead, all ports that are to be used by an DPDK application must be
> +bound to the uio_pci_generic, igb_uio or vfio-pci module before the 
> application is run.
>  Any network ports under Linux* control will

[dpdk-dev] [PATCH 2/2] doc: update programmers guide for uio_pci_generic

2015-02-25 Thread Iremonger, Bernard


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, February 24, 2015 4:28 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for 
> uio_pci_generic
> 
> Since DPDK now has support for the in-tree uio_pci_generic driver, update the 
> programmers guide
> document to reference this module, and to use it in preference to the igb_uio 
> driver, which is DPDK-
> specific.
> 
> Signed-off-by: Bruce Richardson 
> ---
>  doc/guides/prog_guide/env_abstraction_layer.rst  | 8 
>  doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst | 6 +++---
>  doc/guides/prog_guide/kernel_nic_interface.rst   | 2 +-
>  doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst  | 8 
>  doc/guides/prog_guide/poll_mode_drv_paravirtual_vmxnets_nic.rst  | 2 +-
>  5 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst
> b/doc/guides/prog_guide/env_abstraction_layer.rst
> index 231e266..b5321c3 100644
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> @@ -66,7 +66,7 @@ EAL in a Linux-userland Execution Environment
>  -
> 
>  In a Linux user space environment, the DPDK application runs as a user-space 
> application using the
> pthread library.
> -PCI information about devices and address space is discovered through the 
> /sys kernel interface and
> through a module called igb_uio.
> +PCI information about devices and address space is discovered through the 
> /sys kernel interface and
> through kernel modules such as uio_pci_generic, or igb_uio.
>  Refer to the UIO: User-space drivers documentation in the Linux kernel. This 
> memory is mmap'd in
> the application.
> 
>  The EAL performs physical memory allocation using mmap() in hugetlbfs (using 
> huge page sizes to
> increase performance).
> @@ -134,10 +134,10 @@ PCI Access
>  ~~
> 
>  The EAL uses the /sys/bus/pci utilities provided by the kernel to scan the 
> content on the PCI bus.
> -
> -To access PCI memory, a kernel module called igb_uio provides a /dev/uioX 
> device file
> +To access PCI memory, a kernel module called uio_pci_generic provides a
> +/dev/uioX device file and resource files in /sys
>  that can be mmap'd to obtain access to PCI address space from the 
> application.
> -It uses the uio kernel feature (userland driver).
> +The DPDK-specific igb_uio module can also be used for this. Both drivers use 
> the uio kernel feature
> (userland driver).
> 
>  Per-lcore and Shared Variables
>  ~~
> diff --git a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> index 1f1e04f..a0dd959 100644
> --- a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> +++ b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> @@ -306,12 +306,12 @@ Building and Running the Switching Backend
>  Refer to the *DPDK Getting Started Guide* for more information on 
> memory management in the
> DPDK.
>  In the above command, 4 GB memory is reserved (2048 of 2 MB pages) 
> for DPDK.
> 
> -#.  Load igb_uio and bind one Intel NIC controller to igb_uio:
> +#.  Load uio_pci_generic and bind one Intel NIC controller to it:
> 
>  .. code-block:: console
> 
> -insmod x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> -python tools/dpdk_nic_bind.py -b igb_uio :09:00:00.0


Hi Bruce,

Should the information about igb_uio be retained alongside the new information 
about uio_pci_generic?

> +modprobe uio_pci_generic
> +python tools/dpdk_nic_bind.py -b uio_pci_generic
> + :09:00:00.0
> 
>  In this case, :09:00.0 is the PCI address for the NIC controller.
> 
> diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst
> b/doc/guides/prog_guide/kernel_nic_interface.rst
> index 0276019..9ed7688 100644
> --- a/doc/guides/prog_guide/kernel_nic_interface.rst
> +++ b/doc/guides/prog_guide/kernel_nic_interface.rst
> @@ -224,7 +224,7 @@ Otherwise, by default, KNI will not enable its backend 
> support capability.
> 
>  Of course, as a prerequisite, the vhost/vhost-net kernel CONFIG should be 
> chosen before compiling
> the kernel.
> 
> -#.  Compile the DPDK and insert igb_uio as normal.
> +#.  Compile the DPDK and insert uio_pci_generic/igb_uio kernel modules as 
> normal.
> 
>  #.  Insert the KNI kernel module:
> 
> diff --git a/doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst
> b/doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst
> index 86f4f60..b0a6250 100644
> --- a/doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst
> @@ -113,7 +113,7 @@ Host2VM communication example
> 
>

[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread hem...@freescale.com
Hi OIivier
 Comments inline.
Regards,
Hemant

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Deme
> Sent: 25/Feb/2015 5:44 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
> 
> Thank you Hemant, I think there might be one issue left with the patch though.
> The alloc_q must initially be filled with mbufs before getting mbuf back on 
> the
> tx_q.
> 
> So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the alloc_q with
> MAX_MBUF_BURST_NUM mbufs)
> 
> The patch for rte_kni_rx_burst would then look like:
> 
> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
> **mbufs, unsigned num)
> 
>   /* If buffers removed, allocate mbufs and then put them into alloc_q */
>   if (ret)
> -kni_allocate_mbufs(kni);
> +  kni_allocate_mbufs(kni, ret);
> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
> +  kni_allocate_mbufs(kni, 0);
> 
[hemant]  This will introduce a run-time check.

I missed to include the other change in the patch. 
 I am doing it in kni_alloc i.e. initiate the alloc_q with default burst size. 
kni_allocate_mbufs(ctx, 0);

In a way, we are now suggesting to reduce the size of alloc_q to only default 
burst size. 

Can we reach is situation, when the kernel is adding packets faster in tx_q 
than the application is able to dequeue?
 alloc_q  can be empty in this case and kernel will be striving. 

> 
> Olivier.
> 
> On 25/02/15 11:48, Hemant Agrawal wrote:
> > From: Hemant Agrawal 
> >
> > if any buffer is read from the tx_q, MAX_BURST buffers will be allocated and
> attempted to be added to to the alloc_q.
> > This seems terribly inefficient and it also looks like the alloc_q will 
> > quickly fill
> to its maximum capacity. If the system buffers are low in number, it will 
> reach
> "out of memory" situation.
> >
> > This patch allocates the number of buffers as many dequeued from tx_q.
> >
> > Signed-off-by: Hemant Agrawal 
> > ---
> >   lib/librte_kni/rte_kni.c | 13 -
> >   1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index
> > 4e70fa0..4cf8e30 100644
> > --- a/lib/librte_kni/rte_kni.c
> > +++ b/lib/librte_kni/rte_kni.c
> > @@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {
> >
> >
> >   static void kni_free_mbufs(struct rte_kni *kni); -static void
> > kni_allocate_mbufs(struct rte_kni *kni);
> > +static void kni_allocate_mbufs(struct rte_kni *kni, int num);
> >
> >   static volatile int kni_fd = -1;
> >   static struct rte_kni_memzone_pool kni_memzone_pool = { @@ -575,7
> > +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
> > **mbufs, unsigned num)
> >
> > /* If buffers removed, allocate mbufs and then put them into alloc_q
> */
> > if (ret)
> > -   kni_allocate_mbufs(kni);
> > +   kni_allocate_mbufs(kni, ret);
> >
> > return ret;
> >   }
> > @@ -594,7 +594,7 @@ kni_free_mbufs(struct rte_kni *kni)
> >   }
> >
> >   static void
> > -kni_allocate_mbufs(struct rte_kni *kni)
> > +kni_allocate_mbufs(struct rte_kni *kni, int num)
> >   {
> > int i, ret;
> > struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM]; @@ -620,7 +620,10
> @@
> > kni_allocate_mbufs(struct rte_kni *kni)
> > return;
> > }
> >
> > -   for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
> > +   if (num == 0 || num > MAX_MBUF_BURST_NUM)
> > +   num = MAX_MBUF_BURST_NUM;
> > +
> > +   for (i = 0; i < num; i++) {
> > pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
> > if (unlikely(pkts[i] == NULL)) {
> > /* Out of memory */
> > @@ -636,7 +639,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
> > ret = kni_fifo_put(kni->alloc_q, (void **)pkts, i);
> >
> > /* Check if any mbufs not put into alloc_q, and then free them */
> > -   if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM)
> {MAX_MBUF_BURST_NUM
> >
> > +   if (ret >= 0 && ret < i && ret < num) {
> > int j;
> >
> > for (j = ret; j < i; j++)
> 
> --
>   *Olivier Dem?*
> *Druid Software Ltd.*
> *Tel: +353 1 202 1831*
> *Email: odeme at druidsoftware.com *
> *URL: http://www.druidsoftware.com*
>   *Hall 7, stand 7F70.*
> Druid Software: Monetising enterprise small cells solutions.



[dpdk-dev] [PATCH 1/2] doc: Update GSG for uio_pci_generic use

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 12:14:15PM +, Iremonger, Bernard wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, February 24, 2015 4:28 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 1/2] doc: Update GSG for uio_pci_generic use
> > 
> > Since DPDK now has support for the in-tree uio_pci_generic driver, update 
> > the GSG document to
> > reference this module, and to use it in preference to the igb_uio driver, 
> > which is DPDK-specific.
> > 
> > Signed-off-by: Bruce Richardson 
> > ---
> >  doc/guides/linux_gsg/build_dpdk.rst| 63 
> > +-
> >  doc/guides/linux_gsg/build_sample_apps.rst |  5 ++-
> >  doc/guides/linux_gsg/enable_func.rst   |  2 +
> >  3 files changed, 40 insertions(+), 30 deletions(-)
> > 
> > diff --git a/doc/guides/linux_gsg/build_dpdk.rst 
> > b/doc/guides/linux_gsg/build_dpdk.rst
> > index d09c69d..255d6dc 100644
> > --- a/doc/guides/linux_gsg/build_dpdk.rst
> > +++ b/doc/guides/linux_gsg/build_dpdk.rst
> > @@ -133,7 +133,8 @@ use the make config T= command:
> > 
> >  .. warning::
> > 
> > -The igb_uio module must be compiled with the same kernel as the one 
> > running on the target.
> > +Any kernel modules to be used, e.g. igb_uio, kni, must be compiled 
> > with the
> > +same kernel as the one running on the target.
> >  If the DPDK is not being built on the target machine,
> >  the RTE_KERNELDIR environment variable should be used to point the 
> > compilation at a copy of the
> > kernel version to be used on the target machine.
> > 
> > @@ -154,28 +155,29 @@ Browsing the Installed DPDK Environment Target
> > 
> >  Once a target is created it contains all libraries and header files for 
> > the DPDK environment that are
> > required to build customer applications.
> >  In addition, the test and testpmd applications are built under the 
> > build/app directory, which may be
> > used for testing.
> > -In the case of Linux, a kmod  directory is also present that contains a 
> > module to install:
> > +A kmod  directory is also present that contains kernel modules which may 
> > be loaded if needed:
> > 
> >  .. code-block:: console
> > 
> >  $ ls x86_64-native-linuxapp-gcc
> >  app build hostapp include kmod lib Makefile
> > 
> > -Loading the DPDK igb_uio Module
> > 
> > +Loading Modules to Enable Userspace IO for DPDK
> > +---
> > 
> > -To run any DPDK application, the igb_uio module can be loaded into the 
> > running kernel.
> > -The module is found in the kmod sub-directory of the DPDK target directory.
> > -This module should be loaded using the insmod command as shown below 
> > (assuming that the
> > current directory is the DPDK target directory).
> > -In many cases, the uio support in the Linux* kernel is compiled as a 
> > module rather than as part of the
> > kernel, -so it is often necessary to load the uio module first:
> 
> 
> Hi Bruce,
> 
> Should the information about igb_uio be retained alongside the new 
> information about uio_pci_generic?
>  

This is obviously a matter of opinion, but: "no".
This doc is a Getting Started Guide, and therefore meant to cover just the 
minimum
needed to get up and running and ignoring advanced details. 
"uio_pci_generic" is the simplest path to getting up and running quickly, and
maintaining mention of igb_uio just adds to the complexity of the documentation.

Since uio_pci_generic also works on most linux distro's I'd also be tempted to
move the vfio details out of the main GSG body - perhaps to the extra chapter
covering KNI and running as non-root, again with the objective of simplifying
things for the beginner. VFIO and igb_uio are provided for those who want 
something
extra above what uio_pci_generic provides, e.g. security, or ability to create
VF devices on all kernels while having the PF in use by DPDK.

Regards,
/Bruce


[dpdk-dev] [PATCH 2/2] doc: update programmers guide for uio_pci_generic

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 12:19:10PM +, Iremonger, Bernard wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, February 24, 2015 4:28 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for 
> > uio_pci_generic
> > 
> > Since DPDK now has support for the in-tree uio_pci_generic driver, update 
> > the programmers guide
> > document to reference this module, and to use it in preference to the 
> > igb_uio driver, which is DPDK-
> > specific.
> > 
> > Signed-off-by: Bruce Richardson 
> > ---
> >  doc/guides/prog_guide/env_abstraction_layer.rst  | 8 
> > 
> >  doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst | 6 +++---
> >  doc/guides/prog_guide/kernel_nic_interface.rst   | 2 +-
> >  doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst  | 8 
> > 
> >  doc/guides/prog_guide/poll_mode_drv_paravirtual_vmxnets_nic.rst  | 2 +-
> >  5 files changed, 13 insertions(+), 13 deletions(-)
> > 
> > diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst
> > b/doc/guides/prog_guide/env_abstraction_layer.rst
> > index 231e266..b5321c3 100644
> > --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> > @@ -66,7 +66,7 @@ EAL in a Linux-userland Execution Environment
> >  -
> > 
> >  In a Linux user space environment, the DPDK application runs as a 
> > user-space application using the
> > pthread library.
> > -PCI information about devices and address space is discovered through the 
> > /sys kernel interface and
> > through a module called igb_uio.
> > +PCI information about devices and address space is discovered through the 
> > /sys kernel interface and
> > through kernel modules such as uio_pci_generic, or igb_uio.
> >  Refer to the UIO: User-space drivers documentation in the Linux kernel. 
> > This memory is mmap'd in
> > the application.
> > 
> >  The EAL performs physical memory allocation using mmap() in hugetlbfs 
> > (using huge page sizes to
> > increase performance).
> > @@ -134,10 +134,10 @@ PCI Access
> >  ~~
> > 
> >  The EAL uses the /sys/bus/pci utilities provided by the kernel to scan the 
> > content on the PCI bus.
> > -
> > -To access PCI memory, a kernel module called igb_uio provides a /dev/uioX 
> > device file
> > +To access PCI memory, a kernel module called uio_pci_generic provides a
> > +/dev/uioX device file and resource files in /sys
> >  that can be mmap'd to obtain access to PCI address space from the 
> > application.
> > -It uses the uio kernel feature (userland driver).
> > +The DPDK-specific igb_uio module can also be used for this. Both drivers 
> > use the uio kernel feature
> > (userland driver).
> > 
> >  Per-lcore and Shared Variables
> >  ~~
> > diff --git 
> > a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > index 1f1e04f..a0dd959 100644
> > --- a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > +++ b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > @@ -306,12 +306,12 @@ Building and Running the Switching Backend
> >  Refer to the *DPDK Getting Started Guide* for more information on 
> > memory management in the
> > DPDK.
> >  In the above command, 4 GB memory is reserved (2048 of 2 MB pages) 
> > for DPDK.
> > 
> > -#.  Load igb_uio and bind one Intel NIC controller to igb_uio:
> > +#.  Load uio_pci_generic and bind one Intel NIC controller to it:
> > 
> >  .. code-block:: console
> > 
> > -insmod x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> > -python tools/dpdk_nic_bind.py -b igb_uio :09:00:00.0
> 
> 
> Hi Bruce,
> 
> Should the information about igb_uio be retained alongside the new 
> information about uio_pci_generic?
>
While the answer may not be as clear-cut as with the GSG, why would be bother
covering both here. We already ignore VFIO in these examples.

/Bruce


[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Olivier Deme
I guess it would be unusual but possible for the kernel to enqueue 
faster to tx_q than the application dequeues.
But that would also be possible with a real NIC, so I think it is 
acceptable for the kernel to have to drop egress packets in that case.


On 25/02/15 12:24, Hemant at freescale.com wrote:
> Hi OIivier
>Comments inline.
> Regards,
> Hemant
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Deme
>> Sent: 25/Feb/2015 5:44 PM
>> To: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
>>
>> Thank you Hemant, I think there might be one issue left with the patch 
>> though.
>> The alloc_q must initially be filled with mbufs before getting mbuf back on 
>> the
>> tx_q.
>>
>> So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
>> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the alloc_q with
>> MAX_MBUF_BURST_NUM mbufs)
>>
>> The patch for rte_kni_rx_burst would then look like:
>>
>> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>> **mbufs, unsigned num)
>>
>>/* If buffers removed, allocate mbufs and then put them into alloc_q 
>> */
>>if (ret)
>> -kni_allocate_mbufs(kni);
>> +  kni_allocate_mbufs(kni, ret);
>> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
>> +  kni_allocate_mbufs(kni, 0);
>>
> [hemant]  This will introduce a run-time check.
>
> I missed to include the other change in the patch.
>   I am doing it in kni_alloc i.e. initiate the alloc_q with default burst 
> size.
>   kni_allocate_mbufs(ctx, 0);
>
> In a way, we are now suggesting to reduce the size of alloc_q to only default 
> burst size.
>
> Can we reach is situation, when the kernel is adding packets faster in tx_q 
> than the application is able to dequeue?
>   alloc_q  can be empty in this case and kernel will be striving.
>
>> Olivier.
>>
>> On 25/02/15 11:48, Hemant Agrawal wrote:
>>> From: Hemant Agrawal 
>>>
>>> if any buffer is read from the tx_q, MAX_BURST buffers will be allocated and
>> attempted to be added to to the alloc_q.
>>> This seems terribly inefficient and it also looks like the alloc_q will 
>>> quickly fill
>> to its maximum capacity. If the system buffers are low in number, it will 
>> reach
>> "out of memory" situation.
>>> This patch allocates the number of buffers as many dequeued from tx_q.
>>>
>>> Signed-off-by: Hemant Agrawal 
>>> ---
>>>lib/librte_kni/rte_kni.c | 13 -
>>>1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index
>>> 4e70fa0..4cf8e30 100644
>>> --- a/lib/librte_kni/rte_kni.c
>>> +++ b/lib/librte_kni/rte_kni.c
>>> @@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {
>>>
>>>
>>>static void kni_free_mbufs(struct rte_kni *kni); -static void
>>> kni_allocate_mbufs(struct rte_kni *kni);
>>> +static void kni_allocate_mbufs(struct rte_kni *kni, int num);
>>>
>>>static volatile int kni_fd = -1;
>>>static struct rte_kni_memzone_pool kni_memzone_pool = { @@ -575,7
>>> +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>>> **mbufs, unsigned num)
>>>
>>> /* If buffers removed, allocate mbufs and then put them into alloc_q
>> */
>>> if (ret)
>>> -   kni_allocate_mbufs(kni);
>>> +   kni_allocate_mbufs(kni, ret);
>>>
>>> return ret;
>>>}
>>> @@ -594,7 +594,7 @@ kni_free_mbufs(struct rte_kni *kni)
>>>}
>>>
>>>static void
>>> -kni_allocate_mbufs(struct rte_kni *kni)
>>> +kni_allocate_mbufs(struct rte_kni *kni, int num)
>>>{
>>> int i, ret;
>>> struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM]; @@ -620,7 +620,10
>> @@
>>> kni_allocate_mbufs(struct rte_kni *kni)
>>> return;
>>> }
>>>
>>> -   for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>>> +   if (num == 0 || num > MAX_MBUF_BURST_NUM)
>>> +   num = MAX_MBUF_BURST_NUM;
>>> +
>>> +   for (i = 0; i < num; i++) {
>>> pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>>> if (unlikely(pkts[i] == NULL)) {
>>> /* Out of memory */
>>> @@ -636,7 +639,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
>>> ret = kni_fifo_put(kni->alloc_q, (void **)pkts, i);
>>>
>>> /* Check if any mbufs not put into alloc_q, and then free them */
>>> -   if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM)
>> {MAX_MBUF_BURST_NUM
>>> +   if (ret >= 0 && ret < i && ret < num) {
>>> int j;
>>>
>>> for (j = ret; j < i; j++)
>> --
>>  *Olivier Dem?*
>> *Druid Software Ltd.*
>> *Tel: +353 1 202 1831*
>> *Email: odeme at druidsoftware.com *
>> *URL: http://www.druidsoftware.com*
>>  *Hall 7, stand 7F70.*
>> Druid Software: Monetising enterprise small cells solutions.

-- 
*Olivier Dem?*
*Druid Software Ltd.*
*Tel: +353 1 202 1831*
*Email: odeme at druidsoftware.com 

[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Neil Horman
On Wed, Feb 25, 2015 at 11:39:47AM +0800, Cunming Liang wrote:
> These two patches are the fixing for the compling error when 
> CONFIG_RTE_BUILD_SHARED_LIB=y.
> The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing in 
> the version map.
> Thanks for the notification from Tetsuya Mukawa . 
> 
> Cunming Liang (2):
>   eal/linux: fix symbol missing in version map
>   eal/bsd: fix symbol missing in version map
> 
>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 2 ++
>  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
>  2 files changed, 4 insertions(+)
> 
> -- 
> 1.8.1.4
> 
> 

NAK

This is the wrong way to fix this problem. Exporting global variables is
never a good solution when it can be helped.  Instead, rte_socket id should be
made a non inline function and exported.  Then the definition of
per_lcore_socket_id can be made private, protecting it from type changes.

Neil



[dpdk-dev] [PATCH v14 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-25 Thread Tetsuya Mukawa
2015-02-25 20:21 GMT+09:00 Thomas Monjalon :
> 2015-02-25 13:04, Tetsuya Mukawa:
>> --- a/lib/librte_eal/common/eal_common_dev.c
>> +++ b/lib/librte_eal/common/eal_common_dev.c
>> @@ -32,10 +32,13 @@
>>   *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>>   */
>>
>> +#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>
>> +#include 
>>  #include 
>>  #include 
>
> No, you must not include ethdev in EAL.
> The ethdev layer is by design on top of EAL.
> Maxime already asked why you did it. He was implicitly asking to remove it.
> You said that you are calling ethdev_is_detachable() but you should
> call a function eal_is_detachable() or something like that.
> The detachable state must be only device-related, i.e. in EAL.
> The ethdev API is only a wrapper (with port id) in such case.
>

Hi Thomas,

If ethdev library is on top of EAL, hotplug functions like
rte_eal_dev_attach/detach should be implemented in ethdev library.
Is it right?

If so, I will move rte_eal_dev_attach/detach to ethdev library.
And I will change names like rte_eth_dev_attach/detach.
Also, I will add "rte_dev.h" and "rte_pci.h" in rte_ethdev.h, and call
below EAL functions from ethdev library.

- For virtual device initialization and finalization
-- rte_eth_vdev_init
-- rte_eth_vdev_uninit()
- For physical NIC initialization and finalization
-- rte_eal_pci_probe_one()
-- rte_eal_pci_close_one()

I guess this will fix this design violation.
Is this ok?

Thanks,
Tetsuya

>> --- a/lib/librte_eal/linuxapp/eal/Makefile
>> +++ b/lib/librte_eal/linuxapp/eal/Makefile
>> @@ -45,6 +45,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
>>  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
>> +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
>
> By removing ethdev dependency, you can remove this ugly mbuf dependency.
>
> Thanks Tetsuya
>


[dpdk-dev] [PATCH v3] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Yerden Zhumabekov
New function test_crc32_hash_alg_equiv() checks whether software,
4-byte operand and 8-byte operand versions of CRC32 hash function
implementations return the same result value.

Signed-off-by: Yerden Zhumabekov 
---
 app/test/test_hash.c |   60 ++
 1 file changed, 60 insertions(+)

diff --git a/app/test/test_hash.c b/app/test/test_hash.c
index 76b1b8f..653dd86 100644
--- a/app/test/test_hash.c
+++ b/app/test/test_hash.c
@@ -177,6 +177,63 @@ static struct rte_hash_parameters ut_params = {
.socket_id = 0,
 };

+#define CRC32_ITERATIONS (1U << 20)
+#define CRC32_DWORDS (1U << 6)
+/*
+ * Test if all CRC32 implementations yield the same hash value
+ */
+static int
+test_crc32_hash_alg_equiv(void)
+{
+   uint32_t hash_val;
+   uint32_t init_val;
+   uint64_t data64[CRC32_DWORDS];
+   unsigned i, j;
+   size_t data_len;
+
+   printf("# CRC32 implementations equivalence test\n");
+   for (i = 0; i < CRC32_ITERATIONS; i++) {
+   /* Randomizing data_len of data set */
+   data_len = (size_t) ((rte_rand() % sizeof(data64)) + 1);
+   init_val = (uint32_t) rte_rand();
+
+   /* Fill the data set */
+   for (j = 0; j < CRC32_DWORDS; j++)
+   data64[j] = rte_rand();
+
+   /* Calculate software CRC32 */
+   rte_hash_crc_set_alg(CRC32_SW);
+   hash_val = rte_hash_crc(data64, data_len, init_val);
+
+   /* Check against 4-byte-operand sse4.2 CRC32 if available */
+   rte_hash_crc_set_alg(CRC32_SSE42);
+   if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
+   printf("Failed checking CRC32_SW against 
CRC32_SSE42\n");
+   break;
+   }
+
+   /* Check against 8-byte-operand sse4.2 CRC32 if available */
+   rte_hash_crc_set_alg(CRC32_SSE42_x64);
+   if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
+   printf("Failed checking CRC32_SW against 
CRC32_SSE42_x64\n");
+   break;
+   }
+   }
+
+   /* Resetting to best available algorithm */
+   rte_hash_crc_set_alg(CRC32_SSE42_x64);
+
+   if (i == CRC32_ITERATIONS)
+   return 0;
+
+   printf("Failed test data (hex, %lu bytes total):\n", data_len);
+   for (j = 0; j < data_len; j++)
+   printf("%02X%c", ((uint8_t *)data64)[j],
+   ((j+1) % 16 == 0 || j == data_len - 1) ? '\n' : 
' ');
+
+   return -1;
+}
+
 /*
  * Test a hash function.
  */
@@ -1356,6 +1413,9 @@ test_hash(void)

run_hash_func_tests();

+   if (test_crc32_hash_alg_equiv() < 0)
+   return -1;
+
return 0;
 }

-- 
1.7.9.5



[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Marc Sune

On 25/02/15 13:24, Hemant at freescale.com wrote:
> Hi OIivier
>Comments inline.
> Regards,
> Hemant
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Deme
>> Sent: 25/Feb/2015 5:44 PM
>> To: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
>>
>> Thank you Hemant, I think there might be one issue left with the patch 
>> though.
>> The alloc_q must initially be filled with mbufs before getting mbuf back on 
>> the
>> tx_q.
>>
>> So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
>> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the alloc_q with
>> MAX_MBUF_BURST_NUM mbufs)
>>
>> The patch for rte_kni_rx_burst would then look like:
>>
>> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>> **mbufs, unsigned num)
>>
>>/* If buffers removed, allocate mbufs and then put them into alloc_q 
>> */
>>if (ret)
>> -kni_allocate_mbufs(kni);
>> +  kni_allocate_mbufs(kni, ret);
>> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
>> +  kni_allocate_mbufs(kni, 0);
>>
> [hemant]  This will introduce a run-time check.
>
> I missed to include the other change in the patch.
>   I am doing it in kni_alloc i.e. initiate the alloc_q with default burst 
> size.
>   kni_allocate_mbufs(ctx, 0);
>
> In a way, we are now suggesting to reduce the size of alloc_q to only default 
> burst size.

As an aside comment here, I think that we should allow to tweak the 
userspace <-> kernel queue sizes (rx_q, tx_q, free_q and alloc_q) . 
Whether this should be a build configuration option or a parameter to 
rte_kni_init(), it is not completely clear to me, but I guess 
rte_kni_init() is a better option.

Having said that, the original mail from Hemant was describing that KNI 
was giving an out-of-memory. This to me indicates that the pool is 
incorrectly dimensioned. Even if KNI will not pre-allocate in the 
alloc_q, or not completely, in the event of high load, you will get this 
same "out of memory".

We can reduce the usage of buffers by the KNI subsystem in kernel space 
and in userspace, but the kernel will always need a small cache of 
pre-allocated buffers (coming from user-space), since the KNI kernel 
module does not know where to grab the packets from (which pool). So my 
guess is that the dimensioning problem experienced by Hemant would be 
the same, even with the proposed changes.

>
> Can we reach is situation, when the kernel is adding packets faster in tx_q 
> than the application is able to dequeue?

I think so. We cannot control much how the kernel will schedule the KNI 
thread(s), specially if the # of threads in relation to the cores is 
incorrect (not enough), hence we need at least a reasonable amount of 
buffering to prevent early dropping to those "internal" burst side effects.

Marc

>   alloc_q  can be empty in this case and kernel will be striving.
>
>> Olivier.
>>
>> On 25/02/15 11:48, Hemant Agrawal wrote:
>>> From: Hemant Agrawal 
>>>
>>> if any buffer is read from the tx_q, MAX_BURST buffers will be allocated and
>> attempted to be added to to the alloc_q.
>>> This seems terribly inefficient and it also looks like the alloc_q will 
>>> quickly fill
>> to its maximum capacity. If the system buffers are low in number, it will 
>> reach
>> "out of memory" situation.
>>> This patch allocates the number of buffers as many dequeued from tx_q.
>>>
>>> Signed-off-by: Hemant Agrawal 
>>> ---
>>>lib/librte_kni/rte_kni.c | 13 -
>>>1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index
>>> 4e70fa0..4cf8e30 100644
>>> --- a/lib/librte_kni/rte_kni.c
>>> +++ b/lib/librte_kni/rte_kni.c
>>> @@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {
>>>
>>>
>>>static void kni_free_mbufs(struct rte_kni *kni); -static void
>>> kni_allocate_mbufs(struct rte_kni *kni);
>>> +static void kni_allocate_mbufs(struct rte_kni *kni, int num);
>>>
>>>static volatile int kni_fd = -1;
>>>static struct rte_kni_memzone_pool kni_memzone_pool = { @@ -575,7
>>> +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>>> **mbufs, unsigned num)
>>>
>>> /* If buffers removed, allocate mbufs and then put them into alloc_q
>> */
>>> if (ret)
>>> -   kni_allocate_mbufs(kni);
>>> +   kni_allocate_mbufs(kni, ret);
>>>
>>> return ret;
>>>}
>>> @@ -594,7 +594,7 @@ kni_free_mbufs(struct rte_kni *kni)
>>>}
>>>
>>>static void
>>> -kni_allocate_mbufs(struct rte_kni *kni)
>>> +kni_allocate_mbufs(struct rte_kni *kni, int num)
>>>{
>>> int i, ret;
>>> struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM]; @@ -620,7 +620,10
>> @@
>>> kni_allocate_mbufs(struct rte_kni *kni)
>>> return;
>>> }
>>>
>>> -   for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>>> +   if (num == 0 || num > MAX_MBUF_BURST_NUM)
>>> +  

[dpdk-dev] [PATCH v2] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Yerden Zhumabekov
All notes taken into account. v3 posted.

25.02.2015 17:34, Bruce Richardson ?:
> On Wed, Feb 25, 2015 at 10:08:32AM +0600, Yerden Zhumabekov wrote:
>> New function test_crc32_hash_alg_equiv() checks whether software,
>> 4-byte operand and 8-byte operand versions of CRC32 hash function
>> implementations return the same result value.
>>
>> Signed-off-by: Yerden Zhumabekov 
> Two small notes below for improving output on error.
>
> Acked-by: Bruce Richardson 
>
>> ---
>>  app/test/test_hash.c |   63 
>> ++
>>  1 file changed, 63 insertions(+)
>>
>> diff --git a/app/test/test_hash.c b/app/test/test_hash.c
>> index 76b1b8f..3e94af1 100644
>> --- a/app/test/test_hash.c
>> +++ b/app/test/test_hash.c
>> @@ -177,6 +177,66 @@ static struct rte_hash_parameters ut_params = {
>>  .socket_id = 0,
>>  };
>>  
>> +#define CRC32_ITERATIONS (1U << 20)
>> +#define CRC32_DWORDS (1U << 6)
>> +/*
>> + * Test if all CRC32 implementations yield the same hash value
>> + */
>> +static int
>> +test_crc32_hash_alg_equiv(void)
>> +{
>> +uint32_t hash_val;
>> +uint32_t init_val;
>> +uint64_t data64[CRC32_DWORDS];
>> +unsigned i, j;
>> +size_t data_len;
>> +
>> +printf("# CRC32 implementations equivalence test\n");
>> +for (i = 0; i < CRC32_ITERATIONS; i++) {
>> +/* Randomizing data_len of data set */
>> +data_len = (size_t) ((rte_rand() % sizeof(data64)) + 1);
>> +init_val = (uint32_t) rte_rand();
>> +
>> +/* Fill the data set */
>> +for (j = 0; j < CRC32_DWORDS; j++)
>> +data64[j] = rte_rand();
>> +
>> +/* Calculate software CRC32 */
>> +rte_hash_crc_set_alg(CRC32_SW);
>> +hash_val = rte_hash_crc(data64, data_len, init_val);
>> +
>> +/* Check against 4-byte-operand sse4.2 CRC32 if available */
>> +rte_hash_crc_set_alg(CRC32_SSE42);
>> +if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
>> +printf("Failed checking CRC32_SW against 
>> CRC32_SSE42\n");
>> +break;
>> +}
>> +
>> +/* Check against 8-byte-operand sse4.2 CRC32 if available */
>> +rte_hash_crc_set_alg(CRC32_SSE42_x64);
>> +if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
>> +printf("Failed checking CRC32_SW against 
>> CRC32_SSE42_x64\n");
>> +break;
>> +}
>> +}
>> +
>> +/* Resetting to best available algorithm */
>> +rte_hash_crc_set_alg(CRC32_SSE42_x64);
>> +
>> +if (i == CRC32_ITERATIONS)
>> +return 0;
>> +
>> +printf("Failed test data (hex):\n");
>> +
>> +for (j = 0; j < data_len; j++) {
>> +printf("%02X", ((uint8_t *)data64)[j]);
> Put in a space after each hex character, otherwise it comes out like:
>
> Failed test data (hex):
> AAD292776348010C7A18D3080DB3A300
> FD
> Test Failed
>
> [I forced a failure by changing a != to == to test it, don't worry, the
> hash calculations are fine! :-)]
>
>> +if ((j+1) % 16 == 0 || j == data_len - 1)
>> +printf("\n");
>> +}
> Maybe also print out here, or before the hex digits, the length of the data
> that was tested. e.g. "printf("%u bytes total\n", data_len);" or similar.
>> +
>> +return -1;
>> +}
>> +
>>  /*
>>   * Test a hash function.
>>   */
>> @@ -1356,6 +1416,9 @@ test_hash(void)
>>  
>>  run_hash_func_tests();
>>  
>> +if (test_crc32_hash_alg_equiv() < 0)
>> +return -1;
>> +
>>  return 0;
>>  }
>>  
>> -- 
>> 1.7.9.5
>>

-- 
Sincerely,

Yerden Zhumabekov
State Technical Service
Astana, KZ



[dpdk-dev] [PATCH v2 0/4] Fix issues reported by static analysis tool

2015-02-25 Thread Pawel Wodkowski
Static analysis report some issues against current DPDK version. Most of
them need only cosmetic code changes (changing type of variable).

One issue related with ring pmd fix real memory leak problem.

PATCH v2 changes:
 - remove patch 5/5 as it was NACKed
 - reword commit log acording to mailing list sugestions.

Pawel Wodkowski (4):
  rte_timer: change declaration of rte_timer_cb_t
  librte_kvargs: make rte_kvargs_free() be consistent with other
"free()" functions
  pmd ring: fix possible memory leak during devinit
  cmdline: make parse_set_list() use size_t instead of int for low/high 
   parameter

 lib/librte_cmdline/cmdline_parse_portlist.c | 4 ++--
 lib/librte_kvargs/rte_kvargs.c  | 4 
 lib/librte_kvargs/rte_kvargs.h  | 3 ++-
 lib/librte_pmd_ring/rte_eth_ring.c  | 6 +++---
 lib/librte_timer/rte_timer.h| 4 ++--
 5 files changed, 13 insertions(+), 8 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH v2 1/4] rte_timer: change declaration of rte_timer_cb_t

2015-02-25 Thread Pawel Wodkowski
This patch remove inconsistency between declaration of type
rte_timer_cb_t, field f in struct rte_timer and function
__rte_timer_reset().

Although compiler treat both of them the same, the static analysis tool
like complain about that.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_timer/rte_timer.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 35b8719..77547c6 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -115,7 +115,7 @@ struct rte_timer;
 /**
  * Callback function type for timer expiry.
  */
-typedef void (rte_timer_cb_t)(struct rte_timer *, void *);
+typedef void (*rte_timer_cb_t)(struct rte_timer *, void *);

 #define MAX_SKIPLIST_DEPTH 10

@@ -128,7 +128,7 @@ struct rte_timer
struct rte_timer *sl_next[MAX_SKIPLIST_DEPTH];
volatile union rte_timer_status status; /**< Status of timer. */
uint64_t period;   /**< Period of timer (0 if not periodic). */
-   rte_timer_cb_t *f; /**< Callback function. */
+   rte_timer_cb_t f;  /**< Callback function. */
void *arg; /**< Argument to callback function. */
 };

-- 
1.9.1



[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Olivier Deme
Hi Marc,

I think one of the observations is that currently the alloc_q grows very 
quickly to the maximum fifo size (1024).
The patch suggests fixing the alloc_q to a fix size and maybe make that 
size configurable in rte_kni_alloc or rte_kni_init.

It should then be up to the application to provision the mempool 
accordingly.
Currently the out of memory problem shows up if the mempool doesn't have 
1024 buffers per KNI.

Olivier.

On 25/02/15 12:38, Marc Sune wrote:
>
> On 25/02/15 13:24, Hemant at freescale.com wrote:
>> Hi OIivier
>>  Comments inline.
>> Regards,
>> Hemant
>>
>>> -Original Message-
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Deme
>>> Sent: 25/Feb/2015 5:44 PM
>>> To: dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
>>>
>>> Thank you Hemant, I think there might be one issue left with the 
>>> patch though.
>>> The alloc_q must initially be filled with mbufs before getting mbuf 
>>> back on the
>>> tx_q.
>>>
>>> So the patch should allow rte_kni_rx_burst to check if alloc_q is 
>>> empty.
>>> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the 
>>> alloc_q with
>>> MAX_MBUF_BURST_NUM mbufs)
>>>
>>> The patch for rte_kni_rx_burst would then look like:
>>>
>>> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct 
>>> rte_mbuf
>>> **mbufs, unsigned num)
>>>
>>>/* If buffers removed, allocate mbufs and then put them into 
>>> alloc_q */
>>>if (ret)
>>> -kni_allocate_mbufs(kni);
>>> +  kni_allocate_mbufs(kni, ret);
>>> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
>>> +  kni_allocate_mbufs(kni, 0);
>>>
>> [hemant]  This will introduce a run-time check.
>>
>> I missed to include the other change in the patch.
>>   I am doing it in kni_alloc i.e. initiate the alloc_q with default 
>> burst size.
>> kni_allocate_mbufs(ctx, 0);
>>
>> In a way, we are now suggesting to reduce the size of alloc_q to only 
>> default burst size.
>
> As an aside comment here, I think that we should allow to tweak the 
> userspace <-> kernel queue sizes (rx_q, tx_q, free_q and alloc_q) . 
> Whether this should be a build configuration option or a parameter to 
> rte_kni_init(), it is not completely clear to me, but I guess 
> rte_kni_init() is a better option.
>
> Having said that, the original mail from Hemant was describing that 
> KNI was giving an out-of-memory. This to me indicates that the pool is 
> incorrectly dimensioned. Even if KNI will not pre-allocate in the 
> alloc_q, or not completely, in the event of high load, you will get 
> this same "out of memory".
>
> We can reduce the usage of buffers by the KNI subsystem in kernel 
> space and in userspace, but the kernel will always need a small cache 
> of pre-allocated buffers (coming from user-space), since the KNI 
> kernel module does not know where to grab the packets from (which 
> pool). So my guess is that the dimensioning problem experienced by 
> Hemant would be the same, even with the proposed changes.
>
>>
>> Can we reach is situation, when the kernel is adding packets faster 
>> in tx_q than the application is able to dequeue?
>
> I think so. We cannot control much how the kernel will schedule the 
> KNI thread(s), specially if the # of threads in relation to the cores 
> is incorrect (not enough), hence we need at least a reasonable amount 
> of buffering to prevent early dropping to those "internal" burst side 
> effects.
>
> Marc
>
>>   alloc_q  can be empty in this case and kernel will be striving.
>>
>>> Olivier.
>>>
>>> On 25/02/15 11:48, Hemant Agrawal wrote:
 From: Hemant Agrawal 

 if any buffer is read from the tx_q, MAX_BURST buffers will be 
 allocated and
>>> attempted to be added to to the alloc_q.
 This seems terribly inefficient and it also looks like the alloc_q 
 will quickly fill
>>> to its maximum capacity. If the system buffers are low in number, it 
>>> will reach
>>> "out of memory" situation.
 This patch allocates the number of buffers as many dequeued from tx_q.

 Signed-off-by: Hemant Agrawal 
 ---
lib/librte_kni/rte_kni.c | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)

 diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index
 4e70fa0..4cf8e30 100644
 --- a/lib/librte_kni/rte_kni.c
 +++ b/lib/librte_kni/rte_kni.c
 @@ -128,7 +128,7 @@ struct rte_kni_memzone_pool {


static void kni_free_mbufs(struct rte_kni *kni); -static void
 kni_allocate_mbufs(struct rte_kni *kni);
 +static void kni_allocate_mbufs(struct rte_kni *kni, int num);

static volatile int kni_fd = -1;
static struct rte_kni_memzone_pool kni_memzone_pool = { @@ -575,7
 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
 **mbufs, unsigned num)

/* If buffers removed, allocate mbufs and then put them into 
 alloc_

[dpdk-dev] [PATCH v2 2/4] librte_kvargs: make rte_kvargs_free() be consistent with other "free()" functions

2015-02-25 Thread Pawel Wodkowski
By convenction free() functions should ignore NULL parameter. This patch
add this behaviour for rte_kvargs_free().

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_kvargs/rte_kvargs.c | 4 
 lib/librte_kvargs/rte_kvargs.h | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index 8bc1e46..c2dd051 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -174,8 +174,12 @@ rte_kvargs_process(const struct rte_kvargs *kvlist,
 void
 rte_kvargs_free(struct rte_kvargs *kvlist)
 {
+   if (!kvlist)
+   return;
+
if (kvlist->str != NULL)
free(kvlist->str);
+
free(kvlist);
 }

diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index ef4efab..ae9ae79 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -115,7 +115,8 @@ void rte_kvargs_free(struct rte_kvargs *kvlist);
  *
  * For each key/value association that matches the given key, calls the
  * handler function with the for a given arg_name passing the value on the
- * dictionary for that key and a given extra argument.
+ * dictionary for that key and a given extra argument. If *kvlist* is NULL
+ * function does nothing.
  *
  * @param kvlist
  *   The rte_kvargs structure
-- 
1.9.1



[dpdk-dev] [PATCH v2 3/4] pmd ring: fix possible memory leak during devinit

2015-02-25 Thread Pawel Wodkowski
Free kvlist on function exit to avoid memory leak.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_pmd_ring/rte_eth_ring.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_ring/rte_eth_ring.c 
b/lib/librte_pmd_ring/rte_eth_ring.c
index a5dc71e..f049bb3 100644
--- a/lib/librte_pmd_ring/rte_eth_ring.c
+++ b/lib/librte_pmd_ring/rte_eth_ring.c
@@ -527,7 +527,7 @@ out:
 static int
 rte_pmd_ring_devinit(const char *name, const char *params)
 {
-   struct rte_kvargs *kvlist;
+   struct rte_kvargs *kvlist = NULL;
int ret = 0;
struct node_action_list *info = NULL;

@@ -548,7 +548,7 @@ rte_pmd_ring_devinit(const char *name, const char *params)
info = rte_zmalloc("struct node_action_list", 
sizeof(struct node_action_list) +
   (sizeof(struct node_action_pair) * 
ret), 0);
if (!info)
-   goto out;
+   goto out_free;

info->total = ret;
info->list = (struct node_action_pair*)(info + 1);
@@ -567,8 +567,8 @@ rte_pmd_ring_devinit(const char *name, const char *params)
}

 out_free:
+   rte_kvargs_free(kvlist);
rte_free(info);
-out:
return ret;
 }

-- 
1.9.1



[dpdk-dev] [PATCH v2 4/4] cmdline: make parse_set_list() use size_t instead of int for low/high parameter

2015-02-25 Thread Pawel Wodkowski
Fix warning reported during static analysis about size_t to int cast
when passing
parameters to parse_set_list().

This patch fix code formating errors that give checkpatch.pl errors
after generating patch.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_cmdline/cmdline_parse_portlist.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse_portlist.c 
b/lib/librte_cmdline/cmdline_parse_portlist.c
index fc6c14e..9c1fe3e 100644
--- a/lib/librte_cmdline/cmdline_parse_portlist.c
+++ b/lib/librte_cmdline/cmdline_parse_portlist.c
@@ -78,7 +78,7 @@ struct cmdline_token_ops cmdline_token_portlist_ops = {
 };

 static void
-parse_set_list(cmdline_portlist_t * pl, int low, int high)
+parse_set_list(cmdline_portlist_t *pl, size_t low, size_t high)
 {
do {
pl->map |= (1 << low++);
@@ -86,7 +86,7 @@ parse_set_list(cmdline_portlist_t * pl, int low, int high)
 }

 static int
-parse_ports(cmdline_portlist_t * pl, const char * str)
+parse_ports(cmdline_portlist_t *pl, const char *str)
 {
size_t ps, pe;
const char *first, *last;
-- 
1.9.1



[dpdk-dev] [PATCH 2/2] doc: update programmers guide for uio_pci_generic

2015-02-25 Thread Iremonger, Bernard


> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, February 25, 2015 12:28 PM
> To: Iremonger, Bernard
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for 
> uio_pci_generic
> 
> On Wed, Feb 25, 2015 at 12:19:10PM +, Iremonger, Bernard wrote:
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> > > Richardson
> > > Sent: Tuesday, February 24, 2015 4:28 PM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for
> > > uio_pci_generic
> > >
> > > Since DPDK now has support for the in-tree uio_pci_generic driver,
> > > update the programmers guide document to reference this module, and
> > > to use it in preference to the igb_uio driver, which is DPDK- specific.
> > >
> > > Signed-off-by: Bruce Richardson 
> > > ---
> > >  doc/guides/prog_guide/env_abstraction_layer.rst  | 8 
> > > 
> > >  doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst | 6 
> > > +++---
> > >  doc/guides/prog_guide/kernel_nic_interface.rst   | 2 +-
> > >  doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst  | 8 
> > > 
> > >  doc/guides/prog_guide/poll_mode_drv_paravirtual_vmxnets_nic.rst  |
> > > 2 +-
> > >  5 files changed, 13 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst
> > > b/doc/guides/prog_guide/env_abstraction_layer.rst
> > > index 231e266..b5321c3 100644
> > > --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> > > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> > > @@ -66,7 +66,7 @@ EAL in a Linux-userland Execution Environment
> > >  -
> > >
> > >  In a Linux user space environment, the DPDK application runs as a
> > > user-space application using the pthread library.
> > > -PCI information about devices and address space is discovered
> > > through the /sys kernel interface and through a module called igb_uio.
> > > +PCI information about devices and address space is discovered
> > > +through the /sys kernel interface and
> > > through kernel modules such as uio_pci_generic, or igb_uio.
> > >  Refer to the UIO: User-space drivers documentation in the Linux
> > > kernel. This memory is mmap'd in the application.
> > >
> > >  The EAL performs physical memory allocation using mmap() in
> > > hugetlbfs (using huge page sizes to increase performance).
> > > @@ -134,10 +134,10 @@ PCI Access
> > >  ~~
> > >
> > >  The EAL uses the /sys/bus/pci utilities provided by the kernel to scan 
> > > the content on the PCI bus.
> > > -
> > > -To access PCI memory, a kernel module called igb_uio provides a
> > > /dev/uioX device file
> > > +To access PCI memory, a kernel module called uio_pci_generic
> > > +provides a /dev/uioX device file and resource files in /sys
> > >  that can be mmap'd to obtain access to PCI address space from the 
> > > application.
> > > -It uses the uio kernel feature (userland driver).
> > > +The DPDK-specific igb_uio module can also be used for this. Both
> > > +drivers use the uio kernel feature
> > > (userland driver).
> > >
> > >  Per-lcore and Shared Variables
> > >  ~~
> > > diff --git
> > > a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > index 1f1e04f..a0dd959 100644
> > > ---
> > > a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > +++ b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.r
> > > +++ st
> > > @@ -306,12 +306,12 @@ Building and Running the Switching Backend
> > >  Refer to the *DPDK Getting Started Guide* for more
> > > information on memory management in the DPDK.
> > >  In the above command, 4 GB memory is reserved (2048 of 2 MB 
> > > pages) for DPDK.
> > >
> > > -#.  Load igb_uio and bind one Intel NIC controller to igb_uio:
> > > +#.  Load uio_pci_generic and bind one Intel NIC controller to it:
> > >
> > >  .. code-block:: console
> > >
> > > -insmod x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> > > -python tools/dpdk_nic_bind.py -b igb_uio :09:00:00.0
> >
> >
> > Hi Bruce,
> >
> > Should the information about igb_uio be retained alongside the new 
> > information about
> uio_pci_generic?
> >
> While the answer may not be as clear-cut as with the GSG, why would be bother 
> covering both here.
> We already ignore VFIO in these examples.
> 
> /Bruce

Hi Bruce,

The method of loading is different for both modules, igb_uio uses insmod and 
uio_pci_generic uses modprobe.
It would be useful to retain this igb_uio information. Maybe vfio information 
should be added too.
This comment also applies to the GSG,  the differences need to be documented.

Regards,

Bernard.




[dpdk-dev] [PATCH v3] app/test: add crc32 algorithms equivalence check

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 06:34:06PM +0600, Yerden Zhumabekov wrote:
> New function test_crc32_hash_alg_equiv() checks whether software,
> 4-byte operand and 8-byte operand versions of CRC32 hash function
> implementations return the same result value.
> 
> Signed-off-by: Yerden Zhumabekov 

Acked-by: Bruce Richardson 

> ---
>  app/test/test_hash.c |   60 
> ++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/app/test/test_hash.c b/app/test/test_hash.c
> index 76b1b8f..653dd86 100644
> --- a/app/test/test_hash.c
> +++ b/app/test/test_hash.c
> @@ -177,6 +177,63 @@ static struct rte_hash_parameters ut_params = {
>   .socket_id = 0,
>  };
>  
> +#define CRC32_ITERATIONS (1U << 20)
> +#define CRC32_DWORDS (1U << 6)
> +/*
> + * Test if all CRC32 implementations yield the same hash value
> + */
> +static int
> +test_crc32_hash_alg_equiv(void)
> +{
> + uint32_t hash_val;
> + uint32_t init_val;
> + uint64_t data64[CRC32_DWORDS];
> + unsigned i, j;
> + size_t data_len;
> +
> + printf("# CRC32 implementations equivalence test\n");
> + for (i = 0; i < CRC32_ITERATIONS; i++) {
> + /* Randomizing data_len of data set */
> + data_len = (size_t) ((rte_rand() % sizeof(data64)) + 1);
> + init_val = (uint32_t) rte_rand();
> +
> + /* Fill the data set */
> + for (j = 0; j < CRC32_DWORDS; j++)
> + data64[j] = rte_rand();
> +
> + /* Calculate software CRC32 */
> + rte_hash_crc_set_alg(CRC32_SW);
> + hash_val = rte_hash_crc(data64, data_len, init_val);
> +
> + /* Check against 4-byte-operand sse4.2 CRC32 if available */
> + rte_hash_crc_set_alg(CRC32_SSE42);
> + if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
> + printf("Failed checking CRC32_SW against 
> CRC32_SSE42\n");
> + break;
> + }
> +
> + /* Check against 8-byte-operand sse4.2 CRC32 if available */
> + rte_hash_crc_set_alg(CRC32_SSE42_x64);
> + if (hash_val != rte_hash_crc(data64, data_len, init_val)) {
> + printf("Failed checking CRC32_SW against 
> CRC32_SSE42_x64\n");
> + break;
> + }
> + }
> +
> + /* Resetting to best available algorithm */
> + rte_hash_crc_set_alg(CRC32_SSE42_x64);
> +
> + if (i == CRC32_ITERATIONS)
> + return 0;
> +
> + printf("Failed test data (hex, %lu bytes total):\n", data_len);
> + for (j = 0; j < data_len; j++)
> + printf("%02X%c", ((uint8_t *)data64)[j],
> + ((j+1) % 16 == 0 || j == data_len - 1) ? '\n' : 
> ' ');
> +
> + return -1;
> +}
> +
>  /*
>   * Test a hash function.
>   */
> @@ -1356,6 +1413,9 @@ test_hash(void)
>  
>   run_hash_func_tests();
>  
> + if (test_crc32_hash_alg_equiv() < 0)
> + return -1;
> +
>   return 0;
>  }
>  
> -- 
> 1.7.9.5
> 


[dpdk-dev] [PATCH 2/2] doc: update programmers guide for uio_pci_generic

2015-02-25 Thread Bruce Richardson
On Wed, Feb 25, 2015 at 01:12:43PM +, Iremonger, Bernard wrote:
> 
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Wednesday, February 25, 2015 12:28 PM
> > To: Iremonger, Bernard
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for 
> > uio_pci_generic
> > 
> > On Wed, Feb 25, 2015 at 12:19:10PM +, Iremonger, Bernard wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> > > > Richardson
> > > > Sent: Tuesday, February 24, 2015 4:28 PM
> > > > To: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH 2/2] doc: update programmers guide for
> > > > uio_pci_generic
> > > >
> > > > Since DPDK now has support for the in-tree uio_pci_generic driver,
> > > > update the programmers guide document to reference this module, and
> > > > to use it in preference to the igb_uio driver, which is DPDK- specific.
> > > >
> > > > Signed-off-by: Bruce Richardson 
> > > > ---
> > > >  doc/guides/prog_guide/env_abstraction_layer.rst  | 8 
> > > > 
> > > >  doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst | 6 
> > > > +++---
> > > >  doc/guides/prog_guide/kernel_nic_interface.rst   | 2 +-
> > > >  doc/guides/prog_guide/poll_mode_drv_emulated_virtio_nic.rst  | 8 
> > > > 
> > > >  doc/guides/prog_guide/poll_mode_drv_paravirtual_vmxnets_nic.rst  |
> > > > 2 +-
> > > >  5 files changed, 13 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst
> > > > b/doc/guides/prog_guide/env_abstraction_layer.rst
> > > > index 231e266..b5321c3 100644
> > > > --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> > > > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> > > > @@ -66,7 +66,7 @@ EAL in a Linux-userland Execution Environment
> > > >  -
> > > >
> > > >  In a Linux user space environment, the DPDK application runs as a
> > > > user-space application using the pthread library.
> > > > -PCI information about devices and address space is discovered
> > > > through the /sys kernel interface and through a module called igb_uio.
> > > > +PCI information about devices and address space is discovered
> > > > +through the /sys kernel interface and
> > > > through kernel modules such as uio_pci_generic, or igb_uio.
> > > >  Refer to the UIO: User-space drivers documentation in the Linux
> > > > kernel. This memory is mmap'd in the application.
> > > >
> > > >  The EAL performs physical memory allocation using mmap() in
> > > > hugetlbfs (using huge page sizes to increase performance).
> > > > @@ -134,10 +134,10 @@ PCI Access
> > > >  ~~
> > > >
> > > >  The EAL uses the /sys/bus/pci utilities provided by the kernel to scan 
> > > > the content on the PCI bus.
> > > > -
> > > > -To access PCI memory, a kernel module called igb_uio provides a
> > > > /dev/uioX device file
> > > > +To access PCI memory, a kernel module called uio_pci_generic
> > > > +provides a /dev/uioX device file and resource files in /sys
> > > >  that can be mmap'd to obtain access to PCI address space from the 
> > > > application.
> > > > -It uses the uio kernel feature (userland driver).
> > > > +The DPDK-specific igb_uio module can also be used for this. Both
> > > > +drivers use the uio kernel feature
> > > > (userland driver).
> > > >
> > > >  Per-lcore and Shared Variables
> > > >  ~~
> > > > diff --git
> > > > a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > > b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > > index 1f1e04f..a0dd959 100644
> > > > ---
> > > > a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
> > > > +++ b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.r
> > > > +++ st
> > > > @@ -306,12 +306,12 @@ Building and Running the Switching Backend
> > > >  Refer to the *DPDK Getting Started Guide* for more
> > > > information on memory management in the DPDK.
> > > >  In the above command, 4 GB memory is reserved (2048 of 2 MB 
> > > > pages) for DPDK.
> > > >
> > > > -#.  Load igb_uio and bind one Intel NIC controller to igb_uio:
> > > > +#.  Load uio_pci_generic and bind one Intel NIC controller to it:
> > > >
> > > >  .. code-block:: console
> > > >
> > > > -insmod x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> > > > -python tools/dpdk_nic_bind.py -b igb_uio :09:00:00.0
> > >
> > >
> > > Hi Bruce,
> > >
> > > Should the information about igb_uio be retained alongside the new 
> > > information about
> > uio_pci_generic?
> > >
> > While the answer may not be as clear-cut as with the GSG, why would be 
> > bother covering both here.
> > We already ignore VFIO in these examples.
> > 
> > /Bruce
> 
> Hi Bruce,
> 
> The method of loading is different for both modules, igb_uio uses insmod and 
> uio_

[dpdk-dev] [PATCH] headers: typeof -> __typeof__ to unbreak C++11 code

2015-02-25 Thread Simon Kagstrom
When compiling C++11-code or above (--std=c++11), the build fails with
lots of

  rte_eth_ctrl.h:517:3: note: in expansion of macro RTE_ALIGN
(RTE_ALIGN(RTE_ETH_FLOW_MAX, UINT32_BIT)/UINT32_BIT)
^

When reading the GCC info pages, I get the feeling that __typeof__ is
a better choice, and that indeed works when including the headers in
C++ files (--std=c++11).

There are some typeof()s left in C files, the patch only touches the
public API.

Signed-off-by: Simon Kagstrom 
---
 lib/librte_acl/acl_vect.h  |  8 
 lib/librte_eal/common/include/rte_common.h | 17 +
 lib/librte_eal/common/include/rte_pci.h|  2 +-
 3 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/librte_acl/acl_vect.h b/lib/librte_acl/acl_vect.h
index 6cc1999..de47071 100644
--- a/lib/librte_acl/acl_vect.h
+++ b/lib/librte_acl/acl_vect.h
@@ -52,8 +52,8 @@ extern "C" {
  * hi - contains high 32 bits of given N transitions.
  */
 #defineACL_TR_HILO(P, TC, tr0, tr1, lo, hi)do 
{ \
-   lo = (typeof(lo))_##P##_shuffle_ps((TC)(tr0), (TC)(tr1), 0x88);  \
-   hi = (typeof(hi))_##P##_shuffle_ps((TC)(tr0), (TC)(tr1), 0xdd);  \
+   lo = (__typeof__(lo))_##P##_shuffle_ps((TC)(tr0), (TC)(tr1), 0x88);  \
+   hi = (__typeof__(hi))_##P##_shuffle_ps((TC)(tr0), (TC)(tr1), 0xdd);  \
 } while (0)


@@ -74,8 +74,8 @@ extern "C" {
addr, index_mask, next_input, shuffle_input,\
ones_16, range_base, tr_lo, tr_hi)   do {   \
\
-   typeof(addr) in, node_type, r, t;   \
-   typeof(addr) dfa_msk, dfa_ofs, quad_ofs;\
+   __typeof__(addr) in, node_type, r, t;   \
+   __typeof__(addr) dfa_msk, dfa_ofs, quad_ofs;\
\
t = _##P##_xor_si##S(index_mask, index_mask);   \
in = _##P##_shuffle_epi8(next_input, shuffle_input);\
diff --git a/lib/librte_eal/common/include/rte_common.h 
b/lib/librte_eal/common/include/rte_common.h
index 8ac940c..40c2603 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -43,6 +43,7 @@

 #ifdef __cplusplus
 extern "C" {
+
 #endif

 #include 
@@ -112,7 +113,7 @@ rte_align_floor_int(uintptr_t ptr, uintptr_t align)
  * must be a power-of-two value.
  */
 #define RTE_PTR_ALIGN_FLOOR(ptr, align) \
-   (typeof(ptr))rte_align_floor_int((uintptr_t)ptr, align)
+   (__typeof__(ptr))rte_align_floor_int((uintptr_t)ptr, align)

 /**
  * Macro to align a value to a given power-of-two. The resultant value
@@ -121,7 +122,7 @@ rte_align_floor_int(uintptr_t ptr, uintptr_t align)
  * power-of-two value.
  */
 #define RTE_ALIGN_FLOOR(val, align) \
-   (typeof(val))((val) & (~((typeof(val))((align) - 1
+   (__typeof__(val))((val) & (~((__typeof__(val))((align) - 1

 /**
  * Macro to align a pointer to a given power-of-two. The resultant
@@ -130,7 +131,7 @@ rte_align_floor_int(uintptr_t ptr, uintptr_t align)
  * must be a power-of-two value.
  */
 #define RTE_PTR_ALIGN_CEIL(ptr, align) \
-   RTE_PTR_ALIGN_FLOOR((typeof(ptr))RTE_PTR_ADD(ptr, (align) - 1), align)
+   RTE_PTR_ALIGN_FLOOR((__typeof__(ptr))RTE_PTR_ADD(ptr, (align) - 1), 
align)

 /**
  * Macro to align a value to a given power-of-two. The resultant value
@@ -139,7 +140,7 @@ rte_align_floor_int(uintptr_t ptr, uintptr_t align)
  * value.
  */
 #define RTE_ALIGN_CEIL(val, align) \
-   RTE_ALIGN_FLOOR(((val) + ((typeof(val)) (align) - 1)), align)
+   RTE_ALIGN_FLOOR(((val) + ((__typeof__(val)) (align) - 1)), align)

 /**
  * Macro to align a pointer to a given power-of-two. The resultant
@@ -257,8 +258,8 @@ rte_align64pow2(uint64_t v)
  * Macro to return the minimum of two numbers
  */
 #define RTE_MIN(a, b) ({ \
-   typeof (a) _a = (a); \
-   typeof (b) _b = (b); \
+   __typeof__ (a) _a = (a); \
+   __typeof__ (b) _b = (b); \
_a < _b ? _a : _b; \
})

@@ -266,8 +267,8 @@ rte_align64pow2(uint64_t v)
  * Macro to return the maximum of two numbers
  */
 #define RTE_MAX(a, b) ({ \
-   typeof (a) _a = (a); \
-   typeof (b) _b = (b); \
+   __typeof__ (a) _a = (a); \
+   __typeof__ (b) _b = (b); \
_a > _b ? _a : _b; \
})

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 3df07e8..bc065d4 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -212,7 +212,7 @@ do {
   \
val = strtoul((in), &end, 16);  \
if (errno != 0 || end[0] != (dlm) || val > (lim))   \
return (-EINVAL); 

[dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst

2015-02-25 Thread Jay Rolette
On Wed, Feb 25, 2015 at 6:38 AM, Marc Sune  wrote:

>
> On 25/02/15 13:24, Hemant at freescale.com wrote:
>
>> Hi OIivier
>>  Comments inline.
>> Regards,
>> Hemant
>>
>>  -Original Message-
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Deme
>>> Sent: 25/Feb/2015 5:44 PM
>>> To: dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
>>>
>>> Thank you Hemant, I think there might be one issue left with the patch
>>> though.
>>> The alloc_q must initially be filled with mbufs before getting mbuf back
>>> on the
>>> tx_q.
>>>
>>> So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
>>> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the alloc_q
>>> with
>>> MAX_MBUF_BURST_NUM mbufs)
>>>
>>> The patch for rte_kni_rx_burst would then look like:
>>>
>>> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>>> **mbufs, unsigned num)
>>>
>>>/* If buffers removed, allocate mbufs and then put them into
>>> alloc_q */
>>>if (ret)
>>> -kni_allocate_mbufs(kni);
>>> +  kni_allocate_mbufs(kni, ret);
>>> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
>>> +  kni_allocate_mbufs(kni, 0);
>>>
>>>  [hemant]  This will introduce a run-time check.
>>
>> I missed to include the other change in the patch.
>>   I am doing it in kni_alloc i.e. initiate the alloc_q with default burst
>> size.
>> kni_allocate_mbufs(ctx, 0);
>>
>> In a way, we are now suggesting to reduce the size of alloc_q to only
>> default burst size.
>>
>
> As an aside comment here, I think that we should allow to tweak the
> userspace <-> kernel queue sizes (rx_q, tx_q, free_q and alloc_q) . Whether
> this should be a build configuration option or a parameter to
> rte_kni_init(), it is not completely clear to me, but I guess
> rte_kni_init() is a better option.
>

rte_kni_init() is definitely a better option. It allows things to be tuned
based on individual system config rather than requiring different builds.


> Having said that, the original mail from Hemant was describing that KNI
> was giving an out-of-memory. This to me indicates that the pool is
> incorrectly dimensioned. Even if KNI will not pre-allocate in the alloc_q,
> or not completely, in the event of high load, you will get this same "out
> of memory".
>
> We can reduce the usage of buffers by the KNI subsystem in kernel space
> and in userspace, but the kernel will always need a small cache of
> pre-allocated buffers (coming from user-space), since the KNI kernel module
> does not know where to grab the packets from (which pool). So my guess is
> that the dimensioning problem experienced by Hemant would be the same, even
> with the proposed changes.
>
>
>> Can we reach is situation, when the kernel is adding packets faster in
>> tx_q than the application is able to dequeue?
>>
>
> I think so. We cannot control much how the kernel will schedule the KNI
> thread(s), specially if the # of threads in relation to the cores is
> incorrect (not enough), hence we need at least a reasonable amount of
> buffering to prevent early dropping to those "internal" burst side effects.
>
> Marc


Strongly agree with Marc here. We *really* don't want just a single burst
worth of mbufs available to the kernel in alloc_q. That's just asking for
congestion when there's no need for it.

The original problem reported by Olivier is more of a resource tuning
problem than anything else. The number of mbufs you need in the system has
to take into account internal queue depths.

Jay


[dpdk-dev] [PATCH v2 0/3] timer: fix rte_timer_reset

2015-02-25 Thread Robert Sanford
On Wed, Feb 25, 2015 at 6:16 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Wed, Feb 25, 2015 at 06:02:24AM -0500, Robert Sanford wrote:
>
> >
> > One question about lib rte_timer that's been troubling me for a while:
> How
> > are skip lists better than BSD-style timer wheels?
> >
> > --
>
> The skip list may not be any better than a timer wheel - it's just what is
> used
> now, and it does give pretty good performance (insert O(log n) [up to a few
> million timers per core], expiry O(1)).
> Originally in DPDK, the timers were maintained in a regular sorted linked
> list,
> but that suffered from scalability issues when starting timers, or stopped
> before
> expiry. The skip-list was therefore a big improvement on that, and gave us
> much greater scalability in timers, without any regressions in
> performance. I
> don't know if anyone has tried to implement and benchmark a timer-wheel
> based
> rte_timer library replacement. I'd be interested to see a performance
> comparison
> between the two implementations! :-)
>
> Regards,
> /Bruce
>
>
I've wanted to try out the timer-wheels since before the skip-list version,
but it just hasn't made it to the top of my priority list.
The other thing that concerns me about the skip-list implementation is the
extra cache line that all those pointers consume.

--
Thanks,
Robert


[dpdk-dev] Looking forward to DPDK 2.1

2015-02-25 Thread Butler, Siobhan A
Hi all,

The progress on DPDK 2.0 has been really positive and thanks to everyone for 
contributing and helping to grow our community. We now look onwards to DPDK 2.1 
planning which is due to release at the end of July, and we'd like to inform 
the community of the features that we hope to submit to that release. The 
current list of features, along with brief descriptions, is included below.

This list is provisional and will naturally change over the lifecycle of the 
release, and should be taken as guidance on what we hope to submit, not a 
commitment.

Our aim in providing this information now is to solicit input from the 
community. We'd like to make sure we avoid duplication or conflicts with work 
that others are planning, so please feel free to let the community know of any 
plans that you have for contributions to DPDK in this timeframe. This will 
allow us to build a complete picture and ensure we avoid duplication of effort. 
We have seen great community collaboration in DPDK 2.0 and hope that this will 
increase in 2.1.

I'm sure people will have questions, and will be looking for more information 
on these features. Further details will be provided by the individual 
developers over the next few months. We aim to provide early outlines of the 
features so that we can obtain community feedback as soon as possible. In 
addition, community calls can be arranged to discuss features as required.


2.1 (Q2 2015) DPDK Features:

* Cuckoo hash - Provide a new hash library based on the cuckoo hashing scheme 
(see http://www.cs.cmu.edu/~dongz/papers/cuckooswitch.pdf), which shall 
guarantee worst-case constant lookup time with a better memory utilization, 
compared to the current implementation.

* IEEE1588 Support for i40e - Support IEEE1588 Standard (PTP) for i40e   
Ethernet Controller

* Continued development of PCI Hot Plug - Add support for PCI Hotplug Framework 
in librte_pmd drivers (librte_pmd_ixgbe, librte_pmd_bond, librte_pmd_e1000, 
librte_pmd_i40e, librte_pmd_virtio, librte_pmd_vmxnet3)

* Packet Framework Enhancements - Enhancements to Packet Framework Port and 
Table Libraries, as well as IP Pipeline  Application, to include additional 
statistics, better pipeline encapsulation and CLI simplification.

* i40e DCB (ETS only) - Support DCB Enhanced Transmission Selection algorithm 
with i40e Ethernet controller.

* i40e Mirroring Rule - Add support for port mirroring using i40e Ethernet 
Controller.

* Additional FM10K Features - Add support for additional usage models for FM10K 
including: promiscuous mode, mac vlan filter, statistics, vlan offload (strip, 
insertion, dual), flow control, Tx offload (checksum).

* Dynamic Configuration of RSS on Bonded Slave devices - Support dynamic queues 
assignment for RX packets. Implementation for a bonding device will require 
multiple RX queues support on a bonding slave and its dynamic
reconfiguration.

* VXLAN Offload Sample Application - Provide a sample application to 
demonstrate the usage of VXLAN overlay encapsulation protocol in DPDK.

* Dynamic Memory Management - Add DPDK API's (Rte_free_unused_pages, 
rte_attach_pages, rte_detach_pages, rte_lazy_allocation) for Dynamic Memory 
Management for NFV use cases.


Thanks,
Siobhan Butler



[dpdk-dev] [PATCH v1 0/2] eal: fix symbol missing in version map

2015-02-25 Thread Thomas Monjalon
2015-02-25 07:30, Neil Horman:
> On Wed, Feb 25, 2015 at 11:39:47AM +0800, Cunming Liang wrote:
> > These two patches are the fixing for the compling error when 
> > CONFIG_RTE_BUILD_SHARED_LIB=y.
> > The root cause is *per_lcore__socket_id* and *rte_sys_gettid* are missing 
> > in the version map.
> > Thanks for the notification from Tetsuya Mukawa . 
> > 
> > Cunming Liang (2):
> >   eal/linux: fix symbol missing in version map
> >   eal/bsd: fix symbol missing in version map
> > 
> >  lib/librte_eal/bsdapp/eal/rte_eal_version.map   | 2 ++
> >  lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++
> >  2 files changed, 4 insertions(+)
> > 
> 
> NAK
> 
> This is the wrong way to fix this problem. Exporting global variables is
> never a good solution when it can be helped.  Instead, rte_socket id should be
> made a non inline function and exported.  Then the definition of
> per_lcore_socket_id can be made private, protecting it from type changes.

Neil, I applied the patches to fix compilation on HEAD.
In case your comment makes sense, a cleanup would be appreciated.

Thanks


[dpdk-dev] [PATCH v3 0/3] Mellanox ConnectX-3 PMD

2015-02-25 Thread Adrien Mazarguil
This PMD adds support for Mellanox ConnectX-3-based adapters through the
verbs framework. It relies on external libraries (libibverbs and user space
driver libmlx4) and kernel support to do so.

While these libraries and kernel modules are available on OpenFabrics
Alliance's website [1] and provided by package managers on most
distributions, this PMD requires Ethernet extensions that may not be
supported at the moment (this is a work in progress).

Mellanox OFED [2] includes the necessary support and should be used in the
meantime. For DPDK, only libibverbs, libmlx4 and mlnx-ofed-kernel packages
are required from that distribution.

The following kernel modules must be loaded before using this PMD:

- mlx4_core (hardware driver, does global initialization)
- mlx4_en (Ethernet device driver)
- mlx4_ib (InfiniBand device driver)
- ib_uverbs (user space driver for verbs)

[1] https://www.openfabrics.org/
[2] 
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers

v2:
 - Include minor bugfix for VLAN filtering.
 - Add maintainers entry.
 - Add documentation.

v3:
 - Add script and documentation to MAINTAINERS.
 - Make cosmetic changes to copyright notices.
 - Remove unwanted executable bits.
 - Fix coding style and typos found by checkpatch.
 - Add shared library compilation support.

Adrien Mazarguil (3):
  scripts: check features to generate configuration header
  mlx4: new poll mode driver
  doc: add librte_pmd_mlx4 documentation

 MAINTAINERS  |6 +
 config/common_bsdapp |   11 +
 config/common_linuxapp   |   11 +
 doc/guides/prog_guide/index.rst  |1 +
 doc/guides/prog_guide/mlx4_poll_mode_drv.rst |  326 ++
 doc/guides/prog_guide/source_org.rst |1 +
 lib/Makefile |1 +
 lib/librte_pmd_mlx4/Makefile |  121 +
 lib/librte_pmd_mlx4/mlx4.c   | 4749 ++
 lib/librte_pmd_mlx4/mlx4.h   |  165 +
 lib/librte_pmd_mlx4/rte_pmd_mlx4_version.map |4 +
 mk/rte.app.mk|8 +
 scripts/auto-config-h.sh |  136 +
 13 files changed, 5540 insertions(+)
 create mode 100644 doc/guides/prog_guide/mlx4_poll_mode_drv.rst
 create mode 100644 lib/librte_pmd_mlx4/Makefile
 create mode 100644 lib/librte_pmd_mlx4/mlx4.c
 create mode 100644 lib/librte_pmd_mlx4/mlx4.h
 create mode 100644 lib/librte_pmd_mlx4/rte_pmd_mlx4_version.map
 create mode 100755 scripts/auto-config-h.sh

-- 
2.1.0



[dpdk-dev] [PATCH v3 1/3] scripts: check features to generate configuration header

2015-02-25 Thread Adrien Mazarguil
This script looks for types, macros and functions in header files using
compilation options found in the environment (CC, CFLAGS, CPPFLAGS) to
define feature macros in a generated header.

Useful in combination with external headers that do not provide such macros.

Signed-off-by: Adrien Mazarguil 
---
 MAINTAINERS  |   1 +
 scripts/auto-config-h.sh | 136 +++
 2 files changed, 137 insertions(+)
 create mode 100755 scripts/auto-config-h.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 349ad2b..631e8ea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -45,6 +45,7 @@ F: Makefile
 F: config/
 F: mk/
 F: pkg/
+F: scripts/auto-config-h.sh
 F: scripts/depdirs-rule.sh
 F: scripts/gen-build-mk.sh
 F: scripts/gen-config-h.sh
diff --git a/scripts/auto-config-h.sh b/scripts/auto-config-h.sh
new file mode 100755
index 000..4356d7e
--- /dev/null
+++ b/scripts/auto-config-h.sh
@@ -0,0 +1,136 @@
+#!/bin/sh
+#
+#   BSD LICENSE
+#
+#   Copyright 2014-2015 6WIND S.A.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of 6WIND S.A. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Crude script to detect whether particular types, macros and functions are
+# defined by trying to compile a file with a given header. Can be used to
+# perform cross-platform checks since the resulting object file is not
+# executed.
+#
+# Set VERBOSE=1 in the environment to display compiler output and errors.
+#
+# CC, CPPFLAGS, CFLAGS, EXTRA_CPPFLAGS and EXTRA_CFLAGS are taken from the
+# environment.
+#
+# AUTO_CONFIG_CFLAGS may append additional CFLAGS without modifying the
+# above variables.
+
+file=${1:?output file name required (config.h)}
+macro=${2:?output macro name required (HAVE_*)}
+include=${3:?include name required (foo.h)}
+type=${4:?object type required (define, enum, type, field, func)}
+name=${5:?define/type/function name required}
+
+: ${CC:=cc}
+
+temp=/tmp/${0##*/}.$$.c
+
+case $type in
+define)
+   code="\
+#ifndef $name
+#error $name not defined
+#endif
+"
+   ;;
+enum)
+   code="\
+long test = $name;
+"
+   ;;
+type)
+   code="\
+$name test;
+"
+   ;;
+field)
+   code="\
+void test(void)
+{
+   ${name%%.*} test_;
+
+   (void)test_.${name#*.};
+}
+"
+   ;;
+func)
+   code="\
+void (*test)() = (void (*)())$name;
+"
+   ;;
+*)
+   unset error
+   : ${error:?unknown object type \"$type\"}
+   exit
+esac
+
+if [ "${VERBOSE}" = 1 ]
+then
+   err=2
+   out=1
+   eol='
+'
+else
+   exec 3> /dev/null ||
+   exit
+   err=3
+   out=3
+   eol=' '
+fi &&
+printf 'Looking for %s %s in %s.%s' \
+   "${name}" "${type}" "${include}" "${eol}" &&
+printf "\
+#include <%s>
+
+%s
+" "$include" "$code" > "${temp}" &&
+if ${CC} ${CPPFLAGS} ${EXTRA_CPPFLAGS} ${CFLAGS} ${EXTRA_CFLAGS} \
+   ${AUTO_CONFIG_CFLAGS} \
+   -c -o /dev/null "${temp}" 1>&${out} 2>&${err}
+then
+   rm -f "${temp}"
+   printf "\
+#ifndef %s
+#define %s 1
+#endif /* %s */
+
+" "${macro}" "${macro}" "${macro}" >> "${file}" &&
+   printf 'Defining %s.\n' "${macro}"
+else
+   rm -f "${temp}"
+   printf "\
+/* %s is not defined. */
+
+" "${macro}" >> "${file}" &&
+   printf 'Not defining %s.\n' "${macro}"
+fi
+
+exit
-- 
2.1.0



[dpdk-dev] [PATCH v3 2/3] mlx4: new poll mode driver

2015-02-25 Thread Adrien Mazarguil
This PMD manages all variants of Mellanox ConnectX-3 (EN 40, EN 10, Pro EN
40) as well as their virtual functions in SR-IOV context through IB Verbs
(libibverbs) and the dedicated user-space driver (libmlx4).

It is disabled by default due to dependencies on these libraries and only
supports Linux userland at the moment partly because /sys (sysfs) support is
required.

Also claim responsibility in the MAINTAINERS file.

Signed-off-by: Adrien Mazarguil 
Signed-off-by: Olga Shern 
---
 MAINTAINERS  |4 +
 config/common_bsdapp |   11 +
 config/common_linuxapp   |   11 +
 lib/Makefile |1 +
 lib/librte_pmd_mlx4/Makefile |  121 +
 lib/librte_pmd_mlx4/mlx4.c   | 4749 ++
 lib/librte_pmd_mlx4/mlx4.h   |  165 +
 lib/librte_pmd_mlx4/rte_pmd_mlx4_version.map |4 +
 mk/rte.app.mk|8 +
 9 files changed, 5074 insertions(+)
 create mode 100644 lib/librte_pmd_mlx4/Makefile
 create mode 100644 lib/librte_pmd_mlx4/mlx4.c
 create mode 100644 lib/librte_pmd_mlx4/mlx4.h
 create mode 100644 lib/librte_pmd_mlx4/rte_pmd_mlx4_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 631e8ea..d8b0fbc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -220,6 +220,10 @@ Intel fm10k
 M: Jing Chen 
 F: lib/librte_pmd_fm10k/

+Mellanox mlx4
+M: Adrien Mazarguil 
+F: lib/librte_pmd_mlx4/
+
 RedHat virtio
 M: Changchun Ouyang 
 F: lib/librte_pmd_virtio/
diff --git a/config/common_bsdapp b/config/common_bsdapp
index 83a62a6..4bbacaf 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -194,6 +194,17 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y

 #
+# Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
+#
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
+CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N=4
+CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0
+CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
+CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
+CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE=1
+
+#
 # Compile burst-oriented Cisco ENIC PMD driver
 #
 CONFIG_RTE_LIBRTE_ENIC_PMD=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2716381..2ea6711 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -192,6 +192,17 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y

 #
+# Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
+#
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_DEBUG=n
+CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N=4
+CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0
+CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
+CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1
+CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE=1
+
+#
 # Compile burst-oriented Cisco ENIC PMD driver
 #
 CONFIG_RTE_LIBRTE_ENIC_PMD=y
diff --git a/lib/Makefile b/lib/Makefile
index 7dc12af..3ebd394 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -45,6 +45,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += librte_pmd_e1000
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += librte_pmd_ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e
 DIRS-$(CONFIG_RTE_LIBRTE_FM10K_PMD) += librte_pmd_fm10k
+DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += librte_pmd_mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += librte_pmd_enic
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
diff --git a/lib/librte_pmd_mlx4/Makefile b/lib/librte_pmd_mlx4/Makefile
new file mode 100644
index 000..de50a5a
--- /dev/null
+++ b/lib/librte_pmd_mlx4/Makefile
@@ -0,0 +1,121 @@
+#   BSD LICENSE
+#
+#   Copyright 2012-2015 6WIND S.A.
+#   Copyright 2012 Mellanox.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of 6WIND S.A. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; O

  1   2   >