from:"Qiu, Michael"

[dpdk-dev] DPDK and HW offloads

2016-03-23 Thread Qiu, Michael

On 3/22/2016 6:20 PM, Richardson, Bruce wrote:
> On Tue, Mar 22, 2016 at 05:50:28AM +0000, Qiu, Michael wrote:
>> On 3/21/2016 11:27 PM, Kyle Larose wrote:
>>> On Mon, Mar 21, 2016 at 10:52 AM, Bruce Richardson
>>>  wrote:
>>>> On Sun, Mar 20, 2016 at 08:18:57PM +0100, Thomas Monjalon wrote:
>>>>> 2016-03-20 14:17, Zhang, Helin:
>>>>>> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>>>>>>> 2016-03-18 10:16, Stephen Hemminger:
>>>>>>>> Right now, all those offload features are pretty much unusable in a
>>>>>>>> real product without lots and lots of extra codes and huge bug
>>>>>>>> surface. It bothers me enough that I would recommend removing much of 
>>>>>>>> the
>>>>>>> filter/offload/ptype stuff from DPDK!
>>>>>>>
>>>>>>> One of the biggest challenge is to think about a good filtering API.
>>>>>>> The offloading has some interaction with the mbuf struct.
>>>>>>>
>>>>>>> I would like to suggest rewriting ethdev API by keeping it as is for 
>>>>>>> some time for
>>>>>>> compatibility while creating a new one. What about the prefix 
>>>>>>> dpdk_netdev_ to
>>>>>>> progressively replace rte_eth_dev?
>>>>>> I totally agree with to add new and generic APIs for user applications. 
>>>>>> But I don't
>>>>>> think we need to remove all current APIs. Generic APIs may not support 
>>>>>> all advanced
>>>>>> hardware features, while specific APIs can. Why not support all? One 
>>>>>> generic APIs for
>>>>>> common users, and others APIs for advanced users.
>>>>> Yes we cannot access to every features of a device through generic API.
>>>>> Until now we were trying to add an ethdev API for every features even if 
>>>>> it
>>>>> is used by only one driver.
>>>>> I think we should allow a direct access to the driver by the applications 
>>>>> and
>>>>> work on generic API only for common features.
>>>> Definite +1.
>>>> I think that we need to start pushing driver-specific functionality to get 
>>>> exposed
>>>> via a driver's header files. That allow users who want to extract the max
>>>> functionality from a particular NIC to do so via those APIs calls, while 
>>>> not
>>>> polluting the generic ethdev layer.
>>>>
>>> What sort of requirements on ABI/API compatibility would this place on
>>> the drivers? I would hope that it would be treated like any other
>>> public API within DPDK. I don't think this would be too onerous, but
>>> it would require that the drivers be designed to deal with it. (I.e.
>>> don't just expose any old internal driver function).
>> Why not to implement one simple API with variable arguments, just like
>> syscall ioctl() does. And drivers implement it's specific hardware
>> features with a feature bit param, and other needed variable arguments.
>>
>> Thanks,
>> Michael
> A very much dislike that idea. 
> * It makes the code much harder to read as you have to closely examine all the
>   parameters to work out what a function call is actually meant to do.

It's not a big deal, if we have a document.

> * It makes it much harder to see that you have an implicit dependency on a
>   specific device. Having to include a driver specific header file e.g. 
> i40e.h,
>   and call a function named e.g. i40e_do_magic_stuff(), makes it pretty 
> explicit
>   that you have a dependency on i40e-based hardware

Software does not want to bind to specific hardware I think, what about
the transportability?

> * It prevents the compiler from doing type-checking on parameters and 
> informing
>   you of little inconsistencies.

Maybe, we could do self-check for the parameters I think.

>
> For all these reasons, I prefer the device-specific functions option. However,
> at the same time, we also need to ensure we have a reasonable set of generic
> APIs so that the cases where users are forced to drop down to the lower-level
> device-specific primitives are reduced.

For software, it do not care which hardware it is, it only cares about
what ability you have.

Thanks,
Michael

> Regards,
> /Bruce
>
>>>> On the other hand, I don't like the idea of dpdk_netdev. I think we can 
>>>> work
>>>> within the existing rte_eth_dev framework.
>>>>
>>>> /Bruce
>>>>
>>

[dpdk-dev] [PATCH v2] testpmd: fix build on FreeBSD

2016-03-22 Thread Qiu, Michael

On 3/22/2016 2:51 PM, Marvin Liu wrote:
> Build log:
> /root/dpdk/app/test-pmd/cmdline.c:6687:45: error: no member named
> 's6_addr32' in 'struct in6_addr'
> rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
>
> This is caused by macro "s6_addr32" not defined on FreeBSD and testpmd
> swap big endian parameter to host endian. Move the swap action to i40e
> ethdev will fix this issue.
>
> Fixes: 7b1312891b69 ("ethdev: add IP in GRE tunnel")
>
> Signed-off-by: Marvin Liu 
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 9d52b8c..4f3d1e4 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -6678,14 +6678,12 @@ cmd_tunnel_filter_parsed(void *parsed_result,
>  
>   if (res->ip_value.family == AF_INET) {
>   tunnel_filter_conf.ip_addr.ipv4_addr =
> - rte_be_to_cpu_32(res->ip_value.addr.ipv4.s_addr);
> + res->ip_value.addr.ipv4.s_addr;
>   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
>   } else {
> - int i;
> - for (i = 0; i < 4; i++) {
> - tunnel_filter_conf.ip_addr.ipv6_addr[i] =
> - rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
> - }
> + memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
> + &(res->ip_value.addr.ipv6),
> + sizeof(struct in6_addr));
>   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
>   }
>  
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 10e0d38..43c2d5c 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -6015,6 +6015,7 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
>   uint8_t add)
>  {
>   uint16_t ip_type;
> + uint32_t ipv4_addr;
>   uint8_t i, tun_type = 0;
>   /* internal varialbe to convert ipv6 byte order */
>   uint32_t convert_ipv6[4];
> @@ -6040,14 +6041,15 @@ i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
>   pfilter->inner_vlan = rte_cpu_to_le_16(tunnel_filter->inner_vlan);
>   if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
>   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
> + ipv4_addr = rte_be_to_cpu_32(tunnel_filter->ip_addr.ipv4_addr);

As I checked "ipv4_addr" is the host Endian, so does it need to covert
again? Maybe I'm wrong.

Thanks,
Michael
>   rte_memcpy(&pfilter->ipaddr.v4.data,
> - 
> &rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv4_addr),
> + &rte_cpu_to_le_32(ipv4_addr),
>   sizeof(pfilter->ipaddr.v4.data));
>   } else {
>   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
>   for (i = 0; i < 4; i++) {
>   convert_ipv6[i] =
> - rte_cpu_to_le_32(tunnel_filter->ip_addr.ipv6_addr[i]);
> + 
> rte_cpu_to_le_32(rte_be_to_cpu_32(tunnel_filter->ip_addr.ipv6_addr[i]));
>   }
>   rte_memcpy(&pfilter->ipaddr.v6.data, &convert_ipv6,
>   sizeof(pfilter->ipaddr.v6.data));

[dpdk-dev] [PATCH] ixgbe: add TX queue number check

2016-03-22 Thread Qiu, Michael

On 3/22/2016 4:10 PM, Wenzhuo Lu wrote:
> Ixgbe supports at most 128 TX queues. But in none VT nor DCB mode
> the queues 64 ~ 127 should not be used. Ixgbe doesn't do any check
> about that. If a queue larger than 64 is used, the TX packets will
> be dropped silently. It's hard to debug.
> This check is added to forbid using queue number larger than 64
> during device configuration, so the user can know the problem as
> early as possible.
>
> Signed-off-by: Wenzhuo Lu 
> Reported-by: Antonio Fischetti 
> ---

Acked-by: Michael Qiu 

>  drivers/net/ixgbe/ixgbe_ethdev.c | 11 ++-
>  drivers/net/ixgbe/ixgbe_ethdev.h |  1 +
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 5371720..dd6d00e 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1862,7 +1862,7 @@ ixgbe_check_mq_mode(struct rte_eth_dev *dev)
>  {
>   struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
>   uint16_t nb_rx_q = dev->data->nb_rx_queues;
> - uint16_t nb_tx_q = dev->data->nb_rx_queues;
> + uint16_t nb_tx_q = dev->data->nb_tx_queues;
>  
>   if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
>   /* check multi-queue mode */
> @@ -2002,6 +2002,15 @@ ixgbe_check_mq_mode(struct rte_eth_dev *dev)
>   return -EINVAL;
>   }
>   }
> +
> + if (dev_conf->txmode.mq_mode == ETH_MQ_TX_NONE) {
> + if (nb_tx_q > IXGBE_NONE_VT_DCB_MAX_TXQ_NB) {
> + PMD_INIT_LOG(ERR,
> +  "None VT nor DCB, nb_tx_q > %d.",
> +  IXGBE_NONE_VT_DCB_MAX_TXQ_NB);
> + return -EINVAL;
> + }
> + }
>   }
>   return 0;
>  }
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
> b/drivers/net/ixgbe/ixgbe_ethdev.h
> index 5c3aa16..50ee73f 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -61,6 +61,7 @@
>  #define IXGBE_MAX_RX_QUEUE_NUM   128
>  #define IXGBE_VMDQ_DCB_NB_QUEUES IXGBE_MAX_RX_QUEUE_NUM
>  #define IXGBE_DCB_NB_QUEUES  IXGBE_MAX_RX_QUEUE_NUM
> +#define IXGBE_NONE_VT_DCB_MAX_TXQ_NB 64
>  
>  #ifndef NBBY
>  #define NBBY 8   /* number of bits in a byte */

[dpdk-dev] Reg: promiscuous mode on VF

2016-03-22 Thread Qiu, Michael

Yes, we could let ovs using 82599 VF to do rx/tx. I don't know what's
your l2 bridge, but since ovs could work I think your bridge also could
work. But I only tested with one VF.

Make sure below two patches (bifurcate driver) are included in your kernel:

_https://patchwork.ozlabs.org/patch/476511/_
_https://patchwork.ozlabs.org/patch/476516/_

Mostly, if your kernel version in 4.2 or newer, it should be included.

After you create VF, before you passthrough the VF to guest:

(vf +1) << 32 + queue-index,


 1. where vf is the VF index starting from 0
 2. the queue-index is 0 if multi-queue support is not turned on, and
this value is [0,1] if multiple-queue is turned on


echo 1 > /sys/bus/pci/devices/\:05\:00.0/sriov_numvfs
ifconfig $(PF_INTF) up
ifconfig $(VF0_INFT) up
ip link set $(PF_INTF) promisc on
ethtool -K $(PF_INTF) ntuple on
ethtool -N $(PF_INTF) flow-type udp4 dst-port 4789 action 0x1  
(VF0 queue 0)

Here we using flow director to all let packets according to the rules to
the VF, But I don't know if it could let the packets to other VFs at the
same time.

Thanks,
Michael

On 3/17/2016 2:43 PM, bharath paulraj wrote:
> Hi Lu, Helin, Greg,
>
>   Many thanks for your response, which is really quick. Now, If I want
> to implement L2 bridging with Intel virtualization technologies, using
> 82599 controller, then Michael is my only hope, as getting the new
> kernel versions and upstream support will take considerable amount of
> time.
>
>Michael, Could you please share your experience on L2 bridging
> using Intel virtualization technologies. 
>
> Thanks,
> Bharath
>
> On Wed, Mar 16, 2016 at 9:40 PM, Rose, Gregory V
> mailto:gregory.v.rose at intel.com>> wrote:
>
> Intel has not supported promiscuous mode for virtual functions due
> to the security concerns mentioned below.
>
> There will be upstream support in an upcoming Linux kernel for
> setting virtual functions as "trusted" and when that is available
> then Intel will allow virtual functions to enter unicast
> promiscuous mode on those Ethernet controllers that support
> promiscuous mode for virtual functions in the HW/FW.  Be aware
> that not all Intel Ethernet controllers have support for unicast
> promiscuous mode for virtual functions.  The only currently
> released product that does is the X710/XL710.
>
> The key take away is that unicast promiscuous mode for X710/XL710
> virtual functions requires Linux kernel support, iproute2 package
> support and driver support.  Only when all three of these are in
> place will the feature work.
>
> Thanks,
>
> - Greg
>
> -Original Message-
> From: Zhang, Helin
> Sent: Wednesday, March 16, 2016 9:04 AM
> To: bharath paulraj  <mailto:bharathpaul at gmail.com>>; Lu, Wenzhuo  <mailto:wenzhuo.lu at intel.com>>; Rowden, Aaron F
> mailto:aaron.f.rowden at intel.com>>;
> Rose, Gregory V  <mailto:gregory.v.rose at intel.com>>
> Cc: dev at dpdk.org <mailto:dev at dpdk.org>; Qiu, Michael
> mailto:michael.qiu at intel.com>>; Jayakumar,
> Muthurajan  <mailto:muthurajan.jayakumar at intel.com>>
> Subject: RE: [dpdk-dev] Reg: promiscuous mode on VF
>
> Hi Bharath
>
> For your question of "why intel does not support unicast
> promiscuos mode?", I'd ask Aaron or Greg to give answers.
> Thank you very much!
>
> Regards,
> Helin
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org
> <mailto:dev-bounces at dpdk.org>] On Behalf Of bharath paulraj
> > Sent: Wednesday, March 16, 2016 11:29 PM
> > To: Lu, Wenzhuo
> > Cc: dev at dpdk.org <mailto:dev at dpdk.org>
> > Subject: Re: [dpdk-dev] Reg: promiscuous mode on VF
> >
> > Hi Lu,
> >
> > Many thanks for your response. Again I have few more queries.
> > If VF unicast promiscuous mode is not supported then can't we
> > implement a Layer 2 bridging functionality using intel
> virtualization
> > technologies? Or Is there any other way, say tweeking some hardware
> > registers or drivers, which may help us in implementing Layer 2
> bridging.
> > Also I would like to know, why intel does not support unicast
> promiscuos mode?
> > It could have been optional register settings and user should
> have had
> > a previleage to set or unset it. Besides, security reasons, is there
> > any other big reason why Intel does not support this?
> >
> > Thanks,
> &

[dpdk-dev] DPDK and HW offloads

2016-03-22 Thread Qiu, Michael

On 3/21/2016 11:27 PM, Kyle Larose wrote:
> On Mon, Mar 21, 2016 at 10:52 AM, Bruce Richardson
>  wrote:
>> On Sun, Mar 20, 2016 at 08:18:57PM +0100, Thomas Monjalon wrote:
>>> 2016-03-20 14:17, Zhang, Helin:
 From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> 2016-03-18 10:16, Stephen Hemminger:
>> Right now, all those offload features are pretty much unusable in a
>> real product without lots and lots of extra codes and huge bug
>> surface. It bothers me enough that I would recommend removing much of the
> filter/offload/ptype stuff from DPDK!
>
> One of the biggest challenge is to think about a good filtering API.
> The offloading has some interaction with the mbuf struct.
>
> I would like to suggest rewriting ethdev API by keeping it as is for some 
> time for
> compatibility while creating a new one. What about the prefix 
> dpdk_netdev_ to
> progressively replace rte_eth_dev?
 I totally agree with to add new and generic APIs for user applications. 
 But I don't
 think we need to remove all current APIs. Generic APIs may not support all 
 advanced
 hardware features, while specific APIs can. Why not support all? One 
 generic APIs for
 common users, and others APIs for advanced users.
>>> Yes we cannot access to every features of a device through generic API.
>>> Until now we were trying to add an ethdev API for every features even if it
>>> is used by only one driver.
>>> I think we should allow a direct access to the driver by the applications 
>>> and
>>> work on generic API only for common features.
>> Definite +1.
>> I think that we need to start pushing driver-specific functionality to get 
>> exposed
>> via a driver's header files. That allow users who want to extract the max
>> functionality from a particular NIC to do so via those APIs calls, while not
>> polluting the generic ethdev layer.
>>
> What sort of requirements on ABI/API compatibility would this place on
> the drivers? I would hope that it would be treated like any other
> public API within DPDK. I don't think this would be too onerous, but
> it would require that the drivers be designed to deal with it. (I.e.
> don't just expose any old internal driver function).

Why not to implement one simple API with variable arguments, just like
syscall ioctl() does. And drivers implement it's specific hardware
features with a feature bit param, and other needed variable arguments.

Thanks,
Michael
>> On the other hand, I don't like the idea of dpdk_netdev. I think we can work
>> within the existing rte_eth_dev framework.
>>
>> /Bruce
>>

[dpdk-dev] [PATCH 1/2 v2] fm10k: Add Atwood Channel Support

2016-03-09 Thread Qiu, Michael

Hi, Bruce

What about this patch?

Thanks,
Michael

On 2/4/2016 4:36 PM, Qiu, Michael wrote:
> Atwood Channel is intel 25G NIC, and this patch add the support
> in DPDK.
>
> Signed-off-by: Michael Qiu
> Acked-by: John McNamara 
> ---
>  drivers/net/fm10k/base/fm10k_osdep.h| 4 
>  lib/librte_eal/common/include/rte_pci_dev_ids.h | 2 ++
>  2 files changed, 6 insertions(+)
>
> diff --git a/drivers/net/fm10k/base/fm10k_osdep.h 
> b/drivers/net/fm10k/base/fm10k_osdep.h
> index 6852ef0..9cb46ff 100644
> --- a/drivers/net/fm10k/base/fm10k_osdep.h
> +++ b/drivers/net/fm10k/base/fm10k_osdep.h
> @@ -48,6 +48,10 @@ POSSIBILITY OF SUCH DAMAGE.
>  #define BOULDER_RAPIDS_HW
>  #endif
>  
> +#ifndef ATWOOD_CHANNEL_HW
> +#define ATWOOD_CHANNEL_HW
> +#endif
> +
>  #define STATIC  static
>  #define DEBUGFUNC(F)DEBUGOUT(F "\n");
>  #define DEBUGOUT(S, args...)PMD_DRV_LOG_RAW(DEBUG, S, ##args)
> diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
> b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> index e31b934..cb0d177 100644
> --- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
> +++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> @@ -530,9 +530,11 @@ RTE_PCI_DEV_ID_DECL_I40E(PCI_VENDOR_ID_INTEL, 
> I40E_DEV_ID_10G_BASE_T_X722)
>  
>  #define FM10K_DEV_ID_PF   0x15A4
>  #define FM10K_DEV_ID_SDI_FM10420_QDA2 0x15D0
> +#define FM10K_DEV_ID_SDI_FM10420_DA2  0x15D5
>  
>  RTE_PCI_DEV_ID_DECL_FM10K(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_PF)
>  RTE_PCI_DEV_ID_DECL_FM10K(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_SDI_FM10420_QDA2)
> +RTE_PCI_DEV_ID_DECL_FM10K(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_SDI_FM10420_DA2)
>  
>  /** Virtual IGB devices from e1000_hw.h **/
>

[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-03-04 Thread Qiu, Michael

On 3/2/2016 10:48 AM, Yuanhan Liu wrote:
> On Wed, Mar 02, 2016 at 02:10:14AM +0000, Qiu, Michael wrote:
>> On 3/1/2016 5:46 PM, Santosh Shukla wrote:
>>> On Tue, Mar 1, 2016 at 2:41 PM, Qiu, Michael  
>>> wrote:
>>>> On 2/26/2016 4:53 PM, Santosh Shukla wrote:
>>>>> Check cpuflag macro before using vectored api.
>>>>> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
>>>>> cpuflag.
>>>>> - Also wrap other vectored freind api ie..
>>>>> 1) virtqueue_enqueue_recv_refill_simple
>>>>> 2) virtio_rxq_vec_setup
>>>>>
>>>>> todo:
>>>>> 1) Move virtio_recv_pkts_vec() implementation to
>>>>>drivers/virtio/virtio_vec_.h file.
>>>>> 2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_.h
>>>>>files to provide vectored/non-vectored rx/tx apis.
>>>>>
>>>>> Signed-off-by: Santosh Shukla 
>>>>> ---
>>>>> - v1: This is a rework of patch [1].
>>>>> Note: This patch will let non-x86 arch to use virtio pmd.
>>>>>
>>>>> [1] http://dpdk.org/dev/patchwork/patch/10429/
>>>>>
>>>>>  drivers/net/virtio/virtio_rxtx.c|   16 +++-
>>>>>  drivers/net/virtio/virtio_rxtx.h|2 ++
>>>>>  drivers/net/virtio/virtio_rxtx_simple.c |   11 ++-
>>>>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/virtio/virtio_rxtx.c 
>>>>> b/drivers/net/virtio/virtio_rxtx.c
>>>>> index 41a1366..ec0b8de 100644
>>>>> --- a/drivers/net/virtio/virtio_rxtx.c
>>>>> +++ b/drivers/net/virtio/virtio_rxtx.c
>>>>> @@ -67,7 +67,9 @@
>>>>>  #define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
>>>>>   ETH_TXQ_FLAGS_NOOFFLOADS)
>>>>>
>>>>> +#ifdef RTE_MACHINE_CPUFLAG_SSSE3
>>>>>  static int use_simple_rxtx;
>>>>> +#endif
>>>>>
>>>>>
>>>> I don't think so much #ifdef ... #endif in *.c file is a good choice.
>>>> Would you consider let it only in header file like:
>>>>
>>>> in drivers/net/virtio/virtio_rxtx.h
>>>>
>>>> [...]
>>>>
>>>> #ifdef RTE_MACHINE_CPUFLAG_SSSE3
>>>> int virtio_rxq_vec_setup(struct virtqueue *rxq);
>>>>
>>>> int virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,
>>>> struct rte_mbuf *m);
>>>> #else
>>>> int virtio_rxq_vec_setup(__rte_unused struct virtqueue *rxq) {return -1;}
>>>> int virtqueue_enqueue_recv_refill_simple(__rte_unused struct virtqueue *vq,
>>>>  __rte_unused struct rte_mbuf *m) {
>>>> return -1;
>>>> }
>>>> #endif
>>>>
>>>> and remove most #ifdef ... #endif in *.c file.
>>>>
>>> I guess, above approach wont work for non-x86 arch, ad those func are
>>> dummy, right? also code wont build for arm/non-86 arch because
>>> tx/rx_pkt_burst callback will be using x86 specific virtio vec rx/tx
>>> api.
>> You may right, but you really need to reduce the #ifdef in *.c files.
> In general, yes. But for this case, no: those vec stuff are for
> platforms that support it. For other platforms, we should not
> invoke a dummy function like virtio_rxq_vec_setup() at all.
>
> The right way to go is to add another wrapper beyond the vector
> stuff, something like:
>
>   virtio_rxq_setup()
>   {
>
>   if (has_vec_support)
>   virtio_rxq_vec_setup();
>   else
>   virtio_rxq_generic_setup();
>   }

Actually, we could call vec first and if set up failed, fall back to
simple mode. Thus we could use dummy func, and it could make lift simple.

Thanks,
Michael
> Where virtio_rxq_vec_setup() could have a per-arch implementation,
> say for X86, or ARM.
>
> It touchs more code, but for now, I'd like to make it simple first.
> With the virtio_rxtx_simple.c isolated from Makefile, there aren't
> many #ifdef after all.
>
>   --yliu
>

[dpdk-dev] [PATCH 1/4] ixgbe: support UDP tunnel add/del

2016-03-03 Thread Qiu, Michael

On 1/11/2016 3:08 PM, Wenzhuo Lu wrote:
> Add UDP tunnel add/del support on ixgbe. Now it only support
> VxLAN port configuration.
> Although the VxLAN port has a default value 4789, it can be
> changed. We support VxLAN port configuration to meet the
> change.
> Note, the default value of VxLAN port in ixgbe NICs is 0. So
> please set it when using VxLAN off-load.
>
> Signed-off-by: Wenzhuo Lu 
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 93 
> 
>  1 file changed, 93 insertions(+)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 4c4c6df..381cbad 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -337,6 +337,10 @@ static int ixgbe_timesync_read_time(struct rte_eth_dev 
> *dev,
>  struct timespec *timestamp);
>  static int ixgbe_timesync_write_time(struct rte_eth_dev *dev,
>  const struct timespec *timestamp);
> +static int ixgbe_dev_udp_tunnel_add(struct rte_eth_dev *dev,
> + struct rte_eth_udp_tunnel *udp_tunnel);
> +static int ixgbe_dev_udp_tunnel_del(struct rte_eth_dev *dev,
> + struct rte_eth_udp_tunnel *udp_tunnel);
>  
>  /*
>   * Define VF Stats MACRO for Non "cleared on read" register
> @@ -495,6 +499,8 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>   .timesync_adjust_time = ixgbe_timesync_adjust_time,
>   .timesync_read_time   = ixgbe_timesync_read_time,
>   .timesync_write_time  = ixgbe_timesync_write_time,
> + .udp_tunnel_add   = ixgbe_dev_udp_tunnel_add,
> + .udp_tunnel_del   = ixgbe_dev_udp_tunnel_del,
>  };
>  
>  /*
> @@ -6191,6 +6197,93 @@ ixgbe_dev_get_dcb_info(struct rte_eth_dev *dev,
>   return 0;
>  }
>  
> +#define DEFAULT_VXLAN_PORT 4789
> +
> +/* on x550, there's only one register for VxLAN UDP port.
> + * So, we cannot add or del the port. We only update it.
> + */
> +static int
> +ixgbe_update_vxlan_port(struct ixgbe_hw *hw,
> + uint16_t port)
> +{
> + IXGBE_WRITE_REG(hw, IXGBE_VXLANCTRL, port);
> + IXGBE_WRITE_FLUSH(hw);
> +
> + return 0;
> +}
> +
> +/* Add UDP tunneling port */
> +static int
> +ixgbe_dev_udp_tunnel_add(struct rte_eth_dev *dev,
> +  struct rte_eth_udp_tunnel *udp_tunnel)
> +{
> + int ret = 0;
> + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +
> + if (hw->mac.type != ixgbe_mac_X550 &&
> + hw->mac.type != ixgbe_mac_X550EM_x) {
> + return -ENOTSUP;
> + }
> +
> + if (udp_tunnel == NULL)
> + return -EINVAL;
> +
> + switch (udp_tunnel->prot_type) {
> + case RTE_TUNNEL_TYPE_VXLAN:
> + /* cannot add a port, update the port value */
> + ret = ixgbe_update_vxlan_port(hw, udp_tunnel->udp_port);
> + break;
> +
> + case RTE_TUNNEL_TYPE_GENEVE:
> + case RTE_TUNNEL_TYPE_TEREDO:
> + PMD_DRV_LOG(ERR, "Tunnel type is not supported now.");
> + ret = -1;
> + break;
> +
> + default:
> + PMD_DRV_LOG(ERR, "Invalid tunnel type");
> + ret = -1;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +/* Remove UDP tunneling port */
> +static int
> +ixgbe_dev_udp_tunnel_del(struct rte_eth_dev *dev,
> +  struct rte_eth_udp_tunnel *udp_tunnel)
> +{
> + int ret = 0;
> + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +
> + if (hw->mac.type != ixgbe_mac_X550 &&
> + hw->mac.type != ixgbe_mac_X550EM_x) {
> + return -ENOTSUP;
> + }
> +
> + if (udp_tunnel == NULL)
> + return -EINVAL;
> +
> + switch (udp_tunnel->prot_type) {
> + case RTE_TUNNEL_TYPE_VXLAN:
> + /* cannot del the port, reset it to default */
> + ret = ixgbe_update_vxlan_port(hw, DEFAULT_VXLAN_PORT);
> + break;
> + case RTE_TUNNEL_TYPE_GENEVE:
> + case RTE_TUNNEL_TYPE_TEREDO:
> + PMD_DRV_LOG(ERR, "Tunnel type is not supported now.");
> + ret = -1;

Better to use the -EINVAL or other, mixed style always not good.

Thanks,
Michael
> + break;
> + default:
> + PMD_DRV_LOG(ERR, "Invalid tunnel type");
> + ret = -1;
> + break;
> + }
> +
> + return ret;
> +}
> +
>  static struct rte_driver rte_ixgbe_driver = {
>   .type = PMD_PDEV,
>   .init = rte_ixgbe_pmd_init,

[dpdk-dev] New driver (large patch) question.

2016-03-03 Thread Qiu, Michael

On 3/3/2016 7:11 AM, Stephen Hurd wrote:
> On Wed, Mar 2, 2016 at 2:15 PM, Thomas Monjalon 
> wrote:
>
>>> The comments in it are the only publicly available
>>> documentation on the hardware I'm aware of.
>> So you must keep the comments.
>>
> That's my goal, but the comments are well over the 300k limit.
>
>
>>> The driver itself doesn't have a lot of optional features in it, it's the
>>> header file that's too big.
>> It is big because there are many different things.
>> You can split the file in different patches.
>> Examples:
>> - a patch for RSS will bring the hardware structures for RSS
>> - a patch for the stats will bring the hardware stats structures
>> etc
>>
> Should I split additional definitions/documentation that's not currently
> used in the driver as well?  Or should it stay as only enough to document
> what the driver already does?
>
> The header file is expected to be publicly released in the future, so I
> tried to keep it as close to the original as possible.  I'm not strongly
> attached to this approach, but it does make it easier to support future
> firmware releases.
>
> It's a fairly work-intensive project to deconstruct the existing driver
> into a series of small patches that work at each step, is this a hard
> requirement? (if so, I'd better get cracking)

Does original header file has it's own commit log(like it in other
project)? If yes, it could make your life simpler.

Thanks,
Michael 
> PS: please answer inline
> Sorry, $work just switched us to GMail and I'm still learning the ropes.
>

[dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment

2016-03-03 Thread Qiu, Michael

On 3/3/2016 11:36 AM, Dhana Eadala wrote:
> We found a problem in dpdk-2.2 using under multi-process environment.
> Here is the brief description how we are using the dpdk:
>
> We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are two 
> different compiled binaries.
> proc1 is started as primary process and proc2 as secondary process.
>
> proc1:
> Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
> As part of this, this api initalized the rte_hash structure and set the 
> srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
>
> proc2:
> calls srcHash =  rte_hash_find_existing("src_hash_name"). This returns the 
> rte_hash created by proc1.
> This srcHash->rte_hash_cmp_eq still points to the address of memcmp() from 
> proc1 address space.
> Later proc2  calls rte_hash_lookup_with_hash(srcHash, (const void*) &key, 
> key.sig);
> Under the hood, rte_hash_lookup_with_hash() invokes 
> __rte_hash_lookup_with_hash(), which in turn calls h->rte_hash_cmp_eq(key, 
> k->key, h->key_len).
> This leads to a crash as h->rte_hash_cmp_eq is an address from proc1 address 
> space and is invalid address in proc2 address space.
>
> We found, from dpdk documentation, that
>
> "
>  The use of function pointers between multiple processes running based of 
> different compiled
>  binaries is not supported, since the location of a given function in one 
> process may be different to
>  its location in a second. This prevents the librte_hash library from 
> behaving properly as in a  multi-
>  threaded instance, since it uses a pointer to the hash function internally.
>
>  To work around this issue, it is recommended that multi-process applications 
> perform the hash
>  calculations by directly calling the hashing function from the code and then 
> using the
>  rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions instead of 
> the functions which do
>  the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> "
>
> We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
> It was no issue up to and including dpdk-2.0. In later releases started 
> crashing because rte_hash_cmp_eq is introduced in dpdk-2.1
>
> We fixed it with the following patch and would like to submit the patch to 
> dpdk.org.
> Patch is created such that, if anyone wanted to use dpdk in multi-process 
> environment with function pointers not shared, they need to
> define RTE_LIB_MP_NO_FUNC_PTR in their Makefile. Without defining this flag 
> in Makefile, it works as it is now.
>
> Signed-off-by: Dhana Eadala 
> ---
>

Some comments:

1.  your commit log need to refactor, better to limit every line less
than 80 character.

2. I think you could add the ifdef here in
lib/librte_hash/rte_cuckoo_hash.c :
/*
 * If x86 architecture is used, select appropriate compare function,
 * which may use x86 instrinsics, otherwise use memcmp
 */
#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) ||\
 defined(RTE_ARCH_X86_X32) || defined(RTE_ARCH_ARM64)
/* Select function to compare keys */
switch (params->key_len) {
case 16:
h->rte_hash_cmp_eq = rte_hash_k16_cmp_eq;
break;
[...]
break;
default:
/* If key is not multiple of 16, use generic memcmp */
h->rte_hash_cmp_eq = memcmp;
}
#else
h->rte_hash_cmp_eq = memcmp;
#endif

So that could remove other #ifdef in those lines.

3. I don't think ask others to write RTE_LIB_MP_NO_FUNC_PTR in makefile
is a good idea, if you really want to do that, please add a doc so that
others could know it.

Thanks,
Michael

[dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process

2016-03-02 Thread Qiu, Michael

On 3/2/2016 2:57 AM, Dhananjaya Reddy Eadala wrote:
> Hi
>
> We found a problem in dpdk-2.2 using under multi-process environment.
> Here is the brief description how we are using the dpdk:
>
> We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
> two different compiled binaries.
> proc1 is started as primary process and proc2 as secondary process.
>
> proc1:
> Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash
> structure.
> As part of this, this api initalized the rte_hash structure and set the
> srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address
> space.
>
> proc2:
> calls srcHash =  rte_hash_find_existing("src_hash_name"). This returns the
> rte_hash created by proc1.
> This srcHash->rte_hash_cmp_eq still points to the address of memcmp() from
> proc1 address space.
> Later proc2  calls rte_hash_lookup_with_hash(srcHash, (const void*) &key,
> key.sig);
> Under the hood, rte_hash_lookup_with_hash() invokes
> __rte_hash_lookup_with_hash(), which in turn calls h->rte_hash_cmp_eq(key,
> k->key, h->key_len).
> This leads to a crash as h->rte_hash_cmp_eq is an address from proc1
> address space and is invalid address in proc2 address space.
>
> We found, from dpdk documentation, that
> "
>  The use of function pointers between multiple processes running based of
> different compiled
>  binaries is not supported, since the location of a given function in one
> process may be different to
>  its location in a second. This prevents the librte_hash library from
> behaving properly as in a  multi-
>  threaded instance, since it uses a pointer to the hash function internally.
>
>
>  To work around this issue, it is recommended that multi-process
> applications perform the hash
>  calculations by directly calling the hashing function from the code and
> then using the
>  rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions instead of
> the functions which do
>  the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> "
>
> We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
> It was no issue up to and including dpdk-2.0. In later releases started
> crashing because rte_hash_cmp_eq is introduced in dpdk-2.1
>
> We fixed it with the following patch and would like to submit the patch to
> dpdk.org.

Could you send the patch in the mail?

Learn how to send a patch:

http://www.dpdk.org/dev

Thanks,
Michael
> Patch is created such that, if anyone wanted to use dpdk in multi-process
> environment with function pointers not shared, they need to
> define RTE_LIB_MP_NO_FUNC_PTR in their Makefile. Without defining this flag
> in Makefile, it works as it is now.
>
>
> Please find here attached is the patch file.
>
> Thanks
> Dhana
>

[dpdk-dev] [PATCH v9 0/2] Add VHOST PMD

2016-03-02 Thread Qiu, Michael

On 3/1/2016 10:19 AM, Tetsuya Mukawa wrote:
> On 2016/03/01 11:00, Qiu, Michael wrote:
>> On 2/26/2016 4:36 PM, Tetsuya Mukawa wrote:
>>> On 2016/02/26 13:29, Tetsuya Mukawa wrote:
>>>>

[...]

>>>>
>>>> BTW, I have set the frontend mergeable off.
>>>> I have checked below cases.
>>>>  - Case1: Disable mergeable feature in virtio-net PMD.
>>>>  - Case2: Disable mergeable feature in virtio-net PMD and use
>>>> '--txqflags=0xf01' option to use simple ring deploying.
>>>> Both cases,  I still cannot see the drop.
>>>>
>>>> Anyway, I will send a few patch-series to determine the cause of drop.
>>>> So, could you please apply them and check the performance to determine
>>>> which cause the drop?
>>> Hi Michael,
>>>
>>> I may find what causes the drop.
>>> Could you please restart testpmd on guest when you see the drop, then
>>> check performance again?
>>>
>>> I guess the drop will occur only first time when testpmd on guest and
>>> host is connected.
>>> Here are rough steps.
>>>
>>> 1. Start testpmd on host
>>> 2. Start QEMU
>>> 3. Start testpmd on guest
>>>
>>> Then you will see the drop.
>>> Probably, if testpmd on guest is restarted, then you don't see the drop
>>> again.
>>>
>>> 4. Type 'quit' on guest.
>>> 5. Start testpmd on guest again.
> Hi Michael,
>
> I am sorry that above was caused by my miss configuration.
> So please ignore it.
> If you can have time today, could you please check v7 and v8 performance?

Hi, Tetsuya

I have tried the qemu case but seems it does not have any difference,
maybe my configuration is wrong.

What I used to test is container case from Jianfeng.  And I make a
mistake that V6 compiled by GCC 5.3, but V9 with GCC 4.8, after using
the same compiler, the performance almost the same.

Thanks,
Michael


> Thanks,
> Tetsuya
>
>> OK, I will help to tested today.
>>
>> Thanks,
>> Michael
>

[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-03-02 Thread Qiu, Michael

On 3/1/2016 5:46 PM, Santosh Shukla wrote:
> On Tue, Mar 1, 2016 at 2:41 PM, Qiu, Michael  wrote:
>> On 2/26/2016 4:53 PM, Santosh Shukla wrote:
>>> Check cpuflag macro before using vectored api.
>>> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
>>> cpuflag.
>>> - Also wrap other vectored freind api ie..
>>> 1) virtqueue_enqueue_recv_refill_simple
>>> 2) virtio_rxq_vec_setup
>>>
>>> todo:
>>> 1) Move virtio_recv_pkts_vec() implementation to
>>>drivers/virtio/virtio_vec_.h file.
>>> 2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_.h
>>>files to provide vectored/non-vectored rx/tx apis.
>>>
>>> Signed-off-by: Santosh Shukla 
>>> ---
>>> - v1: This is a rework of patch [1].
>>> Note: This patch will let non-x86 arch to use virtio pmd.
>>>
>>> [1] http://dpdk.org/dev/patchwork/patch/10429/
>>>
>>>  drivers/net/virtio/virtio_rxtx.c|   16 +++-
>>>  drivers/net/virtio/virtio_rxtx.h|2 ++
>>>  drivers/net/virtio/virtio_rxtx_simple.c |   11 ++-
>>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio/virtio_rxtx.c 
>>> b/drivers/net/virtio/virtio_rxtx.c
>>> index 41a1366..ec0b8de 100644
>>> --- a/drivers/net/virtio/virtio_rxtx.c
>>> +++ b/drivers/net/virtio/virtio_rxtx.c
>>> @@ -67,7 +67,9 @@
>>>  #define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
>>>   ETH_TXQ_FLAGS_NOOFFLOADS)
>>>
>>> +#ifdef RTE_MACHINE_CPUFLAG_SSSE3
>>>  static int use_simple_rxtx;
>>> +#endif
>>>
>>>
>> I don't think so much #ifdef ... #endif in *.c file is a good choice.
>> Would you consider let it only in header file like:
>>
>> in drivers/net/virtio/virtio_rxtx.h
>>
>> [...]
>>
>> #ifdef RTE_MACHINE_CPUFLAG_SSSE3
>> int virtio_rxq_vec_setup(struct virtqueue *rxq);
>>
>> int virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,
>> struct rte_mbuf *m);
>> #else
>> int virtio_rxq_vec_setup(__rte_unused struct virtqueue *rxq) {return -1;}
>> int virtqueue_enqueue_recv_refill_simple(__rte_unused struct virtqueue *vq,
>>  __rte_unused struct rte_mbuf *m) {
>> return -1;
>> }
>> #endif
>>
>> and remove most #ifdef ... #endif in *.c file.
>>
> I guess, above approach wont work for non-x86 arch, ad those func are
> dummy, right? also code wont build for arm/non-86 arch because
> tx/rx_pkt_burst callback will be using x86 specific virtio vec rx/tx
> api.

You may right, but you really need to reduce the #ifdef in *.c files.

Thanks,
Michael

>> Thanks,
>> Michael

[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api

2016-03-01 Thread Qiu, Michael

On 2/26/2016 4:53 PM, Santosh Shukla wrote:
> Check cpuflag macro before using vectored api.
> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added 
> cpuflag.
> - Also wrap other vectored freind api ie..
> 1) virtqueue_enqueue_recv_refill_simple
> 2) virtio_rxq_vec_setup
>
> todo:
> 1) Move virtio_recv_pkts_vec() implementation to
>drivers/virtio/virtio_vec_.h file.
> 2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_.h
>files to provide vectored/non-vectored rx/tx apis.
>
> Signed-off-by: Santosh Shukla 
> ---
> - v1: This is a rework of patch [1].
> Note: This patch will let non-x86 arch to use virtio pmd.
>
> [1] http://dpdk.org/dev/patchwork/patch/10429/
>
>  drivers/net/virtio/virtio_rxtx.c|   16 +++-
>  drivers/net/virtio/virtio_rxtx.h|2 ++
>  drivers/net/virtio/virtio_rxtx_simple.c |   11 ++-
>  3 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_rxtx.c 
> b/drivers/net/virtio/virtio_rxtx.c
> index 41a1366..ec0b8de 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -67,7 +67,9 @@
>  #define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
>   ETH_TXQ_FLAGS_NOOFFLOADS)
>  
> +#ifdef RTE_MACHINE_CPUFLAG_SSSE3
>  static int use_simple_rxtx;
> +#endif
>  
>

I don't think so much #ifdef ... #endif in *.c file is a good choice.
Would you consider let it only in header file like:

in drivers/net/virtio/virtio_rxtx.h

[...]

#ifdef RTE_MACHINE_CPUFLAG_SSSE3
int virtio_rxq_vec_setup(struct virtqueue *rxq);

int virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,
struct rte_mbuf *m);
#else
int virtio_rxq_vec_setup(__rte_unused struct virtqueue *rxq) {return -1;}
int virtqueue_enqueue_recv_refill_simple(__rte_unused struct virtqueue *vq,
 __rte_unused struct rte_mbuf *m) {
return -1;
}
#endif

and remove most #ifdef ... #endif in *.c file.

Thanks,
Michael

[dpdk-dev] [PATCH v9 0/2] Add VHOST PMD

2016-03-01 Thread Qiu, Michael

On 2/26/2016 4:36 PM, Tetsuya Mukawa wrote:
> On 2016/02/26 13:29, Tetsuya Mukawa wrote:
>> On 2016/02/25 16:51, Qiu, Michael wrote:
>>> On 2/24/2016 1:10 PM, Tetsuya Mukawa wrote:
>>>> On 2016/02/24 11:45, Qiu, Michael wrote:
>>>>> Hi,  Tetsuya
>>>>>
>>>>> When I applied your v6 patch, I could reach 9.5Mpps with 64B packet.
>>>>>
>>>>> But when apply v9 only 8.4 Mpps, could you figure out why has
>>>>> performance drop?
>>>> Hi Michael,
>>>>
>>>> Thanks for checking it.
>>>> I tried to re-produce it, but I don't see the drop on my environment.
>>>> (My cpu is Xeon E5-2697-v2, and the performances of v6 and v9 patch are
>>>> almost 5.9Mpps)
>>>> Did you use totally same code except for vhost PMD?
>>> Yes, totally same code and same platform, only difference is versions of
>>> vhost PMD.
>>>
>>> BTW, I have set the frontend mergeable off.
>> I have checked below cases.
>>  - Case1: Disable mergeable feature in virtio-net PMD.
>>  - Case2: Disable mergeable feature in virtio-net PMD and use
>> '--txqflags=0xf01' option to use simple ring deploying.
>> Both cases,  I still cannot see the drop.
>>
>> Anyway, I will send a few patch-series to determine the cause of drop.
>> So, could you please apply them and check the performance to determine
>> which cause the drop?
> Hi Michael,
>
> I may find what causes the drop.
> Could you please restart testpmd on guest when you see the drop, then
> check performance again?
>
> I guess the drop will occur only first time when testpmd on guest and
> host is connected.
> Here are rough steps.
>
> 1. Start testpmd on host
> 2. Start QEMU
> 3. Start testpmd on guest
>
> Then you will see the drop.
> Probably, if testpmd on guest is restarted, then you don't see the drop
> again.
>
> 4. Type 'quit' on guest.
> 5. Start testpmd on guest again.

OK, I will help to tested today.

Thanks,
Michael
> If so, I guess the drop is caused by queue notifying.
> Could you please let me know whether your issue is above case?
>
> Thanks,
> Tetsuya
>
>> Thanks,
>> Tetsuya
>>
>>> Thanks,
>>> Michael
>>>> Thanks,
>>>> Tetsuya
>>>>
>>>>> Thanks,
>>>>> Michael
>>>>> On 2/9/2016 5:38 PM, Tetsuya Mukawa wrote:
>>>>>> The patch introduces a new PMD. This PMD is implemented as thin wrapper
>>>>>> of librte_vhost.
>>>>>>
>>>>>>
>>>>>> PATCH v9 changes:
>>>>>>  - Fix a null pointer access issue implemented in v8 patch.
>>>>>>
>>>>>> PATCH v8 changes:
>>>>>>  - Manage ether devices list instead of internal structures list.
>>>>>>  - Remove needless NULL checking.
>>>>>>  - Replace "pthread_exit" to "return NULL".
>>>>>>  - Replace rte_panic to RTE_LOG, also add error handling.
>>>>>>  - Remove duplicated lines.
>>>>>>  - Remove needless casting.
>>>>>>  - Follow coding style.
>>>>>>  - Remove needless parenthesis.
>>>>>>
>>>>>> PATCH v7 changes:
>>>>>>  - Remove needless parenthesis.
>>>>>>  - Add release note.
>>>>>>  - Remove needless line wraps.
>>>>>>  - Add null pointer check in vring_state_changed().
>>>>>>  - Free queue memory in eth_queue_release().
>>>>>>  - Fix wrong variable name.
>>>>>>  - Fix error handling code of eth_dev_vhost_create() and
>>>>>>rte_pmd_vhost_devuninit().
>>>>>>  - Remove needless null checking from rte_pmd_vhost_devinit/devuninit().
>>>>>>  - Use port id to create mac address.
>>>>>>  - Add doxygen style comments in "rte_eth_vhost.h".
>>>>>>  - Fix wrong comment in "mk/rte.app.mk".
>>>>>>
>>>>>> PATCH v6 changes:
>>>>>>  - Remove rte_vhost_driver_pmd_callback_registe().
>>>>>>  - Support link status interrupt.
>>>>>>  - Support queue state changed interrupt.
>>>>>>  - Add rte_eth_vhost_get_queue_event().
>>>>>>  - Support numa node detection when new device is connected.
>>>>>>
>>>>>> PATCH v5 changes:
>>>>>&

[dpdk-dev] [PATCH] vhost: broadcast RARP pkt by injecting it to receiving mbuf array

2016-02-25 Thread Qiu, Michael

On 2/24/2016 4:27 PM, Yuanhan Liu wrote:
> On Wed, Feb 24, 2016 at 08:15:36AM +0000, Qiu, Michael wrote:
>> On 2/22/2016 10:35 PM, Yuanhan Liu wrote:
>>> Broadcast RARP packet by injecting it to receiving mbuf array at
>>> rte_vhost_dequeue_burst().
>>>
>>> Commit 33226236a35e ("vhost: handle request to send RARP") iterates
>>> all host interfaces and then broadcast it by all of them.  It did
>>> notify the switches about the new location of the migrated VM, however,
>>> the mac learning table in the target host is wrong (at least in my
>>> test with OVS):
>>>
>>> $ ovs-appctl fdb/show ovsbr0
>>>  port  VLAN  MACAge
>>> 1 0  b6:3c:72:71:cd:4d   10
>>> LOCAL 0  b6:3c:72:71:cd:4e   10
>>> LOCAL 0  52:54:00:12:34:689
>>> 1 0  56:f6:64:2c:bc:c01
>>>
>>> Where 52:54:00:12:34:68 is the mac of the VM. As you can see from the
>>> above, the port learned is "LOCAL", which is the "ovsbr0" port. That
>>> is reasonable, since we indeed send the pkt by the "ovsbr0" interface.
>>>
>>> The wrong mac table lead all the packets to the VM go to the "ovsbr0"
>>> in the end, which ends up with all packets being lost, until the guest
>>> send a ARP quest (or reply) to refresh the mac learning table.
>>>
>>> Jianfeng then came up with a solution I have thought of firstly but NAKed
>> Is it suitable to mention someone in the commit log?
> Why it's not? It's not a secret name or something like that after all :)
>
> On the other hand, it's way of thanking Jianfeng's contribution to this
> patch.

OK, I've never seen this fashion before, forgive me.

Thanks,
Michael
>
>   --yliu
>

[dpdk-dev] [PATCH v9 0/2] Add VHOST PMD

2016-02-25 Thread Qiu, Michael

On 2/24/2016 1:10 PM, Tetsuya Mukawa wrote:
> On 2016/02/24 11:45, Qiu, Michael wrote:
>> Hi,  Tetsuya
>>
>> When I applied your v6 patch, I could reach 9.5Mpps with 64B packet.
>>
>> But when apply v9 only 8.4 Mpps, could you figure out why has
>> performance drop?
> Hi Michael,
>
> Thanks for checking it.
> I tried to re-produce it, but I don't see the drop on my environment.
> (My cpu is Xeon E5-2697-v2, and the performances of v6 and v9 patch are
> almost 5.9Mpps)
> Did you use totally same code except for vhost PMD?

Yes, totally same code and same platform, only difference is versions of
vhost PMD.

BTW, I have set the frontend mergeable off.

Thanks,
Michael
>
> Thanks,
> Tetsuya
>
>> Thanks,
>> Michael
>> On 2/9/2016 5:38 PM, Tetsuya Mukawa wrote:
>>> The patch introduces a new PMD. This PMD is implemented as thin wrapper
>>> of librte_vhost.
>>>
>>>
>>> PATCH v9 changes:
>>>  - Fix a null pointer access issue implemented in v8 patch.
>>>
>>> PATCH v8 changes:
>>>  - Manage ether devices list instead of internal structures list.
>>>  - Remove needless NULL checking.
>>>  - Replace "pthread_exit" to "return NULL".
>>>  - Replace rte_panic to RTE_LOG, also add error handling.
>>>  - Remove duplicated lines.
>>>  - Remove needless casting.
>>>  - Follow coding style.
>>>  - Remove needless parenthesis.
>>>
>>> PATCH v7 changes:
>>>  - Remove needless parenthesis.
>>>  - Add release note.
>>>  - Remove needless line wraps.
>>>  - Add null pointer check in vring_state_changed().
>>>  - Free queue memory in eth_queue_release().
>>>  - Fix wrong variable name.
>>>  - Fix error handling code of eth_dev_vhost_create() and
>>>rte_pmd_vhost_devuninit().
>>>  - Remove needless null checking from rte_pmd_vhost_devinit/devuninit().
>>>  - Use port id to create mac address.
>>>  - Add doxygen style comments in "rte_eth_vhost.h".
>>>  - Fix wrong comment in "mk/rte.app.mk".
>>>
>>> PATCH v6 changes:
>>>  - Remove rte_vhost_driver_pmd_callback_registe().
>>>  - Support link status interrupt.
>>>  - Support queue state changed interrupt.
>>>  - Add rte_eth_vhost_get_queue_event().
>>>  - Support numa node detection when new device is connected.
>>>
>>> PATCH v5 changes:
>>>  - Rebase on latest master.
>>>  - Fix RX/TX routine to count RX/TX bytes.
>>>  - Fix RX/TX routine not to count as error packets if enqueue/dequeue
>>>cannot send all packets.
>>>  - Fix if-condition checking for multiqueues.
>>>  - Add "static" to pthread variable.
>>>  - Fix format.
>>>  - Change default behavior not to receive queueing event from driver.
>>>  - Split the patch to separate rte_eth_vhost_portid2vdev().
>>>
>>> PATCH v4 changes:
>>>  - Rebase on latest DPDK tree.
>>>  - Fix cording style.
>>>  - Fix code not to invoke multiple messaging handling threads.
>>>  - Fix code to handle vdev parameters correctly.
>>>  - Remove needless cast.
>>>  - Remove needless if-condition before rt_free().
>>>
>>> PATCH v3 changes:
>>>  - Rebase on latest matser
>>>  - Specify correct queue_id in RX/TX function.
>>>
>>> PATCH v2 changes:
>>>  - Remove a below patch that fixes vhost library.
>>>The patch was applied as a separate patch.
>>>- vhost: fix crash with multiqueue enabled
>>>  - Fix typos.
>>>(Thanks to Thomas, Monjalon)
>>>  - Rebase on latest tree with above bernard's patches.
>>>
>>> PATCH v1 changes:
>>>  - Support vhost multiple queues.
>>>  - Rebase on "remove pci driver from vdevs".
>>>  - Optimize RX/TX functions.
>>>  - Fix resource leaks.
>>>  - Fix compile issue.
>>>  - Add patch to fix vhost library.
>>>
>>> RFC PATCH v3 changes:
>>>  - Optimize performance.
>>>In RX/TX functions, change code to access only per core data.
>>>  - Add below API to allow user to use vhost library APIs for a port managed
>>>by vhost PMD. There are a few limitations. See "rte_eth_vhost.h".
>>> - rte_eth_vhost_portid2vdev()
>>>To support this functionality, vhost library is also changed.
>>>Anyway, if users doesn't use vhost PMD, can fully use vhost library APIs.
>>&g

[dpdk-dev] [PATCH] vhost: broadcast RARP pkt by injecting it to receiving mbuf array

2016-02-24 Thread Qiu, Michael

On 2/22/2016 10:35 PM, Yuanhan Liu wrote:
> Broadcast RARP packet by injecting it to receiving mbuf array at
> rte_vhost_dequeue_burst().
>
> Commit 33226236a35e ("vhost: handle request to send RARP") iterates
> all host interfaces and then broadcast it by all of them.  It did
> notify the switches about the new location of the migrated VM, however,
> the mac learning table in the target host is wrong (at least in my
> test with OVS):
>
> $ ovs-appctl fdb/show ovsbr0
>  port  VLAN  MACAge
> 1 0  b6:3c:72:71:cd:4d   10
> LOCAL 0  b6:3c:72:71:cd:4e   10
> LOCAL 0  52:54:00:12:34:689
> 1 0  56:f6:64:2c:bc:c01
>
> Where 52:54:00:12:34:68 is the mac of the VM. As you can see from the
> above, the port learned is "LOCAL", which is the "ovsbr0" port. That
> is reasonable, since we indeed send the pkt by the "ovsbr0" interface.
>
> The wrong mac table lead all the packets to the VM go to the "ovsbr0"
> in the end, which ends up with all packets being lost, until the guest
> send a ARP quest (or reply) to refresh the mac learning table.
>
> Jianfeng then came up with a solution I have thought of firstly but NAKed

Is it suitable to mention someone in the commit log?

Thanks,
Michael
> by myself, concerning it has potential issues [0]. The solution is as title
> stated: broadcast the RARP packet by injecting it to the receiving mbuf
> arrays at rte_vhost_dequeue_burst(). The re-bring of that idea made me
> think it twice; it looked like a false concern to me then. And I had done
> a rough verification: it worked as expected.
>
> [0]: http://dpdk.org/ml/archives/dev/2016-February/033527.html
>
> Another note is that while preparing this version, I found that DPDK has
> some ARP related structures and macros defined. So, use them instead of
> the one from standard header files here.
>
> Cc: Thibaut Collet 
> Suggested-by: Jianfeng Tan 
> Signed-off-by: Yuanhan Liu 
> ---
>  lib/librte_vhost/rte_virtio_net.h |   5 +-
>  lib/librte_vhost/vhost_rxtx.c |  80 +++-
>  lib/librte_vhost/vhost_user/vhost-net-user.c  |   2 +-
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 128 
> --
>  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +-
>  5 files changed, 104 insertions(+), 113 deletions(-)
>
> diff --git a/lib/librte_vhost/rte_virtio_net.h 
> b/lib/librte_vhost/rte_virtio_net.h
> index 4a2303a..7d1fde2 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -49,6 +49,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  struct rte_mbuf;
>  
> @@ -133,7 +134,9 @@ struct virtio_net {
>   void*priv;  /**< private context */
>   uint64_tlog_size;   /**< Size of log area */
>   uint64_tlog_base;   /**< Where dirty pages are 
> logged */
> - uint64_treserved[62];   /**< Reserve some spaces for 
> future extension. */
> + struct ether_addr   mac;/**< MAC address */
> + rte_atomic16_t  broadcast_rarp; /**< A flag to tell if we need 
> broadcast rarp packet */
> + uint64_treserved[61];   /**< Reserve some spaces for 
> future extension. */
>   struct vhost_virtqueue  *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];  /**< 
> Contains all virtqueue information. */
>  } __rte_cache_aligned;
>  
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 12ce0cc..9d23eb1 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "vhost-net.h"
>  
> @@ -761,11 +762,50 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, 
> struct rte_mbuf *m)
>   }
>  }
>  
> +#define RARP_PKT_SIZE64
> +
> +static int
> +make_rarp_packet(struct rte_mbuf *rarp_mbuf, const struct ether_addr *mac)
> +{
> + struct ether_hdr *eth_hdr;
> + struct arp_hdr  *rarp;
> +
> + if (rarp_mbuf->buf_len < 64) {
> + RTE_LOG(WARNING, VHOST_DATA,
> + "failed to make RARP; mbuf size too small %u (< %d)\n",
> + rarp_mbuf->buf_len, RARP_PKT_SIZE);
> + return -1;
> + }
> +
> + /* Ethernet header. */
> + eth_hdr = rte_pktmbuf_mtod_offset(rarp_mbuf, struct ether_hdr *, 0);
> + memset(eth_hdr->d_addr.addr_bytes, 0xff, ETHER_ADDR_LEN);
> + ether_addr_copy(mac, ð_hdr->s_addr);
> + eth_hdr->ether_type = htons(ETHER_TYPE_RARP);
> +
> + /* RARP header. */
> + rarp = (struct arp_hdr *)(eth_hdr + 1);
> + rarp->arp_hrd = htons(ARP_HRD_ETHER);
> + rarp->arp_pro = htons(ETHER_TYPE_IPv4);
> + rarp->arp_hln = ETHER_ADDR_LEN;
> + rarp->arp_pln = 4;
> + rarp->arp_op  = htons(ARP_OP_REVREQUEST);
> +
> + ether_addr_copy(mac, &rarp->arp_data.arp_sha);
> + ether_addr_copy(mac, &rarp->arp_data.arp_t

[dpdk-dev] [PATCH v9 0/2] Add VHOST PMD

2016-02-24 Thread Qiu, Michael

Hi,  Tetsuya

When I applied your v6 patch, I could reach 9.5Mpps with 64B packet.

But when apply v9 only 8.4 Mpps, could you figure out why has
performance drop?

Thanks,
Michael
On 2/9/2016 5:38 PM, Tetsuya Mukawa wrote:
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost.
>
>
> PATCH v9 changes:
>  - Fix a null pointer access issue implemented in v8 patch.
>
> PATCH v8 changes:
>  - Manage ether devices list instead of internal structures list.
>  - Remove needless NULL checking.
>  - Replace "pthread_exit" to "return NULL".
>  - Replace rte_panic to RTE_LOG, also add error handling.
>  - Remove duplicated lines.
>  - Remove needless casting.
>  - Follow coding style.
>  - Remove needless parenthesis.
>
> PATCH v7 changes:
>  - Remove needless parenthesis.
>  - Add release note.
>  - Remove needless line wraps.
>  - Add null pointer check in vring_state_changed().
>  - Free queue memory in eth_queue_release().
>  - Fix wrong variable name.
>  - Fix error handling code of eth_dev_vhost_create() and
>rte_pmd_vhost_devuninit().
>  - Remove needless null checking from rte_pmd_vhost_devinit/devuninit().
>  - Use port id to create mac address.
>  - Add doxygen style comments in "rte_eth_vhost.h".
>  - Fix wrong comment in "mk/rte.app.mk".
>
> PATCH v6 changes:
>  - Remove rte_vhost_driver_pmd_callback_registe().
>  - Support link status interrupt.
>  - Support queue state changed interrupt.
>  - Add rte_eth_vhost_get_queue_event().
>  - Support numa node detection when new device is connected.
>
> PATCH v5 changes:
>  - Rebase on latest master.
>  - Fix RX/TX routine to count RX/TX bytes.
>  - Fix RX/TX routine not to count as error packets if enqueue/dequeue
>cannot send all packets.
>  - Fix if-condition checking for multiqueues.
>  - Add "static" to pthread variable.
>  - Fix format.
>  - Change default behavior not to receive queueing event from driver.
>  - Split the patch to separate rte_eth_vhost_portid2vdev().
>
> PATCH v4 changes:
>  - Rebase on latest DPDK tree.
>  - Fix cording style.
>  - Fix code not to invoke multiple messaging handling threads.
>  - Fix code to handle vdev parameters correctly.
>  - Remove needless cast.
>  - Remove needless if-condition before rt_free().
>
> PATCH v3 changes:
>  - Rebase on latest matser
>  - Specify correct queue_id in RX/TX function.
>
> PATCH v2 changes:
>  - Remove a below patch that fixes vhost library.
>The patch was applied as a separate patch.
>- vhost: fix crash with multiqueue enabled
>  - Fix typos.
>(Thanks to Thomas, Monjalon)
>  - Rebase on latest tree with above bernard's patches.
>
> PATCH v1 changes:
>  - Support vhost multiple queues.
>  - Rebase on "remove pci driver from vdevs".
>  - Optimize RX/TX functions.
>  - Fix resource leaks.
>  - Fix compile issue.
>  - Add patch to fix vhost library.
>
> RFC PATCH v3 changes:
>  - Optimize performance.
>In RX/TX functions, change code to access only per core data.
>  - Add below API to allow user to use vhost library APIs for a port managed
>by vhost PMD. There are a few limitations. See "rte_eth_vhost.h".
> - rte_eth_vhost_portid2vdev()
>To support this functionality, vhost library is also changed.
>Anyway, if users doesn't use vhost PMD, can fully use vhost library APIs.
>  - Add code to support vhost multiple queues.
>Actually, multiple queues functionality is not enabled so far.
>
> RFC PATCH v2 changes:
>  - Fix issues reported by checkpatch.pl
>(Thanks to Stephen Hemminger)
>
>
> Tetsuya Mukawa (2):
>   ethdev: Add a new event type to notify a queue state changed event
>   vhost: Add VHOST PMD
>
>  MAINTAINERS |   4 +
>  config/common_linuxapp  |   6 +
>  doc/guides/nics/index.rst   |   1 +
>  doc/guides/rel_notes/release_2_3.rst|   4 +
>  drivers/net/Makefile|   4 +
>  drivers/net/vhost/Makefile  |  62 ++
>  drivers/net/vhost/rte_eth_vhost.c   | 911 
> 
>  drivers/net/vhost/rte_eth_vhost.h   | 109 
>  drivers/net/vhost/rte_pmd_vhost_version.map |  11 +
>  lib/librte_ether/rte_ethdev.h   |   2 +
>  mk/rte.app.mk   |   6 +
>  11 files changed, 1120 insertions(+)
>  create mode 100644 drivers/net/vhost/Makefile
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.c
>  create mode 100644 drivers/net/vhost/rte_eth_vhost.h
>  create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map
>

[dpdk-dev] [PATCH v2] ixgbe: Fix disable interrupt twice

2016-02-19 Thread Qiu, Michael

On 2016/2/2 19:03, Ananyev, Konstantin wrote:
>

[...]

 I don't think i40e miss it, because it not the right please to disable 
 interrupt.
 because all interrupts are enabled in init stage.

 Actually, ixgbe enable the interrupt in init stage, but in dev_start, it 
 disable it
 first and re-enable, so it just the same with doing nothing about 
 interrupt.

 Just think below:

 1. start the port.(interrupt already enabled in init stage, disable -->
 re-enable)
 2. stop the port.(disable interrupt)
 3. start port again(Try to disable, but failed, already disabled)

 Would you think the code has issue?
>>> [Zhang, Helin] in ixgbe PMD, it can be seen that uninit() calls dev_close(),
>>> which calls dev_stop(). So I think the disabling can be done only in 
>>> dev_stop().
>>> All others can make use of dev_stop to disable the interrupt.
>> As I said, if it is in dev_stop, it will has issue when dev_start -->
>> dev_stop --> dev_start, this also could applied in i40e and fm10k. If
>> you want to put it in dev_stop, better to remove enable interrupts in
>> init stage, and only put it in dev_start.
> We can't remove enabling interrupt at init stage and put it only in 
> dev_start().
> That means PF couldn't handle interrupts from VF till dev_start() will be 
> executed on PF
>  - which could never happen.
> For same reason we can't disable all interrupts in dev_stop().
> See: http://dpdk.org/ml/archives/dev/2015-November/027238.html

Hi, Konstantin

Yes, you are right.

So the only way to fix this issue should remove it in dev_stop(), and
left it in uinit() stage, which my patch does.

Am I right?

Thanks,
Michael
> Konstantin
>
>> Thanks,
>> Michael
>>> Regards,
>>> Helin
>>>
 Thanks,
 Michael

> Maybe we can follow fm10k's style.
>
>> On other hand, if we remove it in dev_stop, any side effect? In ixgbe
>> start, it will always disable it first and then re-enable it, so it's 
>> safe.
> I think you mean we can disable intr anyway even if it has been disabled.
 Actually, we couldn't, DPDK call VFIO ioctl to kernel to disable 
 interrupts, and
 if we try disable twice, it will return and error.
 That's why I mean we need a flag to show the interrupts stats. If it 
 already
 disabled, we do not need call in to kernel. just return and give a warning
 message.

 Thanks,
 Michael

>  Sounds more like why we don't
> need this patch :)
>
>> Thanks,
>> Michael
>

[dpdk-dev] [PATCH 2/2 v2] fm10k: update doc for Atwood Channel

2016-02-05 Thread Qiu, Michael

On 2/4/2016 5:05 PM, Thomas Monjalon wrote:
> Hi Michael,
>
> 2016-02-04 16:36, Michael Qiu:
>> Atwood Channel is 25GbE NIC and belongs to Intel FM10K family,
>> update the doc for it.
>>
>> Signed-off-by: Michael Qiu 
>> Acked-by: John McNamara 
> Next time, it would be better to send the doc changes and the related code
> changes in the same patch. Thanks

OK, I will do it next time.

Thanks,
Michael

[dpdk-dev] [PATCH v2] ixgbe: Fix disable interrupt twice

2016-02-02 Thread Qiu, Michael

On 2/2/2016 11:07 AM, Zhang, Helin wrote:
>
>> -Original Message-
>> From: Qiu, Michael
>> Sent: Tuesday, February 2, 2016 10:57 AM
>> To: Zhang, Helin ; Lu, Wenzhuo
>> ; dev at dpdk.org
>> Cc: Zhou, Danny ; Liu, Yong ;
>> Liang, Cunming 
>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>
>> On 2/2/2016 10:14 AM, Zhang, Helin wrote:
>>>> -Original Message-
>>>> From: Qiu, Michael
>>>> Sent: Tuesday, February 2, 2016 10:07 AM
>>>> To: Lu, Wenzhuo; dev at dpdk.org
>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Zhang, Helin
>>>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>
>>>> [+cc helin]
>>>>
>>>> On 2/2/2016 9:03 AM, Lu, Wenzhuo wrote:
>>>>> Hi Michael,
>>>>>
>>>>>> -Original Message-
>>>>>> From: Qiu, Michael
>>>>>> Sent: Monday, February 1, 2016 4:05 PM
>>>>>> To: Lu, Wenzhuo; dev at dpdk.org
>>>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming
>>>>>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>>>
>>>>>> On 1/29/2016 4:07 PM, Lu, Wenzhuo wrote:
>>>>>>> Hi Michael,
>>>>>>>
>>>>>>>> -Original Message-
>>>>>>>> From: Qiu, Michael
>>>>>>>> Sent: Friday, January 29, 2016 1:58 PM
>>>>>>>> To: dev at dpdk.org
>>>>>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Lu, Wenzhuo; Qiu,
>>>>>>>> Michael
>>>>>>>> Subject: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>>>>>
>>>>>>>> Currently, ixgbe vf and pf will disable interrupt twice in stop
>>>>>>>> stage and uninit stage. It will cause an error:
>>>>>>>>
>>>>>>>> testpmd> quit
>>>>>>>>
>>>>>>>> Shutting down port 0...
>>>>>>>> Stopping ports...
>>>>>>>> Done
>>>>>>>> Closing ports...
>>>>>>>> EAL: Error disabling MSI-X interrupts for fd 26
>>>>>>>> Done
>>>>>>>>
>>>>>>>> Becasue the interrupt already been disabled in stop stage.
>>>>>>>> Since it is enabled in init stage, better remove from stop stage.
>>>>>>> I'm afraid it?s not a good idea to just remove the intr_disable
>>>>>>> from
>>>> dev_stop.
>>>>>>> I think dev_stop have the chance to be used independently with
>>>>>>> dev_unint. In
>>>>>> this scenario, we still need intr_disable, right?
>>>>>>> Maybe what we need is some check before we disable the intr:)
>>>>>> Yes, indeed we need some check in disable intr, but it need
>>>>>> additional fields in "struct rte_intr_handle",  and it's much saft
>>>>>> to do so, but as I check i40e/fm10k code, only ixgbe disable it in
>> dev_stop().
>>>>> I found fm10k doesn?t enable intr in dev_start. So, I think it's OK.
>>>>> But i40e
>>>> enables intr in dev_start.
>>>>> To my opinion, it's more like i40e misses the intr_disable in dev_stop.
>>>> I don't think i40e miss it, because it not the right please to disable 
>>>> interrupt.
>>>> because all interrupts are enabled in init stage.
>>>>
>>>> Actually, ixgbe enable the interrupt in init stage, but in dev_start,
>>>> it disable it first and re-enable, so it just the same with doing nothing 
>>>> about
>> interrupt.
>>>> Just think below:
>>>>
>>>> 1. start the port.(interrupt already enabled in init stage, disable
>>>> -->
>>>> re-enable)
>>>> 2. stop the port.(disable interrupt)
>>>> 3. start port again(Try to disable, but failed, already disabled)
>>>>
>>>> Would you think the code has issue?
>>> [Zhang, Helin] in ixgbe PMD, it can be seen that uninit() calls
>>> dev_close(), which calls dev_stop(). So I think the disabling can be done 
>>> only in
>> dev_stop().
>>> All others can make use of dev_stop to disable the interrupt.
>> As I said, if it is in dev_stop

[dpdk-dev] [PATCH 1/3] fm10k: enable FTAG based forwarding

2016-02-02 Thread Qiu, Michael

On 1/25/2016 4:08 PM, Wang Xiao W wrote:
> This patch enables reading sglort info into mbuf for RX and inserting
> an FTAG at the beginning of the packet for TX. The vlan_tci_outer field
> selected from rte_mbuf structure for sglort is not used in fm10k now.
> In FTAG based forwarding mode, the switch will forward packets according
> to glort info in FTAG rather than mac and vlan table.
>
> To activate this feature, user needs to turn 
> ``CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD``
> to y in common_linuxapp or common_bsdapp. Currently this feature is supported
> only on PF, because FM10K_PFVTCTL register is read-only for VF.
>
> Signed-off-by: Wang Xiao W 
> ---
>  config/common_bsdapp   |  1 +
>  config/common_linuxapp |  1 +
>  drivers/net/fm10k/fm10k_ethdev.c   |  8 
>  drivers/net/fm10k/fm10k_rxtx.c | 17 +
>  drivers/net/fm10k/fm10k_rxtx_vec.c |  9 +
>  5 files changed, 36 insertions(+)
>
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index ed7c31c..451f81a 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -208,6 +208,7 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX=n
>  CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX_FREE=n
>  CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
>  CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y
> +CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD=n
>  
>  #
>  # Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 74bc515..c928bce 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -207,6 +207,7 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX_FREE=n
>  CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
>  CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y
>  CONFIG_RTE_LIBRTE_FM10K_INC_VECTOR=y
> +CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD=n
>  
>  #
>  # Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index e4aed94..cc8317f 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -668,6 +668,14 @@ fm10k_dev_tx_init(struct rte_eth_dev *dev)
>   PMD_INIT_LOG(ERR, "failed to disable queue %d", i);
>   return -1;
>   }
> +#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
> + /* Enable use of FTAG bit in TX descriptor, PFVTCTL
> +  * register is read-only for VF.
> +  */
> + if (hw->mac.type == fm10k_mac_pf)
> + FM10K_WRITE_REG(hw, FM10K_PFVTCTL(i),
> + FM10K_PFVTCTL_FTAG_DESC_ENABLE);

So here if somebody enable FTAG, when compile, but he use VF, what will
happen? We'd better to give a error message when he try to use VF with FTAG.

Thanks,
Michael
> +#endif
>  
>   /* set location and size for descriptor ring */
>   FM10K_WRITE_REG(hw, FM10K_TDBAL(i),
> diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
> index e958865..f87987d 100644
> --- a/drivers/net/fm10k/fm10k_rxtx.c
> +++ b/drivers/net/fm10k/fm10k_rxtx.c
> @@ -152,6 +152,13 @@ fm10k_recv_pkts(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
>*/
>   mbuf->ol_flags |= PKT_RX_VLAN_PKT;
>   mbuf->vlan_tci = desc.w.vlan;
> +#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
> + /**
> +  * mbuf->vlan_tci_outer is an idle field in fm10k driver,
> +  * so it can be selected to store sglort value.
> +  */
> + mbuf->vlan_tci_outer = rte_le_to_cpu_16(desc.w.sglort);
> +#endif
>  
>   rx_pkts[count] = mbuf;
>   if (++next_dd == q->nb_desc) {
> @@ -307,6 +314,13 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct 
> rte_mbuf **rx_pkts,
>*/
>   mbuf->ol_flags |= PKT_RX_VLAN_PKT;
>   first_seg->vlan_tci = desc.w.vlan;
> +#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
> + /**
> +  * mbuf->vlan_tci_outer is an idle field in fm10k driver,
> +  * so it can be selected to store sglort value.
> +  */
> + first_seg->vlan_tci_outer = rte_le_to_cpu_16(desc.w.sglort);
> +#endif
>  
>   /* Prefetch data of first segment, if configured to do so. */
>   rte_packet_prefetch((char *)first_seg->buf_addr +
> @@ -432,6 +446,9 @@ static inline void tx_xmit_pkt(struct fm10k_tx_queue *q, 
> struct rte_mbuf *mb)
>   q->nb_free -= mb->nb_segs;
>  
>   q->hw_ring[q->next_free].flags = 0;
> +#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
> + q->hw_ring[q->next_free].flags |= FM10K_TXD_FLAG_FTAG;
> +#endif
>   /* set checksum flags on first descriptor of packet. SCTP checksum
>* offload is not supported, but we do not explicitly check for this
>* case in favor of greatly simplified processing. */
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 2a57eef..0b0f2e3 100644
> --- a/drivers/net

[dpdk-dev] i40evf DPDK init_adminq failed: -53

2016-02-02 Thread Qiu, Michael

On 2/2/2016 6:44 AM, Saurabh Mishra wrote:
> Hi, on KVM system, after doing NVM upgrade to new firmware and I don't
> see init_adminq failed messages.
>

Glade to see you have fix the issue:)

Thanks,
Michael
> Thanks,
> /Saurabh
>
> On Mon, Feb 1, 2016 at 11:49 AM, Saurabh Mishra
> mailto:saurabh.globe at gmail.com>> wrote:
>
> Hi,
>
> So I tried to update the firmware and it says "Update not
> available" for i40e
>
> Intel(R) Ethernet NVM Update Tool
>
> NVMUpdate version 1.25.20.03
>
> Copyright (C) 2013 - 2015 Intel Corporation.
>
>
>
> WARNING: TO AVOID DAMAGE TO YOUR DEVICE, DO NOT EXIT OR REBOOT OR
> POWER OFF THE SYSTEM DURING THIS UPDATE
>
> Inventory in progress. Please wait [|.]
>
> +
>
>
>
> Num DescriptionDevice-Id B:D   Adapter
> Status
>
> === == = =
> 
>
> 01) Intel(R) Ethernet Server Adapter X520- 8086-10FB 130:00 Update
> not available
>
> 02) Intel(R) Ethernet Converged Network Ad 8086-1572 07:00 Update
> not available
>
>
>
> Tool execution completed with the following status: All operations
> completed successfully
>
> Press any key to exit.
>
>
> [root] esxcfg-nics -l
>
> NamePCI  Driver  Link Speed Duplex MAC Address
>   MTUDescription   
>
> vmnic6  :07:00.0 i40eUp   1Mbps Full  
> 3c:fd:fe:04:11:c0 1500   Intel Corporation Ethernet Controller
> X710 for 10GbE SFP+
>
> vmnic7  :07:00.1 i40eUp   1Mbps Full  
> 3c:fd:fe:04:11:c2 1500   Intel Corporation Ethernet Controller
> X710 for 10GbE SFP+
>
> vmnic8  :82:00.0 ixgbe   Up   1Mbps Full  
> 00:1b:21:90:f9:f8 1500   Intel(R) 82599 10 Gigabit Dual Port
> Network Connection
>
> [root] ethtool -i vmnic6
>
> driver: i40e
>
> version: 1.3.38
>
> firmware-version: 4.41 0x80001866 16.5.20
>
> bus-info: :07:00.0
>
>
> On Mon, Feb 1, 2016 at 10:25 AM, Saurabh Mishra
> mailto:saurabh.globe at gmail.com>> wrote:
>
> Hi Michael --
>
> What are the steps to upgrade i40e firmware. We are using CentOS7
>
> It didn't work with guest VF driver either on ESXi and KVM.
>
> Sure. I will blacklist i40evf driver and try it out.
>
> Thanks,
> /Saurabh
>
>
> On Mon, Feb 1, 2016 at 12:16 AM, Qiu, Michael
> mailto:michael.qiu at intel.com>> wrote:
>
> Hi, Saurabh
>
> It's known issue, to fix this you'd better to upgrade the
> firmware
> version of i40e.
>
> BTW, will it work in guest with kernel driver?
>
> If yes, to workaround(somebody reports it does not work
> for them):
> Remove i40e.ko in guest, so that it will not auto-loaded
> when boot up.
>
> Hope it works for you.
>
> Thanks,
> Michael
>
>
> On 1/30/2016 4:35 AM, Saurabh Mishra wrote:
> > Has anybody seen this before? What's the workaround or
> fix? We are using
> > dpdk-2.2.0 on KVM centos:
> >
> > Host PF version: 1.0.11-k on Centos7
> >
> >
> > [root@ ~]# ./symmetric_mp fakeelf -c 2 -m2048 -n4
> --proc-type=primary -- -p
> > 3 --num-procs=2 --proc-id=0
> >
> > [.]
> >
> > EAL: Virtual area found at 0x7fff7580 (size = 0x20)
> >
> > EAL: Requesting 1024 pages of size 2MB from socket 0
> >
> > EAL: TSC frequency is ~2600141 KHz
> >
> > EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no
> -> using unreliable
> > clock cycles !
> >
> > EAL: Master lcore 1 is ready (tid=f7fed880;cpuset=[1])
> >
> > EAL: PCI device :00:04.0 on NUMA socket 0
> >
> > EAL:   probe driver: 8086:154c rte_i40evf_pmd
> >
> > EAL:   PCI memory mapped at 0x7620
> >
> > EAL:   PCI memory mapped at 0x7621
> >
> > EAL: PCI device

[dpdk-dev] [PATCH v2] ixgbe: Fix disable interrupt twice

2016-02-02 Thread Qiu, Michael

On 2/2/2016 10:14 AM, Zhang, Helin wrote:
>
>> -Original Message-
>> From: Qiu, Michael
>> Sent: Tuesday, February 2, 2016 10:07 AM
>> To: Lu, Wenzhuo; dev at dpdk.org
>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Zhang, Helin
>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>
>> [+cc helin]
>>
>> On 2/2/2016 9:03 AM, Lu, Wenzhuo wrote:
>>> Hi Michael,
>>>
>>>> -Original Message-
>>>> From: Qiu, Michael
>>>> Sent: Monday, February 1, 2016 4:05 PM
>>>> To: Lu, Wenzhuo; dev at dpdk.org
>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming
>>>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>
>>>> On 1/29/2016 4:07 PM, Lu, Wenzhuo wrote:
>>>>> Hi Michael,
>>>>>
>>>>>> -Original Message-
>>>>>> From: Qiu, Michael
>>>>>> Sent: Friday, January 29, 2016 1:58 PM
>>>>>> To: dev at dpdk.org
>>>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Lu, Wenzhuo; Qiu,
>>>>>> Michael
>>>>>> Subject: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>>>
>>>>>> Currently, ixgbe vf and pf will disable interrupt twice in stop
>>>>>> stage and uninit stage. It will cause an error:
>>>>>>
>>>>>> testpmd> quit
>>>>>>
>>>>>> Shutting down port 0...
>>>>>> Stopping ports...
>>>>>> Done
>>>>>> Closing ports...
>>>>>> EAL: Error disabling MSI-X interrupts for fd 26
>>>>>> Done
>>>>>>
>>>>>> Becasue the interrupt already been disabled in stop stage.
>>>>>> Since it is enabled in init stage, better remove from stop stage.
>>>>> I'm afraid it?s not a good idea to just remove the intr_disable from
>> dev_stop.
>>>>> I think dev_stop have the chance to be used independently with
>>>>> dev_unint. In
>>>> this scenario, we still need intr_disable, right?
>>>>> Maybe what we need is some check before we disable the intr:)
>>>> Yes, indeed we need some check in disable intr, but it need
>>>> additional fields in "struct rte_intr_handle",  and it's much saft to
>>>> do so, but as I check i40e/fm10k code, only ixgbe disable it in dev_stop().
>>> I found fm10k doesn?t enable intr in dev_start. So, I think it's OK. But 
>>> i40e
>> enables intr in dev_start.
>>> To my opinion, it's more like i40e misses the intr_disable in dev_stop.
>> I don't think i40e miss it, because it not the right please to disable 
>> interrupt.
>> because all interrupts are enabled in init stage.
>>
>> Actually, ixgbe enable the interrupt in init stage, but in dev_start, it 
>> disable it
>> first and re-enable, so it just the same with doing nothing about interrupt.
>>
>> Just think below:
>>
>> 1. start the port.(interrupt already enabled in init stage, disable -->
>> re-enable)
>> 2. stop the port.(disable interrupt)
>> 3. start port again(Try to disable, but failed, already disabled)
>>
>> Would you think the code has issue?
> [Zhang, Helin] in ixgbe PMD, it can be seen that uninit() calls dev_close(),
> which calls dev_stop(). So I think the disabling can be done only in 
> dev_stop().
> All others can make use of dev_stop to disable the interrupt.

As I said, if it is in dev_stop, it will has issue when dev_start -->
dev_stop --> dev_start, this also could applied in i40e and fm10k. If
you want to put it in dev_stop, better to remove enable interrupts in
init stage, and only put it in dev_start.

Thanks,
Michael
> Regards,
> Helin
>
>> Thanks,
>> Michael
>>
>>> Maybe we can follow fm10k's style.
>>>
>>>> On other hand, if we remove it in dev_stop, any side effect? In ixgbe
>>>> start, it will always disable it first and then re-enable it, so it's safe.
>>> I think you mean we can disable intr anyway even if it has been disabled.
>> Actually, we couldn't, DPDK call VFIO ioctl to kernel to disable interrupts, 
>> and
>> if we try disable twice, it will return and error.
>> That's why I mean we need a flag to show the interrupts stats. If it already
>> disabled, we do not need call in to kernel. just return and give a warning
>> message.
>>
>> Thanks,
>> Michael
>>
>>>  Sounds more like why we don't
>>> need this patch :)
>>>
>>>> Thanks,
>>>> Michael
>

[dpdk-dev] [PATCH v2] ixgbe: Fix disable interrupt twice

2016-02-02 Thread Qiu, Michael

[+cc helin]

On 2/2/2016 9:03 AM, Lu, Wenzhuo wrote:
> Hi Michael,
>
>> -Original Message-----
>> From: Qiu, Michael
>> Sent: Monday, February 1, 2016 4:05 PM
>> To: Lu, Wenzhuo; dev at dpdk.org
>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming
>> Subject: Re: [PATCH v2] ixgbe: Fix disable interrupt twice
>>
>> On 1/29/2016 4:07 PM, Lu, Wenzhuo wrote:
>>> Hi Michael,
>>>
>>>> -Original Message-
>>>> From: Qiu, Michael
>>>> Sent: Friday, January 29, 2016 1:58 PM
>>>> To: dev at dpdk.org
>>>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Lu, Wenzhuo; Qiu, Michael
>>>> Subject: [PATCH v2] ixgbe: Fix disable interrupt twice
>>>>
>>>> Currently, ixgbe vf and pf will disable interrupt twice in stop stage
>>>> and uninit stage. It will cause an error:
>>>>
>>>> testpmd> quit
>>>>
>>>> Shutting down port 0...
>>>> Stopping ports...
>>>> Done
>>>> Closing ports...
>>>> EAL: Error disabling MSI-X interrupts for fd 26
>>>> Done
>>>>
>>>> Becasue the interrupt already been disabled in stop stage.
>>>> Since it is enabled in init stage, better remove from stop stage.
>>> I'm afraid it?s not a good idea to just remove the intr_disable from 
>>> dev_stop.
>>> I think dev_stop have the chance to be used independently with dev_unint. In
>> this scenario, we still need intr_disable, right?
>>> Maybe what we need is some check before we disable the intr:)
>> Yes, indeed we need some check in disable intr, but it need additional 
>> fields in
>> "struct rte_intr_handle",  and it's much saft to do so, but as I check 
>> i40e/fm10k
>> code, only ixgbe disable it in dev_stop().
> I found fm10k doesn?t enable intr in dev_start. So, I think it's OK. But i40e 
> enables intr in dev_start.
> To my opinion, it's more like i40e misses the intr_disable in dev_stop.

I don't think i40e miss it, because it not the right please to disable
interrupt. because all interrupts are enabled in init stage.

Actually, ixgbe enable the interrupt in init stage, but in dev_start, it
disable it first and re-enable, so it just the same with doing nothing
about interrupt.

Just think below:

1. start the port.(interrupt already enabled in init stage, disable -->
re-enable)
2. stop the port.(disable interrupt)
3. start port again(Try to disable, but failed, already disabled)

Would you think the code has issue?

Thanks,
Michael

> Maybe we can follow fm10k's style.
>
>> On other hand, if we remove it in dev_stop, any side effect? In ixgbe start, 
>> it will
>> always disable it first and then re-enable it, so it's safe.
> I think you mean we can disable intr anyway even if it has been disabled.

Actually, we couldn't, DPDK call VFIO ioctl to kernel to disable
interrupts, and if we try disable twice, it will return and error.
That's why I mean we need a flag to show the interrupts stats. If it
already disabled, we do not need call in to kernel. just return and give
a warning message.

Thanks,
Michael

>  Sounds more like why we don't
> need this patch :)
>
>> Thanks,
>> Michael
>

[dpdk-dev] [PATCH 0/8] support E-tag offloading and forwarding on Intel X550 NIC

2016-02-01 Thread Qiu, Michael

On 2/1/2016 9:38 AM, Yuanhan Liu wrote:
> On Mon, Feb 01, 2016 at 01:04:52AM +, Lu, Wenzhuo wrote:
>> Hi,
>>
>>> -Original Message-
>>> From: Qiu, Michael
>>> Sent: Friday, January 29, 2016 3:16 PM
>>> To: Lu, Wenzhuo; dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH 0/8] support E-tag offloading and forwarding 
>>> on
>>> Intel X550 NIC
>>>
>>> Hi, Wenzhuo
>>>
>>> Better to explain what E-tag is, so that reviewers could known it.
>> Yes, it's better. But not sure where should I add this info. In release note 
>> or just cover letter? Any suggestion? Thanks.
> It should be done in the first patch introduced E-tag, so that it will
> be in the git log history. And of course, it does no harm at all to
> mention (briefly) it again in cover letter, so that reviewer/maintainer
> has a brief understanding of your whole patchset first.
>
>   --yliu

Yes, in my view, cover letter is a good place, but as want to be in git
log history, it would be better to include in the right patch of the
feature, because most of time, the first patch is not core related to
new feature, perhaps only some prepare code.

My opinion is to explain it where it first be mentioned in the code.

But again, it OK for Yuanhan's solution, the only thing you want to do
is think you are a reviewer, and want to review you patch, what do you want.

Thanks,
Michael

[dpdk-dev] i40evf DPDK init_adminq failed: -53

2016-02-01 Thread Qiu, Michael

Hi, Saurabh

It's known issue, to fix this you'd better to upgrade the firmware
version of i40e.

BTW, will it work in guest with kernel driver?

If yes, to workaround(somebody reports it does not work for them):
Remove i40e.ko in guest, so that it will not auto-loaded when boot up.

Hope it works for you.

Thanks,
Michael


On 1/30/2016 4:35 AM, Saurabh Mishra wrote:
> Has anybody seen this before? What's the workaround or fix? We are using
> dpdk-2.2.0 on KVM centos:
>
> Host PF version: 1.0.11-k on Centos7
>
>
> [root@ ~]# ./symmetric_mp fakeelf -c 2 -m2048 -n4 --proc-type=primary -- -p
> 3 --num-procs=2 --proc-id=0
>
> [.]
>
> EAL: Virtual area found at 0x7fff7580 (size = 0x20)
>
> EAL: Requesting 1024 pages of size 2MB from socket 0
>
> EAL: TSC frequency is ~2600141 KHz
>
> EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
> clock cycles !
>
> EAL: Master lcore 1 is ready (tid=f7fed880;cpuset=[1])
>
> EAL: PCI device :00:04.0 on NUMA socket 0
>
> EAL:   probe driver: 8086:154c rte_i40evf_pmd
>
> EAL:   PCI memory mapped at 0x7620
>
> EAL:   PCI memory mapped at 0x7621
>
> EAL: PCI device :00:05.0 on NUMA socket 0
>
> EAL:   probe driver: 8086:154c rte_i40evf_pmd
>
> EAL:   PCI memory mapped at 0x76214000
>
> EAL:   PCI memory mapped at 0x76224000
>
> PMD: i40evf_init_vf(): init_adminq failed: -53
>
> PMD: i40evf_dev_init(): Init vf failed
>
> EAL: Error - exiting with code: 1
>
>   Cause: Requested device :00:05.0 cannot be used
>
> [root at PA-VM ~]# ./dpdk-2.2.0/tools/dpdk_nic_bind.py --status
>
>
> Network devices using DPDK-compatible driver
>
> 
>
> :00:04.0 'Device 154c' drv=igb_uio unused=uio_pci_generic
>
> :00:05.0 'Device 154c' drv=igb_uio unused=uio_pci_generic
>
>
> Network devices using kernel driver
>
> ===
>
> :00:03.0 'RTL-8139/8139C/8139C+' if=eth0 drv=8139cp
> unused=igb_uio,uio_pci_generic *Active*
>
>
> Other network devices
>
> =
>
> 
>
> [root@ ~]#
>
>
>
> 04:00.0 *Ether*net controller: Intel Corporation *Ether*net Controller X710
> for 10GbE SFP+ (rev 01)
>
> 04:00.1 *Ether*net controller: Intel Corporation *Ether*net Controller X710
> for 10GbE SFP+ (rev 01)
>
> 04:02.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual
> Function (rev 01)
>
> 04:02.1 *Ether*net controller: Intel Corporation XL710/X710 Virtual
> Function (rev 01)
>
> 04:0a.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual
> Function (rev 01)
>
> 04:0a.1 *Ether*net controller: Intel Corporation XL710/X710 Virtual
> Function (rev 01)
>
>
>
> [root at oscompute3 ~]# dmesg | tail
>
> [2064188.042835] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.062836] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.082862] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.102838] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.122850] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.142852] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.162850] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.182845] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.202845] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [2064188.222858] i40evf :04:0a.1: i40evf_add_ether_addrs: command 15
> pending
>
> [root at oscompute3 ~]#
>
>
> /Saurabh
>

[dpdk-dev] [PATCH v2] ixgbe: Fix disable interrupt twice

2016-02-01 Thread Qiu, Michael

On 1/29/2016 4:07 PM, Lu, Wenzhuo wrote:
> Hi Michael,
>
>> -Original Message-----
>> From: Qiu, Michael
>> Sent: Friday, January 29, 2016 1:58 PM
>> To: dev at dpdk.org
>> Cc: Zhou, Danny; Liu, Yong; Liang, Cunming; Lu, Wenzhuo; Qiu, Michael
>> Subject: [PATCH v2] ixgbe: Fix disable interrupt twice
>>
>> Currently, ixgbe vf and pf will disable interrupt twice in stop stage and 
>> uninit
>> stage. It will cause an error:
>>
>> testpmd> quit
>>
>> Shutting down port 0...
>> Stopping ports...
>> Done
>> Closing ports...
>> EAL: Error disabling MSI-X interrupts for fd 26
>> Done
>>
>> Becasue the interrupt already been disabled in stop stage.
>> Since it is enabled in init stage, better remove from stop stage.
> I'm afraid it?s not a good idea to just remove the intr_disable from dev_stop.
> I think dev_stop have the chance to be used independently with dev_unint. In 
> this scenario, we still need intr_disable, right?
> Maybe what we need is some check before we disable the intr:)

Yes, indeed we need some check in disable intr, but it need additional
fields in "struct rte_intr_handle",  and it's much saft to do so, but as
I check i40e/fm10k code, only ixgbe disable it in dev_stop().

On other hand, if we remove it in dev_stop, any side effect? In ixgbe
start, it will always disable it first and then re-enable it, so it's safe.

Thanks,
Michael
>

[dpdk-dev] [PATCH 0/8] support E-tag offloading and forwarding on Intel X550 NIC

2016-01-29 Thread Qiu, Michael

Hi, Wenzhuo

Better to explain what E-tag is, so that reviewers could known it.

Thanks,
Michael

On 1/29/2016 3:05 PM, Wenzhuo Lu wrote:
> This patch set adds the support of E-tag offloading and forwarding
> on X550.
> The offloading means E-tag can be inserted and stripped by HW.
> And E-tag packets can be recognized and forwarded to specific pools
> based on GRP and E-CID_base in E-tag.
>
> Wenzhuo Lu (8):
>   ixgbe: select pool by MAC when using double VLAN
>   lib/librte_ether: support l2 tunnel config
>   ixgbe: support l2 tunnel config
>   app/testpmd: add CLIs for l2 tunnel config
>   lib/librte_ether: support new l2 tunnel operation
>   ixgbe: support l2 tunnel operation
>   app/testpmd: add CLIs for E-tag operation
>   doc: add release note for E-tag
>
>  app/test-pmd/cmdline.c   | 599 
> +++
>  doc/guides/rel_notes/release_2_3.rst |   6 +
>  drivers/net/ixgbe/ixgbe_ethdev.c | 507 +
>  lib/librte_ether/rte_eth_ctrl.h  |   9 +
>  lib/librte_ether/rte_ethdev.c| 239 ++
>  lib/librte_ether/rte_ethdev.h| 288 +
>  6 files changed, 1648 insertions(+)
>

[dpdk-dev] [PATCH] lib/librte_eal: Fix compile issue with gcc 5.3.1

2016-01-28 Thread Qiu, Michael

On 1/28/2016 4:32 PM, Thomas Monjalon wrote:
> 2016-01-28 15:30, Michael Qiu:
>> In fedora 22 with GCC version 5.3.1, when compile,
>> will result an error:
>>
>> include/rte_memcpy.h:309:7: error: "RTE_MACHINE_CPUFLAG_AVX2"
>> is not defined [-Werror=undef]
>> #elif RTE_MACHINE_CPUFLAG_AVX2
>>
>> Fixes: 9484092baad3 ("eal/x86: optimize memcpy for AVX512 platforms")
> Thanks for the quick fix.
>
> Note about the title formatting:
> As you see in the "Fixes:" line, the title of the original commit
> was starting with "eal/x86". Yours should adopt the same convention.
> A lot of patches are sent with "lib/librte_eal" which was never used
> in the git history. If not sure when writing the title, please check
> the history to keep it consistent.

OK, next time I will do it.

Thanks for point it out.

Thanks,
Michael
> As usual, I will reword it.
> Thanks for your attention
>

[dpdk-dev] [PATCH 4/4] virtio/vdev: add a new vdev named eth_cvio

2016-01-27 Thread Qiu, Michael

On 1/11/2016 2:43 AM, Tan, Jianfeng wrote:
> Add a new virtual device named eth_cvio, it can be used just like
> eth_ring, eth_null, etc.
>
> Configured parameters include:
> - rx (optional, 1 by default): number of rx, only allowed to be
>  1 for now.
> - tx (optional, 1 by default): number of tx, only allowed to be
>  1 for now.


>From APP side, virtio is something HW, in your implementation rx/tx is
max queue numbers virtio supported. Does it make sense?

Why need user tell HW, how much queues it support? We'd better make it
un-configurable, only let users query it like the real HW, and then
decide how much queues it need to enable.


> - cq (optional, 0 by default): if ctrl queue is enabled, not
>  supported for now.
> - mac (optional): mac address, random value will be given if not
> specified.
> - queue_num (optional, 256 by default): size of virtqueue.

Better change it to queue_size.

Thanks,
Michael

> - path (madatory): path of vhost, depends on the file type:
>  vhost-user is used if the given path points to
>  a unix socket; vhost-net is used if the given
>  path points to a char device.
>
> The major difference with original virtio for vm is that, here we
> use virtual address instead of physical address for vhost to
> calculate relative address.
>
> When enable CONFIG_RTE_VIRTIO_VDEV (enabled by default), the compiled
> library can be used in both VM and container environment.
>
> Examples:
> a. Use vhost-net as a backend
> sudo numactl -N 1 -m 1 ./examples/l2fwd/build/l2fwd -c 0x10 -n 4 \
> -m 1024 --no-pci --single-file --file-prefix=l2fwd \
> --vdev=eth_cvio0,mac=00:01:02:03:04:05,path=/dev/vhost-net \
> -- -p 0x1
>
> b. Use vhost-user as a backend
> numactl -N 1 -m 1 ./examples/l2fwd/build/l2fwd -c 0x10 -n 4 -m 1024 \
> --no-pci --single-file --file-prefix=l2fwd \
> --vdev=eth_cvio0,mac=00:01:02:03:04:05,path= \
> -- -p 0x1
>
> Signed-off-by: Huawei Xie 
> Signed-off-by: Jianfeng Tan 
> ---
>

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Qiu, Michael

On 1/26/2016 5:04 PM, Van Haaren, Harry wrote:
>> From: Qiu, Michael
>> On 1/25/2016 7:51 PM, Van Haaren, Harry wrote:
>>> Not really, the secondary process will need some CPU,
>>> however it can sleep so it doesn't have to use 100% of it.
>>> It shouldn't be run on a core that is used by the primary
>>> for packet-forwarding though - that will impact performance.
>> If not, what will happen if the primary been killed after you check
>> alive? At that time, the secondary may be doing some work need primary
>> alive.
> What work are you thinking of? Apart from the shared config
> and hugepages, primary and secondary processes are running
> in their own address-space, and if the primary gets killed,
> the secondary will notice when it next polls rte_eal_primary_proc_alive().
>
> Whatever work the secondary was performing (in its own address space)
> won't be directly changed by the primary being killed, because the
> shared config and hugepages stay (EAL "cleans up" when the primary
> is re-launched, not on quit).

OK,  when primary quit or be killed, the queues will be freed, it will
be a potential issue when secondary try to access, maybe I'm wrong.

Thanks,
Michael

> -Harry
>
>

[dpdk-dev] [PATCH 0/4] virtio support for container

2016-01-26 Thread Qiu, Michael

On 1/11/2016 2:43 AM, Tan, Jianfeng wrote:
> This patchset is to provide high performance networking interface (virtio)
> for container-based DPDK applications. The way of starting DPDK apps in
> containers with ownership of NIC devices exclusively is beyond the scope.
> The basic idea here is to present a new virtual device (named eth_cvio),
> which can be discovered and initialized in container-based DPDK apps using
> rte_eal_init(). To minimize the change, we reuse already-existing virtio
> frontend driver code (driver/net/virtio/).
>  
> Compared to QEMU/VM case, virtio device framework (translates I/O port r/w
> operations into unix socket/cuse protocol, which is originally provided in
> QEMU), is integrated in virtio frontend driver. So this converged driver
> actually plays the role of original frontend driver and the role of QEMU
> device framework.
>  
> The major difference lies in how to calculate relative address for vhost.
> The principle of virtio is that: based on one or multiple shared memory
> segments, vhost maintains a reference system with the base addresses and
> length for each segment so that an address from VM comes (usually GPA,
> Guest Physical Address) can be translated into vhost-recognizable address
> (named VVA, Vhost Virtual Address). To decrease the overhead of address
> translation, we should maintain as few segments as possible. In VM's case,
> GPA is always locally continuous. In container's case, CVA (Container
> Virtual Address) can be used. Specifically:
> a. when set_base_addr, CVA address is used;
> b. when preparing RX's descriptors, CVA address is used;
> c. when transmitting packets, CVA is filled in TX's descriptors;
> d. in TX and CQ's header, CVA is used.
>  
> How to share memory? In VM's case, qemu always shares all physical layout
> to backend. But it's not feasible for a container, as a process, to share
> all virtual memory regions to backend. So only specified virtual memory
> regions (with type of shared) are sent to backend. It's a limitation that
> only addresses in these areas can be used to transmit or receive packets.
>
> Known issues
>
> a. When used with vhost-net, root privilege is required to create tap
> device inside.
> b. Control queue and multi-queue are not supported yet.
> c. When --single-file option is used, socket_id of the memory may be
> wrong. (Use "numactl -N x -m x" to work around this for now)
>  
> How to use?
>
> a. Apply this patchset.
>
> b. To compile container apps:
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
>
> c. To build a docker image using Dockerfile below.
> $: cat ./Dockerfile
> FROM ubuntu:latest
> WORKDIR /usr/src/dpdk
> COPY . /usr/src/dpdk
> ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/"
> $: docker build -t dpdk-app-l2fwd .
>
> d. Used with vhost-user
> $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \
>   --socket-mem 1024,1024 -- -p 0x1 --stats 1
> $: docker run -i -t -v :/var/run/usvhost \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/var/run/usvhost -- -p 0x1
>
> f. Used with vhost-net
> $: modprobe vhost
> $: modprobe vhost-net
> $: docker run -i -t --privileged \
>   -v /dev/vhost-net:/dev/vhost-net \
>   -v /dev/net/tun:/dev/net/tun \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/dev/vhost-net -- -p 0x1

We'd better add a ifname, like
--vdev=eth_cvio0,path=/dev/vhost-net,ifname=tap0, so that user could add
the tap to the bridge first.

Thanks,
Michael
>
> By the way, it's not necessary to run in a container.
>
> Signed-off-by: Huawei Xie 
> Signed-off-by: Jianfeng Tan 
>
> Jianfeng Tan (4):
>   mem: add --single-file to create single mem-backed file
>   mem: add API to obstain memory-backed file info
>   virtio/vdev: add ways to interact with vhost
>   virtio/vdev: add a new vdev named eth_cvio
>
>  config/common_linuxapp |   5 +
>  drivers/net/virtio/Makefile|   4 +
>  drivers/net/virtio/vhost.c | 734 
> +
>  drivers/net/virtio/vhost.h | 192 
>  drivers/net/virtio/virtio_ethdev.c | 338 ++---
>  drivers/net/virtio/virtio_ethdev.h |   4 +
>  drivers/net/virtio/virtio_pci.h|  52 +-
>  drivers/net/virtio/virtio_rxtx.c   |  11 +-
>  drivers/net/virtio/virtio_rxtx_simple.c|  14 +-
>  drivers/net/virtio/virtqueue.h |  13 +-
>  lib/librte_eal/common/eal_common_options.c |  17 +
>  lib/librte_eal/common/eal_internal_cfg.h   |   1 +
>  lib/librte_eal/common/eal_options.h|   2 +
>  lib/librte_eal/common/include/rte_memory.h |  16 +
>  lib/l

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-26 Thread Qiu, Michael

On 1/25/2016 7:51 PM, Van Haaren, Harry wrote:
>> From: Qiu, Michael
>> Subject: Re: [dpdk-dev] [PATCH] eal: add function to check if primary proc 
>> alive
>>
>> So secondary will waste a whole lcore to do such polling?
> Not really, the secondary process will need some CPU,
> however it can sleep so it doesn't have to use 100% of it.
> It shouldn't be run on a core that is used by the primary
> for packet-forwarding though - that will impact performance.

If not, what will happen if the primary been killed after you check
alive? At that time, the secondary may be doing some work need primary
alive.

Thanks,
Michael
> -Harry
>

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-25 Thread Qiu, Michael

On 1/20/2016 9:26 PM, Harry van Haaren wrote:
> This patch adds a new function to the EAL API:
> int rte_eal_primary_proc_alive(const char *path);
>
> The function indicates if a primary process is alive right now.
> This functionality is implemented by testing for a write-
> lock on the config file, and the function tests for a lock.
>
> The use case for this functionality is that a secondary
> process can wait until a primary process starts by polling
> the function and waiting. When the primary is running, the
> secondary continues to poll to detect if the primary process
> has quit unexpectedly, the secondary process can detect this.
>
> The RTE_MAGIC number is written to the shared config by the
> primary process, this is the signal to the secondary process
> that the EAL is set up, and ready to be used. The function
> rte_eal_mcfg_complete() writes RTE_MAGIC. This has been
> delayed in the EAL init proceedure, as the PCI probing in
> the primary process can interfere with the secondary running.
>
> Signed-off-by: Harry van Haaren 
> ---
>  

Hi, Harry

So secondary  will waste a whole lcore to do such polling?

Thanks,
Michael

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-25 Thread Qiu, Michael

On 1/23/2016 1:38 AM, Richardson, Bruce wrote:
> On Thu, Jan 21, 2016 at 09:02:41AM +, Van Haaren, Harry wrote:
>>> From: Qiu, Michael
>>> Sent: Thursday, January 21, 2016 6:14 AM
>>> To: Van Haaren, Harry ; david.marchand at 
>>> 6wind.com
>>> Cc: dev at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] eal: add function to check if primary proc 
>>> alive
>>> 
>>> As we could start up many primaries, how does your secondary process
>>> work with them?
>> When a primary process initializes, the location of the config file is 
>> important. The default is /var/run/.rte_config
>>
>> To run multiple primary processes, the --file-prefix= option is used to 
>> specific a custom location for the config file. Eg: --file-prefix=testing
>> /var/run/.testing_config
>>
>> The rte_eal_check_primary_alive(const char*) function takes a char* 
>> parameter - this is the location of the config file that the secondary 
>> process will wait for. Setting it to the correct value will make this 
>> secondary process wait for the corresponding primary process.
>>
>> Regards, -Harry
> Since a given secondary process only works with a single primary process, I'm 
> not
> sure why the user should want or need to pass in this parameter. What's the 
> use
> case for a secondary process wanting to know about a different primary 
> process?
> The details of what the config file is should largely be hidden from the user
> IMHO.

So using the prefix, and get the file name inside the
API(--file-prefix=xxx then the config file /var/run/.xxx_config), if no
perfix, then could be /var/run/.rte_config.

Just a suggestion. Maybe there are better solutions .

Thanks,
Michael
> If you want to allow a secondary to query an arbitrary primary process can you
> still allow a NULL string to query the default primary based on the passed in
> file-prefix parameter (if any)?
>
> /Bruce
>

[dpdk-dev] [PATCH] eal: add function to check if primary proc alive

2016-01-21 Thread Qiu, Michael

On 1/20/2016 9:26 PM, Harry van Haaren wrote:
> This patch adds a new function to the EAL API:
> int rte_eal_primary_proc_alive(const char *path);
>
> The function indicates if a primary process is alive right now.
> This functionality is implemented by testing for a write-
> lock on the config file, and the function tests for a lock.
>
> The use case for this functionality is that a secondary
> process can wait until a primary process starts by polling
> the function and waiting. When the primary is running, the
> secondary continues to poll to detect if the primary process
> has quit unexpectedly, the secondary process can detect this.
>
> The RTE_MAGIC number is written to the shared config by the
> primary process, this is the signal to the secondary process
> that the EAL is set up, and ready to be used. The function
> rte_eal_mcfg_complete() writes RTE_MAGIC. This has been
> delayed in the EAL init proceedure, as the PCI probing in
> the primary process can interfere with the secondary running.
>
> Signed-off-by: Harry van Haaren 
> ---

one question:

As we could start up many primaries, how does your secondary process
work with them?

Thanks,
Michael

[dpdk-dev] Getting error while running DPDK test app on X-Gene1

2016-01-14 Thread Qiu, Michael

On 1/14/2016 12:15 PM, Jerin Jacob wrote:
> On Wed, Jan 13, 2016 at 03:52:01PM +0530, Ankit Jindal wrote:
>> Hi,
>>
>> We are trying to run dpdk on our arm64 based SOC having Intel 10G
>> ixgbe PCIe card plugged. While running any test app, we are getting
>> following error.
>>
>> EAL: PCI device :01:00.0 on NUMA socket 0
>> EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
>> EAL: Cannot open /sys/bus/pci/devices/:01:00.0/resource0: No such
>> file or directory
>> EAL: Error - exiting with code: 1
>>   Cause: Requested device :01:00.0 cannot be used
>
> pci resource creation patch is not yet part of the arm64 mainline kernel.
> The following patch should fix the problem.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358906.html
>
> Jerin

What's the status of your arm kernel patch?

Thanks,
Michael
>> Below are the details on modules, hugepages and device binding.
>> root at arm64:~# lsmod
>> Module  Size  Used by
>> rte_kni   292795  0
>> igb_uio 4338  0
>> ixgbe 184456  0
>>
>> root at arm64:~/dpdk# cat 
>> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>> 2048
>>
>> root at arm64:~/dpdk# ./tools/dpdk_nic_bind.py --status
>>
>> Network devices using DPDK-compatible driver
>> 
>> :01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection'
>> drv=igb_uio unused=
>> :01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection'
>> drv=igb_uio unused=
>>
>> Network devices using kernel driver
>> ===
>> 
>>
>> Other network devices
>> =
>> 
>> root at arm64:~/dpdk#
>>
>> Thanks,
>> Ankit

[dpdk-dev] Getting error while running DPDK test app on X-Gene1

2016-01-14 Thread Qiu, Michael

Could you show what's  exists in

/sys/bus/pci/devices/:01:00.0/


Thanks, Michael


On 1/13/2016 6:23 PM, Ankit Jindal wrote:
> Hi,
>
> We are trying to run dpdk on our arm64 based SOC having Intel 10G
> ixgbe PCIe card plugged. While running any test app, we are getting
> following error.
>
> EAL: PCI device :01:00.0 on NUMA socket 0
> EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
> EAL: Cannot open /sys/bus/pci/devices/:01:00.0/resource0: No such
> file or directory
> EAL: Error - exiting with code: 1
>   Cause: Requested device :01:00.0 cannot be used
>
> Below are the details on modules, hugepages and device binding.
> root at arm64:~# lsmod
> Module  Size  Used by
> rte_kni   292795  0
> igb_uio 4338  0
> ixgbe 184456  0
>
> root at arm64:~/dpdk# cat 
> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> 2048
>
> root at arm64:~/dpdk# ./tools/dpdk_nic_bind.py --status
>
> Network devices using DPDK-compatible driver
> 
> :01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection'
> drv=igb_uio unused=
> :01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection'
> drv=igb_uio unused=
>
> Network devices using kernel driver
> ===
> 
>
> Other network devices
> =
> 
> root at arm64:~/dpdk#
>
> Thanks,
> Ankit
>

[dpdk-dev] [PATCH 00/12] Add API to get packet type info

2016-01-13 Thread Qiu, Michael

On 12/31/2015 9:53 PM, Jianfeng Tan wrote:
> HAPPRY NEW YEAR!
>
> A new ether API rte_eth_dev_get_ptype_info() is added to query what
> packet type information will be provided by current pmd driver of the
> specifed port.
>
> To achieve this, a new function pointer, dev_ptype_info_get, is added
> into struct eth_dev_ops. For those devices who do not implement it, it
> means it will not provide any ptype info.

I haven't go through all the patches, but I have a question, what's the
usercase of this API?

Thanks,
Michael
> Jianfeng Tan (12):
>   ethdev: add API to query what/if packet type is set
>   pmd/cxgbe: add dev_ptype_info_get implementation
>   pmd/e1000: add dev_ptype_info_get implementation
>   pmd/enic: add dev_ptype_info_get implementation
>   pmd/fm10k: add dev_ptype_info_get implementation
>   pmd/i40e: add dev_ptype_info_get implementation
>   pmd/ixgbe: add dev_ptype_info_get implementation
>   pmd/mlx4: add dev_ptype_info_get implementation
>   pmd/mlx5: add dev_ptype_info_get implementation
>   pmd/nfp: add dev_ptype_info_get implementation
>   pmd/vmxnet3: add dev_ptype_info_get implementation
>   examples/l3fwd: add option to parse ptype
>
>  drivers/net/cxgbe/cxgbe_ethdev.c | 17 +++
>  drivers/net/e1000/igb_ethdev.c   | 48 
>  drivers/net/enic/enic_ethdev.c   | 20 +
>  drivers/net/fm10k/fm10k_ethdev.c | 60 +
>  drivers/net/fm10k/fm10k_rxtx.c   |  5 +++
>  drivers/net/fm10k/fm10k_rxtx_vec.c   |  5 +++
>  drivers/net/i40e/i40e_ethdev.c   |  1 +
>  drivers/net/i40e/i40e_ethdev_vf.c|  1 +
>  drivers/net/i40e/i40e_rxtx.c | 69 -
>  drivers/net/i40e/i40e_rxtx.h |  2 +
>  drivers/net/ixgbe/ixgbe_ethdev.c | 50 +
>  drivers/net/ixgbe/ixgbe_ethdev.h |  2 +
>  drivers/net/ixgbe/ixgbe_rxtx.c   |  5 ++-
>  drivers/net/mlx4/mlx4.c  | 27 +++
>  drivers/net/mlx5/mlx5.c  |  1 +
>  drivers/net/mlx5/mlx5.h  |  2 +
>  drivers/net/mlx5/mlx5_ethdev.c   | 25 +++
>  drivers/net/mlx5/mlx5_rxtx.c |  2 +
>  drivers/net/nfp/nfp_net.c| 18 
>  drivers/net/vmxnet3/vmxnet3_ethdev.c | 20 +
>  examples/l3fwd/main.c| 86 
> 
>  lib/librte_ether/rte_ethdev.c| 12 +
>  lib/librte_ether/rte_ethdev.h| 22 +
>  lib/librte_mbuf/rte_mbuf.h   | 13 ++
>  24 files changed, 511 insertions(+), 2 deletions(-)
>

[dpdk-dev] [RFC PATCH 0/2] Virtio-net PMD Extension to work on host.

2015-12-28 Thread Qiu, Michael

Hi, Tetsuya

I have a question about your solution, as I know you plan to run qemu
and dpdk both in container right?

If so, I think it's a bit tricky, DPDK is a lib, and qemu is a App,
seems it is not suitable to let a lib depends on Apps.

Also, till now I don't see any usecase to run qemu inside container.

Thanks,
Michael

On 11/19/2015 6:58 PM, Tetsuya Mukawa wrote:
> THIS IS A PoC IMPLEMENATION.
>
> [Abstraction]
>
> Normally, virtio-net PMD only works on VM, because there is no virtio-net 
> device on host.
> This RFC patch extends virtio-net PMD to be able to work on host as virtual 
> PMD.
> But we didn't implement virtio-net device as a part of virtio-net PMD.
> To prepare virtio-net device for the PMD, start QEMU process with special 
> QTest mode, then connect it from virtio-net PMD through unix domain socket.
>
> The PMD can connect to anywhere QEMU virtio-net device can.
> For example, the PMD can connects to vhost-net kernel module and vhost-user 
> backend application.
> Similar to virtio-net PMD on QEMU, application memory that uses virtio-net 
> PMD will be shared between vhost backend application.
> But vhost backend application memory will not be shared.
>
> Main target of this PMD is container like docker, rkt, lxc and etc.
> We can isolate related processes(virtio-net PMD process, QEMU and vhost-user 
> backend process) by container.
> But, to communicate through unix domain socket, shared directory will be 
> needed.
>
>
> [How to use]
>
> So far, we need QEMU patch to connect to vhost-user backend.
> Please check known issue in later section.
> Because of this, I will describe example of using vhost-net kernel module.
>
>  - Compile
>  Set "CONFIG_RTE_VIRTIO_VDEV=y" in config/common_linux.
>  Then compile it.
>
>  - Start QEMU like below.
>  $ sudo qemu-system-x86_64 -qtest unix:/tmp/qtest0,server -machine 
> accel=qtest \
>-display none -qtest-log /dev/null \
>-netdev 
> type=tap,script=/etc/qemu-ifup,id=net0,vhost=on \
>-device virtio-net-pci,netdev=net0 \
>-chardev socket,id=chr1,path=/tmp/ivshmem0,server \
>-device ivshmem,size=1G,chardev=chr1,vectors=1
>
>  - Start DPDK application like below
>  $ sudo ./testpmd -c f -n 1 -m 1024 --shm \
>   --vdev="eth_cvio0,qtest=/tmp/qtest0,ivshmem=/tmp/ivshmem0" 
> -- \
>   --disable-hw-vlan --txqflags=0xf00 -i
>
>  - Check created tap device.
>
> (*1) Please Specify same memory size in QEMU and DPDK command line.
>
>
> [Detailed Description]
>
>  - virtio-net device implementation
> The PMD uses QEMU virtio-net device. To do that, QEMU QTest functionality is 
> used.
> QTest is a test framework of QEMU devices. It allows us to implement a device 
> driver outside of QEMU.
> With QTest, we can implement DPDK application and virtio-net PMD as 
> standalone process on host.
> When QEMU is invoked as QTest mode, any guest code will not run.
> To know more about QTest, see below.
> http://wiki.qemu.org/Features/QTest
>
>  - probing devices
> QTest provides a unix domain socket. Through this socket, driver process can 
> access to I/O port and memory of QEMU virtual machine.
> The PMD will send I/O port accesses to probe pci devices.
> If we can find virtio-net and ivshmem device, initialize the devices.
> Also, I/O port accesses of virtio-net PMD will be sent through socket, and 
> virtio-net PMD can initialize vitio-net device on QEMU correctly.
>
>  - ivshmem device to share memory
> To share memory that virtio-net PMD process uses, ivshmem device will be used.
> Because ivshmem device can only handle one file descriptor, shared memory 
> should be consist of one file.
> To allocate such a memory, EAL has new option called "--shm".
> If the option is specified, EAL will open a file and allocate memory from 
> hugepages.
> While initializing ivshmem device, we can set BAR(Base Address Register).
> It represents which memory QEMU vcpu can access to this shared memory.
> We will specify host physical address of shared memory as this address.
> It is very useful because we don't need to apply patch to QEMU to calculate 
> address offset.
> (For example, if virtio-net PMD process will allocate memory from shared 
> memory, then specify the physical address of it to virtio-net register, QEMU 
> virtio-net device can understand it without calculating address offset.)
>
>  - Known limitation
> So far, the PMD doesn't handle interrupts from QEMU devices.
> Because of this, VIRTIO_NET_F_STATUS functionality is dropped.
> But without it, we can use all virtio-net functions.
>
>  - Known issues
> So far, to use vhost-user, we need to apply vhost-user patch to QEMU and DPDK 
> vhost library.
> This is because, QEMU will not send memory information and file descriptor of 
> ivshmem device to vhost-user backend.
> (Anyway, vhost-net kernel module can receive the information. So vhost-user

[dpdk-dev] [PATCH 1/3] librte_ether: remove RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET

2015-12-28 Thread Qiu, Michael

On 12/25/2015 1:40 AM, Pattan, Reshma wrote:
>
>> -Original Message-
>> From: Qiu, Michael
>> On 12/23/2015 8:19 PM, Reshma Pattan wrote:
>>> Macros RTE_PROC_PRIMARY_OR_ERR_RET and
>> RTE_PROC_PRIMARY_OR_RET are
>>> blocking the secondary process from using the APIs.
>>> API access should be given to both secondary and primary.
>> Just as the log says, is it safe to do so?
>
> Hi,
>
> Some parts of the code still need these macros, which I am not sure yet. But 
> as and when we identify those we have to add the macros to the needed places. 
> But it is safe to remove from start of function to allow secondary process to 
> do device configuration and queue setups for vdev. 
> Please let me know if you know any of such cases where these macros should be 
> added.

You you have removed almost all secondary check, and as I know, with
your patch, secondary almost has full control of a device, what's the
exact demand for secondary to have full API access?

It's a big change for DPDK I think, but patch set itself is good for me:)

Thanks,
Michael
> Thanks,
> Reshma
>   
>

[dpdk-dev] [PATCH] mk: fix examples build failure

2015-12-28 Thread Qiu, Michael

On 12/24/2015 8:38 PM, steeven lee wrote:
> 1. Fix examples build failure
> 2. make build as default output folder name
>
> Signed-off-by: steeven 
> ---
>  mk/internal/rte.extvars.mk | 4 ++--
>  mk/rte.extsubdir.mk| 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mk/internal/rte.extvars.mk b/mk/internal/rte.extvars.mk
> index 040d39f..cabef0a 100644
> --- a/mk/internal/rte.extvars.mk
> +++ b/mk/internal/rte.extvars.mk
> @@ -52,9 +52,9 @@ RTE_EXTMK ?= $(RTE_SRCDIR)/Makefile
>  export RTE_EXTMK
>
>  # RTE_SDK_BIN must point to .config, include/ and lib/.
> -RTE_SDK_BIN := $(RTE_SDK)/$(RTE_TARGET)
> +RTE_SDK_BIN := $(RTE_SDK)/build
>  ifeq ($(wildcard $(RTE_SDK_BIN)/.config),)
> -$(error Cannot find .config in $(RTE_SDK))
> +$(error Cannot find .config in $(RTE_SDK_BIN))
>  endif
>
>  #
> diff --git a/mk/rte.extsubdir.mk b/mk/rte.extsubdir.mk
> index f50f006..819020a 100644
> --- a/mk/rte.extsubdir.mk
> +++ b/mk/rte.extsubdir.mk
> @@ -46,7 +46,7 @@ $(DIRS-y):
> @echo "== $@"
> $(Q)$(MAKE) -C $(@) \
> M=$(CURDIR)/$(@)/Makefile \
> -   O=$(BASE_OUTPUT)/$(CUR_SUBDIR)/$(@)/$(RTE_TARGET) \
> +   O=$(BASE_OUTPUT)/$(CUR_SUBDIR)/build \
> BASE_OUTPUT=$(BASE_OUTPUT) \
> CUR_SUBDIR=$(CUR_SUBDIR)/$(@) \
> S=$(CURDIR)/$(@) \

Could you show your compile error log? And how to reproduce it?

Thanks,
Michael

[dpdk-dev] [PATCH v2 0/3] Handle SIGINT and SIGTERM in DPDK examples

2015-12-28 Thread Qiu, Michael

On 2015/12/25 17:40, Wang, Zhihong wrote:
> This patch handles SIGINT and SIGTERM in testpmd, l2fwd, and l3fwd, make sure 
> all ports are properly stopped and closed.
> For virtual ports, the stop and close function may deal with resource 
> cleanup, such as socket files unlinking.
>
> --
> Changes in v2:
>
> 1. Make sure graceful exit for all running phases
>
> 2. Make sure program exits with the right status
>
> Zhihong Wang (3):
>   app/test-pmd: Handle SIGINT and SIGTERM in testpmd
>   examples/l2fwd: Handle SIGINT and SIGTERM in l2fwd
>   examples/l3fwd: Handle SIGINT and SIGTERM in l3fwd
>
>  app/test-pmd/cmdline.c |  19 ++---
>  app/test-pmd/testpmd.c |  38 ++---
>  app/test-pmd/testpmd.h |   1 +
>  examples/l2fwd/main.c  |  60 +++
>  examples/l3fwd/main.c  | 110 
> -
>  5 files changed, 196 insertions(+), 32 deletions(-)
>

Next time, you'd better not to top post for V2 :)

Acked-by: Michael Qiu

[dpdk-dev] [RFC PATCH 0/2] Reduce DPDK initialization time

2015-12-24 Thread Qiu, Michael

On 11/18/2015 6:30 PM, Zhihong Wang wrote:
> This RFC patch aims to reduce DPDK initialization time, which is important in 
> cases such as micro service.
>
> Changes are:
>
> 1. Reduce timer initialization time
>
> 2. Remove unnecessary hugepage zero-filling operations
>
> With this patch:
>
> 1. Timer initialization time can be reduced by 4/10 second
>
> 2. Memory initialization time can be reduced nearly by half
>
> The 2nd topic has been brought up before in this thread:
> http://dpdk.org/dev/patchwork/patch/4219/
>
> Zhihong Wang (2):
>   lib/librte_eal: Reduce timer initialization time
>   lib/librte_eal: Remove unnecessary hugepage zero-filling
>
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 5 +
>  lib/librte_eal/linuxapp/eal/eal_timer.c  | 2 +-
>  2 files changed, 2 insertions(+), 5 deletions(-)
>

As I tested with 8192 hugepages(size 2M), one nic 82599 bind, using time
to get the seconds used:
with this patch:

echo quit | time ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 -- -i
2.15 user
5.55 system
0:07.82 elapsed

Without patch:
echo quit | time ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 -- -i
3.18 user
5.63 system
0:09.32 elapsed

1.5s saved,  16% improvement, I don't know if this is good enough, but
indeed save lots of time.

Thanks,
Michael

[dpdk-dev] [PATCH 2/3] examples/l2fwd: Handle SIGINT and SIGTERM in l2fwd

2015-12-24 Thread Qiu, Michael

On 12/24/2015 11:07 AM, Zhihong Wang wrote:
> Handle SIGINT and SIGTERM in l2fwd.
>
> Signed-off-by: Zhihong Wang 
> ---
>  examples/l2fwd/main.c | 25 +
>  1 file changed, 25 insertions(+)
>
> diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c
> index 720fd5a..0594037 100644
> --- a/examples/l2fwd/main.c
> +++ b/examples/l2fwd/main.c
> @@ -44,6 +44,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -534,6 +535,27 @@ check_all_ports_link_status(uint8_t port_num, uint32_t 
> port_mask)
>   }
>  }
>  
> +/* When we receive a INT signal, close all ports */
> +static void
> +sigint_handler(__rte_unused int signum)
> +{
> + unsigned portid, nb_ports;
> +
> + printf("Preparing to exit...\n");

Same here and l3fwd, better to show the reason of this exit.

Thanks,
Michael
> + nb_ports = rte_eth_dev_count();
> + for (portid = 0; portid < nb_ports; portid++) {
> + if ((l2fwd_enabled_port_mask & (1 << portid)) == 0) {
> + continue;
> + }
> + printf("Stopping port %d...", portid);
> + rte_eth_dev_stop(portid);
> + rte_eth_dev_close(portid);
> + printf(" Done\n");
> + }
> + printf("Bye...\n");
> + exit(0);
> +}
> +
>  int
>  main(int argc, char **argv)
>  {
> @@ -546,6 +568,9 @@ main(int argc, char **argv)
>   unsigned lcore_id, rx_lcore_id;
>   unsigned nb_ports_in_mask = 0;
>  
> + signal(SIGINT, sigint_handler);
> + signal(SIGTERM, sigint_handler);
> +
>   /* init EAL */
>   ret = rte_eal_init(argc, argv);
>   if (ret < 0)

[dpdk-dev] [PATCH 1/3] app/test-pmd: Handle SIGINT and SIGTERM in testpmd

2015-12-24 Thread Qiu, Michael

On 12/24/2015 11:07 AM, Zhihong Wang wrote:
> Handle SIGINT and SIGTERM in testpmd.
>
> Signed-off-by: Zhihong Wang 
> ---
>  app/test-pmd/testpmd.c | 23 +++
>  1 file changed, 23 insertions(+)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index 98ae46d..c259ba3 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1573,6 +1573,7 @@ pmd_test_exit(void)
>   FOREACH_PORT(pt_id, ports) {
>   printf("Stopping port %d...", pt_id);
>   fflush(stdout);
> + rte_eth_dev_stop(pt_id);
>   rte_eth_dev_close(pt_id);
>   printf("done\n");
>   }
> @@ -1984,12 +1985,34 @@ init_port(void)
>   ports[pid].enabled = 1;
>  }
>  
> +/* When we receive a INT signal, close all ports */
> +static void
> +sigint_handler(__rte_unused int signum)
> +{
> + unsigned portid;
> +
> + printf("Preparing to exit...\n");

Better to notice user "Signal xxx received, reparing to exit... "

> + FOREACH_PORT(portid, ports) {
> + if (port_id_is_invalid(portid, ENABLED_WARN))
> + continue;
> + printf("Stopping port %d...", portid);
> + rte_eth_dev_stop(portid);
> + rte_eth_dev_close(portid);
> + printf(" Done\n");
> + }
> + printf("Bye...\n");

Here why don't call pmd_test_exit()? Any issue with that func?

Thanks,
Michael
> + exit(0);
> +}
> +
>  int
>  main(int argc, char** argv)
>  {
>   int  diag;
>   uint8_t port_id;
>  
> + signal(SIGINT, sigint_handler);
> + signal(SIGTERM, sigint_handler);
> +
>   diag = rte_eal_init(argc, argv);
>   if (diag < 0)
>   rte_panic("Cannot init EAL\n");

[dpdk-dev] [PATCH v5 0/6] interrupt mode for fm10k

2015-12-24 Thread Qiu, Michael

On 12/23/2015 3:38 PM, He, Shaopeng wrote:
> This patch series adds interrupt mode support for fm10k,
> contains four major parts:
>
> 1. implement rx_descriptor_done function in fm10k
> 2. add rx interrupt support in fm10k PF and VF
> 3. make sure default VID available in dev_init in fm10k
> 4. fix a memory leak for non-ip packet in l3fwd-power,
>which happens mostly when testing fm10k interrupt mode.
>
> v5 changes:
>   - remove one unnecessary NULL check for rte_free
>   - fix a wrong error message
>   - add more clean up when memory allocation fails
>   - split line over 80 characters to 2 lines
>   - update interrupt mode limitation in fm10k.rst
>
> v4 changes:
>   - rebase to latest code
>   - update release 2.3 note in corresponding patches
>
> v3 changes:
>   - rebase to latest code
>   - macro renaming according to the EAL change
>
> v2 changes:
>   - reword some comments and commit messages
>   - split one big patch into three smaller ones
>
> Shaopeng He (6):
>   fm10k: implement rx_descriptor_done function
>   fm10k: setup rx queue interrupts for PF and VF
>   fm10k: remove rx queue interrupts when dev stops
>   fm10k: add rx queue interrupt en/dis functions
>   fm10k: make sure default VID available in dev_init
>   l3fwd-power: fix a memory leak for non-ip packet
>
>  doc/guides/nics/fm10k.rst|   7 ++
>  doc/guides/rel_notes/release_2_3.rst |   8 ++
>  drivers/net/fm10k/fm10k.h|   6 ++
>  drivers/net/fm10k/fm10k_ethdev.c | 174 
> ---
>  drivers/net/fm10k/fm10k_rxtx.c   |  25 +
>  examples/l3fwd-power/main.c  |   3 +-
>  6 files changed, 211 insertions(+), 12 deletions(-)
>

Acked-by: Michael Qiu

[dpdk-dev] [PATCH 1/3] librte_ether: remove RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET

2015-12-24 Thread Qiu, Michael

On 12/23/2015 8:19 PM, Reshma Pattan wrote:
> Macros RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET
> are blocking the secondary process from using the APIs.
> API access should be given to both secondary and primary.

Just as the log says, is it safe to do so?

Thanks,
Michael
> Fix minor checkpath issues in rte_ethdev.h
>
> Reported-by: Sean Harte 
> Signed-off-by: Reshma Pattan 
> ---
>  lib/librte_ether/rte_ethdev.c | 50 
> +--
>  lib/librte_ether/rte_ethdev.h | 20 -
>  2 files changed, 11 insertions(+), 59 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..5849102 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -711,10 +711,6 @@ rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t 
> rx_queue_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -741,10 +737,6 @@ rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t 
> rx_queue_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -771,10 +763,6 @@ rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t 
> tx_queue_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -801,10 +789,6 @@ rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t 
> tx_queue_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -874,10 +858,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
> uint16_t nb_tx_q,
>   struct rte_eth_dev_info dev_info;
>   int diag;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   if (nb_rx_q > RTE_MAX_QUEUES_PER_PORT) {
> @@ -1059,10 +1039,6 @@ rte_eth_dev_start(uint8_t port_id)
>   struct rte_eth_dev *dev;
>   int diag;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -1096,10 +1072,6 @@ rte_eth_dev_stop(uint8_t port_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_RET();
> -
>   RTE_ETH_VALID_PORTID_OR_RET(port_id);
>   dev = &rte_eth_devices[port_id];
>  
> @@ -1121,10 +1093,6 @@ rte_eth_dev_set_link_up(uint8_t port_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -1138,10 +1106,6 @@ rte_eth_dev_set_link_down(uint8_t port_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
> -
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>  
>   dev = &rte_eth_devices[port_id];
> @@ -1155,10 +1119,6 @@ rte_eth_dev_close(uint8_t port_id)
>  {
>   struct rte_eth_dev *dev;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*/
> - RTE_PROC_PRIMARY_OR_RET();
> -
>   RTE_ETH_VALID_PORTID_OR_RET(port_id);
>   dev = &rte_eth_devices[port_id];
>  
> @@ -1183,10 +1143,6 @@ rte_eth_rx_queue_setup(uint8_t port_id, uint16_t 
> rx_queue_id,
>   struct rte_eth_dev *dev;
>   struct rte_eth_dev_info dev_info;
>  
> - /* This function is only safe when called from the primary process
> -  * in a multi-process setup*

[dpdk-dev] [PATCH] hash: fix CRC32c computation

2015-12-23 Thread Qiu, Michael

Is it suitable to put so many code in commit log?

Thanks,
Michael
On 12/22/2015 5:36 PM, Didier Pallard wrote:
> As demonstrated by the following code, CRC32c computation is not valid
> when buffer length is not a multiple of 4 bytes:
> (Output obtained by code below)
>
> CRC of 1 NULL bytes expected: 0x527d5351
> soft: 527d5351
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 2 NULL bytes expected: 0xf16177d2
> soft: f16177d2
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 2x1 NULL bytes expected: 0xf16177d2
> soft: f16177d2
> rte accelerated: 8c28b28a
> rte soft: 8c28b28a
> CRC of 3 NULL bytes expected: 0x6064a37a
> soft: 6064a37a
> rte accelerated: 48674bc7
> rte soft: 48674bc7
> CRC of 4 NULL bytes expected: 0x48674bc7
> soft: 48674bc7
> rte accelerated: 48674bc7
> rte soft: 48674bc7
>
> Values returned by rte_hash_crc functions does not match the one
> computed by a trivial crc32c implementation.
>
> ARM code is a guess, it is not tested, neither compiled.
>
> code showing the problem:
>
> uint8_t null_test[32] = {0};
>
> static uint32_t crc32c_trivial(uint8_t *buffer, uint32_t length, uint32_t crc)
> {
> uint32_t i, j;
> for (i = 0; i < length; ++i)
> {
> crc = crc ^ buffer[i];
> for (j = 0; j < 8; j++)
> crc = (crc >> 1) ^ 0x8000 ^ ((~crc & 1) * 0x82f63b78);
> }
> return crc;
> }
>
> void hash_test(void);
> void hash_test(void)
> {
>   printf("CRC of 1 nul byte expected: 0x527d5351\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 1, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1, 
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1, 0x));
>
>   printf("CRC of 2 nul bytes expected: 0xf16177d2\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 2, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 2, 
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 2, 0x));
>
>   printf("CRC of 2x1 nul bytes expected: 0xf16177d2\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 1, 
> crc32c_trivial(null_test, 1, 0)));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1, 
> rte_hash_crc(null_test, 1, 0x)));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1, 
> rte_hash_crc(null_test, 1, 0x)));
>
>   printf("CRC of 3 nul bytes expected: 0x6064a37a\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 3, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 3, 
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 3, 0x));
>
>   printf("CRC of 4 nul bytes expected: 0x48674bc7\n");
>   printf("soft: %08x\n", crc32c_trivial(null_test, 4, 0));
>   rte_hash_crc_init_alg();
>   printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 4, 
> 0x));
>   rte_hash_crc_set_alg(CRC32_SW);
>   printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 4, 0x));
> }
>
> Signed-off-by: Didier Pallard 
> Acked-by: David Marchand 
> ---
>  lib/librte_hash/rte_crc_arm64.h |  64 
>  lib/librte_hash/rte_hash_crc.h  | 125 
> +++-
>  2 files changed, 162 insertions(+), 27 deletions(-)
>
> diff --git a/lib/librte_hash/rte_crc_arm64.h b/lib/librte_hash/rte_crc_arm64.h
> index 02e26bc..44ef460 100644
> --- a/lib/librte_hash/rte_crc_arm64.h
> +++ b/lib/librte_hash/rte_crc_arm64.h
> @@ -50,6 +50,28 @@ extern "C" {
>  #include 
>  
>  static inline uint32_t
> +crc32c_arm64_u8(uint8_t data, uint32_t init_val)
> +{
> + asm(".arch armv8-a+crc");
> + __asm__ volatile(
> + "crc32cb %w[crc], %w[crc], %b[value]"
> + : [crc] "+r" (init_val)
> + : [value] "r" (data));
> + return init_val;
> +}
> +
> +static inline uint32_t
> +crc32c_arm64_u16(uint16_t data, uint32_t init_val)
> +{
> + asm(".arch armv8-a+crc");
> + __asm__ volatile(
> + "crc32ch %w[crc], %w[crc], %h[value]"
> + : [crc] "+r" (init_val)
> + : [value] "r" (data));
> + return init_val;
> +}
> +
> +static inline uint32_t
>  crc32c_arm64_u32(uint32_t data, uint32_t init_val)
>  {
>   asm(".arch armv8-a+crc");
> @@ -103,6 +125,48 @@ rte_hash_crc_init_alg(void)
>  }
>  
>  /**
> + * Use single crc32 instruction to perform a hash on a 1 byte value.
> + * Fall back to software crc32 implementation in case arm64 crc intrinsics is

[dpdk-dev] [PATCH v4 2/6] fm10k: setup rx queue interrupts for PF and VF

2015-12-22 Thread Qiu, Michael

On 12/21/2015 6:20 PM, Shaopeng He wrote:
> In interrupt mode, each rx queue can have one interrupt to notify the up
> layer application when packets are available in that queue. Some queues
> also can share one interrupt.
> Currently, fm10k needs one separate interrupt for mailbox. So, only those
> drivers which support multiple interrupt vectors e.g. vfio-pci can work
> in fm10k interrupt mode.
> This patch uses the RXINT/INT_MAP registers to map interrupt causes
> (rx queue and other events) to vectors, and enable these interrupts
> through kernel drivers like vfio-pci.
>
> Signed-off-by: Shaopeng He 
> Acked-by: Jing Chen 
> ---
>  doc/guides/rel_notes/release_2_3.rst |   2 +
>  drivers/net/fm10k/fm10k.h|   3 ++
>  drivers/net/fm10k/fm10k_ethdev.c | 101 
> +++
>  3 files changed, 95 insertions(+), 11 deletions(-)
>
> diff --git a/doc/guides/rel_notes/release_2_3.rst 
> b/doc/guides/rel_notes/release_2_3.rst
> index 99de186..2cb5ebd 100644
> --- a/doc/guides/rel_notes/release_2_3.rst
> +++ b/doc/guides/rel_notes/release_2_3.rst
> @@ -4,6 +4,8 @@ DPDK Release 2.3
>  New Features
>  
>  
> +* **Added fm10k Rx interrupt support.**
> +
>  
>  Resolved Issues
>  ---
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index e2f677a..770d6ba 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -129,6 +129,9 @@
>  #define RTE_FM10K_TX_MAX_FREE_BUF_SZ64
>  #define RTE_FM10K_DESCS_PER_LOOP4
>  
> +#define FM10K_MISC_VEC_ID   RTE_INTR_VEC_ZERO_OFFSET
> +#define FM10K_RX_VEC_START  RTE_INTR_VEC_RXTX_OFFSET
> +
>  #define FM10K_SIMPLE_TX_FLAG ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
>   ETH_TXQ_FLAGS_NOOFFLOADS)
>  
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index d39c33b..a34c5e2 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -54,6 +54,8 @@
>  /* Number of chars per uint32 type */
>  #define CHARS_PER_UINT32 (sizeof(uint32_t))
>  #define BIT_MASK_PER_UINT32 ((1 << CHARS_PER_UINT32) - 1)
> +/* default 1:1 map from queue ID to interrupt vector ID */
> +#define Q2V(dev, queue_id) (dev->pci_dev->intr_handle.intr_vec[queue_id])
>  
>  static void fm10k_close_mbx_service(struct fm10k_hw *hw);
>  static void fm10k_dev_promiscuous_enable(struct rte_eth_dev *dev);
> @@ -109,6 +111,8 @@ struct fm10k_xstats_name_off 
> fm10k_hw_stats_tx_q_strings[] = {
>  
>  #define FM10K_NB_XSTATS (FM10K_NB_HW_XSTATS + FM10K_MAX_QUEUES_PF * \
>   (FM10K_NB_RX_Q_XSTATS + FM10K_NB_TX_Q_XSTATS))
> +static int
> +fm10k_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
>  
>  static void
>  fm10k_mbx_initlock(struct fm10k_hw *hw)
> @@ -687,6 +691,7 @@ static int
>  fm10k_dev_rx_init(struct rte_eth_dev *dev)
>  {
>   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
>   int i, ret;
>   struct fm10k_rx_queue *rxq;
>   uint64_t base_addr;
> @@ -694,10 +699,23 @@ fm10k_dev_rx_init(struct rte_eth_dev *dev)
>   uint32_t rxdctl = FM10K_RXDCTL_WRITE_BACK_MIN_DELAY;
>   uint16_t buf_size;
>  
> - /* Disable RXINT to avoid possible interrupt */
> - for (i = 0; i < hw->mac.max_queues; i++)
> + /* enable RXINT for interrupt mode */
> + i = 0;
> + if (rte_intr_dp_is_en(intr_handle)) {
> + for (; i < dev->data->nb_rx_queues; i++) {
> + FM10K_WRITE_REG(hw, FM10K_RXINT(i), Q2V(dev, i));
> + if (hw->mac.type == fm10k_mac_pf)
> + FM10K_WRITE_REG(hw, FM10K_ITR(Q2V(dev, i)),
> + FM10K_ITR_AUTOMASK | 
> FM10K_ITR_MASK_CLEAR);
> + else
> + FM10K_WRITE_REG(hw, FM10K_VFITR(Q2V(dev, i)),
> + FM10K_ITR_AUTOMASK | 
> FM10K_ITR_MASK_CLEAR);
> + }
> + }
> + /* Disable other RXINT to avoid possible interrupt */
> + for (; i < hw->mac.max_queues; i++)
>   FM10K_WRITE_REG(hw, FM10K_RXINT(i),
> - 3 << FM10K_RXINT_TIMER_SHIFT);
> + 3 << FM10K_RXINT_TIMER_SHIFT);
>  
>   /* Setup RX queues */
>   for (i = 0; i < dev->data->nb_rx_queues; ++i) {
> @@ -1053,6 +1071,9 @@ fm10k_dev_start(struct rte_eth_dev *dev)
>   return diag;
>   }
>  
> + if (fm10k_dev_rxq_interrupt_setup(dev))
> + return -EIO;
> +
>   diag = fm10k_dev_rx_init(dev);
>   if (diag) {
>   PMD_INIT_LOG(ERR, "RX init failed: %d", diag);
> @@ -2072,7 +2093,7 @@ fm10k_dev_enable_intr_pf(struct rte_eth_dev *dev)
>   uint32_t int_map = FM10K_INT_MAP_IMMEDIATE;
>  
>   /* Bind all local non-queue interrupt to vector 0 */
> - int_map |= 0;
> + int_map |= FM10K

[dpdk-dev] [PATCH v2 7/7] doc: release note update for fm10k intr mode

2015-12-22 Thread Qiu, Michael

On 10/26/2015 11:48 AM, He, Shaopeng wrote:
> Signed-off-by: Shaopeng He 
> ---
>  doc/guides/rel_notes/release_2_2.rst | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/doc/guides/rel_notes/release_2_2.rst 
> b/doc/guides/rel_notes/release_2_2.rst
> index 73dba47..44b3aea 100644
> --- a/doc/guides/rel_notes/release_2_2.rst
> +++ b/doc/guides/rel_notes/release_2_2.rst
> @@ -5,11 +5,13 @@ New Features
>  
>  
>  * Support interrupt mode on i40e
> +* Support interrupt mode on fm10k
>  

Do we need to declare the limitation like only VFIO support for RRC, and
other limitations like could not work on the management port for
BR(maybe I'm wrong, but need BAR4 to run Testpoint)

>  Resolved Issues
>  ---
>  
>  * Fix ixgbe/igb rx interrupt compatible issue with mbox
> +* Fix l3fwd-power memory leak for non-ip packet
>  
>  Known Issues
>

[dpdk-dev] [PATCH v4 3/6] fm10k: remove rx queue interrupts when dev stops

2015-12-22 Thread Qiu, Michael

On 12/21/2015 6:20 PM, Shaopeng He wrote:
> Previous dev_stop function stops the rx/tx queues. This patch adds logic
> to disable rx queue interrupt, clean the datapath event and queue/vec map.
>
> Signed-off-by: Shaopeng He 
> Acked-by: Jing Chen 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c | 22 ++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index a34c5e2..b5b809c 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -1125,6 +1125,8 @@ fm10k_dev_start(struct rte_eth_dev *dev)
>  static void
>  fm10k_dev_stop(struct rte_eth_dev *dev)
>  {
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct rte_intr_handle *intr_handle = &dev->pci_dev->intr_handle;
>   int i;
>  
>   PMD_INIT_FUNC_TRACE();
> @@ -1136,6 +1138,26 @@ fm10k_dev_stop(struct rte_eth_dev *dev)
>   if (dev->data->rx_queues)
>   for (i = 0; i < dev->data->nb_rx_queues; i++)
>   fm10k_dev_rx_queue_stop(dev, i);
> +
> + /* Disable datapath event */
> + if (rte_intr_dp_is_en(intr_handle)) {
> + for (i = 0; i < dev->data->nb_rx_queues; i++) {
> + FM10K_WRITE_REG(hw, FM10K_RXINT(i),
> + 3 << FM10K_RXINT_TIMER_SHIFT);
> + if (hw->mac.type == fm10k_mac_pf)
> + FM10K_WRITE_REG(hw, FM10K_ITR(Q2V(dev, i)),
> + FM10K_ITR_MASK_SET);
> + else
> + FM10K_WRITE_REG(hw, FM10K_VFITR(Q2V(dev, i)),
> + FM10K_ITR_MASK_SET);
> + }
> + }
> + /* Clean datapath event and queue/vec mapping */
> + rte_intr_efd_disable(intr_handle);
> + if (intr_handle->intr_vec != NULL) {

This line could be removed, because rte_free already do the check, see
below:
void rte_free(void *addr)
{
if (addr == NULL) return;
if (malloc_elem_free(malloc_elem_from_data(addr)) < 0)
rte_panic("Fatal error: Invalid memory\n");
}

> + rte_free(intr_handle->intr_vec);
> + intr_handle->intr_vec = NULL;
> + }
>  }
>  
>  static void

[dpdk-dev] [PATCH v4 1/6] fm10k: implement rx_descriptor_done function

2015-12-22 Thread Qiu, Michael

On 12/21/2015 6:20 PM, Shaopeng He wrote:
> rx_descriptor_done is used by interrupt mode example application
> (l3fwd-power) to check rxd DD bit to decide the RX trend,
> then l3fwd-power will adjust the cpu frequency according to
> the result.
>
> Signed-off-by: Shaopeng He 
> Acked-by: Jing Chen 
> ---
>  drivers/net/fm10k/fm10k.h|  3 +++
>  drivers/net/fm10k/fm10k_ethdev.c |  1 +
>  drivers/net/fm10k/fm10k_rxtx.c   | 25 +
>  3 files changed, 29 insertions(+)
>
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index cd38af2..e2f677a 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -345,6 +345,9 @@ uint16_t fm10k_recv_pkts(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
>  uint16_t fm10k_recv_scattered_pkts(void *rx_queue,
>   struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
>  
> +int
> +fm10k_dev_rx_descriptor_done(void *rx_queue, uint16_t offset);
> +
>  uint16_t fm10k_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
>   uint16_t nb_pkts);
>  
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index e4aed94..d39c33b 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -2435,6 +2435,7 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
>   .rx_queue_release   = fm10k_rx_queue_release,
>   .tx_queue_setup = fm10k_tx_queue_setup,
>   .tx_queue_release   = fm10k_tx_queue_release,
> + .rx_descriptor_done = fm10k_dev_rx_descriptor_done,
>   .reta_update= fm10k_reta_update,
>   .reta_query = fm10k_reta_query,
>   .rss_hash_update= fm10k_rss_hash_update,
> diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
> index e958865..36d3002 100644
> --- a/drivers/net/fm10k/fm10k_rxtx.c
> +++ b/drivers/net/fm10k/fm10k_rxtx.c
> @@ -369,6 +369,31 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct 
> rte_mbuf **rx_pkts,
>   return nb_rcv;
>  }
>  
> +int
> +fm10k_dev_rx_descriptor_done(void *rx_queue, uint16_t offset)
> +{
> + volatile union fm10k_rx_desc *rxdp;
> + struct fm10k_rx_queue *rxq = rx_queue;
> + uint16_t desc;
> + int ret;
> +
> + if (unlikely(offset >= rxq->nb_desc)) {
> + PMD_DRV_LOG(ERR, "Invalid RX queue id %u", offset);

Sorry, here makes my confuse: offset for RX queue id?

> + return 0;
> + }
> +
> + desc = rxq->next_dd + offset;
> + if (desc >= rxq->nb_desc)
> + desc -= rxq->nb_desc;
> +
> + rxdp = &rxq->hw_ring[desc];
> +
> + ret = !!(rxdp->w.status &
> + rte_cpu_to_le_16(FM10K_RXD_STATUS_DD));
> +
> + return ret;
> +}
> +
>  static inline void tx_free_descriptors(struct fm10k_tx_queue *q)
>  {
>   uint16_t next_rs, count = 0;

[dpdk-dev] [PATCH] librte_ether: fix crashes in rte_ethdev functions.

2015-12-21 Thread Qiu, Michael

On 2015/12/18 1:24, Bernard Iremonger wrote:
> The nb_rx_queues and nb_tx_queues are initialised before
> the tx_queue and rx_queue arrays are allocated. The arrays
> are allocated when the ethdev port is started.
>
> If any of the following functions are called before the ethdev
> port is started there is a segmentation fault:
>
> rte_eth_stats_get
> rte_eth_stats_reset
> rte_eth_xstats_get
> rte_eth_xstats_reset
>
> Fixes: af75078fece3 ("first public release")
> Fixes: ce757f5c9a4d ("ethdev: new method to retrieve extended statistics")
> Fixes: d4fef8b0d5e5 ("ethdev: expose generic and driver specific stats in 
> xstats")
> Signed-off-by: Bernard Iremonger 
> ---
>  lib/librte_ether/rte_ethdev.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index ed971b4..a0ee84d 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1441,7 +1441,10 @@ rte_eth_stats_get(uint8_t port_id, struct 
> rte_eth_stats *stats)
>   memset(stats, 0, sizeof(*stats));
>  
>   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_get, -ENOTSUP);
> - (*dev->dev_ops->stats_get)(dev, stats);
> +
> + if (dev->data->dev_started)
> + (*dev->dev_ops->stats_get)(dev, stats);
> +

My question is should we mark an error or a warning here and return an
error so that the caller knows what happens?

Thanks,
Michael

>   stats->rx_nombuf = dev->data->rx_mbuf_alloc_failed;
>   return 0;
>  }
> @@ -1455,7 +1458,10 @@ rte_eth_stats_reset(uint8_t port_id)
>   dev = &rte_eth_devices[port_id];
>  
>   RTE_FUNC_PTR_OR_RET(*dev->dev_ops->stats_reset);
> - (*dev->dev_ops->stats_reset)(dev);
> +
> + if (dev->data->dev_started)
> + (*dev->dev_ops->stats_reset)(dev);
> +
>   dev->data->rx_mbuf_alloc_failed = 0;
>  }
>  
> @@ -1479,7 +1485,8 @@ rte_eth_xstats_get(uint8_t port_id, struct 
> rte_eth_xstats *xstats,
>   (dev->data->nb_tx_queues * RTE_NB_TXQ_STATS);
>  
>   /* implemented by the driver */
> - if (dev->dev_ops->xstats_get != NULL) {
> + if ((dev->dev_ops->xstats_get != NULL) &&
> + (dev->data->dev_started)) {
>   /* Retrieve the xstats from the driver at the end of the
>* xstats struct.
>*/
> @@ -1548,7 +1555,8 @@ rte_eth_xstats_reset(uint8_t port_id)
>   dev = &rte_eth_devices[port_id];
>  
>   /* implemented by the driver */
> - if (dev->dev_ops->xstats_reset != NULL) {
> + if ((dev->dev_ops->xstats_reset != NULL) &&
> + (dev->data->dev_started)) {
>   (*dev->dev_ops->xstats_reset)(dev);
>   return;
>   }

[dpdk-dev] [PATCH v3] mem: calculate space left in a hugetlbfs

2015-12-21 Thread Qiu, Michael

On 2015/11/18 17:42, Jianfeng Tan wrote:
> Currently DPDK does not respect the quota of a hugetblfs mount.
> It will fail to init the EAL because it tries to map the number of
> free hugepages in the system rather than using the number specified
> in the quota for that mount.
>
> To solve this issue, we take the quota into consideration when
> calculating the number of hugepages to map.  We use either the number
> specified in the quota, or number of available hugepages, whichever
> is lower.
>
> There are possible race conditions when multiple applications
> allocate hugepages in different hugetlbfs mounts of the same size,
> so the suggested system would have a pool with enough hugepages for
> all hugetlbfs mount quotas.
>
> There is, however, still an open issue with
> CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS. When this option is enabled
> (IVSHMEM target does this by default), having hugetlbfs mounts with
> quota will fail to remap hugepages because it relies on having
> mapped all free hugepages in the system.
>
> Signed-off-by: Jianfeng Tan 
>

Acked-by: Michael Qiu

[dpdk-dev] Issue with patch "app/testpmd: detect numa socket count"

2015-12-19 Thread Qiu, Michael

Sorry, I forgot to do that, add CC to mailing list 

Thanks,
Michael

> ? 2015?12?197:32?Stephen Hurd  ???
> 
> Thanks, I can reproduce it now.
> 
> The intent was to set max_socket when detecting lcores, not when selecting 
> active ones, so there's a bug in my patch.  I'll actually be working on DPDK 
> next week, so I'll have fix on Monday or Tuesday.
> 
> As for parsing /sys/, that's not likely to work on FreeBSD, and since we 
> already detect all lcores and their socket, it makes sense to set max_socket 
> there.
> 
> My request for the additional logging was so I could see the lines that 
> correspond to these:
> 
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 0 on socket 1
> EAL: Detected lcore 2 as core 1 on socket 0
> EAL: Detected lcore 3 as core 1 on socket 1
> 
> Which would indicate that EAL was able to detect a higher max socket even 
> though it's not enabled.
> 
> Is there a reason we're not CCing the mailing list?
> 
> 
> -- Stephen Hurd
> 
> 
> -Original Message-
> From: Qiu, Michael [mailto:michael.qiu at intel.com] 
> Sent: Thursday, December 17, 2015 10:31 PM
> To: Stephen Hurd; De Lara Guarch, Pablo
> Cc: Tan, Jianfeng; Gonzalez Monroy, Sergio
> Subject: Re: Issue with patch "app/testpmd: detect numa socket count"
> 
> That's the bug, I have two socket in my system, also from log you could see:
> 
> EAL: Requesting 512 pages of size 2MB from socket 1
> 
> command is : ./testpmd -c 0x3 -n 4 -- -i -socket-num=1
> 
> the root cause is that you only check the enabled lcore which is on
> socket 0, and then mark max socket as 1,  also your could has issue when
> in --numa
> 
> Because when numa enabled, still we run with coremask 0x03, the max
> socket will never be 2, which will lead lots of issues.
> 
> Thanks,
> Michael
> 
> 
>> On 2015/12/18 9:57, Stephen Hurd wrote:
>> Can you provide more log info (the socket to core mapping especially) and 
>> the entire command line (or at least the entire EAL set)?  The error 
>> suggests that you only have one numa socket (socket 0) on that system and 
>> are attempting to allocate memory on socket 1.
>> 
>> My patch redefined the max socket to be the last socket the system has 
>> whereas the old code allowed specifying sockets that aren't physically 
>> present in the system.
>> 
>> 
>> -- Stephen Hurd
>> 
>> 
>> -Original Message-
>> From: Qiu, Michael [mailto:michael.qiu at intel.com] 
>> Sent: Thursday, December 17, 2015 5:42 PM
>> To: Stephen Hurd; De Lara Guarch, Pablo
>> Subject: Issue with patch "app/testpmd: detect numa socket count"
>> 
>> Hi, Stephen
>> 
>> I just see this patch and found some issue with it.
>> 
>> When I start testpmd with -c 0x3 but with socket-num 1, that means run
>> lcore in socket 0, but want hugepage allocated in socket 1, in previous,
>> it works, I don't know why you force it in the socket lcore locates.I
>> would like to give a warning instead of failure. just like before:
>> 
>> EAL: Requesting 512 pages of size 2MB from socket 1
>> EAL: TSC frequency is ~2294689 KHz
>> EAL: WARNING: Master core has no memory on local socket!
>> 
>> After your patch:
>> 
>> EAL: No probed ethernet devices
>> Interactive-mode selected
>> EAL: Error - exiting with code: 1
>>  Cause: The socket number should be < 1
>> 
>> 
>> Thanks,
>> Michael
> 
>

[dpdk-dev] [PATCH] ip_pipeline: Fix compile issue with strict-aliasing

2015-12-09 Thread Qiu, Michael

Sorry please ignore this :)

Thanks,
Michael

-Original Message-
From: Qiu, Michael 
Sent: Wednesday, December 9, 2015 4:40 PM
To: dev at dpdk.org
Cc: Singh, Jasvinder; Dumitrescu, Cristian; root; Qiu, Michael
Subject: [PATCH] ip_pipeline: Fix compile issue with strict-aliasing

From: root 

strict-aliasing

Signed-off-by: Michael Qiu 
---
 .../ip_pipeline/pipeline/pipeline_routing_be.c |   11 ---
 1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/examples/ip_pipeline/pipeline/pipeline_routing_be.c 
b/examples/ip_pipeline/pipeline/pipeline_routing_be.c
index 4a95c7d..9baabd0 100644
--- a/examples/ip_pipeline/pipeline/pipeline_routing_be.c
+++ b/examples/ip_pipeline/pipeline/pipeline_routing_be.c
@@ -1461,8 +1461,7 @@ pipeline_routing_msg_req_route_add_handler(struct 
pipeline *p, void *msg)
uint64_t macaddr_dst;
uint64_t ethertype = ETHER_TYPE_IPv4;

-   *((struct ether_addr *) &macaddr_dst) =
-   req->data.ethernet.macaddr;
+   macaddr_dst = *((uint64_t *)&(req->data.ethernet.macaddr));
macaddr_dst = rte_bswap64(macaddr_dst << 16);

entry_arp0.slab[0] =
@@ -1503,8 +1502,7 @@ pipeline_routing_msg_req_route_add_handler(struct 
pipeline *p, void *msg)
uint64_t svlan = req->data.l2.qinq.svlan;
uint64_t cvlan = req->data.l2.qinq.cvlan;

-   *((struct ether_addr *) &macaddr_dst) =
-   req->data.ethernet.macaddr;
+   macaddr_dst = *((uint64_t *)&(req->data.ethernet.macaddr));
macaddr_dst = rte_bswap64(macaddr_dst << 16);

entry_arp0.slab[0] = rte_bswap64((svlan << 48) | @@ -1563,8 
+1561,7 @@ pipeline_routing_msg_req_route_add_handler(struct pipeline *p, void 
*msg)
uint64_t label3 = req->data.l2.mpls.labels[3];
uint32_t n_labels = req->data.l2.mpls.n_labels;

-   *((struct ether_addr *) &macaddr_dst) =
-   req->data.ethernet.macaddr;
+   macaddr_dst = *((uint64_t *)&(req->data.ethernet.macaddr));
macaddr_dst = rte_bswap64(macaddr_dst << 16);

switch (n_labels) {
@@ -1814,7 +1811,7 @@ pipeline_routing_msg_req_arp_add_handler(struct pipeline 
*p, void *msg)
return rsp;
}

-   *((struct ether_addr *) &entry.macaddr) = req->macaddr;
+   entry.macaddr = *((uint64_t *)&(req->macaddr));
entry.macaddr = entry.macaddr << 16;

rsp->status = rte_pipeline_table_entry_add(p->p,
--
1.7.1

[dpdk-dev] Compile error with CONFIG_RTE_BUILD_COMBINE_LIBS=y

2015-12-08 Thread Qiu, Michael

My mistake. please ignore this issue.

Thanks,
Michael

On 2015/12/8 15:08, Qiu, Michael wrote:
> Hi, Thomas
>
> I see you recently merged one commit:
>
> commit 8f1c704fb0f1b867471fc692ed2c0fc5610831e2
> Author: Thomas Monjalon 
> Date:   Tue Dec 8 01:50:17 2015 +0100
>
> mk: fix external library build when combine is enabled
> 
> The object files are copied to prepare the internal combined library.
> It must be disabled when building an external library.
> 
> It has been seen because the directory was missing:
> examples/ethtool/lib/x86_64-native-linuxapp-gcc/build/lib:
> No such file or directory
> 
> Signed-off-by: Thomas Monjalon 
>
> It lead compile error when open CONFIG_RTE_BUILD_COMBINE_LIBS.
>
> See below:
>
> make -j install T=x86_64-native-linuxapp-gcc
>
> ...
>
>  CC pipeline_acl.o  
>
>   [1491/1853]
>   CC parameters.o
>   CC macfwd-retry.o
>   CC testpmd.o
>   CC macswap.o
>   CC macfwd.o
>   CC rxonly.o
>   CC flowgen.o
>   CC txonly.o
>   CC csumonly.o
>   CC icmpecho.o
>   CC mempool_anon.o
>   LD cmdline_test
>   CC test.o
>   CC commands.o
>   CC test_pci.o
>   CC test_prefetch.o
>   CC test_byteorder.o
>   CC test_per_lcore.o
>   CC test_atomic.o
> cmdline_test.o: In function `main':
> cmdline_test.c:(.text.startup+0xc): undefined reference to
> `cmdline_stdin_new'
> cmdline_test.c:(.text.startup+0x1c): undefined reference to
> `cmdline_interact'
> cmdline_test.c:(.text.startup+0x24): undefined reference to
> `cmdline_stdin_exit'
> commands.o: In function `cmd_quit_parsed':
> commands.c:(.text+0x4): undefined reference to `cmdline_quit'
> commands.o: In function `cmd_single_parsed':
> commands.c:(.text+0x1b): undefined reference to `cmdline_printf'
> commands.o: In function `cmd_single_long_parsed':
> commands.c:(.text+0x2b): undefined reference to `cmdline_printf'
> commands.o: In function `cmd_autocomplete_1_parsed':
> commands.c:(.text+0x3b): undefined reference to `cmdline_printf'
> commands.o: In function `cmd_autocomplete_2_parsed':
> commands.c:(.text+0x4b): undefined reference to `cmdline_printf'
> commands.o: In function `cmd_num_parsed':
> commands.c:(.text+0x60): undefined reference to `cmdline_printf'
> commands.o:commands.c:(.text+0x7b): more undefined references to
> `cmdline_printf' follow
> commands.o: In function `cmd_clear_history_parsed':
> commands.c:(.text+0xb5): undefined reference to `rdline_clear_history'
> commands.o:(.data+0x90): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0xe0): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x130): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x150): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x1a0): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x1c0): more undefined references to
> `cmdline_token_string_ops' follow
> commands.o:(.data+0x210): undefined reference to `cmdline_token_num_ops'
> commands.o:(.data+0x260): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x2b0): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x300): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x350): undefined reference to `cmdline_token_string_ops'
> commands.o:(.data+0x3a0): undefined reference to `cmdline_token_string_ops'
>
> 
>
>
>
> Thanks,
> Michael
>

[dpdk-dev] Compile error with CONFIG_RTE_BUILD_COMBINE_LIBS=y

2015-12-08 Thread Qiu, Michael

Hi, Thomas

I see you recently merged one commit:

commit 8f1c704fb0f1b867471fc692ed2c0fc5610831e2
Author: Thomas Monjalon 
Date:   Tue Dec 8 01:50:17 2015 +0100

mk: fix external library build when combine is enabled

The object files are copied to prepare the internal combined library.
It must be disabled when building an external library.

It has been seen because the directory was missing:
examples/ethtool/lib/x86_64-native-linuxapp-gcc/build/lib:
No such file or directory

Signed-off-by: Thomas Monjalon 

It lead compile error when open CONFIG_RTE_BUILD_COMBINE_LIBS.

See below:

make -j install T=x86_64-native-linuxapp-gcc

...

 CC pipeline_acl.o  

  [1491/1853]
  CC parameters.o
  CC macfwd-retry.o
  CC testpmd.o
  CC macswap.o
  CC macfwd.o
  CC rxonly.o
  CC flowgen.o
  CC txonly.o
  CC csumonly.o
  CC icmpecho.o
  CC mempool_anon.o
  LD cmdline_test
  CC test.o
  CC commands.o
  CC test_pci.o
  CC test_prefetch.o
  CC test_byteorder.o
  CC test_per_lcore.o
  CC test_atomic.o
cmdline_test.o: In function `main':
cmdline_test.c:(.text.startup+0xc): undefined reference to
`cmdline_stdin_new'
cmdline_test.c:(.text.startup+0x1c): undefined reference to
`cmdline_interact'
cmdline_test.c:(.text.startup+0x24): undefined reference to
`cmdline_stdin_exit'
commands.o: In function `cmd_quit_parsed':
commands.c:(.text+0x4): undefined reference to `cmdline_quit'
commands.o: In function `cmd_single_parsed':
commands.c:(.text+0x1b): undefined reference to `cmdline_printf'
commands.o: In function `cmd_single_long_parsed':
commands.c:(.text+0x2b): undefined reference to `cmdline_printf'
commands.o: In function `cmd_autocomplete_1_parsed':
commands.c:(.text+0x3b): undefined reference to `cmdline_printf'
commands.o: In function `cmd_autocomplete_2_parsed':
commands.c:(.text+0x4b): undefined reference to `cmdline_printf'
commands.o: In function `cmd_num_parsed':
commands.c:(.text+0x60): undefined reference to `cmdline_printf'
commands.o:commands.c:(.text+0x7b): more undefined references to
`cmdline_printf' follow
commands.o: In function `cmd_clear_history_parsed':
commands.c:(.text+0xb5): undefined reference to `rdline_clear_history'
commands.o:(.data+0x90): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0xe0): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x130): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x150): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x1a0): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x1c0): more undefined references to
`cmdline_token_string_ops' follow
commands.o:(.data+0x210): undefined reference to `cmdline_token_num_ops'
commands.o:(.data+0x260): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x2b0): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x300): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x350): undefined reference to `cmdline_token_string_ops'
commands.o:(.data+0x3a0): undefined reference to `cmdline_token_string_ops'





Thanks,
Michael

[dpdk-dev] [PATCH v2] lib/librte_sched: Fix compile with gcc 4.3.4

2015-12-02 Thread Qiu, Michael

I will make v3 patch to fix the issue.

Thanks,
Michael

On 2015/12/2 10:19, Thomas Monjalon wrote:
> 2015-12-02 10:09, Michael Qiu:
>> gcc 4.3.4 does not include "immintrin.h", and will post below error:
>> lib/librte_sched/rte_sched.c:56:23: error:
>> immintrin.h: No such file or directory
> This compiler issue is fixed with rte_vect.h.
>
>> To avoid this issue, a gcc version check is need and a flag to indicate
>> vector ablility.
> It is another issue: we need SSE2 support.
>
>> --- a/lib/librte_sched/rte_sched.c
>> +++ b/lib/librte_sched/rte_sched.c
>> @@ -42,6 +42,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
> Shouldn't be in #ifdef RTE_SCHED_VECTOR ?
>
>>  #include "rte_sched.h"
>>  #include "rte_bitmap.h"
>> @@ -53,7 +54,11 @@
>>  #endif
>>  
>>  #ifdef RTE_SCHED_VECTOR
>> -#include 
>> +
>> +#if defined(__SSE2__)
>> +#define SCHED_VECTOR_ENABLE
>> +#endif
> I think the flag should SCHED_VECTOR_SSE2
>
> With this fix, the need for disabling SCHED_VECTOR for non-x86 platforms
> should disappear.
> But it may be safe to disable it (another patch).
> Thanks
>

[dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4

2015-11-27 Thread Qiu, Michael

Sorry for not explaining clearly. 

For gcc version start from version 4.4, x86intrin.h will be include, and inside 
x86intrin.h, immintrin.h will be directly include without check AVX (as I know, 
AVX is not exist when gcc >= 4.4),so no AVX macro does not mean vector disable.

Only gcc < 4.4 and no macro AVX will disable vector.

This is my understanding, may be wrong :)

Thanks,
Michael

> ? 2015?11?2710:09?Ananyev, Konstantin  
> ???
> 
> 
> 
>> -Original Message-
>> From: Qiu, Michael
>> Sent: Friday, November 27, 2015 2:02 PM
>> To: Ananyev, Konstantin
>> Cc: Thomas Monjalon; dev at dpdk.org
>> Subject: Re: [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4
>> 
>> I just replied that Marco AVX only exist in gcc version < 4.4 , otherwise it 
>> will not exist.
> 
> If macro __AVX__ not defined, then 
> #if defined(__AVX__)
> would always be false and SCHED_VECTOR_ENABLE also wouldn't be defined.
> So still don't understand why that is a problem
> Konstantin
> 
>> 
>> What's your suggest will not work if gcc version greater than 4.3.
>> 
>> So still need to check gcc version. Any other solution?
>> 
>> Thanks,
>> Michael
>> 
>>> ? 2015?11?278:34?Ananyev, Konstantin  
>>> ???
>>> 
>>> 
>>>> -Original Message-
>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Qiu, Michael
>>>> Sent: Friday, November 27, 2015 11:53 AM
>>>> To: Thomas Monjalon
>>>> Cc: dev at dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 
>>>> 4.3.4
>>>> 
>>>> really?I don't think so.
>>>> 
>>>> AVX Marco only exist in the gcc version below 4.4,  I still need to check 
>>>> if below or beyond 4.4 am I right?
>>>> 
>>>> Thanks,
>>>> Michael
>>> 
>>> 
>>> If you look at lib/librte_eal/common/include/arch/x86/rte_vect.h, you'll 
>>> see the code similar
>>> to one you are trying to put into rte_shed.c:
>>> 
>>> lib/librte_eal/common/include/arch/x86/rte_vect.h:
>>> ...
>>> #if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
>>> 
>>> #ifdef __SSE__
>>> #include 
>>> #endif
>>> 
>>> #ifdef __SSE2__
>>> #include 
>>> #endif
>>> 
>>> #ifdef __SSE3__
>>> #include 
>>> #endif
>>> 
>>> #if defined(__SSE4_2__) || defined(__SSE4_1__)
>>> #include 
>>> #endif
>>> 
>>> #if defined(__AVX__)
>>> #include 
>>> #endif
>>> 
>>> #else
>>> 
>>> #include 
>>> 
>>> #endif
>>> ...
>>> 
>>> So I think you can do just like that:
>>> 
>>> #include 
>>> #if defined(__AVX__)
>>> #define SCHED_VECTOR_ENABLE
>>> #endif
>>> 
>>> inside rte_sched.c
>>> 
>>> Konstantin
>>> 
>>> 
>>>> 
>>>> 
>>>>> ? 2015?11?275:01?Thomas Monjalon  ???
>>>>> 
>>>>> 2015-11-27 02:26, Qiu, Michael:
>>>>>>>> On 2015/11/27 5:29, Thomas Monjalon wrote:
>>>>>>>> 2015-11-26 18:49, Michael Qiu:
>>>>>>>> gcc 4.3.4 does not include "immintrin.h", and will post below error:
>>>>>>>>  lib/librte_sched/rte_sched.c:56:23: error:
>>>>>>>>  immintrin.h: No such file or directory
>>>>>>>> 
>>>>>>>> To avoid this issue, a gcc version check is need and a flag to indicate
>>>>>>>> vector ablility.
>>>>>>> [...]
>>>>>>>> +#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
>>>>>>>> +
>>>>>>>> +#if defined(__AVX__)
>>>>>>>> #include 
>>>>>>>> +#define SCHED_VECTOR_ENABLE
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>> +#else
>>>>>>>> +
>>>>>>>> +#include 
>>>>>>>> +#define SCHED_VECTOR_ENABLE
>>>>>>>> +
>>>>>>>> +#endif
>>>>>>> This kind of complication is managed by EAL.
>>>>>>> I think we should include rte_vect.h.
>>>>>> 
>>>>>> As I know here it needs a flag to identify whether the platform support
>>>>>> AVX, if not it will not use it, so I don't know if we could only simply
>>>>>> include rte_vect.h?
>>>>> 
>>>>> It's not exclusive.
>>>>> You can include rte_vect.h and check AVX to define SCHED_VECTOR_ENABLE.
>>>>>

[dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4

2015-11-27 Thread Qiu, Michael

I just replied that Marco AVX only exist in gcc version < 4.4 , otherwise it 
will not exist.

What's your suggest will not work if gcc version greater than 4.3.

So still need to check gcc version. Any other solution?

Thanks,
Michael

> ? 2015?11?278:34?Ananyev, Konstantin  ???
> 
> 
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Qiu, Michael
>> Sent: Friday, November 27, 2015 11:53 AM
>> To: Thomas Monjalon
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4
>> 
>> really?I don't think so.
>> 
>> AVX Marco only exist in the gcc version below 4.4,  I still need to check if 
>> below or beyond 4.4 am I right?
>> 
>> Thanks,
>> Michael
> 
> 
> If you look at lib/librte_eal/common/include/arch/x86/rte_vect.h, you'll see 
> the code similar
> to one you are trying to put into rte_shed.c:
> 
> lib/librte_eal/common/include/arch/x86/rte_vect.h:
> ...
> #if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
> 
> #ifdef __SSE__
> #include 
> #endif
> 
> #ifdef __SSE2__
> #include 
> #endif
> 
> #ifdef __SSE3__
> #include 
> #endif
> 
> #if defined(__SSE4_2__) || defined(__SSE4_1__)
> #include 
> #endif
> 
> #if defined(__AVX__)
> #include 
> #endif
> 
> #else
> 
> #include 
> 
> #endif
> ...
> 
> So I think you can do just like that:
> 
> #include 
> #if defined(__AVX__)
> #define SCHED_VECTOR_ENABLE
> #endif
> 
> inside rte_sched.c
> 
> Konstantin
> 
> 
>> 
>> 
>>> ? 2015?11?275:01?Thomas Monjalon  ???
>>> 
>>> 2015-11-27 02:26, Qiu, Michael:
>>>>>> On 2015/11/27 5:29, Thomas Monjalon wrote:
>>>>>> 2015-11-26 18:49, Michael Qiu:
>>>>>> gcc 4.3.4 does not include "immintrin.h", and will post below error:
>>>>>>   lib/librte_sched/rte_sched.c:56:23: error:
>>>>>>   immintrin.h: No such file or directory
>>>>>> 
>>>>>> To avoid this issue, a gcc version check is need and a flag to indicate
>>>>>> vector ablility.
>>>>> [...]
>>>>>> +#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
>>>>>> +
>>>>>> +#if defined(__AVX__)
>>>>>> #include 
>>>>>> +#define SCHED_VECTOR_ENABLE
>>>>>> +#endif
>>>>>> +
>>>>>> +#else
>>>>>> +
>>>>>> +#include 
>>>>>> +#define SCHED_VECTOR_ENABLE
>>>>>> +
>>>>>> +#endif
>>>>> This kind of complication is managed by EAL.
>>>>> I think we should include rte_vect.h.
>>>> 
>>>> As I know here it needs a flag to identify whether the platform support
>>>> AVX, if not it will not use it, so I don't know if we could only simply
>>>> include rte_vect.h?
>>> 
>>> It's not exclusive.
>>> You can include rte_vect.h and check AVX to define SCHED_VECTOR_ENABLE.
>>>

[dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4

2015-11-27 Thread Qiu, Michael

really?I don't think so.

AVX Marco only exist in the gcc version below 4.4,  I still need to check if 
below or beyond 4.4 am I right?

Thanks,
Michael


> ? 2015?11?275:01?Thomas Monjalon  ???
> 
> 2015-11-27 02:26, Qiu, Michael:
>>> On 2015/11/27 5:29, Thomas Monjalon wrote:
>>> 2015-11-26 18:49, Michael Qiu:
>>>> gcc 4.3.4 does not include "immintrin.h", and will post below error:
>>>>lib/librte_sched/rte_sched.c:56:23: error:
>>>>immintrin.h: No such file or directory
>>>> 
>>>> To avoid this issue, a gcc version check is need and a flag to indicate
>>>> vector ablility.
>>> [...]
>>>> +#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
>>>> +
>>>> +#if defined(__AVX__)
>>>> #include 
>>>> +#define SCHED_VECTOR_ENABLE
>>>> +#endif
>>>> +
>>>> +#else
>>>> +
>>>> +#include 
>>>> +#define SCHED_VECTOR_ENABLE
>>>> +
>>>> +#endif
>>> This kind of complication is managed by EAL.
>>> I think we should include rte_vect.h.
>> 
>> As I know here it needs a flag to identify whether the platform support
>> AVX, if not it will not use it, so I don't know if we could only simply
>> include rte_vect.h?
> 
> It's not exclusive.
> You can include rte_vect.h and check AVX to define SCHED_VECTOR_ENABLE.
>

[dpdk-dev] [PATCH] lib/librte_sched: Fix compile with gcc 4.3.4

2015-11-27 Thread Qiu, Michael

On 2015/11/27 5:29, Thomas Monjalon wrote:
> 2015-11-26 18:49, Michael Qiu:
>> gcc 4.3.4 does not include "immintrin.h", and will post below error:
>> lib/librte_sched/rte_sched.c:56:23: error:
>> immintrin.h: No such file or directory
>>
>> To avoid this issue, a gcc version check is need and a flag to indicate
>> vector ablility.
> [...]
>> +#if (defined(__ICC) || (__GNUC__ == 4 &&  __GNUC_MINOR__ < 4))
>> +
>> +#if defined(__AVX__)
>>  #include 
>> +#define SCHED_VECTOR_ENABLE
>> +#endif
>> +
>> +#else
>> +
>> +#include 
>> +#define SCHED_VECTOR_ENABLE
>> +
>> +#endif
> This kind of complication is managed by EAL.
> I think we should include rte_vect.h.

As I know here it needs a flag to identify whether the platform support
AVX, if not it will not use it, so I don't know if we could only simply
include rte_vect.h?

Thanks,
Michael
>
>

[dpdk-dev] [PATCH 2/2] Fix compile issue in i686 platform

2015-11-27 Thread Qiu, Michael

On 2015/11/27 5:15, Thomas Monjalon wrote:
> 2015-11-26 09:35, Michael Qiu:
>> In i686 platform, long is 32bit, so XXX_CYCLECOUNTER_MASK
>> need define as 'ULL'
>>
>> Signed-off-by: Michael Qiu 
> This patch is correct but the description is not exact:
> I have no issue with my i686 compiler.
> For future reference, please could you be more precise
> about the reproduction environment? Is it related to a specific compiler?

OK, I will be for careful about the compile and os next time.

> We also need to add these lines:
> Fixes: 9c857bf6be87 ("igb: support ieee1588 functions for device time")
> Fixes: 1c4445e1f28e ("ixgbe: support ieee1588 functions for device time")
> Fixes: f3a4e40eca0c ("i40e: support ieee1588 functions for device time")

So I will repost the patch set.

Thanks,
Michael

[dpdk-dev] [PATCH] fm10k: fix a crash bug when quit from testpmd

2015-11-17 Thread Qiu, Michael

On 2015/11/12 12:58, Chen Jing D(Mark) wrote:
> From: "Chen Jing D(Mark)" 
>
> When the fm10k port is closed, both func tx_queue_clean() and
> fm10k_tx_queue_release_mbufs_vec() will try to release buffer in
> SW ring. The latter func won't do sanity check on those pointers
> and cause crash.
>
> The fix include 2 parts.
> 1. Remove Vector TX buffer release func since it can share the
>release functions with regular TX.
> 2. Add log to print out what actual Rx/Tx func is used.
>
> Signed-off-by: Chen Jing D(Mark) 
> ---

Acked-by: Michael Qiu 
>  drivers/net/fm10k/fm10k.h  |1 -
>  drivers/net/fm10k/fm10k_ethdev.c   |   17 -
>  drivers/net/fm10k/fm10k_rxtx_vec.c |   28 
>  3 files changed, 12 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index 754aa6a..38d5489 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -237,7 +237,6 @@ struct fm10k_tx_queue {
>  };
>  
>  struct fm10k_txq_ops {
> - void (*release_mbufs)(struct fm10k_tx_queue *txq);
>   void (*reset)(struct fm10k_tx_queue *txq);
>  };
>  
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index cf7ada7..af7b0c2 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -386,7 +386,6 @@ fm10k_check_mq_mode(struct rte_eth_dev *dev)
>  }
>  
>  static const struct fm10k_txq_ops def_txq_ops = {
> - .release_mbufs = tx_queue_free,
>   .reset = tx_queue_reset,
>  };
>  
> @@ -1073,7 +1072,7 @@ fm10k_dev_queue_release(struct rte_eth_dev *dev)
>   for (i = 0; i < dev->data->nb_tx_queues; i++) {
>   struct fm10k_tx_queue *txq = dev->data->tx_queues[i];
>  
> - txq->ops->release_mbufs(txq);
> + tx_queue_free(txq);
>   }
>   }
>  
> @@ -1793,7 +1792,7 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
> queue_id,
>   if (dev->data->tx_queues[queue_id] != NULL) {
>   struct fm10k_tx_queue *txq = dev->data->tx_queues[queue_id];
>  
> - txq->ops->release_mbufs(txq);
> + tx_queue_free(txq);
>   dev->data->tx_queues[queue_id] = NULL;
>   }
>  
> @@ -1872,7 +1871,7 @@ fm10k_tx_queue_release(void *queue)
>   struct fm10k_tx_queue *q = queue;
>   PMD_INIT_FUNC_TRACE();
>  
> - q->ops->release_mbufs(q);
> + tx_queue_free(q);
>  }
>  
>  static int
> @@ -2439,13 +2438,16 @@ fm10k_set_tx_function(struct rte_eth_dev *dev)
>   }
>  
>   if (use_sse) {
> + PMD_INIT_LOG(ERR, "Use vector Tx func");
>   for (i = 0; i < dev->data->nb_tx_queues; i++) {
>   txq = dev->data->tx_queues[i];
>   fm10k_txq_vec_setup(txq);
>   }
>   dev->tx_pkt_burst = fm10k_xmit_pkts_vec;
> - } else
> + } else {
>   dev->tx_pkt_burst = fm10k_xmit_pkts;
> + PMD_INIT_LOG(ERR, "Use regular Tx func");
> + }
>  }
>  
>  static void __attribute__((cold))
> @@ -2469,6 +2471,11 @@ fm10k_set_rx_function(struct rte_eth_dev *dev)
>   (dev->rx_pkt_burst == fm10k_recv_scattered_pkts_vec ||
>   dev->rx_pkt_burst == fm10k_recv_pkts_vec);
>  
> + if (rx_using_sse)
> + PMD_INIT_LOG(ERR, "Use vector Rx func");
> + else
> + PMD_INIT_LOG(ERR, "Use regular Rx func");
> +
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {
>   struct fm10k_rx_queue *rxq = dev->data->rx_queues[i];
>  
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 06beca9..6042568 100644
> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> @@ -45,8 +45,6 @@
>  #endif
>  
>  static void
> -fm10k_tx_queue_release_mbufs_vec(struct fm10k_tx_queue *txq);
> -static void
>  fm10k_reset_tx_queue(struct fm10k_tx_queue *txq);
>  
>  /* Handling the offload flags (olflags) field takes computation
> @@ -634,7 +632,6 @@ fm10k_recv_scattered_pkts_vec(void *rx_queue,
>  }
>  
>  static const struct fm10k_txq_ops vec_txq_ops = {
> - .release_mbufs = fm10k_tx_queue_release_mbufs_vec,
>   .reset = fm10k_reset_tx_queue,
>  };
>  
> @@ -795,31 +792,6 @@ fm10k_xmit_pkts_vec(void *tx_queue, struct rte_mbuf 
> **tx_pkts,
>  }
>  
>  static void __attribute__((cold))
> -fm10k_tx_queue_release_mbufs_vec(struct fm10k_tx_queue *txq)
> -{
> - unsigned i;
> - const uint16_t max_desc = (uint16_t)(txq->nb_desc - 1);
> -
> - if (txq->sw_ring == NULL || txq->nb_free == max_desc)
> - return;
> -
> - /* release the used mbufs in sw_ring */
> - for (i = txq->next_dd - (txq->rs_thresh - 1);
> -  i != txq->next_free;
> -  i = (i + 1) & max_desc)
> - rte_pktmbuf_free_seg(txq->sw_ring[i]);
> -
> - txq->nb_free = max_desc;
> -
> - /* reset tx_e

[dpdk-dev] [PATCH 4/4] fm10k: remove crc size from all byte counters

2015-11-17 Thread Qiu, Michael

Hi, Harry

Have you ever tested this patch by yourself?

fm10k's stats should already remove the crc bytes by default.

After your patch applied, if send a packet without vlan(64 bytes),
we expect receive 60 bytes, but it will disappoint you, that only
56 bytes shows in system.

Thanks,
Michael

On 2015/11/16 18:36, Harry van Haaren wrote:
> This patch removes the crc bytes from byte counter statistics.
>
> Signed-off-by: Harry van Haaren 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 441f713..fdb2e81 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -1183,11 +1183,13 @@ fm10k_stats_get(struct rte_eth_dev *dev, struct 
> rte_eth_stats *stats)
>  
>   ipackets = opackets = ibytes = obytes = 0;
>   for (i = 0; (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) &&
> - (i < hw->mac.max_queues); ++i) {
> + (i < hw->mac.max_queues); ++i) {
>   stats->q_ipackets[i] = hw_stats->q[i].rx_packets.count;
>   stats->q_opackets[i] = hw_stats->q[i].tx_packets.count;
> - stats->q_ibytes[i]   = hw_stats->q[i].rx_bytes.count;
> - stats->q_obytes[i]   = hw_stats->q[i].tx_bytes.count;
> + stats->q_ibytes[i]   = hw_stats->q[i].rx_bytes.count -
> + (stats->q_ipackets[i] * 4);
> + stats->q_obytes[i]   = hw_stats->q[i].tx_bytes.count -
> + (stats->q_opackets[i] * 4);
>   ipackets += stats->q_ipackets[i];
>   opackets += stats->q_opackets[i];
>   ibytes   += stats->q_ibytes[i];

[dpdk-dev] [PATCH] PPC64: turn off fm10k driver compilation on IBM POWER

2015-11-04 Thread Qiu, Michael

On 2015/11/4 14:14, Chao Zhu wrote:
> The fm10k vector driver is specific for x86 platform which can't compile
> on IBM POWER for lacking of tmmintrin.h header file. This patch turns
> off fm10k driver compilation on IBM POWER to prevent compile issue.
>
> Signed-off-by: Chao Zhu 

Acked-by: Michael Qiu 
> ---
>  config/defconfig_ppc_64-power8-linuxapp-gcc |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc 
> b/config/defconfig_ppc_64-power8-linuxapp-gcc
> index f1af518..03760c4 100644
> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
> @@ -50,6 +50,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
>  CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
>  CONFIG_RTE_LIBRTE_PMD_BOND=n
>  CONFIG_RTE_LIBRTE_ENIC_PMD=n
> +CONFIG_RTE_LIBRTE_FM10K_PMD=n
>  
>  # This following libraries are not available on Power. So they're turned off.
>  CONFIG_RTE_LIBRTE_LPM=n

[dpdk-dev] [PATCH v7 00/28] remove pci driver from vdevs

2015-11-02 Thread Qiu, Michael

Hi, Bernard

Could we merge some patch together? I see lots of patches are simple and
doing the same thing but in different NIC, merge them almost have no
affect of review, what's more it will make reviewers more comfortable
with less and simple patches.

Then we could have a clean patch set with almost 5 patches, not totally 28.

librte_eal: add RTE_KDRV_NONE for vdevs
librte_ether: add fields from rte_pci_driver to rte_eth_dev_data
drivers: copy pci device info to eth_dev data
librte_ether: remove branches on pci_dev
drivers: remove pci device

At least, we could merge serial patches ":copy pci device info to
eth_dev data" to "drivers: copy pci device info to eth_dev data" in my mind.

Just an advise, but better to have.

Thanks,
Michael
On 2015/10/30 23:09, Bernard Iremonger wrote:
> There is a dummy pci driver in the vdev PMD's at present.
> This patch set removes the pci driver from the vdev PMD's.
> Changes have been made to librte_ether to handle vdevs and pdevs in the same 
> way.
>
> The following vdev PMD's have had the pci driver removed:
>
> null
> ring
> bonding
> pcap
> af_packet
> xenvirt
> mpipe
>
> All the pdev PMD's have been modified to copy the pci device info into 
> eth_dev data.
>
> Changes in v7:
> rebase to latest code.
>
> Changes in v6:
> Initialise data->drv_name with the PMD driver name in the vdevs.
> Remove two more branches on pci_dev from the bonding vdev.
>
> Changes in v5:
> rebase to latest code.
> refactor patches to avoid potential problems with git bisect.
>
> Changes in v4:
> rebase to latest code.
> add doxygen comments to rte_ethdev.h
> update release notes in patch 0002.
>
> Changes in v3:
> rebase to latest code.
> restructure patches 0002 and 0003 to fix compile issue in patch 0002.
>
> Changes in V2:
> rebase to latest code.
> fix compile error in rte_ethdev.c when debug disabled.
> remove Intel copyright from bnx2x, cxgbe, enic, mlx4, mpipe and null PMD's.
>
> Bernard Iremonger (28):
>   librte_eal: add RTE_KDRV_NONE for vdevs
>   librte_ether: add fields from rte_pci_driver to rte_eth_dev_data
>   librte_ether: add function rte_eth_copy_dev_info
>   ixgbe: copy pci device info to eth_dev data
>   e1000: copy pci device info to eth_dev data
>   i40e: copy pci device info to eth_dev data
>   fm10k: copy pci device info to eth_dev data
>   bnx2x: copy pci device info to eth_dev data
>   cxgbe: copy pci device info to eth_dev data
>   enic: copy pci device info to eth_dev data
>   mlx4: copy pci device info to eth_dev data
>   virtio: copy pci device info to eth_dev data
>   vmxnet3: copy pci device info to eth_dev data
>   null: copy device info to eth_dev data
>   ring: copy device info to eth_dev data
>   pcap: copy device info to eth_dev data
>   af_packet: copy device info to eth_dev data
>   xenvirt: copy device info to eth_dev data
>   mpipe: copy device info to eth_dev data
>   bonding: copy device info to eth_dev data
>   librte_ether: remove branches on pci_dev
>   null: remove pci device
>   ring: remove pci device
>   pcap: remove pci device
>   af_packet: remove pci device
>   xenvirt: remove pci device
>   mpipe: remove pci device
>   bonding: remove pci device
>
>  doc/guides/rel_notes/release_2_2.rst   |  4 ++
>  drivers/net/af_packet/rte_eth_af_packet.c  | 20 --
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  3 ++
>  drivers/net/bonding/rte_eth_bond_8023ad.c  |  4 +-
>  drivers/net/bonding/rte_eth_bond_alb.c |  2 +-
>  drivers/net/bonding/rte_eth_bond_api.c | 60 
> +-
>  drivers/net/bonding/rte_eth_bond_pmd.c | 18 -
>  drivers/net/bonding/rte_eth_bond_private.h |  2 +-
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  3 ++
>  drivers/net/e1000/em_ethdev.c  |  3 ++
>  drivers/net/e1000/igb_ethdev.c |  5 +++
>  drivers/net/enic/enic_ethdev.c |  1 +
>  drivers/net/fm10k/fm10k_ethdev.c   |  2 +
>  drivers/net/i40e/i40e_ethdev.c |  3 ++
>  drivers/net/i40e/i40e_ethdev_vf.c  |  2 +
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  4 ++
>  drivers/net/mlx4/mlx4.c|  3 ++
>  drivers/net/mpipe/mpipe_tilegx.c   | 10 +++--
>  drivers/net/null/rte_eth_null.c| 29 ---
>  drivers/net/pcap/rte_eth_pcap.c| 31 +--
>  drivers/net/ring/rte_eth_ring.c| 37 --
>  drivers/net/virtio/virtio_ethdev.c |  3 ++
>  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  2 +
>  drivers/net/xenvirt/rte_eth_xenvirt.c  | 18 -
>  lib/librte_eal/common/include/rte_pci.h|  3 +-
>  lib/librte_ether/rte_ethdev.c  | 54 +++
>  lib/librte_ether/rte_ethdev.h  | 29 +++
>  lib/librte_ether/rte_ether_version.map |  7 
>  28 files changed, 185 insertions(+), 177 deletions(-)
>

[dpdk-dev] [PATCH v6 00/28] remove pci driver from vdevs

2015-10-30 Thread Qiu, Michael

On 2015/10/30 2:37, Bernard Iremonger wrote:
> There is a dummy pci driver in the vdev PMD's at present.
> This patch set removes the pci driver from the vdev PMD's.
> Changes have been made to librte_ether to handle vdevs and pdevs in the same 
> way.
>
> The following vdev PMD's have had the pci driver removed:
>
> null
> ring
> bonding
> pcap
> af_packet
> xenvirt
> mpipe
>
> All the pdev PMD's have been modified to copy the pci device info into 
> eth_dev data.
>
> Changes in v6:
> Initialise data->drv_name with the PMD driver name in the vdevs.
> Remove two more branches on pci_dev from the bonding vdev.
>
> Changes in v5:
> rebase to latest code.
> refactor patches to avoid potential problems with git bisect.
>
> Changes in v4:
> rebase to latest code.
> add doxygen comments to rte_ethdev.h
> update release notes in patch 0002.
>
> Changes in v3:
> rebase to latest code.
> restructure patches 0002 and 0003 to fix compile issue in patch 0002.
>
> Changes in V2:
> rebase to latest code.
> fix compile error in rte_ethdev.c when debug disabled.
> remove Intel copyright from bnx2x, cxgbe, enic, mlx4, mpipe and null PMD's.
>
> Bernard Iremonger (28):
>   librte_eal: add RTE_KDRV_NONE for vdevs
>   librte_ether: add fields from rte_pci_driver to rte_eth_dev_data
>   librte_ether: add function rte_eth_copy_dev_info
>   ixgbe: copy pci device info to eth_dev data
>   e1000: copy pci device info to eth_dev data
>   i40e: copy pci device info to eth_dev data
>   fm10k: copy pci device info to eth_dev data
>   bnx2x: copy pci device info to eth_dev data
>   cxgbe: copy pci device info to eth_dev data
>   enic: copy pci device info to eth_dev data
>   mlx4: copy pci device info to eth_dev data
>   virtio: copy pci device info to eth_dev data
>   vmxnet3: copy pci device info to eth_dev data
>   null: copy device info to eth_dev data
>   ring: copy device info to eth_dev data
>   pcap: copy device info to eth_dev data
>   af_packet: copy device info to eth_dev data
>   xenvirt: copy device info to eth_dev data
>   mpipe: copy device info to eth_dev data
>   bonding: copy device info to eth_dev data
>   librte_ether: remove branches on pci_dev
>   null: remove pci device
>   ring: remove pci device
>   pcap: remove pci device
>   af_packet: remove pci device
>   xenvirt: remove pci device
>   mpipe: remove pci device
>   bonding: remove pci device
>
>  doc/guides/rel_notes/release_2_2.rst   |  3 ++
>  drivers/net/af_packet/rte_eth_af_packet.c  | 20 --
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  3 ++
>  drivers/net/bonding/rte_eth_bond_8023ad.c  |  4 +-
>  drivers/net/bonding/rte_eth_bond_alb.c |  2 +-
>  drivers/net/bonding/rte_eth_bond_api.c | 60 
> +-
>  drivers/net/bonding/rte_eth_bond_pmd.c | 18 -
>  drivers/net/bonding/rte_eth_bond_private.h |  2 +-
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  3 ++
>  drivers/net/e1000/em_ethdev.c  |  3 ++
>  drivers/net/e1000/igb_ethdev.c |  5 +++
>  drivers/net/enic/enic_ethdev.c |  1 +
>  drivers/net/fm10k/fm10k_ethdev.c   |  2 +
>  drivers/net/i40e/i40e_ethdev.c |  3 ++
>  drivers/net/i40e/i40e_ethdev_vf.c  |  2 +
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  4 ++
>  drivers/net/mlx4/mlx4.c|  3 ++
>  drivers/net/mpipe/mpipe_tilegx.c   | 10 +++--
>  drivers/net/null/rte_eth_null.c| 29 ---
>  drivers/net/pcap/rte_eth_pcap.c| 31 +--
>  drivers/net/ring/rte_eth_ring.c| 37 --
>  drivers/net/virtio/virtio_ethdev.c |  3 ++
>  drivers/net/vmxnet3/vmxnet3_ethdev.c   |  2 +
>  drivers/net/xenvirt/rte_eth_xenvirt.c  | 18 -
>  lib/librte_eal/common/include/rte_pci.h|  3 +-
>  lib/librte_ether/rte_ethdev.c  | 54 +++
>  lib/librte_ether/rte_ethdev.h  | 29 +++
>  lib/librte_ether/rte_ether_version.map |  7 
>  28 files changed, 184 insertions(+), 177 deletions(-)
>

Acked-by: Michael Qiu

[dpdk-dev] [PATCH v3 0/4] fm10k: add VMDQ support

2015-10-30 Thread Qiu, Michael

On 2015/10/27 17:25, He, Shaopeng wrote:
> This patch series adds VMDQ support for fm10k.
> It includes the functions to configure VMDQ mode and
> add MAC address for each VMDQ queue pool.
> It also includes logic to do sanity check for
> multi-queue settings.
>
> Changes in v3:
> - Keep device default MAC address even in VMDQ mode after
>   queue pool config was changed, because some applications
>   (e.g. vmdq_app) always need a valid MAC address there.
>
> Changes in v2:
> - Reword some comments and commit messages
> - Updated release note
>
> Shaopeng He (4):
>   fm10k: add multi-queue checking
>   fm10k: add VMDQ support in MAC/VLAN filter
>   fm10k: add VMDQ support in multi-queue configure
>   doc: update release note for fm10k VMDQ support
>
>  doc/guides/rel_notes/release_2_2.rst |   5 +
>  drivers/net/fm10k/fm10k.h|   3 +
>  drivers/net/fm10k/fm10k_ethdev.c | 358 
> +++
>  3 files changed, 289 insertions(+), 77 deletions(-)
>

Acked-by: Michael Qiu

[dpdk-dev] DPDK patch backlog

2015-10-22 Thread Qiu, Michael

On 2015/10/21 17:05, Thomas Monjalon wrote:
> 2015-10-21 11:48, Panu Matilainen:
>> On 10/21/2015 11:25 AM, Thomas Monjalon wrote:
>>> 2015-10-20 21:34, Stephen Hemminger:
 Patch backlog is not getting better, now at 486.

 How can we break this logjam?
 Do I need to make a new "ready for merge" tree?
>>> What would mean "ready for merge"?
>>> A lot of patches are acked but do not compile or doc is missing.
>> Well, isn't that one quite reasonable definition of being "ready"?
>> - patch must be acked
>> - patch must apply and compile (when relevant)
>> - is appropriately documented (commit message style and all)
> Yes.
> Compilation must be tested with GCC and clang, as static and shared libraries
> and for 32-bit and 64-bit targets.
> Documented means good commit message and doc or release notes updated.

What about bug fix patches?

Thanks,
Michael
>

[dpdk-dev] DPDK patch backlog

2015-10-22 Thread Qiu, Michael

On 2015/10/16 22:25, Neil Horman wrote:
> On Fri, Oct 16, 2015 at 10:45:23AM +0200, Thomas Monjalon wrote:
>> 2015-10-15 14:44, Stephen Hemminger:
>>> There are currently 428 patches in New state in DPDK patchwork.
>>>
>>> Thomas, could you start reducing that backlog?
>> Yes
>>
>>> The simplest solution would be to merge some of the big patch series
>>> from Intel for the base drivers, then reviewers can focus on the other
>>> patches.
>> That's why having a drivers/net subtree would be useful.
>>
> Agreed, a dpdk-next tree would really be the solution here.

Can't agree more :)

Thanks,
Michael
> Neil
>
>

[dpdk-dev] [PATCH 1/3] fm10k: add multi-queue checking

2015-10-22 Thread Qiu, Michael

On 2015/10/15 19:07, He, Shaopeng wrote:
> Hi, Michael
>
>> -Original Message-----
>> From: Qiu, Michael
>> Sent: Thursday, October 15, 2015 2:28 PM
>> To: He, Shaopeng; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 1/3] fm10k: add multi-queue checking
>>
>> On 2015/9/30 15:29, Shaopeng He wrote:
>>> Add multi-queue checking in device configure process.
>>> Currently, VMDQ and RSS are supported.
>>>
>>> Signed-off-by: Shaopeng He 
>>> ---
>>>  drivers/net/fm10k/fm10k_ethdev.c | 44
>>> 
>>>  1 file changed, 44 insertions(+)
>>>
>>> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
>>> b/drivers/net/fm10k/fm10k_ethdev.c
>>> index a69c990..082937d 100644
>>> --- a/drivers/net/fm10k/fm10k_ethdev.c
>>> +++ b/drivers/net/fm10k/fm10k_ethdev.c
>>> @@ -283,12 +283,56 @@ tx_queue_disable(struct fm10k_hw *hw,
>> uint16_t
>>> qnum)  }
>>>
>>>  static int
>>> +fm10k_check_mq_mode(struct rte_eth_dev *dev) {
>>> +   enum rte_eth_rx_mq_mode rx_mq_mode = dev->data-
>>> dev_conf.rxmode.mq_mode;
>>> +   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data-
>>> dev_private);
>>> +   struct rte_eth_vmdq_rx_conf *vmdq_conf;
>>> +   uint16_t nb_rx_q = dev->data->nb_rx_queues;
>>> +
>>> +   vmdq_conf = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
>>> +
>>> +   if (rx_mq_mode & ETH_MQ_RX_DCB_FLAG) {
>>> +   PMD_INIT_LOG(ERR, "DCB mode is not supported.");
>>> +   return -EINVAL;
>>> +   }
>>> +
>>> +   if (!(rx_mq_mode & ETH_MQ_RX_VMDQ_FLAG))
>>> +   return 0;
>>> +
>>> +   if (hw->mac.type == fm10k_mac_vf) {
>>> +   PMD_INIT_LOG(ERR, "VMDQ mode is not supported in VF.");
>>> +   return -EINVAL;
>>> +   }
>> I think vf check should be the first one, then we do not need check dcb and
>> VMDq flag.
>>
>> Thanks,
>> Michael
> Thanks for the comments. There is a case of RSS support on VF, if vf check be 
> the first one, it will return fail, which is not correct.

OK, you are right.

Thanks,
Michael
> Thanks,
> --Shaopeng
>>> +
>>> +   /* Check VMDQ queue pool number */
>>> +   if (vmdq_conf->nb_queue_pools >
>>> +   sizeof(vmdq_conf->pool_map[0].pools) * CHAR_BIT
>> ||
>>> +   vmdq_conf->nb_queue_pools > nb_rx_q) {
>>> +   PMD_INIT_LOG(ERR, "Too many of queue pools: %d",
>>> +   vmdq_conf->nb_queue_pools);
>>> +   return -EINVAL;
>>> +   }
>>> +
>>> +   return 0;
>>> +}
>>> +
>>> +static int
>>>  fm10k_dev_configure(struct rte_eth_dev *dev)  {
>>> +   int ret;
>>> +
>>> PMD_INIT_FUNC_TRACE();
>>>
>>> if (dev->data->dev_conf.rxmode.hw_strip_crc == 0)
>>> PMD_INIT_LOG(WARNING, "fm10k always strip CRC");
>>> +   /* multipe queue mode checking */
>>> +   ret  = fm10k_check_mq_mode(dev);
>>> +   if (ret != 0) {
>>> +   PMD_DRV_LOG(ERR, "fm10k_check_mq_mode fails
>> with %d.",
>>> +   ret);
>>> +   return ret;
>>> +   }
>>>
>>> return 0;
>>>  }
>

[dpdk-dev] [PATCH v3 02/20] librte_ether: add fields from rte_pci_driver to rte_eth_dev_data

2015-10-20 Thread Qiu, Michael

On 2015/10/13 0:26, Bernard Iremonger wrote:
> add dev_flags to rte_eth_dev_data, add macros for dev_flags.
> add kdrv to rte_eth_dev_data.
> add numa_node to rte_eth_dev_data.
> add drv_name to rte_eth_dev_data.
> use dev_type to distinguish between vdev's and pdev's.
> remove pci_dev branches.
>
> Signed-off-by: Bernard Iremonger 

I have a question, if we only apply the patch set till here, does DPDK
work fine?

Thanks,
Michael
> ---
>  lib/librte_ether/rte_ethdev.c | 40 +---
>  lib/librte_ether/rte_ethdev.h | 15 +++
>  2 files changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index f593f6e..4187595 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -424,7 +424,7 @@ rte_eth_dev_socket_id(uint8_t port_id)
>  {
>   if (!rte_eth_dev_is_valid_port(port_id))
>   return -1;
> - return rte_eth_devices[port_id].pci_dev->numa_node;
> + return rte_eth_devices[port_id].data->numa_node;
>  }
>  
>  uint8_t
> @@ -503,27 +503,25 @@ rte_eth_dev_get_name_by_port(uint8_t port_id, char 
> *name)
>  static int
>  rte_eth_dev_is_detachable(uint8_t port_id)
>  {
> - uint32_t drv_flags;
> + uint32_t dev_flags;
>  
>   if (!rte_eth_dev_is_valid_port(port_id)) {
>   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>   return -EINVAL;
>   }
>  
> - if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
> - switch (rte_eth_devices[port_id].pci_dev->kdrv) {
> - case RTE_KDRV_IGB_UIO:
> - case RTE_KDRV_UIO_GENERIC:
> - case RTE_KDRV_NIC_UIO:
> - break;
> - case RTE_KDRV_VFIO:
> - default:
> - return -ENOTSUP;
> - }
> + switch (rte_eth_devices[port_id].data->kdrv) {
> + case RTE_KDRV_IGB_UIO:
> + case RTE_KDRV_UIO_GENERIC:
> + case RTE_KDRV_NIC_UIO:
> + case RTE_KDRV_NONE:
> + break;
> + case RTE_KDRV_VFIO:
> + default:
> + return -ENOTSUP;
>   }
> -
> - drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
> - return !(drv_flags & RTE_PCI_DRV_DETACHABLE);
> + dev_flags = rte_eth_devices[port_id].data->dev_flags;
> + return !(dev_flags & RTE_ETH_DEV_DETACHABLE);
>  }
>  
>  /* attach the new physical device, then store port_id of the device */
> @@ -1143,14 +1141,11 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>* If link state interrupt is enabled, check that the
>* device supports it.
>*/
> - if (dev_conf->intr_conf.lsc == 1) {
> - const struct rte_pci_driver *pci_drv = &dev->driver->pci_drv;
> -
> - if (!(pci_drv->drv_flags & RTE_PCI_DRV_INTR_LSC)) {
> + if ((dev_conf->intr_conf.lsc == 1) &&
> + (!(dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC))) {
>   PMD_DEBUG_TRACE("driver %s does not support lsc\n",
> - pci_drv->name);
> + dev->data->drv_name);
>   return -EINVAL;
> - }
>   }
>  
>   /*
> @@ -1795,8 +1790,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> rte_eth_dev_info *dev_info)
>   FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
>   (*dev->dev_ops->dev_infos_get)(dev, dev_info);
>   dev_info->pci_dev = dev->pci_dev;
> - if (dev->driver)
> - dev_info->driver_name = dev->driver->pci_drv.name;
> + dev_info->driver_name = dev->data->drv_name;
>  }
>  
>  void
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 8a8c82b..d440bd6 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1471,8 +1471,23 @@ struct rte_eth_dev_data {
>   all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
>   dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). 
> */
>   lro : 1;   /**< RX LRO is ON(1) / OFF(0) */
> + uint32_t dev_flags; /**< Flags controlling handling of device. */
> + enum rte_kernel_driver kdrv;/**< Kernel driver passthrough */
> + int numa_node;
> + const char *drv_name;
>  };
>  
> +/** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */
> +#define RTE_ETH_DEV_DRV_NEED_MAPPING RTE_PCI_DRV_NEED_MAPPING
> +/** Device needs to be unbound even if no module is provided */
> +#define RTE_ETH_DEV_DRV_FORCE_UNBIND RTE_PCI_DRV_FORCE_UNBIND
> +/** Device supports link state interrupt */
> +#define RTE_ETH_DEV_INTR_LSC RTE_PCI_DRV_INTR_LSC
> +/** Device  supports detaching capability */
> +#define RTE_ETH_DEV_DETACHABLE   RTE_PCI_DRV_DETACHABLE
> +/** Device  is a bonded device */
> +#define RTE_ETH_DEV_BONDED   0x0020
> +
>  /**
>   * @internal
>   * The pool of *rte_eth_dev*

[dpdk-dev] [PATCH v3 02/20] librte_ether: add fields from rte_pci_driver to rte_eth_dev_data

2015-10-20 Thread Qiu, Michael

On 2015/10/13 0:26, Bernard Iremonger wrote:
> add dev_flags to rte_eth_dev_data, add macros for dev_flags.
> add kdrv to rte_eth_dev_data.
> add numa_node to rte_eth_dev_data.
> add drv_name to rte_eth_dev_data.
> use dev_type to distinguish between vdev's and pdev's.
> remove pci_dev branches.
>
> Signed-off-by: Bernard Iremonger 
> ---

[../..]

>  /* attach the new physical device, then store port_id of the device */
> @@ -1143,14 +1141,11 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>* If link state interrupt is enabled, check that the
>* device supports it.
>*/
> - if (dev_conf->intr_conf.lsc == 1) {
> - const struct rte_pci_driver *pci_drv = &dev->driver->pci_drv;
> -
> - if (!(pci_drv->drv_flags & RTE_PCI_DRV_INTR_LSC)) {
> + if ((dev_conf->intr_conf.lsc == 1) &&
> + (!(dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC))) {
>   PMD_DEBUG_TRACE("driver %s does not support lsc\n",
> - pci_drv->name);
> + dev->data->drv_name);
>   return -EINVAL;
> - }
>   }
>  
>   /*
> @@ -1795,8 +1790,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> rte_eth_dev_info *dev_info)
>   FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
>   (*dev->dev_ops->dev_infos_get)(dev, dev_info);
>   dev_info->pci_dev = dev->pci_dev;

Here also pci_dev, I think after you remove pci_dev from vdevs, and this
field could be remove I think, as I don't see any use of this field in
dev_info, it should be more general and not only PCI.

Thanks,
Michael

> - if (dev->driver)
> - dev_info->driver_name = dev->driver->pci_drv.name;
> + dev_info->driver_name = dev->data->drv_name;
>  }
>  
>

[dpdk-dev] [dpdk-dev, PATCHv5, 1/8] ethdev: add new API to retrieve RX/TX queue information

2015-10-20 Thread Qiu, Michael

On 2015/10/20 16:09, Vincent JARDIN wrote:
> On 20/10/2015 09:53, Qiu, Michael wrote:
>> But as I know it is different all the time, am I right?
>> If yes, I don't know what's the value of this field.
> It can be used to get some snapshot/instant view informations while we 
> have to monitor and debug.

So this field is mainly for debug?

Thanks,
Michael

[dpdk-dev] [dpdk-dev, PATCHv5, 1/8] ethdev: add new API to retrieve RX/TX queue information

2015-10-20 Thread Qiu, Michael

On 2015/10/14 19:50, Ananyev, Konstantin wrote:
> Hi Amine,
>
>> -Original Message-
>> From: Amine Kherbouche [mailto:amine.kherbouche at 6wind.com]
>> Sent: Wednesday, October 14, 2015 12:40 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev, PATCHv5, 1/8] ethdev: add new API to retrieve RX/TX 
>> queue information
>>
>>
>>
>> Hi Konstantin
>>> +/**
>>> + * Ethernet device RX queue information structure.
>>> + * Used to retieve information about configured queue.
>>> + */
>>> +struct rte_eth_rxq_info {
>>> +   struct rte_mempool *mp; /**< mempool used by that queue. */
>>> +   struct rte_eth_rxconf conf; /**< queue config parameters. */
>>> +   uint8_t scattered_rx;   /**< scattered packets RX supported. */
>>> +   uint16_t nb_desc;   /**< configured number of RXDs. */
>> Here i need two more fields in this struct :
>>  uint16_t free_desc : for free queue descriptors
>>  uint16_t used_desc : for used queue descriptors

But as I know it is different all the time, am I right?
If yes, I don't know what's the value of this field.

Thanks,
Michael


>>> +} __rte_cache_aligned;
>

[dpdk-dev] [PATCH 1/3] fm10k: add multi-queue checking

2015-10-15 Thread Qiu, Michael

On 2015/9/30 15:29, Shaopeng He wrote:
> Add multi-queue checking in device configure process.
> Currently, VMDQ and RSS are supported.
>
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c | 44 
> 
>  1 file changed, 44 insertions(+)
>
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index a69c990..082937d 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -283,12 +283,56 @@ tx_queue_disable(struct fm10k_hw *hw, uint16_t qnum)
>  }
>  
>  static int
> +fm10k_check_mq_mode(struct rte_eth_dev *dev)
> +{
> + enum rte_eth_rx_mq_mode rx_mq_mode = dev->data->dev_conf.rxmode.mq_mode;
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct rte_eth_vmdq_rx_conf *vmdq_conf;
> + uint16_t nb_rx_q = dev->data->nb_rx_queues;
> +
> + vmdq_conf = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
> +
> + if (rx_mq_mode & ETH_MQ_RX_DCB_FLAG) {
> + PMD_INIT_LOG(ERR, "DCB mode is not supported.");
> + return -EINVAL;
> + }
> +
> + if (!(rx_mq_mode & ETH_MQ_RX_VMDQ_FLAG))
> + return 0;
> +
> + if (hw->mac.type == fm10k_mac_vf) {
> + PMD_INIT_LOG(ERR, "VMDQ mode is not supported in VF.");
> + return -EINVAL;
> + }

I think vf check should be the first one, then we do not need check dcb
and VMDq flag.

Thanks,
Michael
> +
> + /* Check VMDQ queue pool number */
> + if (vmdq_conf->nb_queue_pools >
> + sizeof(vmdq_conf->pool_map[0].pools) * CHAR_BIT ||
> + vmdq_conf->nb_queue_pools > nb_rx_q) {
> + PMD_INIT_LOG(ERR, "Too many of queue pools: %d",
> + vmdq_conf->nb_queue_pools);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int
>  fm10k_dev_configure(struct rte_eth_dev *dev)
>  {
> + int ret;
> +
>   PMD_INIT_FUNC_TRACE();
>  
>   if (dev->data->dev_conf.rxmode.hw_strip_crc == 0)
>   PMD_INIT_LOG(WARNING, "fm10k always strip CRC");
> + /* multipe queue mode checking */
> + ret  = fm10k_check_mq_mode(dev);
> + if (ret != 0) {
> + PMD_DRV_LOG(ERR, "fm10k_check_mq_mode fails with %d.",
> + ret);
> + return ret;
> + }
>  
>   return 0;
>  }

[dpdk-dev] [PATCH v2 1/2] fm10k: enable TSO support

2015-10-13 Thread Qiu, Michael

On 2015/10/12 14:38, Wang Xiao W wrote:
> This patch enables fm10k TSO feature for both non-tunneling packet
> and tunneling packet.
>
> Signed-off-by: Wang Xiao W 
> ---

Acked-by: Michael Qiu

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-10-13 Thread Qiu, Michael

Hi, Thomas

Any comments on this patch? Is it suitable for DPDK?

Thanks,
Michael
On 2015/8/26 14:12, Liu, Jijiang wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
>> Sent: Friday, August 07, 2015 11:29 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
>>
>> For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK
>> will be dropped in switch side, so the packet generator will never receive 
>> the
>> packet.
>>
>> Signed-off-by: Michael Qiu 
>> ---
>>  app/test-pmd/csumonly.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
>> 1bf3485..bf8af1d 100644
>> --- a/app/test-pmd/csumonly.c
>> +++ b/app/test-pmd/csumonly.c
>> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream
>> *fs)
>>   * and inner headers */
>>
>>  eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>> +ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>> +ð_hdr->d_addr);
>> +ether_addr_copy(&ports[fs->tx_port].eth_addr,
>> +ð_hdr->s_addr);
>>  parse_ethernet(eth_hdr, &info);
>>  l3_hdr = (char *)eth_hdr + info.l2_len;
>>
>> --
>> 1.9.3
> The change will affect on the csum fwd performance.
> But I also think the change is necessary, or we cannot use csumonly fwd mode 
> in guest?
>
> Acked-by: Jijiang Liu 
>
>

[dpdk-dev] [PATCH] librte_eal: Fix wrong header file for old gcc version

2015-10-13 Thread Qiu, Michael

Hi, all

Any comments on this?

Thanks,
Michael
On 2015/9/25 10:56, Qiu, Michael wrote:
> On 2015/9/7 22:46, Thomas Monjalon wrote:
>> 2015-08-24 17:22, Michael Qiu:
>>> For __SSE3__, the corresponding header file should be pmmintrin.h,
>>> tmmintrin.h works for __SSSE3__.
>> Please could you better explain the difference and what is exactly the bug
>> being fixed?
> It should solve this issue:
>
> [dpdk-dev] DPDK 2.1.0 build error: inlining failed in call to always_inline
>
> /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1: error: 
> inlining failed in call to always_inline ?_mm_alignr_epi8?: t
> arget specific option mismatch
>  _mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N)
>
>  ^
> The AMD cpu flags:
>
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxe
> xt fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl 
> nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lah
> f_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch 
> osvw ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_sa
>
>
> "_mm_alignr_epi8" only works for ssse3 or upper,
> but this AMD CPU does not support that. This function has been wrongly 
> called, because the wrong header file.
>
> Thanks,
> Michael 
>
>
>> Thanks
>>
>>
>

[dpdk-dev] [PATCH] librte_eal: Fix wrong header file for old gcc version

2015-09-25 Thread Qiu, Michael

On 2015/9/7 22:46, Thomas Monjalon wrote:
> 2015-08-24 17:22, Michael Qiu:
>> For __SSE3__, the corresponding header file should be pmmintrin.h,
>> tmmintrin.h works for __SSSE3__.
> Please could you better explain the difference and what is exactly the bug
> being fixed?
It should solve this issue:

[dpdk-dev] DPDK 2.1.0 build error: inlining failed in call to always_inline

/usr/lib/gcc/x86_64-redhat-linux/4.9.2/include/tmmintrin.h:185:1: error: 
inlining failed in call to always_inline ?_mm_alignr_epi8?: t
arget specific option mismatch
 _mm_alignr_epi8(__m128i __X, __m128i __Y, const int __N)

 ^
The AMD cpu flags:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxe
xt fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl 
nonstop_tsc extd_apicid aperfmperf pni monitor cx16 popcnt lah
f_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw 
ibs skinit wdt cpb hw_pstate npt lbrv svm_lock nrip_sa


"_mm_alignr_epi8" only works for ssse3 or upper,
but this AMD CPU does not support that. This function has been wrongly called, 
because the wrong header file.

Thanks,
Michael 


> Thanks
>
>

[dpdk-dev] [PATCH v2] ethdev: add new RX/TX queue state arrays in rte_eth_dev_data

2015-09-23 Thread Qiu, Michael

On 2015/9/22 6:40, Stephen Hemminger wrote:
> On Wed, 16 Sep 2015 22:51:24 +0100
> Pablo de Lara  wrote:
>
>> This is important to avoid trying to start/stop twice a queue,
>> which will result in undefined behaviour
>> (which may cause RX/TX disruption).
>>
>> Mind that only the PMDs which have queue_start/stop functions
>> have been changed to update this field, as the functions will
>> check the queue state before switching it.
>>
>> Signed-off-by: Pablo de Lara 
> I agree that the DPDK API should check for buggy manipulation
> in the control path. But this should be done in generic code.
> Anything where you have to change any driver is making more work
> than necessary.

I agree with you, but I have a question, why we need expose the queue
start and stop function to app?

In my opinion, user app will hardly to start a device but stop the
device queue. what's the purpose of it?

Thanks,
Michael

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-09-14 Thread Qiu, Michael

Hi, all

any other comments about this patch?

Thanks,
Michael

On 8/26/2015 2:12 PM, Liu, Jijiang wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
>> Sent: Friday, August 07, 2015 11:29 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
>>
>> For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK
>> will be dropped in switch side, so the packet generator will never receive 
>> the
>> packet.
>>
>> Signed-off-by: Michael Qiu 
>> ---
>>  app/test-pmd/csumonly.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
>> 1bf3485..bf8af1d 100644
>> --- a/app/test-pmd/csumonly.c
>> +++ b/app/test-pmd/csumonly.c
>> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream
>> *fs)
>>   * and inner headers */
>>
>>  eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>> +ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>> +ð_hdr->d_addr);
>> +ether_addr_copy(&ports[fs->tx_port].eth_addr,
>> +ð_hdr->s_addr);
>>  parse_ethernet(eth_hdr, &info);
>>  l3_hdr = (char *)eth_hdr + info.l2_len;
>>
>> --
>> 1.9.3
> The change will affect on the csum fwd performance.
> But I also think the change is necessary, or we cannot use csumonly fwd mode 
> in guest?
>
> Acked-by: Jijiang Liu 
>
>

[dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment

2015-08-24 Thread Qiu, Michael

On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
> In theory, the queue number per pool could be 2 or 4, but only 2 queues are
> available due to HW limitation, the same limit also exist in Linux ixgbe 
> driver.
>
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_ether/rte_ethdev.c | 40 +++
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82 
> +--
>  2 files changed, 111 insertions(+), 11 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 024fe8b..6535715 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, 
> uint16_t nb_rx_q)
>   return 0;
>  }
>  
> +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> +
> +static int
> +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t 
> nb_rx_q)
> +{
> + if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> + return -EINVAL;
> + return 0;
> +}
> +
>  static int
>  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t 
> nb_tx_q,
> const struct rte_eth_conf *dev_conf)
> @@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>   return -EINVAL;
>   }
>   }
> +
> + if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
> + uint32_t nb_queue_pools =
> + 
> dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
> + struct rte_eth_dev_info dev_info;
> +
> + rte_eth_dev_info_get(port_id, &dev_info);
> + dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
> + if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == 
> ETH_64_POOLS)
> + RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> + dev_info.max_rx_queues/nb_queue_pools;
> + else {
> + PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
> + "nb_queue_pools=%d invalid "
> + "in VMDQ RSS\n"

Does here miss "," ?

Thanks,
Michael

> + port_id,
> + nb_queue_pools);
> + return -EINVAL;
> + }
> +
> + if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
> + RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
> + PMD_DEBUG_TRACE("ethdev port_id=%d"
> + " SRIOV active, invalid queue"
> + " number for VMDQ RSS, allowed"
> + " value are 1, 2 or 4\n",
> + port_id);
> + return -EINVAL;
> + }
> + }
>   }
>   return 0;
>  }
>

[dpdk-dev] [PATCH]doc: Add performance tuning guide about how to get DPDK high perf on Intel platform.

2015-08-10 Thread Qiu, Michael

On 2015/8/10 9:11, Zhang, Helin wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
>> Sent: Monday, August 10, 2015 6:38 AM
>> To: Xu, Qian Q
>> Cc: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH]doc: Add performance tuning guide about how
>> to get DPDK high perf on Intel platform.
>>

[../..]

>>> +1. For Intel? 40G NICs, special configurations should be set before 
>>> compiling it,
>> as follows. **Note**: This is very important::
>>> +
>>> +  for at least DPDK release 1.8, 2.0 and 2.1, in
>> /config/common_linuxapp
>>> +  CONFIG_RTE_PCI_CONFIG=y
>>> +  CONFIG_RTE_PCI_EXTENDED_TAG=?on?
>> Please insert it in a i40e doc instead of here. Then you can reference it.
> Good idea!

Yes, agree, actually, for RRC(fm10k), this configure is meaningless.

Thanks,
Michael 
> Regards,
> Helin
>
>

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-08-10 Thread Qiu, Michael

On 2015/8/7 17:13, Ouyang, Changchun wrote:
>
>>

[.../...]

>>
>>  eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>> +ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>> +ð_hdr->d_addr);
>> +ether_addr_copy(&ports[fs->tx_port].eth_addr,
>> +ð_hdr->s_addr);
>> Is it really necessary? Why other NICs do not need this?
>>
> Seems the behavior changes from io fwd into mac fwd?

Yes, but I think it is no influence for checksum offload.

Thanks,
Michael
>>> parse_ethernet(eth_hdr, &info);
>>> l3_hdr = (char *)eth_hdr + info.l2_len;
>>>
>>> --
>>> 1.9.3
>

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-08-07 Thread Qiu, Michael

On 2015/8/7 13:37, Zhang, Helin wrote:
>
>> -Original Message-
>> From: Qiu, Michael
>> Sent: Friday, August 7, 2015 11:53 AM
>> To: Zhang, Helin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
>>
>> On 2015/8/7 9:06, Zhang, Helin wrote:
>>>> -Original Message-
>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
>>>> Sent: Thursday, August 6, 2015 8:29 PM
>>>> To: dev at dpdk.org
>>>> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum
>>>> forwarding
>>>>
>>>> For some ethnet-switch like intel RRC, all the packet forwarded out
>>>> by DPDK will be dropped in switch side, so the packet generator will never
>> receive the packet.
>>> Is it because of anti-sproof? E.g. When the hardware found that the
>>> dest mac is the port itself, then it will be dropped during TX.
>>> You need to tell the root cause, and why we need to modify like this.
>> Actually, it is not the hardware from PEP(PCI End Point) side, but the 
>> switch side.
>>
>> The TX is OK for DPDK and NIC, but in switch, it receives the packet and try 
>> to
>> forward it, but the dest mac is the same as the NIC which transmit this 
>> packet.
>> So switch will drop it as "Loopback Suppression Drop" in RRC. This should 
>> only
>> happen when switch forwarding packets using dest mac.
>>
>>
>>>> Signed-off-by: Michael Qiu 
>>>> ---
>>>>  app/test-pmd/csumonly.c | 4 
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
>>>> 1bf3485..bf8af1d 100644
>>>> --- a/app/test-pmd/csumonly.c
>>>> +++ b/app/test-pmd/csumonly.c
>>>> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream
>> *fs)
>>>> * and inner headers */
>>>>
>>>>eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>>>> +  ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>>>> +  ð_hdr->d_addr);
>>>> +  ether_addr_copy(&ports[fs->tx_port].eth_addr,
>>>> +  ð_hdr->s_addr);
>>> Is it really necessary? Why other NICs do not need this?
>> Because other NICs is connect directly to packet generator, if we using 
>> switch
>> to connect the generator and the NICs, I think it will need this.
> There are 'iofwd' and 'mac' mode in testpmd, and mac forware will modify the 
> dest
> mac before transmitting the packet. They are for different cases.
> Why not use mac forwarding mode for your testing, and just keep it as is?

Yes, I don't touch iofwd, I just modify the csum, when we test checksum
offload, especially for checksum insert in TX side.

Thanks,
Michael

> Regards,
> Helin
>
>> Thanks,
>> Michael
>>>>parse_ethernet(eth_hdr, &info);
>>>>l3_hdr = (char *)eth_hdr + info.l2_len;
>>>>
>>>> --
>>>> 1.9.3
>

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-08-07 Thread Qiu, Michael

On 2015/8/7 9:06, Zhang, Helin wrote:
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
>> Sent: Thursday, August 6, 2015 8:29 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
>>
>> For some ethnet-switch like intel RRC, all the packet forwarded out by DPDK 
>> will
>> be dropped in switch side, so the packet generator will never receive the 
>> packet.
> Is it because of anti-sproof? E.g. When the hardware found that the dest mac 
> is the
> port itself, then it will be dropped during TX.
> You need to tell the root cause, and why we need to modify like this.

Actually, it is not the hardware from PEP(PCI End Point) side, but the
switch side.

The TX is OK for DPDK and NIC, but in switch, it receives the packet and
try to forward it, but the dest mac is the same as the NIC which
transmit this packet.
So switch will drop it as "Loopback Suppression Drop" in RRC. This
should only happen when switch forwarding packets using dest mac.


>
>> Signed-off-by: Michael Qiu 
>> ---
>>  app/test-pmd/csumonly.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
>> 1bf3485..bf8af1d 100644
>> --- a/app/test-pmd/csumonly.c
>> +++ b/app/test-pmd/csumonly.c
>> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
>>   * and inner headers */
>>
>>  eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>> +ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>> +ð_hdr->d_addr);
>> +ether_addr_copy(&ports[fs->tx_port].eth_addr,
>> +ð_hdr->s_addr);
> Is it really necessary? Why other NICs do not need this?

Because other NICs is connect directly to packet generator, if we
using switch to connect the generator and the NICs, I think it will need
this.

Thanks,
Michael
>
>>  parse_ethernet(eth_hdr, &info);
>>  l3_hdr = (char *)eth_hdr + info.l2_len;
>>
>> --
>> 1.9.3
>

[dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding

2015-08-07 Thread Qiu, Michael

On 2015/8/7 9:05, De Lara Guarch, Pablo wrote:
> Hi Michael,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
>> Sent: Friday, August 07, 2015 4:29 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] testpmd: modify the mac of csum forwarding
>>
>> For some ethnet-switch like intel RRC, all the packet forwarded
>> out by DPDK will be dropped in switch side, so the packet
>> generator will never receive the packet.
>>
>> Signed-off-by: Michael Qiu 
>> ---
>>  app/test-pmd/csumonly.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
>> index 1bf3485..bf8af1d 100644
>> --- a/app/test-pmd/csumonly.c
>> +++ b/app/test-pmd/csumonly.c
>> @@ -550,6 +550,10 @@ pkt_burst_checksum_forward(struct fwd_stream
>> *fs)
>>   * and inner headers */
>>
>>  eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
>> +ether_addr_copy(&peer_eth_addrs[fs->peer_addr],
>> +ð_hdr->d_addr);
>> +ether_addr_copy(&ports[fs->tx_port].eth_addr,
>> +ð_hdr->s_addr);
>>  parse_ethernet(eth_hdr, &info);
>>  l3_hdr = (char *)eth_hdr + info.l2_len;
>>
>> --
>> 1.9.3
> Why do you make this change only in this mode? If NICs like RRC has this 
> issue,
> I assume it would happen in other modes.

Yes, exactly, but for iofwd if we change the mac, so the mode is changed
 am I right?

Thanks,
Michael

> Thanks,
> Pablo
>

[dpdk-dev] how to build dpdk in debug mode?

2015-08-04 Thread Qiu, Michael

Please refer dec/build-sdk-quick.txt

EXTRA_CFLAGS=-g

should work.

Thanks,
Michael

On 2015/8/3 9:29, Thomas Monjalon wrote:
> 2015-08-03 16:16, Montorsi, Francesco:
>> Hi all,
>> I have searched the archives for this, without much success.
>>
>> Is it possible to build dpdk user-space libraries with -O0 and -g instead of 
>> -O3 ?
>> This would make debugging via GDB much more friendly...
> The answer is EXTRA_CFLAGS :)
>

[dpdk-dev] [RFC] examples: remove l3fwd-vf example

2015-08-04 Thread Qiu, Michael

Actually, l3fwd works fine with fm10k vf.

I don't know what's the exact reason of l3fwd-vf still in DPDK, at least we 
could make full support for vf in l3fwd instead of another sample with most 
code are the same compare with l3fwd.

Thanks,
Michael

On 2015/7/22 7:51, Zhang, Helin wrote:

Marvin/Waterman

Could you help to check if l3fwd is good enough for all cases (1g/10/40g, PF 
and VF, single queue/multiple queue)?
We aim to remove l3fwd-vf to reduce an example application which is not so 
necessary.
Thank you!

Regards,
Helin



-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
Sent: Wednesday, July 22, 2015 3:30 AM
To: Zhang, Helin
Cc: dev at dpdk.org; Wu, Jingjing
Subject: Re: [dpdk-dev] [RFC] examples: remove l3fwd-vf example

2015-07-14 14:50, Zhang, Helin:


From: Wu, Jingjing


Because VF multi-queues can be supported, l3fwd can run on vf.
Suggest to remove the l3fwd-vf example.


Totally agree with this!
But we need the confirmation from validation guys of that l3fwd works
quite well on VF with all NICs (e.g. i350, 82599, x550, xl710, and fm10k).



Helin, any new from validation?

[dpdk-dev] [PATCH v6] Add toeplitz hash algorithm used by RSS

2015-07-29 Thread Qiu, Michael

Hi, Vladimir

You need also to fix this issue in i686 platform:

RHEL65_32,2.6.32,4.4.7,14.0.0
SUSE11SP3_32,3.0.76-0,4.3.4,14.0.0


i686-native-linuxapp-gcc/include/rte_thash.h:63: error: integer constant is too 
large for 'long' type
i686-native-linuxapp-gcc/include/rte_thash.h:63: error: integer constant is too 
large for 'long' type


Thanks,
Michael
On 2015/7/27 4:58, Vladimir Medvedkin wrote:

Hi Tony,

Sorry for the late reply, I was on vacation.
I'll prepare patch soon.

Regards,
Vladimir

2015-07-22 10:55 GMT+03:00 Tony Lu :



Hi, Vladimir

When compiling thash for no-X86 arches, it fails with the following errors.
I wonder if
it is possible to make the thash library arch-independent?

== Build app/test
  CC test_thash.o
In file included from /u/zlu.bjg/git/dpdk.org/app/test/test_thash.c:40:
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:56:22
:
error: rte_vect.h: No such file or directory
In file included from /u/zlu.bjg/git/dpdk.org/app/test/test_thash.c:40:
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:62:
error: expected '=', ',', ';', 'asm' or '__attribute__' before
'rte_thash_ipv6_bswap_mask'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:129:
error: requested alignment is not a constant
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h: In
function 'rte_thash_load_v6_addrs':
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:160:
error: '__m128i' undeclared (first use in this function)
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:160:
error: (Each undeclared identifier is reported only once
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:160:
error: for each function it appears in.)
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:160:
error: expected ';' before 'ipv6'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:161:
error: expected expression before ')' token
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
error: 'ipv6' undeclared (first use in this function)
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
warning: implicit declaration of function '_mm_loadu_si128'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
warning: nested extern declaration of '_mm_loadu_si128'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
error: expected ')' before '__m128i'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
warning: type defaults to 'int' in declaration of 'type name'
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:163:
warning: cast from pointer to integer of different size
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:164:
error: expected expression before ')' token
/u/zlu.bjg/git/dpdk.org/tile-tilegx-linuxapp-gcc/include/rte_thash.h:158:
warning: unused parameter 'targ'
make[3]: *** [test_thash.o] Error 1
make[2]: *** [test] Error 2
make[1]: *** [app] Error 2
make: *** [all] Error 2

Thanks
-Zhigang Lu



-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Vladimir Medvedkin
Sent: Wednesday, July 01, 2015 7:40 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH v6] Add toeplitz hash algorithm used by RSS

Software implementation of the Toeplitz hash function used by RSS.
Can be used either for packet distribution on single queue NIC or for


simulating


of RSS computation on specific NIC (for example after GRE header
decapsulating).

v6 changes
- Fix compilation error
- Rename some defines and function

v5 changes
- Fix errors reported by checkpatch.pl

v4 changes
- Fix copyright
- rename bswap_mask constant, add rte_ prefix
- change rte_ipv[46]_tuple struct
- change rte_thash_load_v6_addr prototype

v3 changes
- Rework API to be more generic
- Add sctp_tag into tuple

v2 changes
- Add ipv6 support
- Various style fixes

Signed-off-by: Vladimir Medvedkin 
---
lib/librte_hash/Makefile|   1 +
lib/librte_hash/rte_thash.h | 231

2 files changed, 232 insertions(+)
create mode 100644 lib/librte_hash/rte_thash.h

diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
3696cb1..981230b 100644
--- a/lib/librte_hash/Makefile
+++ b/lib/librte_hash/Makefile
@@ -49,6 +49,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_hash_crc.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h

# this lib needs eal
diff --git a/lib/librte_hash/rte_thash.h b/lib/librte_hash/rte_thash.h new


file


mode 100644 index 000..1808f47
--- /dev/null
+++ b/lib/librte

[dpdk-dev] [PATCH] testpmd: Fix wrong message in testpmd

2015-07-28 Thread Qiu, Michael

On 2015/7/8 2:04, Richardson, Bruce wrote:
> On Wed, Jul 08, 2015 at 07:16:21AM +0000, Qiu, Michael wrote:

[.../...]

>>> port = &ports[pi];
>>> if (rte_atomic16_cmpset(&(port->port_status),
>>> +   RTE_PORT_CLOSED, RTE_PORT_CLOSED) == 1) {
>>> +   printf("Port %d is already closed\n", pi);
>>> +   continue;
>>> +   }
>>> +
>>> +   if (rte_atomic16_cmpset(&(port->port_status),
>>> RTE_PORT_STOPPED, RTE_PORT_HANDLING) == 0) {
>>> printf("Port %d is now not stopped\n", pi);
>>> continue;
> I know it's not part of your change, but "Now not stopped" doesn't really seem
> right to me. What is the message actually trying to report?

It is just make sure the port is in stopped state. So it will check if
it is not in RTE_PORT_STOPPED stat or fail to set to RTE_PORT_HANDLING,
it will report as "now not stopped"


Thanks,
Michael
>
> /Bruce

[dpdk-dev] [PATCH] examples/l3fwd: increase lookup burst size to 8

2015-07-23 Thread Qiu, Michael

Hi, Pablo

Is there any performance data for this change?

Thanks,
Michael

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Pablo de Lara
Sent: Thursday, July 23, 2015 9:12 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] examples/l3fwd: increase lookup burst size to 8

With the new hash implementation, the minimum lookup burst size to get good 
performance is 8, since its internal pipeline consists of 4 stages of 2 entries 
each, so to avoid duplication, burst size should be 8 or more entries.

Signed-off-by: Pablo de Lara 
---
 examples/l3fwd/main.c | 234 --
 1 file changed, 191 insertions(+), 43 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 
45676ba..c8a0f66 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -1,7 +1,7 @@
 /*-
  *   BSD LICENSE
  *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -731,26 +731,34 @@ static inline void l3fwd_simple_forward(struct rte_mbuf 
*m, uint8_t portid,  #if ((APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH) && \
(ENABLE_MULTI_BUFFER_OPTIMIZE == 1))

-#define MASK_ALL_PKTS0xf
-#define EXECLUDE_1ST_PKT 0xe
-#define EXECLUDE_2ND_PKT 0xd
-#define EXECLUDE_3RD_PKT 0xb
-#define EXECLUDE_4TH_PKT 0x7
+#define MASK_ALL_PKTS0xff
+#define EXCLUDE_1ST_PKT 0xfe
+#define EXCLUDE_2ND_PKT 0xfd
+#define EXCLUDE_3RD_PKT 0xfb
+#define EXCLUDE_4TH_PKT 0xf7
+#define EXCLUDE_5TH_PKT 0xef
+#define EXCLUDE_6TH_PKT 0xdf
+#define EXCLUDE_7TH_PKT 0xbf
+#define EXCLUDE_8TH_PKT 0x7f

 static inline void
-simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint8_t portid, struct lcore_conf 
*qconf)
+simple_ipv4_fwd_8pkts(struct rte_mbuf *m[8], uint8_t portid, struct 
+lcore_conf *qconf)
 {
-   struct ether_hdr *eth_hdr[4];
-   struct ipv4_hdr *ipv4_hdr[4];
-   uint8_t dst_port[4];
-   int32_t ret[4];
-   union ipv4_5tuple_host key[4];
-   __m128i data[4];
+   struct ether_hdr *eth_hdr[8];
+   struct ipv4_hdr *ipv4_hdr[8];
+   uint8_t dst_port[8];
+   int32_t ret[8];
+   union ipv4_5tuple_host key[8];
+   __m128i data[8];

eth_hdr[0] = rte_pktmbuf_mtod(m[0], struct ether_hdr *);
eth_hdr[1] = rte_pktmbuf_mtod(m[1], struct ether_hdr *);
eth_hdr[2] = rte_pktmbuf_mtod(m[2], struct ether_hdr *);
eth_hdr[3] = rte_pktmbuf_mtod(m[3], struct ether_hdr *);
+   eth_hdr[4] = rte_pktmbuf_mtod(m[4], struct ether_hdr *);
+   eth_hdr[5] = rte_pktmbuf_mtod(m[5], struct ether_hdr *);
+   eth_hdr[6] = rte_pktmbuf_mtod(m[6], struct ether_hdr *);
+   eth_hdr[7] = rte_pktmbuf_mtod(m[7], struct ether_hdr *);

/* Handle IPv4 headers.*/
ipv4_hdr[0] = rte_pktmbuf_mtod_offset(m[0], struct ipv4_hdr *, @@ 
-761,32 +769,56 @@ simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint8_t portid, 
struct lcore_conf *
  sizeof(struct ether_hdr));
ipv4_hdr[3] = rte_pktmbuf_mtod_offset(m[3], struct ipv4_hdr *,
  sizeof(struct ether_hdr));
+   ipv4_hdr[4] = rte_pktmbuf_mtod_offset(m[4], struct ipv4_hdr *,
+ sizeof(struct ether_hdr));
+   ipv4_hdr[5] = rte_pktmbuf_mtod_offset(m[5], struct ipv4_hdr *,
+ sizeof(struct ether_hdr));
+   ipv4_hdr[6] = rte_pktmbuf_mtod_offset(m[6], struct ipv4_hdr *,
+ sizeof(struct ether_hdr));
+   ipv4_hdr[7] = rte_pktmbuf_mtod_offset(m[7], struct ipv4_hdr *,
+ sizeof(struct ether_hdr));

 #ifdef DO_RFC_1812_CHECKS
/* Check to make sure the packet is valid (RFC1812) */
uint8_t valid_mask = MASK_ALL_PKTS;
if (is_valid_ipv4_pkt(ipv4_hdr[0], m[0]->pkt_len) < 0) {
rte_pktmbuf_free(m[0]);
-   valid_mask &= EXECLUDE_1ST_PKT;
+   valid_mask &= EXCLUDE_1ST_PKT;
}
if (is_valid_ipv4_pkt(ipv4_hdr[1], m[1]->pkt_len) < 0) {
rte_pktmbuf_free(m[1]);
-   valid_mask &= EXECLUDE_2ND_PKT;
+   valid_mask &= EXCLUDE_2ND_PKT;
}
if (is_valid_ipv4_pkt(ipv4_hdr[2], m[2]->pkt_len) < 0) {
rte_pktmbuf_free(m[2]);
-   valid_mask &= EXECLUDE_3RD_PKT;
+   valid_mask &= EXCLUDE_3RD_PKT;
}
if (is_valid_ipv4_pkt(ipv4_hdr[3], m[3]->pkt_len) < 0) {
rte_pktmbuf_free(m[3]);
-   valid_mask &= EXECLUDE_4TH_PKT;
+   valid_mask &= EXCLUDE_4TH_PKT;
+   }
+   if (is_valid_ipv4_pkt(ipv4_hdr[4], m[4]->pkt_len) < 0) {
+   rte_pktmbuf_free(m[4]);
+   valid_mask &= EXCLUDE_5TH_PKT;
+

[dpdk-dev] [RFC] vfio: only map regions VFIO supports

2015-07-14 Thread Qiu, Michael

Hi, all

While I check the maillist,  I found a patch:

[dpdk-dev] [PATCH v3] vfio: Fix overflow while assigning vfio BAR region
offset and size

This patch has fixed this issue :), and will be merged.

This patch is same with mine.

So I will not post the patch again.

Thanks,
Michael
On 7/11/2015 1:24 AM, Stephen Hemminger wrote:
> On Fri, 10 Jul 2015 07:54:10 +
> "Qiu, Michael"  wrote:
>
>> Hi, Stephen
>>
>> This patch does not work for fm10k with vfio, see error below:
>>
>> EAL: PCI device :84:00.0 on NUMA socket 1
>> EAL:   probe driver: 8086:15a4 rte_pmd_fm10k
>> EAL:   PCI memory mapped at 0x7f198000
>> EAL: Trying to map BAR 2 that contains the MSI-X table. Trying offsets:
>> :, 1000:1000
>> EAL:   PCI memory mapped at 0x7f1980401000
>> EAL: pci_map_resource(): cannot mmap(105, 0x7f1980402000, 0x400,
>> 0x0): Invalid argument (0x)
>> EAL:   :84:00.0 mapping BAR4 failed: Invalid argument
>> EAL: Error - exiting with code: 1
>>   Cause: Requested device :84:00.0 cannot be used
> Yes. The patch doesn't solve the problem (but might be needed in some future 
> weird hw).
>
>

[dpdk-dev] [RFC] vfio: only map regions VFIO supports

2015-07-14 Thread Qiu, Michael

Hi, Stephen

I have found out the root cause of this bug, and generate a patch.

Will send out later after :)

Thanks,
Michael

On 7/11/2015 1:24 AM, Stephen Hemminger wrote:
> On Fri, 10 Jul 2015 07:54:10 +
> "Qiu, Michael"  wrote:
>
>> Hi, Stephen
>>
>> This patch does not work for fm10k with vfio, see error below:
>>
>> EAL: PCI device :84:00.0 on NUMA socket 1
>> EAL:   probe driver: 8086:15a4 rte_pmd_fm10k
>> EAL:   PCI memory mapped at 0x7f198000
>> EAL: Trying to map BAR 2 that contains the MSI-X table. Trying offsets:
>> :, 1000:1000
>> EAL:   PCI memory mapped at 0x7f1980401000
>> EAL: pci_map_resource(): cannot mmap(105, 0x7f1980402000, 0x400,
>> 0x0): Invalid argument (0x)
>> EAL:   :84:00.0 mapping BAR4 failed: Invalid argument
>> EAL: Error - exiting with code: 1
>>   Cause: Requested device :84:00.0 cannot be used
> Yes. The patch doesn't solve the problem (but might be needed in some future 
> weird hw).
>
>

1 2 3 4 >

1 - 100 of 337 matches

Mail list logo