[dpdk-dev] Changes to 5tuple IPv4 filters in dpdk v2.0

2015-08-05 Thread Kamraan Nasim
Hi Vladimir,

Thank you for the link. Seems to simply be an abstraction over the existing
filters so it is safe for me to upgrade to v2.0 :)

Since we are on the subject, are you aware of any filters on 82599 or
Fortville that may provide subnet filtering(I can specify something like
192.168.0.0/16 instead of host addresses)?  What about flow director
filters?


--Kam

On Tue, Aug 4, 2015 at 5:40 PM, Vladimir Medvedkin 
wrote:

> Hi Kam,
>
> 1) The reason is discussed in
> http://dpdk.org/ml/archives/dev/2014-September/005179.html
> 2) No, it's still not supported (on current NICs). At the moment ntuple is
> supported only by igb and ixgbe. If you look at
> drivers/net/ixgbe/ixgbe_ethdev.c you can see ntuple_filter_to_5tuple
> function which translate rte_eth_ntuple_filter to ixgbe_5tuple_filter_info,
> so mask can be either UINT32_MAX or 0. It's hardware limitation (see 82599
> datasheet 7.1.2.5 L3/L4 5-tuple Filters).
>
> Regards,
> Vladimir
>
> 2015-08-04 23:44 GMT+03:00 Kamraan Nasim :
>
>> Hi DPDK community,
>>
>> I've been using DPDK v1.7 and v1.8 for the past year. On updating to
>> v2.0.0,  I see that *rte_5tuple_filter* has been deprecated as well as the
>> associated install/remove call,* rte_eth_dev_add_5tuple_filter()*
>>
>> I now see that rte_eth_ntuple_filter has been added in place.
>>
>> 1) Is there a specific reason for removing backward compatibility? As in
>> is
>> there a known issue with rte_5tuple_filter infra that was discovered in
>> v2.0?
>>
>>
>> 2) One limitation of rte_5tuple_filter was that it could not be used to
>> filter /24 or /16 ip addresses(subnet filtering). I now see that the
>> src_ip_mask and dst_ip_mask is 32 bits and a separate
>> RTE_NTUPLE_FLAGS_SRC_IP
>> <
>> http://dpdk.org/doc/api/rte__eth__ctrl_8h.html#aff1204ca0b33628610956f840dd9b206
>> >
>>   has been introduced. Does this imply that we NOW support subnet
>> filtering(use mask for wildcard masking)?
>>
>>
>> Any help or pointers on the subject will be greatly appreciated!!!
>>
>>
>> Thanks,
>> Kam
>>
>
>


[dpdk-dev] Changes to 5tuple IPv4 filters in dpdk v2.0

2015-08-04 Thread Kamraan Nasim
Hi DPDK community,

I've been using DPDK v1.7 and v1.8 for the past year. On updating to
v2.0.0,  I see that *rte_5tuple_filter* has been deprecated as well as the
associated install/remove call,* rte_eth_dev_add_5tuple_filter()*

I now see that rte_eth_ntuple_filter has been added in place.

1) Is there a specific reason for removing backward compatibility? As in is
there a known issue with rte_5tuple_filter infra that was discovered in
v2.0?


2) One limitation of rte_5tuple_filter was that it could not be used to
filter /24 or /16 ip addresses(subnet filtering). I now see that the
src_ip_mask and dst_ip_mask is 32 bits and a separate
RTE_NTUPLE_FLAGS_SRC_IP

  has been introduced. Does this imply that we NOW support subnet
filtering(use mask for wildcard masking)?


Any help or pointers on the subject will be greatly appreciated!!!


Thanks,
Kam


[dpdk-dev] dev Digest, Vol 22, Issue 37

2015-04-15 Thread Kamraan Nasim
>
> This had me stumped for a while as well. In my case, PostGres9.4 was also
> running on my system which also used huge pages and came up before my DPDK
> application causing rte_mempool_create() to ENOMEM.


Check which other applications are using huge pages:
> lsof | grep huge

And see if you can disable huge pages for them or increase the total pages
you're allocating in Kernel.


--Kam


>


> Date: Sat, 10 Jan 2015 21:26:03 +0200
> From: Liran Zvibel 
> To: Newman Poborsky , "dev at dpdk.org"
> 
> Subject: Re: [dpdk-dev] rte_mempool_create fails with ENOMEM
> Message-ID:
>  QUKrd-ZCGn6HqBw7h6NE7wxUszf6WxOY18geg at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hi Newman,
>
> There are two options, either one of your pools is very large, and
> just does not fit in half of the memory,
> so if the physical memory must be split it just can never work, or
> what you?re seeing is localized to your
> environment, and just when allocating from both NUMAs the huge pages
> just happen to be to scattered
> for your pools to be allocated.
>
> In any case, we also have to deal with large pools that don?t always
> fit into consecutive huge pages as
> allocated by the kernel. I have created a small patch to DPDK itself,
> then some more code that can live
> as part of the dpdk application that does the scattered allocation.
>
> I?m going to send both parts here (the change to the DPDK and the user
> part). I don?t know what are the
> rules that allow pushing to the repository, so I won?t try to do so.
>
> First ? the DPDK patch, that just makes sure that the huge pates are
> mapped in a continuous virtual memory,
> and then the memory segments are allocated continuously in virtual
> memory: I?m attaching full mbox content to make it easier
> for you to use if you?d like. I created it against 1.7.1, since that
> is the version we?re  using. If you?d like, I can also create it
> against 1.8.0
>
> 
>
> >From 10ebc74eda2c3fe9e5a34815e0f7ee1f44d99aa3 Mon Sep 17 00:00:00 2001
> From: Liran Zvibel 
> Date: Sat, 10 Jan 2015 12:46:54 +0200
> Subject: [PATCH] Add an option to allocate huge pages in contiunous virtual
>  addresses
> To: dev at dpdk.org
>
> Add a configuration option: CONFIG_RTE_EAL_HUGEPAGES_SINGLE_CONT_VADDR
> that advises the memory sengment allocation code to allocate as many
> hugemages in a continuous way in virtual addresses as possible.
>
> This way, a mempool may be created out of disparsed memzones allocated
> from these new continuos memory segments.
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index f2454f4..b8d68b0 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -329,6 +329,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
>
>  #ifndef RTE_EAL_SINGLE_FILE_SEGMENTS
> else if (vma_len == 0) {
> +#ifndef RTE_EAL_HUGEPAGES_SINGLE_CONT_VADDR
> unsigned j, num_pages;
>
> /* reserve a virtual area for next contiguous
> @@ -340,6 +341,14 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
> break;
> }
> num_pages = j - i;
> +#else // hugepages are will be allocated in a continous virtual address
> way
> +   unsigned num_pages;
> +   /* We will reserve a virtual area large enough
> to fit ALL
> +* physical blocks.
> +* This way we can have bigger mempools even
> if there is no
> +* continuos physcial region.
> */
> +   num_pages = hpi->num_pages[0] - i;
> +#endif
> vma_len = num_pages * hugepage_sz;
>
> /* get the biggest virtual memory area up to
> @@ -1268,6 +1277,16 @@ rte_eal_hugepage_init(void)
> new_memseg = 1;
>
> if (new_memseg) {
> +#ifdef RTE_EAL_HUGEPAGES_SINGLE_CONT_VADDR
> +   if (0 <= j) {
> +   RTE_LOG(DEBUG, EAL, "Closing memory
> segment #%d(%p) vaddr is %p phys is 0x%lx size is 0x%lx "
> +   "which is #%ld pages next
> vaddr will be at 0x%lx\n",
> +   j,&mcfg->memseg[j],
> +   mcfg->memseg[j].addr,
> mcfg->memseg[j].phys_addr, mcfg->memseg[j].len,
> +   mcfg->memseg[j].len /
> mcfg->memseg[j].hugepage_sz,
> +   mcfg->memseg[j].addr_64 +
> mcfg->memseg[j].len);
> +   }
> +#endif
> j += 1;
> if (j == RTE_M

[dpdk-dev] Filtration Statistics for VLAN filters(ixgbe)

2015-03-11 Thread Kamraan Nasim
Hello,

I am trying to implement VLAN filters in DPDK on 82599 NIC(Niantic)

I've been using *rte_eth_dev_vlan_filter(port, vlanID, 1)* which seems to
be working however I would like to see some actual packet filtration
statistics.

 In the case of 5tuple IPv4 filters where I was redirecting to queueX, I
could see the filter working by looking at q_errors[X} in rte_eth_stats
struct.

Is there anything in *rte_eth_stats* that can be used to see VLAN filtering
stats or anything else out there?

Greatly appreciate any help in this department :)


Thanks,
Kam


[dpdk-dev] RSS Reta redirection not working on 82599 NIC(Niantic)

2015-01-19 Thread Kamraan Nasim
Hello,

Following from a previous thread, I took up Bruce Richardson's advice to
modify the RSS redirection table(RETA) to exclude entries pointing to a
specific queue:

For e.g, this is the RETA table before any modifications are done:
*Default *Reta table:
0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1
2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0
1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
0 1 2 0 1 2 0 1 2 0 0 0 0 0

I have initialized 3 queues on this NIC(82599-Niantic) and as can be seen,
NIC will forward traffic to queue's 1 & 2.

Now the reta table is modified so that everything goes to queue 0

*Modified* Reta table:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0

However despite this, RSS is loadbalancing packets on queues 1 & 2.

The following is a snippet of my code:
  /*
 * Rewrite the RETA(Redirection Table) for RSS
 * We want all traffic to be RX'd on queue 0
 * and only filtered traffic to be mapped to specific
 * queues, with one per filter for us to report
 * statistics
 */

struct rte_eth_rss_reta port_reta_conf;
memset(&port_reta_conf,0,sizeof(struct rte_eth_rss_reta));
port_reta_conf.mask_lo = 0x;
port_reta_conf.mask_hi = 0xfff;
int reta_ret = 0;
for (int rc = 0; rc < ETH_RSS_RETA_NUM_ENTRIES; rc++) {
port_reta_conf.reta[rc] = 0x00;
}
reta_ret = rte_eth_dev_rss_reta_update(port, &port_reta_conf);

printf("Reta return code: %d\n", reta_ret);


I have tried to restart the port, start/stop, linkup/linkdown but the
effect is the same: Reta table shows that it has all been zero'd out
however packets are still send to all configured queues. I am using DPDK
v1.7.0

Have others come across something similar? Any clues would be much
appreciated :)

--Kam


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-16 Thread Kamraan Nasim
Hi Bruce,
Perhaps this can be worked around by setting  rx_drop_en in the
rte_eth_rxconf <http://dpdk.org/doc/api/structrte__eth__rxconf.html> and
using small values for number of rx descripters so that the NIC drops when
it is out of RX descriptors for that queue?

As long as NIC can still provide q_ipackets/q_errors then this might be
faster than doing a RX in software and drop.

--Kam

On Fri, Jan 16, 2015 at 5:56 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Thu, Jan 15, 2015 at 08:06:52PM -0500, Kamraan Nasim wrote:
> > >>> update the RSS RETA table so that traffic doesn't get sent
> > >> to that queue via RSS. Is that what you are asking?
> >
> > Thanks Bruce, that's exactly it.  Basically each filter will forward
> > traffic to a unique RSS queue which can allow me to calculate filter
> match
> > statistics for that queue(or filter). At that point I would like to drop
> > the filtered packet. Is there any way to drop the filtered packet in the
> > RSS queue without doing a rte_eth_rx_burst() and dropping it then?
> >
> > --Kam
> >
> I don't believe there is any other way to drop them from the queue other
> than
> RX and drop. It's not the most efficient way to do so (since you have to
> allocate,
> fill and free an mbuf), but it's the only way that an app can do so without
> adding new functions to the driver(s).
>
> /Bruce
>


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-15 Thread Kamraan Nasim
>>> update the RSS RETA table so that traffic doesn't get sent
>> to that queue via RSS. Is that what you are asking?

Thanks Bruce, that's exactly it.  Basically each filter will forward
traffic to a unique RSS queue which can allow me to calculate filter match
statistics for that queue(or filter). At that point I would like to drop
the filtered packet. Is there any way to drop the filtered packet in the
RSS queue without doing a rte_eth_rx_burst() and dropping it then?

--Kam

On Thu, Jan 15, 2015 at 9:44 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Wed, Jan 14, 2015 at 04:59:17PM -0500, Kamraan Nasim wrote:
> > Many thanks Helin and Bruce :)
> >
> > Now if 1Gb NICs don't support fdir filters then im wondering how would we
> > count the number of packets matching a filter.
> >
> > Regular 5tuple filters don't have any stats similar to "fdirmatch"(in the
> > rte_eth_stats <http://dpdk.org/doc/api/structrte__eth__stats.html>
> struct).
> > One way I can think of is to use regular ibytes/ipackets stats for the
> > queue to which the packets are being redirected in the 5tuple filter but
> > this seems a bit hacky + there is no way to distinguish this packet
> > throughput from the regular traffic that the NIC is forwarding to that
> > specific queue.
> >
> > Is there a way to EXCLUSIVELY bind a 5tuple filter to an RSS queue so
> that
> > only matched traffic is forwarded there?
> >
> >
> > --Kam
> >
>
> What you can do is use a 5-tuple filter to send traffic to a queue. What
> you
> can also do is update the RSS RETA table so that traffic doesn't get sent
> to that queue via RSS. Is that what you are asking?
>
> /Bruce
>
> > On Wed, Jan 14, 2015 at 5:27 AM, Bruce Richardson <
> > bruce.richardson at intel.com> wrote:
> >
> > > On Tue, Jan 13, 2015 at 11:21:08PM -0500, Kamraan Nasim wrote:
> > > > Hello,
> > > >
> > > > I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they
> > > work
> > > > very well.
> > > >
> > > > Was wondering if I these could also be used for I210, 1Gbps NICs?
> > > >
> > > > The other option is to use 5tuple
> filters(rte_eth_dev_add_5tuple_filter
> > > > <
> > >
> http://dpdk.org/doc/api/rte__ethdev_8h.html#aaa28adafa65a4f47d4aeceaf1b08381b
> > > >),
> > > > however these do not support IPv6 yet.
> > > >
> > > >
> > > > Have people in the community had any luck with configuring L3/L4
> hardware
> > > > filters for the I210 NIC?
> > > >
> > > > Thanks,
> > > > Kam
> > >
> > > Flow director filters are not supported for 1G NICs. Sorry.
> > >
> > > /Bruce
> > >
>


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-14 Thread Kamraan Nasim
Many thanks Helin and Bruce :)

Now if 1Gb NICs don't support fdir filters then im wondering how would we
count the number of packets matching a filter.

Regular 5tuple filters don't have any stats similar to "fdirmatch"(in the
rte_eth_stats <http://dpdk.org/doc/api/structrte__eth__stats.html> struct).
One way I can think of is to use regular ibytes/ipackets stats for the
queue to which the packets are being redirected in the 5tuple filter but
this seems a bit hacky + there is no way to distinguish this packet
throughput from the regular traffic that the NIC is forwarding to that
specific queue.

Is there a way to EXCLUSIVELY bind a 5tuple filter to an RSS queue so that
only matched traffic is forwarded there?


--Kam

On Wed, Jan 14, 2015 at 5:27 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Jan 13, 2015 at 11:21:08PM -0500, Kamraan Nasim wrote:
> > Hello,
> >
> > I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they
> work
> > very well.
> >
> > Was wondering if I these could also be used for I210, 1Gbps NICs?
> >
> > The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
> > <
> http://dpdk.org/doc/api/rte__ethdev_8h.html#aaa28adafa65a4f47d4aeceaf1b08381b
> >),
> > however these do not support IPv6 yet.
> >
> >
> > Have people in the community had any luck with configuring L3/L4 hardware
> > filters for the I210 NIC?
> >
> > Thanks,
> > Kam
>
> Flow director filters are not supported for 1G NICs. Sorry.
>
> /Bruce
>


[dpdk-dev] Does I210 NIC support Flow director filters?

2015-01-13 Thread Kamraan Nasim
Hello,

I've been using DPDK fdir filter APIs for 82599 NIC(Niantic) and they work
very well.

Was wondering if I these could also be used for I210, 1Gbps NICs?

The other option is to use 5tuple filters(rte_eth_dev_add_5tuple_filter
),
however these do not support IPv6 yet.


Have people in the community had any luck with configuring L3/L4 hardware
filters for the I210 NIC?

Thanks,
Kam


[dpdk-dev] Symmetric RSS Hashing in DPDK

2014-12-17 Thread Kamraan Nasim
Hi DPDK community,

Any better RSS hash keys out there?

--Kam

On Wed, Dec 17, 2014 at 2:12 PM, Kamraan Nasim 
wrote:
>
> Thank you Jeriel. 0x00 0x01 works and I can get bi-directional symmetry
> but you are right, it compromises the packet distribution. I am seeing
> vastly different 5 tuples hashed with the same value.
>
> Will let you know if I find a better alternative.
>
> --Kam
>
> On Tue, Dec 16, 2014 at 5:17 PM, Jeriel Smith  wrote:
>>
>> Hi Kamraan,
>>   Even i noticed it with "0x6d5a". Currently, I use a continuous
>> pattern of "0x00 0x01" which helps in getting a symmetrical hashing. But,
>> the packet spraying is not that good as "0x6d5a". Please let me know if you
>> find a alternative.
>> Thanks,
>> Jeriel
>>
>>
>>> -- Forwarded message --
>>> From: Kamraan Nasim 
>>> Date: Tue, Dec 16, 2014 at 11:52 AM
>>> Subject: [dpdk-dev] Symmetric RSS Hashing in DPDK
>>> To: dev at dpdk.org
>>> Cc: Steve Noble , Jun Du <
>>> jdu at sidebandnetworks.com>, Ashish Juneja >> sidebandnetworks.com>
>>>
>>> Hello,
>>>
>>> My DPDK application requires bidirectional TCP flows to have the same RSS
>>> hash however default RSS hashing is *asymmetric*.
>>>
>>>
>>> There are posts such as:
>>> http://dpdk.info/ml/archives/dev/2014-February/001460.html
>>>
>>> which point to a symmetric RSS key(0x6d5a). I have tried using it but it
>>> is
>>> still hashing bi-directional flows separately. I am using an 82599 NIC.
>>>
>>> Have others come across this? What other options are available(I presume
>>> S/W hashing)?
>>>
>>> Appreciate any help I can get on this :)
>>>
>>> #define RSS_HASH_KEY_LENGTH 40
>>> static uint8_t hash_key[RSS_HASH_KEY_LENGTH] = {
>>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>>> };
>>> // ethernet rx config
>>> static struct rte_eth_conf port_conf = {
>>> .rxmode = {
>>> .mq_mode= ETH_MQ_RX_RSS,
>>> .split_hdr_size = 0,
>>> .header_split   = 0, /**< Header Split disabled */
>>> .hw_ip_checksum = 1, /**< IP checksum offload enabled */
>>> .hw_vlan_filter = 0, /**< VLAN filtering disabled */
>>> .jumbo_frame= 0, /**< Jumbo Frame Support disabled */
>>> .hw_strip_crc   = 0, /**< CRC stripped by hardware */
>>> },
>>> .rx_adv_conf = {
>>> .rss_conf = {
>>> .rss_key = hash_key,
>>> .rss_hf  = ETH_RSS_PROTO_MASK,
>>> },
>>> },
>>> .txmode = {
>>> .mq_mode = ETH_MQ_TX_NONE,
>>> },
>>> };
>>>
>>>
>>> Thanks,
>>> Kam
>>>
>>


[dpdk-dev] Symmetric RSS Hashing in DPDK

2014-12-17 Thread Kamraan Nasim
Thank you Jeriel. 0x00 0x01 works and I can get bi-directional symmetry but
you are right, it compromises the packet distribution. I am seeing vastly
different 5 tuples hashed with the same value.

Will let you know if I find a better alternative.

--Kam
On Tue, Dec 16, 2014 at 5:17 PM, Jeriel Smith  wrote:
>
> Hi Kamraan,
>   Even i noticed it with "0x6d5a". Currently, I use a continuous
> pattern of "0x00 0x01" which helps in getting a symmetrical hashing. But,
> the packet spraying is not that good as "0x6d5a". Please let me know if you
> find a alternative.
> Thanks,
> Jeriel
>
>
>> -- Forwarded message --
>> From: Kamraan Nasim 
>> Date: Tue, Dec 16, 2014 at 11:52 AM
>> Subject: [dpdk-dev] Symmetric RSS Hashing in DPDK
>> To: dev at dpdk.org
>> Cc: Steve Noble , Jun Du <
>> jdu at sidebandnetworks.com>, Ashish Juneja 
>>
>> Hello,
>>
>> My DPDK application requires bidirectional TCP flows to have the same RSS
>> hash however default RSS hashing is *asymmetric*.
>>
>>
>> There are posts such as:
>> http://dpdk.info/ml/archives/dev/2014-February/001460.html
>>
>> which point to a symmetric RSS key(0x6d5a). I have tried using it but it
>> is
>> still hashing bi-directional flows separately. I am using an 82599 NIC.
>>
>> Have others come across this? What other options are available(I presume
>> S/W hashing)?
>>
>> Appreciate any help I can get on this :)
>>
>> #define RSS_HASH_KEY_LENGTH 40
>> static uint8_t hash_key[RSS_HASH_KEY_LENGTH] = {
>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>> 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
>> };
>> // ethernet rx config
>> static struct rte_eth_conf port_conf = {
>> .rxmode = {
>> .mq_mode= ETH_MQ_RX_RSS,
>> .split_hdr_size = 0,
>> .header_split   = 0, /**< Header Split disabled */
>> .hw_ip_checksum = 1, /**< IP checksum offload enabled */
>> .hw_vlan_filter = 0, /**< VLAN filtering disabled */
>> .jumbo_frame= 0, /**< Jumbo Frame Support disabled */
>> .hw_strip_crc   = 0, /**< CRC stripped by hardware */
>> },
>> .rx_adv_conf = {
>> .rss_conf = {
>> .rss_key = hash_key,
>> .rss_hf  = ETH_RSS_PROTO_MASK,
>> },
>> },
>> .txmode = {
>> .mq_mode = ETH_MQ_TX_NONE,
>> },
>> };
>>
>>
>> Thanks,
>> Kam
>>
>


[dpdk-dev] Symmetric RSS Hashing in DPDK

2014-12-16 Thread Kamraan Nasim
Hello,

My DPDK application requires bidirectional TCP flows to have the same RSS
hash however default RSS hashing is *asymmetric*.

There are posts such as:
http://dpdk.info/ml/archives/dev/2014-February/001460.html

which point to a symmetric RSS key(0x6d5a). I have tried using it but it is
still hashing bi-directional flows separately. I am using an 82599 NIC.

Have others come across this? What other options are available(I presume
S/W hashing)?

Appreciate any help I can get on this :)

#define RSS_HASH_KEY_LENGTH 40
static uint8_t hash_key[RSS_HASH_KEY_LENGTH] = {
0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A, 0x6D, 0x5A,
};
// ethernet rx config
static struct rte_eth_conf port_conf = {
.rxmode = {
.mq_mode= ETH_MQ_RX_RSS,
.split_hdr_size = 0,
.header_split   = 0, /**< Header Split disabled */
.hw_ip_checksum = 1, /**< IP checksum offload enabled */
.hw_vlan_filter = 0, /**< VLAN filtering disabled */
.jumbo_frame= 0, /**< Jumbo Frame Support disabled */
.hw_strip_crc   = 0, /**< CRC stripped by hardware */
},
.rx_adv_conf = {
.rss_conf = {
.rss_key = hash_key,
.rss_hf  = ETH_RSS_PROTO_MASK,
},
},
.txmode = {
.mq_mode = ETH_MQ_TX_NONE,
},
};


Thanks,
Kam


[dpdk-dev] RTE mempool "used count" steadily goes down to zero despite steady packet throughput

2014-12-01 Thread Kamraan Nasim
Thanks Oliver for the helpful tip, that settles the confusion as to why
Free count grows and used shrinks.

I've modified the rte_mempool library to maintain atomic pktmbuf malloc and
free stats, hopefully that would help pinpoint the leak.

Thanks again for your help!

--Kam

On Mon, Dec 1, 2014 at 4:58 AM, Olivier MATZ  wrote:

> Hi Kam,
>
> On 11/28/2014 06:34 PM, Kamraan Nasim wrote:
> > I have ~15Gbps of traffic flowing through two 10GE ports and been
> profiling
> > the rte mempool(or rather the pktmbuf mempool) memory consumption:
> >
> > I have per lcore caching disabled(cache_size is 0)
> >
> > I have noticed that:
> > - Mempool FREE cnt(as given byt rte_mempool_free_count()) increases
> > - Mempool USED cnt(as given by rte_mempool_used_count() decreases and
> > eventually drops to 0. When this happens, mempool reports itself as EMPTY
> > - rx_nombuf stats for the eth ports start climbing
> > - Valgrind Memcheck does not indicate any obvious leaks in RTE mempool or
> > my application.
> >
> >
> > I was wondering if others have come across this issue?  Or if people here
> > have used ways, besides Valgrind to profile the mempool or the pkt mbuf
> > pool?
>
> First, be careful with rte_mempool_free_count(): it returns the number
> of free entries in the internal ring of the mempool, which actually
> corresponds to the number of allocated objects from the mempool point
> of view. See:
> http://dpdk.org/browse/dpdk/tree/lib/librte_mempool/rte_mempool.h
>
> If you have the number of allocated objects increasing, and the
> statistics rx_nombuf increasing when the mbuf pool is empty, it means
> that you have a mbuf leak in your application. Valgrind won't see it
> since it does not know about mempool alloc/free functions.
>
> Regards,
> Olivier
>


[dpdk-dev] RTE mempool "used count" steadily goes down to zero despite steady packet throughput

2014-11-28 Thread Kamraan Nasim
Hello,

I have ~15Gbps of traffic flowing through two 10GE ports and been profiling
the rte mempool(or rather the pktmbuf mempool) memory consumption:

I have per lcore caching disabled(cache_size is 0)

I have noticed that:
- Mempool FREE cnt(as given byt rte_mempool_free_count()) increases
- Mempool USED cnt(as given by rte_mempool_used_count() decreases and
eventually drops to 0. When this happens, mempool reports itself as EMPTY
- rx_nombuf stats for the eth ports start climbing
- Valgrind Memcheck does not indicate any obvious leaks in RTE mempool or
my application.


I was wondering if others have come across this issue?  Or if people here
have used ways, besides Valgrind to profile the mempool or the pkt mbuf
pool?

Thanks,
Kam


[dpdk-dev] Load-balancing position field in DPDK load_balancer sample app vs. Hash table

2014-11-13 Thread Kamraan Nasim
Hello,

So i've borrowed some code from the DPDK Load balancer sample application,
specifically the load balancing position(byte 29th) to determine which
worker lcore to forward the packet to.

The idea is that flow affinity should be maintained and all packets from
the same flow would have the same checksum/5-tuple value

worker_id = packet[load_balancing_field] % n_workers

Question is that how reliable is this load balancing position? I am tempted
to use Hash tables but I think this position based mechanism may be faster.

How have people's experience with this been in general?


--Kam


[dpdk-dev] Panic in rte MEMPOOL__mempool_check_cookies()

2014-11-12 Thread Kamraan Nasim
Hi Bruce,

The issue seems to be triggered by a combination of using the
CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG  option and contention in the mempool(small
mempool size).

I disabled that and the header panics went away.

Thank you for your help!

--Kam

On Fri, Nov 7, 2014 at 4:48 AM, Bruce Richardson  wrote:

> On Thu, Nov 06, 2014 at 04:28:18PM -0500, Kamraan Nasim wrote:
> > Greetings,
> >
> > I have been hitting this issue fairly consistently for the ixgbe driver
> >
> > MEMPOOL: obj=0x7ffeed1f5d00, mempool=0x7ffeecb69bc0,
> cookie=badbadbadadd2e55
> > PANIC in __mempool_check_cookies():
> > MEMPOOL: bad header cookie (get)
> >
> > It seems to be a corruption in the mempool bound to my ixgbe port. What I
> > have observed is that this ONLY happens if I initialize dpdk(i.e. start
> > dpdk application) AFTER traffic is already flowing in through the port.
> If
> > I initialize dpdk and bind BEFORE I start traffic then things seem to
> work
> > fine.
> >
> > Any clues on why this might be happening?
> >
> > A bit stumped, so would really appreciate all the help I can get on this
> > one.
> >
>
> Hi.
> Can you confirm if your application works ok with the mempool debug options
> disabled? The mempool debug options are not often used because they slow
> things
> down so it's possible the bug could be in the cookie checking itself.
>
> /Bruce
>
> > Thanks,
> > Kam
> >
> >
> > (bt for your reference)
> >
> > #2  0x00408cc6 in __rte_panic (funcname=0x571100
> > "__mempool_check_cookies", format=
> > 0x568fb0 "MEMPOOL: bad header cookie (get)\n%.0s")
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_eal/linuxapp/eal/eal_debug.c:83
> > #3  0x004af027 in __mempool_check_cookies (rxq= > out>)
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:357
> > #4  rte_mempool_get_bulk (rxq=)
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1094
> > #5  ixgbe_rx_alloc_bufs (rxq=)
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:997
> > #6  0x004afce9 in rx_recv_pkts (rx_queue=0x7ffeec8edbc0,
> > rx_pkts=0x900410,
> > nb_pkts=)
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:1074
> > #7  ixgbe_recv_pkts_bulk_alloc (rx_queue=0x7ffeec8edbc0,
> rx_pkts=0x900410,
> > nb_pkts=)
> > at
> >
> /b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:1124
> > #8  0x00520d36 in rte_eth_rx_burst (lp=0x900340, n_workers=14,
> > bsz_rd=, bsz_wr=
> > 144, pos_lb=0 '\000') at /usr/lib/dpdk/include/rte_ethdev.h:2368
>


[dpdk-dev] Panic in rte MEMPOOL__mempool_check_cookies()

2014-11-06 Thread Kamraan Nasim
Greetings,

I have been hitting this issue fairly consistently for the ixgbe driver

MEMPOOL: obj=0x7ffeed1f5d00, mempool=0x7ffeecb69bc0, cookie=badbadbadadd2e55
PANIC in __mempool_check_cookies():
MEMPOOL: bad header cookie (get)

It seems to be a corruption in the mempool bound to my ixgbe port. What I
have observed is that this ONLY happens if I initialize dpdk(i.e. start
dpdk application) AFTER traffic is already flowing in through the port. If
I initialize dpdk and bind BEFORE I start traffic then things seem to work
fine.

Any clues on why this might be happening?

A bit stumped, so would really appreciate all the help I can get on this
one.

Thanks,
Kam


(bt for your reference)

#2  0x00408cc6 in __rte_panic (funcname=0x571100
"__mempool_check_cookies", format=
0x568fb0 "MEMPOOL: bad header cookie (get)\n%.0s")
at
/b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_eal/linuxapp/eal/eal_debug.c:83
#3  0x004af027 in __mempool_check_cookies (rxq=)
at
/b/knasim/bandwagon/sbn/src/share/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:357
#4  rte_mempool_get_bulk (rxq=)
at
/b/knasim/bandwagon/sbn/src/share/dpdk/x86_64-native-linuxapp-gcc/include/rte_mempool.h:1094
#5  ixgbe_rx_alloc_bufs (rxq=)
at
/b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:997
#6  0x004afce9 in rx_recv_pkts (rx_queue=0x7ffeec8edbc0,
rx_pkts=0x900410,
nb_pkts=)
at
/b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:1074
#7  ixgbe_recv_pkts_bulk_alloc (rx_queue=0x7ffeec8edbc0, rx_pkts=0x900410,
nb_pkts=)
at
/b/knasim/bandwagon/sbn/src/share/dpdk/lib/librte_pmd_ixgbe/ixgbe_rxtx.c:1124
#8  0x00520d36 in rte_eth_rx_burst (lp=0x900340, n_workers=14,
bsz_rd=, bsz_wr=
144, pos_lb=0 '\000') at /usr/lib/dpdk/include/rte_ethdev.h:2368


[dpdk-dev] IPC/message passing between DPDK application and non-DPDK userspace application

2014-10-20 Thread Kamraan Nasim
Thank you Matthew.

On closer inspection, I see that DPDK is simply using pthreads and
pthread_get/set_affinity() to bind to particular cores(as specified in
coremask) so no real magic there as you said...

Looking at the load_balancer app for inspiration.

--Kam

On Sat, Oct 18, 2014 at 1:38 PM, Matthew Hall  wrote:

> On Fri, Oct 17, 2014 at 10:14:50PM -0400, Kamraan Nasim wrote:
> > I have a DPI daemon running in userspace which uses libpcap for packet RX
> > that I would like to replace with DPDK ethernet PMD. However it is not
> > feasible to convert the entire application to run within the DPDK
> framework
> > which is why I would like to split packet RX to use DPDK and have my
> > applications higher order functions continue using Linux pthread.
>
> The DPDK's lcore threads are just Linux pthreads bound onto specific cores.
> They aren't doing any secret stuff there really.
>
> So you could configure the DPDK to do stuff on some cores, and do your own
> stuff on other cores using the DPDK's "coremask" setting to say which
> cores it
> should launch itself on.
>
> > Is it possible to exchange data between DPDK processes and regular
> > userspace processes via some shared mem or IPC mechanism? I am not sure
> if
> > the SW ring buffers that DPDK provides can be used in this context?
>
> You can put extra threads in DPDK processes... the DPDK timer does this for
> example. Or you can pass stuff around... they have some example apps which
> have a master app and slave apps as separate Linux processes which can
> exchange data using the rings.
>
> > Please let me know if others have come across this. Any help from the
> DPDK
> > community would be greatly appreciated :)
>
> Yes, check through the various sample apps and see if there's something you
> can borrow from them. Good luck!
>
> Matthew.
>


[dpdk-dev] IPC/message passing between DPDK application and non-DPDK userspace application

2014-10-17 Thread Kamraan Nasim
Greetings,

I am still a DPDK newbie so not really sure how to go about doing this...

I have a DPI daemon running in userspace which uses libpcap for packet RX
that I would like to replace with DPDK ethernet PMD. However it is not
feasible to convert the entire application to run within the DPDK framework
which is why I would like to split packet RX to use DPDK and have my
applications higher order functions continue using Linux pthread.

Is it possible to exchange data between DPDK processes and regular
userspace processes via some shared mem or IPC mechanism? I am not sure if
the SW ring buffers that DPDK provides can be used in this context?

Please let me know if others have come across this. Any help from the DPDK
community would be greatly appreciated :)

--Kam