[dpdk-dev] Polling too often at lower packet rates?

2015-04-09 Thread Aaron Campbell
Hi Stephen,

Thanks, that was an informative talk.  In this case, are you referring to your 
comments about the thermal budget?

That?s definitely interesting, but there must be more to it than that.  Again, 
if I loop over all 6 ports (i.e., continue to keep the CPU busy), it works 
around the problem.

I agree that adaptive polling makes sense and will look into it.  But will 
still take any further ideas on what is going on here.

-Aaron

> On Apr 8, 2015, at 3:00 PM, Stephen Hemminger  
> wrote:
> 
> We use adaptive polling loop similar to l3fwd-power example.
> See:
>   
> 
> http://video.fosdem.org/2015/devroom-network_management_and_sdn/ 
> <http://video.fosdem.org/2015/devroom-network_management_and_sdn/>
> 
> On Wed, Apr 8, 2015 at 9:35 AM, Aaron Campbell  <mailto:aaron at arbor.net>> wrote:
> Hi,
> 
> I have a machine with 6 DPDK ports (4 igb, 2 ixgbe), with 1.23Mpps traffic 
> offered to only one of the 10G ports (the other 5 are unused).  I also have a 
> program with a pretty standard looking DPDK receive loop, where it calls 
> rte_eth_rx_burst() for each configured port.  If I configure the loop to read 
> from all 6 ports, it can read the 1.23Mpps rate with no drops.  If I 
> configure the loop to poll only 1 port (the ixgbe receiving the traffic), I 
> lose about 1/3rd of the packets (i.e., the NIC drops ~400Kpps).
> 
> Another data point is that if I configure the loop to read from 3 out of the 
> 6 ports, the drop rate is reduced to less than half (i.e., the NIC is only 
> dropping ~190Kpps now).  So it seems that in this test, throughput improves 
> by adding NICs, not removing them, which is counter-intuitive.  Again, I get 
> no drops when polling all 6 ports.  Note, the burst size is 32.
> 
> I did find a reference to a similar issue in a recent paper 
> (http://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ICN2015.pdf 
> <http://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ICN2015.pdf>), 
> Section III, which states:
> 
> "The DPDK L2FWD application initially only managed to forward 13.8 Mpps in 
> the single direction test at the maximum CPU frequency, a similar result can 
> be found in [11]. Reducing the CPU frequency increased the throughput to the 
> expected value of 14.88 Mpps. Our investigation of this anomaly revealed that 
> the lack of any processing combined with the fast CPU caused DPDK to poll the 
> NIC too often. DPDK does not use interrupts, it utilizes a busy wait loop 
> that polls the NIC until at least one packet is returned. This resulted in a 
> high poll rate which affected the throughput. We limited the poll rate to 
> 500,000 poll operations per second (i.e., a batch size of about 30 packets) 
> and achieved line rate in the unidirectional test with all frequencies. This 
> effect was only observed with the X520 NIC, tests with X540 NICs did not show 
> this anomaly.?
> 
> Another reference, from this mailing list last year 
> (http://wiki.dpdk.org/ml/archives/dev/2014-January/001169.html 
> <http://wiki.dpdk.org/ml/archives/dev/2014-January/001169.html>):
> 
> "I suggest you to check average burst sizes on receive queues. Looks like I 
> stumbled upon a similar issue several times. If you are calling 
> rte_eth_rx_burst too frequently, NIC begins losing packets no matter how many 
> CPU horse power you have (more you have, more it loses, actually). In my case 
> this situation occured when average burst size is less than 20 packets or so. 
> I'm not sure what's the reason for this behavior, but I observed it on 
> several applications on Intel 82599 10Gb cards.?
> 
> So I?m wondering if anyone can explain at a lower level what happens when you 
> poll ?too often?, and if there are any practical workarounds.  We?re using 
> this same program and DPDK version to process 10G line-rate in other 
> scenarios, so I?m confident that the overall packet capture architecture is 
> sound.
> 
> -Aaron
> 



[dpdk-dev] [PATCH v2] eal: add option --master-lcore

2014-11-05 Thread Aaron Campbell
Acked-by: Aaron Campbell mailto:aaron at arbor.net>>

Minor comments inline below, I don?t need to see another patch.

Thanks,
-Aaron

> On Nov 4, 2014, at 5:40 PM, Thomas Monjalon  
> wrote:
> 
> + RTE_LOG(ERR, EAL, "please specify the master lcore id"
> + "after specifying the coremask\n?);

Missing a space between ?id? and ?after?.

> + RTE_LOG(ERR, EAL, "please specify the master lcore id"
> + "after specifying the coremask\n?);

Same here in the Linux version of the code.


[dpdk-dev] [PATCH] eal/linuxapp: Add parameter to specify master lcore id

2014-11-05 Thread Aaron Campbell
> On Nov 4, 2014, at 3:00 PM, Thomas Monjalon  
> wrote:
> 
> 2014-11-03 13:02, Aaron Campbell:
>>> On Jul 8, 2014, at 5:28 AM, Simon Kuenzer  
>>> wrote:
>>> 
>>> +   else if (!strcmp(lgopts[option_index].name, 
>>> OPT_MASTER_LCORE)) {
>>> +   if (!coremask_ok) {
>>> +   RTE_LOG(ERR, EAL, "please specify the 
>>> master "
>>> +   "lcore id after 
>>> specifying "
>>> +   "the coremask\n");
>>> +   eal_usage(prgname);
>>> +   return -1;
>>> +   }
>> 
>> 
>> Hi Simon,
>> 
>> I think that forcing a particular command line order is not that clean.
>> It might be better to remove the cfg->master_lcore setting from
>> eal_parse_coremask(), and defer the selection of the master lcore until
>> all of the command-line options have been parsed.  If ?master-lcore was
>> specified, save the value and use that, otherwise
>> rte_get_next_lcore(-1, 0, 0) can return the first bit set in the coremask.
> 
> It's not sufficient: eal_parse_master_lcore() requires cfg->lcore_role
> to be set. There is a real dependency between these 2 options.
> I'm going to submit a v2. Feel free to improve it with another patch.

I was nit-picking; although it might be nice if the new option is given, to 
verify the specified lcore is in the coremask.  I will ack v2 though and this 
can be improved some other time.

-Aaron


[dpdk-dev] [PATCH] eal/linuxapp: Add parameter to specify master lcore id

2014-11-03 Thread Aaron Campbell
Hi Simon,

Thanks for the patch, this will be useful for us.  I responded separately to 
your original post with one suggestion.

Our application currently assumes that DPDK will assign the first bit set in 
the coremask to the master lcore.  As far as I can tell, this is hard-coded as 
of 1.7.1.  But we would like the ability for our application to specify any bit 
from the coremask to serve as the master lcore.

I don?t see any compatibility issues with this.  Existing applications should 
behave as before.

Thomas, could this be accepted for the 1.8 release?  Or will that only happen 
if the BSD side can be patched as well?

-Aaron

> On Jul 23, 2014, at 9:10 AM, Simon Kuenzer  wrote:
> 
> Hi all,
> 
> the only issue I could imagine is that current DPDK applications are
> utilizing the implicit assumption that the master lcore is always set to
> the first available lcore. I would consider this as a "bug" in the
> application because it sets up its worker threads not "properly".
> 
> However, as far I could check it, the DPDK framework seems to cope with
> it correctly.
> It would be nice if somebody else could confirm my statement.
> 
> Thanks,
> 
> Simon
> 
> On 23.07.2014 10:53, Hiroshi Shimamoto wrote:
>> Hi,
>> 
>>> Subject: Re: [dpdk-dev] [PATCH] eal/linuxapp: Add parameter to specify 
>>> master lcore id
>>> 
>>> Hi Hiroshi,
>>> 
>>> 2014-07-22 23:40, Hiroshi Shimamoto:
 does anyone have interest in this functionality?
 
 I think this is important and useful.
 Since we should care about core assignment to get high performance
 and the master lcore thread is special in DPDK, we will want to
 assign the master to the target core.
 For example, with hyperthreading I'd like to make a pair of packet
 processing threads into one physical core and separate the master
 thread which does some management.
>>> 
>>> Thank you for showing your interest.
>>> Does it mean you carefully reviewed this patch? In this case, I'd appreciate
>>> a note "Reviewed-by:".
>> 
>> Not yet deeply, wait a bit, we're testing this patch in our application.
>> Will report if it works fine.
>> 
>> By the way, we should add the same code into the BSD code, right?
>> 
>> thanks,
>> Hiroshi
>> 
>>> 
>>> Thanks
>>> --
>>> Thomas
> 



[dpdk-dev] [PATCH] eal/linuxapp: Add parameter to specify master lcore id

2014-11-03 Thread Aaron Campbell
> On Jul 8, 2014, at 5:28 AM, Simon Kuenzer  wrote:
> 
> + else if (!strcmp(lgopts[option_index].name, 
> OPT_MASTER_LCORE)) {
> + if (!coremask_ok) {
> + RTE_LOG(ERR, EAL, "please specify the 
> master "
> + "lcore id after 
> specifying "
> + "the coremask\n");
> + eal_usage(prgname);
> + return -1;
> + }


Hi Simon,

I think that forcing a particular command line order is not that clean.  It 
might be better to remove the cfg->master_lcore setting from 
eal_parse_coremask(), and defer the selection of the master lcore until all of 
the command-line options have been parsed.  If ?master-lcore was specified, 
save the value and use that, otherwise rte_get_next_lcore(-1, 0, 0) can return 
the first bit set in the coremask.

-Aaron


[dpdk-dev] [PATCH 1/3] stringfns: remove rte_snprintf

2014-06-26 Thread Aaron Campbell
On Jun 26, 2014, at 12:09 PM, Richardson, Bruce  
wrote:

>> I agree we should try to use the "deprecated" attribute when possible.
>> So application porting effort will be smoother.
>> 
>> But in this case, there is something different: as Stephen wrote, 
>> rte_snprintf
>> is useless. It's useless inside the DPDK so it's even more useless for user
>> applications.
>> As it's really useless, it has no sense to keep it as deprecated.
>> Please, let's simply remove it.
>> 
> 
> The reason to keep it as deprecated is so that those customers who don't want 
> to do a huge amount of search-replace immediately can get things working 
> again temporarily using -Wno-deprecated. It provides a simple temporary 
> fallback cushion, and then we can completely remove the function later. 
> So, I'd like to see us remove all our usage of the function internally in 
> 1.7, along with marking as deprecated, and then completely remove in 1.8, 
> (i.e. in a week's time or so) :-)

As a DPDK user, I?d vote to kill it now.  I doubt it is widely used in any 
external applications.  Such usage would be mostly from copy/pasting the sample 
code, is my guess.

-Aaron


[dpdk-dev] [PATCH] eal: fix invalid memory read as reported by valgrind

2014-06-26 Thread Aaron Campbell
==29880== Invalid read of size 1
==29880==at 0x56FF9A5: cpu_socket_id (eal_lcore.c:101)
==29880==by 0x56FFAE9: rte_eal_cpu_init (eal_lcore.c:168)
==29880==by 0x56F944A: rte_eal_init (eal.c:975)

The problem is that endptr points to memory allocated underneath the DIR
handle, which has already been freed.  So move the closedir() call lower.

Signed-off-by: Aaron Campbell 
---
 lib/librte_eal/linuxapp/eal/eal_lcore.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_lcore.c 
b/lib/librte_eal/linuxapp/eal/eal_lcore.c
index cc9b900..fd11142 100644
--- a/lib/librte_eal/linuxapp/eal/eal_lcore.c
+++ b/lib/librte_eal/linuxapp/eal/eal_lcore.c
@@ -77,7 +77,7 @@ cpu_socket_id(unsigned lcore_id)
const char node_prefix[] = "node";
const size_t prefix_len = sizeof(node_prefix) - 1;
char path[PATH_MAX];
-   DIR *d;
+   DIR *d = NULL;
unsigned long id = 0;
struct dirent *e;
char *endptr = NULL;
@@ -97,7 +97,6 @@ cpu_socket_id(unsigned lcore_id)
break;
}
}
-   closedir(d);
if (endptr == NULL || *endptr!='\0' || endptr == e->d_name+prefix_len) {
RTE_LOG(WARNING, EAL, "Cannot read numa node link "
"for lcore %u - using physical package id 
instead\n",
@@ -110,9 +109,12 @@ cpu_socket_id(unsigned lcore_id)
if (eal_parse_sysfs_value(path, ) != 0)
goto err;
}
+   closedir(d);
return (unsigned)id;

 err:
+   if (d)
+   closedir(d);
RTE_LOG(ERR, EAL, "Error getting NUMA socket information from %s "
"for lcore %u - assuming NUMA socket 0\n", SYS_CPU_DIR, 
lcore_id);
return 0;
-- 
1.8.3.2



[dpdk-dev] [PATCH] eal: clear errno before calling strtoull() to parse base_virtaddr

2014-06-19 Thread Aaron Campbell
Yes, that was extremely bizarre.  I didn?t check the whole project, but at 
least eal_parse_socket_mem() in the same file was already correct.  Pablo you 
can add me to Acked-by, obviously. :-)

-Aaron

On Jun 19, 2014, at 12:46 PM, Burakov, Anatoly  
wrote:

> Hi Aaron,
> 
> It seems that Pablo De Lara has beat you to it by a few minutes :-) Are there 
> any other places this could potentially happen?
> 
>> Must reset errno to zero before calling strtoull(), else on success it could 
>> be
>> any arbitrary value from past errors.
>> 
>> Signed-off-by: Aaron Campbell 
>> ---
>> lib/librte_eal/linuxapp/eal/eal.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
>> b/lib/librte_eal/linuxapp/eal/eal.c
>> index 6994303..d204387 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -562,6 +562,7 @@ eal_parse_base_virtaddr(const char *arg)
>>  char *end;
>>  uint64_t addr;
>> 
>> +errno = 0;
>>  addr = strtoull(arg, , 16);
>> 
>>  /* check for errors */
>> --
>> 1.8.3.2
> 



[dpdk-dev] [PATCH] eal: clear errno before calling strtoull() to parse base_virtaddr

2014-06-19 Thread Aaron Campbell
Must reset errno to zero before calling strtoull(), else on success it
could be any arbitrary value from past errors.

Signed-off-by: Aaron Campbell 
---
 lib/librte_eal/linuxapp/eal/eal.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 6994303..d204387 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -562,6 +562,7 @@ eal_parse_base_virtaddr(const char *arg)
char *end;
uint64_t addr;

+   errno = 0;
addr = strtoull(arg, , 16);

/* check for errors */
-- 
1.8.3.2