Re: [E1000-devel] rx_no_dma_resources - Issue on newer hardware (not on older hardware)

Brandeburg, Jesse Thu, 19 Dec 2013 15:34:33 -0800

Scott be sure to try running turbostat on both old and new servers as I suspect 
the 50us wake latency of C6 power state may cause drops.


The new kernels enable deeper sleep. 

You can also try a bios setting to disable deep sleep states, leave on C1 only. 

There was a program called cpudmalatency.c or something that may be able to 
help you keep system more awake. 

--
Jesse Brandeburg


On Dec 19, 2013, at 2:57 PM, "Scott Silverman" 
<ssilver...@simplexinvestments.com> wrote:

> Alex,
> 
> Thanks for the response, I'll attempt to reproduce with a consistent OS
> release and re-open the discussion at that time.
> 
> 
> 
> 
> 
> 
> Thanks,
> 
> Scott Silverman
> 
> 
> On Thu, Dec 19, 2013 at 4:52 PM, Alexander Duyck <
> alexander.h.du...@intel.com> wrote:
> 
>> On 12/19/2013 10:31 AM, Scott Silverman wrote:
>>> We have three generations of servers running nearly identical software.
>>> Each subscribes to a variety of multicast groups taking in, on average,
>>> 200-300Mbps of data.
>>> 
>>> The oldest generation (2x Xeon X5670, SuperMicro 6016T-NTRF, Intel
>>> X520-DA2) has no issues handling all the incoming data. (zero
>>> rx_no_dma_resources)
>>> 
>>> The middle generation (2x Xeon E5-2670, SuperMicro 6017R-WRF, Intel
>>> X520-DA2) and the newest generation (2x Xeon E5-2680v2, SuperMicro
>>> 6017R-WRF, Intel X520-DAs) both have issues handling the incoming data
>>> (indicated by increasing rx_no_dma_resources counter).
>>> 
>>> The oldest generation of servers is running CentOS5 on a newer kernel
>>> (3.4.41), the others are running CentOS6 on the exact same kernel
>> (3.4.41).
>>> 
>>> The oldest generation is using ixgbe 3.13.10, the middle generation
>> 3.13.10
>>> and the newest are on 3.18.7. All machines are using the set_irq_affinity
>>> script to spread queue interrupts across available cores. All machines
>> are
>>> configured with C1 as the maximum C-state and CPU clocks are all steady
>>> between 3-3.2Ghz depending on the processor model.
>>> 
>>> On the middle/newer boxes, lowering the number of RSS queues manually
>> (i.e.
>>> RSS=8,8) seems to help reduce the amount of dropping, but it does not
>>> eliminate it.
>>> 
>>> The ring buffer drops do not seem to correlate with data rates, either.
>> It
>>> does not seem that it is an issue of keeping up. In addition, the boxes
>> are
>>> not under particularly heavy load. The CPU usage is generally between
>> 3-5%
>>> and rarely spikes much higher than 15%. The load average is generally
>>> around 2.
>>> 
>>> I am at a loss for what else to try to diagnose and/or fix this. In my
>>> mind, the newer boxes should have no problem at all keeping up with the
>>> older ones.
>>> 
>>> I've attached the output of ethtool -S, one from each generation of
>> server.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Scott Silverman
>> 
>> Scott,
>> 
>> Have you tried running the CentOS5 w/ newer kernel on any of your newer
>> servers, or CentOS6 on one of the older ones?  I ask because this would
>> seem to be the one of the most significant differences between the
>> servers that are not dropping frames and those that are.  I suspect you
>> may have something in the CentOS6 configuration that is responsible for
>> the drops that is not present in the CentOS5 configuration.  We really
>> need to eliminate any OS based issues before we can really even hope to
>> start chasing this issue down into the driver and/or device configuration.
>> 
>> Thanks,
>> 
>> Alex
> ------------------------------------------------------------------------------
> Rapidly troubleshoot problems before they affect your business. Most IT 
> organizations don't have a clear picture of how application performance 
> affects their revenue. With AppDynamics, you get 100% visibility into your 
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit 
> http://communities.intel.com/community/wired

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] rx_no_dma_resources - Issue on newer hardware (not on older hardware)

Reply via email to