Scott be sure to try running turbostat on both old and new servers as I suspect the 50us wake latency of C6 power state may cause drops.
The new kernels enable deeper sleep. You can also try a bios setting to disable deep sleep states, leave on C1 only. There was a program called cpudmalatency.c or something that may be able to help you keep system more awake. -- Jesse Brandeburg On Dec 19, 2013, at 2:57 PM, "Scott Silverman" <ssilver...@simplexinvestments.com> wrote: > Alex, > > Thanks for the response, I'll attempt to reproduce with a consistent OS > release and re-open the discussion at that time. > > > > > > > Thanks, > > Scott Silverman > > > On Thu, Dec 19, 2013 at 4:52 PM, Alexander Duyck < > alexander.h.du...@intel.com> wrote: > >> On 12/19/2013 10:31 AM, Scott Silverman wrote: >>> We have three generations of servers running nearly identical software. >>> Each subscribes to a variety of multicast groups taking in, on average, >>> 200-300Mbps of data. >>> >>> The oldest generation (2x Xeon X5670, SuperMicro 6016T-NTRF, Intel >>> X520-DA2) has no issues handling all the incoming data. (zero >>> rx_no_dma_resources) >>> >>> The middle generation (2x Xeon E5-2670, SuperMicro 6017R-WRF, Intel >>> X520-DA2) and the newest generation (2x Xeon E5-2680v2, SuperMicro >>> 6017R-WRF, Intel X520-DAs) both have issues handling the incoming data >>> (indicated by increasing rx_no_dma_resources counter). >>> >>> The oldest generation of servers is running CentOS5 on a newer kernel >>> (3.4.41), the others are running CentOS6 on the exact same kernel >> (3.4.41). >>> >>> The oldest generation is using ixgbe 3.13.10, the middle generation >> 3.13.10 >>> and the newest are on 3.18.7. All machines are using the set_irq_affinity >>> script to spread queue interrupts across available cores. All machines >> are >>> configured with C1 as the maximum C-state and CPU clocks are all steady >>> between 3-3.2Ghz depending on the processor model. >>> >>> On the middle/newer boxes, lowering the number of RSS queues manually >> (i.e. >>> RSS=8,8) seems to help reduce the amount of dropping, but it does not >>> eliminate it. >>> >>> The ring buffer drops do not seem to correlate with data rates, either. >> It >>> does not seem that it is an issue of keeping up. In addition, the boxes >> are >>> not under particularly heavy load. The CPU usage is generally between >> 3-5% >>> and rarely spikes much higher than 15%. The load average is generally >>> around 2. >>> >>> I am at a loss for what else to try to diagnose and/or fix this. In my >>> mind, the newer boxes should have no problem at all keeping up with the >>> older ones. >>> >>> I've attached the output of ethtool -S, one from each generation of >> server. >>> >>> >>> >>> Thanks, >>> >>> Scott Silverman >> >> Scott, >> >> Have you tried running the CentOS5 w/ newer kernel on any of your newer >> servers, or CentOS6 on one of the older ones? I ask because this would >> seem to be the one of the most significant differences between the >> servers that are not dropping frames and those that are. I suspect you >> may have something in the CentOS6 configuration that is responsible for >> the drops that is not present in the CentOS5 configuration. We really >> need to eliminate any OS based issues before we can really even hope to >> start chasing this issue down into the driver and/or device configuration. >> >> Thanks, >> >> Alex > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > E1000-devel mailing list > E1000-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/e1000-devel > To learn more about Intel® Ethernet, visit > http://communities.intel.com/community/wired ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired