Hi Todd,

I don’t think it’s related to queues/settings in the OS per se. These machines 
use shared-mode PHY for BMC (IPMI) access also, and when we get packet loss in 
the OS driver, we also see packet loss on the BMC side.

What we’ve discovered is that if we do “ethtool -s eth0 autoneg on” it fixes 
the issue on both sides, however prior to doing this autonegotiation *is* 
enabled in the NIC, it just seems the “autoneg on” operation restarts something 
in the PHY. 

Weird.

Cheers,
--
Steffen Persvold
Chief Architect NumaChip, Numascale AS
Tel: +47 23 16 71 88  Fax: +47 23 16 71 80 Skype: spersvold

> On 19 Dec 2014, at 18:17, Fujinaka, Todd <todd.fujin...@intel.com> wrote:
> 
> Before you start, though, do the check for settings and number of queues 
> being used. The issue may be as simple as that, and that shouldn't take more 
> than a few ethtool commands.
> 
> Todd Fujinaka
> Software Application Engineer
> Networking Division (ND)
> Intel Corporation
> todd.fujin...@intel.com
> (503) 712-4565
> 
> -----Original Message-----
> From: Steffen Persvold [mailto:s...@numascale.com] 
> Sent: Friday, December 19, 2014 9:14 AM
> To: Fujinaka, Todd
> Cc: e1000-devel@lists.sourceforge.net; Daniel J Blueman
> Subject: Re: [E1000-devel] Sporadic packet loss observed with newer in-kernel 
> drivers (5.2.15-k)
> 
> Hi Todd,
> 
> Thanks for responding so quickly. It’s probably easier to bisect the changes 
> to igb between the 3.10 kernel in-tree version (5.0.3-k) and the 3.14 kernel 
> in-tree version (5.0.5-k), rather than diffing on out-of-tree 5.2.15 and 
> in-kernel 5.2.15-k (I tried, the changes are huge, mostly because out-of-tree 
> code has a lot of compatibility stuff in it naturally).
> 
> I’ll let you know. 
> 
> 
> Cheers,
> --
> Steffen Persvold
> Chief Architect NumaChip, Numascale AS
> Tel: +47 23 16 71 88  Fax: +47 23 16 71 80 Skype: spersvold
> 
>> On 19 Dec 2014, at 17:23, Fujinaka, Todd <todd.fujin...@intel.com> wrote:
>> 
>> The in-kernel and out-of-tree driver aren't exactly the same and there could 
>> be differences enforced by the community that create that difference. For 
>> example - and I'm just making this up - there could be a difference in the 
>> dropping or passing of packets with bad checksums.
>> 
>> More likely are differences in the default settings of the two drivers. You 
>> may want to check that first.
>> 
>> If you have a clearly reproducible use case, we can try looking into this, 
>> but we are a bit limited in the number of Opteron systems we have in-house.
>> 
>> Todd Fujinaka
>> Software Application Engineer
>> Networking Division (ND)
>> Intel Corporation
>> todd.fujin...@intel.com
>> (503) 712-4565
>> 
>> -----Original Message-----
>> From: Steffen Persvold [mailto:s...@numascale.com]
>> Sent: Thursday, December 18, 2014 10:36 PM
>> To: e1000-devel@lists.sourceforge.net
>> Cc: Daniel J Blueman
>> Subject: [E1000-devel] Sporadic packet loss observed with newer 
>> in-kernel drivers (5.2.15-k)
>> 
>> Hi,
>> 
>> We’re currently working with a cluster of SuperMicro H8QGL 
>> (http://www.supermicro.com/Aplus/motherboard/Opteron6000/SR56x0/H8QGL-iF.cfm)
>>  based systems which has two of the 82576 chips :
>> 
>> 02:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network 
>> Connection (rev 01)
>> 02:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network 
>> Connection (rev 01)
>> 
>> 
>> Consequently the kernel use the igb network driver for this.
>> 
>> We have observed with kernels 3.14 and onwards that we sometimes get 
>> packet-loss (due to corrupted packets). 3.14 uses igb version 5.0.5-k :
>> 
>> [    0.000000] Linux version 3.14.27-numascale27+ (sp@build-ubuntu) (gcc 
>> version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #2 SMP Thu Dec 18 08:00:08 CET 2014
>> ...
>> [    6.338430] igb: Intel(R) Gigabit Ethernet Network Driver - version 
>> 5.0.5-k
>> [    6.345394] igb: Copyright (c) 2007-2013 Intel Corporation.
>> 
>> 
>> If we revert back to 3.10 kernels (3.10.63), which uses the 5.0.3-k igb 
>> driver we have no packet loss scenarios :
>> 
>> [    0.000000] Linux version 3.10.63-numascale27+ (sp@build-ubuntu) (gcc 
>> version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #1 SMP Wed Dec 17 15:56:25 CET 2014
>> ...
>> [    6.749783] igb: Intel(R) Gigabit Ethernet Network Driver - version 
>> 5.0.3-k
>> [    6.756740] igb: Copyright (c) 2007-2013 Intel Corporation.
>> 
>> 
>> I have also tested the most recent kernel; 3.18.1 :
>> 
>> [    0.000000] Linux version 3.18.1-numascale27+ (sp@build-ubuntu) (gcc 
>> version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #1 SMP Thu Dec 18 08:36:03 CET 2014
>> ...
>> [    8.010000] igb: Intel(R) Gigabit Ethernet Network Driver - version 
>> 5.2.15-k
>> [    8.010000] igb: Copyright (c) 2007-2014 Intel Corporation.
>> 
>> Also in this version we observe packet loss/corrupted packets.
>> 
>> While in the failed state we observe with ethtool -S (snapshot taken on 3.14 
>> with igb-5.0.5-k) :
>> 
>>    rx_short_length_errors: 235
>>    rx_errors: 235
>>    rx_length_errors: 235
>>    rx_queue_6_csum_err: 256
>> 
>> 
>> Now to the interesting part :) If I download igb-5.2.15.tar.gz from the 
>> sourceforge site 
>> (http://sourceforge.net/projects/e1000/files/igb%20stable/5.2.15/igb-5.2.15.tar.gz/download),
>>  and build this for 3.18.1, the packet loss is gone. Which doesn’t make 
>> sense at all since 3.18.1 already has 5.2.15 driver (albeit an in-kernel 
>> variant). This also applies if we apply the same driver version to the 3.14 
>> kernel (replacing 5.0.5-k).
>> 
>> 
>> Any idea what might be causing this ? Any insight you might have would be 
>> highly appreciated.
>> 
>> 
>> Cheers,
>> --
>> Steffen Persvold
>> Chief Architect NumaChip, Numascale AS
>> Tel: +47 23 16 71 88  Fax: +47 23 16 71 80 Skype: spersvold
>> 
>> 
>> ----------------------------------------------------------------------
>> -------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT 
>> Server from Actuate! Instantly Supercharge Your Business Reports and 
>> Dashboards with Interactivity, Sharing, Native Excel Exports, App 
>> Integration & more Get technology previously reserved for 
>> billion-dollar corporations, FREE 
>> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.
>> clktrk _______________________________________________
>> E1000-devel mailing list
>> E1000-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/e1000-devel
>> To learn more about Intel&#174; Ethernet, visit 
>> http://communities.intel.com/community/wired
> 


------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to