Hi Lars, just looking back over old emails, and I did notice that at least with one of your messages the stats still showed tx_tcp_seg_good: 13136
Which means that you still had TSO enabled. Can you make absolutely sure that ethtool -K ethX tso off is done on each 82541 interface? The other thing that might be relevant is if you have >= 4GB ram. -----Original Message----- From: Lars Ehrhardt [mailto:[email protected]] Sent: Monday, March 22, 2010 4:15 PM To: Ronciak, John Cc: [email protected] Subject: Re: [E1000-devel] Network stalls with e1000 driver and 82541 network chips Dear John, Ronciak, John wrote: > Thanks, it's a bit hard to try and translate this into some we can > understand. :-( Let me know, if there is anything specific you'd like to know and I'll try to translate those bits. >> I am getting dropped packets on the 82572EI interfaces as well. > Thanks not good. This means that the interrupts are not being > serviced fast enough to keep up with the traffic. With 5 networking > ports it doesn't surprise me. What kind of tests are you running to > cause this? It's unclear if this system can withstand the traffic > from these ports. Have you tried to run the test on a single port to > see if the drops happen then as well? Try to see where the problem > starts to happen. Are interrupts being shared between the devices? > What OS are you running? We are running Debian Lenny with different kernel versions. At the moment we are testing with 2.6.30 bpo version. Reading through the archives I've tried the module options TxDescriptorStep=4 TxDescriptors=1024 with the e1000 module. This changes the behaviour. We no longer have tx unit hang messages in the log, but the link nevertheless goes down sporadically and comes back after some seconds. There is no down message in the logs, just the up message: syslog:Mar 22 17:34:09 gw kernel: [765361.074217] e1000: aur-mgt: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None syslog:Mar 22 22:28:21 gw kernel: [783013.698259] e1000: aur-mgt: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Interrupts of the 82541 ports are shared with USB, interrupts of the 82572 ports are not shared. I think that the error rate corresponds to the load of the interfaces somehow. Interfaces with little traffic have a smaller value of rx_no_buffer_count and rx_missed_errors than interfaces with lots of traffic. CPU0 CPU1 0: 2676 0 IO-APIC-edge timer 1: 2 0 IO-APIC-edge i8042 3: 896339 0 IO-APIC-edge serial 4: 11 0 IO-APIC-edge 7: 0 0 IO-APIC-edge parport0 8: 56 0 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 14: 0 0 IO-APIC-edge ide0 16: 14766 0 IO-APIC-fasteoi uhci_hcd:usb4, ath 18: 193566354 0 IO-APIC-fasteoi uhci_hcd:usb3, aur-mgt 19: 1904270 0 IO-APIC-fasteoi uhci_hcd:usb2, gst 23: 0 0 IO-APIC-fasteoi uhci_hcd:usb1, ehci_hcd:usb5 28: 6776261 0 PCI-MSI-edge pbr-Q0 29: 2 0 PCI-MSI-edge pbr 30: 66797611 0 PCI-MSI-edge dmz-Q0 31: 871 0 PCI-MSI-edge dmz 32: 218588672 0 PCI-MSI-edge inet-Q0 33: 475455 0 PCI-MSI-edge inet 35: 5566356 0 PCI-MSI-edge ahci NMI: 0 0 Non-maskable interrupts LOC: 70790723 55485197 Local timer interrupts SPU: 0 0 Spurious interrupts RES: 303286 277829 Rescheduling interrupts CAL: 127 293 Function call interrupts TLB: 237854 61406 TLB shootdowns Strange thing is we have five of those devices, two show this behavior, three don't. I might be able to dedicate a single device for further testing. Would it help to diagnose further if you had shell access to one of those devices? Best regards, Lars ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
