В Вто, 09/02/2010 в 23:34 +0200, Покотиленко Костик пишет:

> >> Also if ACPI is having an effect on the issue one other thing you
> >> might try changing in the BIOS would be to disable all CPU C-states.
> >> The system will consume more power as a result, but the CPU also ends
> >> up usually being much more responsive as a result, and we have seen in
> >> the past that this can sometimes resolve performance issues.
> >
> > I'll turn those off:
> >
> > CPU C State=1               ;Options: 1=Enabled: 0=Disabled
> > C1E=1                       ;Options: 1=Enabled: 0=Disabled
> 
> Turned off "CPU C State" and "Spread spectrum", C1E turned off automatically.

With "CPU C State" and "Spread spectrum" turned off after 47 hours I
got:

NETDEV WATCHDOG: eth1 (igb): transmit timed out
Modules linked in: ...
Call Trace:
...

Let summarize:

- None of kernel (29, 30) and driver combinations solved the problem
- None of BIOS options helped
- I've figured out that when TX Unit Hang on 2 configured ports,
Loopback test fails on 2 unconfigured/used ports also
- When the NIC stops working, rest of the system feels Ok

So the problem localized a bit, but the source of the problem it's not
clear. Is it hardware related or software...

Also system is in use by ~300 customers, so more downtime that we
already have is not desireable.

Server has 2 onboard NICs with one of which we have had similar problem,
and PCI-e Quad port NIC.

We can still live with 2 NICs, so one of the options for further testing
I see is to go back using onboard NICs and put PCI-e Quad port NIC to
another server I support and do a loop back (Port1<-> Port2,
Port3<->Port4) stress test, but there is 2.6.26 kernel (changing not an
option).

Let me know what you think and what are other options of further
testing. I'm going to try 2.6.32 before switching NIC to another server.
I Did not do this before because there was issues backporting it to
Lenny.

-- 
Покотиленко Костик <[email protected]>


------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to