Re: e1000: Detected Tx Unit Hang

2008-02-19 Thread Kok, Auke
Bernd Schubert wrote: > On Saturday 16 February 2008, Kok, Auke wrote: >> Bernd Schubert wrote: >>> Hello, >>> >>> I can't login to one of our servers and just got this in an ipmi sol >>> session: >>> >>> [18169.209181] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang >>> [18169.209183] Tx

Re: e1000: Detected Tx Unit Hang

2008-02-15 Thread Bernd Schubert
On Saturday 16 February 2008, Kok, Auke wrote: > Bernd Schubert wrote: > > Hello, > > > > I can't login to one of our servers and just got this in an ipmi sol > > session: > > > > [18169.209181] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > > [18169.209183] Tx Queue <0> > >

Re: e1000: Detected Tx Unit Hang

2008-02-15 Thread Kok, Auke
Bernd Schubert wrote: > Hello, > > I can't login to one of our servers and just got this in an ipmi sol > session: > > [18169.209181] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > [18169.209183] Tx Queue <0> > [18169.209184] TDH > [18169.209185] TDT

e1000: Detected Tx Unit Hang

2008-02-15 Thread Bernd Schubert
Hello, I can't login to one of our servers and just got this in an ipmi sol session: [18169.209181] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [18169.209183] Tx Queue <0> [18169.209184] TDH [18169.209185] TDT [18169.209186] next_

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-21 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]> Date: Mon, 21 Jan 2008 14:27:13 +0100 > Yes it works. e1000 tested for ~3 hours with high very high load and > interface up/down every 5:th sec. Without the patch the irq's gets > disabled within a couple of seconds > > A resolute way of handling the

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-21 Thread Robert Olsson
David Miller writes: > Yes, this semaphore thing is highly problematic. In the most crucial > areas where network driver consistency matters the most for ease of > understanding and debugging, the Intel drivers choose to be different > :-( > > The way the napi_disable() logic breaks out f

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-20 Thread Badalian Vyacheslav
Hello. Its work, thanks for resend it! Sorry, i understand that patch 53e52c729cc169db82a6105fac7a166e10c2ec36 ("[NET]: Make ->poll() breakout consistent in Intel ethernet drivers.") have regression and rollback it, i not see your patch. Sorry again. Thanks! From: Badalian Vyacheslav <[EMAIL

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-20 Thread Andrey Rahmatullin
On Sun, Jan 20, 2008 at 01:20:11AM -0800, Brandeburg, Jesse wrote: > I continually get the > kernel: unregister_netdevice: waiting for eth2 to become free. Usage > count = 1 http://bugzilla.kernel.org/show_bug.cgi?id=9778 -- WBR, wRAR (ALT Linux Team) signature.asc Description: Digital signatu

RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-20 Thread Brandeburg, Jesse
David Miller wrote: > From: Robert Olsson <[EMAIL PROTECTED]> > Date: Fri, 18 Jan 2008 14:00:57 +0100 > >> I don't understand the idea with semaphore for enabling/disabling >> irq's either the overall logic must safer/better without it. > > They must have had code paths where they didn't know i

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-18 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]> Date: Fri, 18 Jan 2008 14:00:57 +0100 > I don't understand the idea with semaphore for enabling/disabling > irq's either the overall logic must safer/better without it. They must have had code paths where they didn't know if IRQs were enabled or not al

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-18 Thread Robert Olsson
David Miller writes: > > eth0 e1000_irq_enable sem = 1<- ifconfig eth0 down > > eth0 e1000_irq_disable sem = 2 > > > > **e1000_open <- ifconfig eth0 up > > eth0 e1000_irq_disable sem = 3 Dead. irq's can't be enabled > > e1000_irq_enable miss > > eth0 e1000_irq

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-18 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 18:07:38 +0100 > > eth0 e1000_irq_enable sem = 1<- High netload > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread David Miller
From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Date: Thu, 17 Jan 2008 07:40:07 -0200 > I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again > to reproduce. Thanks for the datapoints and testing. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the bo

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread Arnaldo Carvalho de Melo
Em Thu, Jan 17, 2008 at 12:00:02AM -0800, David Miller escreveu: > From: Frans Pop <[EMAIL PROTECTED]> > Date: Thu, 17 Jan 2008 08:51:55 +0100 > > > On Thursday 17 January 2008, David Miller wrote: > > > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> > > > > > > > We spent Wednesday trying to repro

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-17 Thread David Miller
From: Frans Pop <[EMAIL PROTECTED]> Date: Thu, 17 Jan 2008 08:51:55 +0100 > On Thursday 17 January 2008, David Miller wrote: > > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> > > > > > We spent Wednesday trying to reproduce (without the patch) these issues > > > without much luck, and have applied

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Frans Pop
On Thursday 17 January 2008, David Miller wrote: > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> > > > We spent Wednesday trying to reproduce (without the patch) these issues > > without much luck, and have applied the patch cleanly and will continue > > testing it. Given the simplicity of the cha

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread David Miller
From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 23:09:47 -0800 > We spent Wednesday trying to reproduce (without the patch) these issues > without much luck, and have applied the patch cleanly and will continue > testing it. Given the simplicity of the changes, and the commun

RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Brandeburg, Jesse
David Miller wrote: > From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> > Date: Tue, 15 Jan 2008 13:53:43 -0800 > >> The tx code has an "early exit" that tries to limit the amount of tx >> packets handled in a single poll loop and requires napi or interrupt >> rescheduling based on the return value fr

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Robert Olsson
David Miller writes: > > On Wednesday 16 January 2008, David Miller wrote: > > > Ok, here is the patch I'll propose to fix this. The goal is to make > > > it as simple as possible without regressing the thing we were trying > > > to fix. > > > > Looks good to me. Tested with -rc8. > > T

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread David Miller
From: Badalian Vyacheslav <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 12:02:28 +0300 > Also have regression after apply patch. BTW, if you are using the e1000e driver then this initial patch will not work. My more recent patch posting for this problem, will. I include it again below for you: [N

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread David Miller
From: Badalian Vyacheslav <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 12:02:28 +0300 > applied to 2.6.24-rc7-git2 > Have messages > Also have regression after apply patch. > System may do above 800mbs traffic before patch. After its "exit polling > mode?" (4 CPU, 1 cpu get 100% si (process ksoftir

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread David Miller
From: Frans Pop <[EMAIL PROTECTED]> Date: Wed, 16 Jan 2008 09:56:08 +0100 > On Wednesday 16 January 2008, David Miller wrote: > > Ok, here is the patch I'll propose to fix this. The goal is to make > > it as simple as possible without regressing the thing we were trying > > to fix. > > Looks goo

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Badalian Vyacheslav
applied to 2.6.24-rc7-git2 Have messages Also have regression after apply patch. System may do above 800mbs traffic before patch. After its "exit polling mode?" (4 CPU, 1 cpu get 100% si (process ksoftirqd/0), 3 CPU is IDLE) After patch system was go to "exit polling mode" at above 600mbs. Than

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-16 Thread Frans Pop
On Wednesday 16 January 2008, David Miller wrote: > Ok, here is the patch I'll propose to fix this. The goal is to make > it as simple as possible without regressing the thing we were trying > to fix. Looks good to me. Tested with -rc8. Cheers, FJP -- To unsubscribe from this list: send the line

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-15 Thread David Miller
From: "Brandeburg, Jesse" <[EMAIL PROTECTED]> Date: Tue, 15 Jan 2008 13:53:43 -0800 > The tx code has an "early exit" that tries to limit the amount of tx > packets handled in a single poll loop and requires napi or interrupt > rescheduling based on the return value from e1000_clean_tx_irq. That

RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-15 Thread Brandeburg, Jesse
[EMAIL PROTECTED] wrote: > Quoting Frans Pop <[EMAIL PROTECTED]>: >>> (Note this isn't the final correct patch we should apply. There is >>> no reason why this revert back to the older ->poll() logic here >>> should have any effect on the TX hang triggering...) >> >> s/no reason/no obvious reas

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-15 Thread slavon
Quoting Frans Pop <[EMAIL PROTECTED]>: On Tuesday 15 January 2008, David Miller wrote: From: Frans Pop <[EMAIL PROTECTED]> > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Does this make the problem go away? Yes, it very much looks like that solves it. I ran with the patch fo

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-15 Thread Frans Pop
On Tuesday 15 January 2008, David Miller wrote: > From: Frans Pop <[EMAIL PROTECTED]> > > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > > Does this make the problem go away? Yes, it very much looks like that solves it. I ran with the patch for 6 hours or so without any errors. I

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-14 Thread Frans Pop
Wow. That's fast! :-) On Tuesday 15 January 2008, David Miller wrote: > From: Frans Pop <[EMAIL PROTECTED]> > > > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > > Does this make the problem go away? I'm compiling a kernel with the patch now. Will let you know the result. May tak

Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-14 Thread David Miller
From: Frans Pop <[EMAIL PROTECTED]> Date: Tue, 15 Jan 2008 06:25:10 +0100 > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Does this make the problem go away? (Note this isn't the final correct patch we should apply. There is no reason why this revert back to the older ->poll()

[REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

2008-01-14 Thread Frans Pop
After compiling v2.6.24-rc7-163-g1a1b285 (x86_64) yesterday I suddenly see this error repeatedly: kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang kernel: Tx Queue <0> kernel: TDH kernel: TDT kernel: next_to_use kernel

Re: e1000 Detected Tx Unit Hang

2006-09-16 Thread Paul Aviles
- Original Message - From: "Jesse Brandeburg" <[EMAIL PROTECTED]> To: "Paul Aviles" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, September 05, 2006 12:09 PM Subject: Re: e1000 Detected Tx Unit Hang On 9/3/06, Paul Aviles <[EMAIL PROTECTED]> wrote: Hey Jes

Re: e1000 Detected Tx Unit Hang

2006-09-10 Thread Paul Aviles
Jesse, testing without NAPI, will see how it behaves. Paul Aviles - Original Message - From: "Jesse Brandeburg" <[EMAIL PROTECTED]> To: "Paul Aviles" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, September 05, 2006 12:09 PM Subject: Re: e1000 Detected Tx Uni

Re: e1000 Detected Tx Unit Hang

2006-09-05 Thread Paul Aviles
he system I can accomodate too, just let me know what time zone you are to schedule it. Let me know. Regards, Paul Aviles - Original Message - From: "Jesse Brandeburg" <[EMAIL PROTECTED]> To: "Paul Aviles" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, September 0

Re: e1000 Detected Tx Unit Hang

2006-09-05 Thread Jesse Brandeburg
On 9/3/06, Paul Aviles <[EMAIL PROTECTED]> wrote: Hey Jesse, thanks for your reply. Here is the stuff on /procs. The weird no problem, part is that I have several other identical systems and only one is affected. Today I moved the hard drive to another similar system and I am not seeing the pr

Re: e1000 Detected Tx Unit Hang

2006-09-03 Thread Paul Aviles
NMI: 0 0 LOC:77158397715838 ERR: 0 MIS: 0 - Original Message - From: "Jesse Brandeburg" <[EMAIL PROTECTED]> To: "Paul Aviles" <[EMAIL PROTECTED]> Cc: Sent: Sunday, September 03, 2006 1:45 PM Subject: Re: e1000 De

Re: e1000 Detected Tx Unit Hang

2006-09-03 Thread Jesse Brandeburg
On 9/2/06, Paul Aviles <[EMAIL PROTECTED]> wrote: I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" using stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3. The server is a Tyan GS12 ( 82541GI/PI and 82547GI) and is connected to a Netgear GS724T Gig switch. I can

e1000 Detected Tx Unit Hang

2006-09-02 Thread Paul Aviles
I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" using stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3. The server is a Tyan GS12 ( 82541GI/PI and 82547GI) and is connected to a Netgear GS724T Gig switch. I can easily reproduce the problem by trying to do a l

Re: e1000 Detected Tx Unit Hang

2006-09-01 Thread Auke Kok
Paul Aviles wrote: I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" using stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3. The server is a Tyan GS10 and is connected to a Netgear GS724T Gig switch. I can easily reproduce the problem by trying to do a large ftp