Oops at boot with 2.4.5

2001-06-27 Thread Manuel A. McLure

When I use an initrd, sometimes when warm-booting I get an "Unable to handle
kernel NULL pointer dereference"  OOPS just after the "Trying to unmount old
root ..." message. I ran gdb on vmlinux and got the following stack trace:

0xc0180516 :  mov0x10(%eax),%eax
0xc0137c37 : mov%edi,0xc(%ebx)
0xc01861a0 : add$0xc,%esp
0xc01864d3 :mov%eax,%ebx
0xc0112949 :  pop%ebp
0xc0132be1 <__refile_buffer+97>:pop%ecx
0xc018047a :   xor%eax,%eax
0xc01347eb :   mov$0x1,%eax
0xc0133062 :   test   %eax,%eax
0xc0123905 :   mov$0x1,%eax
0xc01431af :  pop%eax
0xc01447b0 :  pop%ecx
0xc0137e46 :add$0xc,%esp
0xc0135edc :add$0xc,%esp
0xc0105000 :push   %edi
0xc0117ba3 :add$0x10,%esp
0xc0105000 :push   %edi
0xc01051e8 : pop%edx
0xc010520e :   call   0xc0111a60 
0xc0105000 :push   %edi
0xc01056c6 :  mov$0x1,%eax
0xc0105200 :  push   %ebp

A reset at this point usually (but not always) succeeds in booting, and once
the machine succeeds in booting it is completely stable (for my admittedly
low load).

Hardware is an Athlon Tbird 900MHz (not overclocked) on an MSI K7T Turbo-R
motherboard. I've worked around this by building my SCSI driver into the
kernel and removing the need for an initrd.

Kernel is official 2.4.5 built with Athlon optimizations.

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Zathras is used to being beast of burden to other peoples needs. Very sad
life. Probably have very sad death, but at least there is symmetry.
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure


> > I do have an IRQ for my VGA since the instructions for my 
> card (a Voodoo 5
> > 5500) specifically say an IRQ is needed.
> 
> I wonder though... In my mind this is a driver not hardware issue.  If
> the XFree86 and/or Linux console driver do not use the IRQ, 
> you need not
> have BIOS assign one.  If you are feeling dangerous, try 
> turning the VGA
> IRQ assignment off in BIOS and see if things melt/explode/kick ass.

I'd do that if this wasn't also my Windows 98 gaming machine - I'm supposing
that the Windows drivers do use the IRQ even if XFree86/Linux doesn't. I
dunno if Windows is smart enough to assign an IRQ even if the BIOS doesn't.
Anyway, things are working now (specially since the last tulip patches) and
I like it that way :-)

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure

Jeff Garzik said...
> Changing '#undef DEBUG' to '#define DEBUG 1' in
> arch/i386/kernel/pci-i386.h is also very helpful.  Can you guys do so,
> and post the 'dmesg -s 16384' results to lkml?  This includes the same
> information as dump_pirq, as well as some additional information.

Here's my dmesg output - I tried with both PNP: Yes and PNP: No and the
dmesg outputs were exactly the same modulo a Hz or two in the processor
speed detection.

I do have an IRQ for my VGA since the instructions for my card (a Voodoo 5
5500) specifically say an IRQ is needed.

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."


 dmesg_pnp_yes.txt.gz


RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure

Axel Thimm said...
> On Tue, Apr 10, 2001 at 09:51:18AM -0700, Manuel A. McLure wrote:
> > I have the same motherboard with the same lspci output 
> (i.e. I get the "pin
> > ?" part), but I don't see any problems running 2.4.3 or 
> 2.4.3-ac[23]. I am
> > only using a trackball on my USB port - what problems are 
> you seeing?
> 
> Well, a part of the attached dmesg output yields:
> 
> > PCI: Found IRQ 11 for device 00:07.2
> > IRQ routing conflict in pirq table for device 00:07.2
> > IRQ routing conflict in pirq table for device 00:07.3
> > PCI: The same IRQ used for device 00:0e.0
> > uhci.c: USB UHCI at I/O 0x9400, IRQ 5
> 
> and later:
> 
> > uhci: host controller process error. something bad happened
> > uhci: host controller halted. very bad
> 
> 0.7.[2,3] are the usb devices. BIOS (and 2.2 kernels) had 
> them at IRQ 5. 2.4
> somehow picks the irq of the ethernet adapter, iqr 11, instead.
> 
> At least usb is then unusable.
> 
> As you say that you have the same board, what is the output 
> of dump_pirq - are
> your link values in the set of {1,2,3,5} or are they 
> continuous 1-4? Maybe you
> are lucky - or better say, I am having bad luck :(
> -- 
> [EMAIL PROTECTED]
> 

I am getting IRQ routing conflict messages:

Apr  8 21:32:47 ulthar kernel: usb.c: registered new driver usbdevfs
Apr  8 21:32:47 ulthar kernel: usb.c: registered new driver hub
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: $Revision: 1.251 $ time 18:28:42
Apr
  6 2001
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: High bandwidth mode enabled
Apr  8 21:32:47 ulthar kernel: PCI: Found IRQ 11 for device 00:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.3
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0a.0
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: USB UHCI at I/O 0xa400, IRQ 9
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: Detected 2 ports
Apr  8 21:32:47 ulthar kernel: usb.c: new USB bus registered, assigned bus
numb
er 1
Apr  8 21:32:47 ulthar kernel: hub.c: USB hub found
Apr  8 21:32:47 ulthar kernel: hub.c: 2 ports detected
Apr  8 21:32:47 ulthar kernel: PCI: Found IRQ 11 for device 00:07.3
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.3
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0a.0
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: USB UHCI at I/O 0xa800, IRQ 9
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: Detected 2 ports
Apr  8 21:32:47 ulthar kernel: usb.c: new USB bus registered, assigned bus
numb
er 2
Apr  8 21:32:47 ulthar kernel: hub.c: USB hub found
Apr  8 21:32:47 ulthar kernel: hub.c: 2 ports detected

However I am not seeing any problems caused by this (however I do not use
USB very much, as I mentioned - only for a trackball). I also got the same
messages on my K7T Pro which used the KT133 chipset, however, so I don't
think this is a KT133/KT133A issue.
I can't seem to find dump_pirq on my system (Red Hat 7) - I can run it if I
find it...

Jeff Garzik said:
>Changing '#undef DEBUG' to '#define DEBUG 1' in
>arch/i386/kernel/pci-i386.h is also very helpful.  Can you guys do so,
>and post the 'dmesg -s 16384' results to lkml?  This includes the same
>information as dump_pirq, as well as some additional information.

I'll do that and get back to you - I'll have to physically be at my machine
to reset the BIOS to "PNP: Yes" so it won't be until I get home from work.

>Note that turning "Plug-n-Play OS" off in BIOS setup typically fixes
>many interrupt routing problems -- but Linux 2.4 should now have support
>for PNP OS:Yes.  Clearly there appear to be problems with that support
>on some Via hardware.
>
>Note that you should have "Plug-n-Play OS: Yes" when generated the
>requested 'dmesg' output.

This may be the difference - I always set "Plug-n-Play OS: No" on all my
machines. Linux works fine and it doesn't seem to hurt Windows 98 any.

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA (was: VIA KT133 chipset PCI crazyness...)

2001-04-10 Thread Manuel A. McLure

Axel Thimm said...
> Several weeks ago there had been a thread on the pirq 
> assignments of newer VIA
> and SiS chipsets ending with everybody happy.
> 
> Everybody? Not everybody - there is a small village of 
> chipsets resisting the
> advent of 2.4.x :(
> 
> The system is a KT133A (MSI's K7T Turbo MS-6330 board)/Duron 700
> system. Kernel 2.4.x have IRQ routing problems and USB 
> failures (the latter
> will most probably be due to IRQ mismatches, I believe).
> 
> 2.2 kernel = 2.2.17 RH-kernel
> 2.4 kernel = 2.4.3 kernel with 'yes ""|make config' (I also 
> tried configuring
>  and -ac3 patches to no avail.)
> 
> I attached dmesg, lspci -vvvxxx (under both 2.2 and 2.4), and 
> dump_irq (which
> is the same for both kernels)
> 
> As far as I could follow the discussion back in January a 
> problem seem to be
> that different chipset vendors may arbitrary map pirq to 
> links ('A' vs 1
> etc.). On my board I see that there is a rather strange 
> mapping. Maybe this
> confuses 2.4.3?
> 
> Most prominent difference in the lspci -vvvxxx output (to me) 
> is the interrupt
> with the unknown pin:
> 
> > @@ -162,6 +162,7 @@
> >  00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 
> [Apollo Super ACPI] (rev 40)
> > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- 
> VGASnoop- ParErr- Stepping- SERR- FastB2B-
> > Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- 
> DEVSEL=medium >TAbort- SERR-  > +   Interrupt: pin ? routed to IRQ 11
> > Capabilities: [68] Power Management version 2
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> > Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> Maybe it is a KT133A != KT133 issue. Note that the analysis 
> above is the best
> I can provide, which has nothing to do with a good analysis.
> 
> Any help mostly appreciated! My board wants to run 2.4.x!!!
> 
> BTW kernel 2.2.x does not give any irq related messages in 
> its logs. Does this
> mean that 2.2.x works well, or that the errors are just not displayed?
> 
> Thanks, Axel.
> -- 
> [EMAIL PROTECTED]
> 

I have the same motherboard with the same lspci output (i.e. I get the "pin
?" part), but I don't see any problems running 2.4.3 or 2.4.3-ac[23]. I am
only using a trackball on my USB port - what problems are you seeing?

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA (was: VIA KT133 chipset PCI crazyness...)

2001-04-10 Thread Manuel A. McLure

Axel Thimm said...
 Several weeks ago there had been a thread on the pirq 
 assignments of newer VIA
 and SiS chipsets ending with everybody happy.
 
 Everybody? Not everybody - there is a small village of 
 chipsets resisting the
 advent of 2.4.x :(
 
 The system is a KT133A (MSI's K7T Turbo MS-6330 board)/Duron 700
 system. Kernel 2.4.x have IRQ routing problems and USB 
 failures (the latter
 will most probably be due to IRQ mismatches, I believe).
 
 2.2 kernel = 2.2.17 RH-kernel
 2.4 kernel = 2.4.3 kernel with 'yes ""|make config' (I also 
 tried configuring
  and -ac3 patches to no avail.)
 
 I attached dmesg, lspci -vvvxxx (under both 2.2 and 2.4), and 
 dump_irq (which
 is the same for both kernels)
 
 As far as I could follow the discussion back in January a 
 problem seem to be
 that different chipset vendors may arbitrary map pirq to 
 links ('A' vs 1
 etc.). On my board I see that there is a rather strange 
 mapping. Maybe this
 confuses 2.4.3?
 
 Most prominent difference in the lspci -vvvxxx output (to me) 
 is the interrupt
 with the unknown pin:
 
  @@ -162,6 +162,7 @@
   00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 
 [Apollo Super ACPI] (rev 40)
  Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- 
 VGASnoop- ParErr- Stepping- SERR- FastB2B-
  Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- 
 DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR-
  +   Interrupt: pin ? routed to IRQ 11
  Capabilities: [68] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
 PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 
 Maybe it is a KT133A != KT133 issue. Note that the analysis 
 above is the best
 I can provide, which has nothing to do with a good analysis.
 
 Any help mostly appreciated! My board wants to run 2.4.x!!!
 
 BTW kernel 2.2.x does not give any irq related messages in 
 its logs. Does this
 mean that 2.2.x works well, or that the errors are just not displayed?
 
 Thanks, Axel.
 -- 
 [EMAIL PROTECTED]
 

I have the same motherboard with the same lspci output (i.e. I get the "pin
?" part), but I don't see any problems running 2.4.3 or 2.4.3-ac[23]. I am
only using a trackball on my USB port - what problems are you seeing?

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure

Axel Thimm said...
 On Tue, Apr 10, 2001 at 09:51:18AM -0700, Manuel A. McLure wrote:
  I have the same motherboard with the same lspci output 
 (i.e. I get the "pin
  ?" part), but I don't see any problems running 2.4.3 or 
 2.4.3-ac[23]. I am
  only using a trackball on my USB port - what problems are 
 you seeing?
 
 Well, a part of the attached dmesg output yields:
 
  PCI: Found IRQ 11 for device 00:07.2
  IRQ routing conflict in pirq table for device 00:07.2
  IRQ routing conflict in pirq table for device 00:07.3
  PCI: The same IRQ used for device 00:0e.0
  uhci.c: USB UHCI at I/O 0x9400, IRQ 5
 
 and later:
 
  uhci: host controller process error. something bad happened
  uhci: host controller halted. very bad
 
 0.7.[2,3] are the usb devices. BIOS (and 2.2 kernels) had 
 them at IRQ 5. 2.4
 somehow picks the irq of the ethernet adapter, iqr 11, instead.
 
 At least usb is then unusable.
 
 As you say that you have the same board, what is the output 
 of dump_pirq - are
 your link values in the set of {1,2,3,5} or are they 
 continuous 1-4? Maybe you
 are lucky - or better say, I am having bad luck :(
 -- 
 [EMAIL PROTECTED]
 

I am getting IRQ routing conflict messages:

Apr  8 21:32:47 ulthar kernel: usb.c: registered new driver usbdevfs
Apr  8 21:32:47 ulthar kernel: usb.c: registered new driver hub
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: $Revision: 1.251 $ time 18:28:42
Apr
  6 2001
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: High bandwidth mode enabled
Apr  8 21:32:47 ulthar kernel: PCI: Found IRQ 11 for device 00:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.3
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0a.0
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: USB UHCI at I/O 0xa400, IRQ 9
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: Detected 2 ports
Apr  8 21:32:47 ulthar kernel: usb.c: new USB bus registered, assigned bus
numb
er 1
Apr  8 21:32:47 ulthar kernel: hub.c: USB hub found
Apr  8 21:32:47 ulthar kernel: hub.c: 2 ports detected
Apr  8 21:32:47 ulthar kernel: PCI: Found IRQ 11 for device 00:07.3
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.2
Apr  8 21:32:47 ulthar kernel: IRQ routing conflict in pirq table for device
00
:07.3
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0a.0
Apr  8 21:32:47 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: USB UHCI at I/O 0xa800, IRQ 9
Apr  8 21:32:47 ulthar kernel: usb-uhci.c: Detected 2 ports
Apr  8 21:32:47 ulthar kernel: usb.c: new USB bus registered, assigned bus
numb
er 2
Apr  8 21:32:47 ulthar kernel: hub.c: USB hub found
Apr  8 21:32:47 ulthar kernel: hub.c: 2 ports detected

However I am not seeing any problems caused by this (however I do not use
USB very much, as I mentioned - only for a trackball). I also got the same
messages on my K7T Pro which used the KT133 chipset, however, so I don't
think this is a KT133/KT133A issue.
I can't seem to find dump_pirq on my system (Red Hat 7) - I can run it if I
find it...

Jeff Garzik said:
Changing '#undef DEBUG' to '#define DEBUG 1' in
arch/i386/kernel/pci-i386.h is also very helpful.  Can you guys do so,
and post the 'dmesg -s 16384' results to lkml?  This includes the same
information as dump_pirq, as well as some additional information.

I'll do that and get back to you - I'll have to physically be at my machine
to reset the BIOS to "PNP: Yes" so it won't be until I get home from work.

Note that turning "Plug-n-Play OS" off in BIOS setup typically fixes
many interrupt routing problems -- but Linux 2.4 should now have support
for PNP OS:Yes.  Clearly there appear to be problems with that support
on some Via hardware.

Note that you should have "Plug-n-Play OS: Yes" when generated the
requested 'dmesg' output.

This may be the difference - I always set "Plug-n-Play OS: No" on all my
machines. Linux works fine and it doesn't seem to hurt Windows 98 any.

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure

Jeff Garzik said...
 Changing '#undef DEBUG' to '#define DEBUG 1' in
 arch/i386/kernel/pci-i386.h is also very helpful.  Can you guys do so,
 and post the 'dmesg -s 16384' results to lkml?  This includes the same
 information as dump_pirq, as well as some additional information.

Here's my dmesg output - I tried with both PNP: Yes and PNP: No and the
dmesg outputs were exactly the same modulo a Hz or two in the processor
speed detection.

I do have an IRQ for my VGA since the instructions for my card (a Voodoo 5
5500) specifically say an IRQ is needed.

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."


 dmesg_pnp_yes.txt.gz


RE: Still IRQ routing problems with VIA

2001-04-10 Thread Manuel A. McLure


  I do have an IRQ for my VGA since the instructions for my 
 card (a Voodoo 5
  5500) specifically say an IRQ is needed.
 
 I wonder though... In my mind this is a driver not hardware issue.  If
 the XFree86 and/or Linux console driver do not use the IRQ, 
 you need not
 have BIOS assign one.  If you are feeling dangerous, try 
 turning the VGA
 IRQ assignment off in BIOS and see if things melt/explode/kick ass.

I'd do that if this wasn't also my Windows 98 gaming machine - I'm supposing
that the Windows drivers do use the IRQ even if XFree86/Linux doesn't. I
dunno if Windows is smart enough to assign an IRQ even if the BIOS doesn't.
Anyway, things are working now (specially since the last tulip patches) and
I like it that way :-)

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: tulip (was RE: Kernel 2.4.3 fails to compile)

2001-04-09 Thread Manuel A. McLure

Jeff Garzik said:
[snip conversation about NETDEV WATCHDOG errors on ADMTek Comet tulip clone
card]
> 
> Ok, this should be fixed in the latest patches sent to Alan and Linus.

Testing with 2.4.3-ac3 and so far, so good. Thanks!

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: tulip (was RE: Kernel 2.4.3 fails to compile)

2001-04-09 Thread Manuel A. McLure

Jeff Garzik said:
[snip conversation about NETDEV WATCHDOG errors on ADMTek Comet tulip clone
card]
 
 Ok, this should be fixed in the latest patches sent to Alan and Linus.

Testing with 2.4.3-ac3 and so far, so good. Thanks!

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: tulip (was RE: Kernel 2.4.3 fails to compile)

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
> On Fri, 30 Mar 2001, Manuel A. McLure wrote:
> > It looks like the tulip driver isn't as up-to-date as the one from
> > 2.4.2-ac20 - when is 2.4.3-ac1 due? :-) I got NETDEV 
> WATCHDOG errors shortly
> > after rebooting with 2.4.3, although these were of the 
> "slow/packet lossy"
> > type I got with 2.4.2-ac20 instead of the "network 
> completely unusable" type
> > I got with 2.4.2-ac11 and earlier.
> 
> I'm betting that the latest ac (ac28?) is broken for you, too.
> 
> I had to revert the changes in 'ac' tulip -- they fixed Comet 
> and 21041
> cards, but broke some others.  sigh.
> 
> sigh.  More testing and debugging for Jeffro...  Comet (your chip, I
> am guessing?) should be fixed ASAP, it's pretty easy.  21041 is not as
> easy, but should be fixed quickly as well.

Yes, mine is a Comet - here's the exact detection message:

Mar 30 13:09:06 ulthar kernel: Linux Tulip driver version 0.9.14 (February
20, 2
001)
Mar 30 13:09:06 ulthar kernel: PCI: Found IRQ 5 for device 00:0c.0
Mar 30 13:09:06 ulthar kernel: eth0: ADMtek Comet rev 17 at 0xb000,
00:20:78:0D:
D2:E1, IRQ 5.

I must say that I really appreciate the effort that all of the kernel
developers put in...

Thanks,
--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Kernel 2.4.3 fails to compile

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
> On Fri, 30 Mar 2001, Manuel A. McLure wrote:
> 
> > Jeff Garzik wrote:
> > > On Fri, 30 Mar 2001, Manuel A. McLure wrote:
> > > 
> > > > ...
> > > > gcc -D__KERNEL__ -I/usr/src/linux/include -Wall 
> > > -Wstrict-prototypes -O2
> > > > -fomit-frame-pointer -fno-strict-aliasing -pipe 
> > > -mpreferred-stack-boundary=2
> > > > -march=athlon  -DMODULE -DMODVERSIONS -include
> > > > /usr/src/linux/include/linux/modversions.h   -c -o buz.o buz.c
> > > > buz.c: In function `v4l_fbuffer_alloc':
> > > > buz.c:188: `KMALLOC_MAXSIZE' undeclared (first use in 
> this function)
> > > > buz.c:188: (Each undeclared identifier is reported only once
> > > > buz.c:188: for each function it appears in.)
> > > 
> > > Easy solution -- just delete the entire test
> > > 
> > >   if (size > KMALLOC_MAXSIZE) {
> > >   ...
> > >   }
> > 
> > Thanks, I'll do that. It just seemed strange that the file was being
> > compiled in the first place when the config option was not set.
> 
> buz is built with CONFIG...ZORAN as well as CONFIG...BUZ.  I dunno if
> that's a bug or not...

Yeah - I figured that out. I found that there were many places where
KMALLOC_MAXSIZE was being used in buz.c so I removed CONFIG...ZORAN and the
kernel is working now.

It looks like the tulip driver isn't as up-to-date as the one from
2.4.2-ac20 - when is 2.4.3-ac1 due? :-) I got NETDEV WATCHDOG errors shortly
after rebooting with 2.4.3, although these were of the "slow/packet lossy"
type I got with 2.4.2-ac20 instead of the "network completely unusable" type
I got with 2.4.2-ac11 and earlier.

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Kernel 2.4.3 fails to compile

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
> On Fri, 30 Mar 2001, Manuel A. McLure wrote:
> 
> > ...
> > gcc -D__KERNEL__ -I/usr/src/linux/include -Wall 
> -Wstrict-prototypes -O2
> > -fomit-frame-pointer -fno-strict-aliasing -pipe 
> -mpreferred-stack-boundary=2
> > -march=athlon  -DMODULE -DMODVERSIONS -include
> > /usr/src/linux/include/linux/modversions.h   -c -o buz.o buz.c
> > buz.c: In function `v4l_fbuffer_alloc':
> > buz.c:188: `KMALLOC_MAXSIZE' undeclared (first use in this function)
> > buz.c:188: (Each undeclared identifier is reported only once
> > buz.c:188: for each function it appears in.)
> 
> Easy solution -- just delete the entire test
> 
>   if (size > KMALLOC_MAXSIZE) {
>   ...
>   }

Thanks, I'll do that. It just seemed strange that the file was being
compiled in the first place when the config option was not set.

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Kernel 2.4.3 fails to compile

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
 On Fri, 30 Mar 2001, Manuel A. McLure wrote:
 
  ...
  gcc -D__KERNEL__ -I/usr/src/linux/include -Wall 
 -Wstrict-prototypes -O2
  -fomit-frame-pointer -fno-strict-aliasing -pipe 
 -mpreferred-stack-boundary=2
  -march=athlon  -DMODULE -DMODVERSIONS -include
  /usr/src/linux/include/linux/modversions.h   -c -o buz.o buz.c
  buz.c: In function `v4l_fbuffer_alloc':
  buz.c:188: `KMALLOC_MAXSIZE' undeclared (first use in this function)
  buz.c:188: (Each undeclared identifier is reported only once
  buz.c:188: for each function it appears in.)
 
 Easy solution -- just delete the entire test
 
   if (size  KMALLOC_MAXSIZE) {
   ...
   }

Thanks, I'll do that. It just seemed strange that the file was being
compiled in the first place when the config option was not set.

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: Kernel 2.4.3 fails to compile

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
 On Fri, 30 Mar 2001, Manuel A. McLure wrote:
 
  Jeff Garzik wrote:
   On Fri, 30 Mar 2001, Manuel A. McLure wrote:
   
...
gcc -D__KERNEL__ -I/usr/src/linux/include -Wall 
   -Wstrict-prototypes -O2
-fomit-frame-pointer -fno-strict-aliasing -pipe 
   -mpreferred-stack-boundary=2
-march=athlon  -DMODULE -DMODVERSIONS -include
/usr/src/linux/include/linux/modversions.h   -c -o buz.o buz.c
buz.c: In function `v4l_fbuffer_alloc':
buz.c:188: `KMALLOC_MAXSIZE' undeclared (first use in 
 this function)
buz.c:188: (Each undeclared identifier is reported only once
buz.c:188: for each function it appears in.)
   
   Easy solution -- just delete the entire test
   
 if (size  KMALLOC_MAXSIZE) {
 ...
 }
  
  Thanks, I'll do that. It just seemed strange that the file was being
  compiled in the first place when the config option was not set.
 
 buz is built with CONFIG...ZORAN as well as CONFIG...BUZ.  I dunno if
 that's a bug or not...

Yeah - I figured that out. I found that there were many places where
KMALLOC_MAXSIZE was being used in buz.c so I removed CONFIG...ZORAN and the
kernel is working now.

It looks like the tulip driver isn't as up-to-date as the one from
2.4.2-ac20 - when is 2.4.3-ac1 due? :-) I got NETDEV WATCHDOG errors shortly
after rebooting with 2.4.3, although these were of the "slow/packet lossy"
type I got with 2.4.2-ac20 instead of the "network completely unusable" type
I got with 2.4.2-ac11 and earlier.

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: tulip (was RE: Kernel 2.4.3 fails to compile)

2001-03-30 Thread Manuel A. McLure

Jeff Garzik wrote:
 On Fri, 30 Mar 2001, Manuel A. McLure wrote:
  It looks like the tulip driver isn't as up-to-date as the one from
  2.4.2-ac20 - when is 2.4.3-ac1 due? :-) I got NETDEV 
 WATCHDOG errors shortly
  after rebooting with 2.4.3, although these were of the 
 "slow/packet lossy"
  type I got with 2.4.2-ac20 instead of the "network 
 completely unusable" type
  I got with 2.4.2-ac11 and earlier.
 
 I'm betting that the latest ac (ac28?) is broken for you, too.
 
 I had to revert the changes in 'ac' tulip -- they fixed Comet 
 and 21041
 cards, but broke some others.  sigh.
 
 sigh.  More testing and debugging for Jeffro...  Comet (your chip, I
 am guessing?) should be fixed ASAP, it's pretty easy.  21041 is not as
 easy, but should be fixed quickly as well.

Yes, mine is a Comet - here's the exact detection message:

Mar 30 13:09:06 ulthar kernel: Linux Tulip driver version 0.9.14 (February
20, 2
001)
Mar 30 13:09:06 ulthar kernel: PCI: Found IRQ 5 for device 00:0c.0
Mar 30 13:09:06 ulthar kernel: eth0: ADMtek Comet rev 17 at 0xb000,
00:20:78:0D:
D2:E1, IRQ 5.

I must say that I really appreciate the effort that all of the kernel
developers put in...

Thanks,
--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: NETDEV WATCHDOG: eth0: transmit timed out on LNE100TX 4.0, kernel2.4.2-ac11 and earlier.

2001-03-20 Thread Manuel A. McLure

I'd looked for changes in tulip between 2.4.2-ac11 and 2.4.2-ac20 and hadn't
seen any - that's why I hadn't updated. I gather that the change in question
is at a higher level?

Anyway, I've upgraded to 2.4.2-ac20 and now I still get the error messages:

Mar 20 14:35:52 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 20 14:35:52 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

but instead of hanging completely the connection just gets extremely slow
and "bursty" as shown by the following fragment of ping output:

64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=8 ttl=255
time=130 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=9 ttl=255
time=358 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=10 ttl=255
time=6.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=4 ttl=255
time=12.001 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=12 ttl=255
time=1.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=13 ttl=255
time=368 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=14 ttl=255
time=361 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=15 ttl=255
time=395 usec

So the behavior is quite a bit better (at least I can telnet in to
ifdown/ifup) but still not OK. Once again, ifdown/ifup makes things work OK.

Thanks!
--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."




"Jeff Garzik" wrote:
> "Manuel A. McLure" wrote:
> > 
> > System:
> > AMD Athlon Thunderbird 900MHz
> > MSI K7T Pro (VIA KT133 chipset)
> > Network card: Linksys LNE100TX Rev. 4.0 (tulip)
> > Kernel: 2.2.18 (with 0.92 Scyld drivers), 2.4.0, 2.4.1, 
> 2.4.2, 2.4.2-ac11
> > 
> > With all the above kernel revisions/drivers, my network 
> card hangs at random
> > (sometimes within minutes, other times it takes days). To 
> restart it I need
> > to do an ifdown/ifup cycle and it will work fine until the 
> next hang. I
> > upgraded to 2.4.2-ac11 because of the documented tulip 
> fixes, but after a
> > few days got this again. The error log shows:
> 
> In Alan Cox terms, that's a long time ago :)
> 
> Can you please try 2.4.2-ac20?  It includes fixes 
> specifically for this
> problem.

I'd looked for changes in tulip between 2.4.2-ac11 and 2.4.2-ac20 and hadn't
seen any - that's why I hadn't updated. I gather that the change in question
is at a higher level?

Anyway, I've upgraded to 2.4.2-ac20 and now I still get the error messages:

Mar 20 14:35:52 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 20 14:35:52 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

but instead of hanging completely the connection just gets extremely slow
and "bursty" as shown by the following fragment of ping output:

64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=8 ttl=255
time=130 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=9 ttl=255
time=358 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=10 ttl=255
time=6.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=4 ttl=255
time=12.001 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=12 ttl=255
time=1.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=13 ttl=255
time=368 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=14 ttl=255
time=361 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=15 ttl=255
time=395 usec

So the behavior is quite a bit better (at least I can telnet in to
ifdown/ifup) but still not OK. Once again, ifdown/ifup makes things work
fine until the problem starts again.

Thanks!
--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



NETDEV WATCHDOG: eth0: transmit timed out on LNE100TX 4.0, kernel 2.4.2-ac11 and earlier.

2001-03-20 Thread Manuel A. McLure

System:
AMD Athlon Thunderbird 900MHz
MSI K7T Pro (VIA KT133 chipset)
Network card: Linksys LNE100TX Rev. 4.0 (tulip)
Kernel: 2.2.18 (with 0.92 Scyld drivers), 2.4.0, 2.4.1, 2.4.2, 2.4.2-ac11

With all the above kernel revisions/drivers, my network card hangs at random
(sometimes within minutes, other times it takes days). To restart it I need
to do an ifdown/ifup cycle and it will work fine until the next hang. I
upgraded to 2.4.2-ac11 because of the documented tulip fixes, but after a
few days got this again. The error log shows:

Mar 16 18:37:00 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 16 18:37:00 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

ad infinitum until I do ifdown/ifup. The status & CSR12 values are always
the same. My card is detected as follows by the kernel:

Mar 12 21:38:49 ulthar kernel: Linux Tulip driver version 0.9.14c (March 3,
2001
)
Mar 12 21:38:49 ulthar kernel: PCI: Found IRQ 11 for device 00:0a.0
Mar 12 21:38:49 ulthar kernel: IRQ routing conflict in pirq table for device
00:
07.2
Mar 12 21:38:49 ulthar kernel: IRQ routing conflict in pirq table for device
00:
07.3
Mar 12 21:38:49 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Mar 12 21:38:49 ulthar kernel: eth0: ADMtek Comet rev 17 at 0xdc00,
00:20:78:0D:
D2:E1, IRQ 11.

Any ideas on why this might be happening? 

--
Manuel A. McLure - Unify Corp. Technical Support <[EMAIL PROTECTED]>
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



NETDEV WATCHDOG: eth0: transmit timed out on LNE100TX 4.0, kernel 2.4.2-ac11 and earlier.

2001-03-20 Thread Manuel A. McLure

System:
AMD Athlon Thunderbird 900MHz
MSI K7T Pro (VIA KT133 chipset)
Network card: Linksys LNE100TX Rev. 4.0 (tulip)
Kernel: 2.2.18 (with 0.92 Scyld drivers), 2.4.0, 2.4.1, 2.4.2, 2.4.2-ac11

With all the above kernel revisions/drivers, my network card hangs at random
(sometimes within minutes, other times it takes days). To restart it I need
to do an ifdown/ifup cycle and it will work fine until the next hang. I
upgraded to 2.4.2-ac11 because of the documented tulip fixes, but after a
few days got this again. The error log shows:

Mar 16 18:37:00 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 16 18:37:00 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

ad infinitum until I do ifdown/ifup. The status  CSR12 values are always
the same. My card is detected as follows by the kernel:

Mar 12 21:38:49 ulthar kernel: Linux Tulip driver version 0.9.14c (March 3,
2001
)
Mar 12 21:38:49 ulthar kernel: PCI: Found IRQ 11 for device 00:0a.0
Mar 12 21:38:49 ulthar kernel: IRQ routing conflict in pirq table for device
00:
07.2
Mar 12 21:38:49 ulthar kernel: IRQ routing conflict in pirq table for device
00:
07.3
Mar 12 21:38:49 ulthar kernel: PCI: The same IRQ used for device 00:0e.0
Mar 12 21:38:49 ulthar kernel: eth0: ADMtek Comet rev 17 at 0xdc00,
00:20:78:0D:
D2:E1, IRQ 11.

Any ideas on why this might be happening? 

--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: NETDEV WATCHDOG: eth0: transmit timed out on LNE100TX 4.0, kernel2.4.2-ac11 and earlier.

2001-03-20 Thread Manuel A. McLure

I'd looked for changes in tulip between 2.4.2-ac11 and 2.4.2-ac20 and hadn't
seen any - that's why I hadn't updated. I gather that the change in question
is at a higher level?

Anyway, I've upgraded to 2.4.2-ac20 and now I still get the error messages:

Mar 20 14:35:52 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 20 14:35:52 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

but instead of hanging completely the connection just gets extremely slow
and "bursty" as shown by the following fragment of ping output:

64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=8 ttl=255
time=130 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=9 ttl=255
time=358 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=10 ttl=255
time=6.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=4 ttl=255
time=12.001 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=12 ttl=255
time=1.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=13 ttl=255
time=368 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=14 ttl=255
time=361 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=15 ttl=255
time=395 usec

So the behavior is quite a bit better (at least I can telnet in to
ifdown/ifup) but still not OK. Once again, ifdown/ifup makes things work OK.

Thanks!
--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."




"Jeff Garzik" wrote:
 "Manuel A. McLure" wrote:
  
  System:
  AMD Athlon Thunderbird 900MHz
  MSI K7T Pro (VIA KT133 chipset)
  Network card: Linksys LNE100TX Rev. 4.0 (tulip)
  Kernel: 2.2.18 (with 0.92 Scyld drivers), 2.4.0, 2.4.1, 
 2.4.2, 2.4.2-ac11
  
  With all the above kernel revisions/drivers, my network 
 card hangs at random
  (sometimes within minutes, other times it takes days). To 
 restart it I need
  to do an ifdown/ifup cycle and it will work fine until the 
 next hang. I
  upgraded to 2.4.2-ac11 because of the documented tulip 
 fixes, but after a
  few days got this again. The error log shows:
 
 In Alan Cox terms, that's a long time ago :)
 
 Can you please try 2.4.2-ac20?  It includes fixes 
 specifically for this
 problem.

I'd looked for changes in tulip between 2.4.2-ac11 and 2.4.2-ac20 and hadn't
seen any - that's why I hadn't updated. I gather that the change in question
is at a higher level?

Anyway, I've upgraded to 2.4.2-ac20 and now I still get the error messages:

Mar 20 14:35:52 ulthar kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar 20 14:35:52 ulthar kernel: eth0: Transmit timed out, status fc664010,
CSR12
, resetting...

but instead of hanging completely the connection just gets extremely slow
and "bursty" as shown by the following fragment of ping output:

64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=8 ttl=255
time=130 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=9 ttl=255
time=358 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=10 ttl=255
time=6.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=4 ttl=255
time=12.001 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=12 ttl=255
time=1.000 sec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=13 ttl=255
time=368 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=14 ttl=255
time=361 usec
64 bytes from leng.internal.mclure.org (10.1.1.1): icmp_seq=15 ttl=255
time=395 usec

So the behavior is quite a bit better (at least I can telnet in to
ifdown/ifup) but still not OK. Once again, ifdown/ifup makes things work
fine until the problem starts again.

Thanks!
--
Manuel A. McLure - Unify Corp. Technical Support [EMAIL PROTECTED]
Space Ghost: "Hey, what happened to the-?" Moltar: "It's out." SG: "What
about-?" M: "It's fixed." SG: "Eh, good. Good."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/