Re: sky2 0.11 instability
Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. I don't care whether that debug patch will freeze the box or perform other random funnies. All the debugging printks I added to the driver did not trigger and I'd try anything. So yes, I'm desparate. Does the sk98lin driver have any code for such problems? Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
On Mon, 23 Jan 2006 20:57:10 +0100 Carl-Daniel Hailfinger [EMAIL PROTECTED] wrote: Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. I don't care whether that debug patch will freeze the box or perform other random funnies. All the debugging printks I added to the driver did not trigger and I'd try anything. So yes, I'm desparate. Does the sk98lin driver have any code for such problems? There are several differences that the sk98lin driver has. * It programs some parts of the chip differently. But most of those are wrong. I started copying it, but where it was wrong I didn't copy the mistakes. * Sk98lin does NAPI wrong. It has interrupts disabled and runs packets through soft irq twice. * Sk98lin does it's own buggy rx checksum validation. * Sk98lin does not do VLAN * Sk98lin programs PCI-Ex for 2K transfers, but that causes data corruption The one that probably is saving you with sk98lin, is it has a watchdog routine that tries to work around all the possible driver hangs. I prefer to find an fix these hangs, because a watchdog routine like that just masks the problem and introduces a bunch of SMP race conditions which the sk98lin author either didn't see or ignored. -- Stephen Hemminger [EMAIL PROTECTED] OSDL http://developer.osdl.org/~shemminger - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Stephen Hemminger schrieb: On Mon, 23 Jan 2006 20:57:10 +0100 Carl-Daniel Hailfinger [EMAIL PROTECTED] wrote: Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. I don't care whether that debug patch will freeze the box or perform other random funnies. All the debugging printks I added to the driver did not trigger and I'd try anything. So yes, I'm desparate. Does the sk98lin driver have any code for such problems? There are several differences that the sk98lin driver has. * It programs some parts of the chip differently. But most of those are wrong. I started copying it, but where it was wrong I didn't copy the mistakes. * Sk98lin does NAPI wrong. It has interrupts disabled and runs packets through soft irq twice. * Sk98lin does it's own buggy rx checksum validation. * Sk98lin does not do VLAN * Sk98lin programs PCI-Ex for 2K transfers, but that causes data corruption The one that probably is saving you with sk98lin, is it has a watchdog routine that tries to work around all the possible driver hangs. I prefer to find an fix these hangs, because a watchdog routine like that just masks the problem and introduces a bunch of SMP race conditions which the sk98lin author either didn't see or ignored. Oh. Now that is news to me. Glad I didn't have a SMP machine with the old driver. There is a bug in ethtool support in sky2. Namely, rx-frames{,-irq}=64 is wrapped to zero. And rx-usecs-irq is 20 no matter what I set it to. # ethtool -C bridgeint0 rx-frames 64 rx-frames-irq 64 rx-usecs 1 rx-usecs-irq 1 tx-usecs 1 tx-frames 64 # ethtool -c bridgeint0 Coalesce parameters for bridgeint0: Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 1 rx-frames: 0 rx-usecs-irq: 20 rx-frames-irq: 0 tx-usecs: 1 tx-frames: 64 tx-usecs-irq: 0 tx-frames-irq: 0 Will continue investigating. Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Carl-Daniel Hailfinger schrieb: Stephen Hemminger schrieb: On Mon, 23 Jan 2006 20:57:10 +0100 Carl-Daniel Hailfinger [EMAIL PROTECTED] wrote: Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. I don't care whether that debug patch will freeze the box or perform other random funnies. All the debugging printks I added to the driver did not trigger and I'd try anything. So yes, I'm desparate. Does the sk98lin driver have any code for such problems? There are several differences that the sk98lin driver has. * It programs some parts of the chip differently. But most of those are wrong. I started copying it, but where it was wrong I didn't copy the mistakes. * Sk98lin does NAPI wrong. It has interrupts disabled and runs packets through soft irq twice. * Sk98lin does it's own buggy rx checksum validation. * Sk98lin does not do VLAN * Sk98lin programs PCI-Ex for 2K transfers, but that causes data corruption The one that probably is saving you with sk98lin, is it has a watchdog routine that tries to work around all the possible driver hangs. I prefer to find an fix these hangs, because a watchdog routine like that just masks the problem and introduces a bunch of SMP race conditions which the sk98lin author either didn't see or ignored. Oh. Now that is news to me. Glad I didn't have a SMP machine with the old driver. There is a bug in ethtool support in sky2. Namely, rx-frames{,-irq}=64 is wrapped to zero. And rx-usecs-irq is 20 no matter what I set it to. The following whitespace-damaged patch should help with the latter problem. --- a/drivers/net/sky2.c 2006-01-23 23:41:35.0 +0100 +++ b/drivers/net/sky2.c 2006-01-24 03:41:21.0 +0100 @@ -2843,7 +2843,7 @@ if (ecmd-rx_coalesce_usecs_irq == 0) sky2_write8(hw, STAT_ISR_TIMER_CTRL, TIM_STOP); else { - sky2_write32(hw, STAT_TX_TIMER_INI, + sky2_write32(hw, STAT_ISR_TIMER_INI, sky2_us2clk(hw, ecmd-rx_coalesce_usecs_irq)); sky2_write8(hw, STAT_ISR_TIMER_CTRL, TIM_START); } Despite all the problems I'm having with sky2, I want to thank you for writing it. The driver is easily readable and I can at least try to get it running. With sk98lin I'm just stuck due to coding style and general obfuscation. Yeah! I got the nic to reproducibly auto-recover. With the following ethtool settings it would hang after a few minutes and not recover until a rmmod/modprobe cycle. Now it comes back reliably. # ethtool -C bridgeext0 rx-frames 63 rx-frames-irq 63 tx-frames 63 \ rx-usecs 250 rx-usecs-irq 250 tx-usecs 250 Patch follows: --- a/drivers/net/sky2.c 2006-01-23 23:41:35.0 +0100 +++ b/drivers/net/sky2.c 2006-01-24 04:59:38.0 +0100 @@ -1623,6 +1623,12 @@ unsigned txq = txqaddr[sky2-port]; u16 ridx; + //sky2_write8(hw, STAT_TX_TIMER_CTRL, TIM_STOP); + sky2_write8(hw, STAT_LEV_TIMER_CTRL, TIM_STOP); + //sky2_write8(hw, STAT_ISR_TIMER_CTRL, TIM_STOP); + //sky2_write8(hw, STAT_TX_TIMER_CTRL, TIM_START); + sky2_write8(hw, STAT_LEV_TIMER_CTRL, TIM_START); + //sky2_write8(hw, STAT_ISR_TIMER_CTRL, TIM_START); /* Maybe we just missed an status interrupt */ spin_lock(sky2-tx_lock); ridx = sky2_read16(hw, @@ -1639,6 +1645,7 @@ if (netif_msg_timer(sky2)) printk(KERN_ERR PFX %s: tx timeout\n, dev-name); +#if 0 sky2_write32(hw, Q_ADDR(txq, Q_CSR), BMU_STOP); sky2_write32(hw, Y2_QADDR(txq, PREF_UNIT_CTRL), PREF_UNIT_RST_SET); @@ -1646,6 +1653,7 @@ sky2_qset(hw, txq); sky2_prefetch_init(hw, txq, sky2-tx_le_map, TX_RING_SIZE - 1); +#endif } Properties of the patch above: The device will fail after some time, enter the tx_timeout handler, recover and continue. Now if I could avoid entering the tx_timeout handler, I would be happy because it triggers only after hanging for approx. 10 seconds. Error log with my patch so far: Jan 24 05:09:27 switch kernel: NETDEV WATCHDOG: bridgeint0: transmit timed out Jan 24 05:09:27 switch kernel: sky2 bridgeint0: tx timeout Jan 24 05:09:41 switch kernel: NETDEV WATCHDOG: bridgeext0: transmit timed out Jan 24 05:09:41 switch kernel: sky2 bridgeext0: tx timeout Jan 24 05:09:41 switch kernel: sky2 bridgeext0: rx error, status 0x7ffc0001 length 1312 Jan 24 05:11:12 switch kernel: NETDEV WATCHDOG: bridgeint0: transmit timed out Jan 24 05:11:12 switch kernel: sky2 bridgeint0: tx timeout Jan 24 05:11:12 switch kernel: sky2 bridgeint0: rx error, status 0x7ffc0001 length 592 Jan 24 05:11:42 switch kernel: NETDEV WATCHDOG: bridgeint0: transmit timed out Jan 24 05:11:42 switch kernel: sky2 bridgeint0: tx timeout Jan 24 05:11:42 switch kernel: sky2
Re: sky2 0.11 instability
Hi, Carl-Daniel Hailfinger schrieb: Carl-Daniel Hailfinger schrieb: Carl-Daniel Hailfinger schrieb: after sending 259 GB and receiving 25 GB over my SysKonnect SK-9E21 card (sky2 says it is a Yukon-EC (0xb6) rev 1), the card appears dead. Machine is an Athlon64 3200+ on an Asus A8N-SLI Deluxe board. I have now added a hard reset routine to the tx timeout path and hope it won't kill my machine. Apologies for mangled whitespace, this is just a rough cut'n'paste. --- linux-2.6.15/drivers/net/sky2.c.orig2006-01-21 16:00:15.0 +0100 +++ linux-2.6.15/drivers/net/sky2.c 2006-01-21 14:08:28.0 +0100 @@ -1565,6 +1565,7 @@ static int sky2_autoneg_done(struct sky2 return 0; } +static int sky2_reset(struct sky2_hw *hw); /* * Interrupt from PHY are handled outside of interrupt context * because accessing phy registers requires spin wait which might @@ -1639,6 +1640,7 @@ static void sky2_tx_timeout(struct net_d if (netif_msg_timer(sky2)) printk(KERN_ERR PFX %s: tx timeout\n, dev-name); + if (0) { sky2_write32(hw, Q_ADDR(txq, Q_CSR), BMU_STOP); sky2_write32(hw, Y2_QADDR(txq, PREF_UNIT_CTRL), PREF_UNIT_RST_SET); @@ -1646,6 +1648,12 @@ static void sky2_tx_timeout(struct net_d sky2_qset(hw, txq); sky2_prefetch_init(hw, txq, sky2-tx_le_map, TX_RING_SIZE - 1); + } else { + printk(KERN_ERR PFX %s: recovering the HARD way...\n, dev-name); + sky2_down(dev); + sky2_reset(hw); + sky2_up(dev); + } } And everytime the kernel throws this message, I run the following script: #!/bin/bash deadinterface=`dmesg|grep HARD|tail -1|sed s/.*sky2 //;s/:.*//` ip l s $deadinterface down ip l s $deadinterface up After that, everything continues to work until the next tx timeout happens, and then the script again saves the day. More results about the circumstances of this bug: It seems that it will only trigger under LOW load. As long as I keep the interface busy, it will have no problems at all. OK, more info about the circumstances of the bug. - happens with sky2 0.11 and 0.13 - with low load (100 kB/s) it triggers after 12 hours and then approx. every 50 minutes - with medium load (100-1200 kB/s) it triggers after 30 minutes and then approx. every 70 minutes - with high RX load (9-12 MB/s) it triggers every 8 hours - with high TX load (9-12 MB/s) I can't get it to trigger - with stock tx_timeout handler, it will stay dead and no interrupts are received from the nic once it hangs - simply taking the interface down and up again doesn't help - with my modified tx_timeout handler, taking the interface down and up again after the timeout helps - with stock tx_timeout handler, I have to unload and reload the module to fix up the card - general pattern seems to be medium interrupt load - instability - ah yes, and this is a production machine at a slightly remote location. Silly me. If you want me to test any patch, tell me. It can only get better. Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. ethtool -C bridgeint1 rx-frames 255 rx-frames-irq 255 rx-usecs 0 rx-usecs-irq 0 tx-usecs 0 tx-frames 255 always results in a hang after less than 2 minutes if the network activity is not too high (about 100-600 packets/s). So yes, I can trigger this sucker on demand and give you all the debugging you need. Do you have any idea what the out-of-tree sk98lin did differently? Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Carl-Daniel Hailfinger schrieb: Stephen Hemminger schrieb: You might try adjusting the interrupt coalescing parameters with ethtool -C eth0 ... But I can't give you hard guidelines as to what would make it better. I have a debug patch, but it needs work still. After experimenting further, the following command will always hang the card after 2-3 seconds: ethtool -C bridgeint1 rx-frames 63 rx-frames-irq 63 rx-usecs 0 rx-usecs-irq 0 tx-usecs 0 tx-frames 63 Crude activity log (1 second interval) follows: interrupts RX packets TX packets # normal activity 18225503 1828622 2084564 18225914 1828932 2084939 18226422 1829361 2085422 18226875 1829694 2085832 18227286 1830012 2086183 18227622 1830270 2086465 18227963 1830541 2086738 18228340 1830827 2087057 18228710 1831107 2087382 18229091 1831390 2087694 18229467 1831677 2088002 18229835 1831954 2088338 # ethtool starts now 18230143 1832249 2088647 18230146 1832434 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 18230146 1832462 2088799 # the netdev watchdog triggers now So yes, I can trigger this sucker on demand and give you all the debugging you need. Do you have any idea what the out-of-tree sk98lin v8.14.3.3 did differently? Regards, Carl-Daniel - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Hi, Carl-Daniel Hailfinger schrieb: after sending 259 GB and receiving 25 GB over my SysKonnect SK-9E21 card (sky2 says it is a Yukon-EC (0xb6) rev 1), the card appears dead. Machine is an Athlon64 3200+ on an Asus A8N-SLI Deluxe board. sky2 v0.11 addr 0xc900 irq 74 Yukon-EC (0xb6) rev 1 sky2 eth3: addr 00:00:5a:70:30:fb [...] sky2 eth3: enabling interface [...] sky2 eth3: phy interrupt status 0x1c40 0x7d0c sky2 eth3: Link is up at 100 Mbps, full duplex, flow control both [...] NETDEV WATCHDOG: eth3: transmit timed out sky2 eth3: tx timeout NETDEV WATCHDOG: eth3: transmit timed out sky2 eth3: tx timeout switch:~ # ifconfig eth3 eth3 Link encap:Ethernet HWaddr 00:00:5A:70:30:FB inet6 addr: fe80::200:5aff:fe70:30fb/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:130530358 errors:0 dropped:0 overruns:0 frame:0 TX packets:209647800 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:25980735946 (24777.1 Mb) TX bytes:259787058579 (247752.2 Mb) Interrupt:74 switch:~ # cat /proc/interrupts CPU0 0: 11213627IO-APIC-edge timer 1: 24783IO-APIC-edge i8042 8: 0IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 15: 401558IO-APIC-edge ide1 50: 249384881 IO-APIC-level eth0 58: 179123938 IO-APIC-level sky2 66: 3 IO-APIC-level sky2, ohci1394 74: 98956955 IO-APIC-level sky2 82: 19952 IO-APIC-level sky2 217: 1865 IO-APIC-level libata, NVidia CK804 225: 263052 IO-APIC-level libata, ehci_hcd:usb1 NMI: 11098 LOC: 11214113 ERR: 0 MIS: 0 Not only will the card not transmit anymore, it also doesn't receive any packet at all. ethtool -r eth3 doesn't change anything, taking the interface down and up again also doesn't help. The interrupt count of interrupt 74 stays constant after failing. modprobe -r sky2; modprobe sky2 fixes the problem for me, so maybe resetting the card on TX timeouts will help. The same problem appeared much earlier for another card which shared interrupt 58 with an onboard card driven by skge. After disabling the skge driver and rebooting, that card has been stable so far. The card is connected to a 100 MBit switch. These problems didn't appear with sk98lin v8.14.3.3 (that driver did survive about 10 TB of traffic before I rebooted). Register dumps are available on request (too big for this list). I will now try sky2 0.13 and report back. And it hit the other interface after 200 MB transferred... NETDEV WATCHDOG: bridgeext0: transmit timed out sky2 bridgeext0: tx timeout NETDEV WATCHDOG: bridgeext0: transmit timed out sky2 transmit interrupt missed? recovered Although the driver claims to recover, it doesn't recover at all. What debug level would be advisable? It is now running with modprobe sky2 debug=2, but I can't see more than the messages above. I have now added a hard reset routine to the tx timeout path and hope it won't kill my machine. Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 0.11 instability
Carl-Daniel Hailfinger schrieb: Hi, Carl-Daniel Hailfinger schrieb: after sending 259 GB and receiving 25 GB over my SysKonnect SK-9E21 card (sky2 says it is a Yukon-EC (0xb6) rev 1), the card appears dead. Machine is an Athlon64 3200+ on an Asus A8N-SLI Deluxe board. sky2 v0.11 addr 0xc900 irq 74 Yukon-EC (0xb6) rev 1 sky2 eth3: addr 00:00:5a:70:30:fb [...] sky2 eth3: enabling interface [...] sky2 eth3: phy interrupt status 0x1c40 0x7d0c sky2 eth3: Link is up at 100 Mbps, full duplex, flow control both [...] NETDEV WATCHDOG: eth3: transmit timed out sky2 eth3: tx timeout NETDEV WATCHDOG: eth3: transmit timed out sky2 eth3: tx timeout switch:~ # ifconfig eth3 eth3 Link encap:Ethernet HWaddr 00:00:5A:70:30:FB inet6 addr: fe80::200:5aff:fe70:30fb/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:130530358 errors:0 dropped:0 overruns:0 frame:0 TX packets:209647800 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:25980735946 (24777.1 Mb) TX bytes:259787058579 (247752.2 Mb) Interrupt:74 switch:~ # cat /proc/interrupts CPU0 0: 11213627IO-APIC-edge timer 1: 24783IO-APIC-edge i8042 8: 0IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 15: 401558IO-APIC-edge ide1 50: 249384881 IO-APIC-level eth0 58: 179123938 IO-APIC-level sky2 66: 3 IO-APIC-level sky2, ohci1394 74: 98956955 IO-APIC-level sky2 82: 19952 IO-APIC-level sky2 217: 1865 IO-APIC-level libata, NVidia CK804 225: 263052 IO-APIC-level libata, ehci_hcd:usb1 NMI: 11098 LOC: 11214113 ERR: 0 MIS: 0 Not only will the card not transmit anymore, it also doesn't receive any packet at all. ethtool -r eth3 doesn't change anything, taking the interface down and up again also doesn't help. The interrupt count of interrupt 74 stays constant after failing. modprobe -r sky2; modprobe sky2 fixes the problem for me, so maybe resetting the card on TX timeouts will help. The same problem appeared much earlier for another card which shared interrupt 58 with an onboard card driven by skge. After disabling the skge driver and rebooting, that card has been stable so far. The card is connected to a 100 MBit switch. These problems didn't appear with sk98lin v8.14.3.3 (that driver did survive about 10 TB of traffic before I rebooted). Register dumps are available on request (too big for this list). I will now try sky2 0.13 and report back. And it hit the other interface after 200 MB transferred... NETDEV WATCHDOG: bridgeext0: transmit timed out sky2 bridgeext0: tx timeout NETDEV WATCHDOG: bridgeext0: transmit timed out sky2 transmit interrupt missed? recovered Although the driver claims to recover, it doesn't recover at all. What debug level would be advisable? It is now running with modprobe sky2 debug=2, but I can't see more than the messages above. I have now added a hard reset routine to the tx timeout path and hope it won't kill my machine. Apologies for mangled whitespace, this is just a rough cut'n'paste. --- linux-2.6.15/drivers/net/sky2.c.orig2006-01-21 16:00:15.0 +0100 +++ linux-2.6.15/drivers/net/sky2.c 2006-01-21 14:08:28.0 +0100 @@ -1565,6 +1565,7 @@ static int sky2_autoneg_done(struct sky2 return 0; } +static int sky2_reset(struct sky2_hw *hw); /* * Interrupt from PHY are handled outside of interrupt context * because accessing phy registers requires spin wait which might @@ -1639,6 +1640,7 @@ static void sky2_tx_timeout(struct net_d if (netif_msg_timer(sky2)) printk(KERN_ERR PFX %s: tx timeout\n, dev-name); + if (0) { sky2_write32(hw, Q_ADDR(txq, Q_CSR), BMU_STOP); sky2_write32(hw, Y2_QADDR(txq, PREF_UNIT_CTRL), PREF_UNIT_RST_SET); @@ -1646,6 +1648,12 @@ static void sky2_tx_timeout(struct net_d sky2_qset(hw, txq); sky2_prefetch_init(hw, txq, sky2-tx_le_map, TX_RING_SIZE - 1); + } else { + printk(KERN_ERR PFX %s: recovering the HARD way...\n, dev-name); + sky2_down(dev); + sky2_reset(hw); + sky2_up(dev); + } } And everytime the kernel throws this message, I run the following script: #!/bin/bash deadinterface=`dmesg|grep HARD|tail -1|sed s/.*sky2 //;s/:.*//` ip l s $deadinterface down ip l s $deadinterface up After that, everything continues to work until the next tx timeout happens, and then the script again saves the day. More results about the circumstances of this bug: It seems that it will only trigger under LOW load. As long as I keep the interface busy, it will have no problems at all. Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at