Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-20 Thread Daniel Drake

Manfred Spraul wrote:

Autsch.
Yes, you are right. Sorry for that, I should have reread the patch once 
more.


No problem :)

I've been running v0.38 since my last mail. No problems at all.

Thanks for your continued work on the driver.

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-20 Thread Daniel Drake

Manfred Spraul wrote:

Autsch.
Yes, you are right. Sorry for that, I should have reread the patch once 
more.


No problem :)

I've been running v0.38 since my last mail. No problems at all.

Thanks for your continued work on the driver.

Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Daniel Drake wrote:


So, you want this instead:

#define DEV_HAS_LARGEDESC0x0004


Autsch.
Yes, you are right. Sorry for that, I should have reread the patch once 
more. I've fixed it on my website.


--
   Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Daniel Drake wrote:
After applying the v0.38 patch, I can't get any network at all. DHCP 
fails to get an IP. v0.37 works fine.


Tracked it down. (sorry for linewraps)

+#define DEV_NEED_TIMERIRQ  0x0001  /* set the timer irq flag in the irq 
mask */
+#define DEV_NEED_LINKTIMER	0x0002	/* poll link settings. Relies on the timer 
irq */
+#define DEV_HAS_LARGEDESC	0x0003	/* device supports jumbo frames and needs 
packet format 2 */


My hardware is NEED_TIMERIRQ|NEED_LINKTIMER, however, by this logic, it'll 
also be DEV_HAVE_LARGEDESC, which isn't true.


So, you want this instead:

#define DEV_HAS_LARGEDESC   0x0004

After making that change, all is working fine, but then again, I've never run 
into the hangs you are debugging. I'll follow up in a couple of days time to 
confirm I'm not getting any problems with the new code.


Thanks,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Hi,

Manfred Spraul wrote:
Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with


<<
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...
<<


After applying the v0.38 patch, I can't get any network at all. DHCP fails to 
get an IP. v0.37 works fine.


I enabled debugging, and I get this failure for every packet being 
transmitted: ( i masked out part of my MAC addr with XX )


Jul 16 20:06:28 dsd eth0: nv_start_xmit: packet packet 3 queued for 
transmission.
Jul 16 20:06:28 dsd
Jul 16 20:06:28 dsd 000: ff ff ff ff ff ff 00 50 8d XX XX XX 08 00 45 00
Jul 16 20:06:28 dsd 010: 02 40 75 a0 00 00 40 11 03 0e 00 00 00 00 ff ff
Jul 16 20:06:28 dsd 020: ff ff 00 44 00 43 02 2c 13 0a 01 01 06 00 d2 76
Jul 16 20:06:28 dsd 030: bc 10 00 0a 00 00 00 00 00 00 00 00 00 00 00 00
Jul 16 20:06:28 dsd eth0: nv_nic_irq
Jul 16 20:06:28 dsd eth0: irq: 0008
Jul 16 20:06:28 dsd eth0: nv_tx_done: looking at packet 3, Flags 0x624d.
Jul 16 20:06:28 dsd eth0: received irq with events 0x8. Probably TX fail.
Jul 16 20:06:28 dsd eth0: irq: 
Jul 16 20:06:28 dsd eth0: nv_nic_irq completed

My hardware:

:00:04.0 Class 0200: 10de:0066 (rev a1)

:00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet 
Controller (rev a1)

Subsystem: ABIT Computer Corp.: Unknown device 1c00
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 17
Memory at e0087000 (32-bit, non-prefetchable) [size=4K]
I/O ports at b000 [size=8]
Capabilities: [44] Power Management version 2

Here's the start of the logs:


Jul 16 20:05:27 dsd forcedeth.c: Reverse Engineered nForce ethernet driver. 
Version 0.38.
Jul 16 20:05:27 dsd ACPI: PCI Interrupt :00:04.0[A] -> Link [APCH] -> GSI 
21 (level, high) -> IRQ 17

Jul 16 20:05:27 dsd PCI: Setting latency timer of device :00:04.0 to 64
Jul 16 20:05:27 dsd :00:04.0: resource 0 start e0087000 len 4096 flags 
0x0200.

Jul 16 20:05:27 dsd :00:04.0: MAC Address 00:50:8d:XX:XX:XX
Jul 16 20:05:27 dsd :00:04.0: link timer on.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 2 at PHY 1: 0x0.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 3 at PHY 1: 0x8201.
Jul 16 20:05:27 dsd :00:04.0: open: Found PHY :0020 at address 1.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 4 at PHY 1: 0x1e1.
Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xde1 to reg 4 at PHY 1
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 1 at PHY 1: 0x786d.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3100.
Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xb100 to reg 0 at PHY 1
Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000.
Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000.
Jul 16 20:05:28 dsd eth%d: mii_rw wrote 0x3200 to reg 0 at PHY 1
Jul 16 20:05:28 dsd eth0: forcedeth.c: subsystem: 0147b:1c00 bound to 
:00:04.0
Jul 16 20:05:28 dsd rc-scripts: Configuration not set for eth0 - assuming dhcp
Jul 16 20:05:28 dsd nv_open: begin
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 0 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 1 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 2 marked as Available



Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 125 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 126 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 127 marked as Available
Jul 16 20:05:28 dsd eth0: nv_txrx_reset
Jul 16 20:05:28 dsd startup: got 0x0010.
Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849.
Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849.
Jul 16 20:05:28 dsd eth0: no link detected by phy - falling back to 10HD.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_start_tx
Jul 16 20:05:28 dsd eth0: no link during initialization.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.

Let me know if full logs would be useful (they are big, and it just shows a 
lot of interrupts, some packets being 

Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Daniel Drake wrote:


Hi,

Manfred Spraul wrote:

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the 
nic hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've 
tested it on my nForce 250-Gb.



This patch doesn't apply to 2.6.13-rc3:

patching file drivers/net/forcedeth.c
Hunk #1 FAILED at 87.
Hunk #2 FAILED at 100.
Hunk #3 FAILED at 135.
Hunk #4 succeeded at 145 (offset -3 lines).
Hunk #5 succeeded at 295 (offset -3 lines).
Hunk #6 succeeded at 305 (offset -3 lines).
Hunk #7 succeeded at 995 (offset -20 lines).
Hunk #8 succeeded at 1502 (offset -87 lines).
Hunk #9 succeeded at 2112 (offset -133 lines).
Hunk #10 FAILED at 2221.
4 out of 10 hunks FAILED -- saving rejects to file 
drivers/net/forcedeth.c.rej


I think this is because 2.6.13-rc3 has forcedeth 0.35.

I can't find the patch for 0.35 --> 0.36. (Is this when the netdev 
archives were in limbo?)



Either that, or I just forgot to cc netdev.
I've uploaded all recent patches to
http://www.colorfullife.com/~manfred/Linux-kernel/forcedeth/

0.36 is attached.
--
   Manfred
--- 2.6/drivers/net/forcedeth.c 2005-06-28 22:51:26.0 +0200
+++ build-2.6/drivers/net/forcedeth.c   2005-06-28 22:51:40.0 +0200
@@ -85,6 +85,7 @@
  * 0.33: 16 May 2005: Support for MCP51 added.
  * 0.34: 18 Jun 2005: Add DEV_NEED_LINKTIMER to all nForce nics.
  * 0.35: 26 Jun 2005: Support for MCP55 added.
+ * 0.36: 28 Jul 2005: Add jumbo frame support.
  *
  * Known bugs:
  * We suspect that on some hardware no TX done interrupts are generated.
@@ -96,7 +97,7 @@
  * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few
  * superfluous timer interrupts from the nic.
  */
-#define FORCEDETH_VERSION  "0.35"
+#define FORCEDETH_VERSION  "0.36"
 #define DRV_NAME   "forcedeth"
 
 #include 
@@ -379,9 +380,13 @@
 #define TX_LIMIT_START 62
 
 /* rx/tx mac addr + type + vlan + align + slack*/
-#define RX_NIC_BUFSIZE (ETH_DATA_LEN + 64)
-/* even more slack */
-#define RX_ALLOC_BUFSIZE   (ETH_DATA_LEN + 128)
+#define NV_RX_HEADERS  (64)
+/* even more slack. */
+#define NV_RX_ALLOC_PAD(64)
+
+/* maximum mtu size */
+#define NV_PKTLIMIT_1  ETH_DATA_LEN/* hard limit not known */
+#define NV_PKTLIMIT_2  9100/* Actual limit according to NVidia: 9202 */
 
 #define OOM_REFILL (1+HZ/20)
 #define POLL_WAIT  (1+HZ/100)
@@ -473,6 +478,7 @@
struct sk_buff *rx_skbuff[RX_RING];
dma_addr_t rx_dma[RX_RING];
unsigned int rx_buf_sz;
+   unsigned int pkt_limit;
struct timer_list oom_kick;
struct timer_list nic_poll;
 
@@ -792,7 +798,7 @@
nr = refill_rx % RX_RING;
if (np->rx_skbuff[nr] == NULL) {
 
-   skb = dev_alloc_skb(RX_ALLOC_BUFSIZE);
+   skb = dev_alloc_skb(np->rx_buf_sz + NV_RX_ALLOC_PAD);
if (!skb)
break;
 
@@ -805,7 +811,7 @@
PCI_DMA_FROMDEVICE);
np->rx_ring[nr].PacketBuffer = cpu_to_le32(np->rx_dma[nr]);
wmb();
-   np->rx_ring[nr].FlagLen = cpu_to_le32(RX_NIC_BUFSIZE | 
NV_RX_AVAIL);
+   np->rx_ring[nr].FlagLen = cpu_to_le32(np->rx_buf_sz | 
NV_RX_AVAIL);
dprintk(KERN_DEBUG "%s: nv_alloc_rx: Packet %d marked as 
Available\n",
dev->name, refill_rx);
refill_rx++;
@@ -831,19 +837,31 @@
enable_irq(dev->irq);
 }
 
-static int nv_init_ring(struct net_device *dev)
+static void nv_init_rx(struct net_device *dev) 
 {
struct fe_priv *np = get_nvpriv(dev);
int i;
 
-   np->next_tx = np->nic_tx = 0;
-   for (i = 0; i < TX_RING; i++)
-   np->tx_ring[i].FlagLen = 0;
-
np->cur_rx = RX_RING;
np->refill_rx = 0;
for (i = 0; i < RX_RING; i++)
np->rx_ring[i].FlagLen = 0;
+}
+
+static void nv_init_tx(struct net_device *dev)
+{
+   struct fe_priv *np = get_nvpriv(dev);
+   int i;
+
+   np->next_tx = np->nic_tx = 0;
+   for (i = 0; i < TX_RING; i++)
+   np->tx_ring[i].FlagLen = 0;
+}
+
+static int nv_init_ring(struct net_device *dev)
+{
+   nv_init_tx(dev);
+   nv_init_rx(dev);
return nv_alloc_rx(dev);
 }
 
@@ -1207,15 +1225,82 @@
}
 }
 
+static void set_bufsize(struct net_device *dev)
+{
+   struct fe_priv *np = netdev_priv(dev);
+
+   if (dev->mtu <= ETH_DATA_LEN)
+   np->rx_buf_sz = ETH_DATA_LEN + NV_RX_HEADERS;
+   else
+   np->rx_buf_sz = dev->mtu + NV_RX_HEADERS;
+}
+
 /*
  * nv_change_mtu: dev->change_mtu function
  * Called with dev_base_lock held for read.
  */
 static int nv_change_mtu(struct net_device *dev, int new_mtu)
 {
-   if 

Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Hi,

Manfred Spraul wrote:
Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


This patch doesn't apply to 2.6.13-rc3:

patching file drivers/net/forcedeth.c
Hunk #1 FAILED at 87.
Hunk #2 FAILED at 100.
Hunk #3 FAILED at 135.
Hunk #4 succeeded at 145 (offset -3 lines).
Hunk #5 succeeded at 295 (offset -3 lines).
Hunk #6 succeeded at 305 (offset -3 lines).
Hunk #7 succeeded at 995 (offset -20 lines).
Hunk #8 succeeded at 1502 (offset -87 lines).
Hunk #9 succeeded at 2112 (offset -133 lines).
Hunk #10 FAILED at 2221.
4 out of 10 hunks FAILED -- saving rejects to file drivers/net/forcedeth.c.rej

I think this is because 2.6.13-rc3 has forcedeth 0.35.

I can't find the patch for 0.35 --> 0.36. (Is this when the netdev archives 
were in limbo?)


I found the patch for 0.36 --> 0.37 here : 
http://marc.theaimsgroup.com/?l=linux-netdev=112101962422678=2


Are the earlier changes a prerequisite, or can I just fix the TX handler 
rejects manually?


Thanks,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

[If you receive the mail twice - sorry. I forgot to attach the actual patch]
Hi,

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with


<<
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...
<<

Thanks,
   Manfred
--- 2.6/drivers/net/forcedeth.c 2005-07-16 13:10:30.0 +0200
+++ build-2.6/drivers/net/forcedeth.c   2005-07-16 15:58:03.0 +0200
@@ -87,6 +87,8 @@
  * 0.35: 26 Jun 2005: Support for MCP55 added.
  * 0.36: 28 Jun 2005: Add jumbo frame support.
  * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list
+ * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of
+ *per-packet flags.
  *
  * Known bugs:
  * We suspect that on some hardware no TX done interrupts are generated.
@@ -98,7 +100,7 @@
  * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few
  * superfluous timer interrupts from the nic.
  */
-#define FORCEDETH_VERSION  "0.36"
+#define FORCEDETH_VERSION  "0.38"
 #define DRV_NAME   "forcedeth"
 
 #include 
@@ -133,12 +135,9 @@
  * Hardware access:
  */
 
-#define DEV_NEED_LASTPACKET1   0x0001  /* set LASTPACKET1 in tx flags */
-#define DEV_IRQMASK_1  0x0002  /* use NVREG_IRQMASK_WANTED_1 for irq 
mask */
-#define DEV_IRQMASK_2  0x0004  /* use NVREG_IRQMASK_WANTED_2 for irq 
mask */
-#define DEV_NEED_TIMERIRQ  0x0008  /* set the timer irq flag in the irq 
mask */
-#define DEV_NEED_LINKTIMER 0x0010  /* poll link settings. Relies on the 
timer irq */
-#define DEV_HAS_LARGEDESC  0x0020  /* device supports jumbo frames and 
needs packet format 2 */
+#define DEV_NEED_TIMERIRQ  0x0001  /* set the timer irq flag in the irq 
mask */
+#define DEV_NEED_LINKTIMER 0x0002  /* poll link settings. Relies on the 
timer irq */
+#define DEV_HAS_LARGEDESC  0x0003  /* device supports jumbo frames and 
needs packet format 2 */
 
 enum {
NvRegIrqStatus = 0x000,
@@ -149,13 +148,16 @@
 #define NVREG_IRQ_RX   0x0002
 #define NVREG_IRQ_RX_NOBUF 0x0004
 #define NVREG_IRQ_TX_ERR   0x0008
-#define NVREG_IRQ_TX2  0x0010
+#define NVREG_IRQ_TX_OK0x0010
 #define NVREG_IRQ_TIMER0x0020
 #define NVREG_IRQ_LINK 0x0040
+#define NVREG_IRQ_TX_ERROR 0x0080
 #define NVREG_IRQ_TX1  0x0100
-#define NVREG_IRQMASK_WANTED_1 0x005f
-#define NVREG_IRQMASK_WANTED_2 0x0147
-#define NVREG_IRQ_UNKNOWN  
(~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR|NVREG_IRQ_TX2|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX1))
+#define NVREG_IRQMASK_WANTED   0x00df
+
+#define NVREG_IRQ_UNKNOWN  
(~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR| \
+   
NVREG_IRQ_TX_OK|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX_ERROR| \
+   NVREG_IRQ_TX1))
 
NvRegUnknownSetupReg6 = 0x008,
 #define NVREG_UNKSETUP6_VAL3
@@ -296,7 +298,7 @@
 
 #define NV_TX_LASTPACKET   (1<<16)
 #define NV_TX_RETRYERROR   (1<<19)
-#define NV_TX_LASTPACKET1  (1<<24)
+#define NV_TX_FORCED_INTERRUPT (1<<24)
 #define NV_TX_DEFERRED (1<<26)
 #define NV_TX_CARRIERLOST  (1<<27)
 #define NV_TX_LATECOLLISION(1<<28)
@@ -306,7 +308,7 @@
 
 #define NV_TX2_LASTPACKET  (1<<29)
 #define NV_TX2_RETRYERROR  (1<<18)
-#define NV_TX2_LASTPACKET1 (1<<23)
+#define NV_TX2_FORCED_INTERRUPT(1<<30)
 #define NV_TX2_DEFERRED(1<<25)
 #define NV_TX2_CARRIERLOST (1<<26)
 #define NV_TX2_LATECOLLISION   (1<<27)
@@ -1013,9 +1015,39 @@
struct fe_priv *np = get_nvpriv(dev);
u8 __iomem *base = get_hwbase(dev);
 
-   dprintk(KERN_DEBUG "%s: Got tx_timeout. irq: %08x\n", dev->name,
+   printk(KERN_INFO "%s: Got tx_timeout. irq: %08x\n", dev->name,
readl(base + NvRegIrqStatus) & NVREG_IRQSTAT_MASK);
 
+   {
+   int i;
+
+   printk(KERN_INFO "%s: Ring at %lx: next %d nic %d\n",
+   dev->name, (unsigned long)np->ring_addr,
+   np->next_tx, np->nic_tx);
+   printk(KERN_INFO "%s: Dumping tx registers\n", dev->name);
+   for (i=0;i<0x400;i+= 32) {
+   printk(KERN_INFO "%3x: %08x %08x %08x %08x %08x %08x 
%08x %08x\n",
+   i,
+   readl(base + i + 0), 

[PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Hi,

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with


<<
NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...
<<

Thanks,
   Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Hi,

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with



NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...


Thanks,
   Manfred
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

[If you receive the mail twice - sorry. I forgot to attach the actual patch]
Hi,

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with



NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...


Thanks,
   Manfred
--- 2.6/drivers/net/forcedeth.c 2005-07-16 13:10:30.0 +0200
+++ build-2.6/drivers/net/forcedeth.c   2005-07-16 15:58:03.0 +0200
@@ -87,6 +87,8 @@
  * 0.35: 26 Jun 2005: Support for MCP55 added.
  * 0.36: 28 Jun 2005: Add jumbo frame support.
  * 0.37: 10 Jul 2005: Additional ethtool support, cleanup of pci id list
+ * 0.38: 16 Jul 2005: tx irq rewrite: Use global flags instead of
+ *per-packet flags.
  *
  * Known bugs:
  * We suspect that on some hardware no TX done interrupts are generated.
@@ -98,7 +100,7 @@
  * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few
  * superfluous timer interrupts from the nic.
  */
-#define FORCEDETH_VERSION  0.36
+#define FORCEDETH_VERSION  0.38
 #define DRV_NAME   forcedeth
 
 #include linux/module.h
@@ -133,12 +135,9 @@
  * Hardware access:
  */
 
-#define DEV_NEED_LASTPACKET1   0x0001  /* set LASTPACKET1 in tx flags */
-#define DEV_IRQMASK_1  0x0002  /* use NVREG_IRQMASK_WANTED_1 for irq 
mask */
-#define DEV_IRQMASK_2  0x0004  /* use NVREG_IRQMASK_WANTED_2 for irq 
mask */
-#define DEV_NEED_TIMERIRQ  0x0008  /* set the timer irq flag in the irq 
mask */
-#define DEV_NEED_LINKTIMER 0x0010  /* poll link settings. Relies on the 
timer irq */
-#define DEV_HAS_LARGEDESC  0x0020  /* device supports jumbo frames and 
needs packet format 2 */
+#define DEV_NEED_TIMERIRQ  0x0001  /* set the timer irq flag in the irq 
mask */
+#define DEV_NEED_LINKTIMER 0x0002  /* poll link settings. Relies on the 
timer irq */
+#define DEV_HAS_LARGEDESC  0x0003  /* device supports jumbo frames and 
needs packet format 2 */
 
 enum {
NvRegIrqStatus = 0x000,
@@ -149,13 +148,16 @@
 #define NVREG_IRQ_RX   0x0002
 #define NVREG_IRQ_RX_NOBUF 0x0004
 #define NVREG_IRQ_TX_ERR   0x0008
-#define NVREG_IRQ_TX2  0x0010
+#define NVREG_IRQ_TX_OK0x0010
 #define NVREG_IRQ_TIMER0x0020
 #define NVREG_IRQ_LINK 0x0040
+#define NVREG_IRQ_TX_ERROR 0x0080
 #define NVREG_IRQ_TX1  0x0100
-#define NVREG_IRQMASK_WANTED_1 0x005f
-#define NVREG_IRQMASK_WANTED_2 0x0147
-#define NVREG_IRQ_UNKNOWN  
(~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR|NVREG_IRQ_TX2|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX1))
+#define NVREG_IRQMASK_WANTED   0x00df
+
+#define NVREG_IRQ_UNKNOWN  
(~(NVREG_IRQ_RX_ERROR|NVREG_IRQ_RX|NVREG_IRQ_RX_NOBUF|NVREG_IRQ_TX_ERR| \
+   
NVREG_IRQ_TX_OK|NVREG_IRQ_TIMER|NVREG_IRQ_LINK|NVREG_IRQ_TX_ERROR| \
+   NVREG_IRQ_TX1))
 
NvRegUnknownSetupReg6 = 0x008,
 #define NVREG_UNKSETUP6_VAL3
@@ -296,7 +298,7 @@
 
 #define NV_TX_LASTPACKET   (116)
 #define NV_TX_RETRYERROR   (119)
-#define NV_TX_LASTPACKET1  (124)
+#define NV_TX_FORCED_INTERRUPT (124)
 #define NV_TX_DEFERRED (126)
 #define NV_TX_CARRIERLOST  (127)
 #define NV_TX_LATECOLLISION(128)
@@ -306,7 +308,7 @@
 
 #define NV_TX2_LASTPACKET  (129)
 #define NV_TX2_RETRYERROR  (118)
-#define NV_TX2_LASTPACKET1 (123)
+#define NV_TX2_FORCED_INTERRUPT(130)
 #define NV_TX2_DEFERRED(125)
 #define NV_TX2_CARRIERLOST (126)
 #define NV_TX2_LATECOLLISION   (127)
@@ -1013,9 +1015,39 @@
struct fe_priv *np = get_nvpriv(dev);
u8 __iomem *base = get_hwbase(dev);
 
-   dprintk(KERN_DEBUG %s: Got tx_timeout. irq: %08x\n, dev-name,
+   printk(KERN_INFO %s: Got tx_timeout. irq: %08x\n, dev-name,
readl(base + NvRegIrqStatus)  NVREG_IRQSTAT_MASK);
 
+   {
+   int i;
+
+   printk(KERN_INFO %s: Ring at %lx: next %d nic %d\n,
+   dev-name, (unsigned long)np-ring_addr,
+   np-next_tx, np-nic_tx);
+   printk(KERN_INFO %s: Dumping tx registers\n, dev-name);
+   for (i=0;i0x400;i+= 32) {
+   printk(KERN_INFO %3x: %08x %08x %08x %08x %08x %08x 
%08x %08x\n,
+   i,
+   readl(base + i + 0), readl(base + i + 
4),
+  

Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Hi,

Manfred Spraul wrote:
Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


This patch doesn't apply to 2.6.13-rc3:

patching file drivers/net/forcedeth.c
Hunk #1 FAILED at 87.
Hunk #2 FAILED at 100.
Hunk #3 FAILED at 135.
Hunk #4 succeeded at 145 (offset -3 lines).
Hunk #5 succeeded at 295 (offset -3 lines).
Hunk #6 succeeded at 305 (offset -3 lines).
Hunk #7 succeeded at 995 (offset -20 lines).
Hunk #8 succeeded at 1502 (offset -87 lines).
Hunk #9 succeeded at 2112 (offset -133 lines).
Hunk #10 FAILED at 2221.
4 out of 10 hunks FAILED -- saving rejects to file drivers/net/forcedeth.c.rej

I think this is because 2.6.13-rc3 has forcedeth 0.35.

I can't find the patch for 0.35 -- 0.36. (Is this when the netdev archives 
were in limbo?)


I found the patch for 0.36 -- 0.37 here : 
http://marc.theaimsgroup.com/?l=linux-netdevm=112101962422678w=2


Are the earlier changes a prerequisite, or can I just fix the TX handler 
rejects manually?


Thanks,
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Daniel Drake wrote:


Hi,

Manfred Spraul wrote:

Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the 
nic hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've 
tested it on my nForce 250-Gb.



This patch doesn't apply to 2.6.13-rc3:

patching file drivers/net/forcedeth.c
Hunk #1 FAILED at 87.
Hunk #2 FAILED at 100.
Hunk #3 FAILED at 135.
Hunk #4 succeeded at 145 (offset -3 lines).
Hunk #5 succeeded at 295 (offset -3 lines).
Hunk #6 succeeded at 305 (offset -3 lines).
Hunk #7 succeeded at 995 (offset -20 lines).
Hunk #8 succeeded at 1502 (offset -87 lines).
Hunk #9 succeeded at 2112 (offset -133 lines).
Hunk #10 FAILED at 2221.
4 out of 10 hunks FAILED -- saving rejects to file 
drivers/net/forcedeth.c.rej


I think this is because 2.6.13-rc3 has forcedeth 0.35.

I can't find the patch for 0.35 -- 0.36. (Is this when the netdev 
archives were in limbo?)



Either that, or I just forgot to cc netdev.
I've uploaded all recent patches to
http://www.colorfullife.com/~manfred/Linux-kernel/forcedeth/

0.36 is attached.
--
   Manfred
--- 2.6/drivers/net/forcedeth.c 2005-06-28 22:51:26.0 +0200
+++ build-2.6/drivers/net/forcedeth.c   2005-06-28 22:51:40.0 +0200
@@ -85,6 +85,7 @@
  * 0.33: 16 May 2005: Support for MCP51 added.
  * 0.34: 18 Jun 2005: Add DEV_NEED_LINKTIMER to all nForce nics.
  * 0.35: 26 Jun 2005: Support for MCP55 added.
+ * 0.36: 28 Jul 2005: Add jumbo frame support.
  *
  * Known bugs:
  * We suspect that on some hardware no TX done interrupts are generated.
@@ -96,7 +97,7 @@
  * DEV_NEED_TIMERIRQ will not harm you on sane hardware, only generating a few
  * superfluous timer interrupts from the nic.
  */
-#define FORCEDETH_VERSION  0.35
+#define FORCEDETH_VERSION  0.36
 #define DRV_NAME   forcedeth
 
 #include linux/module.h
@@ -379,9 +380,13 @@
 #define TX_LIMIT_START 62
 
 /* rx/tx mac addr + type + vlan + align + slack*/
-#define RX_NIC_BUFSIZE (ETH_DATA_LEN + 64)
-/* even more slack */
-#define RX_ALLOC_BUFSIZE   (ETH_DATA_LEN + 128)
+#define NV_RX_HEADERS  (64)
+/* even more slack. */
+#define NV_RX_ALLOC_PAD(64)
+
+/* maximum mtu size */
+#define NV_PKTLIMIT_1  ETH_DATA_LEN/* hard limit not known */
+#define NV_PKTLIMIT_2  9100/* Actual limit according to NVidia: 9202 */
 
 #define OOM_REFILL (1+HZ/20)
 #define POLL_WAIT  (1+HZ/100)
@@ -473,6 +478,7 @@
struct sk_buff *rx_skbuff[RX_RING];
dma_addr_t rx_dma[RX_RING];
unsigned int rx_buf_sz;
+   unsigned int pkt_limit;
struct timer_list oom_kick;
struct timer_list nic_poll;
 
@@ -792,7 +798,7 @@
nr = refill_rx % RX_RING;
if (np-rx_skbuff[nr] == NULL) {
 
-   skb = dev_alloc_skb(RX_ALLOC_BUFSIZE);
+   skb = dev_alloc_skb(np-rx_buf_sz + NV_RX_ALLOC_PAD);
if (!skb)
break;
 
@@ -805,7 +811,7 @@
PCI_DMA_FROMDEVICE);
np-rx_ring[nr].PacketBuffer = cpu_to_le32(np-rx_dma[nr]);
wmb();
-   np-rx_ring[nr].FlagLen = cpu_to_le32(RX_NIC_BUFSIZE | 
NV_RX_AVAIL);
+   np-rx_ring[nr].FlagLen = cpu_to_le32(np-rx_buf_sz | 
NV_RX_AVAIL);
dprintk(KERN_DEBUG %s: nv_alloc_rx: Packet %d marked as 
Available\n,
dev-name, refill_rx);
refill_rx++;
@@ -831,19 +837,31 @@
enable_irq(dev-irq);
 }
 
-static int nv_init_ring(struct net_device *dev)
+static void nv_init_rx(struct net_device *dev) 
 {
struct fe_priv *np = get_nvpriv(dev);
int i;
 
-   np-next_tx = np-nic_tx = 0;
-   for (i = 0; i  TX_RING; i++)
-   np-tx_ring[i].FlagLen = 0;
-
np-cur_rx = RX_RING;
np-refill_rx = 0;
for (i = 0; i  RX_RING; i++)
np-rx_ring[i].FlagLen = 0;
+}
+
+static void nv_init_tx(struct net_device *dev)
+{
+   struct fe_priv *np = get_nvpriv(dev);
+   int i;
+
+   np-next_tx = np-nic_tx = 0;
+   for (i = 0; i  TX_RING; i++)
+   np-tx_ring[i].FlagLen = 0;
+}
+
+static int nv_init_ring(struct net_device *dev)
+{
+   nv_init_tx(dev);
+   nv_init_rx(dev);
return nv_alloc_rx(dev);
 }
 
@@ -1207,15 +1225,82 @@
}
 }
 
+static void set_bufsize(struct net_device *dev)
+{
+   struct fe_priv *np = netdev_priv(dev);
+
+   if (dev-mtu = ETH_DATA_LEN)
+   np-rx_buf_sz = ETH_DATA_LEN + NV_RX_HEADERS;
+   else
+   np-rx_buf_sz = dev-mtu + NV_RX_HEADERS;
+}
+
 /*
  * nv_change_mtu: dev-change_mtu function
  * Called with dev_base_lock held for read.
  */
 static int nv_change_mtu(struct net_device *dev, int new_mtu)
 {
-   if (new_mtu  ETH_DATA_LEN)
+  

Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Hi,

Manfred Spraul wrote:
Attached is a patch that modifies the tx interrupt handling of the 
nForce nic. It's part of the attempts to figure out what causes the nic 
hangs (see bug 4552).
The change is experimental: It affects all nForce versions. I've tested 
it on my nForce 250-Gb.


Please test it. And especially: If you experince a nic hang, please send 
me the debug output. That's the block starting with



NETDEV WATCHDOG: eth1: transmit timed out
eth1: Got tx_timeout. irq: 
eth1: Ring at  ...



After applying the v0.38 patch, I can't get any network at all. DHCP fails to 
get an IP. v0.37 works fine.


I enabled debugging, and I get this failure for every packet being 
transmitted: ( i masked out part of my MAC addr with XX )


Jul 16 20:06:28 dsd eth0: nv_start_xmit: packet packet 3 queued for 
transmission.
Jul 16 20:06:28 dsd
Jul 16 20:06:28 dsd 000: ff ff ff ff ff ff 00 50 8d XX XX XX 08 00 45 00
Jul 16 20:06:28 dsd 010: 02 40 75 a0 00 00 40 11 03 0e 00 00 00 00 ff ff
Jul 16 20:06:28 dsd 020: ff ff 00 44 00 43 02 2c 13 0a 01 01 06 00 d2 76
Jul 16 20:06:28 dsd 030: bc 10 00 0a 00 00 00 00 00 00 00 00 00 00 00 00
Jul 16 20:06:28 dsd eth0: nv_nic_irq
Jul 16 20:06:28 dsd eth0: irq: 0008
Jul 16 20:06:28 dsd eth0: nv_tx_done: looking at packet 3, Flags 0x624d.
Jul 16 20:06:28 dsd eth0: received irq with events 0x8. Probably TX fail.
Jul 16 20:06:28 dsd eth0: irq: 
Jul 16 20:06:28 dsd eth0: nv_nic_irq completed

My hardware:

:00:04.0 Class 0200: 10de:0066 (rev a1)

:00:04.0 Ethernet controller: nVidia Corporation nForce2 Ethernet 
Controller (rev a1)

Subsystem: ABIT Computer Corp.: Unknown device 1c00
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 17
Memory at e0087000 (32-bit, non-prefetchable) [size=4K]
I/O ports at b000 [size=8]
Capabilities: [44] Power Management version 2

Here's the start of the logs:


Jul 16 20:05:27 dsd forcedeth.c: Reverse Engineered nForce ethernet driver. 
Version 0.38.
Jul 16 20:05:27 dsd ACPI: PCI Interrupt :00:04.0[A] - Link [APCH] - GSI 
21 (level, high) - IRQ 17

Jul 16 20:05:27 dsd PCI: Setting latency timer of device :00:04.0 to 64
Jul 16 20:05:27 dsd :00:04.0: resource 0 start e0087000 len 4096 flags 
0x0200.

Jul 16 20:05:27 dsd :00:04.0: MAC Address 00:50:8d:XX:XX:XX
Jul 16 20:05:27 dsd :00:04.0: link timer on.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 2 at PHY 1: 0x0.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 3 at PHY 1: 0x8201.
Jul 16 20:05:27 dsd :00:04.0: open: Found PHY :0020 at address 1.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 4 at PHY 1: 0x1e1.
Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xde1 to reg 4 at PHY 1
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 1 at PHY 1: 0x786d.
Jul 16 20:05:27 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3100.
Jul 16 20:05:27 dsd eth%d: mii_rw wrote 0xb100 to reg 0 at PHY 1
Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000.
Jul 16 20:05:28 dsd eth%d: mii_rw read from reg 0 at PHY 1: 0x3000.
Jul 16 20:05:28 dsd eth%d: mii_rw wrote 0x3200 to reg 0 at PHY 1
Jul 16 20:05:28 dsd eth0: forcedeth.c: subsystem: 0147b:1c00 bound to 
:00:04.0
Jul 16 20:05:28 dsd rc-scripts: Configuration not set for eth0 - assuming dhcp
Jul 16 20:05:28 dsd nv_open: begin
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 0 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 1 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 2 marked as Available

snip

Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 125 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 126 marked as Available
Jul 16 20:05:28 dsd eth0: nv_alloc_rx: Packet 127 marked as Available
Jul 16 20:05:28 dsd eth0: nv_txrx_reset
Jul 16 20:05:28 dsd startup: got 0x0010.
Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849.
Jul 16 20:05:28 dsd eth0: mii_rw read from reg 1 at PHY 1: 0x7849.
Jul 16 20:05:28 dsd eth0: no link detected by phy - falling back to 10HD.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_start_tx
Jul 16 20:05:28 dsd eth0: no link during initialization.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.
Jul 16 20:05:28 dsd eth0: nv_stop_rx
Jul 16 20:05:28 dsd eth0: reconfiguration for multicast lists.
Jul 16 20:05:28 dsd eth0: nv_start_rx
Jul 16 20:05:28 dsd eth0: nv_start_rx to duplex 0, speed 0x000103e8.

Let me know if full logs would be useful (they are big, and it just shows a 
lot of interrupts, some packets being 

Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Daniel Drake

Daniel Drake wrote:
After applying the v0.38 patch, I can't get any network at all. DHCP 
fails to get an IP. v0.37 works fine.


Tracked it down. (sorry for linewraps)

+#define DEV_NEED_TIMERIRQ  0x0001  /* set the timer irq flag in the irq 
mask */
+#define DEV_NEED_LINKTIMER	0x0002	/* poll link settings. Relies on the timer 
irq */
+#define DEV_HAS_LARGEDESC	0x0003	/* device supports jumbo frames and needs 
packet format 2 */


My hardware is NEED_TIMERIRQ|NEED_LINKTIMER, however, by this logic, it'll 
also be DEV_HAVE_LARGEDESC, which isn't true.


So, you want this instead:

#define DEV_HAS_LARGEDESC   0x0004

After making that change, all is working fine, but then again, I've never run 
into the hangs you are debugging. I'll follow up in a couple of days time to 
confirm I'm not getting any problems with the new code.


Thanks,
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] forcedeth: TX handler changes (experimental)

2005-07-16 Thread Manfred Spraul

Daniel Drake wrote:


So, you want this instead:

#define DEV_HAS_LARGEDESC0x0004


Autsch.
Yes, you are right. Sorry for that, I should have reread the patch once 
more. I've fixed it on my website.


--
   Manfred
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/