Re: Re: Strange connection slowdown on pcnet32
On Mon, Feb 19, 2007 at 06:59:16PM -0500, Lennart Sorensen wrote: > I am also noticing the receive error count going up, and the source is > this code: > > if (status & 0x01) /* Only count a general error at the */ >lp->stats.rx_errors++; /* end of a packet. */ > > It appears this means I am receiving a frame marked with "End Of Packet" > but without "Start of Packet". I have no idea how that happens, but it > shouldn't be able to make the driver and MAC stop processing the receive > ring. Well the packets actually have both start and end marked, but also have overflow marked, so the cpu simply isn't keeping up it seems (It is taking about 100% of the cpu to push through 6500KB/s). Certainly the CONFIG_X86_OOSTORE makes a major difference, although I am still not sure why. Simply skipping ahead one or two receive descriptors when the current one is marked as owned by the MAC but the one a few ahead is owned by the CPU allows it to continue receiving when it happens. I really want to find out why it happens though, although I am not sure how to go about doing that. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Strange connection slowdown on pcnet32
On Mon, Feb 19, 2007 at 06:45:48PM -0500, Lennart Sorensen wrote: > It seems the problem actually occours when the receive descriptor ring > is full. This seems to generate one (or sometimes more) descriptors in > the ring which claim to be owned by the MAC, but at the head of the > receive ring as far as the driver is concerned. I see some note in the > driver about an SP3G chipset sometimes causing this. How would one > identify this and clear such descriptors out of the way? Getting stuck > until the next time the MAC gets around to the descriptor and overwrites > it is not good, since it causes delays, and out of order packets. I am also noticing the receive error count going up, and the source is this code: if (status & 0x01) /* Only count a general error at the */ lp->stats.rx_errors++; /* end of a packet. */ It appears this means I am receiving a frame marked with "End Of Packet" but without "Start of Packet". I have no idea how that happens, but it shouldn't be able to make the driver and MAC stop processing the receive ring. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Strange connection slowdown on pcnet32
On Mon, Feb 19, 2007 at 05:29:20PM -0500, Lennart Sorensen wrote: > I just noticed, it seems almost all these problems occour right at the > start of transfers when the tcp window size is still being worked out > for the connection speed, and I am seeing the error count go up in > ifconfig for the port when it happens too. Is it possible for an error > to get flagged in a receive descriptor without the owner bit being > updated? It seems the problem actually occours when the receive descriptor ring is full. This seems to generate one (or sometimes more) descriptors in the ring which claim to be owned by the MAC, but at the head of the receive ring as far as the driver is concerned. I see some note in the driver about an SP3G chipset sometimes causing this. How would one identify this and clear such descriptors out of the way? Getting stuck until the next time the MAC gets around to the descriptor and overwrites it is not good, since it causes delays, and out of order packets. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Strange connection slowdown on pcnet32
On Mon, Feb 19, 2007 at 05:18:45PM -0500, Lennart Sorensen wrote: > On Mon, Feb 19, 2007 at 03:11:36PM -0500, Lennart Sorensen wrote: > > I have been poking at things with firescope to see if the MAC is > > actually writing to system memory or not. > > > > The entry that it gets stuch on is _always_ entry 0 in the rx_ring. > > There does not appear to be any exceptions to this. > > > > Here is my firescope (slightly modified for this purpose) dump of the > > rx_ring of eth1: > > > > Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\ > > : : | | |len| |tus| | length | | | > > RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > > RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00 > > RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00 > > RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00 > > RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00 > > RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > > RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00 > > RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > > RXdesc[31]:66941f0: 12 28 f1 04 fa f9 00 80 40 00 00 00 00 00 00 00 > > > > I only ever see entry 0 as status 0080 (0x8000 which is owned by mac), > > and this is while the driver is checking entry 0 every time it tries to > > check for any waiting packets. > > > > Running tcpdump while pinging gives the interesting result that some > > packets are ariving out of order making it seem like the driver is > > processing the packets out of order. Perhaps the driver is wrong to be > > looking at entry 0, and should be looking at entry 1 and is hence stuck > > until the whole receive ring has been filled again? > > > > 15:06:04.112812 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 1 > > 15:06:05.119799 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 2 > > 15:06:05.120159 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 2 > > 15:06:05.127045 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 1 > > 15:06:06.119862 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 3 > > 15:06:07.119921 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 4 > > 15:06:08.119994 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 5 > > 15:06:08.426400 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 3 > > 15:06:08.427915 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 4 > > 15:06:08.429033 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 5 > > 15:06:09.120053 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 6 > > 15:06:10.120109 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 7 > > 15:06:10.705332 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 6 > > 15:06:10.707258 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 7 > > 15:06:11.120175 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 8 > > 15:06:12.120233 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 9 > > 15:06:13.120297 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 10 > > 15:06:14.120359 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 11 > > 15:06:14.120737 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 11 >
Re: Re: Strange connection slowdown on pcnet32
On Mon, Feb 19, 2007 at 03:11:36PM -0500, Lennart Sorensen wrote: > I have been poking at things with firescope to see if the MAC is > actually writing to system memory or not. > > The entry that it gets stuch on is _always_ entry 0 in the rx_ring. > There does not appear to be any exceptions to this. > > Here is my firescope (slightly modified for this purpose) dump of the > rx_ring of eth1: > > Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\ > : : | | |len| |tus| | length | | | > RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 > RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00 > RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00 > RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00 > RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00 > RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00 > RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00 > RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00 > RXdesc[31]:66941f0: 12 28 f1 04 fa f9 00 80 40 00 00 00 00 00 00 00 > > I only ever see entry 0 as status 0080 (0x8000 which is owned by mac), > and this is while the driver is checking entry 0 every time it tries to > check for any waiting packets. > > Running tcpdump while pinging gives the interesting result that some > packets are ariving out of order making it seem like the driver is > processing the packets out of order. Perhaps the driver is wrong to be > looking at entry 0, and should be looking at entry 1 and is hence stuck > until the whole receive ring has been filled again? > > 15:06:04.112812 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 1 > 15:06:05.119799 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 2 > 15:06:05.120159 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 2 > 15:06:05.127045 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 1 > 15:06:06.119862 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 3 > 15:06:07.119921 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 4 > 15:06:08.119994 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 5 > 15:06:08.426400 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 3 > 15:06:08.427915 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 4 > 15:06:08.429033 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 5 > 15:06:09.120053 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 6 > 15:06:10.120109 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 7 > 15:06:10.705332 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 6 > 15:06:10.707258 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 7 > 15:06:11.120175 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 8 > 15:06:12.120233 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 9 > 15:06:13.120297 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 10 > 15:06:14.120359 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 11 > 15:06:14.120737 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 11 > 15:06:14.127064 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 8 > 15:06:14.127700 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 9 > 15:06:14.128268 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo
Re: Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote: > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: pcnet32_poll: pcnet32_rx() got 16 packets > eth1: base: 0x05215812 status: 0310 next->status: 0310 > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: netif_receive_skb(skb) > eth1: pcnet32_poll: pcnet32_rx() got 16 packets > eth1: base: 0x04c51812 status: 8000 next->status: 0310 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x6f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0310 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0310 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0310 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > So somehow it ends up that when it reads the status of the descriptor at > address 0x04c51812, it sees the status as 0x8000 (which means owned by > the MAC I believe), even though the next descriptor in the ring has a > sensible status, indicating that the descriptor is ready to be handled > by the driver. Since the descriptor isn't ready, we exit without > handling anything and NAPI reschedules is the next time we get an > interrupt, and after some random number of tries, we finally see the > right status and handle the packet, along with a bunch of other packets > waiting in the descriptor ring. Then we seem to hit the exact same > descriptor address again, with the same problem in the status we read, > and again we are stuck for a while, until finally we see the right > status, and another pile of packets get handled, and we again hit the > same descriptor address and get stuck. I have been poking at things with firescope to see if the MAC is actually writing to system memory or not. The entry that it gets stuch on is _always_ entry 0 in the rx_ring. There does not appear to be any exceptions to this. Here is my firescope (slightly modified for this purpose) dump of the rx_ring of eth1: Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\ : : | | |len| |tus| | length | | | RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00 RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00 RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00 RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00 RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00 RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00 RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00 RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00 RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00
Re: MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)
On Fri, Feb 16, 2007 at 05:48:24PM -0500, Lennart Sorensen wrote: > Well so far it really looks like enabling OOSTORE on the Geode > SC1200/GX1 really does make a difference. A bit of searching seems to > indicate the person that originally submitted the patch that enabled > load/store reordering on the MediaGX/Geode though it might need OOSTORE, > but was convinced by others it didn't. Looks like it really does need > it. The failure that occoured before within a few seconds of starting a > large transfer, no longer fails and all I did was enable > CONFIG_X86_OOSTORE, and recompile pcnet32.ko and load the new module on > the running system. Moving back to the pcnet32.ko built without OOSTORE > enabled hits the failure again within seconds, until ifconfig eth1 > down/up reinitialized it's descriptor ring, after which it survices > another bit of transfer and then fails again. Well forcing load/store serialize on the CPU doesn't help, disalbing memory bypass doesn't help. Enabling the X86_OOSTORE does help. What a stupid CPU design. So far nothing has managed to fix the __memcpy_toio in the jsm driver getting data out of order when sending on an exar pci uart chip. Only calling memcpy with one byte at a time seems to work there. Works fine on every other cpu of course. What else am I going to discover is wrong with this CPU. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)
On Fri, Feb 16, 2007 at 05:27:28PM -0500, Lennart Sorensen wrote: > On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote: > > It seems whenever it gets stuck, it is always the same descripter it is > > stuck on. Here is my current log: > > > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > > eth1
Re: Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote: > It seems whenever it gets stuck, it is always the same descripter it is > stuck on. Here is my current log: > > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet32_rx() got 0 packets > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. > eth1: base: 0x04c51812 status: 8000 next->status: 0340 > eth1: pcnet32_poll: pcnet3
Re: Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 03:23:00PM -0500, Lennart Sorensen wrote: > So I have determined that when the port gets "stuck/slow" it is hitting > this problem: > > (in pcnet32_rx): > while (quota > npackets && (short)le16_to_cpu(rxp->status) >= 0) { > if (netif_msg_intr(lp)) printk(KERN_DEBUG "%s: pcnet32_rx > npackets %d\n", dev->name, npackets); > pcnet32_rx_entry(dev, lp, rxp, entry); > npackets += 1; > /* > * The docs say that the buffer length isn't touched, but > Andrew > * Boyd of QNX reports that some revs of the 79C965 clear it. > */ > rxp->buf_length = le16_to_cpu(2 - PKT_BUF_SZ); > wmb(); /* Make sure owner changes after others are visible */ > rxp->status = le16_to_cpu(0x8000); > entry = (++lp->cur_rx) & lp->rx_mod_mask; > rxp = &lp->rx_ring[entry]; > } > > Unfortunately rxp->status reads as 0x8000 for a long time, and then > eventually changes to 0x0310 at which point the receive happens. Until > that happens, the poll is called about once per second and each time > returns that 0 packets were received but that more packets are waiting. > > I can't figure out why it would get a status of 0x8000 which means that > the MAC hasn't changed the ownership flag on the packet yet, even though > it generated a receive interrupt multiple seconds ago. Could it be some > caching issue that makes the cpu not realize that the memory has in fact > been changed by DMA? Any way to force a cache update for a memory > location? > > The CPU is a Geode SC1200 (Geode GX1 + Companion in one). So far I have > seen __memcpy from system ram to device memory get data out of order, so > I have no reason to believe the cpu doesn't have more stupid bugs > related to doing I/O. It seems whenever it gets stuck, it is always the same descripter it is stuck on. Here is my current log: eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x. eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00. eth1: base: 0x04c51812 status: 8000 next->status: 0340 eth1: pcnet32_poll: pcnet32_rx() got 0 packets eth1: interrupt
Re: Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 12:21:10PM -0500, Lennart Sorensen wrote: > On Fri, Feb 16, 2007 at 10:21:24AM -0600, [EMAIL PROTECTED] wrote: > > Are there any messages in the log about timeouts, or anything else from the > > driver? When it gets in this state, can you communicate with another > > system, and does it have the same slow behavior? > > Nope no timeouts or messages. As far as the system looks, cpu and ram and > logs show nothing unusual. Just very slow reception on the ethernet port > going towards the server providing the data for the transfer. Messages do > get through eventually, but very very late (when a ping reply arives at > the port and takes 5 to 10 seconds to make it to the network stack, then > something isn't right, at least when there is no other traffic waiting). > > I did have NAPI in the driver even in 2.6.8 (I was adding that at the > time). I am now testing with 2.6.8 without NAPI (so no mask/unmask of > receive interrupts taking place), and so far it has run for over an hour > without failing, although that doens't prove it won't, just that it has > lasted longer. > > I think I will try compiling 2.6.18 again with NAPI disabled on the > pcnet32 and see what that does. There is a chance that something in the > NAPI implementation is breaking the chip's receive somehow although I > can't currently imagine what it could be or how. So I have determined that when the port gets "stuck/slow" it is hitting this problem: (in pcnet32_rx): while (quota > npackets && (short)le16_to_cpu(rxp->status) >= 0) { if (netif_msg_intr(lp)) printk(KERN_DEBUG "%s: pcnet32_rx npackets %d\n", dev->name, npackets); pcnet32_rx_entry(dev, lp, rxp, entry); npackets += 1; /* * The docs say that the buffer length isn't touched, but Andrew * Boyd of QNX reports that some revs of the 79C965 clear it. */ rxp->buf_length = le16_to_cpu(2 - PKT_BUF_SZ); wmb(); /* Make sure owner changes after others are visible */ rxp->status = le16_to_cpu(0x8000); entry = (++lp->cur_rx) & lp->rx_mod_mask; rxp = &lp->rx_ring[entry]; } Unfortunately rxp->status reads as 0x8000 for a long time, and then eventually changes to 0x0310 at which point the receive happens. Until that happens, the poll is called about once per second and each time returns that 0 packets were received but that more packets are waiting. I can't figure out why it would get a status of 0x8000 which means that the MAC hasn't changed the ownership flag on the packet yet, even though it generated a receive interrupt multiple seconds ago. Could it be some caching issue that makes the cpu not realize that the memory has in fact been changed by DMA? Any way to force a cache update for a memory location? The CPU is a Geode SC1200 (Geode GX1 + Companion in one). So far I have seen __memcpy from system ram to device memory get data out of order, so I have no reason to believe the cpu doesn't have more stupid bugs related to doing I/O. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 10:21:24AM -0600, [EMAIL PROTECTED] wrote: > Are there any messages in the log about timeouts, or anything else from the > driver? When it gets in this state, can you communicate with another system, > and does it have the same slow behavior? Nope no timeouts or messages. As far as the system looks, cpu and ram and logs show nothing unusual. Just very slow reception on the ethernet port going towards the server providing the data for the transfer. Messages do get through eventually, but very very late (when a ping reply arives at the port and takes 5 to 10 seconds to make it to the network stack, then something isn't right, at least when there is no other traffic waiting). I did have NAPI in the driver even in 2.6.8 (I was adding that at the time). I am now testing with 2.6.8 without NAPI (so no mask/unmask of receive interrupts taking place), and so far it has run for over an hour without failing, although that doens't prove it won't, just that it has lasted longer. I think I will try compiling 2.6.18 again with NAPI disabled on the pcnet32 and see what that does. There is a chance that something in the NAPI implementation is breaking the chip's receive somehow although I can't currently imagine what it could be or how. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Strange connection slowdown on pcnet32
Are there any messages in the log about timeouts, or anything else from the driver? When it gets in this state, can you communicate with another system, and does it have the same slow behavior? Looks like my mailer is munging white spaces. On Fri, Feb 16, 2007 at 09:35:54AM -0500, Lennart Sorensen wrote: > I have run some tests using 2.6.8 now, and so far it hasn't failed. > > Still investigating... And 5 minutes later 2.6.8 failed the same way too. Maybe I will go back to 2.4 and check. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange connection slowdown on pcnet32
On Fri, Feb 16, 2007 at 09:35:54AM -0500, Lennart Sorensen wrote: > I have run some tests using 2.6.8 now, and so far it hasn't failed. > > Still investigating... And 5 minutes later 2.6.8 failed the same way too. Maybe I will go back to 2.4 and check. -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange connection slowdown on pcnet32
On Thu, Feb 15, 2007 at 05:50:30PM -0500, Lennart Sorensen wrote: > I have encountered a strange behaviour with the pcnet32. > > I am transfering data from a server to a client routing it through my > router. The router has 2 ethernet ports, both of which are amd 972 > chips (pcnet32). The transfer has so far been either http or ftp (both > see the same problem). I transfer lots of data, and after a while (I > have seen anywhere from 200 to 700MB or so) the speed suddenly drops to > less than 1KB/s. If I ping from the router to the server, the ping > requests go out normally (seen by tcpdump on the server) every second, > but on the router the replies are not seen by the kernel for multiple > seconds. Sometimes I will see 3 ping replies together, sometimes 5 or > even 10. The turn around times will show 10500, 9500, 8500, ..., 500ms > for the packets received in a batch. ifconfig on the router shows the > packet receive counts showing up in lumps, just as ping does, and > tcpdump on the interface on the router. > > Doing ifconfig down and up on the port connecting to the server makes > the problem clear and it can handle another pile of data before the > problem reappears. > > The CPU on the router is not fast enough to ensure there won't ever be > dropped packets at 100Mbps. When I force the port to the server to > 10Mbps I have no problems at all. > > Replacing the port to the server with an rtl8139 doesn't show any > problems at 100Mbps, although the transfer rate drops from 6500KBps to > 4000KBps compared to using the pcnet32. > > Kernel used so far is 2.6.16 and 2.6.18. > > I have a tulip card I intend to try with as well just to see if it > affects anything other than the pcnet32. > > Does anyone have any hints as to what part of the code to look at for > changes made by doing ifconfig eth1 down; ifconfig eth1 up? Any ideas > as to what could make the reception of packets suddenly get very very > slow? > > On one pass where I was running tcpdump on the router, I saw a wrap of > the sequence number right before the problem occoured, but that has not > been the case every time as far as I can tell, so I am not sure if that > is related to the problem at all. I have run some tests using 2.6.8 now, and so far it hasn't failed. Still investigating... -- Len Sorensen - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html