On Mon, 26 Mar 2007 21:24:06 -0600 Rob Sims <[EMAIL PROTECTED]> wrote:
> On Fri, Mar 16, 2007 at 02:16:48PM -0700, Stephen Hemminger wrote: > > On Fri, 16 Mar 2007 14:36:45 -0600 > > Rob Sims <[EMAIL PROTECTED]> wrote: > > > > Are there some debug hooks that can be activated? My sky2 stops > > > responding (very light load) about twice a day. The netdev watchdog > > > notices after a while and is able to reactivate the interface: > > > > Mar 15 13:28:12 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out > > > Mar 15 13:28:12 btd kernel: sky2 eth0: tx timeout > > > Mar 15 13:28:12 btd kernel: sky2 eth0: transmit ring 458 .. 435 > > > report=458 done=458 > > > Mar 15 13:28:12 btd kernel: sky2 eth0: disabling interface > > > Mar 15 13:28:12 btd kernel: sky2 eth0: enabling interface > > > Mar 15 13:28:12 btd kernel: sky2 eth0: ram buffer 48K > > > Mar 15 13:28:15 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full > > > duplex, flow control both > > > > Use ethtool -S to if there are any pause frames, etc. See if frames are > > still making it into PHY statistics but not being received. > > > > Use ethtool -d to dump registers. Need current version of ethtool with > > decode logic. > > > > Then look for things like is Ram buffer read/write pointer changing? > > > > Is GMAC stuck in pause: > > > > Normal is: > > GMAC 1 > > Status 0x5010 (see GM_GPSR_XXX in sky2.h) > > Control 0x1800 > > > > Stuck is > > GMAC 1 > > Status 0x5810 (or 0x5A10) > > First, here's the described hang in action, on the Core2 Duo on a 1Gb > hub: > GMAC 1 Status/Control remains at 0x5010/0x1800 until module is removed. > Read/write buffer pointers are changing. Full ethtool output in > http://www.robsims.com/sky2.netmon.log.gz > > This machine was also having major throughput problems - 17 kB/s. > Rebooting brought it to ~ 20 MB/s. Booting into a kernel with the > proprietary sk98lin kernel module showed ~ 80MB/s. Finally, returning > to sky2 gave 117 MB/s. Tests run using netcat, dd, /dev/zero, and > /dev/null, transmitting from the problem box to an e1000 via a Netgear > GS108. No hangs were observed during the "load test." The vendor driver does not do hardware flow control correctly. It ignores Tx pause frames. Were you using Jumbo MTU? > I also had a hang on a Pentium 4 w/sky2, 100Mb/s hub. I neglected to > try removing and re-inserting the module before rebooting. > GMAC 1 > Status 0xF004 This is [100mbit] [Full duplex] [Tx flow control disabled] [link up] [Rx flow control disabled] > Control 0x1800 [TX enabled] [Rx enabled] You might have over run the hub and it wedged. Try doing: ethtool -r eth0 that forces a down/up -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/