Re: sky2 silicon bugs and workarounds...
On Mon, 2 Jul 2007 14:37:06 +0100 Daniel J Blueman [EMAIL PROTECTED] wrote: Hi Stephen, When the sky2 driver initialises, it sets the the ISR timer register (STAT_ISR_TIMER_INI) to 125 * 20 = 2500, whereas the vendor sk98lin driver sets it to 400, irrespective of the clockspeed of the NIC processor. I guess you found more performance/stability from this value...? I've checked through the errata workarounds common to my rev-1 and 2 Yukon-EC chips...the HWF_WA_DEV_4167 oversize receive hang workaround checks and can reset the (as I understand) bus master unit of the NIC (in CheckRxPath) in a periodic timer that is fired, where is finds no progress is made. My best guess at what that is handling is the chip (bug) that causes the receiver to hang if a packet larger than the receive DMA buffer is received. The sky2 driver doesn't need this because it allocates a slightly larger buffer than necessary, and truncates the oversize packet. This works because the hardware has a truncation register that was probably designed for use when packet sniffing. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sky2 silicon bugs and workarounds...
Hi Stephen, When the sky2 driver initialises, it sets the the ISR timer register (STAT_ISR_TIMER_INI) to 125 * 20 = 2500, whereas the vendor sk98lin driver sets it to 400, irrespective of the clockspeed of the NIC processor. I guess you found more performance/stability from this value...? I've checked through the errata workarounds common to my rev-1 and 2 Yukon-EC chips...the HWF_WA_DEV_4167 oversize receive hang workaround checks and can reset the (as I understand) bus master unit of the NIC (in CheckRxPath) in a periodic timer that is fired, where is finds no progress is made. With the issues we see, can they be detected earlier by the stats counters not being incremented, then resetting the bus-master unit, rather than the whole chip getting kicked after a far longer period. It looks like if it is a silicon bug, we can just acknowledge it and have a better framework to detect the chip's PCI interface locking up and kick it in a smarter way perhaps... Daniel -- Daniel J Blueman - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 silicon bugs and workarounds...
On 02/07/07, Stephen Hemminger [EMAIL PROTECTED] wrote: On Mon, 2 Jul 2007 14:37:06 +0100 Daniel J Blueman [EMAIL PROTECTED] wrote: When the sky2 driver initialises, it sets the the ISR timer register (STAT_ISR_TIMER_INI) to 125 * 20 = 2500, whereas the vendor sk98lin driver sets it to 400, irrespective of the clockspeed of the NIC processor. I guess you found more performance/stability from this value...? Not really, it was just a rough guess to try and get more frames per irq under DoS load. Haven't fine tuned those values. I've checked through the errata workarounds common to my rev-1 and 2 Yukon-EC chips...the HWF_WA_DEV_4167 oversize receive hang workaround checks and can reset the (as I understand) bus master unit of the NIC (in CheckRxPath) in a periodic timer that is fired, where is finds no progress is made. Where did you get those errata's. I keep having to do reverse engineering guessing with vendor driver. http://www.syskonnect.de/e_en/products/adapters/pcie_server/sk-9exx/software/linux/driver/install_v10.0.4.3.tar.bz2 from sk98lin.tar.bz2 inside --- defined in ./common/h/skgeinit.h /*-RMV- DWORD 1: Deviations */ #define HWF_WA_DEV_53 0x1100UL/*-RMV- 5.3 (Tx Done LSOv2 rep)*/ #define HWF_WA_DEV_LIM_IPV6_RSS 0x1080UL/*-RMV- IPV6 RSS limitted */ #define HWF_WA_DEV_4217 0x1040UL/*-RMV- 4.217 (PCI-E blockage) */ #define HWF_WA_DEV_4200 0x1020UL/*-RMV- 4.200 (D3 Blue Screen)*/ #define HWF_WA_DEV_4185CS 0x1010UL/*-RMV- 4.185 (ECU 100 CS cal)*/ #define HWF_WA_DEV_4185 0x1008UL/*-RMV- 4.185 (ECU Tx h check)*/ #define HWF_WA_DEV_4167 0x1004UL/*-RMV- 4.167 (Rx OvSize Hang)*/ #define HWF_WA_DEV_4152 0x1002UL/*-RMV- 4.152 (RSS issue) */ #define HWF_WA_DEV_4115 0x1001UL/*-RMV- 4.115 (Rx MAC FIFO) */ #define HWF_WA_DEV_4109 0x10008000UL/*-RMV- 4.109 (BIU hang) */ #define HWF_WA_DEV_483 0x10004000UL/*-RMV- 4.83 (Rx TCP wrong) */ #define HWF_WA_DEV_479 0x10002000UL/*-RMV- 4.79 (Rx BMU hang II) */ #define HWF_WA_DEV_472 0x10001000UL/*-RMV- 4.72 (GPHY2 MDC clk) */ #define HWF_WA_DEV_463 0x1800UL/*-RMV- 4.63 (Rx BMU hang I) */ #define HWF_WA_DEV_427 0x1400UL/*-RMV- 4.27 (Tx Done Rep) */ #define HWF_WA_DEV_42 0x1200UL/*-RMV- 4.2 (pref unit burst) */ #define HWF_WA_DEV_46 0x1100UL/*-RMV- 4.6 (CPU crash II) */ #define HWF_WA_DEV_43_418 0x1080UL/*-RMV- 4.3 4.18 (PCI unexp */ /*-RMV- complStat BMU deadl) */ #define HWF_WA_DEV_420 0x1040UL/*-RMV- 4.20 (Status BMU ov) */ #define HWF_WA_DEV_423 0x1020UL/*-RMV- 4.23 (TCP Segm Hang) */ #define HWF_WA_DEV_424 0x1010UL/*-RMV- 4.24 (MAC reg overwr) */ #define HWF_WA_DEV_425 0x1008UL/*-RMV- 4.25 (Magic packet */ /*-RMV- with odd offset) */ #define HWF_WA_DEV_428 0x1004UL/*-RMV- 4.28 (Poll-U BigEndi)*/ #define HWF_WA_FIFO_FLUSH_YLA0 0x1002UL/*-RMV- dis Rx GMAC FIFO Flush*/ /*-RMV- for Yu-L Rev. A0 only */ #define HWF_WA_COMA_MODE0x1001UL/*-RMV- Coma Mode WA req */ --- common/skgeinit.c:SkGeSetUpSupFeatures() case CHIP_ID_YUKON_EC: pAC-GIni.HwF.Features[HW_DEV_LIST] = HWF_WA_DEV_42 | HWF_WA_DEV_46 | HWF_WA_DEV_43_418 | ... case CHIP_ID_YUKON_FE: pAC-GIni.HwF.Features[HW_DEV_LIST] = HWF_WA_DEV_427 | HWF_WA_DEV_4109 | HWF_WA_DEV_4152 | HWF_WA_DEV_4167; break; case CHIP_ID_YUKON_XL: ... etc It's worthwhile looking at 2.6/skge.c:CheckRxPath() and it's call-site from the timer handler. Thanks, Daniel -- Daniel J Blueman - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 silicon bugs and workarounds...
On Mon, 2 Jul 2007 14:37:06 +0100 Daniel J Blueman [EMAIL PROTECTED] wrote: Hi Stephen, When the sky2 driver initialises, it sets the the ISR timer register (STAT_ISR_TIMER_INI) to 125 * 20 = 2500, whereas the vendor sk98lin driver sets it to 400, irrespective of the clockspeed of the NIC processor. I guess you found more performance/stability from this value...? I've checked through the errata workarounds common to my rev-1 and 2 Yukon-EC chips...the HWF_WA_DEV_4167 oversize receive hang workaround checks and can reset the (as I understand) bus master unit of the NIC (in CheckRxPath) in a periodic timer that is fired, where is finds no progress is made. This code in the vendor driver is not acceptable. It causes the device to continually reset itself in idle state. The sk9lin driver can not tell the difference between no packets arriving and hung! -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html