Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-22 Thread Andrew Hall


If there's anything I can do to help isolate this issue please let me 
know.




Looking through the logs I'm also intermittently seeing some of this:

sky2 eth1: rx error, status 0x7ffc0001 length 844

I'm not sure if it's related to the TX issue but I thought I'd mention it.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-22 Thread Stephen Hemminger
On Fri, 22 Sep 2006 18:40:00 +1000
Andrew Hall [EMAIL PROTECTED] wrote:

 
  If there's anything I can do to help isolate this issue please let me 
  know.
 
 
 Looking through the logs I'm also intermittently seeing some of this:
 
 sky2 eth1: rx error, status 0x7ffc0001 length 844

That means the driver is getting receive fifo overflows. You must
not be using receive flow control.

The driver takes no special action when this occurs. It isn't clear (even from
the documentation), what is supposed to be done.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-21 Thread Andrew Hall

Unfortunately I spoke too soon.. :-(

The driver no longer appears to fail under load but still fails randomly 
with relatively light load:


Sep 22 07:04:38 localhost syslog.info -- MARK --
Sep 22 07:24:38 localhost syslog.info -- MARK --
Sep 22 07:31:52 localhost user.info kernel: NETDEV WATCHDOG: eth1: transmit 
timed out

Sep 22 07:31:52 localhost user.err kernel: sky2 eth1: tx timeout
Sep 22 07:31:52 localhost user.debug kernel: sky2 eth1: transmit ring 27 .. 
498 report=27 done=27

Sep 22 07:31:52 localhost user.info kernel: sky2 hardware hung? flushing
Sep 22 07:39:27 localhost user.info kernel: NETDEV WATCHDOG: eth1: transmit 
timed out

Sep 22 07:39:27 localhost user.err kernel: sky2 eth1: tx timeout
Sep 22 07:39:27 localhost user.debug kernel: sky2 eth1: transmit ring 498 .. 
457 report=27 done=27

Sep 22 07:39:27 localhost user.info kernel: sky2 status report lost?
Sep 22 07:40:07 localhost user.info kernel: NETDEV WATCHDOG: eth1: transmit 
timed out

Sep 22 07:40:07 localhost user.err kernel: sky2 eth1: tx timeout
Sep 22 07:40:07 localhost user.debug kernel: sky2 eth1: transmit ring 27 .. 
498 report=27 done=27

Sep 22 07:40:07 localhost user.info kernel: sky2 hardware hung? flushing

The driver version that I am using is 1.6.1 with the tx pause bug fix 
manually applied. This is compiled against 2.6.17.13 stock.


This fault (since releasing this version into production) has now occured in 
3 sites since upgrading them yesterday. The previous version of kernel that 
was being used was 2.6.12 (based on the old sk98lin driver) which didn't 
manifest any of these problems.


If there's anything I can do to help isolate this issue please let me know.

- Original Message - 
From: Stephen Hemminger [EMAIL PROTECTED]

To: Andrew Hall [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org
Sent: Wednesday, September 20, 2006 6:31 AM
Subject: Re: [PATCH 1/3] sky2: tx pause bug fix



On Mon, 11 Sep 2006 11:35:11 +1000
Andrew Hall [EMAIL PROTECTED] wrote:


Stephen,

After some serious testing, this patch seems to fix the lockup issue
completely. I manually applied these changes against the 2.6.17.13 
release.


Any more problems? 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-21 Thread Larry Finger

Andrew Hall wrote:

Unfortunately I spoke too soon.. :-(

The driver no longer appears to fail under load but still fails randomly 
with relatively light load:




I don't know if this will help, but NETDEV watchdog timeouts in the bcm43xx driver were fixed by 
changing a netif_stop_queue call into a netif_tx_disable. Why wrapping the stop_queue inside 
netif_tx_lock_bh/netif_tx_unlock_bh pairs works is beyond me, but ...


Larry
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-19 Thread Stephen Hemminger
On Mon, 11 Sep 2006 11:35:11 +1000
Andrew Hall [EMAIL PROTECTED] wrote:

 Stephen,
 
 After some serious testing, this patch seems to fix the lockup issue 
 completely. I manually applied these changes against the 2.6.17.13 release.

Any more problems?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-19 Thread Andrew Hall

Not that I've been able to uncover.

Thanks for fixing this..

- Original Message - 
From: Stephen Hemminger [EMAIL PROTECTED]

To: Andrew Hall [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org
Sent: Wednesday, September 20, 2006 6:31 AM
Subject: Re: [PATCH 1/3] sky2: tx pause bug fix



On Mon, 11 Sep 2006 11:35:11 +1000
Andrew Hall [EMAIL PROTECTED] wrote:


Stephen,

After some serious testing, this patch seems to fix the lockup issue
completely. I manually applied these changes against the 2.6.17.13 
release.


Any more problems?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-13 Thread Jeff Garzik
since I didn't hear back on my question from yesterday, and since 2.6.18 
is very close, I applied this series to netdev#upstream


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-11 Thread Jeff Garzik

Stephen Hemminger wrote:

Fix problems with transmit pause frames. The driver was telling the
GMAC to flush (not process) pause frames. Manually disabling pause wasn't
working because of problems in the setup.

This maybe the cause of the lockup under load.
http://bugzilla.kernel.org/show_bug.cgi?id=6839

Patch against netdev-2.6 git tree

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


Is this for #upstream (2.6.19) or #upstream-fixes (2.6.18-rc)?

It looks OK for #upstream, but too intrusive for #upstream-fixes.

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] sky2: tx pause bug fix

2006-09-10 Thread Andrew Hall

Stephen,

After some serious testing, this patch seems to fix the lockup issue 
completely. I manually applied these changes against the 2.6.17.13 release.


- Original Message - 
From: Stephen Hemminger [EMAIL PROTECTED]

To: Jeff Garzik [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org
Sent: Thursday, September 07, 2006 5:44 AM
Subject: [PATCH 1/3] sky2: tx pause bug fix



Fix problems with transmit pause frames. The driver was telling the
GMAC to flush (not process) pause frames. Manually disabling pause wasn't
working because of problems in the setup.

This maybe the cause of the lockup under load.
http://bugzilla.kernel.org/show_bug.cgi?id=6839

Patch against netdev-2.6 git tree

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---

drivers/net/sky2.c |  123 
++--

drivers/net/sky2.h |2 -
2 files changed, 43 insertions(+), 82 deletions(-)

1419c8ab49f8fd56bad1ac0d3dcbaf830cd5a5d6
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 7ce0663..a3a4ab2 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -289,7 +289,7 @@ static void sky2_gmac_reset(struct sky2_
static void sky2_phy_init(struct sky2_hw *hw, unsigned port)
{
 struct sky2_port *sky2 = netdev_priv(hw-dev[port]);
- u16 ctrl, ct1000, adv, pg, ledctrl, ledover;
+ u16 ctrl, ct1000, adv, pg, ledctrl, ledover, reg;

 if (sky2-autoneg == AUTONEG_ENABLE 
 !(hw-chip_id == CHIP_ID_YUKON_XL || hw-chip_id == 
CHIP_ID_YUKON_EC_U)) {

@@ -358,6 +358,7 @@ static void sky2_phy_init(struct sky2_hw
 ctrl = 0;
 ct1000 = 0;
 adv = PHY_AN_CSMA;
+ reg = 0;

 if (sky2-autoneg == AUTONEG_ENABLE) {
 if (hw-copper) {
@@ -390,21 +391,46 @@ static void sky2_phy_init(struct sky2_hw
 /* forced speed/duplex settings */
 ct1000 = PHY_M_1000C_MSE;

- if (sky2-duplex == DUPLEX_FULL)
- ctrl |= PHY_CT_DUP_MD;
+ /* Disable auto update for duplex flow control and speed */
+ reg |= GM_GPCR_AU_ALL_DIS;

 switch (sky2-speed) {
 case SPEED_1000:
 ctrl |= PHY_CT_SP1000;
+ reg |= GM_GPCR_SPEED_1000;
 break;
 case SPEED_100:
 ctrl |= PHY_CT_SP100;
+ reg |= GM_GPCR_SPEED_100;
 break;
 }

+ if (sky2-duplex == DUPLEX_FULL) {
+ reg |= GM_GPCR_DUP_FULL;
+ ctrl |= PHY_CT_DUP_MD;
+ } else if (sky2-speed != SPEED_1000  hw-chip_id != 
CHIP_ID_YUKON_EC_U) {

+ /* Turn off flow control for 10/100mbps */
+ sky2-rx_pause = 0;
+ sky2-tx_pause = 0;
+ }
+
+ if (!sky2-rx_pause)
+ reg |= GM_GPCR_FC_RX_DIS;
+
+ if (!sky2-tx_pause)
+ reg |= GM_GPCR_FC_TX_DIS;
+
+ /* Forward pause packets to GMAC? */
+ if (sky2-tx_pause || sky2-rx_pause)
+ sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
+ else
+ sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
+
 ctrl |= PHY_CT_RESET;
 }

+ gma_write16(hw, port, GM_GP_CTRL, reg);
+
 if (hw-chip_id != CHIP_ID_YUKON_FE)
 gm_phy_write(hw, port, PHY_MARV_1000T_CTRL, ct1000);

@@ -508,6 +534,7 @@ static void sky2_phy_init(struct sky2_hw
 gm_phy_write(hw, port, PHY_MARV_LED_OVER, ledover);

 }
+
 /* Enable phy interrupt on auto-negotiation complete (or link up) */
 if (sky2-autoneg == AUTONEG_ENABLE)
 gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_IS_AN_COMPL);
@@ -570,49 +597,11 @@ static void sky2_mac_init(struct sky2_hw
 gm_phy_read(hw, 1, PHY_MARV_INT_MASK) != 0);
 }

- if (sky2-autoneg == AUTONEG_DISABLE) {
- reg = gma_read16(hw, port, GM_GP_CTRL);
- reg |= GM_GPCR_AU_ALL_DIS;
- gma_write16(hw, port, GM_GP_CTRL, reg);
- gma_read16(hw, port, GM_GP_CTRL);
-
- switch (sky2-speed) {
- case SPEED_1000:
- reg = ~GM_GPCR_SPEED_100;
- reg |= GM_GPCR_SPEED_1000;
- break;
- case SPEED_100:
- reg = ~GM_GPCR_SPEED_1000;
- reg |= GM_GPCR_SPEED_100;
- break;
- case SPEED_10:
- reg = ~(GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100);
- break;
- }
-
- if (sky2-duplex == DUPLEX_FULL)
- reg |= GM_GPCR_DUP_FULL;
-
- /* turn off pause in 10/100mbps half duplex */
- else if (sky2-speed != SPEED_1000 
- hw-chip_id != CHIP_ID_YUKON_EC_U)
- sky2-tx_pause = sky2-rx_pause = 0;
- } else
- reg = GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100 | GM_GPCR_DUP_FULL;
-
- if (!sky2-tx_pause  !sky2-rx_pause) {
- sky2_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
- reg |=
- GM_GPCR_FC_TX_DIS | GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
- } else if (sky2-tx_pause  !sky2-rx_pause) {
- /* disable Rx flow-control */
- reg |= GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
- }
-
- gma_write16(hw, port, GM_GP_CTRL, reg);
-
 sky2_read16(hw, SK_REG(port, GMAC_IRQ_SRC));

+ /* Enable Transmit FIFO Underrun */
+ sky2_write8(hw, SK_REG(port, GMAC_IRQ_MSK), GMAC_DEF_MSK);
+
 spin_lock_bh(sky2-phy_lock);
 sky2_phy_init(hw, port);
 spin_unlock_bh(sky2-phy_lock);
@@ -1529,40 +1518,10 @@ static void sky2_link_up(struct sky2_por
 unsigned port = sky2-port;
 u16 reg;

- /* Enable Transmit FIFO Underrun */
- sky2_write8(hw, SK_REG(port, GMAC_IRQ_MSK), GMAC_DEF_MSK);
-
- reg = gma_read16(hw, port, GM_GP_CTRL);
- if (sky2-autoneg == AUTONEG_DISABLE) {
- reg |= GM_GPCR_AU_ALL_DIS;
-
- /* Is write/read necessary?  Copied from sky2_mac_init */
- gma_write16

[PATCH 1/3] sky2: tx pause bug fix

2006-09-06 Thread Stephen Hemminger
Fix problems with transmit pause frames. The driver was telling the
GMAC to flush (not process) pause frames. Manually disabling pause wasn't
working because of problems in the setup.

This maybe the cause of the lockup under load.
http://bugzilla.kernel.org/show_bug.cgi?id=6839

Patch against netdev-2.6 git tree

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---

 drivers/net/sky2.c |  123 ++--
 drivers/net/sky2.h |2 -
 2 files changed, 43 insertions(+), 82 deletions(-)

1419c8ab49f8fd56bad1ac0d3dcbaf830cd5a5d6
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 7ce0663..a3a4ab2 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -289,7 +289,7 @@ static void sky2_gmac_reset(struct sky2_
 static void sky2_phy_init(struct sky2_hw *hw, unsigned port)
 {
struct sky2_port *sky2 = netdev_priv(hw-dev[port]);
-   u16 ctrl, ct1000, adv, pg, ledctrl, ledover;
+   u16 ctrl, ct1000, adv, pg, ledctrl, ledover, reg;
 
if (sky2-autoneg == AUTONEG_ENABLE 
!(hw-chip_id == CHIP_ID_YUKON_XL || hw-chip_id == 
CHIP_ID_YUKON_EC_U)) {
@@ -358,6 +358,7 @@ static void sky2_phy_init(struct sky2_hw
ctrl = 0;
ct1000 = 0;
adv = PHY_AN_CSMA;
+   reg = 0;
 
if (sky2-autoneg == AUTONEG_ENABLE) {
if (hw-copper) {
@@ -390,21 +391,46 @@ static void sky2_phy_init(struct sky2_hw
/* forced speed/duplex settings */
ct1000 = PHY_M_1000C_MSE;
 
-   if (sky2-duplex == DUPLEX_FULL)
-   ctrl |= PHY_CT_DUP_MD;
+   /* Disable auto update for duplex flow control and speed */
+   reg |= GM_GPCR_AU_ALL_DIS;
 
switch (sky2-speed) {
case SPEED_1000:
ctrl |= PHY_CT_SP1000;
+   reg |= GM_GPCR_SPEED_1000;
break;
case SPEED_100:
ctrl |= PHY_CT_SP100;
+   reg |= GM_GPCR_SPEED_100;
break;
}
 
+   if (sky2-duplex == DUPLEX_FULL) {
+   reg |= GM_GPCR_DUP_FULL;
+   ctrl |= PHY_CT_DUP_MD;
+   } else if (sky2-speed != SPEED_1000  hw-chip_id != 
CHIP_ID_YUKON_EC_U) {
+   /* Turn off flow control for 10/100mbps */
+   sky2-rx_pause = 0;
+   sky2-tx_pause = 0;
+   }
+
+   if (!sky2-rx_pause)
+   reg |= GM_GPCR_FC_RX_DIS;
+
+   if (!sky2-tx_pause)
+   reg |= GM_GPCR_FC_TX_DIS;
+
+   /* Forward pause packets to GMAC? */
+   if (sky2-tx_pause || sky2-rx_pause)
+   sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
+   else
+   sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
+
ctrl |= PHY_CT_RESET;
}
 
+   gma_write16(hw, port, GM_GP_CTRL, reg);
+
if (hw-chip_id != CHIP_ID_YUKON_FE)
gm_phy_write(hw, port, PHY_MARV_1000T_CTRL, ct1000);
 
@@ -508,6 +534,7 @@ static void sky2_phy_init(struct sky2_hw
gm_phy_write(hw, port, PHY_MARV_LED_OVER, ledover);
 
}
+
/* Enable phy interrupt on auto-negotiation complete (or link up) */
if (sky2-autoneg == AUTONEG_ENABLE)
gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_IS_AN_COMPL);
@@ -570,49 +597,11 @@ static void sky2_mac_init(struct sky2_hw
 gm_phy_read(hw, 1, PHY_MARV_INT_MASK) != 0);
}
 
-   if (sky2-autoneg == AUTONEG_DISABLE) {
-   reg = gma_read16(hw, port, GM_GP_CTRL);
-   reg |= GM_GPCR_AU_ALL_DIS;
-   gma_write16(hw, port, GM_GP_CTRL, reg);
-   gma_read16(hw, port, GM_GP_CTRL);
-
-   switch (sky2-speed) {
-   case SPEED_1000:
-   reg = ~GM_GPCR_SPEED_100;
-   reg |= GM_GPCR_SPEED_1000;
-   break;
-   case SPEED_100:
-   reg = ~GM_GPCR_SPEED_1000;
-   reg |= GM_GPCR_SPEED_100;
-   break;
-   case SPEED_10:
-   reg = ~(GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100);
-   break;
-   }
-
-   if (sky2-duplex == DUPLEX_FULL)
-   reg |= GM_GPCR_DUP_FULL;
-
-   /* turn off pause in 10/100mbps half duplex */
-   else if (sky2-speed != SPEED_1000 
-hw-chip_id != CHIP_ID_YUKON_EC_U)
-   sky2-tx_pause = sky2-rx_pause = 0;
-   } else
-   reg = GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100 | GM_GPCR_DUP_FULL;
-
-   if (!sky2-tx_pause  !sky2-rx_pause) {
-   sky2_write32(hw,