from:"Stephen Hemminger"

Re: sky2 hangs

2007-02-01 Thread Stephen Hemminger

On Thu, 1 Feb 2007 19:55:32 +0100
Thomas Glanzmann [EMAIL PROTECTED] wrote:

 Hello,
 I have a sky2 network card in my intel mac mini. It stops working when I
 do havy network load like watching a divx over http/sshfs. However if I
 remove the driver module and load it again it works and even the tcp
 connection doesn't get shutdown. I automated the above procedure using
 a userland watchdog which basically does the same thing and is written
 entirely by me, because the traditional watchdog wasn't that reliable
 and did a lot of false positives:
 
 * Look every ten seconds if my default router is pingable (3
   pings, one has to get back).
 If it isn't the case I call network_fix script (it calls the
 script only once after a ping gets lost. To run the script 
 again at least one
 ping has to arrive again)
 
 (mini) [~] cat /usr/local/sbin/fix_network
 #!/bin/bash
 
 export PATH=/bin:/usr/bin:/usr/sbin:/sbin
 
 rmmod sky2
 modprobe sky2
 ifdown eth0
 ifup eth0
 
 If after that no ping is received from the default
 router for another 90 seconds I tell init to reboot and
 stop feeding the kernel software watchdog.
 
 * My watchdog also checks if sshd process is running. If it is
   down for more than 100 seconds it reboots the machine, too.
 
 Jan 27 22:35:35 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 27 22:35:35 mini watchdog-tg[4146]: Running fix_network script.
 Jan 27 22:38:46 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 27 22:38:46 mini watchdog-tg[4146]: Running fix_network script.
 Jan 27 22:44:17 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 27 22:44:17 mini watchdog-tg[4146]: Running fix_network script.
 Jan 29 12:00:13 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 29 12:00:13 mini watchdog-tg[4146]: Running fix_network script.
 Jan 29 19:18:59 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 29 19:18:59 mini watchdog-tg[4146]: Running fix_network script.
 Jan 31 15:56:29 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Jan 31 15:56:29 mini watchdog-tg[4146]: Running fix_network script.
 Feb  1 08:56:57 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 
 (failure 1 of 10)
 Feb  1 08:56:57 mini watchdog-tg[4146]: Running fix_network script.
 
 I have a question to this: I wonder why the Linux Kernel (no longer?)
 increments the use counter of an ethernet driver (I saw it on sky2 and
 e1000) when the interface is up, running and configured? I can unload
 the sky2 driver without doing a 'ifconfig eth0 down' beforehand. Could
 somone provide me with background on this fact?

It was intentional in 2.6 to allow interfaces to be hot-removed.
Remember with Internet protocols there is no hard binding (normally)
between address and device and connections should not go down
if link fails.

 
 With that everything works. If somone is interested in my userland
 watchdog, just send me an E-Mail.

Hopefully, it won't be necessary for long.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 hangs

2007-02-01 Thread Stephen Hemminger

On Thu, 1 Feb 2007 19:55:32 +0100
Thomas Glanzmann [EMAIL PROTECTED] wrote:

 Hello,
 I have a sky2 network card in my intel mac mini. It stops working when I
 do havy network load like watching a divx over http/sshfs.

Is this heavy Tx load (ie your watching movie from mac mini). or Rx load
(you are watching movie on mac mini).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 hangs

2007-02-01 Thread Stephen Hemminger

I can reproduce the problem now (on mac mini). Interestingly it seems to whack
the whole ethernet switch when it happens.
 
 - a previously suggested fix - passing idle=poll to the kernel - did not
 work for me at the end

It is not an MSI or IRQ problem. It is a phy problem (see below).

 - the locks I have happen very periodically (somewhere around every 22-28
 hours), as if the chip would die after a given amount of data transferred;
 I know this looks stupid but I thought I might mention it
 - I have about 1Mbit/s of (incoming) traffic on this interface: with
 short, very high peaks, as there is a MySQL server on the other end,
 receiving about 100 queries per second
 - unloading the sky2 module totally freezes the computer for me

If you do:
ethtool -r eth0
it cause a PHY reset (renegotiation) and clears the problem.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] skge: handle zero address at open

2007-02-02 Thread Stephen Hemminger

Some motherboards are broken and have no address set. Failing at probe time 
prevents the device from ever being used (like to download a fixed BIOS). 
Instead
warn on probe and check again when device is brought up. That way the address
can be set.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- skge.orig/drivers/net/skge.c
+++ skge/drivers/net/skge.c
@@ -2373,6 +2373,9 @@ static int skge_up(struct net_device *de
size_t rx_size, tx_size;
int err;
 
+   if (!is_valid_ether_addr(dev-dev_addr))
+   return -EINVAL;
+
if (netif_msg_ifup(skge))
printk(KERN_INFO PFX %s: enabling interface\n, dev-name);
 
@@ -3567,11 +3570,10 @@ static int __devinit skge_probe(struct p
if (!dev)
goto err_out_led_off;
 
+   /* Some motherboards are broken and has zero in ROM. */
if (!is_valid_ether_addr(dev-dev_addr)) {
-   printk(KERN_ERR PFX %s: bad (zero?) ethernet address in rom\n,
+   printk(KERN_WARNING PFX %s: bad (zero?) ethernet address in 
rom\n,
   pci_name(pdev));
-   err = -EIO;
-   goto err_out_free_netdev;
}
 
err = register_netdev(dev);

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/4] skge: update

2007-02-02 Thread Stephen Hemminger

Several enhancements: WOL now works, use dev_printk macros
and allow handling broken hardware better.

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] skge: version 1.10

2007-02-02 Thread Stephen Hemminger

Mark this as 1.10 because WOL now works

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- skge.orig/drivers/net/skge.c
+++ skge/drivers/net/skge.c
@@ -42,7 +42,7 @@
 #include skge.h
 
 #define DRV_NAME   skge
-#define DRV_VERSION1.9
+#define DRV_VERSION1.10
 #define PFXDRV_NAME  
 
 #define DEFAULT_TX_RING_SIZE   128

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] skge: WOL support

2007-02-02 Thread Stephen Hemminger

Add WOL support for Yukon chipsets in skge device.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/skge.c |  158 +
 drivers/net/skge.h |2 
 2 files changed, 125 insertions(+), 35 deletions(-)

--- skge.orig/drivers/net/skge.c
+++ skge/drivers/net/skge.c
@@ -132,18 +132,93 @@ static void skge_get_regs(struct net_dev
 }
 
 /* Wake on Lan only supported on Yukon chips with rev 1 or above */
-static int wol_supported(const struct skge_hw *hw)
+static u32 wol_supported(const struct skge_hw *hw)
 {
-   return !((hw-chip_id == CHIP_ID_GENESIS ||
- (hw-chip_id == CHIP_ID_YUKON  hw-chip_rev == 0)));
+   if (hw-chip_id == CHIP_ID_YUKON  hw-chip_rev != 0)
+   return WAKE_MAGIC | WAKE_PHY;
+   else
+   return 0;
+}
+
+static u32 pci_wake_enabled(struct pci_dev *dev)
+{
+   int pm = pci_find_capability(dev, PCI_CAP_ID_PM);
+   u16 value;
+
+   /* If device doesn't support PM Capabilities, but request is to disable
+* wake events, it's a nop; otherwise fail */
+   if (!pm)
+   return 0;
+
+   pci_read_config_word(dev, pm + PCI_PM_PMC, value);
+
+   value = PCI_PM_CAP_PME_MASK;
+   value = ffs(PCI_PM_CAP_PME_MASK) - 1;   /* First bit of mask */
+
+   return value != 0;
+}
+
+static void skge_wol_init(struct skge_port *skge)
+{
+   struct skge_hw *hw = skge-hw;
+   int port = skge-port;
+   enum pause_control save_mode;
+   u32 ctrl;
+
+   /* Bring hardware out of reset */
+   skge_write16(hw, B0_CTST, CS_RST_CLR);
+   skge_write16(hw, SK_REG(port, GMAC_LINK_CTRL), GMLC_RST_CLR);
+
+   skge_write8(hw, SK_REG(port, GPHY_CTRL), GPC_RST_CLR);
+   skge_write8(hw, SK_REG(port, GMAC_CTRL), GMC_RST_CLR);
+
+   /* Force to 10/100 skge_reset will re-enable on resume   */
+   save_mode = skge-flow_control;
+   skge-flow_control = FLOW_MODE_SYMMETRIC;
+
+   ctrl = skge-advertising;
+   skge-advertising = 
~(ADVERTISED_1000baseT_Half|ADVERTISED_1000baseT_Full);
+
+   skge_phy_reset(skge);
+
+   skge-flow_control = save_mode;
+   skge-advertising = ctrl;
+
+   /* Set GMAC to no flow control and auto update for speed/duplex */
+   gma_write16(hw, port, GM_GP_CTRL,
+   GM_GPCR_FC_TX_DIS|GM_GPCR_TX_ENA|GM_GPCR_RX_ENA|
+   GM_GPCR_DUP_FULL|GM_GPCR_FC_RX_DIS|GM_GPCR_AU_FCT_DIS);
+
+   /* Set WOL address */
+   memcpy_toio(hw-regs + WOL_REGS(port, WOL_MAC_ADDR),
+   skge-netdev-dev_addr, ETH_ALEN);
+
+   /* Turn on appropriate WOL control bits */
+   skge_write16(hw, WOL_REGS(port, WOL_CTRL_STAT), WOL_CTL_CLEAR_RESULT);
+   ctrl = 0;
+   if (skge-wol  WAKE_PHY)
+   ctrl |= WOL_CTL_ENA_PME_ON_LINK_CHG|WOL_CTL_ENA_LINK_CHG_UNIT;
+   else
+   ctrl |= WOL_CTL_DIS_PME_ON_LINK_CHG|WOL_CTL_DIS_LINK_CHG_UNIT;
+
+   if (skge-wol  WAKE_MAGIC)
+   ctrl |= WOL_CTL_ENA_PME_ON_MAGIC_PKT|WOL_CTL_ENA_MAGIC_PKT_UNIT;
+   else
+   ctrl |= 
WOL_CTL_DIS_PME_ON_MAGIC_PKT|WOL_CTL_DIS_MAGIC_PKT_UNIT;;
+
+   ctrl |= WOL_CTL_DIS_PME_ON_PATTERN|WOL_CTL_DIS_PATTERN_UNIT;
+   skge_write16(hw, WOL_REGS(port, WOL_CTRL_STAT), ctrl);
+
+   /* block receiver */
+   skge_write8(hw, SK_REG(port, RX_GMF_CTRL_T), GMF_RST_SET);
 }
 
 static void skge_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 {
struct skge_port *skge = netdev_priv(dev);
 
-   wol-supported = wol_supported(skge-hw) ? WAKE_MAGIC : 0;
-   wol-wolopts = skge-wol ? WAKE_MAGIC : 0;
+   wol-supported = wol_supported(skge-hw);
+   wol-wolopts = skge-wol;
 }
 
 static int skge_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
@@ -151,23 +226,12 @@ static int skge_set_wol(struct net_devic
struct skge_port *skge = netdev_priv(dev);
struct skge_hw *hw = skge-hw;
 
-   if (wol-wolopts != WAKE_MAGIC  wol-wolopts != 0)
-   return -EOPNOTSUPP;
-
-   if (wol-wolopts == WAKE_MAGIC  !wol_supported(hw))
+   if (wol-wolopts  wol_supported(hw))
return -EOPNOTSUPP;
 
-   skge-wol = wol-wolopts == WAKE_MAGIC;
-
-   if (skge-wol) {
-   memcpy_toio(hw-regs + WOL_MAC_ADDR, dev-dev_addr, ETH_ALEN);
-
-   skge_write16(hw, WOL_CTRL_STAT,
-WOL_CTL_ENA_PME_ON_MAGIC_PKT |
-WOL_CTL_ENA_MAGIC_PKT_UNIT);
-   } else
-   skge_write16(hw, WOL_CTRL_STAT, WOL_CTL_DEFAULT);
-
+   skge-wol = wol-wolopts;
+   if (!netif_running(dev))
+   skge_wol_init(skge);
return 0;
 }
 
@@ -3456,6 +3520,7 @@ static struct net_device *skge_devinit(s
skge-duplex = -1;
skge-speed = -1;
skge-advertising = skge_supported_modes(hw);
+   skge-wol = pci_wake_enabled(hw-pdev) ? wol_supported(hw) : 0;
 
hw

[PATCH 2/4] skge: use dev_printk

2007-02-02 Thread Stephen Hemminger

Use dev_printk related macros for PCI related errors and warnings

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- skge.orig/drivers/net/skge.c
+++ skge/drivers/net/skge.c
@@ -2395,7 +2395,7 @@ static int skge_up(struct net_device *de
BUG_ON(skge-dma  7);
 
if ((u64)skge-dma  32 != ((u64) skge-dma + skge-mem_size)  32) {
-   printk(KERN_ERR PFX pci_alloc_consistent region crosses 4G 
boundary\n);
+   dev_err(hw-pdev-dev, pci_alloc_consistent region crosses 4G 
boundary\n);
err = -EINVAL;
goto free_pci_mem;
}
@@ -3004,6 +3004,7 @@ static void skge_mac_intr(struct skge_hw
 /* Handle device specific framing and timeout interrupts */
 static void skge_error_irq(struct skge_hw *hw)
 {
+   struct pci_dev *pdev = hw-pdev;
u32 hwstatus = skge_read32(hw, B0_HWE_ISRC);
 
if (hw-chip_id == CHIP_ID_GENESIS) {
@@ -3019,12 +3020,12 @@ static void skge_error_irq(struct skge_h
}
 
if (hwstatus  IS_RAM_RD_PAR) {
-   printk(KERN_ERR PFX Ram read data parity error\n);
+   dev_err(pdev-dev, Ram read data parity error\n);
skge_write16(hw, B3_RI_CTRL, RI_CLR_RD_PERR);
}
 
if (hwstatus  IS_RAM_WR_PAR) {
-   printk(KERN_ERR PFX Ram write data parity error\n);
+   dev_err(pdev-dev, Ram write data parity error\n);
skge_write16(hw, B3_RI_CTRL, RI_CLR_WR_PERR);
}
 
@@ -3035,38 +3036,38 @@ static void skge_error_irq(struct skge_h
skge_mac_parity(hw, 1);
 
if (hwstatus  IS_R1_PAR_ERR) {
-   printk(KERN_ERR PFX %s: receive queue parity error\n,
-  hw-dev[0]-name);
+   dev_err(pdev-dev, %s: receive queue parity error\n,
+   hw-dev[0]-name);
skge_write32(hw, B0_R1_CSR, CSR_IRQ_CL_P);
}
 
if (hwstatus  IS_R2_PAR_ERR) {
-   printk(KERN_ERR PFX %s: receive queue parity error\n,
-  hw-dev[1]-name);
+   dev_err(pdev-dev, %s: receive queue parity error\n,
+   hw-dev[1]-name);
skge_write32(hw, B0_R2_CSR, CSR_IRQ_CL_P);
}
 
if (hwstatus  (IS_IRQ_MST_ERR|IS_IRQ_STAT)) {
u16 pci_status, pci_cmd;
 
-   pci_read_config_word(hw-pdev, PCI_COMMAND, pci_cmd);
-   pci_read_config_word(hw-pdev, PCI_STATUS, pci_status);
+   pci_read_config_word(pdev, PCI_COMMAND, pci_cmd);
+   pci_read_config_word(pdev, PCI_STATUS, pci_status);
 
-   printk(KERN_ERR PFX %s: PCI error cmd=%#x status=%#x\n,
-  pci_name(hw-pdev), pci_cmd, pci_status);
+   dev_err(pdev-dev, PCI error cmd=%#x status=%#x\n,
+   pci_cmd, pci_status);
 
/* Write the error bits back to clear them. */
pci_status = PCI_STATUS_ERROR_BITS;
skge_write8(hw, B2_TST_CTRL1, TST_CFG_WRITE_ON);
-   pci_write_config_word(hw-pdev, PCI_COMMAND,
+   pci_write_config_word(pdev, PCI_COMMAND,
  pci_cmd | PCI_COMMAND_SERR | 
PCI_COMMAND_PARITY);
-   pci_write_config_word(hw-pdev, PCI_STATUS, pci_status);
+   pci_write_config_word(pdev, PCI_STATUS, pci_status);
skge_write8(hw, B2_TST_CTRL1, TST_CFG_WRITE_OFF);
 
/* if error still set then just ignore it */
hwstatus = skge_read32(hw, B0_HWE_ISRC);
if (hwstatus  IS_IRQ_STAT) {
-   printk(KERN_INFO PFX unable to clear error (so 
ignoring them)\n);
+   dev_warn(hw-pdev-dev, unable to clear error (so 
ignoring them)\n);
hw-intr_mask = ~IS_HW_ERR;
}
}
@@ -3280,8 +3281,8 @@ static int skge_reset(struct skge_hw *hw
hw-phy_addr = PHY_ADDR_BCOM;
break;
default:
-   printk(KERN_ERR PFX %s: unsupported phy type 0x%x\n,
-  pci_name(hw-pdev), hw-phy_type);
+   dev_err(hw-pdev-dev, unsupported phy type 0x%x\n,
+  hw-phy_type);
return -EOPNOTSUPP;
}
break;
@@ -3296,8 +3297,8 @@ static int skge_reset(struct skge_hw *hw
break;
 
default:
-   printk(KERN_ERR PFX %s: unsupported chip type 0x%x\n,
-  pci_name(hw-pdev), hw-chip_id);
+   dev_err(hw-pdev-dev, unsupported chip type 0x%x\n,
+  hw-chip_id);
return -EOPNOTSUPP;
}
 
@@ -3337,7 +3338,7 @@ static int skge_reset(struct skge_hw *hw
/* avoid boards with stuck Hardware error bits */
if ((skge_read32(hw, B0_ISRC)  IS_HW_ERR

[RFT] sky2 auto negotiation PHY errata

2007-02-02 Thread Stephen Hemminger

This patch does the Marvell errata before auto negotiation
(from drivers/phy/marvell.c).  The Yukon II chips have an internal
version of the same PHY, so perhaps this errata is necessary for them
as well.

For test only, but it may fix some of the hangs. It seems to fix
the PHY lockups I saw yesterday on Mac Mini.

---
 drivers/net/sky2.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 822dd0b..4f04ffa 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -387,6 +387,14 @@ static void sky2_phy_init(struct sky2_hw
 
if (sky2-autoneg == AUTONEG_ENABLE) {
if (sky2_is_copper(hw)) {
+   /* Errata setup */
+   gm_phy_write(hw, port, PHY_MARV_PAGE_ADDR, 0x1f);
+   gm_phy_write(hw, port, PHY_MARV_PAGE_DATA, 0x200c);
+   gm_phy_write(hw, port, PHY_MARV_PAGE_ADDR, 5);
+   gm_phy_write(hw, port, PHY_MARV_PAGE_DATA, 0);
+   gm_phy_write(hw, port, PHY_MARV_PAGE_DATA, 0x100);
+
+
if (sky2-advertising  ADVERTISED_1000baseT_Full)
ct1000 |= PHY_M_1000C_AFD;
if (sky2-advertising  ADVERTISED_1000baseT_Half)
-- 
1.4.1

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sky2: flow control off

2007-02-02 Thread Stephen Hemminger

Turn flow control off for sky2. When flow control is on, the transmitter
may get randomly stuck. Perhaps there is hardware problem, but until
Marvell provides errata information for workaround, it should default to off.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]
---
 drivers/net/sky2.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 822dd0b..a31dea5 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -3263,7 +3263,7 @@ #endif
 
/* Auto speed and flow control */
sky2-autoneg = AUTONEG_ENABLE;
-   sky2-flow_mode = FC_BOTH;
+   sky2-flow_mode = FC_NONE;
 
sky2-duplex = -1;
sky2-speed = -1;
-- 
1.4.1

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [TCP]: Fix truesize underflow

2006-04-19 Thread Stephen Hemminger

On Tue, 18 Apr 2006 22:32:04 +1000
Herbert Xu [EMAIL PROTECTED] wrote:

 Hi Dave:
 
 You're absolutely right about there being a problem with the TSO packet
 trimming code.  The cause of this lies in the tcp_fragment() function.
 
 When we allocate a fragment for a completely non-linear packet the
 truesize is calculated for a payload length of zero.  This means that
 truesize could in fact be less than the real payload length.
 
 When that happens the TSO packet trimming can cause truesize to become
 negative.  This in turn can cause sk_forward_alloc to be -n * PAGE_SIZE
 which would trigger the warning.
 
 I've copied the code you used in tso_fragment which should work here.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]
 
 Everyone who's having the sk_forward_alloc warning problem should give
 this patch a go to see if it cures things.
 
 Just in case this still doesn't fix it, could everyone please also verify
 whether disabling SMP has any effect on reproducing this?
 
 Thanks,

Please put this in the next -stable load...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: I/OAT: Call for discussion

2006-04-19 Thread Stephen Hemminger

On Wed, 19 Apr 2006 09:39:37 -0700
Grover, Andrew [EMAIL PROTECTED] wrote:

 Over the past few months, we (the Intel networking group) have been
 working hard, often off-list, to get the I/OAT patches we've posted here
 merged into the mainline kernel branch, as well as Red Hat and SuSE.
 We've had some success, but not what's really important: getting it into
 the mainline kernel releases.

Vendor kernel support has little or no bearing on eventual inclusion.

 Of course some of this can be blamed on how a corporate culture
 approaches the open source community when it thinks it has something
 that gives it a competitive advantage in the marketplace. If we acted
 like jerks, it's just because we think we have something good here! :) 
 
 But seriously, I know we've had longer turnaround times in releases and
 replying to comments than people have liked. All we can say is sorry, we
 really have been doing our best. People were kind enough to review our
 patches and suggest over 50 improvements, we have fixed the patches
 accordingly, and we really do appreciate it.
 
 So OK assume we have a nice pretty patchset. Why should it go in? Since
 we have an NDA with Red Hat we've been trying to convince DaveM and Red
 Hat of I/OAT's merits off-list, but this kind of change needs a more
 public airing of all its pros and cons.

Off list lobbying usually has a negative impact.

 We have posted all the performance data we have gathered so far on the
 linux-net wiki: http://linux-net.osdl.org/index.php/I/OAT , and listed
 the overall concerns that have been expressed in private. I'm hoping you
 will look at the data, re-examine the patches, and then we can talk
 about the technical issues here on the list, getting down to the
 specifics, so we can hash it out in public and settle on the right path
 to take.

The biggest barrier at this point seems to be hardware availability.
People generally don't care unless they use or are going to get that hardware.
Also the big benchmark data, although interesting, is usually only
interesting to vendors.

You probably will have to suffer out of tree for a while until the hardware
becomes more available. When the hardware is more common, then the 
implementation
details will be sorted out. Also after the 2+ years of getting TSO to work
right, maybe the developers are a little gun shy at this point.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netpoll checksum issue

2006-04-19 Thread Stephen Hemminger

The changes to how hardware receive checksums are handled broke
the netpoll checksum code (for CHECKSUM_HW).  Since this is not at
all performance critical, try this patch. It changes to always to
normal software checksum.

--- linux-2.6.orig/net/core/netpoll.c   2006-03-22 09:30:56.0 -0800
+++ linux-2.6/net/core/netpoll.c2006-04-19 10:30:13.0 -0700
@@ -102,20 +102,11 @@
 static int checksum_udp(struct sk_buff *skb, struct udphdr *uh,
 unsigned short ulen, u32 saddr, u32 daddr)
 {
-   unsigned int psum;
-
if (uh-check == 0 || skb-ip_summed == CHECKSUM_UNNECESSARY)
return 0;
 
-   psum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0);
-
-   if (skb-ip_summed == CHECKSUM_HW 
-   !(u16)csum_fold(csum_add(psum, skb-csum)))
-   return 0;
-
-   skb-csum = psum;
-
-   return __skb_checksum_complete(skb);
+   skb-csum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0);
+   return (u16) csum_fold(skb_checksum(skb, 0, skb-len, skb-csum));
 }
 
 /*
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: TOE info page

2006-04-19 Thread Stephen Hemminger

On Wed, 19 Apr 2006 19:22:14 -0400
Jeff Garzik [EMAIL PROTECTED] wrote:

 
 I created a TOE (TCP Offload Engine) info page for Linux, on the 
 linux-net wiki:
 
   http://linux-net.osdl.org/index.php/TOE
 
 As soon as I can find a wiki admin, it will get added to the main page. 
   I don't seem to have such access.
 
   Jeff

I am the main administrator. I updated the front page, and added a couple
more stubs for NAPI, TSO, UFO.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] bridge: allow full size vlan tagged packets to be bridged

2006-04-20 Thread Stephen Hemminger

The Ethernet bridge code silently drops packets when forwarding a packet
that is too large for the destination interface (as per 802.1d). But it
should allow for VLAN tagged frames.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- bridge.orig/net/bridge/br_forward.c 2006-04-10 16:17:51.0 -0700
+++ bridge/net/bridge/br_forward.c  2006-04-19 13:50:42.0 -0700
@@ -16,6 +16,7 @@
 #include linux/kernel.h
 #include linux/netdevice.h
 #include linux/skbuff.h
+#include linux/if_vlan.h
 #include linux/netfilter_bridge.h
 #include br_private.h
 
@@ -29,10 +30,15 @@
return 1;
 }
 
+static inline unsigned packet_length(const struct sk_buff *skb)
+{
+   return skb-len - (skb-protocol == htons(ETH_P_8021Q) ? VLAN_HLEN : 0);
+}
+
 int br_dev_queue_push_xmit(struct sk_buff *skb)
 {
/* drop mtu oversized packets except tso */
-   if (skb-len  skb-dev-mtu  !skb_shinfo(skb)-tso_size)
+   if (packet_length(skb)  skb-dev-mtu  !skb_shinfo(skb)-tso_size)
kfree_skb(skb);
else {
 #ifdef CONFIG_BRIDGE_NETFILTER
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Congestion Avoidance Monitoring Tools

2006-04-21 Thread Stephen Hemminger

On Thu, 20 Apr 2006 22:26:14 -0700
Piet Delaney [EMAIL PROTECTED] wrote:

 I'm upgrading our 2.6.12 kernel to 2.6.13, which includes significant
 congestion avoidance code additions and changes. I was wondering if
 there are any tools folks can recommend for testing the kernel to make
 sure the congestion avoidance code is operating correctly. For 
 example the displaying of the congestion window as a function of time
 while undergoing convergence. For causing congestion I could modify 
 a kernel to discard packets once in a while on a lab gateway and hit 
 it with iperf. HP's netperf looks interesting. 
 
 Any suggestions?
 
 
 -piet
 

2.6.13 still had lots of problems, things didn't really get working
right till 2.6.15 or later. Especially with TSO.

I have a tool using kprobe's see 
http://developer.osdl.org/shemminger/prototypes/tcpprobe.tar.gz
I try to keep it up to date with current kernel and build process, last used it
on 2.6.16.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hotplug race on name change

2006-04-21 Thread Stephen Hemminger

This:
 Without that patch, there is a race when registering network interfaces 
 and renaming it with udev rules, because initially the address in 
 sysfs doesn't contain useful data. See 
 http://marc.theaimsgroup.com/?t=11446033892r=1w=2
 
 Breaking the recommended way of assigning persistent network interface 
 names is, IMHO, a bug serious enough to be fixed in -stable.
 
 Signed-off-by: Alexander E. Patrakov [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.16.5/net/core/dev.c
 +++ linux-2.6.16.5/net/core/dev.c
 @@ -2932,11 +2932,11 @@
  
   switch(dev-reg_state) {
   case NETREG_REGISTERING:
 + dev-reg_state = NETREG_REGISTERED;
   err = netdev_register_sysfs(dev);
   if (err)
   printk(KERN_ERR %s: failed sysfs registration 
 (%d)\n,
  dev-name, err);
 - dev-reg_state = NETREG_REGISTERED;
   break;
  
   case NETREG_UNREGISTERING:

Introduces new races in netdev_register_sysfs if the name changes, because
netdev_register_sysfs runs without RTNL at this point. So if some application 
gets
in and changes the device name while netdev_register_sysfs is running, then
the class_dev-class_id would end up not matching the netdevice-name.

Not a big issue since, hotplug doesn't get run until the device is registered.
Ideally, it would be possible to create the groups in the class device before it
was registered. It won't work with existing class device interface.

I am working on a patch to extend class_device to allow the creation of groups
to be atomic (like the attributes are).
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fw: [Bug 6421] New: kernel 2.6.10-2.6.16 on alpha: arch/alpha/kernel/io.c, iowrite16_rep() BUG_ON((unsigned long)src 0x1) triggered

2006-04-21 Thread Stephen Hemminger

Looks like PIO at unaligned addresses doesn't work on alpha...

Begin forwarded message:

Date: Fri, 21 Apr 2006 02:35:45 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bug 6421] New: kernel 2.6.10-2.6.16 on alpha: arch/alpha/kernel/io.c, 
iowrite16_rep() BUG_ON((unsigned long)src  0x1) triggered

http://bugzilla.kernel.org/show_bug.cgi?id=6421

   Summary: kernel 2.6.10-2.6.16 on alpha: arch/alpha/kernel/io.c,
iowrite16_rep() BUG_ON((unsigned long)src  0x1)
triggered
Kernel Version: 2.6.16
Status: NEW
  Severity: blocking
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]

Most recent kernel where this bug did not occur: 2.6.10 (i think so)

Distribution: self build (RH9, FC4/5 on alpha mix) 
Hardware Environment: alpha sx164pc board with ne200 ISA card.
Software Environment:
Problem Description:

Since 2.6.10 the BUG_ON() in arch/alpha/kernel/io.c iowrite16_rep() 
is triggered randomly, but reproducable on linux alpha platform with
ne2000/8390 ISA network card.

ne.c:v1.10 9/23/94 Donald Becker ([EMAIL PROTECTED])
Last modified Nov 1, 2000 by Paul Gortmaker
NE*000 ethercard probe at 0x320: 00 80 29 63 4a f6
eth%d: NE2000 found at 0x320, using IRQ 5.

I´ve traced this back to drivers/net/8390.c:
 ei_start_xmit().
I add a workaround, because else the system won't be usable any more in
ei_start_xmit() like this:
  if ( (unsigned long) skb-data  0x1)
{
printk(KERN_WARNING ei_start_xmit(): skb-data unaligned %p
align to %p length %i\n, skb-data, scratch, send_length);
if ( send_length = 128 ) goto normal;
memset( scratch, 0, 128);
memcpy( scratch, skb-data, send_length  128 ? send_length : 
128);
ei_block_output(dev, send_length, scratch, output_page);
}
else
{
normal:
ei_block_output(dev, send_length, skb-data, output_page);
}
And the output in the kernel ring buffer is:
dmesg | grep xmit
ei_start_xmit(): skb-data unaligned fc0019be55d5 align to fc001ef37620
length 60
ei_start_xmit(): skb-data unaligned fc0019be55d5 align to fc001ef37620
length 60
ei_start_xmit(): skb-data unaligned fc0008ceb735 align to fc00019dbb18
length 60
ei_start_xmit(): skb-data unaligned fc000ea787c5 align to fc001f683540
length 60
ei_start_xmit(): skb-data unaligned fc000e864fe9 align to fc000b737350
length 60
ei_start_xmit(): skb-data unaligned fc0008cd883f align to fc000b737350
length 60

So why is the skb-data address unaligned over time the system is running?

Steps to reproduce:
Every time. I think with higher system load the BUG is triggered earlier.

Here some stack traces with BUG_ON/ WARN_ON added by me in more places to trace
down the problem:
Kernel bug at net/ipv4/ip_output.c:297
cc1(2841): Kernel Bug 1
pc = [fc62a92c]  ra = [fc6407c8]  ps = Not tainted
pc is at ip_queue_xmit+0x59c/0x690
ra is at tcp_transmit_skb+0x588/0xbb0
v0 = fbc4  t0 = 0cfa4195  t1 = fc785d18
t2 = fc00014e26e0  t3 = 0030  t4 = 0030
t5 = fc000cfa41a9  t6 = 0002  t7 = fc00062d8000
a0 = fc001c14ef40  a1 =   a2 = 1400
a3 = 0600  a4 =   a5 = 
t8 =   t9 = fc000cfa41a9  t10= 0001
t11= 0200  pv = fc62a390  at = 
gp = fc7f1600  sp = fc00062db6e0
Trace:
[fc6407c8] tcp_transmit_skb+0x588/0xbb0
[fc6469ac] tcp_v4_send_check+0x11c/0x150
[fc6406cc] tcp_transmit_skb+0x48c/0xbb0
[fc641c48] tcp_retransmit_skb+0x128/0x7d0
[fc6436ec] tcp_xmit_retransmit_queue+0x1bc/0x3c0
[fc5f5618] sk_reset_timer+0x28/0x60
[fc63b8a0] tcp_ack+0x16e0/0x1e80
[fc63ec58] tcp_rcv_state_process+0x6c8/0x13f0
[fc649078] tcp_v4_do_rcv+0x128/0x480
[fc64a068] tcp_v4_rcv+0xc98/0xcb0
[fc62556c] ip_local_deliver+0x1ac/0x400
[fc625110] ip_rcv+0x480/0x730
[fc601794] netif_receive_skb+0x174/0x300
[fc6019ec] process_backlog+0xcc/0x1b0
[fc600394] net_rx_action+0xb4/0x1a0
[fc330cb0] __do_softirq+0x90/0x130
[fc3162e8] handle_IRQ_event+0x48/0x110
[fc330db4] do_softirq+0x64/0x70
[fc316c14] handle_irq+0x124/0x1b0
[fc31ff28] isa_device_interrupt+0x28/0x40
[fc31fe38] pyxis_device_interrupt+0x68/0x130
[fc317328] do_entInt+0x118/0x190
[fc3112d0] ret_from_sys_call+0x0/0x10
[fc393b90] __link_path_walk+0x900/0x10a0
[fc4ff080] memcmp+0x0/0x70
[fc393b90] __link_path_walk+0x900/0x10a0
[fc3943c0] link_path_walk+0x90/0x1b0
[fc4ff080] memcmp+0x0/0x70
[fc393b90] __link_path_walk+0x900/0x10a0
[fc3943c0] link_path_walk+0x90/0x1b0
[fc3956d8]

[PATCH 1/2] class device: add attribute_group creation

2006-04-21 Thread Stephen Hemminger

Extend the support of attribute groups in class_device's to allow groups
to be created as part of the registration process. This allows network device's
to avoid race between registration and creating groups.

Note that unlike attributes that are a property of the class object, the groups
are a property of the class_device object. This is done because there are 
different
types of network devices (wireless for example).

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2-2.6.17.orig/drivers/base/class.c   2006-04-21 12:19:26.0 
-0700
+++ sky2-2.6.17/drivers/base/class.c2006-04-21 12:21:21.0 -0700
@@ -456,6 +456,35 @@
}
 }
 
+static int class_device_add_groups(struct class_device * cd)
+{
+   int i;
+   int error = 0;
+
+   if (cd-groups) {
+   for (i = 0; cd-groups[i]; i++) {
+   error = sysfs_create_group(cd-kobj, cd-groups[i]);
+   if (error) {
+   while (--i = 0)
+   sysfs_remove_group(cd-kobj, 
cd-groups[i]);
+   goto out;
+   }
+   }
+   }
+out:
+   return error;
+}
+
+static void class_device_remove_groups(struct class_device * cd)
+{
+   int i;
+   if (cd-groups) {
+   for (i = 0; cd-groups[i]; i++) {
+   sysfs_remove_group(cd-kobj, cd-groups[i]);
+   }
+   }
+}
+
 static ssize_t show_dev(struct class_device *class_dev, char *buf)
 {
return print_dev_t(buf, class_dev-devt);
@@ -559,6 +588,8 @@
  class_name);
}
 
+   class_device_add_groups(class_dev);
+
kobject_uevent(class_dev-kobj, KOBJ_ADD);
 
/* notify any interfaces this device is now here */
@@ -672,6 +703,7 @@
if (class_dev-devt_attr)
class_device_remove_file(class_dev, class_dev-devt_attr);
class_device_remove_attrs(class_dev);
+   class_device_remove_groups(class_dev);
 
kobject_uevent(class_dev-kobj, KOBJ_REMOVE);
kobject_del(class_dev-kobj);
--- sky2-2.6.17.orig/include/linux/device.h 2006-04-21 12:19:26.0 
-0700
+++ sky2-2.6.17/include/linux/device.h  2006-04-21 12:19:36.0 -0700
@@ -200,6 +200,7 @@
  * @node: for internal use by the driver core only.
  * @kobj: for internal use by the driver core only.
  * @devt_attr: for internal use by the driver core only.
+ * @groups: optional additional groups to be created
  * @dev: if set, a symlink to the struct device is created in the sysfs
  * directory for this struct class device.
  * @class_data: pointer to whatever you want to store here for this struct
@@ -228,6 +229,7 @@
struct device   * dev;  /* not necessary, but nice to 
have */
void* class_data;   /* class-specific data */
struct class_device *parent;/* parent of this child device, 
if there is one */
+   struct attribute_group  ** groups;  /* optional groups */
 
void(*release)(struct class_device *dev);
int (*uevent)(struct class_device *dev, char **envp,
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] netdev: create attribute_groups with class_device_add

2006-04-21 Thread Stephen Hemminger

Atomically create attributes when class device is added. This avoids the
race between registering class_device (which generates hotplug event),
and the creation of attribute groups.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2-2.6.17.orig/net/core/dev.c 2006-04-21 12:20:58.0 -0700
+++ sky2-2.6.17/net/core/dev.c  2006-04-21 12:21:45.0 -0700
@@ -3043,11 +3043,11 @@
 
switch(dev-reg_state) {
case NETREG_REGISTERING:
-   dev-reg_state = NETREG_REGISTERED;
err = netdev_register_sysfs(dev);
if (err)
printk(KERN_ERR %s: failed sysfs registration 
(%d)\n,
   dev-name, err);
+   dev-reg_state = NETREG_REGISTERED;
break;
 
case NETREG_UNREGISTERING:
--- sky2-2.6.17.orig/net/core/net-sysfs.c   2006-04-21 12:20:58.0 
-0700
+++ sky2-2.6.17/net/core/net-sysfs.c2006-04-21 12:21:45.0 -0700
@@ -29,7 +29,7 @@
 
 static inline int dev_isalive(const struct net_device *dev) 
 {
-   return dev-reg_state == NETREG_REGISTERED;
+   return dev-reg_state = NETREG_REGISTERED;
 }
 
 /* use same locking rules as GIF* ioctl's */
@@ -445,58 +445,33 @@
 
 void netdev_unregister_sysfs(struct net_device * net)
 {
-   struct class_device * class_dev = (net-class_dev);
-
-   if (net-get_stats)
-   sysfs_remove_group(class_dev-kobj, netstat_group);
-
-#ifdef WIRELESS_EXT
-   if (net-get_wireless_stats || (net-wireless_handlers 
-   net-wireless_handlers-get_wireless_stats))
-   sysfs_remove_group(class_dev-kobj, wireless_group);
-#endif
-   class_device_del(class_dev);
-
+   class_device_del((net-class_dev));
 }
 
 /* Create sysfs entries for network device. */
 int netdev_register_sysfs(struct net_device *net)
 {
struct class_device *class_dev = (net-class_dev);
-   int ret;
+   struct attribute_group **groups = net-sysfs_groups;
 
+   class_device_initialize(class_dev);
class_dev-class = net_class;
class_dev-class_data = net;
+   class_dev-groups = groups;
 
+   BUILD_BUG_ON(BUS_ID_SIZE  IFNAMSIZ);
strlcpy(class_dev-class_id, net-name, BUS_ID_SIZE);
-   if ((ret = class_device_register(class_dev)))
-   goto out;
 
-   if (net-get_stats 
-   (ret = sysfs_create_group(class_dev-kobj, netstat_group)))
-   goto out_unreg; 
+   if (net-get_stats)
+   *groups++ = netstat_group;
 
 #ifdef WIRELESS_EXT
-   if (net-get_wireless_stats || (net-wireless_handlers 
-   net-wireless_handlers-get_wireless_stats)) {
-   ret = sysfs_create_group(class_dev-kobj, wireless_group);
-   if (ret)
-   goto out_cleanup;
-   }
-   return 0;
-out_cleanup:
-   if (net-get_stats)
-   sysfs_remove_group(class_dev-kobj, netstat_group);
-#else
-   return 0;
+   if (net-get_wireless_stats
+   || (net-wireless_handlers  
net-wireless_handlers-get_wireless_stats))
+   *groups++ = wireless_group;
 #endif
 
-out_unreg:
-   printk(KERN_WARNING %s: sysfs attribute registration failed %d\n,
-  net-name, ret);
-   class_device_unregister(class_dev);
-out:
-   return ret;
+   return class_device_add(class_dev);
 }
 
 int netdev_sysfs_init(void)
--- sky2-2.6.17.orig/include/linux/netdevice.h  2006-04-21 12:20:58.0 
-0700
+++ sky2-2.6.17/include/linux/netdevice.h   2006-04-21 12:21:45.0 
-0700
@@ -506,6 +506,8 @@
 
/* class/net/name entry */
struct class_device class_dev;
+   /* space for optional statistics and wireless sysfs groups */
+   struct attribute_group  *sysfs_groups[3];
 };
 
 #defineNETDEV_ALIGN32
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] netdev sysfs failure handling

2006-04-21 Thread Stephen Hemminger

In case of sysfs failure, don't let device be brought up.
It can be cleared by unregister_netdevice so module can be unloaded
normally.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-2.6.17.orig/net/core/dev.c 2006-04-21 12:21:45.0 -0700
+++ sky2-2.6.17/net/core/dev.c  2006-04-21 12:46:48.0 -0700
@@ -3043,10 +3043,17 @@
 
switch(dev-reg_state) {
case NETREG_REGISTERING:
+   /* Can't do proper error handling here because
+* this is a delayed call after register_netdevice
+* so no way to tell device driver what is wrong.
+*/
err = netdev_register_sysfs(dev);
-   if (err)
+   if (err) {
printk(KERN_ERR %s: failed sysfs registration 
(%d)\n,
   dev-name, err);
+   /* Don't let device be brought up */
+   clear_bit(__LINK_STATE_PRESENT, dev-state);
+   }
dev-reg_state = NETREG_REGISTERED;
break;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 9/10] d80211: rename master interface

2006-04-21 Thread Stephen Hemminger

On Fri, 21 Apr 2006 22:53:29 +0200 (CEST)
Jiri Benc [EMAIL PROTECTED] wrote:

 Rename master interface to wmasterX to better reflect its purpose.
 
 Signed-off-by: Jiri Benc [EMAIL PROTECTED]
 
 ---
 
  net/d80211/ieee80211.c   |2 +-
  net/d80211/ieee80211_i.h |2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 784f203467e4421aa0ecac34cb1647f4bdfe51be
 diff --git a/net/d80211/ieee80211.c b/net/d80211/ieee80211.c
 index 31f979c..1fd13dd 100644
 --- a/net/d80211/ieee80211.c
 +++ b/net/d80211/ieee80211.c
 @@ -4144,7 +4144,7 @@ struct net_device *ieee80211_alloc_hw(si
   ((char *) local + ((sizeof(struct ieee80211_local) + 3)  ~3));
  
   ether_setup(mdev);
 - memcpy(mdev-name, wlan%d, 7);
 + memcpy(mdev-name, wmaster%d, 10);

Why not use strlcpy or strncpy?  and use sizeof(mdev-name) or IFNAMSIZ
rather than hard coded 10.

  
   local-dev_index = -1;
   local-mdev = mdev;
 diff --git a/net/d80211/ieee80211_i.h b/net/d80211/ieee80211_i.h
 index ea1d9ab..3580d1e 100644
 --- a/net/d80211/ieee80211_i.h
 +++ b/net/d80211/ieee80211_i.h
 @@ -318,7 +318,7 @@ #define IEEE80211_SUB_IF_TO_DEV(sub_if) 
  struct ieee80211_local {
   struct ieee80211_hw *hw;
   void *hw_priv;
 - struct net_device *mdev; /* wlan# - master 802.11 device */
 + struct net_device *mdev; /* wmaster# - master 802.11 device */
   int open_count;
   int monitors;
   struct ieee80211_conf conf;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 2/5] s2io driver updates

2006-04-24 Thread Stephen Hemminger

On Sat, 22 Apr 2006 11:28:02 +0200
Francois Romieu [EMAIL PROTECTED] wrote:

 Ananda Raju [EMAIL PROTECTED] :
 [...]
  Signed-off-by: Ananda Raju [EMAIL PROTECTED]
  ---
  diff -upNr perf_fixes/drivers/net/s2io.c 
  dmesg_param_fixes/drivers/net/s2io.c
  --- perf_fixes/drivers/net/s2io.c   2006-04-13 08:02:56.0 -0700
  +++ dmesg_param_fixes/drivers/net/s2io.c2006-04-13 09:08:22.0 
  -0700
 [...]
  @@ -4626,6 +4633,45 @@ static int write_eeprom(nic_t * sp, int 
  return ret;
   }
   
  +static void s2io_vpd_read(nic_t *nic)
  +{
  +   u8 vpd_data[256],data;
 
 You may consider removing vpd_data from the stack and kmallocing it.
 

Since there lsvpd tool doesn't in user space, why add more kernel code
to do it?  Adding more code to just print prettier console log's is bogus.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/11] ixgb: Fix the use of dprintk rather than printk

2006-04-24 Thread Stephen Hemminger

On Sat, 22 Apr 2006 11:03:01 +0200
Francois Romieu [EMAIL PROTECTED] wrote:

 Jeff Kirsher [EMAIL PROTECTED] :
 [...]
  diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h
  index c83271b..a696c33 100644
  --- a/drivers/net/ixgb/ixgb.h
  +++ b/drivers/net/ixgb/ixgb.h
 [...]
  @@ -192,6 +197,7 @@ struct ixgb_adapter {
   
  /* structs defined in ixgb_hw.h */
  struct ixgb_hw hw;
  +   u16 msg_enable;
  struct ixgb_hw_stats stats;
   #ifdef CONFIG_PCI_MSI
  boolean_t have_msi;
  diff --git a/drivers/net/ixgb/ixgb_ethtool.c 
  b/drivers/net/ixgb/ixgb_ethtool.c
  index d38ade5..e8d83de 100644
  --- a/drivers/net/ixgb/ixgb_ethtool.c
  +++ b/drivers/net/ixgb/ixgb_ethtool.c
  @@ -251,6 +251,20 @@ ixgb_set_tso(struct net_device *netdev, 
   } 
   #endif /* NETIF_F_TSO */
   
  +static uint32_t
  +ixgb_get_msglevel(struct net_device *netdev)
  +{
  +   struct ixgb_adapter *adapter = netdev-priv;
  +   return adapter-msg_enable;
  +}
  +
  +static void
  +ixgb_set_msglevel(struct net_device *netdev, uint32_t data)
  +{
  +   struct ixgb_adapter *adapter = netdev-priv;
  +   adapter-msg_enable = data;
  +}
  +
 
 Minor nits:
 - you may consider removing the u{8/16/32} in drivers/net/ixgb
   for consistency sake in a different patch (there is a strong
   majority of uint_something in the driver).


All the uint32_t should be removed.  Kernel style is u32.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Congestion Avoidance Monitoring Tools

2006-04-24 Thread Stephen Hemminger

On Mon, 24 Apr 2006 00:52:35 +0200
Hagen Paul Pfeifer [EMAIL PROTECTED] wrote:

 * Stephen Hemminger | 2006-04-21 08:19:17 [-0700]:
 
 2.6.13 still had lots of problems, things didn't really get working
 right till 2.6.15 or later. Especially with TSO.
 
 --verbose?
 
 I have a tool using kprobe's see 
 http://developer.osdl.org/shemminger/prototypes/tcpprobe.tar.gz
 I try to keep it up to date with current kernel and build process, last used 
 it
 on 2.6.16.
 
 wget http://developer.osdl.org/shemminger/prototypes/tcpprobe.tar.gz
 
 Ended with following error code: ;-)
 
 00:32:48 ERROR 403: Forbidden.
 
 HGN
 

Fixed
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 2/5] s2io driver updates

2006-04-24 Thread Stephen Hemminger

On Mon, 24 Apr 2006 10:39:52 -0700
Ananda Raju [EMAIL PROTECTED] wrote:

 Hi, 
 
 Currently the only way we can differentiate between copper CX4 transponder
 adapters from optical transponder adapters is by reading the product name
 string in vpd. 

That makes sense.  Though often the VPD can be messed up by OEM's. Probably
not a big issue with this driver.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] netdev: hotplug napi race cleanup

2006-04-24 Thread Stephen Hemminger

This follows after the earlier two patches.

Change the initialization of the class device portion of the net device
to be done earlier, so that any races before registration completes are
harmless.  Add a mutex to avoid changes to netdevice during the
class device registration. 

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- linux-2.6.orig/net/core/dev.c   2006-04-24 10:31:15.0 -0700
+++ linux-2.6/net/core/dev.c2006-04-24 10:31:16.0 -0700
@@ -203,10 +203,12 @@
 
 #ifdef CONFIG_SYSFS
 extern int netdev_sysfs_init(void);
-extern int netdev_register_sysfs(struct net_device *);
-extern void netdev_unregister_sysfs(struct net_device *);
+extern void netdev_init_classdev(struct net_device *);
+#define netdev_register_sysfs(dev) class_device_add((dev-class_dev))
+#definenetdev_unregister_sysfs(dev)
class_device_del((dev-class_dev))
 #else
 #define netdev_sysfs_init()(0)
+#define netdev_init_classdev(dev)  do { } while(0)
 #define netdev_register_sysfs(dev) (0)
 #definenetdev_unregister_sysfs(dev)do { } while(0)
 #endif
@@ -2870,6 +2872,8 @@
 
set_bit(__LINK_STATE_PRESENT, dev-state);
 
+   netdev_init_classdev(dev);
+
dev-next = NULL;
dev_init_scheduler(dev);
write_lock_bh(dev_base_lock);
@@ -3047,7 +3051,10 @@
 * this is a delayed call after register_netdevice
 * so no way to tell device driver what is wrong.
 */
+   rtnl_lock();
err = netdev_register_sysfs(dev);
+   __rtnl_unlock();
+
if (err) {
printk(KERN_ERR %s: failed sysfs registration 
(%d)\n,
   dev-name, err);
--- sky2-2.6.17.orig/net/core/net-sysfs.c   2006-04-24 10:31:14.0 
-0700
+++ sky2-2.6.17/net/core/net-sysfs.c2006-04-24 10:31:16.0 -0700
@@ -443,13 +443,8 @@
 #endif
 };
 
-void netdev_unregister_sysfs(struct net_device * net)
-{
-   class_device_del((net-class_dev));
-}
-
-/* Create sysfs entries for network device. */
-int netdev_register_sysfs(struct net_device *net)
+/* Setup class device */
+void netdev_init_classdev(struct net_device *net)
 {
struct class_device *class_dev = (net-class_dev);
struct attribute_group **groups = net-sysfs_groups;
@@ -470,8 +465,6 @@
|| (net-wireless_handlers  
net-wireless_handlers-get_wireless_stats))
*groups++ = wireless_group;
 #endif
-
-   return class_device_add(class_dev);
 }
 
 int netdev_sysfs_init(void)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: is it a backwards compatability catch-22?

2006-04-24 Thread Stephen Hemminger

On Mon, 24 Apr 2006 16:47:34 -0700
Rick Jones [EMAIL PROTECTED] wrote:

 I might be out to lunch, certainly it happens often enough :)  I've 
 spent the afternoon trying to stop my NIC names from being random on 
 each boot.  To that end, I've been doing udev rules based on an example 
 I found at http://www.debianhelp.co.uk/udev.htm  In this case I'm 
 running a Debian 2.6.15-1 kernel.
 
 It seems that the SYSTEM{address} looks for a case senstive match on the 
 address (MAC) of the interface in rules like these:
 
 lumber:~# cat /etc/udev/rules.d/010_netinterfaces.rules
 KERNEL=eth*,SYSFS{address}==00:30:6e:4c:27:3c, NAME=eth0
 KERNEL=eth*,SYSFS{address}==00:30:6e:4c:27:3d, NAME=eth1
 KERNEL=eth*,SYSFS{address}==00:12:79:9e:0e:d2, NAME=eth2
 KERNEL=eth*,SYSFS{address}==00:12:79:9e:0e:d3, NAME=eth3
 KERNEL=eth*,SYSFS{address}==00:0c:fc:00:08:71, NAME=eth4
 
 it seems to want lower-case hex because that is what comes out of SYSFS. (?)
 
 Of course, ifconfig -a gives HW addresses in uppercase hex:
 
 lumber:~# ifconfig -a | grep HW
 eth0  Link encap:Ethernet  HWaddr 00:30:6E:4C:27:3C
 eth1  Link encap:Ethernet  HWaddr 00:30:6E:4C:27:3D
 eth2  Link encap:Ethernet  HWaddr 00:12:79:9E:0E:D2
 eth3  Link encap:Ethernet  HWaddr 00:12:79:9E:0E:D3
 eth4  Link encap:Ethernet  HWaddr 00:0C:FC:00:08:71
 
 and some of the dmesg stuff - notably e100:
 
 lumber:~# dmesg | grep eth
 e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
 eth1: Tigon3 [partno(BCM95700A6) rev 0105 PHY(5701)] (PCI:66MHz:64-bit) 
 10/100/1000BaseT Ethernet 00:30:6e:4c:27:3d
 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] 
 TSOcap[0]
 eth1: dma_rwctrl[76ff2d0f]
 e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection
 e100: eth3: e100_probe: addr 0x8002, irq 57, MAC addr 00:30:6E:4C:27:3C
 eth4: Neterion Xframe I 10GbE adapter (rev 4), Version Version 2.0.9.3, 
 Intr type INTA
 e100: eth0: e100_watchdog: link up, 100Mbps, half-duplex
 
 While it isn't a showstopper it does become a bit inconvenient to have 
 to downshift the MAC when taking it from ifconfig to use in the udev 
 rules.  Any chance the two can agree on one or the other?  Or is each 
 locked in a backwards compatability embrace?
 
 rick jones
 
 and of course, arp matches ifconfig:
 
 lumber:~# arp -an
 ? (15.4.89.87) at 00:12:79:94:F8:24 [ether] on eth0
 ? (15.4.88.1) at 00:00:0C:07:AC:00 [ether] on eth0
 
 not that arp in and of itself matters in this situation.

Don't use the auto assigned format eth0, eth1, eth2?  
The udev stuff runs after the device has already chosen it's default name.
It has to, it's part of the hotplug infrastructure, and we don't want
to depend on usermode to define the name.  Just choose some other
convention eth_0  or something like that.



-- 
Stephen Hemminger [EMAIL PROTECTED]
OSDL http://developer.osdl.org/~shemminger
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: suspicious unlikely usage in tcp_transmit_skb()

2006-04-25 Thread Stephen Hemminger

On Mon, 24 Apr 2006 16:25:39 -0700
Hua Zhong [EMAIL PROTECTED] wrote:

 Hi,
 
 I am developing a profiling tool to check if likely/unlikely usages are wise. 
 I find that the following one is always a miss:
 
   # Hit# miss Function:[EMAIL PROTECTED]
 ! 0 50505 tcp_transmit_skb():net/ipv4/[EMAIL PROTECTED]
 
 There is a chance that my tool is buggy, but I just want to confirm with you 
 whether this does look suspicious and what your opinion is.
 
 Signed-off-by: Hua Zhong [EMAIL PROTECTED]
 
 diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
 index a28ae59..743016b 100644
 --- a/net/ipv4/tcp_output.c
 +++ b/net/ipv4/tcp_output.c
 @@ -465,7 +465,7 @@ #define SYSCTL_FLAG_SACK0x4
 TCP_INC_STATS(TCP_MIB_OUTSEGS);
  
 err = icsk-icsk_af_ops-queue_xmit(skb, 0);
 -   if (unlikely(err = 0))
 +   if (likely(err = 0))
 return err;
  
 tcp_enter_cwr(sk);

How about just taking off the likely/unlikely in this case.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/5] sky2: add fake idle irq timer

2006-04-25 Thread Stephen Hemminger

Add an fake NAPI schedule once a second. This is an attempt to work around
for broken configurations with edge-triggered interrupts.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-2.6.17.orig/drivers/net/sky2.c 2006-04-25 10:48:47.0 -0700
+++ sky2-2.6.17/drivers/net/sky2.c  2006-04-25 10:53:32.0 -0700
@@ -2086,6 +2086,20 @@
}
 }
 
+/* If idle then force a fake soft NAPI poll once a second
+ * to work around cases where sharing an edge triggered interrupt.
+ */
+static void sky2_idle(unsigned long arg)
+{
+   struct net_device *dev = (struct net_device *) arg;
+
+   local_irq_disable();
+   if (__netif_rx_schedule_prep(dev))
+   __netif_rx_schedule(dev);
+   local_irq_enable();
+}
+
+
 static int sky2_poll(struct net_device *dev0, int *budget)
 {
struct sky2_hw *hw = ((struct sky2_port *) netdev_priv(dev0))-hw;
@@ -2134,6 +2148,8 @@
sky2_write32(hw, STAT_CTRL, SC_STAT_CLR_IRQ);
}
 
+   mod_timer(hw-idle_timer, jiffies + HZ);
+
local_irq_disable();
__netif_rx_complete(dev0);
 
@@ -3288,6 +3304,8 @@
 
sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
 
+   setup_timer(hw-idle_timer, sky2_idle, (unsigned long) dev);
+
pci_set_drvdata(pdev, hw);
 
return 0;
@@ -3323,13 +3341,15 @@
if (!hw)
return;
 
+   del_timer_sync(hw-idle_timer);
+
+   sky2_write32(hw, B0_IMSK, 0);
dev0 = hw-dev[0];
dev1 = hw-dev[1];
if (dev1)
unregister_netdev(dev1);
unregister_netdev(dev0);
 
-   sky2_write32(hw, B0_IMSK, 0);
sky2_set_power_state(hw, PCI_D3hot);
sky2_write16(hw, B0_Y2LED, LED_STAT_OFF);
sky2_write8(hw, B0_CTST, CS_RST_SET);
--- sky2-2.6.17.orig/drivers/net/sky2.h 2006-04-25 10:48:42.0 -0700
+++ sky2-2.6.17/drivers/net/sky2.h  2006-04-25 10:51:33.0 -0700
@@ -1880,6 +1880,8 @@
struct sky2_status_le *st_le;
u32  st_idx;
dma_addr_t   st_dma;
+
+   struct timer_listidle_timer;
int  msi_detected;
wait_queue_head_tmsi_wait;
 };

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/5] sky2: use ALIGN() macro

2006-04-25 Thread Stephen Hemminger

The ALIGN() macro in kernel.h does the same math that the
sky2 driver was using for padding.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2-2.6.17.orig/drivers/net/sky2.c 2006-04-25 10:47:03.0 -0700
+++ sky2-2.6.17/drivers/net/sky2.c  2006-04-25 10:47:28.0 -0700
@@ -925,8 +925,7 @@
skb = alloc_skb(size + RX_SKB_ALIGN, gfp_mask);
if (likely(skb)) {
unsigned long p = (unsigned long) skb-data;
-   skb_reserve(skb,
-   ((p + RX_SKB_ALIGN - 1)  ~(RX_SKB_ALIGN - 1)) - p);
+   skb_reserve(skb, ALIGN(p, RX_SKB_ALIGN) - p);
}
 
return skb;
@@ -1686,13 +1685,12 @@
 }
 
 
-#define roundup(x, y)   x)+((y)-1))/(y))*(y))
 /* Want receive buffer size to be multiple of 64 bits
  * and incl room for vlan and truncation
  */
 static inline unsigned sky2_buf_size(int mtu)
 {
-   return roundup(mtu + ETH_HLEN + VLAN_HLEN, 8) + 8;
+   return ALIGN(mtu + ETH_HLEN + VLAN_HLEN, 8) + 8;
 }
 
 static int sky2_change_mtu(struct net_device *dev, int new_mtu)

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/5] sky2: version 1.2

2006-04-25 Thread Stephen Hemminger

Update to version 1.2

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2-2.6.17.orig/drivers/net/sky2.c 2006-04-25 10:54:57.0 -0700
+++ sky2-2.6.17/drivers/net/sky2.c  2006-04-25 10:55:51.0 -0700
@@ -51,7 +51,7 @@
 #include sky2.h
 
 #define DRV_NAME   sky2
-#define DRV_VERSION1.1
+#define DRV_VERSION1.2
 #define PFXDRV_NAME  
 
 /*

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/5] sky2: version 1.2

2006-04-25 Thread Stephen Hemminger

Update to sky2 driver. Mostly fixes to try and handle users
stuck with edge-triggered interrupts. Also, some minor cleanups.

Patches apply onto 1.1 version in 2.6.17-rc2

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/5] sky2: reschedule if irq still pending

2006-04-25 Thread Stephen Hemminger

This is a workaround for the case edge-triggered irq's. Several users
seem to have broken configurations sharing edge-triggered irq's. To avoid
losing IRQ's, reshedule if more work arrives.

The changes to netdevice.h are to extract the part that puts device
back in list into separate inline.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2-2.6.17.orig/drivers/net/sky2.c 2006-04-25 10:48:44.0 -0700
+++ sky2-2.6.17/drivers/net/sky2.c  2006-04-25 10:48:47.0 -0700
@@ -2093,6 +2093,7 @@
int work_done = 0;
u32 status = sky2_read32(hw, B0_Y2_SP_EISR);
 
+ restart_poll:
if (unlikely(status  ~Y2_IS_STAT_BMU)) {
if (status  Y2_IS_HW_ERR)
sky2_hw_intr(hw);
@@ -2123,7 +2124,7 @@
}
 
if (status  Y2_IS_STAT_BMU) {
-   work_done = sky2_status_intr(hw, work_limit);
+   work_done += sky2_status_intr(hw, work_limit - work_done);
*budget -= work_done;
dev0-quota -= work_done;
 
@@ -2133,9 +2134,22 @@
sky2_write32(hw, STAT_CTRL, SC_STAT_CLR_IRQ);
}
 
-   netif_rx_complete(dev0);
+   local_irq_disable();
+   __netif_rx_complete(dev0);
 
status = sky2_read32(hw, B0_Y2_SP_LISR);
+
+   if (unlikely(status)) {
+   /* More work pending, try and keep going */
+   if (__netif_rx_schedule_prep(dev0)) {
+   __netif_rx_reschedule(dev0, work_done);
+   status = sky2_read32(hw, B0_Y2_SP_EISR);
+   local_irq_enable();
+   goto restart_poll;
+   }
+   }
+
+   local_irq_enable();
return 0;
 }
 
@@ -2153,8 +2167,6 @@
prefetch(hw-st_le[hw-st_idx]);
if (likely(__netif_rx_schedule_prep(dev0)))
__netif_rx_schedule(dev0);
-   else
-   printk(KERN_DEBUG PFX irq race detected\n);
 
return IRQ_HANDLED;
 }
--- sky2-2.6.17.orig/include/linux/netdevice.h  2006-04-25 10:48:44.0 
-0700
+++ sky2-2.6.17/include/linux/netdevice.h   2006-04-25 10:48:47.0 
-0700
@@ -829,19 +829,21 @@
__netif_rx_schedule(dev);
 }
 
-/* Try to reschedule poll. Called by dev-poll() after netif_rx_complete().
- * Do not inline this?
- */
+
+static inline void  __netif_rx_reschedule(struct net_device *dev, int undo)
+{
+   dev-quota += undo;
+   list_add_tail(dev-poll_list, __get_cpu_var(softnet_data).poll_list);
+   __raise_softirq_irqoff(NET_RX_SOFTIRQ);
+}
+
+/* Try to reschedule poll. Called by dev-poll() after netif_rx_complete(). */
 static inline int netif_rx_reschedule(struct net_device *dev, int undo)
 {
if (netif_rx_schedule_prep(dev)) {
unsigned long flags;
-
-   dev-quota += undo;
-
local_irq_save(flags);
-   list_add_tail(dev-poll_list, 
__get_cpu_var(softnet_data).poll_list);
-   __raise_softirq_irqoff(NET_RX_SOFTIRQ);
+   __netif_rx_reschedule(dev, undo);
local_irq_restore(flags);
return 1;
}

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] sky2: add fake idle irq timer

2006-04-25 Thread Stephen Hemminger

On Tue, 25 Apr 2006 23:23:29 +0200
Francois Romieu [EMAIL PROTECTED] wrote:

 Stephen Hemminger [EMAIL PROTECTED] :
 [...]
  --- sky2-2.6.17.orig/drivers/net/sky2.c 2006-04-25 10:48:47.0 
  -0700
  +++ sky2-2.6.17/drivers/net/sky2.c  2006-04-25 10:53:32.0 -0700
  @@ -2086,6 +2086,20 @@
  }
   }
   
  +/* If idle then force a fake soft NAPI poll once a second
  + * to work around cases where sharing an edge triggered interrupt.
  + */
  +static void sky2_idle(unsigned long arg)
  +{
  +   struct net_device *dev = (struct net_device *) arg;
  +
  +   local_irq_disable();
  +   if (__netif_rx_schedule_prep(dev))
  +   __netif_rx_schedule(dev);
  +   local_irq_enable();
  +}
  +
  +
   static int sky2_poll(struct net_device *dev0, int *budget)
   {
  struct sky2_hw *hw = ((struct sky2_port *) netdev_priv(dev0))-hw;
  @@ -2134,6 +2148,8 @@
  sky2_write32(hw, STAT_CTRL, SC_STAT_CLR_IRQ);
  }
   
  +   mod_timer(hw-idle_timer, jiffies + HZ);
  +
  local_irq_disable();
  __netif_rx_complete(dev0);
 
 
 Any objection against moving mod_timer() from sky2_poll() to sky2_idle()
 so as to keep poll() path unmodified ?
 

If traffic is moving, then I want the timer to keep getting rescheduled
farther out.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: suspicious unlikely usage in tcp_transmit_skb()

2006-04-25 Thread Stephen Hemminger

On Tue, 25 Apr 2006 14:46:49 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Tue, 25 Apr 2006 10:01:49 -0700

 # Hit# miss Function:[EMAIL PROTECTED]
   ! 0 50505 tcp_transmit_skb():net/ipv4/[EMAIL PROTECTED]
  ...
  How about just taking off the likely/unlikely in this case.

 Why remove it when we'll now get a 50505 to 0 hit rate?

Depends on the data stream, but I guess if we are seeing high loss
we really don't care about the CPU branch prediction.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] sky2: add fake idle irq timer

2006-04-25 Thread Stephen Hemminger

On Wed, 26 Apr 2006 00:39:00 +0200
Francois Romieu [EMAIL PROTECTED] wrote:

 Stephen Hemminger [EMAIL PROTECTED] :
 [...]
   Any objection against moving mod_timer() from sky2_poll() to sky2_idle()
   so as to keep poll() path unmodified ?
   
  
  If traffic is moving, then I want the timer to keep getting rescheduled
  farther out.
 
 If my version of the driver is not stale, the timer will not be
 rescheduled when work_done = work_limit.

I am trying to work around possible lost IRQ's, not netdev scheduler
screw up's.  If workdone = work_limit, then it will already be
called back later when it return's 1.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 driver problems in 2.6.17-rc2-git6 (was: Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1)

2006-04-26 Thread Stephen Hemminger

On Tue, 25 Apr 2006 17:06:25 -0700
Guenther Thomsen [EMAIL PROTECTED] wrote:

 On Monday 17 April 2006 11:18, Stephen Hemminger wrote:
  I don't know what you are doing different, but my 2 port SysKonnect
  card is working fine.  Running SMP AMD64 and 2.6.17 latest.
 
  Showing full speed on both ports.
 I missed that e-mail, sorry.
 
 I just gave it another try, this time with 2.6.16.11 . One port works 
 fine (so far, I just did very limited testing with ttcp). The second port 
 does negotiate IP address via DHCP, but the packgages it receives 
 seem to be garbled:
 
 --8--
0x:   6175 6469 7428 3131 3435 3939 3430  ..audit(11459940
 0x0010:  3031 2e39 3738 3a33 3829 3a20 7573 6572  01.978:38):.user
 0x0020:  2070 6964 3d33 3230 3920 7569 643d   .pid=3209.uid=
 12:56:23.725090 00:00:00:00:00:00  30:6e:6d:00:00:00 null I (s=32,r=55,P) 
 len=42
 12:56:24.603274 00:00:21:00:00:00  00:00:00:00:00:00 null disc/C len=43
 12:56:26.619326 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 12:56:28.635346 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 12:56:29.734046 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 12:56:29.865239 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 12:56:30.651371 00:00:00:00:00:00  a6:00:00:00:4d:04, ethertype Unknown 
 (0xe20c), length 60:
 0x:   6175 6469 7428 3131 3435 3939 3436  ..audit(11459946
 0x0010:  3031 2e33 3639 3a34 3729 3a20 7573 6572  01.369:47):.user
 0x0020:  2070 6964 3d33 3239 3820 7569 643d   .pid=3298.uid=
 12:56:30.916718 00:00:f0:71:61:00  28:37:03:5b:3a:00 null I (s=16,r=0,C) 
 len=42
 12:56:30.923558 00:00:21:00:00:00  00:00:00:00:00:00 null rnr (r=55,C) len=42
 12:56:32.667413 00:00:d0:2e:30:42  10:60:61:00:00:00, ethertype Unknown 
 (0x572b), length 60:
 0x:   d675 0d00   0200    ...u
 0x0010:           
 0x0020:       1300    ..
 12:56:33.296384 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 12:56:33.303222 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 [..]
 13:00:44.340062 00:00:00:00:00:00  5f:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:44.672350 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:44.868724 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:45.340123 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:46.340173 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:46.688433 IP truncated-ip - 1454 bytes missing! 192.168.65.66.40313  
 192.168.65.65.5001: . 1426488980:1426490428(1448) ack 1790562292 win 1460 
 nop,nop,timestamp[|tcp]
 13:00:48.704431 00:00:21:00:00:00  00:00:00:00:00:00 null I (s=17,r=18,C) 
 len=42
 13:00:48.886426 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:50.720463 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:52.736496 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:54.752522 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:54.927556 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 13:00:54.934394 00:00:00:00:00:00  00:00:00:00:00:00 null I (s=0,r=0,C) 
 len=42
 --8--
 On a different host connected to the same switch, traffic looks more like:
 --8--
 2:01:49.388992 IP 192.168.64.1.ntp  255.255.255.255.ntp: NTPv3, Broadcast, 
 length 48
 12:01:50.176550 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
 12:01:51.235034 arp reply 192.168.64.32 is-at 00:0a:49:00:5e:8a
 12:01:51.241857 arp reply 192.168.64.33 is-at 00:0a:49:00:5e:8b
 12:01:51.891193 00:00:01:02:c8:58  45:c0:00:1c:00:20, ethertype Unknown 
 (0xe000), length 60:
 0x:  0001 1164 ee9b       ...d
 0x0010:        2f6b 8c87  /k..
 0x0020:           ..
 12:01:52.192552 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
 12:01:52.801392 arp reply 192.168.64.34 is-at 00:0a:49:00:5e:8c
 12:01:52.808240 arp reply 192.168.64.35 is-at 00:0a:49:00:5e:8d
 12:01:54.208495 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
 12:01:56.224453 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
 12:01:58.240464 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
 12:02:00.029320 arp reply 192.168.64.39 is-at 00:0a:49:00:5e:ff
 12:02:00.256420 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
 8000.00:a0:d1:e1:b4

[RFC] bridge: partial rtnetlink hooks

2006-04-26 Thread Stephen Hemminger

This is the start of adding support for rtnetlink to the bridge code.
So far it only supports accessing the list of links and notifying
about link changes. It is just a prototype to get early feedback, don't
use to build your own masterpiece yet.

--- bridge-2.6.orig/net/bridge/Makefile
+++ bridge-2.6/net/bridge/Makefile
@@ -6,7 +6,7 @@ obj-$(CONFIG_BRIDGE) += bridge.o
 
 bridge-y   := br.o br_device.o br_fdb.o br_forward.o br_if.o br_input.o \
br_ioctl.o br_notify.o br_stp.o br_stp_bpdu.o \
-   br_stp_if.o br_stp_timer.o
+   br_stp_if.o br_stp_timer.o br_netlink.o
 
 bridge-$(CONFIG_SYSFS) += br_sysfs_if.o br_sysfs_br.o
 
--- bridge-2.6.orig/net/bridge/br.c
+++ bridge-2.6/net/bridge/br.c
@@ -30,17 +30,20 @@ static struct llc_sap *br_stp_sap;
 
 static int __init br_init(void)
 {
+   int err = -EADDRINUSE;
+
br_stp_sap = llc_sap_open(LLC_SAP_BSPAN, br_stp_rcv);
if (!br_stp_sap) {
printk(KERN_ERR bridge: can't register sap for STP\n);
-   return -EBUSY;
+   goto out;
}
 
br_fdb_init();
 
 #ifdef CONFIG_BRIDGE_NETFILTER
-   if (br_netfilter_init())
-   return 1;
+   err = br_netfilter_init();
+   if (err)
+   goto unregister_sap;
 #endif
brioctl_set(br_ioctl_deviceless_stub);
br_handle_frame_hook = br_handle_frame;
@@ -50,13 +53,23 @@ static int __init br_init(void)
 
register_netdevice_notifier(br_device_notifier);
 
+   br_netlink_init();
+
return 0;
+#ifdef CONFIG_BRIDGE_NETFILTER
+ unregister_sap:
+   llc_sap_close(br_stp_sap);
+#endif
+ out:
+   return err;
 }
 
 static void __exit br_deinit(void)
 {
llc_sap_close(br_stp_sap);
 
+   br_netlink_exit();
+
 #ifdef CONFIG_BRIDGE_NETFILTER
br_netfilter_fini();
 #endif
--- /dev/null
+++ bridge-2.6/net/bridge/br_netlink.c
@@ -0,0 +1,135 @@
+/*
+ * Bridge netlink control interface
+ *
+ * Authors:
+ * Stephen Hemminger   [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include linux/kernel.h
+#include linux/rtnetlink.h
+#include br_private.h
+
+static int br_fill_ifinfo(struct sk_buff *skb, const struct net_bridge_port 
*port,
+ u32 pid, u32 seq, int event, unsigned int flags)
+{
+   const struct net_bridge *br = port-br;
+   const struct net_device *dev = port-dev;
+   struct ifinfomsg *r;
+   struct nlmsghdr *nlh;
+   unsigned char *b = skb-tail;
+   u32 mtu = dev-mtu;
+   u8 operstate = netif_running(dev) ? dev-operstate : IF_OPER_DOWN;
+
+   printk(KERN_DEBUG bridge fill %s %s\n, dev-name, br-dev-name);
+
+   nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*r), flags);
+   r = NLMSG_DATA(nlh);
+   r-ifi_family = AF_BRIDGE;
+   r-__ifi_pad = 0;
+   r-ifi_type = dev-type;
+   r-ifi_index = dev-ifindex;
+   r-ifi_flags = dev_get_flags(dev);
+   r-ifi_change = 0;
+
+   RTA_PUT(skb, IFLA_IFNAME, strlen(dev-name)+1, dev-name);
+
+   RTA_PUT(skb, IFLA_MASTER, sizeof(int), br-dev-ifindex);
+
+   if (dev-addr_len)
+   RTA_PUT(skb, IFLA_ADDRESS, dev-addr_len, dev-dev_addr);
+
+   RTA_PUT(skb, IFLA_MTU, sizeof(mtu), mtu);
+   if (dev-ifindex != dev-iflink)
+   RTA_PUT(skb, IFLA_LINK, sizeof(int), dev-iflink);
+
+
+   RTA_PUT(skb, IFLA_OPERSTATE, sizeof(operstate), operstate);
+
+   if (event == RTM_NEWLINK) {
+   struct brifinfo portstate = {
+   .state = port-state,
+   .cost  = port-path_cost,
+   };
+   RTA_PUT(skb, IFLA_PROTINFO, sizeof(portstate), portstate);
+   }
+
+   nlh-nlmsg_len = skb-tail - b;
+
+   return skb-len;
+
+nlmsg_failure:
+rtattr_failure:
+
+   skb_trim(skb, b - skb-data);
+   return -1;
+}
+
+
+void br_ifinfo_notify(int event, struct net_bridge_port *port)
+{
+   struct sk_buff *skb;
+
+   printk(KERN_DEBUG bridge notify event=%d\n, event);
+   skb = alloc_skb(NLMSG_SPACE(sizeof(struct ifinfomsg) + 128),
+   GFP_ATOMIC);
+   if (!skb) {
+   netlink_set_err(rtnl, 0, RTNLGRP_BRIDGE_IFINFO, ENOBUFS);
+   return;
+   }
+   if (br_fill_ifinfo(skb, port, current-pid, 0, event, 0)  0) {
+   kfree_skb(skb);
+   netlink_set_err(rtnl, 0, RTNLGRP_BRIDGE_IFINFO, EINVAL);
+   return;
+   }
+   NETLINK_CB(skb).dst_group = RTNLGRP_IPV6_IFINFO;
+   netlink_broadcast(rtnl, skb, 0, RTNLGRP_BRIDGE_IFINFO, GFP_ATOMIC);
+}
+
+static int br_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb

Re: tune back idle cwnd closing?

2006-04-26 Thread Stephen Hemminger

On Wed, 26 Apr 2006 15:16:18 -0700
Rick Jones [EMAIL PROTECTED] wrote:

  When you're bursty application is not sending, other flows can take up
  the pipe space you are not using, and you must reprobe to figure that
  out.
 
 If the restarted connection does normal slow-start, one of two things 
 will happen yes?  Either it will grow its cwnd to = the receiver's 
 window, or it will have to stop before then because it triggered a 
 packet loss.
 
 In the first case, seems it would have been just as good to let the 
 connection burst.
 
 In the second case, is the effect on other connections really any better 
 than if the connection just started-up from where it was before?
 
 BTW, is the RFC 2681?  I looked that one up on ietf.org and the RFC by 
 that number was a different beast entirely - at least at a very quick 
 glance.
 
 rick jones
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

http://www.faqs.org/rfcs/rfc2861.html

   Long periods when the sender is application-limited can lead to the
   invalidation of the congestion window.  During periods when the TCP
   sender is network-limited, the value of the congestion window is
   repeatedly revalidated by the successful transmission of a window
   of data without loss.  When the TCP sender is network-limited, there
   is an incoming stream of acknowledgements that clocks out new data,
   giving concrete evidence of recent available bandwidth in the
   network.  In contrast, during periods when the TCP sender is
   application-limited, the estimate of available capacity represented
   by the congestion window may become steadily less accurate over time.
   In particular, capacity that had once been used by the network-
   limited connection might now be used by other traffic.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] Rough VJ Channel Implementation - vj_core.patch

2006-04-28 Thread Stephen Hemminger

On Fri, 28 Apr 2006 10:02:10 -0700
Caitlin Bestler [EMAIL PROTECTED] wrote:

 Evgeniy Polyakov wrote:
  On Fri, Apr 28, 2006 at 08:59:19AM -0700, Caitlin Bestler
  ([EMAIL PROTECTED]) wrote:
  Btw, how is it supposed to work without header split capabale
  hardware?
  
  Hardware that can classify packets is obviously capable of doing
  header data separation, but that does not mean that it has to do so.
  
  If the host wants header data separation it's real value is that when
  packets arrive in order that fewer distinct copies are required to
  move the data to the user buffer (because separated data can be
  placed back-to-back in a data-only ring). But that's an
  optimization, it's not needed to make the idea worth doing, or even
  necessarily in the first implementation.
  
  If there is dataflow, not flow of packets or flow of data
  with holes, it could be possible to modify recv() to just
  return the right pointer, so in theory userspace
  modifications would be minimal.
  With copy in place it completely does not differ from current
  design with copy_to_user() being used since memcpy() is just
  slightly faster than copy*user().
 
 If the app is really ready to use a modified interface we might as well
 just give them a QP/CQ interface. But I suppose receive by pointer
 interfaces don't really stretch the sockets interface all that badly.
 The key is that you have to decide how the buffer is released,
 is it the next call? Or a separate call? Does releasing buffer
 N+2 release buffers N and N+1? What you want to avoid 
 is having to keep a scoreboard of which buffers have been
 released.


Please just use existing AIO interface.  We don't need another
interface. The number of interfaces increases the exposed bug
surface geometrically.  Which means for each new interface, it
means testing and fixing bugs in every possible usage.



 But in context, header/data separation would allow in order
 packets to have the data be placed back to back, which 
 could allow a single recv to report the payload of multiple
 successive TCP segments. So the benefit of header/data
 separation remains the same, and I still say it's a optimization
 that should not be made a requirement. The benefits of vj_channels
 exist even without them. When the packet classifier runs on the
 host, header/data separation would not be free. I want to enable
 hardware offloads, not make the kernel bend over backwards
 to emulate how hardware would work. I'm just hoping that we
 can agree to let hardware do its work without being forced to
 work the same way the kernel does (i.e., running down a long
 list of arbitrary packet filter rules on a per packet basis).
 
 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] netem: fix loss

2006-04-28 Thread Stephen Hemminger

The following one line fix is needed to make loss function of
netem work right when doing loss on the local host.
Otherwise, higher layers just recover.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- linux-2.6.orig/net/sched/sch_netem.c
+++ linux-2.6/net/sched/sch_netem.c
@@ -167,7 +167,7 @@ static int netem_enqueue(struct sk_buff 
if (count == 0) {
sch-qstats.drops++;
kfree_skb(skb);
-   return NET_XMIT_DROP;
+   return NET_XMIT_BYPASS;
}
 
/*
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [LARTC] how to change classful netem loss probability?

2006-04-28 Thread Stephen Hemminger

Loss was broken, patch sent.

The following works now:

# tc qdisc add dev eth1 root handle 1:0 netem loss 20%

# tc qdisc add dev eth1 parent 1:1 handle 10: tbf \
  rate 256kbit buffer 1600 limit 3000
# ping -f -c 1000 shell

1000 packets transmitted, 781 received, 21% packet loss, time 3214ms
rtt min/avg/max/mdev = 0.187/0.398/3.763/0.730 ms, ipg/ewma 3.217/0.538 ms

# tc qdisc chang dev eth1 handle 1: netem loss 1%
# ping -f -c 1000 shell

1000 packets transmitted, 990 received, 1% packet loss, time 2922ms
rtt min/avg/max/mdev = 0.187/2.739/3.298/0.789 ms, ipg/ewma 2.924/2.084 ms


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] Rough VJ Channel Implementation - vj_core.patch

2006-04-28 Thread Stephen Hemminger

On Fri, 28 Apr 2006 21:29:32 +0400
Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 On Fri, Apr 28, 2006 at 10:18:33AM -0700, Stephen Hemminger ([EMAIL 
 PROTECTED]) wrote:
  Please just use existing AIO interface.  We don't need another
  interface. The number of interfaces increases the exposed bug
  surface geometrically.  Which means for each new interface, it
  means testing and fixing bugs in every possible usage.
 
 Networking AIO? Like [1] :)
 That would be really good.
 
 1. http://tservice.net.ru/~s0mbre/old/?section=projectsitem=naio
 

The existing infrastructure is there in the syscall layer, it just
isn't really AIO for sockets. That naio project has two problems, first
they require driver changes, and he is doing it on the stupidest
of hardware, optimizing a 8139too is foolish. Second, introducing
kevents, seems unnecessary and hasn't been accepted in the mainline.

The existing linux AIO model seems sufficient:
http://lse.sourceforge.net/io/aio.html

There is work to put true Posix AIO on top of this.




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] Rough VJ Channel Implementation - vj_core.patch

2006-04-28 Thread Stephen Hemminger

On Fri, 28 Apr 2006 12:16:36 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Evgeniy Polyakov [EMAIL PROTECTED]
 Date: Fri, 28 Apr 2006 21:55:39 +0400

  On Fri, Apr 28, 2006 at 10:41:18AM -0700, Stephen Hemminger ([EMAIL 
  PROTECTED]) wrote:
   Second, introducing
   kevents, seems unnecessary and hasn't been accepted in the mainline.

  kevent was never sent to lkml@ although it showed over 40% win over epoll 
  for
  test web server. Sending it to lkml@ is just jumping into ... not into
  technical world, so I posted it first here, but without much attention
  though.

 Frankly I found kevents to be a very strong idea.

But there is this huge semantic overload of kevent, poll, epoll, aio,
regular sendmsg/recv, posix aio, etc.  

Perhaps a clean break with the socket interface is needed. Otherwise, there
are nasty complications with applications that mix old socket calls and new 
interface
on the same connection.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 001/100] TCP congestion module: add TCP-LP supporting for 2.6.16

2006-05-02 Thread Stephen Hemminger

On Mon, 1 May 2006 18:05:52 +0800
Wong Edison [EMAIL PROTECTED] wrote:

 TCP Low Priority is a distributed algorithm whose goal is to utilize only
 the excess network bandwidth as compared to the ``fair share`` of
 bandwidth as targeted by TCP. Available from:
   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
 
 See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
 Our group take the following changes from
 the original TCP-LP implementation:
   o We use newReno in most core CA handling. Only add some checking
 within cong_avoid.
   o Error correcting in remote HZ, therefore remote HZ will be keeped
 on checking and updating.
   o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
 OWD have a similar meaning as RTT. Also correct the buggy formular.
   o Handle reaction for Early Congestion Indication (ECI) within
 pkts_acked, as mentioned within pseudo code.
   o OWD is handled in relative format, where local time stamp will in
 tcp_time_stamp format.
 
 Port from 2.4.19 to 2.6.16 as module by:
   Wong Hoi Sing Edison [EMAIL PROTECTED]
   Hung Hing Lun [EMAIL PROTECTED]
 
 Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]
 

Is this all of it?  Your subject line says there are a 99 more pieces.
That seems huge.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 001/100] TCP congestion module: add TCP-LP supporting for 2.6.16

2006-05-02 Thread Stephen Hemminger


 +/**
 + * struct lp
 + * @flag: TCP-LP state flag
 + * @sowd: smoothed OWD  3
 + * @owd_min: min OWD
 + * @owd_max: max OWD
 + * @owd_max_rsv: resrved max owd
 + * @RHZ: estimated remote HZ
 + * @remote_ref_time: remote reference time
 + * @local_ref_time: local reference time
 + * @last_drop: time for last active drop
 + * @inference: current inference
 + *
 + * TCP-LP's private struct.
 + * We get the idea from original TCP-LP implementation where only left those 
 we
 + * found are really useful.
 + */
 +struct lp {
 + u32 flag;
 + u32 sowd;
 + u32 owd_min;
 + u32 owd_max;
 + u32 owd_max_rsv;
 + u32 RHZ;
 + u32 remote_ref_time;
 + u32 local_ref_time;
 + u32 last_drop;
 + u32 inference;
 +};

It is best to keep structure element names lower case.
s/RHZ/rhz/

or use remote_hz

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: VJ Channel API - driver level (PATCH)

2006-05-03 Thread Stephen Hemminger

On Wed, 3 May 2006 11:12:15 -0700
Caitlin Bestler [EMAIL PROTECTED] wrote:

 Evgeniy Polyakov wrote:
  On Wed, May 03, 2006 at 08:56:23AM -0700, Caitlin Bestler
  ([EMAIL PROTECTED]) wrote:
  I'd expect high end NIC ASICs to implement rx steering based upon
  some sort of hash (for load balancing), as well as explicit 1:1
  steering between a sw channel and a hw channel. Both options for
  channel configuration are present in the driver interface.
  If netfilter assists can be done in hardware, I agree the driver
  interface will need to add support for these - otherwise, netfilter
  processing will stay above the driver.
  
  
  
  Even if the hardware cannot fully implement netfilter rules there is
  still value in having an interface that documents exactly how much
  filtering a given piece of hardware can do.
  There is no point in having the kernel repeat packet classifications
  that have already been done by the NIC.
  
  Please do not suppose that vj channel must rely on
  underlaying hardware.
  New interface MUST work better or at least not worse than
  existing skb queueing for majority of users, and I doubt
  users with netfilter capable hardware are there.
  It is only some hint to the SW, not rules, that hardware can provide.
  The best would be ipv4/ipv6 hashing, and I think it is enough.
 
 I agree. I was just stating that *if* there is direct hardware 
 support then the software should be enabled to skip 
 redundant checks. What I'm suggesting is really the
 equivalent of knowing whether the hardware generates
 or checks CRCs and TCP checksums. Don't mandate
 the feature, just have the option to avoid redundant work.
 

Also like mulitcast filtering, you need to allow for the partial
match case. If hardware can do some of the work, it is helps.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

sky2 1.3-rc1

2006-05-03 Thread Stephen Hemminger

Here is a new version that addresses some of the outstanding bugs.
* There was a race in receive processing that would cause hang
* Some more support for Yukon Ultra found in dual-core Centrino
  laptops (I want one of these).

It does not fix the problems with dual port cards corrupting receive
data (and possibly memory).

http://developer.osdl.org/shemminger/prototypes/sky2-1.3-rc1.tar.bz2

If this works for most people, I'll post as separate patches for 2.6.17
tomorrow.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: netlink+ARP+CLIP == broken,

2006-05-03 Thread Stephen Hemminger

On Wed, 03 May 2006 22:32:39 +0100
Simon Kelley [EMAIL PROTECTED] wrote:

 Both net/ipv4/arp.c and net/arm/clip.c create neighbour tables with
 family == AF_INET. For most purposes this is fine, since the two modules
  each hold a pointer to their table and pass it into the neigh_* functions.
 
 A problem arises in neigh_add, which is called by the rtnetlink code and
 which iterates through all the neighbour tables looking for the first
 one with the correct family. Since there are two different tables with
 family == AF_INET, sometimes it picks the wrong one.
 
 This leads to the situation where sending a RTM_NEWNEIGH message via
 netlink can generate an ignored and useless entry in the clip table,
 whilst the not affecting another entry in the ARP table, both entries
 for the same IP.
 
 Viz:
 sid:~# ip neigh
 192.168.3.40 dev eth0 lladdr 52:54:00:12:34:59 REACHABLE
 192.168.3.40 dev eth0  FAILED
 
 
 It's not immediately obvious how to fix this in a conceptually clean
 manner: neighbour tables are not associated with single netdevices, and
 they don't carry an address-type field. Given a {IP,lladdr,device}
 triple, its easy to determine if the device is ether-like or CLIP, but
 then the update call would have to go via the ARP and CLIP modules,
 instead of direct to the neighbour module in an address independent way.
 New address types would need further additions to the netlink/neighbour
 code.
 
 OTOH there are several obvious hacks that will fix the immediate
 problem. I'm happy to provide a patch implementing one if that's desired.
 
 Looking again, I think this is also a security hole, since the CLIP code
 keeps a whole struct including pointers in the neighbour table entry
 where ARP has the MAC address. So this might provide a way to poke
 arbitrary pointers into the kernel via RTM_NEWNEIGH. Only for root, though.


This was fixed in 2.6.16.6 and current 2.6.17
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] bridge: keep track of received multicast packets

2006-05-04 Thread Stephen Hemminger

It makes sense to add this simple statistic to keep track of received
multicast packets.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- bridge.orig/net/bridge/br_input.c   2006-04-21 14:28:55.0 -0700
+++ bridge/net/bridge/br_input.c2006-05-04 16:07:24.0 -0700
@@ -66,6 +66,7 @@
}
 
if (is_multicast_ether_addr(dest)) {
+   br-statistics.multicast++;
br_flood_forward(br, skb, !passedup);
if (!passedup)
br_pass_frame_up(br, skb);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Netem] where i can find this netem patch?

2006-05-05 Thread Stephen Hemminger

On Fri, 05 May 2006 11:08:23 -0400
George Nychis [EMAIL PROTECTED] wrote:

 Hi,
 
 I need help finding this patch that Stephen made.
 
 He sent me a patch, but i do not think its related to the patch that
 solved this problem.  I will include the patch he did forward to me at
 the bottom.

 
 However here is the problem, i even rtied his misspelling of change :)

 thorium-ini 15849-tests # tc qdisc add dev ath0 root handle 1:0 netem
 drop 0%
 thorium-ini 15849-tests # tc qdisc add dev ath0 parent 1:1 handle 10:
 xcp capacity 54Mbit limit 500
 thorium-ini 15849-tests # tc qdisc change dev ath0 root handle 1:0 netem
 drop 1%
 RTNETLINK answers: Invalid argument


The problem was you are giving handle 1:0 so the change request was
going to xcp. And xcp doesn't understand netem rtnetlink message.

You want to do:
# tc qdisc change dev ath0 root netem drop 1%

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 1.3-rc1

2006-05-05 Thread Stephen Hemminger

On Fri, 5 May 2006 19:35:09 +0200
Thomas Glanzmann [EMAIL PROTECTED] wrote:

 Hello,
 
 http://developer.osdl.org/shemminger/prototypes/sky2-1.3-rc1.tar.bz2
 
  v0.15:
  64 bytes from 10.0.0.138: icmp_seq=2 ttl=64 time=0.467 ms
 
  v1.3-rc1:
  64 bytes from 10.0.0.138: icmp_seq=4 ttl=64 time=32.9 ms
 
 I can't confirm this. For me it is just perfect:
 
 64 bytes from 89.106.66.1: icmp_seq=1 ttl=64 time=0.278 ms
 
 :04:00.0 Ethernet controller: Marvell Technology Group Ltd.: Unknown 
 device 4361 (rev 17)
 
 Thomas

What is happening is that if there is a misconfiguration and irq routing
is messed up (ie edge trigged). The driver will degenerate to polling every 
100ms.
If your system is this misconfigured, then ACPI or the BIOS needs to be fixed
and the driver really only needs to work well enough to get the bug report out 
;-)

The older driver was doing rewhacking the Transmit IRQ status timer,
so it would give a bogus transmit status interrupt and that was masking
issues.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: sky2 1.3-rc1

2006-05-05 Thread Stephen Hemminger

On Fri, 05 May 2006 19:42:27 +0100
Daniel Drake [EMAIL PROTECTED] wrote:

 Stephen Hemminger wrote:
  What is happening is that if there is a misconfiguration and irq routing
  is messed up (ie edge trigged). The driver will degenerate to polling every 
  100ms.
  If your system is this misconfigured, then ACPI or the BIOS needs to be 
  fixed
  and the driver really only needs to work well enough to get the bug report 
  out ;-)
 
 Ok, thanks for the explanation.
 
 Can you give any hints as to how we can classify this misconfiguration? 
 Barry's system has a level triggered IRQ assigned to sky2, and that IRQ 
 is not shared:
 
 http://bugs.gentoo.org/show_bug.cgi?id=132056#c3
 
 I'm just looking for something I can take to the ACPI developers, other 
 than its broken because Stephen said so ;)

Try running idle_timeout=0 module parameter.  In that case there will be no
polling timer.  If it just hangs, then the problem is missed interrupt.

You could use this to see if you are getting irq's

--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -2125,6 +2125,9 @@ static int sky2_poll(struct net_device *
int work_done = 0;
u32 status = sky2_read32(hw, B0_Y2_SP_EISR);
 
+   if (netif_msg_intr((struct sky2_port *) netdev_priv(dev0)))
+   printk(KERN_DEBUG PFX poll status %#x\n, status);
+
if (status  Y2_IS_HW_ERR)
sky2_hw_intr(hw);
 
@@ -2183,6 +2186,9 @@ static irqreturn_t sky2_intr(int irq, vo
if (status == 0 || status == ~0)
return IRQ_NONE;
 
+   if (netif_msg_intr((struct sky2_port *) netdev_priv(dev0)))
+   printk(KERN_DEBUG PFX irq status %#x\n, status);
+
prefetch(hw-st_le[hw-st_idx]);
if (likely(__netif_rx_schedule_prep(dev0)))
__netif_rx_schedule(dev0);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Please pull upstream-fixes branch of wireless-2.6

2006-05-05 Thread Stephen Hemminger


Linus Torvalds wrote:

On Fri, 5 May 2006, Andrew Morton wrote:
  

On Fri, 5 May 2006 21:06:18 -0400
John W. Linville [EMAIL PROTECTED] wrote:


These are fixes intended for 2.6.17...thanks!
  

Jeff is offline for a couple of weeks.   Please prepare a pull for Linus.



Actually, while Jeff is off, Steve Hemminger is supposed to be the network 
driver overlord (All bow down before the mighty Shemminger), so please 
do synchronize with him.


Of course, that might be just Steve taking a look and telling me yeah, 
please pull directly from John.


Linus
  

I had a bunch ready for monday...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] netdev: hotplug napi race cleanup

2006-05-08 Thread Stephen Hemminger

On Sat, 06 May 2006 18:09:47 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Mon, 24 Apr 2006 15:23:41 -0700

  This follows after the earlier two patches.

  Change the initialization of the class device portion of the net device
  to be done earlier, so that any races before registration completes are
  harmless.  Add a mutex to avoid changes to netdevice during the
  class device registration. 

  Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

 I'm not going to apply this patch and instead request that we think
 about why this problem exists in the first place.

 This patch is even stronger evidence that doing the sysfs registry in
 the todo list processing is wrong.  If you can legally do this while
 holding the rtnl semaphore, you can just as equally do it inside of
 register_netdevice() which is where it truly belongs.

 Then you can handle errors properly, unwind the state, and return the
 error to the caller instead of just losing the error and leaving the
 device in a half-registered state.

The issue is are there network devices that can't sleep during
register_netdevice?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.4 kern: want to print TCP cwnd with every packet

2006-05-08 Thread Stephen Hemminger

On Sat, 6 May 2006 10:19:16 -0400 (EDT)
George P Nychis [EMAIL PROTECTED] wrote:

 Hi,
 
 I'd like to print the TCP cwnd for the sender, with every packet before it is 
 sent out.  This way i could plot the sender window over time to show TCP's 
 behavior in certain conditions.
 
 I see in tcp_input.c several places where i could print the current window, 
 but i'd have to add code in multiple places.  I was wondering if there is any 
 1 place, right before a packet is sent out, that i could printk() tp-snd_cwnd
 
 Thanks!
 George
 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Look at http://developer.osdl.org/shemminger/prototypes/tcpprobe.tar.gz
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] netdev: hotplug napi race cleanup

2006-05-08 Thread Stephen Hemminger

On Mon, 08 May 2006 11:37:31 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Mon, 8 May 2006 09:54:58 -0700

  The issue is are there network devices that can't sleep during
  register_netdevice?

 Oh right, I forgot about that.

We could do something like this in register_netdevice()

if (in_atomic() || irqs_disabled())
net_set_todo(dev);
else {
dev-reg_state = NETREG_REGISTERED;
ret = netdev_register_sysfs(dev);
if (ret) {
... 
}

It seems a bit grotty, and might cause pain later.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 8/9] Add more support for the Yukon Ultra chip found in dual core centino laptops.

2006-05-08 Thread Stephen Hemminger

The newest Yukon Ultra chipset's require more special tweaks.
They seem to be like the Yukon XL chipsets. This code is transliterated
from the latest SysKonnect driver; I don't have any Ultra hardware.

Signed-off-by: Stephe Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -304,7 +304,8 @@ static void sky2_phy_init(struct sky2_hw
struct sky2_port *sky2 = netdev_priv(hw-dev[port]);
u16 ctrl, ct1000, adv, pg, ledctrl, ledover;
 
-   if (sky2-autoneg == AUTONEG_ENABLE  hw-chip_id != CHIP_ID_YUKON_XL) 
{
+   if (sky2-autoneg == AUTONEG_ENABLE 
+   (hw-chip_id != CHIP_ID_YUKON_XL || hw-chip_id == 
CHIP_ID_YUKON_EC_U)) {
u16 ectrl = gm_phy_read(hw, port, PHY_MARV_EXT_CTRL);
 
ectrl = ~(PHY_M_EC_M_DSC_MSK | PHY_M_EC_S_DSC_MSK |
@@ -332,7 +333,7 @@ static void sky2_phy_init(struct sky2_hw
ctrl |= PHY_M_PC_MDI_XMODE(PHY_M_PC_ENA_AUTO);
 
if (sky2-autoneg == AUTONEG_ENABLE 
-   hw-chip_id == CHIP_ID_YUKON_XL) {
+   (hw-chip_id == CHIP_ID_YUKON_XL || hw-chip_id == 
CHIP_ID_YUKON_EC_U)) {
ctrl = ~PHY_M_PC_DSC_MSK;
ctrl |= PHY_M_PC_DSC(2) | PHY_M_PC_DOWN_S_ENA;
}
@@ -448,10 +449,11 @@ static void sky2_phy_init(struct sky2_hw
gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 3);
 
/* set LED Function Control register */
-   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, 
(PHY_M_LEDC_LOS_CTRL(1) | /* LINK/ACT */
-  
PHY_M_LEDC_INIT_CTRL(7) |/* 10 Mbps */
-  
PHY_M_LEDC_STA1_CTRL(7) |/* 100 Mbps */
-  
PHY_M_LEDC_STA0_CTRL(7)));   /* 1000 Mbps */
+   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL,
+(PHY_M_LEDC_LOS_CTRL(1) |  /* LINK/ACT */
+ PHY_M_LEDC_INIT_CTRL(7) | /* 10 Mbps */
+ PHY_M_LEDC_STA1_CTRL(7) | /* 100 Mbps */
+ PHY_M_LEDC_STA0_CTRL(7)));/* 1000 Mbps */
 
/* set Polarity Control register */
gm_phy_write(hw, port, PHY_MARV_PHY_STAT,
@@ -465,6 +467,25 @@ static void sky2_phy_init(struct sky2_hw
/* restore page register */
gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg);
break;
+   case CHIP_ID_YUKON_EC_U:
+   pg = gm_phy_read(hw, port, PHY_MARV_EXT_ADR);
+
+   /* select page 3 to access LED control register */
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 3);
+
+   /* set LED Function Control register */
+   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL,
+(PHY_M_LEDC_LOS_CTRL(1) |  /* LINK/ACT */
+ PHY_M_LEDC_INIT_CTRL(8) | /* 10 Mbps */
+ PHY_M_LEDC_STA1_CTRL(7) | /* 100 Mbps */
+ PHY_M_LEDC_STA0_CTRL(7)));/* 1000 Mbps */
+
+   /* set Blink Rate in LED Timer Control Register */
+   gm_phy_write(hw, port, PHY_MARV_INT_MASK,
+ledctrl | PHY_M_LED_BLINK_RT(BLINK_84MS));
+   /* restore page register */
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg);
+   break;
 
default:
/* set Tx LED (LED_TX) to blink mode on Rx OR Tx activity */
@@ -473,19 +494,21 @@ static void sky2_phy_init(struct sky2_hw
ledover |= PHY_M_LED_MO_RX(MO_LED_OFF);
}
 
-   if (hw-chip_id == CHIP_ID_YUKON_EC_U  hw-chip_rev = 2) {
+   if (hw-chip_id == CHIP_ID_YUKON_EC_U  hw-chip_rev == 
CHIP_REV_YU_EC_A1) {
/* apply fixes in PHY AFE */
-   gm_phy_write(hw, port, 22, 255);
+   pg = gm_phy_read(hw, port, PHY_MARV_EXT_ADR);
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 255);
+
/* increase differential signal amplitude in 10BASE-T */
-   gm_phy_write(hw, port, 24, 0xaa99);
-   gm_phy_write(hw, port, 23, 0x2011);
+   gm_phy_write(hw, port, 0x18, 0xaa99);
+   gm_phy_write(hw, port, 0x17, 0x2011);
 
/* fix for IEEE A/B Symmetry failure in 1000BASE-T */
-   gm_phy_write(hw, port, 24, 0xa204);
-   gm_phy_write(hw, port, 23, 0x2002);
+   gm_phy_write(hw, port, 0x18, 0xa204);
+   gm_phy_write(hw, port, 0x17, 0x2002);
 
/* set page register to 0 */
-   gm_phy_write(hw, port, 22, 0);
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg);
} else {
gm_phy_write(hw, port, PHY_MARV_LED_CTRL, ledctrl);
 
@@ -559,6 +582,11 @@

[patch 6/9] sky2: dont write status ring

2006-05-08 Thread Stephen Hemminger

It is more efficient not to write the status ring from the
processor and just read the active portion.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c2006-05-02 09:49:38.0 -0700
+++ sky2/drivers/net/sky2.c 2006-05-02 09:49:42.0 -0700
@@ -1865,35 +1865,28 @@
 static int sky2_status_intr(struct sky2_hw *hw, int to_do)
 {
int work_done = 0;
+   u16 hwidx = sky2_read16(hw, STAT_PUT_IDX);
 
rmb();
 
-   for(;;) {
+   while (hw-st_idx != hwidx) {
struct sky2_status_le *le  = hw-st_le + hw-st_idx;
struct net_device *dev;
struct sky2_port *sky2;
struct sk_buff *skb;
u32 status;
u16 length;
-   u8  link, opcode;
-
-   opcode = le-opcode;
-   if (!opcode)
-   break;
-   opcode = ~HW_OWNER;
 
hw-st_idx = RING_NEXT(hw-st_idx, STATUS_RING_SIZE);
-   le-opcode = 0;
 
-   link = le-link;
-   BUG_ON(link = 2);
-   dev = hw-dev[link];
+   BUG_ON(le-link = 2);
+   dev = hw-dev[le-link];
 
sky2 = netdev_priv(dev);
length = le-length;
status = le-status;
 
-   switch (opcode) {
+   switch (le-opcode  ~HW_OWNER) {
case OP_RXSTAT:
skb = sky2_receive(sky2, length, status);
if (!skb)
@@ -1944,8 +1937,8 @@
default:
if (net_ratelimit())
printk(KERN_WARNING PFX
-  unknown status opcode 0x%x\n, opcode);
-   break;
+  unknown status opcode 0x%x\n, 
le-opcode);
+   goto exit_loop;
}
}
 

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 5/9] sky2: edge triggered workaround enhancement

2006-05-08 Thread Stephen Hemminger

Need to make the edge-triggered workaround timer faster to get marginally
better peformance. The test_and_set_bit in schedule_prep() acts as a barrier
already. Make it a module parameter so that laptops who are concerned
about power can set it to 0; and user's stuck with broken BIOS's
can turn the driver into pure polling.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c2006-05-02 09:49:37.0 -0700
+++ sky2/drivers/net/sky2.c 2006-05-02 09:49:38.0 -0700
@@ -98,6 +98,10 @@
 module_param(disable_msi, int, 0);
 MODULE_PARM_DESC(disable_msi, Disable Message Signaled Interrupt (MSI));
 
+static int idle_timeout = 100;
+module_param(idle_timeout, int, 0);
+MODULE_PARM_DESC(idle_timeout, Idle timeout workaround for lost interrupts 
(ms));
+
 static const struct pci_device_id sky2_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9000) },
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E00) },
@@ -2092,12 +2096,13 @@
  */
 static void sky2_idle(unsigned long arg)
 {
-   struct net_device *dev = (struct net_device *) arg;
+   struct sky2_hw *hw = (struct sky2_hw *) arg;
+   struct net_device *dev = hw-dev[0];
 
-   local_irq_disable();
if (__netif_rx_schedule_prep(dev))
__netif_rx_schedule(dev);
-   local_irq_enable();
+
+   mod_timer(hw-idle_timer, jiffies + msecs_to_jiffies(idle_timeout));
 }
 
 
@@ -2145,8 +2150,6 @@
if (work_done = work_limit)
return 1;
 
-   mod_timer(hw-idle_timer, jiffies + HZ);
-
netif_rx_complete(dev0);
 
status = sky2_read32(hw, B0_Y2_SP_LISR);
@@ -2167,8 +2170,6 @@
prefetch(hw-st_le[hw-st_idx]);
if (likely(__netif_rx_schedule_prep(dev0)))
__netif_rx_schedule(dev0);
-   else
-   printk(KERN_DEBUG PFX irq race detected\n);
 
return IRQ_HANDLED;
 }
@@ -3290,7 +3291,10 @@
 
sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
 
-   setup_timer(hw-idle_timer, sky2_idle, (unsigned long) dev);
+   setup_timer(hw-idle_timer, sky2_idle, (unsigned long) hw);
+   if (idle_timeout  0)
+   mod_timer(hw-idle_timer,
+ jiffies + msecs_to_jiffies(idle_timeout));
 
pci_set_drvdata(pdev, hw);
 

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 1/9] sky2: backout NAPI reschedule

2006-05-08 Thread Stephen Hemminger

This is a backout of earlier patch.

The whole rescheduling hack was a bad idea. It doesn't really solve
the problem and it makes the code more complicated for no good reason.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -2105,7 +2105,6 @@ static int sky2_poll(struct net_device *
int work_done = 0;
u32 status = sky2_read32(hw, B0_Y2_SP_EISR);
 
- restart_poll:
if (unlikely(status  ~Y2_IS_STAT_BMU)) {
if (status  Y2_IS_HW_ERR)
sky2_hw_intr(hw);
@@ -2136,7 +2135,7 @@ static int sky2_poll(struct net_device *
}
 
if (status  Y2_IS_STAT_BMU) {
-   work_done += sky2_status_intr(hw, work_limit - work_done);
+   work_done = sky2_status_intr(hw, work_limit);
*budget -= work_done;
dev0-quota -= work_done;
 
@@ -2148,22 +2147,9 @@ static int sky2_poll(struct net_device *
 
mod_timer(hw-idle_timer, jiffies + HZ);
 
-   local_irq_disable();
-   __netif_rx_complete(dev0);
+   netif_rx_complete(dev0);
 
status = sky2_read32(hw, B0_Y2_SP_LISR);
-
-   if (unlikely(status)) {
-   /* More work pending, try and keep going */
-   if (__netif_rx_schedule_prep(dev0)) {
-   __netif_rx_reschedule(dev0, work_done);
-   status = sky2_read32(hw, B0_Y2_SP_EISR);
-   local_irq_enable();
-   goto restart_poll;
-   }
-   }
-
-   local_irq_enable();
return 0;
 }
 
@@ -2181,6 +2167,8 @@ static irqreturn_t sky2_intr(int irq, vo
prefetch(hw-st_le[hw-st_idx]);
if (likely(__netif_rx_schedule_prep(dev0)))
__netif_rx_schedule(dev0);
+   else
+   printk(KERN_DEBUG PFX irq race detected\n);
 
return IRQ_HANDLED;
 }
--- sky2.orig/include/linux/netdevice.h
+++ sky2/include/linux/netdevice.h
@@ -829,21 +829,19 @@ static inline void netif_rx_schedule(str
__netif_rx_schedule(dev);
 }
 
-
-static inline void  __netif_rx_reschedule(struct net_device *dev, int undo)
-{
-   dev-quota += undo;
-   list_add_tail(dev-poll_list, __get_cpu_var(softnet_data).poll_list);
-   __raise_softirq_irqoff(NET_RX_SOFTIRQ);
-}
-
-/* Try to reschedule poll. Called by dev-poll() after netif_rx_complete(). */
+/* Try to reschedule poll. Called by dev-poll() after netif_rx_complete().
+ * Do not inline this?
+ */
 static inline int netif_rx_reschedule(struct net_device *dev, int undo)
 {
if (netif_rx_schedule_prep(dev)) {
unsigned long flags;
+
+   dev-quota += undo;
+
local_irq_save(flags);
-   __netif_rx_reschedule(dev, undo);
+   list_add_tail(dev-poll_list, 
__get_cpu_var(softnet_data).poll_list);
+   __raise_softirq_irqoff(NET_RX_SOFTIRQ);
local_irq_restore(flags);
return 1;
}

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 0/9] sky2 update

2006-05-08 Thread Stephen Hemminger

Bug fixes for sky2 driver:
 * fix NAPI related race that caused hangs
 * possible fixes for Yukon Ultra PHY support
 * performance improvement of ring management
 * fix race with irq on module removal

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 4/9] sky2: use mask instead of modulo operation

2006-05-08 Thread Stephen Hemminger

Gcc isn't smart enough to know that it can do a modulo
operation with power of 2 constant by doing a mask.
So add macro to do it for us.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -79,6 +79,8 @@
 #define NAPI_WEIGHT64
 #define PHY_RETRIES1000
 
+#define RING_NEXT(x,s) (((x)+1)  ((s)-1))
+
 static const u32 default_msg =
 NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK
 | NETIF_MSG_TIMER | NETIF_MSG_TX_ERR | NETIF_MSG_RX_ERR
@@ -719,7 +721,7 @@ static inline struct sky2_tx_le *get_tx_
 {
struct sky2_tx_le *le = sky2-tx_le + sky2-tx_prod;
 
-   sky2-tx_prod = (sky2-tx_prod + 1) % TX_RING_SIZE;
+   sky2-tx_prod = RING_NEXT(sky2-tx_prod, TX_RING_SIZE);
return le;
 }
 
@@ -735,7 +737,7 @@ static inline void sky2_put_idx(struct s
 static inline struct sky2_rx_le *sky2_next_rx(struct sky2_port *sky2)
 {
struct sky2_rx_le *le = sky2-rx_le + sky2-rx_put;
-   sky2-rx_put = (sky2-rx_put + 1) % RX_LE_SIZE;
+   sky2-rx_put = RING_NEXT(sky2-rx_put, RX_LE_SIZE);
return le;
 }
 
@@ -1078,7 +1080,7 @@ err_out:
 /* Modular subtraction in ring */
 static inline int tx_dist(unsigned tail, unsigned head)
 {
-   return (head - tail) % TX_RING_SIZE;
+   return (head - tail)  (TX_RING_SIZE - 1);
 }
 
 /* Number of list elements available for next tx */
@@ -1255,7 +1257,7 @@ static int sky2_xmit_frame(struct sk_buf
le-opcode = OP_BUFFER | HW_OWNER;
 
fre = sky2-tx_ring
-   + ((re - sky2-tx_ring) + i + 1) % TX_RING_SIZE;
+   + RING_NEXT((re - sky2-tx_ring) + i, TX_RING_SIZE);
pci_unmap_addr_set(fre, mapaddr, mapping);
}
 
@@ -1315,7 +1317,7 @@ static void sky2_tx_complete(struct sky2
 
for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
struct tx_ring_info *fre;
-   fre = sky2-tx_ring + (put + i + 1) % TX_RING_SIZE;
+   fre = sky2-tx_ring + RING_NEXT(put + i, TX_RING_SIZE);
pci_unmap_page(pdev, pci_unmap_addr(fre, mapaddr),
   skb_shinfo(skb)-frags[i].size,
   PCI_DMA_TODEVICE);
@@ -1876,7 +1878,7 @@ static int sky2_status_intr(struct sky2_
break;
opcode = ~HW_OWNER;
 
-   hw-st_idx = (hw-st_idx + 1) % STATUS_RING_SIZE;
+   hw-st_idx = RING_NEXT(hw-st_idx, STATUS_RING_SIZE);
le-opcode = 0;
 
link = le-link;

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 2/9] sky2: status irq hang fix

2006-05-08 Thread Stephen Hemminger

The status interrupt flag should be cleared before processing,
not afterwards to avoid race. Need to process in poll routine
even if no new interrupt status. This is a normal occurrence when
more than 64 frames (NAPI weight) are processed in one poll routine.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c2006-05-02 09:42:18.0 -0700
+++ sky2/drivers/net/sky2.c 2006-05-02 09:46:39.0 -0700
@@ -2105,45 +2105,42 @@
int work_done = 0;
u32 status = sky2_read32(hw, B0_Y2_SP_EISR);
 
-   if (unlikely(status  ~Y2_IS_STAT_BMU)) {
-   if (status  Y2_IS_HW_ERR)
-   sky2_hw_intr(hw);
+   if (status  Y2_IS_HW_ERR)
+   sky2_hw_intr(hw);
 
-   if (status  Y2_IS_IRQ_PHY1)
-   sky2_phy_intr(hw, 0);
+   if (status  Y2_IS_IRQ_PHY1)
+   sky2_phy_intr(hw, 0);
 
-   if (status  Y2_IS_IRQ_PHY2)
-   sky2_phy_intr(hw, 1);
+   if (status  Y2_IS_IRQ_PHY2)
+   sky2_phy_intr(hw, 1);
 
-   if (status  Y2_IS_IRQ_MAC1)
-   sky2_mac_intr(hw, 0);
+   if (status  Y2_IS_IRQ_MAC1)
+   sky2_mac_intr(hw, 0);
 
-   if (status  Y2_IS_IRQ_MAC2)
-   sky2_mac_intr(hw, 1);
+   if (status  Y2_IS_IRQ_MAC2)
+   sky2_mac_intr(hw, 1);
 
-   if (status  Y2_IS_CHK_RX1)
-   sky2_descriptor_error(hw, 0, receive, Y2_IS_CHK_RX1);
+   if (status  Y2_IS_CHK_RX1)
+   sky2_descriptor_error(hw, 0, receive, Y2_IS_CHK_RX1);
 
-   if (status  Y2_IS_CHK_RX2)
-   sky2_descriptor_error(hw, 1, receive, Y2_IS_CHK_RX2);
+   if (status  Y2_IS_CHK_RX2)
+   sky2_descriptor_error(hw, 1, receive, Y2_IS_CHK_RX2);
 
-   if (status  Y2_IS_CHK_TXA1)
-   sky2_descriptor_error(hw, 0, transmit, 
Y2_IS_CHK_TXA1);
+   if (status  Y2_IS_CHK_TXA1)
+   sky2_descriptor_error(hw, 0, transmit, Y2_IS_CHK_TXA1);
 
-   if (status  Y2_IS_CHK_TXA2)
-   sky2_descriptor_error(hw, 1, transmit, 
Y2_IS_CHK_TXA2);
-   }
+   if (status  Y2_IS_CHK_TXA2)
+   sky2_descriptor_error(hw, 1, transmit, Y2_IS_CHK_TXA2);
 
-   if (status  Y2_IS_STAT_BMU) {
-   work_done = sky2_status_intr(hw, work_limit);
-   *budget -= work_done;
-   dev0-quota -= work_done;
+   if (status  Y2_IS_STAT_BMU)
+   sky2_write32(hw, STAT_CTRL, SC_STAT_CLR_IRQ);
 
-   if (work_done = work_limit)
-   return 1;
+   work_done = sky2_status_intr(hw, work_limit);
+   *budget -= work_done;
+   dev0-quota -= work_done;
 
-   sky2_write32(hw, STAT_CTRL, SC_STAT_CLR_IRQ);
-   }
+   if (work_done = work_limit)
+   return 1;
 
mod_timer(hw-idle_timer, jiffies + HZ);
 

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 3/9] sky2: tx ring index mask fix

2006-05-08 Thread Stephen Hemminger

Mask for transmit ring status was picking up bits from the
unused sync ring.  They were always zero, so far...
Also, make sure to remind self not to make tx ring too big.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -1927,7 +1927,8 @@ static int sky2_status_intr(struct sky2_
 
case OP_TXINDEXLE:
/* TX index reports status for both ports */
-   sky2_tx_done(hw-dev[0], status  0x);
+   BUILD_BUG_ON(TX_RING_SIZE  0x1000);
+   sky2_tx_done(hw-dev[0], status  0xfff);
if (hw-dev[1])
sky2_tx_done(hw-dev[1],
 ((status  24)  0xff)

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 9/9] sky2: version 1.3

2006-05-08 Thread Stephen Hemminger

Update version number, to track changes.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -51,7 +51,7 @@
 #include sky2.h
 
 #define DRV_NAME   sky2
-#define DRV_VERSION1.2
+#define DRV_VERSION1.3
 #define PFXDRV_NAME  
 
 /*

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

iproute2 git repository

2006-05-08 Thread Stephen Hemminger

I moved iproute2 out of CVS. New home is:
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git

Will keep CVS tree up to date until the next release, after that it is will
rest in peace.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Please pull updated network drivers

2006-05-08 Thread Stephen Hemminger

These fixes are for 2.6.17, please excuse my git learning curve.
I have about had it for today.

git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/netdev-2.6.git upstream

Daniel Drake:
  softmac: don't reassociate if user asked for deauthentication
  softmac: make non-operational after being stopped

David Woodhouse:
  bcm43xx: Fix access to non-existent PHY registers

Herbert Valerio Riedel:
  au1000_eth.c: use ether_crc() from linux/crc32.h

Jean Delvare:
  ieee80211: Fix A band channel count (resent)

Jens Osterkamp:
  spidernet: introduce new setting
  spidernet: enable support for bcm5461 ethernet phy

Michael Buesch:
  bcm43xx: fix iwmode crash when down
  bcm43xx: Fix array overrun in bcm43xx_geo_init

Sergei Shtylyov:
  Fix RTL8019AS init for Toshiba RBTX49xx boards

Stefano Brivio:
  bcm43xx: check for valid MAC address in SPROM

Stephen Hemminger:
  sky2: backout NAPI reschedule
  sky2: status irq hang fix
  sky2: tx ring index mask fix
  sky2: use mask instead of modulo operation
  sky2: edge triggered workaround enhancement
  sky2: dont write status ring
  sky2: synchronize irq on remove
  Add more support for the Yukon Ultra chip found in dual core
centino laptops. sky2: version 1.3
  Merge branch 'upstream-fixes' of
git://git.kernel.org/.../linville/wireless-2.6
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcp compound

2006-05-09 Thread Stephen Hemminger

On Tue, 9 May 2006 19:39:43 +0200
Angelo P. Castellani [EMAIL PROTECTED] wrote:

 I resend the file because I've sent an old (quite identical) copy

Moved discussion over to netdev mailing list..

Could you export symbols in tcp_vegas (and change config dependencies) to
allow code reuse rather than having to copy/paste everything from vegas?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2 git repository

2006-05-09 Thread Stephen Hemminger

On Tue, 09 May 2006 21:51:44 +1000
Herbert Xu [EMAIL PROTECTED] wrote:

 Stephen Hemminger [EMAIL PROTECTED] wrote:
  I moved iproute2 out of CVS. New home is:
 git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
 
 Thanks Stephen.
 
 BTW, how come there is a checked out tree sitting in that git directory?

fixed.  stupid git
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger

Something like this would handle errors better, but introduce possible
problems for drivers that call register_netdevice with irq's disabled.
There was some comment about racing with linkwatch, but don't see how
that could happen during creation.  

For 2.6.18?

--- bridge.orig/include/linux/netdevice.h   2006-05-09 11:17:08.0 
-0700
+++ bridge/include/linux/netdevice.h2006-05-09 11:18:52.0 -0700
@@ -433,8 +433,7 @@
 
/* register/unregister state machine */
enum { NETREG_UNINITIALIZED=0,
-  NETREG_REGISTERING,  /* called register_netdevice */
-  NETREG_REGISTERED,   /* completed register todo */
+  NETREG_REGISTERED,   /* completed register_netdevice */
   NETREG_UNREGISTERING,/* called unregister_netdevice */
   NETREG_UNREGISTERED, /* completed unregister todo */
   NETREG_RELEASED, /* called free_netdev */
--- bridge.orig/net/core/dev.c  2006-05-09 11:17:09.0 -0700
+++ bridge/net/core/dev.c   2006-05-09 11:37:18.0 -0700
@@ -2777,6 +2777,8 @@
BUG_ON(dev_boot_phase);
ASSERT_RTNL();
 
+   might_sleep();
+
/* When net_device's are persistent, this will be fatal. */
BUG_ON(dev-reg_state != NETREG_UNINITIALIZED);
 
@@ -2863,6 +2865,11 @@
if (!dev-rebuild_header)
dev-rebuild_header = default_rebuild_header;
 
+   ret = netdev_register_sysfs(dev);
+   if (ret)
+   goto out_err;
+   dev-reg_state = NETREG_REGISTERED;
+
/*
 *  Default initial state at registry is that the
 *  device is present.
@@ -2878,14 +2885,11 @@
hlist_add_head(dev-name_hlist, head);
hlist_add_head(dev-index_hlist, dev_index_hash(dev-ifindex));
dev_hold(dev);
-   dev-reg_state = NETREG_REGISTERING;
write_unlock_bh(dev_base_lock);
 
/* Notify protocols, that a new device appeared. */
blocking_notifier_call_chain(netdev_chain, NETDEV_REGISTER, dev);
 
-   /* Finish registration after unlock */
-   net_set_todo(dev);
ret = 0;
 
 out:
@@ -3008,7 +3012,7 @@
  *
  * We are invoked by rtnl_unlock() after it drops the semaphore.
  * This allows us to deal with problems:
- * 1) We can create/delete sysfs objects which invoke hotplug
+ * 1) We can delete sysfs objects which invoke hotplug
  *without deadlocking with linkwatch via keventd.
  * 2) Since we run with the RTNL semaphore not held, we can sleep
  *safely in order to wait for the netdev refcnt to drop to zero.
@@ -3017,8 +3021,6 @@
 void netdev_run_todo(void)
 {
struct list_head list = LIST_HEAD_INIT(list);
-   int err;
-
 
/* Need to guard against multiple cpu's getting out of order. */
mutex_lock(net_todo_run_mutex);
@@ -3041,40 +3043,29 @@
= list_entry(list.next, struct net_device, todo_list);
list_del(dev-todo_list);
 
-   switch(dev-reg_state) {
-   case NETREG_REGISTERING:
-   err = netdev_register_sysfs(dev);
-   if (err)
-   printk(KERN_ERR %s: failed sysfs registration 
(%d)\n,
-  dev-name, err);
-   dev-reg_state = NETREG_REGISTERED;
-   break;
-
-   case NETREG_UNREGISTERING:
-   netdev_unregister_sysfs(dev);
-   dev-reg_state = NETREG_UNREGISTERED;
-
-   netdev_wait_allrefs(dev);
-
-   /* paranoia */
-   BUG_ON(atomic_read(dev-refcnt));
-   BUG_TRAP(!dev-ip_ptr);
-   BUG_TRAP(!dev-ip6_ptr);
-   BUG_TRAP(!dev-dn_ptr);
-
-
-   /* It must be the very last action, 
-* after this 'dev' may point to freed up memory.
-*/
-   if (dev-destructor)
-   dev-destructor(dev);
-   break;
-
-   default:
+   if (unlikely(dev-reg_state != NETREG_UNREGISTERING)) {
printk(KERN_ERR network todo '%s' but state %d\n,
   dev-name, dev-reg_state);
-   break;
+   dump_stack();
+   continue;
}
+
+   netdev_unregister_sysfs(dev);
+   dev-reg_state = NETREG_UNREGISTERED;
+
+   netdev_wait_allrefs(dev);
+
+   /* paranoia */
+   BUG_ON(atomic_read(dev-refcnt));
+   BUG_TRAP(!dev-ip_ptr);
+   BUG_TRAP(!dev-ip6_ptr);
+   BUG_TRAP(!dev-dn_ptr);
+
+   /* It must be the very last action,
+* after this 'dev' may point to freed up memory.
+*/
+

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-09 Thread Stephen Hemminger

The stuff in /proc could easily just be added attributes to the class_device 
kobject
of the net device (and then show up in sysfs).


 +
 +#define GRANT_INVALID_REF0
 +
 +#define NET_TX_RING_SIZE __RING_SIZE((struct netif_tx_sring *)0, PAGE_SIZE)
 +#define NET_RX_RING_SIZE __RING_SIZE((struct netif_rx_sring *)0, PAGE_SIZE)
 +
 +static inline void init_skb_shinfo(struct sk_buff *skb)
 +{
 + atomic_set((skb_shinfo(skb)-dataref), 1);
 + skb_shinfo(skb)-nr_frags = 0;
 + skb_shinfo(skb)-frag_list = NULL;
 +}
 +

Could you use existing sk_buff_head instead of inventing your
own skb queue?

 +struct netfront_info
 +{
 + struct list_head list;
 + struct net_device *netdev;
 +
 + struct net_device_stats stats;
 + unsigned int tx_full;
 +
 + struct netif_tx_front_ring tx;
 + struct netif_rx_front_ring rx;
 +
 + spinlock_t   tx_lock;
 + spinlock_t   rx_lock;
 +
 + unsigned int handle;
 + unsigned int evtchn, irq;
 +
 + /* What is the status of our connection to the remote backend? */
 +#define BEST_CLOSED   0
 +#define BEST_DISCONNECTED 1
 +#define BEST_CONNECTED2
 + unsigned int backend_state;
 +
 + /* Is this interface open or closed (down or up)? */
 +#define UST_CLOSED0
 +#define UST_OPEN  1
 + unsigned int user_state;
 +
 + /* Receive-ring batched refills. */
 +#define RX_MIN_TARGET 8
 +#define RX_DFL_MIN_TARGET 64
 +#define RX_MAX_TARGET NET_RX_RING_SIZE
 + int rx_min_target, rx_max_target, rx_target;
 + struct sk_buff_head rx_batch;
 +
 + struct timer_list rx_refill_timer;
 +
 + /*
 +  * {tx,rx}_skbs store outstanding skbuffs. The first entry in each
 +  * array is an index into a chain of free entries.
 +  */
 + struct sk_buff *tx_skbs[NET_TX_RING_SIZE+1];
 + struct sk_buff *rx_skbs[NET_RX_RING_SIZE+1];
 +
 + grant_ref_t gref_tx_head;
 + grant_ref_t grant_tx_ref[NET_TX_RING_SIZE + 1];
 + grant_ref_t gref_rx_head;
 + grant_ref_t grant_rx_ref[NET_TX_RING_SIZE + 1];
 +
 + struct xenbus_device *xbdev;
 + int tx_ring_ref;
 + int rx_ring_ref;
 + u8 mac[ETH_ALEN];

Isn't mac address already stored in dev-dev_addr and/or dev-perm_addr?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-09 Thread Stephen Hemminger

 +static int setup_device(struct xenbus_device *dev, struct
 netfront_info *info) +{
 + struct netif_tx_sring *txs;
 + struct netif_rx_sring *rxs;
 + int err;
 + struct net_device *netdev = info-netdev;
 +
 + info-tx_ring_ref = GRANT_INVALID_REF;
 + info-rx_ring_ref = GRANT_INVALID_REF;
 + info-rx.sring = NULL;
 + info-tx.sring = NULL;
 + info-irq = 0;
 +
 + txs = (struct netif_tx_sring *)get_zeroed_page(GFP_KERNEL);
 + if (!txs) {
 + err = -ENOMEM;
 + xenbus_dev_fatal(dev, err, allocating tx ring
 page);
 + goto fail;
 + }
 + rxs = (struct netif_rx_sring *)get_zeroed_page(GFP_KERNEL);
 + if (!rxs) {
 + err = -ENOMEM;
 + xenbus_dev_fatal(dev, err, allocating rx ring
 page);
 + free_page((unsigned long)txs);
 + goto fail;
 + }
 + info-backend_state = BEST_DISCONNECTED;
 +
 + SHARED_RING_INIT(txs);
 + FRONT_RING_INIT(info-tx, txs, PAGE_SIZE);
 +
 + SHARED_RING_INIT(rxs);
 + FRONT_RING_INIT(info-rx, rxs, PAGE_SIZE);
 +
 + err = xenbus_grant_ring(dev, virt_to_mfn(txs));
 + if (err  0)
 + goto fail;
 + info-tx_ring_ref = err;
 +
 + err = xenbus_grant_ring(dev, virt_to_mfn(rxs));
 + if (err  0)
 + goto fail;
 + info-rx_ring_ref = err;
 +
 + err = xenbus_alloc_evtchn(dev, info-evtchn);
 + if (err)
 + goto fail;
 +
 + memcpy(netdev-dev_addr, info-mac, ETH_ALEN);
 + network_connect(netdev);
 + info-irq = bind_evtchn_to_irqhandler(
 + info-evtchn, netif_int, SA_SAMPLE_RANDOM,
 netdev-name,
 

This doesn't look like a real random entropy source. packets
arriving from another domain are easily timed.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger

On Tue, 09 May 2006 14:05:01 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Tue, 9 May 2006 12:01:07 -0700

  Something like this would handle errors better, but introduce possible
  problems for drivers that call register_netdevice with irq's disabled.
  There was some comment about racing with linkwatch, but don't see how
  that could happen during creation.  

  For 2.6.18?

 I've been thinking about this a bit more.

 How can anyone be using this with IRQ's disabled if we have
 an ASSERT_RTNL() there?

Agreed, especially since rtnl is now a real mutex.  The case, that
I was worried about:
rtnl_lock()
spin_lock_irq(mylock);
x = register_netdevice();
...

Doesn't show up in any current code, even for the pseudo devices
and funny virtualized interfaces.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sky2: ifdown kills irq mask

2006-05-09 Thread Stephen Hemminger

Bringing down a port also masks off the status and other IRQ's
needed for device to function due to missing paren's.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -128,6 +128,7 @@ MODULE_DEVICE_TABLE(pci, sky2_id_table);
 /* Avoid conditionals by using array */
 static const unsigned txqaddr[] = { Q_XA1, Q_XA2 };
 static const unsigned rxqaddr[] = { Q_R1, Q_R2 };
+static const u32 portirq_msk[] = { Y2_IS_PORT_1, Y2_IS_PORT_2 };
 
 /* This driver supports yukon2 chipset only */
 static const char *yukon2_name[] = {
@@ -1084,7 +1085,7 @@ static int sky2_up(struct net_device *de
 
/* Enable interrupts from phy/mac for port */
imask = sky2_read32(hw, B0_IMSK);
-   imask |= (port == 0) ? Y2_IS_PORT_1 : Y2_IS_PORT_2;
+   imask |= portirq_msk[port];
sky2_write32(hw, B0_IMSK, imask);
 
return 0;
@@ -1435,7 +1436,7 @@ static int sky2_down(struct net_device *
 
/* Disable port IRQ */
imask = sky2_read32(hw, B0_IMSK);
-   imask = ~(sky2-port == 0) ? Y2_IS_PORT_1 : Y2_IS_PORT_2;
+   imask = ~portirq_msk[port];
sky2_write32(hw, B0_IMSK, imask);
 
/* turn off LED's */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] netdev sysfs failure handling

2006-05-09 Thread Stephen Hemminger

On Tue, 09 May 2006 15:43:22 -0700 (PDT)
David S. Miller [EMAIL PROTECTED] wrote:

 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Tue, 9 May 2006 14:40:49 -0700

  Agreed, especially since rtnl is now a real mutex.  The case, that
  I was worried about:
  rtnl_lock()
  spin_lock_irq(mylock);
  x = register_netdevice();
  ...

  Doesn't show up in any current code, even for the pseudo devices
  and funny virtualized interfaces.

 Right, therefore I think we should put something like your patch in
 there now perhaps.

 The case where we really needed the todo list is unregister, so that
 we can safely wait for all references to the net device to go away.

 I still wonder about those mentioned hotplug races wrt. linkwatch
 in the comment above netdev_run_todo().

 Linkwatch is such a nuissance because it combines asynchronous link
 state change processing with keventd and RTNL locking.  It sleeps
 waiting for __LINK_STATE_SCHED to clear with the RTNL held (via
 dev_deactivate()).  But then again dev_close() code paths do this
 too, so the dev_deactivate() bit should be OK.

 Linkwatch, after doing the dev_activate(), emits a NETDEV_CHANGE
 notifier on netdev_chain and also sends out an RTM_NETLINK
 message.  This is for the case where IFF_UP is set.

 Until we release the RTNL semaphore, during netdev register, nobody
 can go in an inspect the state of a net device.  So doing the sysfs
 node creation in register_netdevice() should be OK as far as I can
 tell.

 Can anyone find a problem with this?

Also, by getting the netdevice fully in sysfs under RTNL,
we are safe from races with the hotplug uevent that occurs.

Right now, it might be possible on SMP for the hotplug to happen
after register_netdevice, but before the device shows up
in sysfs.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/2] register_netdevice and sysfs changes

2006-05-10 Thread Stephen Hemminger

This is a signed-off version of yesterday's fix, plus the bridge
code no longer needs to be so tricky.
--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] bridge: do sysfs registration inside rtnl

2006-05-10 Thread Stephen Hemminger

Now that netdevice sysfs registration is done as part of register_netdevice;
bridge code no longer has to be tricky when adding it's kobjects to bridges.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- bridge.orig/net/bridge/br_if.c  2006-05-04 16:22:29.0 -0700
+++ bridge/net/bridge/br_if.c   2006-05-09 11:27:16.0 -0700
@@ -308,26 +308,19 @@
if (ret)
goto err2;
 
-   /* network device kobject is not setup until
-* after rtnl_unlock does it's hotplug magic.
-* so hold reference to avoid race.
-*/
-   dev_hold(dev);
-   rtnl_unlock();
-
ret = br_sysfs_addbr(dev);
-   dev_put(dev);
-
-   if (ret) 
-   unregister_netdev(dev);
- out:
-   return ret;
+   if (ret)
+   goto err3;
+   rtnl_unlock();
+   return 0;
 
+ err3:
+   unregister_netdev(dev);
  err2:
free_netdev(dev);
  err1:
rtnl_unlock();
-   goto out;
+   return ret;
 }
 
 int br_del_bridge(const char *name)

--

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] netdev: do sysfs registration as part of register_netdevice

2006-05-10 Thread Stephen Hemminger

The last step of netdevice registration was being done by a delayed
call, but because it was delayed, it was impossible to return any error
code if the class_device registration failed.

Side effects:
 * one state in registration process is unnecessary.
 * register_netdevice can sleep inside class_device registration/hotplug
 * code in netdev_run_todo only does unregistration so it is simpler.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- bridge.orig/include/linux/netdevice.h   2006-05-09 11:17:08.0 
-0700
+++ bridge/include/linux/netdevice.h2006-05-09 11:18:52.0 -0700
@@ -433,8 +433,7 @@
 
/* register/unregister state machine */
enum { NETREG_UNINITIALIZED=0,
-  NETREG_REGISTERING,  /* called register_netdevice */
-  NETREG_REGISTERED,   /* completed register todo */
+  NETREG_REGISTERED,   /* completed register_netdevice */
   NETREG_UNREGISTERING,/* called unregister_netdevice */
   NETREG_UNREGISTERED, /* completed unregister todo */
   NETREG_RELEASED, /* called free_netdev */
--- bridge.orig/net/core/dev.c  2006-05-09 11:17:09.0 -0700
+++ bridge/net/core/dev.c   2006-05-09 11:37:18.0 -0700
@@ -2777,6 +2777,8 @@
BUG_ON(dev_boot_phase);
ASSERT_RTNL();
 
+   might_sleep();
+
/* When net_device's are persistent, this will be fatal. */
BUG_ON(dev-reg_state != NETREG_UNINITIALIZED);
 
@@ -2863,6 +2865,11 @@
if (!dev-rebuild_header)
dev-rebuild_header = default_rebuild_header;
 
+   ret = netdev_register_sysfs(dev);
+   if (ret)
+   goto out_err;
+   dev-reg_state = NETREG_REGISTERED;
+
/*
 *  Default initial state at registry is that the
 *  device is present.
@@ -2878,14 +2885,11 @@
hlist_add_head(dev-name_hlist, head);
hlist_add_head(dev-index_hlist, dev_index_hash(dev-ifindex));
dev_hold(dev);
-   dev-reg_state = NETREG_REGISTERING;
write_unlock_bh(dev_base_lock);
 
/* Notify protocols, that a new device appeared. */
blocking_notifier_call_chain(netdev_chain, NETDEV_REGISTER, dev);
 
-   /* Finish registration after unlock */
-   net_set_todo(dev);
ret = 0;
 
 out:
@@ -3008,7 +3012,7 @@
  *
  * We are invoked by rtnl_unlock() after it drops the semaphore.
  * This allows us to deal with problems:
- * 1) We can create/delete sysfs objects which invoke hotplug
+ * 1) We can delete sysfs objects which invoke hotplug
  *without deadlocking with linkwatch via keventd.
  * 2) Since we run with the RTNL semaphore not held, we can sleep
  *safely in order to wait for the netdev refcnt to drop to zero.
@@ -3017,8 +3021,6 @@
 void netdev_run_todo(void)
 {
struct list_head list = LIST_HEAD_INIT(list);
-   int err;
-
 
/* Need to guard against multiple cpu's getting out of order. */
mutex_lock(net_todo_run_mutex);
@@ -3041,40 +3043,29 @@
= list_entry(list.next, struct net_device, todo_list);
list_del(dev-todo_list);
 
-   switch(dev-reg_state) {
-   case NETREG_REGISTERING:
-   err = netdev_register_sysfs(dev);
-   if (err)
-   printk(KERN_ERR %s: failed sysfs registration 
(%d)\n,
-  dev-name, err);
-   dev-reg_state = NETREG_REGISTERED;
-   break;
-
-   case NETREG_UNREGISTERING:
-   netdev_unregister_sysfs(dev);
-   dev-reg_state = NETREG_UNREGISTERED;
-
-   netdev_wait_allrefs(dev);
-
-   /* paranoia */
-   BUG_ON(atomic_read(dev-refcnt));
-   BUG_TRAP(!dev-ip_ptr);
-   BUG_TRAP(!dev-ip6_ptr);
-   BUG_TRAP(!dev-dn_ptr);
-
-
-   /* It must be the very last action, 
-* after this 'dev' may point to freed up memory.
-*/
-   if (dev-destructor)
-   dev-destructor(dev);
-   break;
-
-   default:
+   if (unlikely(dev-reg_state != NETREG_UNREGISTERING)) {
printk(KERN_ERR network todo '%s' but state %d\n,
   dev-name, dev-reg_state);
-   break;
+   dump_stack();
+   continue;
}
+
+   netdev_unregister_sysfs(dev);
+   dev-reg_state = NETREG_UNREGISTERED;
+
+   netdev_wait_allrefs(dev);
+
+   /* paranoia */
+   BUG_ON(atomic_read(dev-refcnt));
+   BUG_TRAP(!dev-ip_ptr);
+   BUG_TRAP(!dev-ip6_ptr

please pull upstream branch of netdev-2.6

2006-05-10 Thread Stephen Hemminger

The following changes since commit 6810b548b25114607e0814612d84125abccc0a4f:
  Andi Kleen:
x86_64: Move ondemand timer into own work queue

are found in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/netdev-2.6.git 
upstream

Francois Romieu:
  dl2k: use DMA_48BIT_MASK constant

Herbert Valerio Riedel:
  phy: mdiobus_register(): initialize all phy_map entries

James Cameron:
  sis900: phy for FoxCon motherboard

Stephen Hemminger:
  sky2: ifdown kills irq mask

 drivers/net/dl2k.c  |   12 ++--
 drivers/net/phy/mdio_bus.c  |4 +++-
 drivers/net/sis900.c|1 +
 drivers/net/sky2.c  |5 +++--
 include/linux/dma-mapping.h |1 +
 5 files changed, 14 insertions(+), 9 deletions(-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/6] myri10ge - Driver header files

2006-05-10 Thread Stephen Hemminger

On Wed, 10 May 2006 23:36:18 +0200
Brice Goglin [EMAIL PROTECTED] wrote:

 [PATCH 3/6] myri10ge - Driver header files
 
 myri10ge driver header files.
 myri10ge_mcp.h is the generic header, while myri10ge_mcp_gen_header.h
 is automatically generated from our firmware image.

Then clean it up after the auto generation.
Auto generated code still gets maintained by humans.

 Signed-off-by: Brice Goglin [EMAIL PROTECTED]
 Signed-off-by: Andrew J. Gallatin [EMAIL PROTECTED]
 
  myri10ge_mcp.h|  233 
 ++
  myri10ge_mcp_gen_header.h |   73 ++
  2 files changed, 306 insertions(+)
 
 --- /dev/null 2006-04-21 00:45:09.06443 -0700
 +++ linux-mm/drivers/net/myri10ge/myri10ge_mcp.h  2006-04-21 
 08:20:59.0 -0700
 @@ -0,0 +1,233 @@
 +#ifndef _myri10ge_mcp_h
 +#define _myri10ge_mcp_h
 +
 +#define MYRI10GE_MCP_MAJOR   1
 +#define MYRI10GE_MCP_MINOR   4
 +

Major/Minor for what. You don't have a character device.

 +#ifdef MYRI10GE_MCP
 +typedef signed char  int8_t;
 +typedef signed shortint16_t;
 +typedef signed int  int32_t;
 +typedef signed long longint64_t;
 +typedef unsigned char   uint8_t;
 +typedef unsigned short uint16_t;
 +typedef unsigned int   uint32_t;
 +typedef unsigned long long uint64_t;
 +#endif

Use u8 u16 u32


 +/* 8 Bytes */
 +typedef struct
 +{
 +  uint32_t high;
 +  uint32_t low;
 +} mcp_dma_addr_t;

Run this through scripts/Lindent and get indentation right

 +/* 16 Bytes */
 +typedef struct
 +{
 +  uint16_t checksum;
 +  uint16_t length;
 +} mcp_slot_t;
 +
 +/* 64 Bytes */
 +typedef struct
 +{
 +  uint32_t cmd;
 +  uint32_t data0;/* will be low portion if data  32 bits */
 +  /* 8 */
 +  uint32_t data1;/* will be high portion if data  32 bits */
 +  uint32_t data2;/* currently unused.. */
 +  /* 16 */
 +  mcp_dma_addr_t response_addr;
 +  /* 24 */
 +  uint8_t pad[40];
 +} mcp_cmd_t;
 +
 +/* 8 Bytes */
 +typedef struct
 +{
 +  uint32_t data;
 +  uint32_t result;
 +} mcp_cmd_response_t;
 +
 +
 +
 +/* 
 +   flags used in mcp_kreq_ether_send_t:
 +
 +   The SMALL flag is only needed in the first segment. It is raised
 +   for packets that are total less or equal 512 bytes.
 +
 +   The CKSUM flag must be set in all segments.
 +
 +   The PADDED flags is set if the packet needs to be padded, and it
 +   must be set for all segments.
 +
 +   The  MYRI10GE_MCP_ETHER_FLAGS_ALIGN_ODD must be set if the cumulative
 +   length of all previous segments was odd.
 +*/
 +
 +
 +#define MYRI10GE_MCP_ETHER_FLAGS_SMALL  0x1
 +#define MYRI10GE_MCP_ETHER_FLAGS_TSO_HDR0x1
 +#define MYRI10GE_MCP_ETHER_FLAGS_FIRST  0x2
 +#define MYRI10GE_MCP_ETHER_FLAGS_ALIGN_ODD  0x4
 +#define MYRI10GE_MCP_ETHER_FLAGS_CKSUM  0x8
 +#define MYRI10GE_MCP_ETHER_FLAGS_TSO_LAST   0x8
 +#define MYRI10GE_MCP_ETHER_FLAGS_NO_TSO 0x10
 +#define MYRI10GE_MCP_ETHER_FLAGS_TSO_CHOP   0x10
 +#define MYRI10GE_MCP_ETHER_FLAGS_TSO_PLD0x20
 +
 +#define MYRI10GE_MCP_ETHER_SEND_SMALL_SIZE  1520
 +#define MYRI10GE_MCP_ETHER_MAX_MTU  9400
 +
 +typedef union mcp_pso_or_cumlen
 +{
 +  uint16_t pseudo_hdr_offset;
 +  uint16_t cum_len;
 +} mcp_pso_or_cumlen_t;
 +
 +#define  MYRI10GE_MCP_ETHER_MAX_SEND_DESC 12
 +#define MYRI10GE_MCP_ETHER_PAD   2
 +
 +/* 16 Bytes */
 +typedef struct
 +{
 +  uint32_t addr_high;
 +  uint32_t addr_low;
 +  uint16_t pseudo_hdr_offset;
 +  uint16_t length;
 +  uint8_t  pad;
 +  uint8_t  rdma_count;
 +  uint8_t  cksum_offset; /* where to start computing cksum */
 +  uint8_t  flags;/* as defined above */
 +} mcp_kreq_ether_send_t;
 +
 +/* 8 Bytes */
 +typedef struct
 +{
 +  uint32_t addr_high;
 +  uint32_t addr_low;
 +} mcp_kreq_ether_recv_t;
 +
 +
 +/* Commands */
 +
 +#define MYRI10GE_MCP_CMD_OFFSET 0xf8
 +
 +typedef enum {
 +  MYRI10GE_MCP_CMD_NONE = 0,
 +  /* Reset the mcp, it is left in a safe state, waiting
 + for the driver to set all its parameters */
 +  MYRI10GE_MCP_CMD_RESET,
 +
 +  /* get the version number of the current firmware..
 + (may be available in the eeprom strings..? */
 +  MYRI10GE_MCP_GET_MCP_VERSION,
 +
 +
 +  /* Parameters which must be set by the driver before it can
 + issue MYRI10GE_MCP_CMD_ETHERNET_UP. They persist until the next
 + MYRI10GE_MCP_CMD_RESET is issued */
 +
 +  MYRI10GE_MCP_CMD_SET_INTRQ_DMA,
 +  MYRI10GE_MCP_CMD_SET_BIG_BUFFER_SIZE,  /* in bytes, power of 2 */
 +  MYRI10GE_MCP_CMD_SET_SMALL_BUFFER_SIZE,/* in bytes */
 +  
 +
 +  /* Parameters which refer to lanai SRAM addresses where the 
 + driver must issue PIO writes for various things */
 +
 +  MYRI10GE_MCP_CMD_GET_SEND_OFFSET,
 +  MYRI10GE_MCP_CMD_GET_SMALL_RX_OFFSET,
 +  MYRI10GE_MCP_CMD_GET_BIG_RX_OFFSET,
 +  MYRI10GE_MCP_CMD_GET_IRQ_ACK_OFFSET,
 +  MYRI10GE_MCP_CMD_GET_IRQ_DEASSERT_OFFSET,
 +
 +  /* Parameters which refer to rings stored on the MCP,
 + and whose size is controlled by the mcp */
 +
 +

Re: [PATCH 4/6] myri10ge - First half of the driver

2006-05-10 Thread Stephen Hemminger

On Wed, 10 May 2006 14:40:22 -0700 (PDT)
Brice Goglin [EMAIL PROTECTED] wrote:

 [PATCH 4/6] myri10ge - First half of the driver
 
 The first half of the myri10ge driver core.
 

Splitting it in half, might help email restrictions, but it kills
future users of 'git bisect' who expect to have every kernel buildable.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/6] myri10ge - Second half of the driver

2006-05-10 Thread Stephen Hemminger

On Wed, 10 May 2006 14:42:41 -0700 (PDT)
Brice Goglin [EMAIL PROTECTED] wrote:

 [PATCH 5/6] myri10ge - Second half of the driver
 
 The second half of the myri10ge driver core.
 
 Signed-off-by: Brice Goglin [EMAIL PROTECTED]
 Signed-off-by: Andrew J. Gallatin [EMAIL PROTECTED]
 
  myri10ge.c | 1540 
 +
  1 file changed, 1540 insertions(+)
 
 --- linux/drivers/net/myri10ge/myri10ge.c.old 2006-05-09 23:00:54.0 
 +0200
 +++ linux/drivers/net/myri10ge/myri10ge.c 2006-05-09 23:00:54.0 
 +0200
 @@ -1481,3 +1481,1543 @@ static struct ethtool_ops myri10ge_ethto
   .get_stats_count= myri10ge_get_stats_count,
   .get_ethtool_stats  = myri10ge_get_ethtool_stats
  };
 +
 +static int
 +myri10ge_open(struct net_device *dev)

It is preferred to put function declarations on one line.

static int mril10ge_open(struct net_device *dev)



 +{
 + struct myri10ge_priv *mgp;
 + size_t bytes;
 + myri10ge_cmd_t cmd;
 + int tx_ring_size, rx_ring_size;
 + int tx_ring_entries, rx_ring_entries;
 + int i, status, big_pow2;
 +
 + mgp = dev-priv;

use netdev_priv(dev)

 +
 + if (mgp-running != MYRI10GE_ETH_STOPPED)
 + return -EBUSY;
 +
 + mgp-running = MYRI10GE_ETH_STARTING;
 + status = myri10ge_reset(mgp);

 + /* If the user sets an obscenely small MTU, adjust the small
 +  * bytes down to nearly nothing */
 + if (mgp-small_bytes = (dev-mtu + ETH_HLEN))
 + mgp-small_bytes = 64;

You should enforce mtu = 68 in your driver (see eth_change_mtu)


 +static int
 +myri10ge_close(struct net_device *dev)
 +{
 + struct myri10ge_priv *mgp;
 + struct sk_buff *skb;
 + myri10ge_tx_buf_t *tx;
 + int status, i, old_down_cnt, len, idx;
 + myri10ge_cmd_t cmd;
 +
 + mgp = dev-priv;
 +
 + if (mgp-running != MYRI10GE_ETH_RUNNING)
 + return 0;
 +
 + if (mgp-tx.req_bytes == NULL)
 + return 0;
 +
 + del_timer_sync(mgp-watchdog_timer);
 + mgp-running = MYRI10GE_ETH_STOPPING;
 + if (myri10ge_napi)
 + netif_poll_disable(mgp-dev);
 + netif_carrier_off(dev);
 + netif_stop_queue(dev);
 + old_down_cnt = mgp-down_cnt;
 + mb();
 + status = myri10ge_send_cmd(mgp, MYRI10GE_MCP_CMD_ETHERNET_DOWN, cmd);
 + if (status) {
 + printk(KERN_ERR myri10ge: %s: Couldn't bring down link\n,
 +dev-name);
 + }
 + set_current_state (TASK_UNINTERRUPTIBLE);
 + if (old_down_cnt == mgp-down_cnt)
 + schedule_timeout(HZ);
 + set_current_state(TASK_RUNNING);
 + if (old_down_cnt == mgp-down_cnt) {
 + printk(KERN_ERR myri10ge: %s never got down irq\n,
 +dev-name);
 + }

Better to use a wait_queue and wait_event()

 
 +#ifdef NETIF_F_TSO
 +static inline unsigned long
 +myri10ge_tcpend(struct sk_buff *skb)
 +{
 + struct iphdr *ip;
 + int iphlen, tcplen;
 + struct tcphdr *tcp;
 +
 + ip = (struct iphdr *) ((char *) skb-data + 14);
 + iphlen = ip-ihl  2;
 + tcp = (struct tcphdr *) ((char *) ip + iphlen);
 + tcplen = tcp-doff  2;
 + return (tcplen + iphlen + 14);
 +}
 +#endif

The information you want is already in skb-nh.iph and skb-h.th
and it works with VLAN's. Your code doesn't.

 +
 +static inline void
 +myri10ge_csum_fixup(struct sk_buff *skb, int cksum_offset,
 + int pseudo_hdr_offset)
 +{
 + int csum;
 + uint16_t *csum_ptr;
 +
 +
 + csum = skb_checksum(skb, cksum_offset,
 + skb-len - cksum_offset, 0);
 + csum_ptr = (uint16_t *) (skb-h.raw + skb-csum);
 + if (!pskb_may_pull(skb, pseudo_hdr_offset)) {
 + printk(KERN_ERR myri10ge: can't pull skb %d\n,
 +pseudo_hdr_offset);
 + return;
 + }
 + *csum_ptr = csum_fold(csum);
 + /* need to fixup IPv4 UDP packets according to RFC768 */
 + if (unlikely(*csum_ptr == 0 
 +  skb-protocol == htons(ETH_P_IP) 
 +  skb-nh.iph-protocol == IPPROTO_UDP)) {
 + *csum_ptr = 0x;
 + }
 +}

Use skb_checksum_help() instead of this code...

 +
 +/*
 + * Transmit a packet.  We need to split the packet so that a single
 + * segment does not cross myri10ge-tx.boundary, so this makes segment
 + * counting tricky.  So rather than try to count segments up front, we
 + * just give up if there are too few segments to hold a reasonably
 + * fragmented packet currently available.  If we run
 + * out of segments while preparing a packet for DMA, we just linearize
 + * it and try again.
 + */
 +
 +static int
 +myri10ge_xmit(struct sk_buff *skb, struct net_device *dev)
 +{
 + struct myri10ge_priv *mgp = dev-priv;
 + mcp_kreq_ether_send_t *req;
 + myri10ge_tx_buf_t *tx = mgp-tx;
 + struct skb_frag_struct *frag;
 + dma_addr_t bus;
 + uint32_t low, high_swapped;
 +

[PATCH] bonding: fix sparse warnings

2006-05-10 Thread Stephen Hemminger

Fix warning from sparse in bonding code about incorrect type in assignment

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- orig/drivers/net/bonding/bond_main.c2006-05-04 16:22:10.0 
-0700
+++ new/drivers/net/bonding/bond_main.c 2006-05-10 16:04:38.0 -0700
@@ -629,7 +629,7 @@
ioctl = slave_dev-do_ioctl;
strncpy(ifr.ifr_name, slave_dev-name, IFNAMSIZ);
etool.cmd = ETHTOOL_GSET;
-   ifr.ifr_data = (char*)etool;
+   ifr.ifr_data = (void __user *) etool;
if (!ioctl || (IOCTL(slave_dev, ifr, SIOCETHTOOL)  0)) {
return -1;
}
@@ -726,7 +726,7 @@
if (ioctl) {
strncpy(ifr.ifr_name, slave_dev-name, IFNAMSIZ);
etool.cmd = ETHTOOL_GLINK;
-   ifr.ifr_data = (char*)etool;
+   ifr.ifr_data = (void __user *) etool;
if (IOCTL(slave_dev, ifr, SIOCETHTOOL) == 0) {
if (etool.data == 1) {
return BMSR_LSTATUS;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] bonding: fix sparse warnings

2006-05-10 Thread Stephen Hemminger

On Thu, 11 May 2006 00:22:03 +0100
Al Viro [EMAIL PROTECTED] wrote:

 On Wed, May 10, 2006 at 04:14:05PM -0700, Stephen Hemminger wrote:
  Fix warning from sparse in bonding code about incorrect type in assignment
 
 *snerk*
 
 Only if you are building without -Wcast-to-as.  It _is_ incorrect type in
 assignment.  And the real fix is to expand the call, killing set_fs()
 in there.

More like this (in br_if.c)?

struct ethtool_cmd ecmd = { ETHTOOL_GSET };
struct ifreq ifr;
mm_segment_t old_fs;
int err;

strncpy(ifr.ifr_name, dev-name, IFNAMSIZ);
ifr.ifr_data = (void __user *) ecmd;

old_fs = get_fs();
set_fs(KERNEL_DS);
err = dev_ethtool(ifr);
set_fs(old_fs);

if (!err)
...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-11 Thread Stephen Hemminger

On Thu, 11 May 2006 11:47:52 +0200
Andi Kleen [EMAIL PROTECTED] wrote:

 On Thursday 11 May 2006 09:49, Keir Fraser wrote:
  On 11 May 2006, at 01:33, Herbert Xu wrote:
   But if sampling virtual events for randomness is really unsafe (is it
   really?) then native guests in Xen would also get bad random numbers
   and this would need to be somehow addressed.
  
   Good point.  I wonder what VMWare does in this situation.
 
  Well, there's not much they can do except maybe jitter interrupt
  delivery. I doubt they do that though.
 
  The original complaint in our case was that we take entropy from
  interrupts caused by other local VMs, as well as external sources.
  There was a feeling that the former was more predictable and could form
  the basis of an attack. I have to say I'm unconvinced: I don't really
  see that it's significantly easier to inject precisely-timed interrupts
  into a local VM. Certainly not to better than +/- a few microseconds.
  As long as you add cycle-counter info to the entropy pool, the least
  significant bits of that will always be noise.
 
 I think I agree - e.g. i would expect the virtual interrupts to have
 enough jitter too. Maybe it would be good if someone could
 run a few statistics on the resulting numbers?
 
 Ok the randomness added doesn't consist only of the least significant
 bits. Currently it adds jiffies+full 32bit cycle count.  I guess if it was
 a real problem the code could be changed to leave out the jiffies and 
 only add maybe a 8 bit word from the low bits. But that would only
 help for the para case because the algorithm for native guests
 cannot be changed.
 
2. An entropy front/back is tricky -- how do we decide how much
  entropy to pull from domain0? How much should domain0 be prepared to
  give other domains? How easy is it to DoS domain0 by draining its
  entropy pool? Yuk.
 
 I claim (without having read any code) that in theory you need to have solved 
 that problem already in the vTPM @)
 

The base question under all this is how good does an entropy source have
to be? and then what guarantees do we make about the entropy inputs used
by /dev/random?.  If we can resolve those, then the virtual environment
answer should fall out.

This is a area where the security tin-foil hat types take over, and it
gets real hard to make good enough argument. People have built an expectation
that /dev/random has really strong entropy, good enough to generate long term
keys etc.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] expose simplified skb_checksum_recalc

2006-05-11 Thread Stephen Hemminger

Many users of skb_checksum_help() are just using it to recalculate
outbound checksum, so why not expose the interface in a more useful
way. Suggested by Ingo Oeser.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- linux-2.6.orig/include/linux/skbuff.h   2006-04-27 11:12:53.0 
-0700
+++ linux-2.6/include/linux/skbuff.h2006-05-11 11:17:39.0 -0700
@@ -1343,6 +1343,24 @@
__skb_checksum_complete(skb);
 }
 
+extern int skb_checksum_recalc(struct sk_buff *skb);
+/**
+ * skb_checksum_help - recalculate checksum of packet
+ * @skb: packet to process
+ * @inward: direction of flow, zero is receiving
+ *
+ * Invalidate hardware checksum when packet is to be mangled on
+ * receive and complete checksum manually on outgoing path.
+ */
+static inline int skb_checksum_help(struct sk_buff *skb, int inward)
+{
+   if (inward) {
+   skb-ip_summed = CHECKSUM_NONE;
+   return 0;
+   }
+   return skb_checksum_recalc(skb);
+}
+
 #ifdef CONFIG_NETFILTER
 static inline void nf_conntrack_put(struct nf_conntrack *nfct)
 {
--- sky2.orig/net/core/dev.c2006-05-10 10:17:51.0 -0700
+++ sky2/net/core/dev.c 2006-05-11 11:22:27.0 -0700
@@ -1144,39 +1144,6 @@
 EXPORT_SYMBOL(netif_device_attach);
 
 
-/*
- * Invalidate hardware checksum when packet is to be mangled, and
- * complete checksum manually on outgoing path.
- */
-int skb_checksum_help(struct sk_buff *skb, int inward)
-{
-   unsigned int csum;
-   int ret = 0, offset = skb-h.raw - skb-data;
-
-   if (inward) {
-   skb-ip_summed = CHECKSUM_NONE;
-   goto out;
-   }
-
-   if (skb_cloned(skb)) {
-   ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
-   if (ret)
-   goto out;
-   }
-
-   BUG_ON(offset  (int)skb-len);
-   csum = skb_checksum(skb, offset, skb-len-offset, 0);
-
-   offset = skb-tail - skb-h.raw;
-   BUG_ON(offset = 0);
-   BUG_ON(skb-csum + 2  offset);
-
-   *(u16*)(skb-h.raw + skb-csum) = csum_fold(csum);
-   skb-ip_summed = CHECKSUM_NONE;
-out:   
-   return ret;
-}
-
 /* Take action when hardware reception checksum errors are detected. */
 #ifdef CONFIG_BUG
 void netdev_rx_csum_fault(struct net_device *dev)
@@ -3403,7 +3370,6 @@
 EXPORT_SYMBOL(register_gifconf);
 EXPORT_SYMBOL(register_netdevice);
 EXPORT_SYMBOL(register_netdevice_notifier);
-EXPORT_SYMBOL(skb_checksum_help);
 EXPORT_SYMBOL(synchronize_net);
 EXPORT_SYMBOL(unregister_netdevice);
 EXPORT_SYMBOL(unregister_netdevice_notifier);
--- sky2.orig/net/core/skbuff.c 2006-04-27 11:12:54.0 -0700
+++ sky2/net/core/skbuff.c  2006-05-11 11:23:13.0 -0700
@@ -1334,6 +1334,36 @@
 }
 
 /**
+ * skb_checksum_recalc - force software checksum
+ * @skb: skb to process
+ * Force complete checksum, this is used to force a software checksum
+ * on the outgoing path.
+ */
+int skb_checksum_recalc(struct sk_buff *skb)
+{
+   unsigned int csum;
+   int ret = 0, offset = skb-h.raw - skb-data;
+
+   if (skb_cloned(skb)) {
+   ret = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
+   if (ret)
+   goto out;
+   }
+
+   BUG_ON(offset  (int)skb-len);
+   csum = skb_checksum(skb, offset, skb-len-offset, 0);
+
+   offset = skb-tail - skb-h.raw;
+   BUG_ON(offset = 0);
+   BUG_ON(skb-csum + 2  offset);
+
+   *(u16*)(skb-h.raw + skb-csum) = csum_fold(csum);
+   skb-ip_summed = CHECKSUM_NONE;
+out:
+   return ret;
+}
+
+/**
  * skb_dequeue - remove from the head of the queue
  * @list: list to dequeue from
  *
@@ -1854,6 +1884,7 @@
 EXPORT_SYMBOL(pskb_copy);
 EXPORT_SYMBOL(pskb_expand_head);
 EXPORT_SYMBOL(skb_checksum);
+EXPORT_SYMBOL(skb_checksum_recalc);
 EXPORT_SYMBOL(skb_clone);
 EXPORT_SYMBOL(skb_clone_fraglist);
 EXPORT_SYMBOL(skb_copy);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sky2: prevent dual port receiver problems

2006-05-11 Thread Stephen Hemminger

When both ports are receiving simultaneously, the receive logic gets confused
and may pass up a packet before it is full. This causes hangs, and IP will see
lots of garbage packets. There is even the potential for data corruption if 
a later arriving packet DMA's into freed memory. 

It looks like a hardware bug because status arrives for a packet but no
data is there. Until this bug is worked out, block the user from bringing
up both ports at once.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- sky2.orig/drivers/net/sky2.c
+++ sky2/drivers/net/sky2.c
@@ -1020,8 +1020,19 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2-hw;
unsigned port = sky2-port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int err;
+   struct net_device *otherdev = hw-dev[sky2-port^1];
 
+   /* Block bringing up both ports at the same time on a dual port card.
+* There is an unfixed bug where receiver gets confused and picks up
+* packets out of order. Until this is fixed, prevent data corruption.
+*/
+   if (otherdev  netif_running(otherdev)) {
+   printk(KERN_INFO PFX dual port support is disabled.\n);
+   return -EBUSY;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup(sky2))
printk(KERN_INFO PFX %s: enabling interface\n, dev-name);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: skge driver oops

2006-05-12 Thread Stephen Hemminger

On Fri, 12 May 2006 11:36:24 +1000
David Arnold [EMAIL PROTECTED] wrote:

 i've been getting semi-regular lockups on my machine over 2.6.16
 series.  I recently attached a serial console in an attempt to capture
 an OOPS.
 
 i got one yesterday.  it's copied manually from the console, but
 hopefully the values are all accurate.  there was more that had scrolled
 off screen above this too (sorry).
 
 oops, lspci, uname -a, .config and dmesg below.
 
 any suggestions for further debugging would be great,
 
 thanks,

Could you retest with the v1.5 version that is 2.6.17-rc3?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ixp2000: handle enp2611s with two gigabit ports

2006-05-15 Thread Stephen Hemminger

On Thu, 27 Apr 2006 00:24:11 +0200
Lennert Buytenhek [EMAIL PROTECTED] wrote:

 The ixp2000 driver for the enp2611 was developed on a board with
 three gigabit ports, but some enp2611 models only have two ports
 (and only one onboard PM3386.)  The current driver assumes there
 are always three ports and so it doesn't work on the two-port
 version of the board at all.
 
 This patch adds a bit of logic to the enp2611 driver to limit the
 number of ports to 2 if the second PM3386 isn't detected.
 
 Signed-off-by: Lennert Buytenhek [EMAIL PROTECTED]

This patch got mangled, that is probably why jeff didn't apply it before
he left. I had to fix it manually.

patching file drivers/net/ixp2000/enp2611.c
patch:  malformed patch at line 106:
module_init(enp2611_init_module);

In this part...

@@ -236,8 +240,10 @@
del_timer_sync(link_check_timer);
 
ixpdev_deinit();
-   for (i = 0; i  3; i++)
-   free_netdev(nds[i]);
+   for (i = 0; i  3; i++) {
+   if (nds[i] != NULL)
free_netdev(nds[i]);
+   }
 }
 
 module_init(enp2611_init_module);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net_sched: potential jiffy wrap bug in dev_watchdog

2006-05-15 Thread Stephen Hemminger

There is a potential jiffy wraparound bug in the transmit watchdog
that is easily avoided by using time_after().

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- linux-2.6.orig/net/sched/sch_generic.c
+++ linux-2.6/net/sched/sch_generic.c
@@ -193,8 +193,10 @@ static void dev_watchdog(unsigned long a
netif_running(dev) 
netif_carrier_ok(dev)) {
if (netif_queue_stopped(dev) 
-   (jiffies - dev-trans_start)  dev-watchdog_timeo) 
{
-   printk(KERN_INFO NETDEV WATCHDOG: %s: transmit 
timed out\n, dev-name);
+   time_after(jiffies, dev-trans_start + 
dev-watchdog_timeo)) {
+
+   printk(KERN_INFO NETDEV WATCHDOG: %s: transmit 
timed out\n,
+  dev-name);
dev-tx_timeout(dev);
}
if (!mod_timer(dev-watchdog_timer, jiffies + 
dev-watchdog_timeo))
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] skge: bad checksums on big-endian platforms

2006-05-15 Thread Stephen Hemminger

Skge driver always causes  bad checksums on big-endian.
The checksum in the receive control block was being swapped
when it doesn't need to be.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- skge-2.6.orig/drivers/net/skge.c
+++ skge-2.6/drivers/net/skge.c
@@ -2717,8 +2717,7 @@ static int skge_poll(struct net_device *
if (control  BMU_OWN)
break;
 
-   skb = skge_rx_get(skge, e, control, rd-status,
- le16_to_cpu(rd-csum2));
+   skb = skge_rx_get(skge, e, control, rd-status, rd-csum2);
if (likely(skb)) {
dev-last_rx = jiffies;
netif_receive_skb(skb);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] skge: don't allow transmit ring to be too small

2006-05-15 Thread Stephen Hemminger

The driver will get stuck (permanent transmit timeout), if the transmit
ring size is set too small.  It needs to have enough ring elements to
hold one maximum size transmit.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- skge-2.6.orig/drivers/net/skge.c
+++ skge-2.6/drivers/net/skge.c
@@ -402,7 +402,7 @@ static int skge_set_ring_param(struct ne
int err;
 
if (p-rx_pending == 0 || p-rx_pending  MAX_RX_RING_SIZE ||
-   p-tx_pending == 0 || p-tx_pending  MAX_TX_RING_SIZE)
+   p-tx_pending  MAX_SKB_FRAGS+1 || p-tx_pending  MAX_TX_RING_SIZE)
return -EINVAL;
 
skge-rx_ring.count = p-rx_pending;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: send(), sendmsg(), sendto() not thread-safe

2006-05-15 Thread Stephen Hemminger

On Mon, 15 May 2006 16:17:48 -0700
Rick Jones [EMAIL PROTECTED] wrote:

 David S. Miller wrote:
  From: Mark A Smith [EMAIL PROTECTED]
  Date: Mon, 15 May 2006 14:39:06 -0700

 I discovered that in some cases, send(), sendmsg(), and sendto() are not
 thread-safe. Although the man page for these functions does not specify
 whether these functions are supposed to be thread-safe, my reading of the
 POSIX/SUSv3 specification tells me that they should be. I traced the
 problem to tcp_sendmsg(). I was very curious about this issue, so I wrote
 up a small page to describe in more detail my findings. You can find it at:
 http://www.almaden.ibm.com/cs/people/marksmith/sendmsg.html .

 # ./sendmsgclient localhost
 ERROR! We should have all 0! We don't!
 buff[16384]=1
 buff[16385]=1
 buff[16386]=1
 buff[16387]=1
 buff[16388]=1
 buff[16389]=1
 buff[16390]=1
 buff[16391]=1
 buff[16392]=1
 buff[16393]=1
 That's 10/32768 bad bytes
 # uname -a
 HP-UX tarry B.11.23 U ia64 2397028692 unlimited-user license

 Given that the URL above asserts that HP-UX claims atomicity, either 
 there is a bug in the UX stack, or perhaps the test?  I took a quick 
 look at the HP-UX 11iv2 (aka 11.23) manpage for sendmsg and didn't see 
 anything about atomicity there - on which manpage(s) or docs was the 
 assertion of HP-UX atomicity made?

 I presume this is only for blocking sockets?  I cannot at least off 
 the top of my head see how a stack could offer it on non-blocking sockets.

The test seems to be based on sending a big message. In this case,
on non-blocking sockets, the send call will return partial status. The
return from the system call will be less than the number of bytes requested.

  And frankly, BSD defines BSD socket semantics here not some wording in
  the POSIX standards.

 Have BSD socket semantics ever been updated/clarified any any 
 quasi-official manner since the popular presence of threads?  Or 
 are/were Posix/Xopen filling a gap?

 rick jones
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tcpdump may trace some outbound packets twice.

2006-05-15 Thread Stephen Hemminger

On Mon, 15 May 2006 16:11:05 -0700 (PDT)
Ranjit Manomohan [EMAIL PROTECTED] wrote:

 On Mon, 15 May 2006, David S. Miller wrote:

  From: Ranjit Manomohan [EMAIL PROTECTED]
  Date: Mon, 15 May 2006 14:19:06 -0700 (PDT)

   Heres a new version which does a copy instead of the clone to avoid
   the double cloning issue.

  I still very much dislike this patch because it is creating
  1 more clone per packet than is actually necessary and that
  is very expensive.

  dev_queue_xmit_nit() is going to clone whatever SKB you send into
  there, so better to just bump the reference count (with skb_get())
  instead of cloning or copying.

 I was a bit apprehensive about just incrementing the refcnt but that works 
 too. Attached is the modified version.

 -Thanks,
 Ranjit

 --- linux-2.6/net/sched/sch_generic.c 2006-05-10 12:34:52.0 -0700
 +++ linux/net/sched/sch_generic.c 2006-05-15 15:48:03.0 -0700
 @@ -136,8 +136,12 @@

   if (!netif_queue_stopped(dev)) {
   int ret;
 + struct sk_buff *skbc = NULL;
 + /* Increment the reference count on the skb so
 +  * that we can use it after a successful xmit.
 +  */
   if (netdev_nit)
 - dev_queue_xmit_nit(skb, dev);
 + skbc = skb_get(skb);

skbc = netdev_nit ? skb_get(skb) : NULL;

   ret = dev-hard_start_xmit(skb, dev);
   if (ret == NETDEV_TX_OK) { 
 @@ -145,9 +149,20 @@
   dev-xmit_lock_owner = -1;
   spin_unlock(dev-xmit_lock);
   }
 + if (skbc) {
 + /* transmit succeeded, 
 +  * trace the buffer. */
 + dev_queue_xmit_nit(skbc,dev);
 + kfree_skb(skbc);
 + }
   spin_lock(dev-queue_lock);
   return -1;
   }
 +
 + /* Call free in case we incremented refcnt */
 + if (skbc)
 + kfree_skb(skbc);

kfree_skb(NULL) is legal so the conditional here is unneeded.

But the increased calls to kfree_skb(NULL) would probably bring the
unlikely() hordes descending on kfree_skb, so maybe:

if (unlikely(netdev_nit))
kfree_skb(skbc);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Enabling standard compliant behaviour in the Linux TCP implementation

2006-05-16 Thread Stephen Hemminger

On Tue, 16 May 2006 16:24:22 +0200
Angelo P. Castellani [EMAIL PROTECTED] wrote:

 Hi all,
 I'm a student doing a thesis about TCP performance over high BDP links
 and so about congestion control in TCP.
 
 To do this work I've built a testbed using the latest Linux release (2.6.16).
 
 Anyway I've came across the fact that Linux TCP implementation isn't
 fully standard compliant.
 
 Even if the choices made to be different from the standards have been
 wisely thought, I think that should be possible to disable these
 Linuxisms.
 
 Surely this can help all the people using Linux to evaluate a
 standard environment.
 
 Moreover it permits to compare the proscons of the Linux
 implementation against the standard one.
 
 So I've disabled the first two Linux-specific mechanisms I've found:
 - rate halving
 - dynamic reordering metric (dynamic DupThresh)
 
 These're disabled as long as net.ipv4.tcp_standard_compliant=1 (default: 0).
 
 However I don't exclude that there're more non-standard details, so I
 hope that somebody can point some more differences between Linux and
 the RFCs.
 
 Moreover NewReno is implemented in the Impatient variant (resets the
 retransmit timer only on the first partial ack), with
 net.ipv4.tcp_slow_but_steady=1 (default: 0) you can enable the
 Slow-but-Steady variant (resets the retransmit timer every partial
 ack).
 
 Hoping that this can be useful, I attach the patch.
 
 Regards,
 Angelo P. Castellani


Read Linus's comments on standards. We make software for users, not for
academic use.
http://kerneltrap.org/node/5725

If we added this then paranoid users would set it.

The Reno thing seems okay, if the default was the same as the original behavior
but it makes one more test case to try.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ifIndex allocation

2006-05-16 Thread Stephen Hemminger

On Tue, 16 May 2006 08:11:01 +0200
Sven Schnelle [EMAIL PROTECTED] wrote:

 Hi List,

Redirecting to netdev

 
 investigating a problem with an snmp software for linux, i was wondering
 why the kernel allocates a new ifindex Number, even if the old one is still
 available. For example, if i unload a network driver module, and reload
 it, it has a different ifindex.

Because when you reload the driver it is effectively a completely new
object.  ifindex is a basically and object id. Ifindices act as soft
references so user space can know about a particular network device
even if name changes or other operations happen.  The reference is soft
because the device can disappear. If the application wants to know about
device removal it can catch the netlink event.


 Looking at the function dev_new_index (line 2620 in net/core/dev.c)
 there is a line 'static int ifindex'. Is there any special reason why
 this variable is static, and the list is not traversed from the
 beginning, so that the first free ifindex will be used?
 
 
 Best regards,
 
 Sven.

It is static because no other code should be looking at it.
It doesn't retraverse from the start because it doesn't want to reuse
an earlier index and confuse an application with a soft reference.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-05-16 Thread Stephen Hemminger

Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2-hw;
unsigned port = sky2-port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int cap, err;
+   struct net_device *otherdev = hw-dev[sky2-port^1];
 
+   /*
+* Reduce split transactions (and turn off) rx checksums to
+* prevent problems with dual ports.
+*/
+   if (otherdev  netif_running(otherdev) 
+   (cap = pci_find_capability(hw-pdev, PCI_CAP_ID_PCIX))) {
+   struct sky2_port *osky2 = netdev_priv(otherdev);
+   u16 cmd;
+
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+   cmd = ~PCI_X_CMD_MAX_SPLIT;
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+   sky2-rx_csum = 0;
+   osky2-rx_csum = 0;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup(sky2))
printk(KERN_INFO PFX %s: enabling interface\n, dev-name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
sky2-duplex = -1;
sky2-speed = -1;
sky2-advertising = sky2_supported_modes(hw);
-
-   /* Receive checksum disabled for Yukon XL
-* because of observed problems with incorrect
-* values when multiple packets are received in one interrupt
-*/
-   sky2-rx_csum = (hw-chip_id != CHIP_ID_YUKON_XL);
+   sky2-rx_csum = 1;
 
spin_lock_init(sky2-phy_lock);
sky2-tx_pending = TX_DEF_PENDING;
-- 
1.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: skge driver oops

2006-05-16 Thread Stephen Hemminger

On Fri, 12 May 2006 11:36:24 +1000
David Arnold [EMAIL PROTECTED] wrote:

 i've been getting semi-regular lockups on my machine over 2.6.16
 series.  I recently attached a serial console in an attempt to capture
 an OOPS.
 
 i got one yesterday.  it's copied manually from the console, but
 hopefully the values are all accurate.  there was more that had scrolled
 off screen above this too (sorry).
 
 oops, lspci, uname -a, .config and dmesg below.
 
 any suggestions for further debugging would be great,
 
 thanks,

I tried reproducing this and can't seem to cause it.
Are you running anything special that could influence this?
  bridging, VLAN's, bonding, netfilter, queueing disciplines,
  tc filters, ...


What is the output of /proc/interrupts, perhaps the devices don't like
sharing IRQ?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 5336 matches

Mail list logo