date:20070717

[PATCH 1/2] Fix error checking in Vitesse IRQ config

2007-07-17 Thread Andy Fleming

phy_read() returns a negative number if there's an error, but the
error-checking code in the Vitesse driver's config_intr function
triggers if phy_read() returns non-zero.  Correct that.

Signed-off-by: Andy Fleming <[EMAIL PROTECTED]>
---
I made a really stupid mistake in the 4 patches I sent out, earlier.  I
thought those patches had been tested, but they hadn't been.  This one
corrects a tiny error in the patch, and they have now been tested.  As before
this change can be pulled from:

http://opensource.freescale.com/pub/scm/linux-2.6-85xx.git netdev

Really, REALLY sorry about that.  I have been given a paper bag of appropriate
size and shape to fit over my head.

 drivers/net/phy/vitesse.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c
index 6a53856..8874497 100644
--- a/drivers/net/phy/vitesse.c
+++ b/drivers/net/phy/vitesse.c
@@ -109,7 +109,7 @@ static int vsc824x_config_intr(struct phy_device *phydev)
 */
err = phy_read(phydev, MII_VSC8244_ISTAT);
 
-   if (err)
+   if (err < 0)
return err;
 
err = phy_write(phydev, MII_VSC8244_IMASK, 0);
-- 
1.5.0.2.230.gfbe3d-dirty

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Realtek RTL8111B serious performance issues

2007-07-17 Thread john




Hi,

I originally sent this email to the linux-net list before realizing it
probably belonged on the netdev list.

I just subscribed to this list, so I apologize if this is a known issue.  I
did try looking through the archives, and did not see it there either.

We just put together a new "app server" based on a P35 chipset motherboard,
4 gigabytes of RAM, Q6600 processor, and integrated Realtek RTL8111B gigabit
NIC.  When we SSH or RSH into this machine, and try to run any X application
(emacs, firefox) the application's graphics are drawn *extremely* slowly.
It can take 10 seconds from the time an emacs window pops up until it is
done drawing all of it's icons.

Firefox is even worse.  Loading pages is painful.  The "spinning dots", in the
upper right and corner, never actually spin.  It takes a long time for a
page to be displayed, and when it is draw, it is all-at-once.  Scrolling a
page up/down is extremely jurky.

We are currently running kernel 2.6.22.1, but I have also tried going back
to 2.6.20.x without any change in behavior.

The NIC driver is loaded as:

kernel: eth0: RTL8168b/8111b at 0xc264, 00:1a:4d:43:db:d4, IRQ 17

I tried going to Realtek's site to see if there was a newer driver, but the
only driver there seems to be for older kernels.

I finally put an old Linksys 10/100 PCI NIC in the system, and that has
SOLVED the problem.  We would prefer using the integrated NIC, however.


04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI 
Express Gigabit Ethernet controller (rev 01)
 Subsystem: Giga-byte Technology Unknown device e000
 Flags: bus master, fast devsel, latency 0, IRQ 17
 I/O ports at c000 [size=256]
 Memory at f800 (64-bit, non-prefetchable) [size=4K]
 [virtual] Expansion ROM at fb20 [disabled] [size=64K]
 Capabilities: [40] Power Management version 2
 Capabilities: [48] Vital Product Data
 Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ 
Queue=0/1 Enable-
 Capabilities: [60] Express Endpoint IRQ 0
 Capabilities: [84] Vendor Specific Information
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [12c] Virtual Channel
 Capabilities: [148] Device Serial Number 68-81-ec-10-00-00-00-25
 Capabilities: [154] Power Budgeting

Anyone have any suggestions for solving this problem?

Thanks,

John


--

| |
+--+  ==  |  John Patrick Poet Blue Sky Tours
|  |  |  Director of Systems Development   10832 Prospect Ave., N.E.
| +---+  [EMAIL PROTECTED] Albuquerque, N.M. 87112
| |  Ph. 505 293 9462  Fx. 505 293 6902
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

tc filter add ... fw ... action drop

2007-07-17 Thread Abhijit Menon-Sen

Hi.

Is it a bug that:

  # tc filter add dev eth0 parent 1: protocol ip prio 0 handle 0xfff
fw police rate 1 burst 1 mpu 0 mtu 1 action drop
 ^^^
creates a filter that looks like:

  # tc filter ls dev eth0
  filter parent 1: protocol ip pref 49152 fw 
  filter parent 1: protocol ip pref 49152 fw handle 0xfff police 0x1
  rate 0bit burst 0b mtu 1b action reclassify
^
  ref -543190236 bind 4

(which reclassifies and thus lets 0xfff-marked packets through).

I'm pretty sure this used to work under 2.4.x (though I no longer have a
2.4 box to test with), but it hasn't worked on any of the 2.6.x kernels
I've tried (with both iproute2-ss060323 and 070710).

I haven't been able to find anything that suggests this change is
intentional. If it's not immediately obvious to anyone what the
problem is, I could try to track it down.

-- ams
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Please pull 'upstream-davem' branch of wireless-2.6

2007-07-17 Thread David Miller

From: "John W. Linville" <[EMAIL PROTECTED]>
Date: Tue, 17 Jul 2007 22:16:07 -0400

> A few more for 2.6.23...individual patches available here:
> 
>   
> http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/upstream-davem

What about this warning which I reported to you right after the last
merge?  Did this get fixed?

net/mac80211/ieee80211.c:4989: warning: comparison of distinct pointer types 
lacks a cast

Please fix that up first, then I'll pull from your tree.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fix wrong argument of tc35815_read_plat_dev_addr()

2007-07-17 Thread Atsushi Nemoto

On Wed, 18 Jul 2007 11:13:42 +0900, Yoichi Yuasa <[EMAIL PROTECTED]> wrote:
> Fix wrong argument of tc35815_read_plat_dev_addr()

Oh my fault!  Thanks!

Acked-by: Atsushi Nemoto <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] [net/core] move __dev_addr_discard adjacent to dev_addr_discard for readability

2007-07-17 Thread Denis Cheng

Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
---
 net/core/dev.c |   28 ++--
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 17c9cbd..6357f54 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2715,20 +2715,6 @@ int __dev_addr_add(struct dev_addr_list **list, int 
*count,
return 0;
 }
 
-static void __dev_addr_discard(struct dev_addr_list **list)
-{
-   struct dev_addr_list *tmp;
-
-   while (*list != NULL) {
-   tmp = *list;
-   *list = tmp->next;
-   if (tmp->da_users > tmp->da_gusers)
-   printk("__dev_addr_discard: address leakage! "
-  "da_users=%d\n", tmp->da_users);
-   kfree(tmp);
-   }
-}
-
 /**
  * dev_unicast_delete  - Release secondary unicast address.
  * @dev: device
@@ -2777,6 +2763,20 @@ int dev_unicast_add(struct net_device *dev, void *addr, 
int alen)
 }
 EXPORT_SYMBOL(dev_unicast_add);
 
+static void __dev_addr_discard(struct dev_addr_list **list)
+{
+   struct dev_addr_list *tmp;
+
+   while (*list != NULL) {
+   tmp = *list;
+   *list = tmp->next;
+   if (tmp->da_users > tmp->da_gusers)
+   printk("__dev_addr_discard: address leakage! "
+  "da_users=%d\n", tmp->da_users);
+   kfree(tmp);
+   }
+}
+
 static void dev_addr_discard(struct net_device *dev)
 {
netif_tx_lock_bh(dev);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] [net/core] move dev_mc_discard from dev_mcast.c to dev.c

2007-07-17 Thread Denis Cheng

Because this function is only called by unregister_netdevice,
this moving could make this non-global function static,
and also remove its declaration in netdevice.h;

Any further, function __dev_addr_discard is also just called by
dev_mc_discard and dev_unicast_discard, keeping this two functions
both in one c file could make __dev_addr_discard also static
and remove its declaration in netdevice.h;

Futhermore, the sequential call to dev_unicast_discard and then
dev_mc_discard in unregister_netdevice have a similar mechanism that:
(netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh),
they should merged into one to eliminate duplicates in acquiring and
releasing the dev->_xmit_lock, this would be done in my following patch.

Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
---
 include/linux/netdevice.h |2 --
 net/core/dev.c|   14 +-
 net/core/dev_mcast.c  |   12 
 3 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index da7a13c..9820ca1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1098,10 +1098,8 @@ extern int   dev_mc_delete(struct net_device 
*dev, void *addr, int alen, int all
 extern int dev_mc_add(struct net_device *dev, void *addr, int 
alen, int newonly);
 extern int dev_mc_sync(struct net_device *to, struct net_device 
*from);
 extern voiddev_mc_unsync(struct net_device *to, struct net_device 
*from);
-extern voiddev_mc_discard(struct net_device *dev);
 extern int __dev_addr_delete(struct dev_addr_list **list, int 
*count, void *addr, int alen, int all);
 extern int __dev_addr_add(struct dev_addr_list **list, int *count, 
void *addr, int alen, int newonly);
-extern void__dev_addr_discard(struct dev_addr_list **list);
 extern voiddev_set_promiscuity(struct net_device *dev, int inc);
 extern voiddev_set_allmulti(struct net_device *dev, int inc);
 extern voidnetdev_state_change(struct net_device *dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index 13a0d9f..3ba63aa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2715,7 +2715,7 @@ int __dev_addr_add(struct dev_addr_list **list, int 
*count,
return 0;
 }
 
-void __dev_addr_discard(struct dev_addr_list **list)
+static void __dev_addr_discard(struct dev_addr_list **list)
 {
struct dev_addr_list *tmp;
 
@@ -2785,6 +2785,18 @@ static void dev_unicast_discard(struct net_device *dev)
netif_tx_unlock_bh(dev);
 }
 
+/*
+ * Discard multicast list when a device is downed
+ */
+
+static void dev_mc_discard(struct net_device *dev)
+{
+   netif_tx_lock_bh(dev);
+   __dev_addr_discard(&dev->mc_list);
+   dev->mc_count = 0;
+   netif_tx_unlock_bh(dev);
+}
+
 unsigned dev_get_flags(const struct net_device *dev)
 {
unsigned flags;
diff --git a/net/core/dev_mcast.c b/net/core/dev_mcast.c
index 235a2a8..99aece1 100644
--- a/net/core/dev_mcast.c
+++ b/net/core/dev_mcast.c
@@ -177,18 +177,6 @@ void dev_mc_unsync(struct net_device *to, struct 
net_device *from)
 }
 EXPORT_SYMBOL(dev_mc_unsync);
 
-/*
- * Discard multicast list when a device is downed
- */
-
-void dev_mc_discard(struct net_device *dev)
-{
-   netif_tx_lock_bh(dev);
-   __dev_addr_discard(&dev->mc_list);
-   dev->mc_count = 0;
-   netif_tx_unlock_bh(dev);
-}
-
 #ifdef CONFIG_PROC_FS
 static void *dev_mc_seq_start(struct seq_file *seq, loff_t *pos)
 {
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] [net/core] merge dev_unicast_discard and dev_mc_discard into one

2007-07-17 Thread Denis Cheng

this two functions could share the dev->_xmit_lock acquired context.

Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
---
 net/core/dev.c |   16 
 1 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 3ba63aa..17c9cbd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2777,23 +2777,16 @@ int dev_unicast_add(struct net_device *dev, void *addr, 
int alen)
 }
 EXPORT_SYMBOL(dev_unicast_add);
 
-static void dev_unicast_discard(struct net_device *dev)
+static void dev_addr_discard(struct net_device *dev)
 {
netif_tx_lock_bh(dev);
+
__dev_addr_discard(&dev->uc_list);
dev->uc_count = 0;
-   netif_tx_unlock_bh(dev);
-}
 
-/*
- * Discard multicast list when a device is downed
- */
-
-static void dev_mc_discard(struct net_device *dev)
-{
-   netif_tx_lock_bh(dev);
__dev_addr_discard(&dev->mc_list);
dev->mc_count = 0;
+
netif_tx_unlock_bh(dev);
 }
 
@@ -3751,8 +3744,7 @@ void unregister_netdevice(struct net_device *dev)
/*
 *  Flush the unicast and multicast chains
 */
-   dev_unicast_discard(dev);
-   dev_mc_discard(dev);
+   dev_addr_discard(dev);
 
if (dev->uninit)
dev->uninit(dev);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Please pull 'upstream-davem' branch of wireless-2.6

2007-07-17 Thread John W. Linville

A few more for 2.6.23...individual patches available here:

  
http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/upstream-davem

Thanks!

John
---

The following changes since commit 4ad1366376bfef32ec0ffa12d1faa483d6f330bd:
  NeilBrown (1):
md: change bitmap_unplug and others to void functions

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
upstream-davem

Daniel Drake (1):
  mac80211: regulatory domain cleanup

Johannes Berg (2):
  mac80211: use debugfs_rename
  mac80211: regdomain.c needs to include ieee80211_i.h

 net/mac80211/Makefile  |1 +
 net/mac80211/debugfs_netdev.c  |9 ++-
 net/mac80211/ieee80211.c   |3 +-
 net/mac80211/ieee80211_i.h |5 +-
 net/mac80211/ieee80211_ioctl.c |  133 -
 net/mac80211/regdomain.c   |  158 
 6 files changed, 171 insertions(+), 138 deletions(-)
 create mode 100644 net/mac80211/regdomain.c

diff --git a/net/mac80211/Makefile b/net/mac80211/Makefile
index e9738da..a9c2d07 100644
--- a/net/mac80211/Makefile
+++ b/net/mac80211/Makefile
@@ -13,6 +13,7 @@ mac80211-objs := \
ieee80211_iface.o \
ieee80211_rate.o \
michael.o \
+   regdomain.o \
tkip.o \
aes_ccm.o \
wme.o \
diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c
index a3e01d7..799a920 100644
--- a/net/mac80211/debugfs_netdev.c
+++ b/net/mac80211/debugfs_netdev.c
@@ -397,6 +397,8 @@ static int netdev_notify(struct notifier_block * nb,
 void *ndev)
 {
struct net_device *dev = ndev;
+   struct dentry *dir;
+   struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev);
char buf[10+IFNAMSIZ];
 
if (state != NETDEV_CHANGENAME)
@@ -408,10 +410,11 @@ static int netdev_notify(struct notifier_block * nb,
if (dev->ieee80211_ptr->wiphy->privid != mac80211_wiphy_privid)
return 0;
 
-   /* TODO
sprintf(buf, "netdev:%s", dev->name);
-   debugfs_rename(IEEE80211_DEV_TO_SUB_IF(dev)->debugfsdir, buf);
-   */
+   dir = sdata->debugfsdir;
+   if (!debugfs_rename(dir->d_parent, dir, dir->d_parent, buf))
+   printk(KERN_ERR "mac80211: debugfs: failed to rename debugfs "
+  "dir to %s\n", buf);
 
return 0;
 }
diff --git a/net/mac80211/ieee80211.c b/net/mac80211/ieee80211.c
index 2ddf4ef..6c63dcf 100644
--- a/net/mac80211/ieee80211.c
+++ b/net/mac80211/ieee80211.c
@@ -5095,7 +5095,7 @@ int ieee80211_register_hwmode(struct ieee80211_hw *hw,
}
 
if (!(hw->flags & IEEE80211_HW_DEFAULT_REG_DOMAIN_CONFIGURED))
-   ieee80211_init_client(local->mdev);
+   ieee80211_set_default_regdomain(mode);
 
return 0;
 }
@@ -5246,6 +5246,7 @@ static int __init ieee80211_init(void)
}
 
ieee80211_debugfs_netdev_init();
+   ieee80211_regdomain_init();
 
return 0;
 }
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 055a2a9..6f7bae7 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -759,7 +759,6 @@ void ieee80211_update_default_wep_only(struct 
ieee80211_local *local);
 /* ieee80211_ioctl.c */
 int ieee80211_set_compression(struct ieee80211_local *local,
  struct net_device *dev, struct sta_info *sta);
-int ieee80211_init_client(struct net_device *dev);
 int ieee80211_set_channel(struct ieee80211_local *local, int channel, int 
freq);
 /* ieee80211_sta.c */
 void ieee80211_sta_timer(unsigned long data);
@@ -798,6 +797,10 @@ void ieee80211_if_sdata_init(struct ieee80211_sub_if_data 
*sdata);
 int ieee80211_if_add_mgmt(struct ieee80211_local *local);
 void ieee80211_if_del_mgmt(struct ieee80211_local *local);
 
+/* regdomain.c */
+void ieee80211_regdomain_init(void);
+void ieee80211_set_default_regdomain(struct ieee80211_hw_mode *mode);
+
 /* for wiphy privid */
 extern void *mac80211_wiphy_privid;
 
diff --git a/net/mac80211/ieee80211_ioctl.c b/net/mac80211/ieee80211_ioctl.c
index 5918dd0..d0e1ab5 100644
--- a/net/mac80211/ieee80211_ioctl.c
+++ b/net/mac80211/ieee80211_ioctl.c
@@ -27,20 +27,6 @@
 #include "aes_ccm.h"
 #include "debugfs_key.h"
 
-static int ieee80211_regdom = 0x10; /* FCC */
-module_param(ieee80211_regdom, int, 0444);
-MODULE_PARM_DESC(ieee80211_regdom, "IEEE 802.11 regulatory domain; 64=MKK");
-
-/*
- * If firmware is upgraded by the vendor, additional channels can be used based
- * on the new Japanese regulatory rules. This is indicated by setting
- * ieee80211_japan_5ghz module parameter to one when loading the 80211 kernel
- * module.
- */
-static int ieee80211_japan_5ghz /* = 0 */;
-module_param(ieee80211_japan_5ghz, int, 0444);
-MODULE_PARM_DESC(ieee80211_japan_5ghz, "Vendor-updated firmware for 5 GHz");
-
 static void ieee80211_set_hw_encryption(struct n

Please pull 'upstream-jgarzik' branch of wireless-2.6

2007-07-17 Thread John W. Linville

A few more for 2.6.23...

Thanks!

John

---

The following changes since commit 4ad1366376bfef32ec0ffa12d1faa483d6f330bd:
  NeilBrown (1):
md: change bitmap_unplug and others to void functions

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
upstream-jgarzik

Daniel Drake (1):
  zd1211rw: Add ID for Siemens Gigaset USB Stick 54

Jean Tourrilhes (1):
  softmac: Channel is listed twice in scan output

Masakazu Mokuno (1):
  zd1211rw: Add ID for Planex GW-US54GXS

Zhu Yi (4):
  ipw2100: Fix `iwpriv set_power` error
  Fix ipw2200 set wrong power parameter causing firmware error
  ipw2200: Fix ipw_isr() comments error on shared IRQ
  Update version ipw2200 stamp to 1.2.2

 drivers/net/wireless/ipw2100.c |6 +++---
 drivers/net/wireless/ipw2200.c |   18 --
 drivers/net/wireless/zd1211rw/zd_usb.c |2 ++
 net/ieee80211/ieee80211_wx.c   |7 ++-
 4 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/net/wireless/ipw2100.c b/drivers/net/wireless/ipw2100.c
index 072ede7..8990585 100644
--- a/drivers/net/wireless/ipw2100.c
+++ b/drivers/net/wireless/ipw2100.c
@@ -7868,10 +7868,10 @@ static int ipw2100_wx_set_powermode(struct net_device 
*dev,
goto done;
}
 
-   if ((mode < 1) || (mode > POWER_MODES))
+   if ((mode < 0) || (mode > POWER_MODES))
mode = IPW_POWER_AUTO;
 
-   if (priv->power_mode != mode)
+   if (IPW_POWER_LEVEL(priv->power_mode) != mode)
err = ipw2100_set_power_mode(priv, mode);
   done:
mutex_unlock(&priv->action_mutex);
@@ -7902,7 +7902,7 @@ static int ipw2100_wx_get_powermode(struct net_device 
*dev,
break;
case IPW_POWER_AUTO:
snprintf(extra, MAX_POWER_STRING,
-"Power save level: %d (Auto)", 0);
+"Power save level: %d (Auto)", level);
break;
default:
timeout = timeout_duration[level - 1] / 1000;
diff --git a/drivers/net/wireless/ipw2200.c b/drivers/net/wireless/ipw2200.c
index aa32a97..61497c4 100644
--- a/drivers/net/wireless/ipw2200.c
+++ b/drivers/net/wireless/ipw2200.c
@@ -70,7 +70,7 @@
 #define VQ
 #endif
 
-#define IPW2200_VERSION "1.2.0" VK VD VM VP VR VQ
+#define IPW2200_VERSION "1.2.2" VK VD VM VP VR VQ
 #define DRV_DESCRIPTION"Intel(R) PRO/Wireless 2200/2915 Network Driver"
 #define DRV_COPYRIGHT  "Copyright(c) 2003-2006 Intel Corporation"
 #define DRV_VERSION IPW2200_VERSION
@@ -2506,7 +2506,7 @@ static int ipw_send_power_mode(struct ipw_priv *priv, u32 
mode)
break;
}
 
-   param = cpu_to_le32(mode);
+   param = cpu_to_le32(param);
return ipw_send_cmd_pdu(priv, IPW_CMD_POWER_MODE, sizeof(param),
¶m);
 }
@@ -9568,6 +9568,7 @@ static int ipw_wx_set_power(struct net_device *dev,
priv->power_mode = IPW_POWER_ENABLED | IPW_POWER_BATTERY;
else
priv->power_mode = IPW_POWER_ENABLED | priv->power_mode;
+
err = ipw_send_power_mode(priv, IPW_POWER_LEVEL(priv->power_mode));
if (err) {
IPW_DEBUG_WX("failed setting power mode.\n");
@@ -9604,22 +9605,19 @@ static int ipw_wx_set_powermode(struct net_device *dev,
struct ipw_priv *priv = ieee80211_priv(dev);
int mode = *(int *)extra;
int err;
+
mutex_lock(&priv->mutex);
-   if ((mode < 1) || (mode > IPW_POWER_LIMIT)) {
+   if ((mode < 1) || (mode > IPW_POWER_LIMIT))
mode = IPW_POWER_AC;
-   priv->power_mode = mode;
-   } else {
-   priv->power_mode = IPW_POWER_ENABLED | mode;
-   }
 
-   if (priv->power_mode != mode) {
+   if (IPW_POWER_LEVEL(priv->power_mode) != mode) {
err = ipw_send_power_mode(priv, mode);
-
if (err) {
IPW_DEBUG_WX("failed setting power mode.\n");
mutex_unlock(&priv->mutex);
return err;
}
+   priv->power_mode = IPW_POWER_ENABLED | mode;
}
mutex_unlock(&priv->mutex);
return 0;
@@ -10555,7 +10553,7 @@ static irqreturn_t ipw_isr(int irq, void *data)
spin_lock(&priv->irq_lock);
 
if (!(priv->status & STATUS_INT_ENABLED)) {
-   /* Shared IRQ */
+   /* IRQ is disabled */
goto none;
}
 
diff --git a/drivers/net/wireless/zd1211rw/zd_usb.c 
b/drivers/net/wireless/zd1211rw/zd_usb.c
index 28d41a2..a9c339e 100644
--- a/drivers/net/wireless/zd1211rw/zd_usb.c
+++ b/drivers/net/wireless/zd1211rw/zd_usb.c
@@ -72,6 +72,8 @@ static struct usb_device_id usb_ids[] = {
{ USB_DEVICE(0x0586, 0x3413), .driver_info = DEVICE_ZD1211B },
{ US

[PATCH] fix wrong argument of tc35815_read_plat_dev_addr()

2007-07-17 Thread Yoichi Yuasa

Fix wrong argument of tc35815_read_plat_dev_addr()

Signed-off-by: Yoichi Yuasa <[EMAIL PROTECTED]>

diff -pruN -X generic/Documentation/dontdiff generic-orig/drivers/net/tc35815.c 
generic/drivers/net/tc35815.c
--- generic-orig/drivers/net/tc35815.c  2007-07-18 10:45:56.542655750 +0900
+++ generic/drivers/net/tc35815.c   2007-07-18 10:41:42.230762250 +0900
@@ -626,7 +626,7 @@ static int __devinit tc35815_read_plat_d
return -ENODEV;
 }
 #else
-static int __devinit tc35815_read_plat_dev_addr(struct device *dev)
+static int __devinit tc35815_read_plat_dev_addr(struct net_device *dev)
 {
return -ENODEV;
 }
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 37/44] xen: add virtual network device driver

2007-07-17 Thread Jeremy Fitzhardinge

Rusty Russell wrote:
> The default function points to the internal stats...
>   

Right you are.

J
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 37/44] xen: add virtual network device driver

2007-07-17 Thread Rusty Russell

On Tue, 2007-07-17 at 07:28 -0700, Jeremy Fitzhardinge wrote:
> Stephen Hemminger wrote:
> >> +struct netfront_info {
> >> +  struct list_head list;
> >> +  struct net_device *netdev;
> >> +
> >> +  struct net_device_stats stats;
> >> 
> >
> > There is now a net_device_stats element inside net_device on
> > 2.6.21 or later.
> >   
> 
> Ah, OK.  Should I just do a s/stats/netdev->stats/?  Is there a generic
> get_stats routine as well?

The default function points to the internal stats...

Cheers,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IPSec freeze

2007-07-17 Thread Patrick McHardy

Beschorner Daniel wrote:
>>I fixed it myself. Daniel, can you please test this patch?
> 
> 
> Many thanks Patrick!!!
> I tested it and found it working!

Thanks for testing.

> No more crashes with IPComp and smaller PMTUs.
> But the "pmtu discovery on SA ESP/..." messages don't disappear.

Thats probably a different issue. Please post the output of
"ip -x xfrm state" (obfuscate keys if you care ..).
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IPSec freeze

2007-07-17 Thread Beschorner Daniel

> >>> I managed to reproduce a crash with ipcomp, will try to 
> fix it later.
> >>>   
> >> Yes, I can confirm this.
> >> After disabling IPComp the crashes went away.
> >> 
> > The crash happens in xfrm_bundle_ok when walking the bundle upwards
> > following xfrm_dst->u.next. The loop should be stopped when
> > xfrm_dst->u.next == first (the topmost xfrm_dst), but it points to
> > NULL instead. I'm pretty sure the attached patch is responsible,
> > it breaks XFRM's assumption that dst->next and xfrm_dst->u.next are
> > the same pointer and xfrm_dst now shares the next pointer with
> > rcu_head.next in struct dst_entry.
> >
> > Eric, could you look into this please?
> 
> I fixed it myself. Daniel, can you please test this patch?

Many thanks Patrick!!!
I tested it and found it working!

No more crashes with IPComp and smaller PMTUs.
But the "pmtu discovery on SA ESP/..." messages don't disappear.

Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Socket Buffers and Memory Managment

2007-07-17 Thread David Miller

From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Tue, 17 Jul 2007 20:41:29 +0100

> Sounds like sucky hardware...

Although expect more of the same in the future, not less.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Socket Buffers and Memory Managment

2007-07-17 Thread Stephen Hemminger

On Tue, 17 Jul 2007 10:20:58 -0700 (PDT)
vinay ravuri <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I am fairly new to linux socket buffers and have the
> following questions!
> 
> I am working with a custom ethernet MAC that does not
> allow me to specify a particular memory location for
> the h/w to DMA the packet into (Rx side).  Instead, it
> has a pool of fixed size buffers with some h/w
> specific headers around each buffer that are managed
> by h/w and will pick a free buffer and DMA the packet.

Sounds like sucky hardware...
You need to copy to a newly allocated skb, see 8139too.c

>  It appears dev_alloc_skb() actually allocates the
> physical memory and doesn't allow the user to specify
> the skb.data to something specific to what I want
> which is a problem for me.  First is my assumption
> correct that I am cannot pick an arbitrary skb.data
> location in struct sk_buff?  I want to avoid copying
> the dma'ed data into a new socket buffer as it is
> expense.  Is there any ways around this problem?

You could play tricks with skb frags but it would be fragile
and not worth the trouble. The problem is that the receive
skb can stay in the system for a really long time (until the application
reads the data) so your fixed size buffer pool in hardware
would get exhausted.

> Also, if the h/w gives me a single packet in multiple
> locations (i.e. non-contiguous chunks of memory), can
> socket buffers handle chains of buffers?  I am looking
> for a facility like mbuf's in netbsd where one can
> chain multiple buffers together to make construct a
> single packet.

Yes, skb frag list could be used for that but you don't
want to do that. See above. Copy the data into an new
skb and reserve any necessary bytes so IP header is
aligned.  I.e. if using ethernet header (14 bytes), do
skb_reserve(skb, 2) before copying the data.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch pci-quirk_e100_interrupt-called-too-early.patch added to gregkh-2.6 tree

2007-07-17 Thread Kok, Auke


[EMAIL PROTECTED] wrote:

This is a note to let you know that I've just added the patch titled

 Subject: [PATCH] PCI: quirk_e100_interrupt() called too early

to my gregkh-2.6 tree.  Its filename is

 pci-quirk_e100_interrupt-called-too-early.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/



From [EMAIL PROTECTED] Tue Jul  3 02:03:55 2007
From: Marian Balakowicz <[EMAIL PROTECTED]>
Date: Tue, 03 Jul 2007 11:03:18 +0200
Subject: [PATCH] PCI: quirk_e100_interrupt() called too early
To: "Kok, Auke" <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], NetDev 

Message-ID: <[EMAIL PROTECTED]>


quirk_e100_interrupts() is called after PCI controller is initialized
and before PCI bus enumeration is performed. On some powerpc platforms
which modify PCI controller configuration and set different MEM and IO
windows than those set by firmware quirk_e100_interrupt() is causing
kernel panic as it tries to read from device BAR0 offets which at this
time points to a invalid PCI window (set by firmware).

This patch delays the quirk_100_interrupt() to pci_fixup_final phase,
which happens after bus enumeration and before PCI enable and
device driver initialization.

Signed-off-by: Marian Balakowicz <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/pci/quirks.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1485,7 +1485,7 @@ static void __devinit quirk_e100_interru
 
 	iounmap(csr);

 }
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_e100_interrupt);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_e100_interrupt);
 
 static void __devinit fixup_rev1_53c810(struct pci_dev* dev)

 {


Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are

pci/pci-quirk_e100_interrupt-called-too-early.patch



Yes, that's OK. Please note that I asked the person who originally reported the 
problem to make sure that this patch doesn't break anything, but he was still 
too busy to test until now.


I assume (from what I know of the PCI subsystem now) that this is a safe patch, 
so feel free to add:


Acked-by: Auke Kok <[EMAIL PROTECTED]>

Cheers,


Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Socket Buffers and Memory Managment

2007-07-17 Thread vinay ravuri

Hi,

I am fairly new to linux socket buffers and have the
following questions!

I am working with a custom ethernet MAC that does not
allow me to specify a particular memory location for
the h/w to DMA the packet into (Rx side).  Instead, it
has a pool of fixed size buffers with some h/w
specific headers around each buffer that are managed
by h/w and will pick a free buffer and DMA the packet.
 It appears dev_alloc_skb() actually allocates the
physical memory and doesn't allow the user to specify
the skb.data to something specific to what I want
which is a problem for me.  First is my assumption
correct that I am cannot pick an arbitrary skb.data
location in struct sk_buff?  I want to avoid copying
the dma'ed data into a new socket buffer as it is
expense.  Is there any ways around this problem?

Also, if the h/w gives me a single packet in multiple
locations (i.e. non-contiguous chunks of memory), can
socket buffers handle chains of buffers?  I am looking
for a facility like mbuf's in netbsd where one can
chain multiple buffers together to make construct a
single packet.

Please e-mail me responses to [EMAIL PROTECTED]

Thanks,
Vinay


 

Bored stiff? Loosen up... 
Download and play hundreds of games for free on Yahoo! Games.
http://games.yahoo.com/games/front
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

patch pci-quirk_e100_interrupt-called-too-early.patch added to gregkh-2.6 tree

2007-07-17 Thread gregkh


This is a note to let you know that I've just added the patch titled

 Subject: [PATCH] PCI: quirk_e100_interrupt() called too early

to my gregkh-2.6 tree.  Its filename is

 pci-quirk_e100_interrupt-called-too-early.patch

This tree can be found at 
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [EMAIL PROTECTED] Tue Jul  3 02:03:55 2007
From: Marian Balakowicz <[EMAIL PROTECTED]>
Date: Tue, 03 Jul 2007 11:03:18 +0200
Subject: [PATCH] PCI: quirk_e100_interrupt() called too early
To: "Kok, Auke" <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], NetDev 

Message-ID: <[EMAIL PROTECTED]>


quirk_e100_interrupts() is called after PCI controller is initialized
and before PCI bus enumeration is performed. On some powerpc platforms
which modify PCI controller configuration and set different MEM and IO
windows than those set by firmware quirk_e100_interrupt() is causing
kernel panic as it tries to read from device BAR0 offets which at this
time points to a invalid PCI window (set by firmware).

This patch delays the quirk_100_interrupt() to pci_fixup_final phase,
which happens after bus enumeration and before PCI enable and
device driver initialization.

Signed-off-by: Marian Balakowicz <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 drivers/pci/quirks.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1485,7 +1485,7 @@ static void __devinit quirk_e100_interru
 
iounmap(csr);
 }
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_e100_interrupt);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_e100_interrupt);
 
 static void __devinit fixup_rev1_53c810(struct pci_dev* dev)
 {


Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are

pci/pci-quirk_e100_interrupt-called-too-early.patch
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: IPSec freeze

2007-07-17 Thread Patrick McHardy


Patrick McHardy wrote:

Beschorner Daniel wrote:
  

I managed to reproduce a crash with ipcomp, will try to fix it later.
  

Yes, I can confirm this.
After disabling IPComp the crashes went away.




The crash happens in xfrm_bundle_ok when walking the bundle upwards
following xfrm_dst->u.next. The loop should be stopped when
xfrm_dst->u.next == first (the topmost xfrm_dst), but it points to
NULL instead. I'm pretty sure the attached patch is responsible,
it breaks XFRM's assumption that dst->next and xfrm_dst->u.next are
the same pointer and xfrm_dst now shares the next pointer with
rcu_head.next in struct dst_entry.

Eric, could you look into this please?


I fixed it myself. Daniel, can you please test this patch?



[XFRM]: Fix crash introduced by struct dst_entry reordering

XFRM expects xfrm_dst->u.next to be same pointer as dst->next, which
was broken by the dst_entry reordering in commit 1e19e02c~, causing
an oops in xfrm_bundle_ok when walking the bundle upwards.

Kill xfrm_dst->u.next and change the only user to use dst->next instead.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

---
commit 20c2fee8cc562817f11752e1d87350d5994fa098
tree f42318b847e962aa637136e94722a688c23a
parent 308ac1b6249226730b70fcf7c13a289c27ce2bf3
author Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 18:11:29 +0200
committer Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 18:11:29 +0200

 include/net/xfrm.h |1 -
 net/xfrm/xfrm_policy.c |2 +-
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ae959e9..a5f80bf 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -585,7 +585,6 @@ static inline int xfrm_sec_ctx_match(struct xfrm_sec_ctx 
*s1, struct xfrm_sec_ct
 struct xfrm_dst
 {
union {
-   struct xfrm_dst *next;
struct dst_entrydst;
struct rtable   rt;
struct rt6_info rt6;
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 157bfbd..b48f06f 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2141,7 +2141,7 @@ int xfrm_bundle_ok(struct xfrm_policy *pol, struct 
xfrm_dst *first,
if (last == first)
break;
 
-   last = last->u.next;
+   last = (struct xfrm_dst *)last->u.dst.next;
last->child_mtu_cached = mtu;
}

Re: [PATCH net-2.6.22-rc7] xfrm beet interfamily support

2007-07-17 Thread Patrick McHardy

Joakim Koskela wrote:
> On Monday 16 July 2007 21:47:40 Patrick McHardy wrote:
> 
>>>diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
>>>index 44ef208..8db7910 100644
>>>--- a/net/ipv4/xfrm4_output.c
>>>+++ b/net/ipv4/xfrm4_output.c
>>>@@ -53,7 +53,8 @@ static int xfrm4_output_one(struct sk_buff *skb)
>>> goto error_nolock;
>>> }
>>>
>>>-if (x->props.mode == XFRM_MODE_TUNNEL) {
>>>+if (x->props.mode == XFRM_MODE_TUNNEL ||
>>>+x->props.mode == XFRM_MODE_BEET) {
>>> err = xfrm4_tunnel_check_size(skb);
>>
>>Its not a real tunnel and all packets are generated locally, why
>>does it need to send ICMPs?
> 
> 
> Guess not. I'll have to still trace through, but can probably be removed.


Just FYI: it does make a difference with netfilter since packets
may be NATed to match a policy, but thats a more general problem
that also affects transport mode and should be dealt with within
netfilter, possibly by propagating PMTU values amonst dst_entries.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 2/3] netlink: allow removing multicast groups

2007-07-17 Thread Patrick McHardy

Johannes Berg wrote:
> +static void netlink_update_socket_mc(struct netlink_sock *nlk,
> +  unsigned int group,
> +  int is_new)
> +{
> + int old, new = !!is_new, subscriptions;
> +
> + netlink_table_grab();


Having the caller lock the table would save lots of atomic operation
in case of netlink_clear_multicast_users.

> + old = test_bit(group - 1, nlk->groups);
> + subscriptions = nlk->subscriptions - old + new;
> + if (new)
> + __set_bit(group - 1, nlk->groups);
> + else
> + __clear_bit(group - 1, nlk->groups);
> + netlink_update_subscriptions(&nlk->sk, subscriptions);
> + netlink_update_listeners(&nlk->sk);
> + netlink_table_ungrab();
> +}
> +

> +void netlink_clear_multicast_users(int unit, unsigned int group)

Same as in the last patch, passing the kernel socket would be nicer IMO.

> +{
> + struct sock *sk;
> + struct hlist_node *node;
> +
> + read_lock(&nl_table_lock);

Won't this deadlock? netlink_table_grab takes a write-lock.

> +
> + sk_for_each_bound(sk, node, &nl_table[unit].mc_list)
> + netlink_update_socket_mc(nlk_sk(sk), group, 0);
> +
> + read_unlock(&nl_table_lock);
> +}
> +EXPORT_SYMBOL(netlink_clear_multicast_users);
> +
>  void netlink_set_nonroot(int protocol, unsigned int flags)
>  {
>   if ((unsigned int)protocol < MAX_LINKS)
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 1/3] netlink: allocate group bitmaps dynamically

2007-07-17 Thread Patrick McHardy

Johannes Berg wrote:
> Allow changing the number of groups for a netlink family
> after it has been created, use RCU to protect the listeners
> bitmap keeping netlink_has_listeners() lock-free.
> 
> Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>
> 
> ---
>  include/linux/netlink.h  |1 
>  net/netlink/af_netlink.c |   86 
> +++
>  2 files changed, 66 insertions(+), 21 deletions(-)
> 
> --- wireless-dev.orig/net/netlink/af_netlink.c2007-07-17 
> 14:05:30.210964463 +0200
> +++ wireless-dev/net/netlink/af_netlink.c 2007-07-17 14:05:30.720964463 
> +0200
> -static int netlink_alloc_groups(struct sock *sk)
> +static int netlink_realloc_groups(struct sock *sk)
>  {
>   struct netlink_sock *nlk = nlk_sk(sk);
>   unsigned int groups;
> + unsigned long *new_groups;
>   int err = 0;
>  
>   netlink_lock_table();


This is actually a bug in the current code I think, netlink_lock_table
is a reader lock.

>   groups = nl_table[sk->sk_protocol].groups;
>   if (!nl_table[sk->sk_protocol].registered)
>   err = -ENOENT;
> - netlink_unlock_table();
>  
>   if (err)
> - return err;
> + goto out_unlock;
>  
> - nlk->groups = kzalloc(NLGRPSZ(groups), GFP_KERNEL);
> - if (nlk->groups == NULL)
> - return -ENOMEM;
> + if (nlk->ngroups >= groups)
> + goto out_unlock;
> +
> + new_groups = krealloc(nlk->groups, NLGRPSZ(groups), GFP_KERNEL);
> + if (new_groups == NULL) {
> + err = -ENOMEM;
> + goto out_unlock;
> + }
> + memset((char*)new_groups + NLGRPSZ(nlk->ngroups), 0,
> +NLGRPSZ(groups) - NLGRPSZ(nlk->ngroups));
> +
> + nlk->groups = new_groups;
>   nlk->ngroups = groups;
> - return 0;
> + out_unlock:
> + netlink_unlock_table();
> + return err;
>  }


> +int netlink_change_ngroups(int unit, unsigned int groups)


I think it would be more consistent to pass the kernel socket
instead of the unit.

> +{
> + unsigned long *listeners, old = NULL;
> + int err = 0;
> +
> + netlink_table_grab();
> + if (NLGRPSZ(nl_table[unit].groups) < NLGRPSZ(groups)) {
> + listeners = kzalloc(NLGRPSZ(groups), GFP_ATOMIC);
> + if (!listeners) {
> + err = -ENOMEM;
> + goto out_ungrab;
> + }
> + old = nl_table[unit].listeners;
> + memcpy(listeners, old, NLGRPSZ(nl_table[unit].groups));
> + rcu_assign_pointer(nl_table[unit].listeners, listeners);
> + }
> + nl_table[unit].groups = groups;


This might set the group to a value < 32. I don't expect it matters,
but when I changed to old code to support > 32 groups I enforced
a minimum of 32 so anything outside the kernel multicasting on them
would still work (even though its a really stupid idea). So for
consistency this should probably also use a minimum of 32.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-2.6.22-rc7] xfrm beet interfamily support

2007-07-17 Thread Joakim Koskela

On Monday 16 July 2007 21:47:40 Patrick McHardy wrote:
>
> I lost interest here, but the reintroduced bugs make me think that
> some old version was simply rediffed without even checking the
> output and the state initialization also seems to need a bit more work.
>

Thanks for reviewing the code, really appreciate it (whoa, would have been a 
lot of problems [re-]introduced)! And yes, you're right - it seemed at the 
time easier to just convert the old code to run in the new kernel as it's 
been working fine for us. Quickly scanned the existing (non-interfamily) beet 
implementation, but I guess not thoroughly enough. Anyway, merged back the 
latest non-interfamily versions and rolling with those now. Should have a 
fixed version ready soon..

Some other comments:

> Joakim Koskela wrote:
> > diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
> > index fa1902d..7a39f4c 100644
> > --- a/net/ipv4/xfrm4_input.c
> > +++ b/net/ipv4/xfrm4_input.c
> > @@ -108,7 +108,8 @@ int xfrm4_rcv_encap(struct sk_buff *skb, __u16
> > encap_type) if (x->mode->input(x, skb))
> > goto drop;
> >
> > -   if (x->props.mode == XFRM_MODE_TUNNEL) {
> > +   if (x->props.mode == XFRM_MODE_TUNNEL ||
> > +   x->props.mode == XFRM_MODE_BEET) {
> > decaps = 1;
> > break;
> > }
>
> I was under the impression that one of the main points of BEET is that
> it offers tunnel semantics but does only transport mode processing.
> Its necessary for inter-family tunnels, but shouldn't this be avoided
> for normal use?
>

Yes, this is actually quite a nice improvement to the interfamily processing I 
(at least) haven't thought of before. Tested it & works fine (ipv4-ipv4).

>
> > diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
> > index 44ef208..8db7910 100644
> > --- a/net/ipv4/xfrm4_output.c
> > +++ b/net/ipv4/xfrm4_output.c
> > @@ -53,7 +53,8 @@ static int xfrm4_output_one(struct sk_buff *skb)
> > goto error_nolock;
> > }
> >
> > -   if (x->props.mode == XFRM_MODE_TUNNEL) {
> > +   if (x->props.mode == XFRM_MODE_TUNNEL ||
> > +   x->props.mode == XFRM_MODE_BEET) {
> > err = xfrm4_tunnel_check_size(skb);
>
> Its not a real tunnel and all packets are generated locally, why
> does it need to send ICMPs?

Guess not. I'll have to still trace through, but can probably be removed.

> > +   if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) {
> > +   encap_family = xfrm[i]->props.family;
> > +   if (encap_family == AF_INET) {
> > +   remote.in = (struct in_addr *)
> > +   &xfrm[i]->id.daddr.a4;
> > +   local.in  = (struct in_addr *)
> > +   &xfrm[i]->props.saddr.a4;
> > +   } else if (encap_family == AF_INET6) {
> > +   remote.in6 = (struct in6_addr *)
> > +   xfrm[i]->id.daddr.a6;
> > +   local.in6 = (struct in6_addr *)
> > +   xfrm[i]->props.saddr.a6;
> > +   }
>
> No ifdefs here?

Thanks for noticing!

> >  static int ipip_init_state(struct xfrm_state *x)
> >  {
> > -   if (x->props.mode != XFRM_MODE_TUNNEL)
> > +   if (x->props.mode != XFRM_MODE_TUNNEL ||
> > +   x->props.mode != XFRM_MODE_BEET)
> > return -EINVAL;
>
> Looks like a bug fix that should be seperated.
>

Probably. This has been there for a while, don't know what's the story behind 
it, have to check..

br, j
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] ibmveth: Add ethtool driver stats hooks

2007-07-17 Thread Brian King


Add ethtool hooks to ibmveth to retrieve driver statistics.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |   53 +++-
 1 file changed, 52 insertions(+), 1 deletion(-)

diff -puN drivers/net/ibmveth.c~ibmveth_ethtool_driver_stats 
drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_ethtool_driver_stats
2007-07-12 09:39:23.0 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-12 09:39:23.0 
-0500
@@ -115,6 +115,28 @@ MODULE_VERSION(ibmveth_driver_version);
 module_param_named(csum_offload, ibmveth_csum_offload, uint, 0);
 MODULE_PARM_DESC(csum_offload, "Checksum offload (0/1). Default: 1");
 
+struct ibmveth_stat {
+   char name[ETH_GSTRING_LEN];
+   int offset;
+};
+
+#define IBMVETH_STAT_OFF(stat) offsetof(struct ibmveth_adapter, stat)
+#define IBMVETH_GET_STAT(a, off) *((u64 *)(((unsigned long)(a)) + off))
+
+struct ibmveth_stat ibmveth_stats[] = {
+   { "replenish_task_cycles", IBMVETH_STAT_OFF(replenish_task_cycles) },
+   { "replenish_no_mem", IBMVETH_STAT_OFF(replenish_no_mem) },
+   { "replenish_add_buff_failure", 
IBMVETH_STAT_OFF(replenish_add_buff_failure) },
+   { "replenish_add_buff_success", 
IBMVETH_STAT_OFF(replenish_add_buff_success) },
+   { "rx_invalid_buffer", IBMVETH_STAT_OFF(rx_invalid_buffer) },
+   { "rx_no_buffer", IBMVETH_STAT_OFF(rx_no_buffer) },
+   { "tx_multidesc_send", IBMVETH_STAT_OFF(tx_multidesc_send) },
+   { "tx_linearized", IBMVETH_STAT_OFF(tx_linearized) },
+   { "tx_linearize_failed", IBMVETH_STAT_OFF(tx_linearize_failed) },
+   { "tx_map_failed", IBMVETH_STAT_OFF(tx_map_failed) },
+   { "tx_send_failed", IBMVETH_STAT_OFF(tx_send_failed) }
+};
+
 /* simple methods of getting data from the current rxq entry */
 static inline int ibmveth_rxq_pending_buffer(struct ibmveth_adapter *adapter)
 {
@@ -756,6 +778,32 @@ static u32 ibmveth_get_rx_csum(struct ne
return adapter->rx_csum;
 }
 
+static void ibmveth_get_strings(struct net_device *dev, u32 stringset, u8 
*data)
+{
+   int i;
+
+   if (stringset != ETH_SS_STATS)
+   return;
+
+   for (i = 0; i < ARRAY_SIZE(ibmveth_stats); i++, data += ETH_GSTRING_LEN)
+   memcpy(data, ibmveth_stats[i].name, ETH_GSTRING_LEN);
+}
+
+static int ibmveth_get_stats_count(struct net_device *dev)
+{
+   return ARRAY_SIZE(ibmveth_stats);
+}
+
+static void ibmveth_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *stats, u64 *data)
+{
+   int i;
+   struct ibmveth_adapter *adapter = dev->priv;
+
+   for (i = 0; i < ARRAY_SIZE(ibmveth_stats); i++)
+   data[i] = IBMVETH_GET_STAT(adapter, ibmveth_stats[i].offset);
+}
+
 static const struct ethtool_ops netdev_ethtool_ops = {
.get_drvinfo= netdev_get_drvinfo,
.get_settings   = netdev_get_settings,
@@ -766,7 +814,10 @@ static const struct ethtool_ops netdev_e
.get_rx_csum= ibmveth_get_rx_csum,
.set_rx_csum= ibmveth_set_rx_csum,
.get_tso= ethtool_op_get_tso,
-   .get_ufo= ethtool_op_get_ufo
+   .get_ufo= ethtool_op_get_ufo,
+   .get_strings= ibmveth_get_strings,
+   .get_stats_count= ibmveth_get_stats_count,
+   .get_ethtool_stats  = ibmveth_get_ethtool_stats
 };
 
 static int ibmveth_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
diff -puN drivers/net/ibmveth.h~ibmveth_ethtool_driver_stats 
drivers/net/ibmveth.h
_
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] ibmveth: Enable TCP checksum offload

2007-07-17 Thread Brian King


This patchset enables TCP checksum offload support for IPV4
on ibmveth. This completely eliminates the generation and checking of
the checksum for packets that are completely virtual and never
touch a physical network. A simple TCP_STREAM netperf run on
a virtual network with maximum mtu set yielded a ~30% increase
in throughput. This feature is enabled by default on systems that
support it, but can be disabled with a module option.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |   58 
 linux-2.6-bjking1/drivers/net/ibmveth.h |   41 +-
 2 files changed, 97 insertions(+), 2 deletions(-)

diff -puN drivers/net/ibmveth.c~ibmveth_csum_offload drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_csum_offload2007-07-12 
08:27:47.0 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-12 09:35:55.0 
-0500
@@ -47,6 +47,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -103,12 +105,15 @@ static struct proc_dir_entry *ibmveth_pr
 
 static const char ibmveth_driver_name[] = "ibmveth";
 static const char ibmveth_driver_string[] = "IBM i/pSeries Virtual Ethernet 
Driver";
+static unsigned int ibmveth_csum_offload = 1;
 #define ibmveth_driver_version "1.03"
 
 MODULE_AUTHOR("Santiago Leon <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("IBM i/pSeries Virtual Ethernet Driver");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(ibmveth_driver_version);
+module_param_named(csum_offload, ibmveth_csum_offload, uint, 0);
+MODULE_PARM_DESC(csum_offload, "Checksum offload (0/1). Default: 1");
 
 /* simple methods of getting data from the current rxq entry */
 static inline int ibmveth_rxq_pending_buffer(struct ibmveth_adapter *adapter)
@@ -131,6 +136,11 @@ static inline int ibmveth_rxq_frame_leng
return (adapter->rx_queue.queue_addr[adapter->rx_queue.index].length);
 }
 
+static inline int ibmveth_rxq_csum_good(struct ibmveth_adapter *adapter)
+{
+   return 
(adapter->rx_queue.queue_addr[adapter->rx_queue.index].csum_good);
+}
+
 /* setup the initial settings for a buffer pool */
 static void ibmveth_init_buffer_pool(struct ibmveth_buff_pool *pool, u32 
pool_index, u32 pool_size, u32 buff_size, u32 pool_active)
 {
@@ -684,6 +694,24 @@ static int ibmveth_start_xmit(struct sk_
desc[0].fields.length, DMA_TO_DEVICE);
desc[0].fields.valid   = 1;
 
+   if (skb->ip_summed == CHECKSUM_PARTIAL &&
+   ip_hdr(skb)->protocol != IPPROTO_TCP && skb_checksum_help(skb)) {
+   ibmveth_error_printk("tx: failed to checksum packet\n");
+   tx_dropped++;
+   goto out;
+   }
+
+   if (skb->ip_summed == CHECKSUM_PARTIAL) {
+   unsigned char *buf = skb_transport_header(skb) + 
skb->csum_offset;
+
+   desc[0].fields.no_csum = 1;
+   desc[0].fields.csum_good = 1;
+
+   /* Need to zero out the checksum */
+   buf[0] = 0;
+   buf[1] = 0;
+   }
+
if(dma_mapping_error(desc[0].fields.address)) {
ibmveth_error_printk("tx: unable to map initial fragment\n");
tx_map_failed++;
@@ -702,6 +730,10 @@ static int ibmveth_start_xmit(struct sk_
frag->size, DMA_TO_DEVICE);
desc[curfrag+1].fields.length = frag->size;
desc[curfrag+1].fields.valid  = 1;
+   if (skb->ip_summed == CHECKSUM_PARTIAL) {
+   desc[curfrag+1].fields.no_csum = 1;
+   desc[curfrag+1].fields.csum_good = 1;
+   }
 
if(dma_mapping_error(desc[curfrag+1].fields.address)) {
ibmveth_error_printk("tx: unable to map fragment %d\n", 
curfrag);
@@ -792,7 +824,11 @@ static int ibmveth_poll(struct net_devic
} else {
int length = ibmveth_rxq_frame_length(adapter);
int offset = ibmveth_rxq_frame_offset(adapter);
+   int csum_good = ibmveth_rxq_csum_good(adapter);
+
skb = ibmveth_rxq_get_buffer(adapter);
+   if (csum_good)
+   skb->ip_summed = CHECKSUM_UNNECESSARY;
 
ibmveth_rxq_harvest_buffer(adapter);
 
@@ -962,8 +998,10 @@ static void ibmveth_poll_controller(stru
 static int __devinit ibmveth_probe(struct vio_dev *dev, const struct 
vio_device_id *id)
 {
int rc, i;
+   long ret;
struct net_device *netdev;
struct ibmveth_adapter *adapter = NULL;
+   union ibmveth_illan_attributes set_attr, ret_attr;
 
unsigned char *mac_addr_p;
unsigned int *mcastFilterSize_p;
@@ -1058,6 +1096,26 @@ static int __devinit ibmveth_probe(struc
 
ibmveth_debug_printk("registe

[PATCH 2/4] ibmveth: Implement ethtool hooks to enable/disable checksum offload

2007-07-17 Thread Brian King


This patch adds the appropriate ethtool hooks to allow for enabling/disabling
of hypervisor assisted checksum offload for TCP.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |  120 +++-
 linux-2.6-bjking1/drivers/net/ibmveth.h |1 
 2 files changed, 119 insertions(+), 2 deletions(-)

diff -puN drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool 
drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_csum_offload_ethtool
2007-07-12 09:36:01.0 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-12 09:41:15.0 
-0500
@@ -644,12 +644,127 @@ static u32 netdev_get_link(struct net_de
return 1;
 }
 
+static void ibmveth_set_rx_csum_flags(struct net_device *dev, u32 data)
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+
+   if (data)
+   adapter->rx_csum = 1;
+   else {
+   adapter->rx_csum = 0;
+   dev->features &= ~NETIF_F_IP_CSUM;
+   }
+}
+
+static void ibmveth_set_tx_csum_flags(struct net_device *dev, u32 data)
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+
+   if (data) {
+   dev->features |= NETIF_F_IP_CSUM;
+   adapter->rx_csum = 1;
+   } else
+   dev->features &= ~NETIF_F_IP_CSUM;
+}
+
+static int ibmveth_set_csum_offload(struct net_device *dev, u32 data,
+   void (*done) (struct net_device *, u32))
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+   union ibmveth_illan_attributes set_attr, clr_attr, ret_attr;
+   long ret;
+   int rc1 = 0, rc2 = 0;
+   int restart = 0;
+
+   if (netif_running(dev)) {
+   restart = 1;
+   adapter->pool_config = 1;
+   ibmveth_close(dev);
+   adapter->pool_config = 0;
+   }
+
+   set_attr.desc = 0;
+   clr_attr.desc = 0;
+
+   if (data)
+   set_attr.fields.tcp_csum_offload_ipv4 = 1;
+   else
+   clr_attr.fields.tcp_csum_offload_ipv4 = 1;
+
+   ret = h_illan_attributes(adapter->vdev->unit_address, 0, 0, 
&ret_attr.desc);
+
+   if (ret == H_SUCCESS && !ret_attr.fields.active_trunk &&
+   !ret_attr.fields.trunk_priority &&
+   ret_attr.fields.csum_offload_padded_pkt_support) {
+   ret = h_illan_attributes(adapter->vdev->unit_address, 
clr_attr.desc,
+set_attr.desc, &ret_attr.desc);
+
+   if (ret != H_SUCCESS) {
+   rc1 = -EIO;
+   ibmveth_error_printk("unable to change checksum offload 
settings."
+" %d rc=%ld\n", data, ret);
+
+   ret = h_illan_attributes(adapter->vdev->unit_address,
+set_attr.desc, clr_attr.desc, 
&ret_attr.desc);
+   } else
+   done(dev, data);
+   } else {
+   rc1 = -EIO;
+   ibmveth_error_printk("unable to change checksum offload 
settings."
+" %d rc=%ld ret_attr=%lx\n", data, ret, 
ret_attr.desc);
+   }
+
+   if (restart)
+   rc2 = ibmveth_open(dev);
+
+   return rc1 ? rc1 : rc2;
+}
+
+static int ibmveth_set_rx_csum(struct net_device *dev, u32 data)
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+
+   if (data && adapter->rx_csum)
+   return 0;
+   if (!data && !adapter->rx_csum)
+   return 0;
+
+   return ibmveth_set_csum_offload(dev, data, ibmveth_set_rx_csum_flags);
+}
+
+static int ibmveth_set_tx_csum(struct net_device *dev, u32 data)
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+   int rc = 0;
+
+   if (data && (dev->features & NETIF_F_IP_CSUM))
+   return 0;
+   if (!data && !(dev->features & NETIF_F_IP_CSUM))
+   return 0;
+
+   if (data && !adapter->rx_csum)
+   rc = ibmveth_set_csum_offload(dev, data, 
ibmveth_set_tx_csum_flags);
+   else
+   ibmveth_set_tx_csum_flags(dev, data);
+
+   return rc;
+}
+
+static u32 ibmveth_get_rx_csum(struct net_device *dev)
+{
+   struct ibmveth_adapter *adapter = dev->priv;
+   return adapter->rx_csum;
+}
+
 static const struct ethtool_ops netdev_ethtool_ops = {
.get_drvinfo= netdev_get_drvinfo,
.get_settings   = netdev_get_settings,
.get_link   = netdev_get_link,
.get_sg = ethtool_op_get_sg,
.get_tx_csum= ethtool_op_get_tx_csum,
+   .set_tx_csum= ibmveth_set_tx_csum,
+   .get_rx_csum= ibmveth_get_rx_csum,
+   .set_rx_csum= ibmveth_set_rx_csum
 };
 
 static int ibmveth_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
@@ -1108,9 +1223,10 @@ static int __devinit ibmveth_probe(struc

[PATCH 3/4] ibmveth: Add ethtool TSO handlers

2007-07-17 Thread Brian King


Add handlers for get_tso and get_ufo to prevent errors being printed
by ethtool.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/net/ibmveth.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff -puN drivers/net/ibmveth.c~ibmveth_ethtool_get_tso drivers/net/ibmveth.c
--- linux-2.6/drivers/net/ibmveth.c~ibmveth_ethtool_get_tso 2007-07-12 
09:39:20.0 -0500
+++ linux-2.6-bjking1/drivers/net/ibmveth.c 2007-07-12 09:39:20.0 
-0500
@@ -764,7 +764,9 @@ static const struct ethtool_ops netdev_e
.get_tx_csum= ethtool_op_get_tx_csum,
.set_tx_csum= ibmveth_set_tx_csum,
.get_rx_csum= ibmveth_get_rx_csum,
-   .set_rx_csum= ibmveth_set_rx_csum
+   .set_rx_csum= ibmveth_set_rx_csum,
+   .get_tso= ethtool_op_get_tso,
+   .get_ufo= ethtool_op_get_ufo
 };
 
 static int ibmveth_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
_
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2.6 patch] ipt_iprange.h must #include

2007-07-17 Thread Patrick McHardy

Adrian Bunk wrote:
> ipt_iprange.h must #include  since it uses __be32.


Applied, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 37/44] xen: add virtual network device driver

2007-07-17 Thread Jeremy Fitzhardinge

Stephen Hemminger wrote:
>> +struct netfront_info {
>> +struct list_head list;
>> +struct net_device *netdev;
>> +
>> +struct net_device_stats stats;
>> 
>
> There is now a net_device_stats element inside net_device on
> 2.6.21 or later.
>   

Ah, OK.  Should I just do a s/stats/netdev->stats/?  Is there a generic
get_stats routine as well?

>> +
>> +struct xen_netif_tx_front_ring tx;
>> +struct xen_netif_rx_front_ring rx;
>> +
>> +spinlock_t   tx_lock;
>> +spinlock_t   rx_lock;
>> 
>
> It might be a performance advantage to reorder/align these
> structure elements to put transmit hot elements together, and
> put tx and rx on different cache lines?
>   

Oh, right.  I'd been meaning to look at that layout more closely.

Thanks,
J
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2

2007-07-17 Thread Veeraiyan, Ayyappan

On 7/10/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:
Veeraiyan, Ayyappan wrote:
> On 7/10/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:
>> [EMAIL PROTECTED] wrote:
>>

> I will post the performance numbers later today..

Sorry for not responding earlier. We faced couple of issues like setup,
and false alarms..

Anyway here are the numbers..

RecvSendSendUtilization Service
Demand  
SocketSocketMessageElapsed  SendRecvSendRecv
SizeSizeSizeTime Throughput local   remotelocal remote  
Bytes   Bytes   Bytes   sec  10^6bits/s % s % s us/KB   us/KB

87380   65536   128 60  2261.34 13.82   4.254.006   1.233
128
87380   65536   256 60  3332.51 14.19   5.672.791.115
256
87380   65536   512 60.01   4262.24 14.38   6.9 2.211.062
512
87380   65536   102460  4659.18 14.47.392.026   1.039
1024
87380   65536   204860.01   6177.87 14.36   14.99   1.524   1.59
2048
87380   65536   409660.01   9410.29 11.58   14.60.807   1.017
4096
87380   65536   819260.01   9324.62 11.13   14.33   0.782   1.007
8192
87380   65536   16384   60.01   9371.35 11.07   14.28   0.774   0.999
16384
87380   65536   32768   60.02   9385.81 10.83   14.27   0.756   0.997
32768
87380   65536   65536   60.01   9363.5  10.73   14.26   0.751   0.998
65536

TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to n0417
(10.0.4.17) port 0 AF_INET : cpu bind

Recv   SendSend  Utilization  Service Demand
Socket Socket  Message  Elapsed   SendRecvSendRecv
Size   SizeSize Time   Throughput local  remote   local   remote
bytes  bytes   bytessecs.  10^6bits/s % S% S  us/KB   us/KB

87380  65536  6553660.029399.61   2.22 14.530.155
1.013 
87380  65536  6553660.029348.01   2.46 14.390.173
1.009 
87380  65536  6553660.029403.36   2.26 14.370.158
1.001 
87380  65536  6553660.019332.22   2.23 14.510.157
1.019 

Bidirectional test.
87380  65536  6553660.01   7809.57   28.6630.022.405   2.519
TX
87380  65536  6553660.01   7592.90   28.6630.022.474   2.591
RX
--
87380  65536  6553660.01  7629.73   28.3229.642.433   2.546
RX
87380  65536  6553660.01  7926.99   28.3229.642.342   2.450
TX

Signle netperf stream between 2 quad-core Xeon based boxes. Tested on
2.6.20 and 2.6.22 kernels. Driver uses NAPI and LRO.

To summarize, we are seeing the line-rate with NAPI (single Rx queue)
and Rx CPU utilization is around 14%. In back to back scenarios, NAPI
(combined with LRO) performs clearly better. In multiple client
scenarios, Non-NAPI with multiple Rx queues performs better. I am
continuously doing more benchmarking and submit a patch to pick one this
week.

But going forward if NAPI supports multiple Rx queues natively, I
believe that would perform much better in most of the cases.

Also, did you get a chance to review the driver take #2? I like to
implement the review comments (if any) as early as possible, and submit
another version.

Thanks...

Ayyappan

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 1/3] netlink: allocate group bitmaps dynamically

2007-07-17 Thread Johannes Berg

Allow changing the number of groups for a netlink family
after it has been created, use RCU to protect the listeners
bitmap keeping netlink_has_listeners() lock-free.

Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>

---
 include/linux/netlink.h  |1 
 net/netlink/af_netlink.c |   86 +++
 2 files changed, 66 insertions(+), 21 deletions(-)

--- wireless-dev.orig/net/netlink/af_netlink.c  2007-07-17 14:05:30.210964463 
+0200
+++ wireless-dev/net/netlink/af_netlink.c   2007-07-17 14:05:30.720964463 
+0200
@@ -62,6 +62,7 @@
 #include 
 
 #define NLGRPSZ(x) (ALIGN(x, sizeof(unsigned long) * 8) / 8)
+#define NLGRPLONGS(x)  (NLGRPSZ(x)/sizeof(unsigned long))
 
 struct netlink_sock {
/* struct sock has to be the first member of netlink_sock */
@@ -314,10 +315,12 @@ netlink_update_listeners(struct sock *sk
unsigned long mask;
unsigned int i;
 
-   for (i = 0; i < NLGRPSZ(tbl->groups)/sizeof(unsigned long); i++) {
+   for (i = 0; i < NLGRPLONGS(tbl->groups); i++) {
mask = 0;
-   sk_for_each_bound(sk, node, &tbl->mc_list)
-   mask |= nlk_sk(sk)->groups[i];
+   sk_for_each_bound(sk, node, &tbl->mc_list) {
+   if (i < NLGRPLONGS(nlk_sk(sk)->ngroups))
+   mask |= nlk_sk(sk)->groups[i];
+   }
tbl->listeners[i] = mask;
}
/* this function is only called with the netlink table "grabbed", which
@@ -555,26 +558,37 @@ netlink_update_subscriptions(struct sock
nlk->subscriptions = subscriptions;
 }
 
-static int netlink_alloc_groups(struct sock *sk)
+static int netlink_realloc_groups(struct sock *sk)
 {
struct netlink_sock *nlk = nlk_sk(sk);
unsigned int groups;
+   unsigned long *new_groups;
int err = 0;
 
netlink_lock_table();
groups = nl_table[sk->sk_protocol].groups;
if (!nl_table[sk->sk_protocol].registered)
err = -ENOENT;
-   netlink_unlock_table();
 
if (err)
-   return err;
+   goto out_unlock;
 
-   nlk->groups = kzalloc(NLGRPSZ(groups), GFP_KERNEL);
-   if (nlk->groups == NULL)
-   return -ENOMEM;
+   if (nlk->ngroups >= groups)
+   goto out_unlock;
+
+   new_groups = krealloc(nlk->groups, NLGRPSZ(groups), GFP_KERNEL);
+   if (new_groups == NULL) {
+   err = -ENOMEM;
+   goto out_unlock;
+   }
+   memset((char*)new_groups + NLGRPSZ(nlk->ngroups), 0,
+  NLGRPSZ(groups) - NLGRPSZ(nlk->ngroups));
+
+   nlk->groups = new_groups;
nlk->ngroups = groups;
-   return 0;
+ out_unlock:
+   netlink_unlock_table();
+   return err;
 }
 
 static int netlink_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
@@ -591,11 +605,9 @@ static int netlink_bind(struct socket *s
if (nladdr->nl_groups) {
if (!netlink_capable(sock, NL_NONROOT_RECV))
return -EPERM;
-   if (nlk->groups == NULL) {
-   err = netlink_alloc_groups(sk);
-   if (err)
-   return err;
-   }
+   err = netlink_realloc_groups(sk);
+   if (err)
+   return err;
}
 
if (nlk->pid) {
@@ -839,10 +851,18 @@ retry:
 int netlink_has_listeners(struct sock *sk, unsigned int group)
 {
int res = 0;
+   unsigned long *listeners;
 
BUG_ON(!(nlk_sk(sk)->flags & NETLINK_KERNEL_SOCKET));
+
+   rcu_read_lock();
+   listeners = rcu_dereference(nl_table[sk->sk_protocol].listeners);
+
if (group - 1 < nl_table[sk->sk_protocol].groups)
-   res = test_bit(group - 1, nl_table[sk->sk_protocol].listeners);
+   res = test_bit(group - 1, listeners);
+
+   rcu_read_unlock();
+
return res;
 }
 EXPORT_SYMBOL_GPL(netlink_has_listeners);
@@ -1037,11 +1057,9 @@ static int netlink_setsockopt(struct soc
 
if (!netlink_capable(sock, NL_NONROOT_RECV))
return -EPERM;
-   if (nlk->groups == NULL) {
-   err = netlink_alloc_groups(sk);
-   if (err)
-   return err;
-   }
+   err = netlink_realloc_groups(sk);
+   if (err)
+   return err;
if (!val || val - 1 >= nlk->ngroups)
return -EINVAL;
netlink_table_grab();
@@ -1328,6 +1346,32 @@ out_sock_release:
return NULL;
 }
 
+int netlink_change_ngroups(int unit, unsigned int groups)
+{
+   unsigned long *listeners, old = NULL;
+   int err = 0;
+
+   netlink_table_grab();
+   if (NLGRPSZ(nl_table[unit].groups) < NLGRPSZ(groups)) {
+   listeners = kzalloc(NLGRPSZ(groups), GFP_ATOMI

[patch 2/3] netlink: allow removing multicast groups

2007-07-17 Thread Johannes Berg

Allow kicking listeners out of a multicast group when necessary
(for example if that group is going to be removed.)

Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>

---
 include/linux/netlink.h  |1 +
 net/netlink/af_netlink.c |   47 ++-
 2 files changed, 35 insertions(+), 13 deletions(-)

--- wireless-dev.orig/include/linux/netlink.h   2007-07-17 14:05:30.720964463 
+0200
+++ wireless-dev/include/linux/netlink.h2007-07-17 14:05:31.250964463 
+0200
@@ -162,6 +162,7 @@ extern struct sock *netlink_kernel_creat
  struct mutex *cb_mutex,
  struct module *module);
 extern int netlink_change_ngroups(int unit, unsigned int groups);
+extern void netlink_clear_multicast_users(int unit, unsigned int group);
 extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err);
 extern int netlink_has_listeners(struct sock *sk, unsigned int group);
 extern int netlink_unicast(struct sock *ssk, struct sk_buff *skb, __u32 pid, 
int nonblock);
--- wireless-dev.orig/net/netlink/af_netlink.c  2007-07-17 14:05:30.720964463 
+0200
+++ wireless-dev/net/netlink/af_netlink.c   2007-07-17 14:05:31.250964463 
+0200
@@ -1027,6 +1027,24 @@ void netlink_set_err(struct sock *ssk, u
read_unlock(&nl_table_lock);
 }
 
+static void netlink_update_socket_mc(struct netlink_sock *nlk,
+unsigned int group,
+int is_new)
+{
+   int old, new = !!is_new, subscriptions;
+
+   netlink_table_grab();
+   old = test_bit(group - 1, nlk->groups);
+   subscriptions = nlk->subscriptions - old + new;
+   if (new)
+   __set_bit(group - 1, nlk->groups);
+   else
+   __clear_bit(group - 1, nlk->groups);
+   netlink_update_subscriptions(&nlk->sk, subscriptions);
+   netlink_update_listeners(&nlk->sk);
+   netlink_table_ungrab();
+}
+
 static int netlink_setsockopt(struct socket *sock, int level, int optname,
  char __user *optval, int optlen)
 {
@@ -1052,9 +1070,6 @@ static int netlink_setsockopt(struct soc
break;
case NETLINK_ADD_MEMBERSHIP:
case NETLINK_DROP_MEMBERSHIP: {
-   unsigned int subscriptions;
-   int old, new = optname == NETLINK_ADD_MEMBERSHIP ? 1 : 0;
-
if (!netlink_capable(sock, NL_NONROOT_RECV))
return -EPERM;
err = netlink_realloc_groups(sk);
@@ -1062,16 +1077,8 @@ static int netlink_setsockopt(struct soc
return err;
if (!val || val - 1 >= nlk->ngroups)
return -EINVAL;
-   netlink_table_grab();
-   old = test_bit(val - 1, nlk->groups);
-   subscriptions = nlk->subscriptions - old + new;
-   if (new)
-   __set_bit(val - 1, nlk->groups);
-   else
-   __clear_bit(val - 1, nlk->groups);
-   netlink_update_subscriptions(sk, subscriptions);
-   netlink_update_listeners(sk);
-   netlink_table_ungrab();
+   netlink_update_socket_mc(nlk, val,
+optname == NETLINK_ADD_MEMBERSHIP);
err = 0;
break;
}
@@ -1372,6 +1379,20 @@ int netlink_change_ngroups(int unit, uns
 }
 EXPORT_SYMBOL(netlink_change_ngroups);
 
+void netlink_clear_multicast_users(int unit, unsigned int group)
+{
+   struct sock *sk;
+   struct hlist_node *node;
+
+   read_lock(&nl_table_lock);
+
+   sk_for_each_bound(sk, node, &nl_table[unit].mc_list)
+   netlink_update_socket_mc(nlk_sk(sk), group, 0);
+
+   read_unlock(&nl_table_lock);
+}
+EXPORT_SYMBOL(netlink_clear_multicast_users);
+
 void netlink_set_nonroot(int protocol, unsigned int flags)
 {
if ((unsigned int)protocol < MAX_LINKS)

-- 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 0/3] dynamic generic netlink multicast

2007-07-17 Thread Johannes Berg

This patch series adds dynamic generic netlink multicast groups.
The ACPI people are open to rebasing their patches on top of this.
Both Jamal and Patrick gave it a quick review (Patrick said he'll
review it again when he gets around) and we've discussed the API
for a while ending up with this. Please apply.

johannes

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 3/3] generic netlink: dynamic multicast groups

2007-07-17 Thread Johannes Berg

Introduce API to dynamically register and unregister multicast groups.

Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>
---
 include/linux/genetlink.h |   13 +++
 include/net/genetlink.h   |   22 +
 net/netlink/genetlink.c   |  190 --
 3 files changed, 218 insertions(+), 7 deletions(-)

--- wireless-dev.orig/include/net/genetlink.h   2007-07-17 14:05:13.760964463 
+0200
+++ wireless-dev/include/net/genetlink.h2007-07-17 14:05:31.780964463 
+0200
@@ -5,6 +5,22 @@
 #include 
 
 /**
+ * struct genl_multicast_group - generic netlink multicast group
+ * @name: name of the multicast group, names are per-family
+ * @id: multicast group ID, assigned by the core, to use with
+ *  genlmsg_multicast().
+ * @list: list entry for linking
+ * @family: pointer to family, need not be set before registering
+ */
+struct genl_multicast_group
+{
+   struct genl_family  *family;/* private */
+   struct list_headlist;   /* private */
+   charname[GENL_NAMSIZ];
+   u32 id;
+};
+
+/**
  * struct genl_family - generic netlink family
  * @id: protocol family idenfitier
  * @hdrsize: length of user specific header in bytes
@@ -14,6 +30,7 @@
  * @attrbuf: buffer to store parsed attributes
  * @ops_list: list of all assigned operations
  * @family_list: family list
+ * @mcast_groups: multicast groups list
  */
 struct genl_family
 {
@@ -25,6 +42,7 @@ struct genl_family
struct nlattr **attrbuf;/* private */
struct list_headops_list;   /* private */
struct list_headfamily_list;/* private */
+   struct list_headmcast_groups;   /* private */
 };
 
 /**
@@ -73,6 +91,10 @@ extern int genl_register_family(struct g
 extern int genl_unregister_family(struct genl_family *family);
 extern int genl_register_ops(struct genl_family *, struct genl_ops *ops);
 extern int genl_unregister_ops(struct genl_family *, struct genl_ops *ops);
+extern int genl_register_mc_group(struct genl_family *family,
+ struct genl_multicast_group *grp);
+extern void genl_unregister_mc_group(struct genl_family *family,
+struct genl_multicast_group *grp);
 
 extern struct sock *genl_sock;
 
--- wireless-dev.orig/net/netlink/genetlink.c   2007-07-17 14:05:13.860964463 
+0200
+++ wireless-dev/net/netlink/genetlink.c2007-07-17 14:05:31.780964463 
+0200
@@ -3,6 +3,7 @@
  *
  * Authors:Jamal Hadi Salim
  * Thomas Graf <[EMAIL PROTECTED]>
+ * Johannes Berg <[EMAIL PROTECTED]>
  */
 
 #include 
@@ -13,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -42,6 +44,17 @@ static void genl_unlock(void)
 #define GENL_FAM_TAB_MASK  (GENL_FAM_TAB_SIZE - 1)
 
 static struct list_head family_ht[GENL_FAM_TAB_SIZE];
+/*
+ * Bitmap of multicast groups that are currently in use.
+ *
+ * To avoid an allocation at boot of just one unsigned long,
+ * declare it global instead.
+ * Bit 0 is marked as already used since group 0 is invalid,
+ * bit 1 is marked as already used since group 1 is the controller group.
+ */
+static unsigned long mc_group_start = 0x3;
+static unsigned long *mc_groups = &mc_group_start;
+static unsigned long mc_groups_longs = 1;
 
 static int genl_ctrl_event(int event, void *data);
 
@@ -116,6 +129,77 @@ static inline u16 genl_generate_id(void)
return id_gen_idx;
 }
 
+int genl_register_mc_group(struct genl_family *family,
+  struct genl_multicast_group *grp)
+{
+   int id = find_first_zero_bit(mc_groups,
+mc_groups_longs * BITS_PER_LONG);
+   unsigned long *new_groups;
+   size_t nlen = (mc_groups_longs + 1) * sizeof(unsigned long);
+   int err;
+
+   genl_lock();
+
+   if (id >= mc_groups_longs * BITS_PER_LONG) {
+   if (mc_groups == &mc_group_start) {
+   new_groups = kzalloc(nlen, GFP_KERNEL);
+   if (!mc_groups) {
+   err = -ENOMEM;
+   goto out;
+   }
+   mc_groups = new_groups;
+   *mc_groups = mc_group_start;
+   } else {
+   new_groups = krealloc(mc_groups, nlen, GFP_KERNEL);
+   if (!new_groups) {
+   err = -ENOMEM;
+   goto out;
+   }
+   mc_groups = new_groups;
+   mc_groups[mc_groups_longs] = 0;
+   }
+   mc_groups_longs++;
+   }
+
+   err = netlink_change_ngroups(NETLINK_GENERIC,
+sizeof(unsigned long) * NETLINK_GENERIC);
+   if (err)
+   goto out;
+
+

[PATCH] negative groups in netlink_setsockopt

2007-07-17 Thread Johannes Berg

Reading netlink_setsockopt it's not immediately clear why there isn't a
bug when you pass in negative numbers, the reason being that the >=
comparison is really unsigned although 'val' is signed because
nlk->ngroups is unsigned. Make 'val' unsigned too.

Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>

---
 net/netlink/af_netlink.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- wireless-dev.orig/net/netlink/af_netlink.c  2007-07-17 14:05:14.580964463 
+0200
+++ wireless-dev/net/netlink/af_netlink.c   2007-07-17 14:05:30.210964463 
+0200
@@ -1012,7 +1012,8 @@ static int netlink_setsockopt(struct soc
 {
struct sock *sk = sock->sk;
struct netlink_sock *nlk = nlk_sk(sk);
-   int val = 0, err;
+   unsigned int val = 0;
+   int err;
 
if (level != SOL_NETLINK)
return -ENOPROTOOPT;


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PPPOL2TP 2/2]: Reset meta-data in xmit function

2007-07-17 Thread Patrick McHardy

[PPPOL2TP]: Reset meta-data in xmit function

Reset netfilter data and IP CB, fix dst_entry leak.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

---
commit 308ac1b6249226730b70fcf7c13a289c27ce2bf3
tree ea05987add4c9423af023e4bc9ca773ab70568c3
parent 86394ab99d7a4532cf23f8d456aecfa6e3085dfd
author Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 14:16:51 +0200
committer Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 14:16:51 +0200

 drivers/net/pppol2tp.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/pppol2tp.c b/drivers/net/pppol2tp.c
index 856610f..f871760 100644
--- a/drivers/net/pppol2tp.c
+++ b/drivers/net/pppol2tp.c
@@ -1049,7 +1049,13 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
 		printk("\n");
 	}
 
+	memset(&(IPCB(skb2)->opt), 0, sizeof(IPCB(skb2)->opt));
+	IPCB(skb2)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED |
+			   IPSKB_REROUTED);
+	nf_reset(skb2);
+
 	/* Get routing info from the tunnel socket */
+	dst_release(skb2->dst);
 	skb2->dst = sk_dst_get(sk_tun);
 
 	/* Queue the packet to IP for output */

[PPPOL2TP 1/2]: Fix use-after-free

2007-07-17 Thread Patrick McHardy

[PPPOL2TP]: Fix use-after-free

Don't use skb->len after passing it to ip_queue_xmit.

Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

---
commit 86394ab99d7a4532cf23f8d456aecfa6e3085dfd
tree 704cfbb8d9c06f79c21a54f189608db1d1b06915
parent 2e27afb300b56d83bb03fbfa68852b9c1e2920c6
author Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 14:11:37 +0200
committer Patrick McHardy <[EMAIL PROTECTED]> Tue, 17 Jul 2007 14:11:37 +0200

 drivers/net/pppol2tp.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/pppol2tp.c b/drivers/net/pppol2tp.c
index 5891a0f..856610f 100644
--- a/drivers/net/pppol2tp.c
+++ b/drivers/net/pppol2tp.c
@@ -824,6 +824,7 @@ static int pppol2tp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msgh
 	struct pppol2tp_session *session;
 	struct pppol2tp_tunnel *tunnel;
 	struct udphdr *uh;
+	unsigned int len;
 
 	error = -ENOTCONN;
 	if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED))
@@ -912,14 +913,15 @@ static int pppol2tp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msgh
 	}
 
 	/* Queue the packet to IP for output */
+	len = skb->len;
 	error = ip_queue_xmit(skb, 1);
 
 	/* Update stats */
 	if (error >= 0) {
 		tunnel->stats.tx_packets++;
-		tunnel->stats.tx_bytes += skb->len;
+		tunnel->stats.tx_bytes += len;
 		session->stats.tx_packets++;
-		session->stats.tx_bytes += skb->len;
+		session->stats.tx_bytes += len;
 	} else {
 		tunnel->stats.tx_errors++;
 		session->stats.tx_errors++;
@@ -958,6 +960,7 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
 	__wsum csum = 0;
 	struct sk_buff *skb2 = NULL;
 	struct udphdr *uh;
+	unsigned int len;
 
 	if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED))
 		goto abort;
@@ -1050,14 +1053,15 @@ static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
 	skb2->dst = sk_dst_get(sk_tun);
 
 	/* Queue the packet to IP for output */
+	len = skb2->len;
 	rc = ip_queue_xmit(skb2, 1);
 
 	/* Update stats */
 	if (rc >= 0) {
 		tunnel->stats.tx_packets++;
-		tunnel->stats.tx_bytes += skb2->len;
+		tunnel->stats.tx_bytes += len;
 		session->stats.tx_packets++;
-		session->stats.tx_bytes += skb2->len;
+		session->stats.tx_bytes += len;
 	} else {
 		tunnel->stats.tx_errors++;
 		session->stats.tx_errors++;

Re: [NET]: gen_estimator deadlock fix

2007-07-17 Thread Jarek Poplawski

On Tue, Jul 17, 2007 at 02:01:48PM +0200, Patrick McHardy wrote:
...

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: gen_estimator deadlock fix

2007-07-17 Thread Patrick McHardy

Jarek Poplawski wrote:
> This patch looks fine, but while checking for this lock I've found
> another strange thing: for actions tcfc_stats_lock is used here, which
> is equivalent to tcfc_lock; so, in gen_kill_estimator we get this lock
> sometimes after dev->queue_lock; this order is also possible during
> tc_classify if actions are used; on the other hand act_mirred calls
> dev_queue_xmit under this lock, so dev->queue_lock is taken in another
> order. I hope it's with different devs, and there is no real deadlock
> possible, but this all is a bit queer...

It *should* be a different device, but AFAIK nothing enforces this.
There are quite a few possible deadlocks with TC actions, mid-term
things like the mirred action need to be redesigned to inject packet
from a different context.

> I don't know actions enough, but it seems, if it's possible that they
> are always run only from tc_classify, with dev->queue_lock, maybe it
> would be simpler to use this lock for actions' stats with
> gen_estimator too.

The same action can be shared between devices, so they need seperate
locks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: gen_estimator deadlock fix

2007-07-17 Thread Jarek Poplawski

On Mon, Jul 16, 2007 at 08:45:05PM +0300, Ranko Zivojnovic wrote:
...
> [NET] gen_estimator deadlock fix
> 
> -Fixes ABBA deadlock noted by Patrick McHardy <[EMAIL PROTECTED]>:
> 
> > There is at least one ABBA deadlock, est_timer() does:
> > read_lock(&est_lock)
> > spin_lock(e->stats_lock) (which is dev->queue_lock)
> >
> > and qdisc_destroy calls htb_destroy under dev->queue_lock, which
> > calls htb_destroy_class, then gen_kill_estimator and this
> > write_locks est_lock.
> 
> To fix the ABBA deadlock the rate estimators are now kept on an rcu list.
> 
> -The est_lock changes the use from protecting the list to protecting
> the update to the 'bstat' pointer in order to avoid NULL dereferencing.

This patch looks fine, but while checking for this lock I've found
another strange thing: for actions tcfc_stats_lock is used here, which
is equivalent to tcfc_lock; so, in gen_kill_estimator we get this lock
sometimes after dev->queue_lock; this order is also possible during
tc_classify if actions are used; on the other hand act_mirred calls
dev_queue_xmit under this lock, so dev->queue_lock is taken in another
order. I hope it's with different devs, and there is no real deadlock
possible, but this all is a bit queer...

I don't know actions enough, but it seems, if it's possible that they
are always run only from tc_classify, with dev->queue_lock, maybe it
would be simpler to use this lock for actions' stats with
gen_estimator too. And if gen_kill_estimator is sometimes run with
dev->queue_lock, maybe doing this always would make this locking
really understandable and less prone for such inversions (plus
est_lock could be forgotten)?

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 0/3] netlink/generic netlink multicast group rework

2007-07-17 Thread Johannes Berg

On Mon, 2007-07-16 at 15:42 +0200, Patrick McHardy wrote:

> Sorry, I'm behind with my plans. I'll try to get to it ASAP,
> but I don't want to stand in your way until then.

Don't want to hurry you or anything, just thought maybe it had dropped
off.

>  So here's
> a preliminary ACK based on my first swift review, in case
> I notice something later I can just send a patch on top.

Just let me know if you find any problems, I can fix them as well. For
now I'll resend for inclusion then I guess and see that the ACPI guys
get a ping to update their code.

johannes

signature.asc
Description: This is a digitally signed message part

Re: Linux 2.6.22: had to reboot after OOM

2007-07-17 Thread Andrew Morton

On Sun, 15 Jul 2007 15:03:21 +0300 Sami Farin <[EMAIL PROTECTED]> wrote:

> After I got this error [1], system got real slow, like 386 having 32 MB of RAM
> and swapping constantly.
> My system is P4 SMP with 1GB of RAM.
> 
> I got this same behavior with 2.6.19, too, but then I used GNU cp v6.9
> which had micro-optimization which did not bother doing read() for regular
> files, like /proc/vmstat , instead, it generated 0-byte destination file
> (I noticed that only after rebooting).
> So I do not have useful debug info for 2.6.19.
> 
> And now my version of cp does not have that micro-optimization.
> 
> I also attached vmstat and slabinfo , in case somebody can figure out
> something out of them.
> I also now use --probes=5 for ipset nethash... uses less mem but is slower.

I don't know why your system went bad like that.

> procs ---memory-- ---swap-- -io --system-- 
> -cpu--
>  r  b   swpd   free   buff  cache   si   sobibo   incs us sy id 
> wa st
>  0  2 762376 748392  0  17104   1185265   6453 10  3 85  
> 3  0
>  0  3 762376 748664  0  17144  456   12   90824  237  3044  2  3 26 
> 70  0
>  0  2 762376 749544  0  16492  464  284  1440   288  248  3633  2  5 14 
> 79  0
>  0  3 762376 749916  0  16392  4320  136816  239  3360  1  2  5 
> 93  0
>  0  3 762376 751108  0  15740  512  432  1656   460  214  3751  1  3 28 
> 69  0
>  0  7 762376 751740  0  15512  532  104  1920   131  237  4014  2  4  5 
> 90  0
>  0 10 762436 755064  0  15652  200 3172   992  3264  274  5086  1  3  1 
> 96  0
>  0 11 762504 757208  0  15156  144  836   624   926  204  2602  1  2  0 
> 98  0
>  0 13 762680 766384  0  12740   36 728496  7308  252  8643  0  5  0 
> 95  0
>  0 14 762772 771312  0  11696   12 3908   132  3912  253  5153  0  3  0 
> 97  0
>  0 13 763044 778536  0  10356  184 8088   572  8088  249  6889  1  3  0 
> 97  0
>  0 15 763092 781380  0  102840 2136   532  2160  176  2387  1  2  0 
> 98  0
>  0 16 763360 785428  0  10008   60 4884  1312  4985  281  4798  0  5  0 
> 96  0
>  0 19 763636 791192  0   8548   24 6264   476  6268  219  6732  0  3  0 
> 97  0
>  0 18 763824 799184  0   62400 750052  7504  217  7080  1  6  0 
> 94  0
>  0 18 763872 802196  0   58160 2628   552  2640  194  3851  0  4  0 
> 96  0
>  0 28 764048 805072  0   5168   92 1760  1180  1772  185  2703  0  5  0 
> 94  0
>  0 30 764048 806032  0   5256   36  164   364   234  126  1448  0  1  0 
> 99  0
>  0 31 764048 807728  0   4732   32   409284  152  1824  0  3  0 
> 97  0
>  0 35 764056 809372  0   4332 1100  532  9084   804  194  1986  0  7  0 
> 94  0
>  0 74 764012 839544  0   8100 16040  566012 4216 46853  0  8  0 
> 92  0
> 
> procs ---memory-- ---swap-- -io --system-- 
> -cpu--
>  r  b   swpd   free  inact active   si   sobibo   incs us sy id 
> wa st
>  0 14 762504 757108   4004  90724   1185265   6453 10  3 85  
> 3  0
>  0 13 762676 764608   8768  78084   44 6324   288  6370  251  7697  0  4  0 
> 95  0
>  0 14 762772 770832  12404  68328   12 4868   128  4880  243  6161  0  4  0 
> 96  0
>  0 13 763044 778536  24248  48736  180 7908   548  7912  296  7208  0  3  0 
> 97  0
>  0 16 763092 781380  21196  490044 2312   560  2336  185  2524  1  1  0 
> 98  0
>  0 16 763360 784584  40040  27048   60 4088  1312  4189  267  3964  0  4  0 
> 96  0
>  0 19 763636 791192  36552  23804   24 7064   392  7068  227  7600  0  4  0 
> 96  0
>  0 18 763824 798896  36928  155680 7352   136  7356  219  6902  0  5  0 
> 94  0
>  0 19 763872 802088  41372   78760 2776   548  2788  195  3917  0  4  0 
> 96  0
>  0 25 764048 804636  39896   6932   56 1760   920  1772  170  2602  0  4  0 
> 95  0
>  0 30 764048 806032  39308   6272   72  164   628   234  131  1579  0  2  0 
> 98  0
>  0 30 764048 807500  38316   5768   32   408468  149  1630  0  2  0 
> 98  0
>  1 34 764056 808476  37900   5444   20  192   660   242  184  1883  0  6  0 
> 94  0
>  1 37 764056 811148  38772   1740   20  100   388   104  168  2409  0  5  0 
> 95  0
>  0 38 764056 813056  34164   453600   12447  227  2590  0  6  0 
> 94  0
>  0 38 764056 815500  32924   3308   400   608 5  228  2448  0  4  0 
> 96  0
> 10 37 764056 817764  30404   38000   16   26020  323  3795  0 10  0 
> 90  0
>  1 37 764056 819548  28552   37440   20   14856  207  2275  0  6  0 
> 94  0
>  1 38 764056 820828  27004   4028   208   49621  216  1703  0  3  0 
> 97  0
>  0 71 764056 820064  28228   3508  6000  1424 4  140   826  0  2  0 
> 99  0
>  0 73 764028 822568  25928   3404   68   36   33248  205  2223  0  5  0 
> 95  0
> 
> 
> 
> As a comparison, these are stats after booting, with similar workload:
> 
> procs ---memory-- ---swap-- -io --system-- 
>

Re: [patch 37/44] xen: add virtual network device driver

2007-07-17 Thread Stephen Hemminger


> +struct netfront_info {
> + struct list_head list;
> + struct net_device *netdev;
> +
> + struct net_device_stats stats;

There is now a net_device_stats element inside net_device on
2.6.21 or later.

> +
> + struct xen_netif_tx_front_ring tx;
> + struct xen_netif_rx_front_ring rx;
> +
> + spinlock_t   tx_lock;
> + spinlock_t   rx_lock;

It might be a performance advantage to reorder/align these
structure elements to put transmit hot elements together, and
put tx and rx on different cache lines?

> + unsigned int evtchn;
> +
> + /* Receive-ring batched refills. */
> +#define RX_MIN_TARGET 8
> +#define RX_DFL_MIN_TARGET 64
> +#define RX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
> + unsigned rx_min_target, rx_max_target, rx_target;
> + struct sk_buff_head rx_batch;
> +
> + struct timer_list rx_refill_timer;
> +
> + /*
> +  * {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
> +  * are linked from tx_skb_freelist through skb_entry.link.
> +  *
> +  *  NB. Freelist index entries are always going to be less than
> +  *  PAGE_OFFSET, whereas pointers to skbs will always be equal or
> +  *  greater than PAGE_OFFSET: we use this property to distinguish
> +  *  them.
> +  */
> + union skb_entry {
> + struct sk_buff *skb;
> + unsigned link;
> + } tx_skbs[NET_TX_RING_SIZE];
> + grant_ref_t gref_tx_head;
> + grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
> + unsigned tx_skb_freelist;
> +
> + struct sk_buff *rx_skbs[NET_RX_RING_SIZE];
> + grant_ref_t gref_rx_head;
> + grant_ref_t grant_rx_ref[NET_RX_RING_SIZE];
> +
> + struct xenbus_device *xbdev;
> + int tx_ring_ref;
> + int rx_ring_ref;
> +
> + unsigned long rx_pfn_array[NET_RX_RING_SIZE];
> + struct multicall_entry rx_mcl[NET_RX_RING_SIZE+1];
> + struct mmu_update rx_mmu[NET_RX_RING_SIZE];
> +};
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [git patches] net driver updates

2007-07-17 Thread maximilian attems

On Mon, Jul 16, 2007 at 06:57:21PM -0400, Jeff Garzik wrote:
> Minor fixes and cleanups, and a wireless pull from Linville.
> 
> Please pull from 'upstream-linus' branch of
> master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git 
> upstream-linus
> 

did you get the dgrs removal patch?
mail didn't hit netdev due to beeing too large,
but you were on cc.

shall i rebase?

--
maks
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

46 matches

Mail list logo