Re: [PATCH] Fix race condition about network device name allocation

2007-06-13 Thread Dan Aloni
On Wed, Jun 13, 2007 at 09:36:31AM -0700, Stephen Hemminger wrote:
> On Wed, 13 Jun 2007 12:45:21 +0300
> Dan Aloni <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, May 14, 2007 at 08:58:40AM -0700, Stephen Hemminger wrote:
> > > Kenji Kaneshige found this race between device removal and
> > > registration.  On unregister it is possible for the old device to
> > > exist, because sysfs file is still open.  A new device with 'eth%d'
> > > will select the same name, but sysfs kobject register will fial.
> > > 
> > > The following changes the shutdown order slightly. It hold a removes the 
> > > sysfs
> > > entries earlier (on unregister_netdevice), but holds a kobject reference.
> > > Then when todo runs the actual last put free happens.
> > > 
> > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
> > 
> > That patch breaks the bonding driver. After reverting it I avoid this crash:
> > 
>[..]
> > 
> 
> I assume this happens when bonded slave device is removed?

Yes, it's just a simple removal via sysfs.

> Which kernel version?

2.6.21.5

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix race condition about network device name allocation

2007-06-13 Thread Jay Vosburgh

The following patch (based on a patch from Stephen Hemminger
<[EMAIL PROTECTED]>) removes use after free conditions in
the unregister path for the bonding master.  Without this patch, an
operation of the form "echo -bond0 > /sys/class/net/bonding_masters"
would trigger a NULL pointer dereference in sysfs.  I was not able to
induce the failure with the non-sysfs code path, but for consistency I
updated that code as well.

I also did some testing of the bonding /proc file being open
while the bond is being deleted, and didn't see any problems there.

Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]>

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 223517d..6287ffb 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4345,8 +4345,8 @@ static void bond_free_all(void)
bond_mc_list_destroy(bond);
/* Release the bonded slaves */
bond_release_all(bond_dev);
-   unregister_netdevice(bond_dev);
bond_deinit(bond_dev);
+   unregister_netdevice(bond_dev);
}
 
 #ifdef CONFIG_PROC_FS
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index a122baa..60cccf2 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -164,9 +164,9 @@ static ssize_t bonding_store_bonds(struct class *cls, const 
char *buffer, size_t
printk(KERN_INFO DRV_NAME
": %s is being deleted...\n",
bond->dev->name);
-   unregister_netdevice(bond->dev);
bond_deinit(bond->dev);
bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond->dev);
rtnl_unlock();
goto out;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix race condition about network device name allocation

2007-06-13 Thread Stephen Hemminger
Bonding refers to device after unregistering. This has always been
a dangerous thing. The following UNTESTED should fix the problem.

--- a/drivers/net/bonding/bond_sysfs.c  2007-06-13 15:48:37.0 -0700
+++ b/drivers/net/bonding/bond_sysfs.c  2007-06-13 15:49:17.0 -0700
@@ -164,9 +164,10 @@ static ssize_t bonding_store_bonds(struc
printk(KERN_INFO DRV_NAME
": %s is being deleted...\n",
bond->dev->name);
-   unregister_netdevice(bond->dev);
+
bond_deinit(bond->dev);
bond_destroy_sysfs_entry(bond);
+   unregister_netdevice(bond->dev);
rtnl_unlock();
goto out;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix race condition about network device name allocation

2007-06-13 Thread Stephen Hemminger
On Wed, 13 Jun 2007 12:45:21 +0300
Dan Aloni <[EMAIL PROTECTED]> wrote:

> On Mon, May 14, 2007 at 08:58:40AM -0700, Stephen Hemminger wrote:
> > Kenji Kaneshige found this race between device removal and
> > registration.  On unregister it is possible for the old device to
> > exist, because sysfs file is still open.  A new device with 'eth%d'
> > will select the same name, but sysfs kobject register will fial.
> > 
> > The following changes the shutdown order slightly. It hold a removes the 
> > sysfs
> > entries earlier (on unregister_netdevice), but holds a kobject reference.
> > Then when todo runs the actual last put free happens.
> > 
> > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
> 
> That patch breaks the bonding driver. After reverting it I avoid this crash:
> 
> <6>[1115260.637351] Ethernet Channel Bonding Driver: v3.1.2 (January 20, 2007)
> <6>[1115260.637358] bonding: MII link monitoring set to 100 ms
> <6>[1115260.637767] bonding: bond0 is being deleted...
> <1>[1115260.695812] Unable to handle kernel NULL pointer dereference at 
> 0020 RIP: 
> <1>[1115260.701504]  [] __lookup_hash+0x22/0x150
> <6>[1115260.709754] PGD 798bf067 PUD 798c4067 PMD 0 
> <0>[1115260.714394] Oops:  [1] SMP 
> [1]kdb> 
> [1]kdb> btc
> btc: cpu status: Currently on cpu 1
> Available cpus: 0(I), 1
> Stack traceback for pid 0
> 0x8058542000  10   I  0x80585710  swapper
> rspripFunction (args)
> 0x8066feb0 0x8046783f thread_return+0x5e
> 0x8066fec0 0x80469df9 _spin_unlock_irq+0x9
> 0x8066ff28 0x80209176 mwait_idle+0x46
> 0x8066ff50 0x802088e2 enter_idle+0x22
> 0x8066ff60 0x802090bc cpu_idle+0x5c
> 0x8066ff80 0x80207146 rest_init+0x26
> 0x8066ff90 0x80678c1a start_kernel+0x2ea
> 0x8066ffc0 0x8067815f _sinittext+0x15f
> Stack traceback for pid 66
> 0x81006a411850   661  11   R  0x81006a411b40 
> *platform_node
> rspripFunction (args)
> 0x81006a46dd98 0x802805e2 __lookup_hash+0x22
> 0x81006a46de00 0x80280cba lookup_one_len+0x9a
> 0x81006a46de20 0x802be671 sysfs_remove_group+0x31
> 0x81006a46de50 0x8800fe4a [bonding]bond_destroy_sysfs_entry+0x1a
> 0x81006a46de60 0x88011974 [bonding]bonding_store_bonds+0x214
> 0x81006a46deb0 0x8037c9d4 class_attr_store+0x24
> [1]more> 
> 0x81006a46dec0 0x802bbe30 sysfs_write_file+0x100
> 0x81006a46df10 0x80277d7e vfs_write+0xbe
> 0x81006a46df40 0x80278400 sys_write+0x50
> 0x81006a46df80 0x80209e6e system_call+0x7e
> 

I assume this happens when bonded slave device is removed?
Which kernel version?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix race condition about network device name allocation

2007-06-13 Thread Dan Aloni
On Mon, May 14, 2007 at 08:58:40AM -0700, Stephen Hemminger wrote:
> Kenji Kaneshige found this race between device removal and
> registration.  On unregister it is possible for the old device to
> exist, because sysfs file is still open.  A new device with 'eth%d'
> will select the same name, but sysfs kobject register will fial.
> 
> The following changes the shutdown order slightly. It hold a removes the sysfs
> entries earlier (on unregister_netdevice), but holds a kobject reference.
> Then when todo runs the actual last put free happens.
> 
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

That patch breaks the bonding driver. After reverting it I avoid this crash:

<6>[1115260.637351] Ethernet Channel Bonding Driver: v3.1.2 (January 20, 2007)
<6>[1115260.637358] bonding: MII link monitoring set to 100 ms
<6>[1115260.637767] bonding: bond0 is being deleted...
<1>[1115260.695812] Unable to handle kernel NULL pointer dereference at 
0020 RIP: 
<1>[1115260.701504]  [] __lookup_hash+0x22/0x150
<6>[1115260.709754] PGD 798bf067 PUD 798c4067 PMD 0 
<0>[1115260.714394] Oops:  [1] SMP 
[1]kdb> 
[1]kdb> btc
btc: cpu status: Currently on cpu 1
Available cpus: 0(I), 1
Stack traceback for pid 0
0x8058542000  10   I  0x80585710  swapper
rspripFunction (args)
0x8066feb0 0x8046783f thread_return+0x5e
0x8066fec0 0x80469df9 _spin_unlock_irq+0x9
0x8066ff28 0x80209176 mwait_idle+0x46
0x8066ff50 0x802088e2 enter_idle+0x22
0x8066ff60 0x802090bc cpu_idle+0x5c
0x8066ff80 0x80207146 rest_init+0x26
0x8066ff90 0x80678c1a start_kernel+0x2ea
0x8066ffc0 0x8067815f _sinittext+0x15f
Stack traceback for pid 66
0x81006a411850   661  11   R  0x81006a411b40 
*platform_node
rspripFunction (args)
0x81006a46dd98 0x802805e2 __lookup_hash+0x22
0x81006a46de00 0x80280cba lookup_one_len+0x9a
0x81006a46de20 0x802be671 sysfs_remove_group+0x31
0x81006a46de50 0x8800fe4a [bonding]bond_destroy_sysfs_entry+0x1a
0x81006a46de60 0x88011974 [bonding]bonding_store_bonds+0x214
0x81006a46deb0 0x8037c9d4 class_attr_store+0x24
[1]more> 
0x81006a46dec0 0x802bbe30 sysfs_write_file+0x100
0x81006a46df10 0x80277d7e vfs_write+0xbe
0x81006a46df40 0x80278400 sys_write+0x50
0x81006a46df80 0x80209e6e system_call+0x7e

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix race condition about network device name allocation

2007-05-14 Thread Stephen Hemminger
Kenji Kaneshige found this race between device removal and
registration.  On unregister it is possible for the old device to
exist, because sysfs file is still open.  A new device with 'eth%d'
will select the same name, but sysfs kobject register will fial.

The following changes the shutdown order slightly. It hold a removes the sysfs
entries earlier (on unregister_netdevice), but holds a kobject reference.
Then when todo runs the actual last put free happens.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

---
 net/core/dev.c   |   10 ++
 net/core/net-sysfs.c |8 +++-
 2 files changed, 13 insertions(+), 5 deletions(-)

--- 2.6.21-rc1.orig/net/core/dev.c  2007-05-11 11:02:55.0 -0700
+++ 2.6.21-rc1/net/core/dev.c   2007-05-14 08:44:52.0 -0700
@@ -3245,7 +3245,6 @@ void netdev_run_todo(void)
continue;
}
 
-   netdev_unregister_sysfs(dev);
dev->reg_state = NETREG_UNREGISTERED;
 
netdev_wait_allrefs(dev);
@@ -3256,11 +3255,11 @@ void netdev_run_todo(void)
BUG_TRAP(!dev->ip6_ptr);
BUG_TRAP(!dev->dn_ptr);
 
-   /* It must be the very last action,
-* after this 'dev' may point to freed up memory.
-*/
if (dev->destructor)
dev->destructor(dev);
+
+   /* Free network device */
+   kobject_put(&dev->dev.kobj);
}
 
 out:
@@ -3411,6 +3410,9 @@ void unregister_netdevice(struct net_dev
/* Notifier chain MUST detach us from master device. */
BUG_TRAP(!dev->master);
 
+   /* Remove entries from sysfs */
+   netdev_unregister_sysfs(dev);
+
/* Finish processing unregister after unlock */
net_set_todo(dev);
 
--- 2.6.21-rc1.orig/net/core/net-sysfs.c2007-05-11 11:02:55.0 
-0700
+++ 2.6.21-rc1/net/core/net-sysfs.c 2007-05-14 08:44:52.0 -0700
@@ -456,9 +456,15 @@ static struct class net_class = {
 #endif
 };
 
+/* Delete sysfs entries but hold kobject reference until after all
+ * netdev references are gone.
+ */
 void netdev_unregister_sysfs(struct net_device * net)
 {
-   device_del(&(net->dev));
+   struct device *dev = &(net->dev);
+
+   kobject_get(&dev->kobj);
+   device_del(dev);
 }
 
 /* Create sysfs entries for network device. */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] Fix race condition about network device name allocation

2007-05-14 Thread Stephen Hemminger
On Mon, 14 May 2007 10:33:01 +0900
Kenji Kaneshige <[EMAIL PROTECTED]> wrote:

> Hi Stephen,
> 
> Thank you for comments. I'll try your patch.
> 
> I have one concern about your patch, though I don't know very much 
> about netdev codes.
> 
> @@ -3411,6 +3410,9 @@ void unregister_netdevice(struct net_dev
> > /* Notifier chain MUST detach us from master device. */
> > BUG_TRAP(!dev->master);
> >  
> > +   /* Remove entries from sysfs */
> > +   netdev_unregister_sysfs(dev);
> > +
> > /* Finish processing unregister after unlock */
> > net_set_todo(dev);
> >  
> 
> With your patch, the netdev_unregister_sysfs() will be called
> earlier than now. Does it have no side effect? I'm worrying 
> about if there are somebody that depend on sysfs entry until
> net_run_todo() is called.


The unregister_sysfs() removes the entry in /sys/class/net and then
decrements the reference count. When ref count goes to zero, the kobject_release
callback gets called and which does kfree(). 

Changing the shutdown order makes the /sys/class/net entry disappear
sooner. In order to prevent the free from happening until after all the 
dev_put()
calls have happened, the reference count for the kobject needs to be increased.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG][PATCH] Fix race condition about network device name allocation

2007-05-14 Thread Kenji Kaneshige
Hi,

> How about the following (untested), instead. It hold a removes the sysfs
> entries earlier (on unregister_netdevice), but holds a kobject reference.
> Then when todo runs the actual last put free happens.

I've confirmed the problem is fixed by Stephen's patch on my reproduction
environment.

Thanks,
Kenji Kaneshige


2007-05-11 (金) の 09:25 -0700 に Stephen Hemminger さんは書きました:
> On Fri, 11 May 2007 14:40:45 +0900
> Kenji Kaneshige <[EMAIL PROTECTED]> wrote:
> 
> > Hi,
> > 
> > I encountered the following error when I was hot-plugging network card
> > using pci hotplug driver.
> > 
> > kobject_add failed for eth8 with -EEXIST, don't try to register things with 
> > the same name in the same directory.
> > 
> > Call Trace:
> >  [] show_stack+0x40/0xa0
> > sp=e000652cfbf0 bsp=e000652c9450
> >  [] dump_stack+0x30/0x60
> > sp=e000652cfdc0 bsp=e000652c9438
> >  [] kobject_shadow_add+0x3b0/0x440
> > sp=e000652cfdc0 bsp=e000652c93f0
> >  [] kobject_add+0x30/0x60
> > sp=e000652cfdc0 bsp=e000652c93d0
> >  [] device_add+0x160/0xf40
> > sp=e000652cfdc0 bsp=e000652c9358
> >  [] netdev_register_sysfs+0xc0/0xe0
> > sp=e000652cfdc0 bsp=e000652c9328
> >  [] register_netdevice+0x600/0x7e0
> > sp=e000652cfdc0 bsp=e000652c92e8
> >  [] register_netdev+0x70/0xc0
> > sp=e000652cfdd0 bsp=e000652c92c0
> >  [] e1000_probe+0x1710/0x1a00 [e1000]
> > sp=e000652cfdd0 bsp=e000652c9258
> >  [] pci_device_probe+0xc0/0x120
> > sp=e000652cfde0 bsp=e000652c9218
> >  [] really_probe+0x1c0/0x300
> > sp=e000652cfde0 bsp=e000652c91c8
> >  [] driver_probe_device+0x1a0/0x1c0
> > sp=e000652cfde0 bsp=e000652c9198
> >  [] __device_attach+0x30/0x60
> > sp=e000652cfde0 bsp=e000652c9170
> >  [] bus_for_each_drv+0x80/0x120
> > sp=e000652cfde0 bsp=e000652c9138
> >  [] device_attach+0xa0/0x100
> > sp=e000652cfe00 bsp=e000652c9110
> >  [] bus_attach_device+0x60/0xe0
> > sp=e000652cfe00 bsp=e000652c90e0
> >  [] device_add+0x950/0xf40
> > sp=e000652cfe00 bsp=e000652c9068
> >  [] pci_bus_add_device+0x20/0x100
> > sp=e000652cfe00 bsp=e000652c9040
> >  [] pci_bus_add_devices+0x50/0x320
> > sp=e000652cfe00 bsp=e000652c9010
> >  [] shpchp_configure_device+0x460/0x4a0 [shpchp]
> > sp=e000652cfe00 bsp=e000652c8fc0
> >  [] board_added+0x720/0x8c0 [shpchp]
> > sp=e000652cfe00 bsp=e000652c8f80
> >  [] shpchp_enable_slot+0x920/0xa40 [shpchp]
> > sp=e000652cfe10 bsp=e000652c8f30
> >  [] shpchp_sysfs_enable_slot+0x120/0x220 [shpchp]
> > sp=e000652cfe20 bsp=e000652c8ef8
> >  [] enable_slot+0x90/0xc0 [shpchp]
> > sp=e000652cfe20 bsp=e000652c8ed0
> >  [] power_write_file+0x1e0/0x2a0 [pci_hotplug]
> > sp=e000652cfe20 bsp=e000652c8ea0
> >  [] hotplug_slot_attr_store+0x60/0xa0 [pci_hotplug]
> > sp=e000652cfe20 bsp=e000652c8e68
> >  [] sysfs_write_file+0x270/0x300
> > sp=e000652cfe20 bsp=e000652c8e18
> >  [] vfs_write+0x1b0/0x300
> > sp=e000652cfe20 bsp=e000652c8dc0
> >  [] sys_write+0x70/0xe0
> > sp=e000652cfe20 bsp=e000652c8d48
> >  [] ia64_ret_from_syscall+0x0/0x20
> > sp=e000652cfe30 bsp=e000652c8d48
> >  [] __kernel_syscall_via_break+0x0/0x20
> > sp=e000652d bsp=e000652c8d48
> > 
> > This error is caused by the race condition between dev_alloc_name()
> > and unregistering net device. The dev_alloc_name() function checks
> > dev_base list to find a suitable id. The net device is removed from
> > the dev_base list when net device is unregistered. Those are done with
> > rtnl lock held, so there is no problem. However, removing sysfs entry
> > is delayed until netdev_run_todo() which is outside rtnl lock. Because
> > of this, dev_alloc_name() can create a device name which is duplicated
> > with the existing sysfs entry.
> > 
> > The following patch fixes this by changing dev_alloc_name() function
> > to check not only dev_base list but also todo list

Re: [BUG][PATCH] Fix race condition about network device name allocation

2007-05-13 Thread Kenji Kaneshige
Hi Stephen,

Thank you for comments. I'll try your patch.

I have one concern about your patch, though I don't know very much 
about netdev codes.

@@ -3411,6 +3410,9 @@ void unregister_netdevice(struct net_dev
>   /* Notifier chain MUST detach us from master device. */
>   BUG_TRAP(!dev->master);
>  
> + /* Remove entries from sysfs */
> + netdev_unregister_sysfs(dev);
> +
>   /* Finish processing unregister after unlock */
>   net_set_todo(dev);
>  

With your patch, the netdev_unregister_sysfs() will be called
earlier than now. Does it have no side effect? I'm worrying 
about if there are somebody that depend on sysfs entry until
net_run_todo() is called.

Anyway, I'll try your patch.

Thanks,
Kenji Kaneshige


2007-05-11 (金) の 09:25 -0700 に Stephen Hemminger さんは書きました:
> On Fri, 11 May 2007 14:40:45 +0900
> Kenji Kaneshige <[EMAIL PROTECTED]> wrote:
> 
> > Hi,
> > 
> > I encountered the following error when I was hot-plugging network card
> > using pci hotplug driver.
> > 
> > kobject_add failed for eth8 with -EEXIST, don't try to register things with 
> > the same name in the same directory.
> > 
> > Call Trace:
> >  [] show_stack+0x40/0xa0
> > sp=e000652cfbf0 bsp=e000652c9450
> >  [] dump_stack+0x30/0x60
> > sp=e000652cfdc0 bsp=e000652c9438
> >  [] kobject_shadow_add+0x3b0/0x440
> > sp=e000652cfdc0 bsp=e000652c93f0
> >  [] kobject_add+0x30/0x60
> > sp=e000652cfdc0 bsp=e000652c93d0
> >  [] device_add+0x160/0xf40
> > sp=e000652cfdc0 bsp=e000652c9358
> >  [] netdev_register_sysfs+0xc0/0xe0
> > sp=e000652cfdc0 bsp=e000652c9328
> >  [] register_netdevice+0x600/0x7e0
> > sp=e000652cfdc0 bsp=e000652c92e8
> >  [] register_netdev+0x70/0xc0
> > sp=e000652cfdd0 bsp=e000652c92c0
> >  [] e1000_probe+0x1710/0x1a00 [e1000]
> > sp=e000652cfdd0 bsp=e000652c9258
> >  [] pci_device_probe+0xc0/0x120
> > sp=e000652cfde0 bsp=e000652c9218
> >  [] really_probe+0x1c0/0x300
> > sp=e000652cfde0 bsp=e000652c91c8
> >  [] driver_probe_device+0x1a0/0x1c0
> > sp=e000652cfde0 bsp=e000652c9198
> >  [] __device_attach+0x30/0x60
> > sp=e000652cfde0 bsp=e000652c9170
> >  [] bus_for_each_drv+0x80/0x120
> > sp=e000652cfde0 bsp=e000652c9138
> >  [] device_attach+0xa0/0x100
> > sp=e000652cfe00 bsp=e000652c9110
> >  [] bus_attach_device+0x60/0xe0
> > sp=e000652cfe00 bsp=e000652c90e0
> >  [] device_add+0x950/0xf40
> > sp=e000652cfe00 bsp=e000652c9068
> >  [] pci_bus_add_device+0x20/0x100
> > sp=e000652cfe00 bsp=e000652c9040
> >  [] pci_bus_add_devices+0x50/0x320
> > sp=e000652cfe00 bsp=e000652c9010
> >  [] shpchp_configure_device+0x460/0x4a0 [shpchp]
> > sp=e000652cfe00 bsp=e000652c8fc0
> >  [] board_added+0x720/0x8c0 [shpchp]
> > sp=e000652cfe00 bsp=e000652c8f80
> >  [] shpchp_enable_slot+0x920/0xa40 [shpchp]
> > sp=e000652cfe10 bsp=e000652c8f30
> >  [] shpchp_sysfs_enable_slot+0x120/0x220 [shpchp]
> > sp=e000652cfe20 bsp=e000652c8ef8
> >  [] enable_slot+0x90/0xc0 [shpchp]
> > sp=e000652cfe20 bsp=e000652c8ed0
> >  [] power_write_file+0x1e0/0x2a0 [pci_hotplug]
> > sp=e000652cfe20 bsp=e000652c8ea0
> >  [] hotplug_slot_attr_store+0x60/0xa0 [pci_hotplug]
> > sp=e000652cfe20 bsp=e000652c8e68
> >  [] sysfs_write_file+0x270/0x300
> > sp=e000652cfe20 bsp=e000652c8e18
> >  [] vfs_write+0x1b0/0x300
> > sp=e000652cfe20 bsp=e000652c8dc0
> >  [] sys_write+0x70/0xe0
> > sp=e000652cfe20 bsp=e000652c8d48
> >  [] ia64_ret_from_syscall+0x0/0x20
> > sp=e000652cfe30 bsp=e000652c8d48
> >  [] __kernel_syscall_via_break+0x0/0x20
> > sp=e000652d bsp=e000652c8d48
> > 
> > This error is caused by the race condition between dev_alloc_name()
> > and unregistering net device. The dev_alloc_name() function checks
> > dev_base list to find a suitable id. The net device is removed from
> > the dev_base list when net device is unre

Re: [BUG][PATCH] Fix race condition about network device name allocation

2007-05-11 Thread Stephen Hemminger
On Fri, 11 May 2007 14:40:45 +0900
Kenji Kaneshige <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I encountered the following error when I was hot-plugging network card
> using pci hotplug driver.
> 
> kobject_add failed for eth8 with -EEXIST, don't try to register things with 
> the same name in the same directory.
> 
> Call Trace:
>  [] show_stack+0x40/0xa0
> sp=e000652cfbf0 bsp=e000652c9450
>  [] dump_stack+0x30/0x60
> sp=e000652cfdc0 bsp=e000652c9438
>  [] kobject_shadow_add+0x3b0/0x440
> sp=e000652cfdc0 bsp=e000652c93f0
>  [] kobject_add+0x30/0x60
> sp=e000652cfdc0 bsp=e000652c93d0
>  [] device_add+0x160/0xf40
> sp=e000652cfdc0 bsp=e000652c9358
>  [] netdev_register_sysfs+0xc0/0xe0
> sp=e000652cfdc0 bsp=e000652c9328
>  [] register_netdevice+0x600/0x7e0
> sp=e000652cfdc0 bsp=e000652c92e8
>  [] register_netdev+0x70/0xc0
> sp=e000652cfdd0 bsp=e000652c92c0
>  [] e1000_probe+0x1710/0x1a00 [e1000]
> sp=e000652cfdd0 bsp=e000652c9258
>  [] pci_device_probe+0xc0/0x120
> sp=e000652cfde0 bsp=e000652c9218
>  [] really_probe+0x1c0/0x300
> sp=e000652cfde0 bsp=e000652c91c8
>  [] driver_probe_device+0x1a0/0x1c0
> sp=e000652cfde0 bsp=e000652c9198
>  [] __device_attach+0x30/0x60
> sp=e000652cfde0 bsp=e000652c9170
>  [] bus_for_each_drv+0x80/0x120
> sp=e000652cfde0 bsp=e000652c9138
>  [] device_attach+0xa0/0x100
> sp=e000652cfe00 bsp=e000652c9110
>  [] bus_attach_device+0x60/0xe0
> sp=e000652cfe00 bsp=e000652c90e0
>  [] device_add+0x950/0xf40
> sp=e000652cfe00 bsp=e000652c9068
>  [] pci_bus_add_device+0x20/0x100
> sp=e000652cfe00 bsp=e000652c9040
>  [] pci_bus_add_devices+0x50/0x320
> sp=e000652cfe00 bsp=e000652c9010
>  [] shpchp_configure_device+0x460/0x4a0 [shpchp]
> sp=e000652cfe00 bsp=e000652c8fc0
>  [] board_added+0x720/0x8c0 [shpchp]
> sp=e000652cfe00 bsp=e000652c8f80
>  [] shpchp_enable_slot+0x920/0xa40 [shpchp]
> sp=e000652cfe10 bsp=e000652c8f30
>  [] shpchp_sysfs_enable_slot+0x120/0x220 [shpchp]
> sp=e000652cfe20 bsp=e000652c8ef8
>  [] enable_slot+0x90/0xc0 [shpchp]
> sp=e000652cfe20 bsp=e000652c8ed0
>  [] power_write_file+0x1e0/0x2a0 [pci_hotplug]
> sp=e000652cfe20 bsp=e000652c8ea0
>  [] hotplug_slot_attr_store+0x60/0xa0 [pci_hotplug]
> sp=e000652cfe20 bsp=e000652c8e68
>  [] sysfs_write_file+0x270/0x300
> sp=e000652cfe20 bsp=e000652c8e18
>  [] vfs_write+0x1b0/0x300
> sp=e000652cfe20 bsp=e000652c8dc0
>  [] sys_write+0x70/0xe0
> sp=e000652cfe20 bsp=e000652c8d48
>  [] ia64_ret_from_syscall+0x0/0x20
> sp=e000652cfe30 bsp=e000652c8d48
>  [] __kernel_syscall_via_break+0x0/0x20
> sp=e000652d bsp=e000652c8d48
> 
> This error is caused by the race condition between dev_alloc_name()
> and unregistering net device. The dev_alloc_name() function checks
> dev_base list to find a suitable id. The net device is removed from
> the dev_base list when net device is unregistered. Those are done with
> rtnl lock held, so there is no problem. However, removing sysfs entry
> is delayed until netdev_run_todo() which is outside rtnl lock. Because
> of this, dev_alloc_name() can create a device name which is duplicated
> with the existing sysfs entry.
> 
> The following patch fixes this by changing dev_alloc_name() function
> to check not only dev_base list but also todo list and snapshot list
> to find a suitable id. If some devices exists on the todo list or
> snapshot list, sysfs entries corresponding to them are not deleted
> yet.
> 
> Thanks,
> Kenji Kaneshige

How about the following (untested), instead. It hold a removes the sysfs
entries earlier (on unregister_netdevice), but holds a kobject reference.
Then when todo runs the actual last put free happens.

---
 net/core/dev.c   |   10 ++
 net/core/net-sysfs.c |8 +++-
 2 files changed, 13 insertions(+), 5 deletions(-)

--- net-2.6.orig/net/core/d

Re: [BUG][PATCH] Fix race condition about network device name allocation

2007-05-11 Thread Stephen Hemminger
On Fri, 11 May 2007 14:40:45 +0900
Kenji Kaneshige <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I encountered the following error when I was hot-plugging network card
> using pci hotplug driver.
> 
> kobject_add failed for eth8 with -EEXIST, don't try to register things with 
> the same name in the same directory.
> 
> Call Trace:
>  [] show_stack+0x40/0xa0
> sp=e000652cfbf0 bsp=e000652c9450
>  [] dump_stack+0x30/0x60
> sp=e000652cfdc0 bsp=e000652c9438
>  [] kobject_shadow_add+0x3b0/0x440
> sp=e000652cfdc0 bsp=e000652c93f0
>  [] kobject_add+0x30/0x60
> sp=e000652cfdc0 bsp=e000652c93d0
>  [] device_add+0x160/0xf40
> sp=e000652cfdc0 bsp=e000652c9358
>  [] netdev_register_sysfs+0xc0/0xe0
> sp=e000652cfdc0 bsp=e000652c9328
>  [] register_netdevice+0x600/0x7e0
> sp=e000652cfdc0 bsp=e000652c92e8
>  [] register_netdev+0x70/0xc0
> sp=e000652cfdd0 bsp=e000652c92c0
>  [] e1000_probe+0x1710/0x1a00 [e1000]
> sp=e000652cfdd0 bsp=e000652c9258
>  [] pci_device_probe+0xc0/0x120
> sp=e000652cfde0 bsp=e000652c9218
>  [] really_probe+0x1c0/0x300
> sp=e000652cfde0 bsp=e000652c91c8
>  [] driver_probe_device+0x1a0/0x1c0
> sp=e000652cfde0 bsp=e000652c9198
>  [] __device_attach+0x30/0x60
> sp=e000652cfde0 bsp=e000652c9170
>  [] bus_for_each_drv+0x80/0x120
> sp=e000652cfde0 bsp=e000652c9138
>  [] device_attach+0xa0/0x100
> sp=e000652cfe00 bsp=e000652c9110
>  [] bus_attach_device+0x60/0xe0
> sp=e000652cfe00 bsp=e000652c90e0
>  [] device_add+0x950/0xf40
> sp=e000652cfe00 bsp=e000652c9068
>  [] pci_bus_add_device+0x20/0x100
> sp=e000652cfe00 bsp=e000652c9040
>  [] pci_bus_add_devices+0x50/0x320
> sp=e000652cfe00 bsp=e000652c9010
>  [] shpchp_configure_device+0x460/0x4a0 [shpchp]
> sp=e000652cfe00 bsp=e000652c8fc0
>  [] board_added+0x720/0x8c0 [shpchp]
> sp=e000652cfe00 bsp=e000652c8f80
>  [] shpchp_enable_slot+0x920/0xa40 [shpchp]
> sp=e000652cfe10 bsp=e000652c8f30
>  [] shpchp_sysfs_enable_slot+0x120/0x220 [shpchp]
> sp=e000652cfe20 bsp=e000652c8ef8
>  [] enable_slot+0x90/0xc0 [shpchp]
> sp=e000652cfe20 bsp=e000652c8ed0
>  [] power_write_file+0x1e0/0x2a0 [pci_hotplug]
> sp=e000652cfe20 bsp=e000652c8ea0
>  [] hotplug_slot_attr_store+0x60/0xa0 [pci_hotplug]
> sp=e000652cfe20 bsp=e000652c8e68
>  [] sysfs_write_file+0x270/0x300
> sp=e000652cfe20 bsp=e000652c8e18
>  [] vfs_write+0x1b0/0x300
> sp=e000652cfe20 bsp=e000652c8dc0
>  [] sys_write+0x70/0xe0
> sp=e000652cfe20 bsp=e000652c8d48
>  [] ia64_ret_from_syscall+0x0/0x20
> sp=e000652cfe30 bsp=e000652c8d48
>  [] __kernel_syscall_via_break+0x0/0x20
> sp=e000652d bsp=e000652c8d48
> 
> This error is caused by the race condition between dev_alloc_name()
> and unregistering net device. The dev_alloc_name() function checks
> dev_base list to find a suitable id. The net device is removed from
> the dev_base list when net device is unregistered. Those are done with
> rtnl lock held, so there is no problem. However, removing sysfs entry
> is delayed until netdev_run_todo() which is outside rtnl lock. Because
> of this, dev_alloc_name() can create a device name which is duplicated
> with the existing sysfs entry.
> 
> The following patch fixes this by changing dev_alloc_name() function
> to check not only dev_base list but also todo list and snapshot list
> to find a suitable id. If some devices exists on the todo list or
> snapshot list, sysfs entries corresponding to them are not deleted
> yet.
> 
> Thanks,
> Kenji Kaneshige
> 
> ---
> Fix the race condition between dev_alloc_name() and net device
> unregistration.
> 
> dev_alloc_name checks `dev_base' list to find a suitable ID. On the
> other hand, net devices are removed from that list on net device
> unregistration. Both of them are done with holding rtnl lock. However,
> removing sysfs entry is delayed until netdev_run_todo(), 

[BUG][PATCH] Fix race condition about network device name allocation

2007-05-10 Thread Kenji Kaneshige
Hi,

I encountered the following error when I was hot-plugging network card
using pci hotplug driver.

kobject_add failed for eth8 with -EEXIST, don't try to register things with the 
same name in the same directory.

Call Trace:
 [] show_stack+0x40/0xa0
sp=e000652cfbf0 bsp=e000652c9450
 [] dump_stack+0x30/0x60
sp=e000652cfdc0 bsp=e000652c9438
 [] kobject_shadow_add+0x3b0/0x440
sp=e000652cfdc0 bsp=e000652c93f0
 [] kobject_add+0x30/0x60
sp=e000652cfdc0 bsp=e000652c93d0
 [] device_add+0x160/0xf40
sp=e000652cfdc0 bsp=e000652c9358
 [] netdev_register_sysfs+0xc0/0xe0
sp=e000652cfdc0 bsp=e000652c9328
 [] register_netdevice+0x600/0x7e0
sp=e000652cfdc0 bsp=e000652c92e8
 [] register_netdev+0x70/0xc0
sp=e000652cfdd0 bsp=e000652c92c0
 [] e1000_probe+0x1710/0x1a00 [e1000]
sp=e000652cfdd0 bsp=e000652c9258
 [] pci_device_probe+0xc0/0x120
sp=e000652cfde0 bsp=e000652c9218
 [] really_probe+0x1c0/0x300
sp=e000652cfde0 bsp=e000652c91c8
 [] driver_probe_device+0x1a0/0x1c0
sp=e000652cfde0 bsp=e000652c9198
 [] __device_attach+0x30/0x60
sp=e000652cfde0 bsp=e000652c9170
 [] bus_for_each_drv+0x80/0x120
sp=e000652cfde0 bsp=e000652c9138
 [] device_attach+0xa0/0x100
sp=e000652cfe00 bsp=e000652c9110
 [] bus_attach_device+0x60/0xe0
sp=e000652cfe00 bsp=e000652c90e0
 [] device_add+0x950/0xf40
sp=e000652cfe00 bsp=e000652c9068
 [] pci_bus_add_device+0x20/0x100
sp=e000652cfe00 bsp=e000652c9040
 [] pci_bus_add_devices+0x50/0x320
sp=e000652cfe00 bsp=e000652c9010
 [] shpchp_configure_device+0x460/0x4a0 [shpchp]
sp=e000652cfe00 bsp=e000652c8fc0
 [] board_added+0x720/0x8c0 [shpchp]
sp=e000652cfe00 bsp=e000652c8f80
 [] shpchp_enable_slot+0x920/0xa40 [shpchp]
sp=e000652cfe10 bsp=e000652c8f30
 [] shpchp_sysfs_enable_slot+0x120/0x220 [shpchp]
sp=e000652cfe20 bsp=e000652c8ef8
 [] enable_slot+0x90/0xc0 [shpchp]
sp=e000652cfe20 bsp=e000652c8ed0
 [] power_write_file+0x1e0/0x2a0 [pci_hotplug]
sp=e000652cfe20 bsp=e000652c8ea0
 [] hotplug_slot_attr_store+0x60/0xa0 [pci_hotplug]
sp=e000652cfe20 bsp=e000652c8e68
 [] sysfs_write_file+0x270/0x300
sp=e000652cfe20 bsp=e000652c8e18
 [] vfs_write+0x1b0/0x300
sp=e000652cfe20 bsp=e000652c8dc0
 [] sys_write+0x70/0xe0
sp=e000652cfe20 bsp=e000652c8d48
 [] ia64_ret_from_syscall+0x0/0x20
sp=e000652cfe30 bsp=e000652c8d48
 [] __kernel_syscall_via_break+0x0/0x20
sp=e000652d bsp=e000652c8d48

This error is caused by the race condition between dev_alloc_name()
and unregistering net device. The dev_alloc_name() function checks
dev_base list to find a suitable id. The net device is removed from
the dev_base list when net device is unregistered. Those are done with
rtnl lock held, so there is no problem. However, removing sysfs entry
is delayed until netdev_run_todo() which is outside rtnl lock. Because
of this, dev_alloc_name() can create a device name which is duplicated
with the existing sysfs entry.

The following patch fixes this by changing dev_alloc_name() function
to check not only dev_base list but also todo list and snapshot list
to find a suitable id. If some devices exists on the todo list or
snapshot list, sysfs entries corresponding to them are not deleted
yet.

Thanks,
Kenji Kaneshige

---
Fix the race condition between dev_alloc_name() and net device
unregistration.

dev_alloc_name checks `dev_base' list to find a suitable ID. On the
other hand, net devices are removed from that list on net device
unregistration. Both of them are done with holding rtnl lock. However,
removing sysfs entry is delayed until netdev_run_todo(), which doesn't
acquire rtnl lock. Therefore, on dev_alloc_name(), device name could
conflict with the device, which is unregistered, but its sysfs entry
still exist.

To fix this bug, change dev_alloc_name to check not only `dev_base'
list but also todo list and snapshot list