Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-09-06 Thread Ben Greear

Jesper Juhl wrote:

Ok, I've done some more testing and it seems, unfortunately, that I
can't trigger the problem reliably. I guess I was just lucky with my
first few reboots.
It now seems that uptime and/or amount of data that has flowed over
the vlan interface impacts the probability of hitting the problem.


Back when I was chasing the neighbor table leak, I wrote a patch to
catch ref-count leaks for net devices.  It was against 2.6.13 or so,
but if nothing else is helping, it might be worth dusting off.

I put what I believe was the last iteration of that patch here:

http://www.candelatech.com/oss/rfcnt.patch

Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-09-01 Thread Jesper Juhl

On 31/08/06, Jesper Juhl [EMAIL PROTECTED] wrote:

On 31/08/06, Ben Greear [EMAIL PROTECTED] wrote:
 Jesper Juhl wrote:
  Hi,
 
  I've got a small problem with 2.6.18-rc5-git2.
 
  I've got a vlan setup on eth0.20, eth0 does not have an IP.
 
  When I attempt to reboot or halt the machine I get the following
  message from the loop in net/core/dev.c::netdev_wait_allrefs() where
  it waits for the ref-count to drop to zero.
  Unfortunately the ref-count stays at 1 forever and the server never
  gets any further.
 
   unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1
 
  I googled a bit and found that people have had similar problems in the
  past and could work around them by shutting down the vlan interface
  before the 'lo' interface. I tried that and indeed, it works.
 
  Any idea how we can get this fixed?

 This is usually a ref-count leak somewhere.  Used to be IPv6 had
 issues..then there were some neighbor leaks...but these were fixed as
 far as I know.

Using IPv4 here.


 Can you reproduce this on older kernels?

I've not actively tried, but I do have several servers running various
older kernel releases with similar vlan setups and I'm not aware of
any problems with those. Only this new box that I'm using for testing
new kernels (currently) shows the problem, and I've only tried 2.6.8
and 2.6.18-rc5-git2 on the box so far (2.6.8 doesn't have the
problem).


I've just encountered the problem on a different server with an
identical vlan setup. That server is running 2.6.13.4

--
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-09-01 Thread Herbert Xu
Jesper Juhl [EMAIL PROTECTED] wrote:

 I've just encountered the problem on a different server with an
 identical vlan setup. That server is running 2.6.13.4

Do you have a simple recipe to reproduce this? Ideally it'd be a
script that anyone can execute in a freshly booted system that
exhibits the problem.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-09-01 Thread Jesper Juhl

On 01/09/06, Herbert Xu [EMAIL PROTECTED] wrote:

Jesper Juhl [EMAIL PROTECTED] wrote:

 I've just encountered the problem on a different server with an
 identical vlan setup. That server is running 2.6.13.4

Do you have a simple recipe to reproduce this? Ideally it'd be a
script that anyone can execute in a freshly booted system that
exhibits the problem.


Well, the first server I saw this on only had a base install of debian
stable on it, then I replaced the kernel, configured the vlan
interface in /etc/network/interfaces typed 'reboot' and it failed -
and it seems to fail reliably on reboot every time.

--
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-09-01 Thread Jesper Juhl

On 01/09/06, Jesper Juhl [EMAIL PROTECTED] wrote:

On 01/09/06, Herbert Xu [EMAIL PROTECTED] wrote:
 Jesper Juhl [EMAIL PROTECTED] wrote:
 
  I've just encountered the problem on a different server with an
  identical vlan setup. That server is running 2.6.13.4

 Do you have a simple recipe to reproduce this? Ideally it'd be a
 script that anyone can execute in a freshly booted system that
 exhibits the problem.

Well, the first server I saw this on only had a base install of debian
stable on it, then I replaced the kernel, configured the vlan
interface in /etc/network/interfaces typed 'reboot' and it failed -
and it seems to fail reliably on reboot every time.


Ok, I've done some more testing and it seems, unfortunately, that I
can't trigger the problem reliably. I guess I was just lucky with my
first few reboots.
It now seems that uptime and/or amount of data that has flowed over
the vlan interface impacts the probability of hitting the problem.

--
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-08-31 Thread Ben Greear

Jesper Juhl wrote:

Hi,

I've got a small problem with 2.6.18-rc5-git2.

I've got a vlan setup on eth0.20, eth0 does not have an IP.

When I attempt to reboot or halt the machine I get the following
message from the loop in net/core/dev.c::netdev_wait_allrefs() where
it waits for the ref-count to drop to zero.
Unfortunately the ref-count stays at 1 forever and the server never
gets any further.

 unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

I googled a bit and found that people have had similar problems in the
past and could work around them by shutting down the vlan interface
before the 'lo' interface. I tried that and indeed, it works.

Any idea how we can get this fixed?


This is usually a ref-count leak somewhere.  Used to be IPv6 had 
issues..then there were some neighbor leaks...but these were fixed as 
far as I know.


Can you reproduce this on older kernels?

Ben







--
Ben Greear [EMAIL PROTECTED]
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unable to halt or reboot due to - unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

2006-08-31 Thread Jesper Juhl

On 31/08/06, Ben Greear [EMAIL PROTECTED] wrote:

Jesper Juhl wrote:
 Hi,

 I've got a small problem with 2.6.18-rc5-git2.

 I've got a vlan setup on eth0.20, eth0 does not have an IP.

 When I attempt to reboot or halt the machine I get the following
 message from the loop in net/core/dev.c::netdev_wait_allrefs() where
 it waits for the ref-count to drop to zero.
 Unfortunately the ref-count stays at 1 forever and the server never
 gets any further.

  unregister_netdevice: waiting for eth0.20 to become free. Usage count = 1

 I googled a bit and found that people have had similar problems in the
 past and could work around them by shutting down the vlan interface
 before the 'lo' interface. I tried that and indeed, it works.

 Any idea how we can get this fixed?

This is usually a ref-count leak somewhere.  Used to be IPv6 had
issues..then there were some neighbor leaks...but these were fixed as
far as I know.


Using IPv4 here.



Can you reproduce this on older kernels?


I've not actively tried, but I do have several servers running various
older kernel releases with similar vlan setups and I'm not aware of
any problems with those. Only this new box that I'm using for testing
new kernels (currently) shows the problem, and I've only tried 2.6.8
and 2.6.18-rc5-git2 on the box so far (2.6.8 doesn't have the
problem).

--
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html