Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
In article <[EMAIL PROTECTED]> (at Thu, 02 Feb 2006 20:09:44 -0800 (PST)), "David S. Miller" <[EMAIL PROTECTED]> says: > From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> > Date: Fri, 03 Feb 2006 11:32:13 +0900 (JST) > > > BTW, David, would you mind sending your addrconf patches to me? > > I'd like to look into it deeply. > > I used to have clone, but I happened to remove that tree... > > I intended to review my patches in small pieces, one by one, > and then send them to you for review. > > Maybe I can do this over the weekend, is that OK with you? Sure, and that is what I'd like to do. --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> Date: Fri, 03 Feb 2006 11:32:13 +0900 (JST) > BTW, David, would you mind sending your addrconf patches to me? > I'd like to look into it deeply. > I used to have clone, but I happened to remove that tree... I intended to review my patches in small pieces, one by one, and then send them to you for review. Maybe I can do this over the weekend, is that OK with you? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
In article <[EMAIL PROTECTED]> (at Fri, 3 Feb 2006 12:48:49 +1100), Herbert Xu <[EMAIL PROTECTED]> says: > On Fri, Feb 03, 2006 at 10:31:58AM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ > wrote: > > > > We SHALL do autoconf when we "up" an ipv6-capable device. > > It is the IPv6. > > I don't think the word "SHALL" stops us from implementing it in > user-space... It is because my poor English...sigh... Specification says an implementation do autoconf when we bringing up an ipv6-capable device. It is the nature and requirement of IPv6. BTW, David, would you mind sending your addrconf patches to me? I'd like to look into it deeply. I used to have clone, but I happened to remove that tree... Thanks. --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
On Fri, Feb 03, 2006 at 10:31:58AM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > > We SHALL do autoconf when we "up" an ipv6-capable device. > It is the IPv6. I don't think the word "SHALL" stops us from implementing it in user-space... Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
In article <[EMAIL PROTECTED]> (at Thu, 2 Feb 2006 23:42:25 +1100), Herbert Xu <[EMAIL PROTECTED]> says: > On Thu, Feb 02, 2006 at 05:37:22AM -0700, Eric W. Biederman wrote: > > > > > Yes you are right. The locking/refcounting in addrconf.c is such > > > a mess. I've asked a number of times before as to why most of > > > this can't be done in user-space instead. There is nothing performance > > > critical here, and the system must be able to deal with a device with > > > no IPv6 addresses anyway (think of the case when the device was up before > > > ipv6.ko was loaded). > > > > A lot of the latter case is handled by the replay of netdevice events > > when you register a netdevice notifier. > > Yes. What I meant is that it is normal to have a period of time during > which a device has no IPv6 addresses attached. Doing addrconf in the > kernel means that we can guarantee that as soon as a device appears we > slap on an IPv6 address. My point is that we need to cope with devices > without IPv6 addresses anyway. We SHALL do autoconf when we "up" an ipv6-capable device. It is the IPv6. I agree that, in SOME cases, some people want to disable ipv6 on some of their interfaces. --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
From: Herbert Xu <[EMAIL PROTECTED]> Date: Thu, 2 Feb 2006 22:25:54 +1100 > [IPV6]: Don't hold extra ref count in ipv6_ifa_notify > > Currently the logic in ipv6_ifa_notify is to hold an extra reference > count for addrconf dst's that get added to the routing table. Thus, > when addrconf dst entries are taken out of the routing table, we need > to drop that dst. However, addrconf dst entries may be removed from > the routing table by means other than __ipv6_ifa_notify. > > So we're faced with the choice of either fixing up all places where > addrconf dst entries are removed, or dropping the extra reference count > altogether. > > I chose the latter because the ifp itself always holds a dst reference > count of 1 while it's alive. This is dropped just before we kfree the > ifp object. Therefore we know that in __ipv6_ifa_notify we will always > hold that count. > > This bug was found by Eric W. Biederman. > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Applied, thanks Herbert. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
On Thu, Feb 02, 2006 at 05:37:22AM -0700, Eric W. Biederman wrote: > > > Yes you are right. The locking/refcounting in addrconf.c is such > > a mess. I've asked a number of times before as to why most of > > this can't be done in user-space instead. There is nothing performance > > critical here, and the system must be able to deal with a device with > > no IPv6 addresses anyway (think of the case when the device was up before > > ipv6.ko was loaded). > > A lot of the latter case is handled by the replay of netdevice events > when you register a netdevice notifier. Yes. What I meant is that it is normal to have a period of time during which a device has no IPv6 addresses attached. Doing addrconf in the kernel means that we can guarantee that as soon as a device appears we slap on an IPv6 address. My point is that we need to cope with devices without IPv6 addresses anyway. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
Herbert Xu <[EMAIL PROTECTED]> writes: > On Fri, Jan 27, 2006 at 01:00:49AM -0700, Eric W. Biederman wrote: >> >> However I do know I have correctly found the leak. > > Yes you are right. The locking/refcounting in addrconf.c is such > a mess. I've asked a number of times before as to why most of > this can't be done in user-space instead. There is nothing performance > critical here, and the system must be able to deal with a device with > no IPv6 addresses anyway (think of the case when the device was up before > ipv6.ko was loaded). A lot of the latter case is handled by the replay of netdevice events when you register a netdevice notifier. Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
On Fri, Jan 27, 2006 at 01:00:49AM -0700, Eric W. Biederman wrote: > > However I do know I have correctly found the leak. Yes you are right. The locking/refcounting in addrconf.c is such a mess. I've asked a number of times before as to why most of this can't be done in user-space instead. There is nothing performance critical here, and the system must be able to deal with a device with no IPv6 addresses anyway (think of the case when the device was up before ipv6.ko was loaded). I'm yet to hear a compelling reason. Anyway, that's enough ranting from me and here is a patch for your bug. [IPV6]: Don't hold extra ref count in ipv6_ifa_notify Currently the logic in ipv6_ifa_notify is to hold an extra reference count for addrconf dst's that get added to the routing table. Thus, when addrconf dst entries are taken out of the routing table, we need to drop that dst. However, addrconf dst entries may be removed from the routing table by means other than __ipv6_ifa_notify. So we're faced with the choice of either fixing up all places where addrconf dst entries are removed, or dropping the extra reference count altogether. I chose the latter because the ifp itself always holds a dst reference count of 1 while it's alive. This is dropped just before we kfree the ifp object. Therefore we know that in __ipv6_ifa_notify we will always hold that count. This bug was found by Eric W. Biederman. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3321,9 +3321,7 @@ static void __ipv6_ifa_notify(int event, switch (event) { case RTM_NEWADDR: - dst_hold(&ifp->rt->u.dst); - if (ip6_ins_rt(ifp->rt, NULL, NULL, NULL)) - dst_release(&ifp->rt->u.dst); + ip6_ins_rt(ifp->rt, NULL, NULL, NULL); if (ifp->idev->cnf.forwarding) addrconf_join_anycast(ifp); break; @@ -3334,8 +3332,6 @@ static void __ipv6_ifa_notify(int event, dst_hold(&ifp->rt->u.dst); if (ip6_del_rt(ifp->rt, NULL, NULL, NULL)) dst_free(&ifp->rt->u.dst); - else - dst_release(&ifp->rt->u.dst); break; } }
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
Herbert Xu <[EMAIL PROTECTED]> writes: > On Wed, Jan 25, 2006 at 08:12:02PM +, Eric W. Biederman wrote: >> >> Unfortunately because we have already call rt6_ifdown() the route is >> not found in the routing table, the dst_free does not decrement the >> count and is therefore unable to free the dst entry because the count >> is still elevated. > > If rt6_ifdown has already kicked the route out, then the dst ref count > should be zero. Even if someone is still holding onto it dst_free will > attach the entry to the GC list which means that it'll be freed when it > eventually does hit zero. > > The ref count held by ipv6_ifa_notify is dropped by ip6_del_rt in case > of an error. That probably explains the dst_free in there, but we may also need a dst_release as well. The problem is that struct ifa holds a reference. When we call ipv6_ifa_notify that reference is decremented, if and only if the route is in the routing table. ip6_del_rt does unconditionally decrement the reference count however we just called dst_hold (which incremented the reference count) just prior to calling ip6_del_rt. So it does not remove the reference from the ifa. ip6_del_rt can't do anything else because the route is not in the routing table. I don't know if my fix is correct, and fixing ipv6_ifa_notify is quite possibly better but I haven't been through all of the paths that call it to know what needs to happen there. However I do know I have correctly found the leak. Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
On Wed, Jan 25, 2006 at 08:12:02PM +, Eric W. Biederman wrote: > > Unfortunately because we have already call rt6_ifdown() the route is > not found in the routing table, the dst_free does not decrement the > count and is therefore unable to free the dst entry because the count > is still elevated. If rt6_ifdown has already kicked the route out, then the dst ref count should be zero. Even if someone is still holding onto it dst_free will attach the entry to the GC list which means that it'll be freed when it eventually does hit zero. The ref count held by ipv6_ifa_notify is dropped by ip6_del_rt in case of an error. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ipv6: addrconf_ifdown fix dst refcounting.
In article <[EMAIL PROTECTED]> (at Wed, 25 Jan 2006 13:12:02 -0700), [EMAIL PROTECTED] (Eric W. Biederman) says: > By bringing down the routes before we bring down the addresses we leak > the dst cache entries held by the addresses. : > Fix this by simply moving rt6_ifdown where we flush the routes after > we have removed all of the addresses from the interface. Do you see exact leakage here? Well, because it is trying to make sure not to accept packets anymore on that device by flushing routing table, rt6_ifdown(dev) should be called before ceasing addresses. --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ipv6: addrconf_ifdown fix dst refcounting.
By bringing down the routes before we bring down the addresses we leak the dst cache entries held by the addresses. For address on an interface there is an associated input route. When we bring that address down we call ipv6_ifa_notify(RTM_DELADDR, ifa). ipv6_ifa_notify will then call ip6_del_rt and release or free the route depending on whether the route was found in the routing table or not. I'm not certain about the general case but in the specific case where we are downing an interface we should always have found the route in the routing table and the reference count should decremented with dst_release, this is what when we remove an address from an interface. Unfortunately because we have already call rt6_ifdown() the route is not found in the routing table, the dst_free does not decrement the count and is therefore unable to free the dst entry because the count is still elevated. Fix this by simply moving rt6_ifdown where we flush the routes after we have removed all of the addresses from the interface. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- net/ipv6/addrconf.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) 91b01e79ff7a4cc6b901af76e81e15361e25c067 diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index d328d59..7461b90 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2278,7 +2278,6 @@ static int addrconf_ifdown(struct net_de if (dev == &loopback_dev && how == 1) how = 0; - rt6_ifdown(dev); neigh_ifdown(&nd_tbl, dev); idev = __in6_dev_get(dev); @@ -2370,6 +2369,12 @@ static int addrconf_ifdown(struct net_de idev->tstamp = jiffies; inet6_ifinfo_notify(RTM_DELLINK, idev); + /* Step 6: Flush any remaining routes for this device. +* This must be done after all of the addresses are taken +* down or we leak dst entries. +*/ + rt6_ifdown(dev); + /* Shot the device (if unregistered) */ if (how == 1) { -- 1.1.4.g7830 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html