Re: Multiple end-points behind same NAT
Hi, although I'm not a kernel guru I think I've got something to say to this. I am wondering if 26sec supports NAT-Traversal for multiple endpoints behind the same NAT. In looking at xfrm_tmpl it's not obvious to me that it's supported, ... You are looking at the rignt place indeed. Just to make you sure, there is really no space to store the port infomation of the tunnel endpoints in the xfrm_tmpl structure. The structure xfrm_state (a kernel structuture for holding SA's) is a bit different story though. Although the port information is not stored directly in the structure either, there is the encap member pointing to a xfrm_encap_tmpl structure which is used to hold the required information. The consequences of this are: 1) The IKE dameon (or the key manager as it is called in the kernel context) can't get the full infomation from the kernel required to be a successful initiator in the case of multiple peers behind the same NAT. (Though you might be able to get it working with a single peer behind the NAT if you configure the port forwarding at the NAT box carefuly.) 2) If there was an IKE daemon which could be told the required port information by some other means then directly by the kernel it should be possible to make it work despite the deficiencies of the kernel. I don't know if there is any IKE daemon capable of this, but I'm sure racoon can't do that. 3) It is possible to get this working the other way around: If the boxes behind the NAT were the initiators then it should work just fine at least if tunnel mode was used. There are some problems with the transport mode but even that can be made to work for certain scenarios. Regards, Michal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
AF_KEY extended xfrm_state selector handling
Hello In an effort to configure an L2TP/IPsec server on Linux capable of supporting multiple clients behind a single NAT device I ran into difficulties with pf_key protocol implementation not being able to exploit all the information passed to it as a SADB_EXT_ADDRESS_PROXY info. Perhaps as the original source suggested (/* Nobody uses this, but we try. */) this info has never been used before. So I propose the included patch. It changes the following: 1) the amount of information that is stored into the struct xfrm_state's selector from the SADB_EXT_ADDRESS_PROXY info received from userspace in the pfkey_msg2xfrm_state() function The information stored now is: - the address (stored as the selector's source address, including family) - the prefix length (stored as the selector's source address prefix length) - the protocol (stored as selector's protocol) - the port (stored as selector's source port) Previously only the address and the prefix length were stored. 2) the conditions under which the SADB_EXT_ADDRESS_PROXY info is included while converting a struct xfrm_state into a pf_key message in the pfkey_xfrm_state2msg() function The conditions now are: - selector' protcol family is non-zero (ie. the selector is defined) and ( - selector's prtocol is non-zero (ie. the protocol is specified) or - selector's source port is non-zero (ie. the port is specified) or - selector's source address is different from xfrm_state source address ) Further the case when selector's address is of a different family from the xfrm_state address is now handled. 3) the way how port information is obtained from struct sadb_address The port information extraction is now part of the pfkey_sadb_addr2xfrm_addr() function wich handles that in a protocol family safe manner, instead of using ((struct sockaddr_in *)x)->sin_port construct irrespective of the protocol family NOTES: - This patch should cause no problems, since as the original source says nobody uses that info. - I've also created a patch for racoon (ipsec-tools) to actually pass that info. Eventually I've been able to establish L2TP/IPSec connections from multiple clients behind the same NAT to the same L2TP/IPSec linux 2.6 based server. (The procedure required a manual insertion of certain SPD entries during the connection establishment but this will hopefuly be handled by the L2TP daemon automatically soon.) Here comes the patch (it is against 2.6.17.11): Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> diff -Naur linux-2.6.17.11.orig/net/key/af_key.c linux-2.6.17.11/net/key/af_key.c --- linux-2.6.17.11.orig/net/key/af_key.c 2006-08-23 23:16:33.0 +0200 +++ linux-2.6.17.11/net/key/af_key.c2006-10-18 16:53:48.0 +0200 @@ -552,19 +552,28 @@ } static int pfkey_sadb_addr2xfrm_addr(struct sadb_address *addr, -xfrm_address_t *xaddr) +xfrm_address_t *xaddr, __u16 *port) { switch (((struct sockaddr*)(addr + 1))->sa_family) { case AF_INET: - xaddr->a4 = - ((struct sockaddr_in *)(addr + 1))->sin_addr.s_addr; + { + struct sockaddr_in *in = (struct sockaddr_in *)(addr + 1); + + xaddr->a4 = in->sin_addr.s_addr; + if (port) + *port = in->sin_port; return AF_INET; + } #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: - memcpy(xaddr->a6, - &((struct sockaddr_in6 *)(addr + 1))->sin6_addr, - sizeof(struct in6_addr)); + { + struct sockaddr_in6 *in6 = (struct sockaddr_in6 *)(addr + 1); + + memcpy(xaddr->a6, &in6->sin6_addr, sizeof(struct in6_addr)); + if (port) + *port = in6->sin6_port; return AF_INET6; + } #endif default: return 0; @@ -651,6 +660,7 @@ int encrypt_key_size = 0; int sockaddr_size; struct xfrm_encap_tmpl *natt = NULL; + int proxy_size; /* address family check */ sockaddr_size = pfkey_sockaddr_size(x->props.family); @@ -674,14 +684,25 @@ /* identity & sensitivity */ - if ((x->props.family == AF_INET && -x->sel.saddr.a4 != x->props.saddr.a4) -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - || (x->props.family == AF_INET6 && - memcmp (x->sel.saddr.a6, x->props.saddr.a6, sizeof (struct in6_addr))) -#endif - ) - size += sizeof(struct sadb_address) + sockaddr_size; + if (x->sel.family != 0 && +
multicast group memberships purge on interface delete
Hello there, I've got the following question/suggestion: The situation today: When an interface is deleted and there happen to have been some multicast groups joined on it only the interface's list of multicast meberships is deleted. The sockets through which the groups were joined and more importantly their associated multicast membership lists are left untouched. This makes it difficult for the function that handles leaving multicast groups on a socket to decide what to do with groups that were joined on such an interface (that no longer exists). The present implementation is a kind of a "best guess" (and nothnig better can probably be done about that). It may even fail to leave an affected group (group that was joined on a deleted interface) completely and thus block a slot in the sockets's multicast mebership list which size is purposely limited. My question/suggestion: Would it feasible to drop the relevant entries from sockets' multicast membership lists on the interface delete? Yes, I do realize it would require to walk through a number of sockets to see if there is any multicast entry for the interface in question to delete. But this could be optimized by maintaining a list of sockets that have a multicast group joined on the interface (and keep a pointer to this list in the device structure). This would ease the job of the function handling leaving multicast groups, made its beahaviour more "deterministic" and possible errors reported by it more meaningful/reliable. Notes: - The suggested approach is reportedly taken by other OSes (notably NetBSD). The fact that linux doesn't behave the same poses a problem for cross platform software for the behaviour of different systems is different in one more detail. - The suggested "list of sockets that have a multicast group joined on the interface" could also probably be of some help when maintaining the per interface multicast source filter list or per-interface multicast reception state as per RFC 3376 (IGMPv3) section 3.2. Thanks Michal Ruzicka - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible leak of multicast source filter sctructure #4
+-DLS I agree, but for the time before all applications are fixed ... (BTW from the IPv6 part of your patch it seems that the multicast group membership management is done solely by the interface index there ... good idea! :-) ) > > Suppose the following situatuion: > > > > 1) create pppX interface: > > IP: A.B.C.D > > > > 2) join a multicast group by address: A.B.C.D > > ASSUMPTION: no other interface with the same IP address exists; > > the result should be that the group is joined on the pppX interface > > > > 3) create pppY interface > > IP: A.B.C.D (Yes the same IP, not unusual on ppp links.) > > > > 4) delete the pppX interace > > > > 5) attempt to leave the multicast group by address: A.B.C.D > > > > 6) * if your algorthim is used -EADDRNOTAVAIL is returned > >* if mine is used the multicast group is left on the pppX interface > > Michal PS: During the writing of this post I realized a bug in my code: The second ip_mc_find_dev() lookup on lines 42-43 should be done irrespective of the prior value of in_dev. And came up with an enhancement to what I had previously done in ip_mc_find_dev(): In case an interface is found by its index the imr_address is no longer cleared rather it is set to: INADDR_NONE. This should result in somewhat smarter behaviour in case of leaving a group by its multicast address alone. Here is the corrected version of the patch: Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> --- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-18 13:04:09.0 +0200 @@ -1369,13 +1369,15 @@ struct flowi fl = { .nl_u = { .ip4_u = { .daddr = imr->imr_multiaddr.s_addr } } }; struct rtable *rt; - struct net_device *dev = NULL; - struct in_device *idev = NULL; + struct net_device *dev; if (imr->imr_ifindex) { - idev = inetdev_by_index(imr->imr_ifindex); - if (idev) + struct in_device *idev = inetdev_by_index(imr->imr_ifindex); + + if (idev) { + imr->imr_address.s_addr = INADDR_NONE; __in_dev_put(idev); + } return idev; } if (imr->imr_address.s_addr) { @@ -1383,17 +1385,16 @@ if (!dev) return NULL; dev_put(dev); - } - - if (!dev && !ip_route_output_key(&rt, &fl)) { + } else if (!ip_route_output_key(&rt, &fl)) { dev = rt->u.dst.dev; ip_rt_put(rt); - } - if (dev) { - imr->imr_ifindex = dev->ifindex; - idev = __in_dev_get_rtnl(dev); - } - return idev; + if (!dev) + return NULL; + } else + return NULL; + + imr->imr_ifindex = dev->ifindex; + return __in_dev_get_rtnl(dev); } /* @@ -1798,27 +1799,77 @@ u32 ifindex; rtnl_lock(); - in_dev = ip_mc_find_dev(imr); - if (!in_dev) { - rtnl_unlock(); - return -ENODEV; - } ifindex = imr->imr_ifindex; - for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { - if (iml->multi.imr_multiaddr.s_addr == group && - iml->multi.imr_ifindex == ifindex) { - (void) ip_mc_leave_src(sk, iml, in_dev); - - *imlp = iml->next; - - ip_mc_dec_group(in_dev, group); - rtnl_unlock(); - sock_kfree_s(sk, iml, sizeof(*iml)); - return 0; + in_dev = ip_mc_find_dev(imr); + if (ifindex != 0) { + /* leave by interface index */ + for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { + if (iml->multi.imr_multiaddr.s_addr != group) + continue; + + if (iml->multi.imr_ifindex == ifindex) + goto leave; + } + } else { + /* leave by address / multicast group route */ + struct ip_mc_socklist **cimlp = NULL; + u32 address = imr->imr_address.s_addr; + + ifindex = imr->imr_ifindex; + for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { + if (iml->multi.imr_multiaddr.s_addr != group) + continue; + + if (iml->multi.imr_ifindex == ifindex) + /* dire
Re: Possible leak of multicast source filter sctructure
Hello David, > Michal, > I believe the patch I submitted yesterday fixes this > problem, and in a simpler way. Somehow I've missed that e-mail (It did not appear on the thread's list at MARC archives). Sorry for that. > > +-DLS I've reviewed your patch (the IPv4 part of it) and I think I can say that our soulutions are actually quite similar. But I can come up with one difference that I'd like know your opinion on. Suppose the following situatuion: 1) create pppX interface: IP: A.B.C.D 2) join a multicast group by address: A.B.C.D ASSUMPTION: no other interface with the same IP address exists; the result should be that the group is joined on the pppX interface 3) create pppY interface IP: A.B.C.D (Yes the same IP, not unusual on ppp links.) 4) delete the pppX interace 5) attempt to leave the multicast group by address: A.B.C.D 6) * if your algorthim is used -EADDRNOTAVAIL is returned * if mine is used the multicast group is left on the pppX interface Surely this won't be a common problem but I think it's worth pointing out here. Michal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible leak of multicast source filter sctructure #3a
The same patch as in previous e-mail with a few typos in comments corrected: Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> --- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-16 16:53:08.0 +0200 @@ -1369,13 +1369,15 @@ struct flowi fl = { .nl_u = { .ip4_u = { .daddr = imr->imr_multiaddr.s_addr } } }; struct rtable *rt; - struct net_device *dev = NULL; - struct in_device *idev = NULL; + struct net_device *dev; if (imr->imr_ifindex) { - idev = inetdev_by_index(imr->imr_ifindex); - if (idev) + struct in_device *idev = inetdev_by_index(imr->imr_ifindex); + + if (idev) { + imr->imr_address.s_addr = 0; __in_dev_put(idev); + } return idev; } if (imr->imr_address.s_addr) { @@ -1383,17 +1385,16 @@ if (!dev) return NULL; dev_put(dev); - } - - if (!dev && !ip_route_output_key(&rt, &fl)) { + } else if (!ip_route_output_key(&rt, &fl)) { dev = rt->u.dst.dev; ip_rt_put(rt); - } - if (dev) { - imr->imr_ifindex = dev->ifindex; - idev = __in_dev_get_rtnl(dev); - } - return idev; + if (!dev) + return NULL; + } else + return NULL; + + imr->imr_ifindex = dev->ifindex; + return __in_dev_get_rtnl(dev); } /* @@ -1798,27 +1799,79 @@ u32 ifindex; rtnl_lock(); - in_dev = ip_mc_find_dev(imr); - if (!in_dev) { - rtnl_unlock(); - return -ENODEV; - } ifindex = imr->imr_ifindex; - for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { - if (iml->multi.imr_multiaddr.s_addr == group && - iml->multi.imr_ifindex == ifindex) { - (void) ip_mc_leave_src(sk, iml, in_dev); - - *imlp = iml->next; - - ip_mc_dec_group(in_dev, group); - rtnl_unlock(); - sock_kfree_s(sk, iml, sizeof(*iml)); - return 0; + in_dev = ip_mc_find_dev(imr); + if (ifindex != 0) { + /* leave by interface index */ + for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { + if (iml->multi.imr_multiaddr.s_addr != group) + continue; + + if (iml->multi.imr_ifindex == ifindex) + goto leave; + } + } else { + /* leave by address / multicast group route */ + struct ip_mc_socklist **cimlp = NULL; + u32 address = imr->imr_address.s_addr; + + ifindex = imr->imr_ifindex; + for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { + if (iml->multi.imr_multiaddr.s_addr != group) + continue; + + if (iml->multi.imr_ifindex == ifindex) + /* direct match +* NOTE: We do not have to test for in_dev != NULL +* since we know that ifindex was zero before call +* to ip_mc_find_dev() but is non-zero now (as +* it equals to an interface index which is never +* zero). The ip_mc_find_dev() function modifies +* the ifindex only if it finds an interface +* (in wich case it returns non-NULL). Thus the +* in_dev must be non-NULL. +*/ + goto leave; + + if (cimlp == NULL && iml->multi.imr_address.s_addr == address) + cimlp = imlp; + } + + if (cimlp != NULL) { + /* We have found at least one candidate interface +* for leaving by address but not a direct match. +* Since there is no way to tell what interface the user +* wnated to leave the multicast group on we are going +* to leave it on the first candidate interface found. +*/ + iml = *(imlp = cimlp); + + if (in_dev != NULL) { + /* If we have found a
Re: Possible leak of multicast source filter sctructure #3
Hi > I'm not sure the second one is quite right. The case of concern > is where an interface is deleted. If you joined (or left) the group by > address and then deleted the interface, then you wouldn't match the > index (which wouldn't be set) so the leave wouldn't work, still. That's right I havent thought of this case. > Also, if you passed a completely bogus ifindex, it should return > ENODEV, but with the patch it would return EADDRNOTAVAIL it appears. The question is what is "completely bogus ifindex" in this case? An interface that does not exist any more but happen to be on the sockets multicast list shouldn't be. > So, I think the second patch needs some more work. I'll look at > it some more and see if I can suggest something better. > > > +-DLS I've tried to implement something more complete but especially in the case of leaving a group by address it is still just a best effort and not something absolutely perfect. I've started with streamlining the ip_mc_find_dev() function with one little change in its behavior: clearing the imr_address member of the ip_mreqn request structure in case an interface is found by an index. This should be no problem since this member is not used in this case and may contain a random value. So I clear it to get rid of this randomness since this value might now be used in ip_mc_leave_group() Well and now the changes in the ip_mc_leave_group(): I've splitted it into two different cases: 1) leave by an interface index 2) leave by an interface address / muticast address In the first case I search for a match by the interface index specified in the leave request. If a match is found I leave the group on the interface irrespective of its existence. In the second case I do a similar search (but this time using the interface index found in ip_mc_find_dev()) while also checking for a match by the interface address. If no match is found by the interface index and there is a match (or more) by the address I leave the group on the interface corresponding to the first match by the address. This certainly could produce weird results but such results could be produced by the original algorithm as well with the additional problem that there was no way to leave a group on a deleted interface. And here is the patch: Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> --- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-16 15:06:18.0 +0200 @@ -1369,13 +1369,15 @@ struct flowi fl = { .nl_u = { .ip4_u = { .daddr = imr->imr_multiaddr.s_addr } } }; struct rtable *rt; - struct net_device *dev = NULL; - struct in_device *idev = NULL; + struct net_device *dev; if (imr->imr_ifindex) { - idev = inetdev_by_index(imr->imr_ifindex); - if (idev) + struct in_device *idev = inetdev_by_index(imr->imr_ifindex); + + if (idev) { + imr->imr_address.s_addr = 0; __in_dev_put(idev); + } return idev; } if (imr->imr_address.s_addr) { @@ -1383,17 +1385,16 @@ if (!dev) return NULL; dev_put(dev); - } - - if (!dev && !ip_route_output_key(&rt, &fl)) { + } else if (!ip_route_output_key(&rt, &fl)) { dev = rt->u.dst.dev; ip_rt_put(rt); - } - if (dev) { - imr->imr_ifindex = dev->ifindex; - idev = __in_dev_get_rtnl(dev); - } - return idev; + if (!dev) + return NULL; + } else + return NULL; + + imr->imr_ifindex = dev->ifindex; + return __in_dev_get_rtnl(dev); } /* @@ -1798,27 +1799,79 @@ u32 ifindex; rtnl_lock(); - in_dev = ip_mc_find_dev(imr); - if (!in_dev) { - rtnl_unlock(); - return -ENODEV; - } ifindex = imr->imr_ifindex; - for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { - if (iml->multi.imr_multiaddr.s_addr == group && - iml->multi.imr_ifindex == ifindex) { - (void) ip_mc_leave_src(sk, iml, in_dev); - - *imlp = iml->next; - - ip_mc_dec_group(in_dev, group); - rtnl_unlock(); - sock_kfree_s(sk, iml, sizeof(*iml)); - return 0; + in_dev = ip_mc_find_dev(imr); + if (ifindex != 0) { + /* leave by interface index */ +
Re: Possible leak of multicast source filter sctructure
Hi, ... > (Meanwhile, Michal, can I get a Signed-off-by: line from you for these > patches? Thanks a lot.) no problem :-) There is a leak of a socket's multicast source filter list structure on closing a socket with a multicast source filter set on an interface that does not exist any more. This patch fixes it: Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> --- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:45:56.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-11 11:51:56.0 +0200 @@ -2202,13 +2202,13 @@ struct in_device *in_dev; inet->mc_list = iml->next; - if ((in_dev = inetdev_by_index(iml->multi.imr_ifindex)) != NULL) { - (void) ip_mc_leave_src(sk, iml, in_dev); + in_dev = inetdev_by_index(iml->multi.imr_ifindex); + (void) ip_mc_leave_src(sk, iml, in_dev); + if (in_dev != NULL) { ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr); in_dev_put(in_dev); } sock_kfree_s(sk, iml, sizeof(*iml)); - } rtnl_unlock(); } Due to a bug in IP_DROP_MEMBERSHIP setsockopt handling code it is not possible to leave a multicast group joined on an interface that does not exist any more. This patch makes leaving a multicast group possible even in that case: Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]> --- linux-2.6.17.8/net/ipv4/igmp.c.leak 2006-08-11 11:50:46.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-11 11:52:33.0 +0200 @@ -1799,19 +1799,15 @@ rtnl_lock(); in_dev = ip_mc_find_dev(imr); - if (!in_dev) { - rtnl_unlock(); - return -ENODEV; - } ifindex = imr->imr_ifindex; for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { if (iml->multi.imr_multiaddr.s_addr == group && iml->multi.imr_ifindex == ifindex) { - (void) ip_mc_leave_src(sk, iml, in_dev); - *imlp = iml->next; - ip_mc_dec_group(in_dev, group); + (void) ip_mc_leave_src(sk, iml, in_dev); + if (in_dev != NULL) + ip_mc_dec_group(in_dev, group); rtnl_unlock(); sock_kfree_s(sk, iml, sizeof(*iml)); return 0; Michal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible leak of multicast source filter sctructure
> Michal, > This looks correct, but I think a better way to do it is: > > in_dev = inetdev_by_index(...) > (void) ip_mc_leave_src() > if (in_dev) { > ip_mc_dec_group() > in_dev_put() > } > > That way, sflist internal details aren't visible at this > level, and ip_mc_leave_src() collapses to the sock_kfree_s() > when in_dev is NULL. You are absolutely right, I just failed to notice that -ENODEV return value from ip_mc_del_src()/ip_mc_leave_src() is ignored. Here comes the patch: --- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:45:56.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-11 11:51:56.0 +0200 @@ -2202,13 +2202,13 @@ struct in_device *in_dev; inet->mc_list = iml->next; - if ((in_dev = inetdev_by_index(iml->multi.imr_ifindex)) != NULL) { - (void) ip_mc_leave_src(sk, iml, in_dev); + in_dev = inetdev_by_index(iml->multi.imr_ifindex); + (void) ip_mc_leave_src(sk, iml, in_dev); + if (in_dev != NULL) { ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr); in_dev_put(in_dev); } sock_kfree_s(sk, iml, sizeof(*iml)); - } rtnl_unlock(); } > Also, ip_mc_leave_group() has the same issue; looks > like it just needs the "if (in_dev)" removed before the call to > ip_mc_leave_src(). In fact it is a slightly different issue, there is no leak in this function. Rather the function completely fails to leave a multicast group joined on an interface that does not exist any more. Actually this is how I discovered the bug as I was tracking down a problem with ripd daemon of routing software quagga which failed to join a multicast group (with -ENOBUFS) on an interface after there were several (20 to be precise which corresponds to the default value [IP_MAX_MEMBERSHIPS] of sysctl_igmp_max_memberships) interfaces added/removed. Here comes a patch for that: --- linux-2.6.17.8/net/ipv4/igmp.c.leak 2006-08-11 11:50:46.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-11 11:52:33.0 +0200 @@ -1799,19 +1799,15 @@ rtnl_lock(); in_dev = ip_mc_find_dev(imr); - if (!in_dev) { - rtnl_unlock(); - return -ENODEV; - } ifindex = imr->imr_ifindex; for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) { if (iml->multi.imr_multiaddr.s_addr == group && iml->multi.imr_ifindex == ifindex) { - (void) ip_mc_leave_src(sk, iml, in_dev); - *imlp = iml->next; - ip_mc_dec_group(in_dev, group); + (void) ip_mc_leave_src(sk, iml, in_dev); + if (in_dev != NULL) + ip_mc_dec_group(in_dev, group); rtnl_unlock(); sock_kfree_s(sk, iml, sizeof(*iml)); return 0; > +-DLS > Michal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Possible leak of multicast source filter sctructure #2
Took some time but this time the inlined patch should be OK. Hi all! It seems to me that there is a leak of struct ip_sf_socklist in the ip_mc_drop_socket function (in net/ipv4/igmp.c) which is called on socket close. This patch corrects it: diff -Naur linux-2.6.17.8.orig/net/ipv4/igmp.c linux-2.6.17.8/net/ipv4/igmp.c --- linux-2.6.17.8.orig/net/ipv4/igmp.c 2006-08-07 06:18:54.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-10 10:38:04.0 +0200 @@ -2206,9 +2206,10 @@ (void) ip_mc_leave_src(sk, iml, in_dev); ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr); in_dev_put(in_dev); - } - sock_kfree_s(sk, iml, sizeof(*iml)); + } else if (iml->sflist != NULL) + sock_kfree_s(sk, iml->sflist, IP_SFLSIZE(iml->sflist->sl_max)); + sock_kfree_s(sk, iml, sizeof(*iml)); } rtnl_unlock(); } The leak only happens if there are some multicast source filters set on a socket wich are bound to an interface that does not exist any more, as in the following scenario: 1. create a temporary interface (say GRE tunnel) 2. create a socket 3. join a multicast group and set a source filter on the temporary interface via MCAST_JOIN_SOURCE_GROUP setsockopt call 4. destroy the temporary interface 5. close the socket This sequence of things eventually leads to a call of ip_mc_drop_socket function, which fails to free the soucre filter structure ip_sf_socklist pointed to from members of socket's multicast addresses list. This structure is normally freed in ip_mc_leave_src function but this function is not called in this scenario because the interface that the multicast group is joined on does not exist any more. Thanks Michal Ruzicka - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Possible leak of multicast source filter sctructure
Hi all! It seems to me that there is a leak of struct ip_sf_socklist in the ip_mc_drop_socket function (in net/ipv4/igmp.c) which is called on socket close. This patch corrects it: diff -Naur linux-2.6.17.8.orig/net/ipv4/igmp.c linux-2.6.17.8/net/ipv4/igmp.c --- linux-2.6.17.8.orig/net/ipv4/igmp.c 2006-08-07 06:18:54.0 +0200 +++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-10 10:38:04.0 +0200 @@ -2206,9 +2206,10 @@ (void) ip_mc_leave_src(sk, iml, in_dev); ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr); in_dev_put(in_dev); - } - sock_kfree_s(sk, iml, sizeof(*iml)); + } else if (iml->sflist != NULL) + sock_kfree_s(sk, iml->sflist, IP_SFLSIZE(iml->sflist->sl_max)); + sock_kfree_s(sk, iml, sizeof(*iml)); } rtnl_unlock(); } The leak only happens if there are some multicast source filters set on a socket wich are bound to an interface that does not exist any more, as in the following scenario: 1. create a temporary interface (say GRE tunnel) 3. join a multicast group an set a source filter on the temporary interface via MCAST_JOIN_SOURCE_GROUP setsockopt call 4. destroy the temporary interface 5. close the socket This sequence of things eventually leads to a call of ip_mc_drop_socket function, which fails to free the soucre filter structure ip_sf_socklist pointed to from members of socket's multicast addresses list. This structure is normally freed in ip_mc_leave_src function but this function is not called in this scenario because the interface that the multicast group is joined on does not exist any more. Thanks Michal Ruzicka linux-2.6.17.8-mc_sf_leak.patch Description: Binary data