Re: Multiple end-points behind same NAT

2006-12-02 Thread Michal Ruzicka

Hi,

although I'm not a kernel guru I think I've got something to say to this.



I am wondering if 26sec supports NAT-Traversal for multiple
endpoints behind the same NAT. In looking at xfrm_tmpl it's
not obvious to me that it's supported, ...


You are looking at the rignt place indeed. Just to make you sure, there is 
really no space to store the port infomation of the tunnel endpoints in the 
xfrm_tmpl structure.
The structure xfrm_state (a kernel structuture for holding SA's) is a bit 
different story though. Although the port information is not stored directly 
in the structure either, there is the encap member pointing to a 
xfrm_encap_tmpl structure which is used to hold the required information.


The consequences of this are:
1) The IKE dameon (or the key manager as it is called in the kernel context) 
can't get the full infomation from the kernel required to be a successful 
initiator in the case of  multiple peers behind the same NAT. (Though you 
might be able to get it working with a single peer behind the NAT if you 
configure the port forwarding at the NAT box carefuly.)


2) If there was an IKE daemon which could be told the required port 
information by some other means then directly by the kernel it should be 
possible to make it work despite the deficiencies of the kernel. I don't 
know if there is any IKE daemon capable of this, but I'm sure racoon can't 
do that.


3) It is possible to get this working the other way around: If the boxes 
behind the NAT were the initiators then it should work just fine at least if 
tunnel mode was used. There are some problems with the transport mode but 
even that can be made to work for certain scenarios.


Regards,
Michal 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


AF_KEY extended xfrm_state selector handling

2006-10-19 Thread Michal Ruzicka
Hello

In an effort to configure an L2TP/IPsec server on Linux capable of supporting
multiple clients behind a single NAT device I ran into difficulties with pf_key
protocol implementation not being able to exploit all the information
passed to it as a SADB_EXT_ADDRESS_PROXY info. Perhaps as the original source
suggested (/* Nobody uses this, but we try. */) this info has never been used
before.

So I propose the included patch. It changes the following:
1) the amount of information that is stored into the struct xfrm_state's
   selector from the SADB_EXT_ADDRESS_PROXY info received from userspace
   in the pfkey_msg2xfrm_state() function
   The information stored now is:
- the address (stored as the selector's source address, including family)
- the prefix length (stored as the selector's source address prefix length)
- the protocol (stored as selector's protocol)
- the port (stored as selector's source port)
   Previously only the address and the prefix length were stored.

2) the conditions under which the SADB_EXT_ADDRESS_PROXY info
   is included while converting a struct xfrm_state into a pf_key message
   in the pfkey_xfrm_state2msg() function
   The conditions now are:
- selector' protcol family is non-zero (ie. the selector is defined)
and
(
  - selector's prtocol is non-zero (ie. the protocol is specified)
  or
  - selector's source port is non-zero (ie. the port is specified)
  or
  - selector's source address is different from xfrm_state source address
)
   Further the case when selector's address is of a different family from
   the xfrm_state address is now handled.

3) the way how port information is obtained from struct sadb_address
   The port information extraction is now part of
   the pfkey_sadb_addr2xfrm_addr() function wich handles that in a protocol
   family safe manner, instead of using ((struct sockaddr_in *)x)->sin_port
   construct irrespective of the protocol family

NOTES:
  - This patch should cause no problems, since as the original source says
nobody uses that info.
  - I've also created a patch for racoon (ipsec-tools) to actually pass
that info. Eventually I've been able to establish L2TP/IPSec connections
from multiple clients behind the same NAT to the same L2TP/IPSec linux 2.6
based server.
(The procedure required a manual insertion of certain SPD entries during
the connection establishment but this will hopefuly be handled by
the L2TP daemon automatically soon.)

Here comes the patch (it is against 2.6.17.11):
Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

diff -Naur linux-2.6.17.11.orig/net/key/af_key.c 
linux-2.6.17.11/net/key/af_key.c
--- linux-2.6.17.11.orig/net/key/af_key.c   2006-08-23 23:16:33.0 
+0200
+++ linux-2.6.17.11/net/key/af_key.c2006-10-18 16:53:48.0 +0200
@@ -552,19 +552,28 @@
 }
 
 static int pfkey_sadb_addr2xfrm_addr(struct sadb_address *addr,
-xfrm_address_t *xaddr)
+xfrm_address_t *xaddr, __u16 *port)
 {
switch (((struct sockaddr*)(addr + 1))->sa_family) {
case AF_INET:
-   xaddr->a4 = 
-   ((struct sockaddr_in *)(addr + 1))->sin_addr.s_addr;
+   {
+   struct sockaddr_in *in = (struct sockaddr_in *)(addr + 1);
+
+   xaddr->a4 = in->sin_addr.s_addr;
+   if (port)
+   *port = in->sin_port;
return AF_INET;
+   }
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
case AF_INET6:
-   memcpy(xaddr->a6, 
-  &((struct sockaddr_in6 *)(addr + 1))->sin6_addr,
-  sizeof(struct in6_addr));
+   {
+   struct sockaddr_in6 *in6 = (struct sockaddr_in6 *)(addr + 1);
+
+   memcpy(xaddr->a6, &in6->sin6_addr, sizeof(struct in6_addr));
+   if (port)
+   *port = in6->sin6_port;
return AF_INET6;
+   }
 #endif
default:
return 0;
@@ -651,6 +660,7 @@
int encrypt_key_size = 0;
int sockaddr_size;
struct xfrm_encap_tmpl *natt = NULL;
+   int proxy_size;
 
/* address family check */
sockaddr_size = pfkey_sockaddr_size(x->props.family);
@@ -674,14 +684,25 @@
 
/* identity & sensitivity */
 
-   if ((x->props.family == AF_INET &&
-x->sel.saddr.a4 != x->props.saddr.a4)
-#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
-   || (x->props.family == AF_INET6 &&
-   memcmp (x->sel.saddr.a6, x->props.saddr.a6, sizeof (struct 
in6_addr)))
-#endif
-   )
-   size += sizeof(struct sadb_address) + sockaddr_size;
+   if (x->sel.family != 0 &&
+ 

multicast group memberships purge on interface delete

2006-08-23 Thread Michal Ruzicka

Hello there,
I've got the following question/suggestion:

The situation today:
When an interface is deleted and there happen to have been some multicast 
groups joined on it only
the interface's list of multicast meberships is deleted. The sockets through 
which the groups
were joined and more importantly their associated multicast membership lists 
are left untouched.
This makes it difficult for the function that handles leaving multicast 
groups on a socket to decide
what to do with groups that were joined on such an interface (that no longer 
exists). The present
implementation is a kind of  a "best guess" (and nothnig better can probably 
be done about that).
It may even fail to leave an affected group (group that was joined on a 
deleted interface) completely
and thus block a slot in the sockets's multicast mebership list which size 
is purposely limited.


My question/suggestion:
Would it feasible to drop the relevant entries from sockets' multicast 
membership lists on the interface
delete? Yes, I do realize it would require to walk through a number of 
sockets to see if there is any
multicast entry for the interface in question to delete. But this could be 
optimized by maintaining a list
of sockets that have a multicast group joined on the interface (and keep a 
pointer to this list in the
device structure). This would ease the job of the function handling leaving 
multicast groups, made
its beahaviour more "deterministic" and possible errors reported by it more 
meaningful/reliable.


Notes:
- The suggested approach is reportedly taken by other OSes (notably NetBSD). 
The fact
that linux doesn't behave the same poses a problem for cross platform 
software for the behaviour

of different systems is different in one more detail.
- The suggested "list of sockets that have a multicast group joined on the 
interface" could also
probably be of some help when maintaining the per interface multicast source 
filter list or
per-interface multicast reception state as per RFC 3376 (IGMPv3) section 
3.2.


Thanks
Michal Ruzicka 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible leak of multicast source filter sctructure #4

2006-08-18 Thread Michal Ruzicka
 +-DLS

I agree, but for the time before all applications are fixed ...
(BTW from the IPv6 part of your patch it seems that the multicast group
membership management is done solely by the interface index there
... good idea! :-) )

> > Suppose the following situatuion:
> > 
> > 1) create pppX interface:
> >   IP: A.B.C.D
> > 
> > 2) join a multicast group by address: A.B.C.D
> >   ASSUMPTION: no other interface with the same IP address exists;
> >   the result should be that the group is joined on the pppX interface
> > 
> > 3) create pppY interface
> >   IP: A.B.C.D (Yes the same IP, not unusual on ppp links.)
> > 
> > 4) delete the pppX interace
> > 
> > 5) attempt to leave the multicast group by address: A.B.C.D
> > 
> > 6) * if your algorthim is used -EADDRNOTAVAIL is returned
> >* if mine is used the multicast group is left on the pppX interface
> > 

Michal

PS: During the writing of this post I realized a bug in my code:
The second ip_mc_find_dev() lookup on lines 42-43 should be done irrespective
of the prior value of in_dev.
And came up with an enhancement to what I had previously done in
ip_mc_find_dev():
In case an interface is found by its index the imr_address is no longer
cleared rather it is set to: INADDR_NONE.
This should result in somewhat smarter behaviour in case of leaving a group
by its multicast address alone.
Here is the corrected version of the patch:

Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

--- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-18 13:04:09.0 +0200
@@ -1369,13 +1369,15 @@
struct flowi fl = { .nl_u = { .ip4_u =
  { .daddr = imr->imr_multiaddr.s_addr } } 
};
struct rtable *rt;
-   struct net_device *dev = NULL;
-   struct in_device *idev = NULL;
+   struct net_device *dev;
 
if (imr->imr_ifindex) {
-   idev = inetdev_by_index(imr->imr_ifindex);
-   if (idev)
+   struct in_device *idev = inetdev_by_index(imr->imr_ifindex);
+
+   if (idev) {
+   imr->imr_address.s_addr = INADDR_NONE;
__in_dev_put(idev);
+   }
return idev;
}
if (imr->imr_address.s_addr) {
@@ -1383,17 +1385,16 @@
if (!dev)
return NULL;
dev_put(dev);
-   }
-
-   if (!dev && !ip_route_output_key(&rt, &fl)) {
+   } else if (!ip_route_output_key(&rt, &fl)) {
dev = rt->u.dst.dev;
ip_rt_put(rt);
-   }
-   if (dev) {
-   imr->imr_ifindex = dev->ifindex;
-   idev = __in_dev_get_rtnl(dev);
-   }
-   return idev;
+   if (!dev)
+   return NULL;
+   } else
+   return NULL;
+
+   imr->imr_ifindex = dev->ifindex;
+   return __in_dev_get_rtnl(dev);
 }
 
 /*
@@ -1798,27 +1799,77 @@
u32 ifindex;
 
rtnl_lock();
-   in_dev = ip_mc_find_dev(imr);
-   if (!in_dev) {
-   rtnl_unlock();
-   return -ENODEV;
-   }
ifindex = imr->imr_ifindex;
-   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) {
-   if (iml->multi.imr_multiaddr.s_addr == group &&
-   iml->multi.imr_ifindex == ifindex) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
-
-   *imlp = iml->next;
-
-   ip_mc_dec_group(in_dev, group);
-   rtnl_unlock();
-   sock_kfree_s(sk, iml, sizeof(*iml));
-   return 0;
+   in_dev = ip_mc_find_dev(imr);
+   if (ifindex != 0) {
+   /* leave by interface index */
+   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = 
&iml->next) {
+   if (iml->multi.imr_multiaddr.s_addr != group)
+   continue;
+
+   if (iml->multi.imr_ifindex == ifindex)
+   goto leave;
+   }
+   } else {
+   /* leave by address / multicast group route */
+   struct ip_mc_socklist **cimlp = NULL;
+   u32 address = imr->imr_address.s_addr;
+
+   ifindex = imr->imr_ifindex;
+   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = 
&iml->next) {
+   if (iml->multi.imr_multiaddr.s_addr != group)
+   continue;
+
+   if (iml->multi.imr_ifindex == ifindex)
+   /* dire

Re: Possible leak of multicast source filter sctructure

2006-08-17 Thread Michal Ruzicka
Hello David,

> Michal,
> I believe the patch I submitted yesterday fixes this
> problem, and in a simpler way.

Somehow I've missed that e-mail (It did not appear on the thread's list at MARC
archives). Sorry for that.

> 
> +-DLS

I've reviewed your patch (the IPv4 part of it) and I think I can say that
our soulutions are actually quite similar.
But I can come up with one difference that I'd like know your opinion on.

Suppose the following situatuion:

1) create pppX interface:
  IP: A.B.C.D

2) join a multicast group by address: A.B.C.D
  ASSUMPTION: no other interface with the same IP address exists;
  the result should be that the group is joined on the pppX interface

3) create pppY interface
  IP: A.B.C.D (Yes the same IP, not unusual on ppp links.)

4) delete the pppX interace

5) attempt to leave the multicast group by address: A.B.C.D

6) * if your algorthim is used -EADDRNOTAVAIL is returned
   * if mine is used the multicast group is left on the pppX interface

Surely this won't be a common problem but I think it's worth pointing out
here.

Michal
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible leak of multicast source filter sctructure #3a

2006-08-16 Thread Michal Ruzicka
The same patch as in previous e-mail with a few typos in comments corrected:

Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

--- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-16 16:53:08.0 +0200
@@ -1369,13 +1369,15 @@
struct flowi fl = { .nl_u = { .ip4_u =
  { .daddr = imr->imr_multiaddr.s_addr } } 
};
struct rtable *rt;
-   struct net_device *dev = NULL;
-   struct in_device *idev = NULL;
+   struct net_device *dev;
 
if (imr->imr_ifindex) {
-   idev = inetdev_by_index(imr->imr_ifindex);
-   if (idev)
+   struct in_device *idev = inetdev_by_index(imr->imr_ifindex);
+
+   if (idev) {
+   imr->imr_address.s_addr = 0;
__in_dev_put(idev);
+   }
return idev;
}
if (imr->imr_address.s_addr) {
@@ -1383,17 +1385,16 @@
if (!dev)
return NULL;
dev_put(dev);
-   }
-
-   if (!dev && !ip_route_output_key(&rt, &fl)) {
+   } else if (!ip_route_output_key(&rt, &fl)) {
dev = rt->u.dst.dev;
ip_rt_put(rt);
-   }
-   if (dev) {
-   imr->imr_ifindex = dev->ifindex;
-   idev = __in_dev_get_rtnl(dev);
-   }
-   return idev;
+   if (!dev)
+   return NULL;
+   } else
+   return NULL;
+
+   imr->imr_ifindex = dev->ifindex;
+   return __in_dev_get_rtnl(dev);
 }
 
 /*
@@ -1798,27 +1799,79 @@
u32 ifindex;
 
rtnl_lock();
-   in_dev = ip_mc_find_dev(imr);
-   if (!in_dev) {
-   rtnl_unlock();
-   return -ENODEV;
-   }
ifindex = imr->imr_ifindex;
-   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) {
-   if (iml->multi.imr_multiaddr.s_addr == group &&
-   iml->multi.imr_ifindex == ifindex) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
-
-   *imlp = iml->next;
-
-   ip_mc_dec_group(in_dev, group);
-   rtnl_unlock();
-   sock_kfree_s(sk, iml, sizeof(*iml));
-   return 0;
+   in_dev = ip_mc_find_dev(imr);
+   if (ifindex != 0) {
+   /* leave by interface index */
+   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = 
&iml->next) {
+   if (iml->multi.imr_multiaddr.s_addr != group)
+   continue;
+
+   if (iml->multi.imr_ifindex == ifindex)
+   goto leave;
+   }
+   } else {
+   /* leave by address / multicast group route */
+   struct ip_mc_socklist **cimlp = NULL;
+   u32 address = imr->imr_address.s_addr;
+
+   ifindex = imr->imr_ifindex;
+   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = 
&iml->next) {
+   if (iml->multi.imr_multiaddr.s_addr != group)
+   continue;
+
+   if (iml->multi.imr_ifindex == ifindex)
+   /* direct match
+* NOTE: We do not have to test for in_dev != 
NULL
+* since we know that ifindex was zero before 
call
+* to ip_mc_find_dev() but is non-zero now (as
+* it equals to an interface index which is 
never
+* zero). The ip_mc_find_dev() function modifies
+* the ifindex only if it finds an interface
+* (in wich case it returns non-NULL). Thus the
+* in_dev must be non-NULL.
+*/
+   goto leave;
+
+   if (cimlp == NULL && iml->multi.imr_address.s_addr == 
address)
+   cimlp = imlp;
+   }
+
+   if (cimlp != NULL) {
+   /* We have found at least one candidate interface
+* for leaving by address but not a direct match.
+* Since there is no way to tell what interface the user
+* wnated to leave the multicast group on we are going
+* to leave it on the first candidate interface found.
+*/
+   iml = *(imlp = cimlp);
+
+   if (in_dev != NULL) {
+   /* If we have found a

Re: Possible leak of multicast source filter sctructure #3

2006-08-16 Thread Michal Ruzicka
Hi
> I'm not sure the second one is quite right. The case of concern
> is where an interface is deleted. If you joined (or left) the group by
> address and then deleted the interface, then you wouldn't match the
> index (which wouldn't be set) so the leave wouldn't work, still.

That's right I havent thought of this case.

> Also, if you passed a completely bogus ifindex, it should return
> ENODEV, but with the patch it would return EADDRNOTAVAIL it appears.

The question is what is "completely bogus ifindex" in this case?
An interface that does not exist any more but happen to be on the sockets
multicast list shouldn't be.

> So, I think the second patch needs some more work. I'll look at
> it some more and see if I can suggest something better.
> 
> 
> +-DLS

I've tried to implement something more complete but especially in the case
of leaving a group by address it is still just a best effort and not something
absolutely perfect.

I've started with streamlining the ip_mc_find_dev() function with one little
change in its behavior: clearing the imr_address member of the ip_mreqn
request structure in case an interface is found by an index.
This should be no problem since this member is not used in this case and may
contain a random value. So I clear it to get rid of this randomness since
this value might now be used in ip_mc_leave_group()

Well and now the changes in the ip_mc_leave_group():
I've splitted it into two different cases:
 1) leave by an interface index
 2) leave by an interface address / muticast address

In the first case I search for a match by the interface index specified
in the leave request. If a match is found I leave the group on the
interface irrespective of its existence.

In the second case I do a similar search (but this time using the interface
index found in ip_mc_find_dev()) while also checking for a match by
the interface address.
If no match is found by the interface index and there is a match (or more)
by the address I leave the group on the interface corresponding to the first
match by the address.
This certainly could produce weird results but such results could be
produced by the original algorithm as well with the additional problem
that there was no way to leave a group on a deleted interface.

And here is the patch:

Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

--- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:50:46.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-16 15:06:18.0 +0200
@@ -1369,13 +1369,15 @@
struct flowi fl = { .nl_u = { .ip4_u =
  { .daddr = imr->imr_multiaddr.s_addr } } 
};
struct rtable *rt;
-   struct net_device *dev = NULL;
-   struct in_device *idev = NULL;
+   struct net_device *dev;
 
if (imr->imr_ifindex) {
-   idev = inetdev_by_index(imr->imr_ifindex);
-   if (idev)
+   struct in_device *idev = inetdev_by_index(imr->imr_ifindex);
+
+   if (idev) {
+   imr->imr_address.s_addr = 0;
__in_dev_put(idev);
+   }
return idev;
}
if (imr->imr_address.s_addr) {
@@ -1383,17 +1385,16 @@
if (!dev)
return NULL;
dev_put(dev);
-   }
-
-   if (!dev && !ip_route_output_key(&rt, &fl)) {
+   } else if (!ip_route_output_key(&rt, &fl)) {
dev = rt->u.dst.dev;
ip_rt_put(rt);
-   }
-   if (dev) {
-   imr->imr_ifindex = dev->ifindex;
-   idev = __in_dev_get_rtnl(dev);
-   }
-   return idev;
+   if (!dev)
+   return NULL;
+   } else
+   return NULL;
+
+   imr->imr_ifindex = dev->ifindex;
+   return __in_dev_get_rtnl(dev);
 }
 
 /*
@@ -1798,27 +1799,79 @@
u32 ifindex;
 
rtnl_lock();
-   in_dev = ip_mc_find_dev(imr);
-   if (!in_dev) {
-   rtnl_unlock();
-   return -ENODEV;
-   }
ifindex = imr->imr_ifindex;
-   for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) {
-   if (iml->multi.imr_multiaddr.s_addr == group &&
-   iml->multi.imr_ifindex == ifindex) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
-
-   *imlp = iml->next;
-
-   ip_mc_dec_group(in_dev, group);
-   rtnl_unlock();
-   sock_kfree_s(sk, iml, sizeof(*iml));
-   return 0;
+   in_dev = ip_mc_find_dev(imr);
+   if (ifindex != 0) {
+   /* leave by interface index */
+

Re: Possible leak of multicast source filter sctructure

2006-08-14 Thread Michal Ruzicka
Hi,

 ...
> (Meanwhile, Michal, can I get a Signed-off-by: line from you for these
>  patches?  Thanks a lot.)

no problem :-)

There is a leak of a socket's multicast source filter list structure
on closing a socket with a multicast source filter set on an interface
that does not exist any more.
This patch fixes it:

Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

--- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:45:56.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-11 11:51:56.0 +0200
@@ -2202,13 +2202,13 @@
struct in_device *in_dev;
inet->mc_list = iml->next;
 
-   if ((in_dev = inetdev_by_index(iml->multi.imr_ifindex)) != 
NULL) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
+   in_dev = inetdev_by_index(iml->multi.imr_ifindex);
+   (void) ip_mc_leave_src(sk, iml, in_dev);
+   if (in_dev != NULL) {
ip_mc_dec_group(in_dev, 
iml->multi.imr_multiaddr.s_addr);
in_dev_put(in_dev);
}
sock_kfree_s(sk, iml, sizeof(*iml));
-
}
rtnl_unlock();
 }


Due to a bug in IP_DROP_MEMBERSHIP setsockopt handling code it is not
possible to leave a multicast group joined on an interface that
does not exist any more.
This patch makes leaving a multicast group possible even in that case:

Signed-off-by: Michal Ruzicka <[EMAIL PROTECTED]>

--- linux-2.6.17.8/net/ipv4/igmp.c.leak 2006-08-11 11:50:46.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-11 11:52:33.0 +0200
@@ -1799,19 +1799,15 @@
 
rtnl_lock();
in_dev = ip_mc_find_dev(imr);
-   if (!in_dev) {
-   rtnl_unlock();
-   return -ENODEV;
-   }
ifindex = imr->imr_ifindex;
for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) {
if (iml->multi.imr_multiaddr.s_addr == group &&
iml->multi.imr_ifindex == ifindex) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
-
*imlp = iml->next;
 
-   ip_mc_dec_group(in_dev, group);
+   (void) ip_mc_leave_src(sk, iml, in_dev);
+   if (in_dev != NULL)
+   ip_mc_dec_group(in_dev, group);
rtnl_unlock();
sock_kfree_s(sk, iml, sizeof(*iml));
return 0;


Michal
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible leak of multicast source filter sctructure

2006-08-11 Thread Michal Ruzicka
> Michal,
> This looks correct, but I think a better way to do it is:
> 
> in_dev = inetdev_by_index(...)
> (void) ip_mc_leave_src()
> if (in_dev) {
> ip_mc_dec_group()
> in_dev_put()
> }
> 
> That way, sflist internal details aren't visible at this
> level, and ip_mc_leave_src() collapses to the sock_kfree_s()
> when in_dev is NULL.

You are absolutely right, I just failed to notice that -ENODEV return value
from ip_mc_del_src()/ip_mc_leave_src() is ignored.
Here comes the patch:

--- linux-2.6.17.8/net/ipv4/igmp.c.orig 2006-08-11 11:45:56.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-11 11:51:56.0 +0200
@@ -2202,13 +2202,13 @@
struct in_device *in_dev;
inet->mc_list = iml->next;
 
-   if ((in_dev = inetdev_by_index(iml->multi.imr_ifindex)) != 
NULL) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
+   in_dev = inetdev_by_index(iml->multi.imr_ifindex);
+   (void) ip_mc_leave_src(sk, iml, in_dev);
+   if (in_dev != NULL) {
ip_mc_dec_group(in_dev, 
iml->multi.imr_multiaddr.s_addr);
in_dev_put(in_dev);
}
sock_kfree_s(sk, iml, sizeof(*iml));
-
}
rtnl_unlock();
 }


> Also, ip_mc_leave_group() has the same issue; looks
> like it just needs the "if (in_dev)" removed before the call to
> ip_mc_leave_src().

In fact it is a slightly different issue, there is no leak in this
function. Rather the function completely fails to leave a multicast
group joined on an interface that does not exist any more. Actually
this is how I discovered the bug as I was tracking down a problem
with ripd daemon of routing software quagga which failed to join
a multicast group (with -ENOBUFS) on an interface after there were
several (20 to be precise which corresponds to the default value
[IP_MAX_MEMBERSHIPS] of sysctl_igmp_max_memberships) interfaces
added/removed.
Here comes a patch for that:

--- linux-2.6.17.8/net/ipv4/igmp.c.leak 2006-08-11 11:50:46.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-11 11:52:33.0 +0200
@@ -1799,19 +1799,15 @@
 
rtnl_lock();
in_dev = ip_mc_find_dev(imr);
-   if (!in_dev) {
-   rtnl_unlock();
-   return -ENODEV;
-   }
ifindex = imr->imr_ifindex;
for (imlp = &inet->mc_list; (iml = *imlp) != NULL; imlp = &iml->next) {
if (iml->multi.imr_multiaddr.s_addr == group &&
iml->multi.imr_ifindex == ifindex) {
-   (void) ip_mc_leave_src(sk, iml, in_dev);
-
*imlp = iml->next;
 
-   ip_mc_dec_group(in_dev, group);
+   (void) ip_mc_leave_src(sk, iml, in_dev);
+   if (in_dev != NULL)
+   ip_mc_dec_group(in_dev, group);
rtnl_unlock();
sock_kfree_s(sk, iml, sizeof(*iml));
return 0;


> +-DLS
> 

Michal
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible leak of multicast source filter sctructure #2

2006-08-10 Thread Michal Ruzicka
Took some time but this time the inlined patch should be OK.

Hi all!
It seems to me that there is a leak of struct ip_sf_socklist in the 
ip_mc_drop_socket function (in net/ipv4/igmp.c) which is called on socket 
close.

This patch corrects it:

diff -Naur linux-2.6.17.8.orig/net/ipv4/igmp.c linux-2.6.17.8/net/ipv4/igmp.c
--- linux-2.6.17.8.orig/net/ipv4/igmp.c 2006-08-07 06:18:54.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c  2006-08-10 10:38:04.0 +0200
@@ -2206,9 +2206,10 @@
(void) ip_mc_leave_src(sk, iml, in_dev);
ip_mc_dec_group(in_dev, 
iml->multi.imr_multiaddr.s_addr);
in_dev_put(in_dev);
-   }
-   sock_kfree_s(sk, iml, sizeof(*iml));
+   } else if (iml->sflist != NULL)
+   sock_kfree_s(sk, iml->sflist, 
IP_SFLSIZE(iml->sflist->sl_max));
 
+   sock_kfree_s(sk, iml, sizeof(*iml));
}
rtnl_unlock();
 }


The leak only happens if there are some multicast source filters set on a 
socket wich are bound to an interface that does not exist any more, as in 
the following scenario:
1. create a temporary interface (say GRE tunnel)
2. create a socket
3. join a multicast group and set a source filter on the temporary interface 
via MCAST_JOIN_SOURCE_GROUP setsockopt call
4. destroy the temporary interface
5. close the socket

This sequence of things eventually leads to a call of ip_mc_drop_socket 
function, which fails to free the soucre filter structure ip_sf_socklist 
pointed to from members of socket's multicast addresses list. This structure 
is normally freed in ip_mc_leave_src function but this function is not 
called in this scenario because the interface that the multicast group is 
joined on does not exist any more.

Thanks
Michal Ruzicka 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible leak of multicast source filter sctructure

2006-08-10 Thread Michal Ruzicka

Hi all!
It seems to me that there is a leak of struct ip_sf_socklist in the 
ip_mc_drop_socket function (in net/ipv4/igmp.c) which is called on socket 
close.


This patch corrects it:

diff -Naur linux-2.6.17.8.orig/net/ipv4/igmp.c 
linux-2.6.17.8/net/ipv4/igmp.c

--- linux-2.6.17.8.orig/net/ipv4/igmp.c 2006-08-07 06:18:54.0 +0200
+++ linux-2.6.17.8/net/ipv4/igmp.c 2006-08-10 10:38:04.0 +0200
@@ -2206,9 +2206,10 @@
   (void) ip_mc_leave_src(sk, iml, in_dev);
   ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr);
   in_dev_put(in_dev);
-  }
-  sock_kfree_s(sk, iml, sizeof(*iml));
+  } else if (iml->sflist != NULL)
+   sock_kfree_s(sk, iml->sflist, IP_SFLSIZE(iml->sflist->sl_max));

+  sock_kfree_s(sk, iml, sizeof(*iml));
 }
 rtnl_unlock();
}

The leak only happens if there are some multicast source filters set on a 
socket wich are bound to an interface that does not exist any more, as in 
the following scenario:

1. create a temporary interface (say GRE tunnel)
3. join a multicast group an set a source filter on the temporary interface 
via MCAST_JOIN_SOURCE_GROUP setsockopt call

4. destroy the temporary interface
5. close the socket

This sequence of things eventually leads to a call of ip_mc_drop_socket 
function, which fails to free the soucre filter structure ip_sf_socklist 
pointed to from members of socket's multicast addresses list. This structure 
is normally freed in ip_mc_leave_src function but this function is not 
called in this scenario because the interface that the multicast group is 
joined on does not exist any more.


Thanks
Michal Ruzicka 


linux-2.6.17.8-mc_sf_leak.patch
Description: Binary data