Re: [PATCH] IPv6: Fix NULL dereference in ipv6_del_addr()
YOSHIFUJI Hideaki wrote: In article [EMAIL PROTECTED] (at Fri, 17 Nov 2006 15:26:28 +0200), Ville Nuorvala [EMAIL PROTECTED] says: - dst_release(rt-u.dst); + if (rt) + dst_release(rt-u.dst); } I disagree. This does NOT fix any bugs. (void *)rt-u.dst is ever equal to (void*)rt, and dst_release() checks if the argument is NULL. As the check is unnecessary you probably want to clean up the other places where rt is checked before rt-u.dst is passed, as well ;-) This is done at least in addrconf.c, ndisc.c and route.c... Seriously though, you are probably right about the pointer being equal to NULL in this case, but does the C language actually guarantee that the pointer to the structure and its first element are equal, or is it implementation dependent? I don't have my KR here, so I can't check. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] IPv6: Fix infinite loop if no matching IPv6 tunnel found
David Miller wrote: From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 02 Nov 2006 16:22:07 +0200 Ok, I'll resubmit a patch doesn't send an ICMPv6 error message. Is this coming soon? I'd like to integrate this patch set into net-2.6.20 if I can. No, it was a false alarm as Herbert's patch already addressed the problem. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] IPv6: Fix infinite loop if no matching IPv6 tunnel found
YOSHIFUJI Hideaki wrote: In article [EMAIL PROTECTED] (at Thu, 02 Nov 2006 15:16:23 +0200), Ville Nuorvala [EMAIL PROTECTED] says: On 11/02/06 14:59, YOSHIFUJI Hideaki wrote: In article [EMAIL PROTECTED] (at Thu, 02 Nov 2006 13:39:19 +0200), Ville Nuorvala [EMAIL PROTECTED] says: read_unlock(ip6ip6_lock); - return 1; - + icmpv6_send(skb, ICMPV6_DEST_UNREACH, + ICMPV6_ADDR_UNREACH, 0, skb-dev); discard: I'd argue this. We probably should not send back any ICMPv6 packets to the original sender in this case to avoid DoS. Sorry, I don't follow you. I don't see the DoS scenario here (after we apply the patch, that is ;-). Well, leaving aside whether sending icmpv6 is good thing (*), the code for sending icmpv6 was moved from ip6_tunnel.c to tunnel6.c by commit-id 50fba2aa7cefa6b0e1768cb350c9e69042320c03 by Herbert. The ip6_tunnel.c change that Herbert made does not seem consistent with ipip.c change. To fix your issue the appropriate change is just fall through to discard section, as we're doing for ipip.c. Ah, I hadn't noticed Herbert's patch. It actually appears to fix the problem I was trying to fix here. AFAIK Tero experienced the infinite loop on a 2.6.16 kernel. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] IPv6: Fix SIOCCHGTUNNEL bug in IPv6 tunnels
From f9812eb0349f44f6374910ce38524f0c6f7ce8f4 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 12:40:38 +0200 Subject: [PATCH 1/6] IPv6: Fix SIOCCHGTUNNEL bug in IPv6 tunnels A logic bug in tunnel lookup could result in duplicate tunnels when changing an existing device. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c | 111 1 files changed, 46 insertions(+), 65 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 84d7ebd..e61458c 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -215,11 +215,10 @@ ip6ip6_tnl_unlink(struct ip6_tnl *t) * Create tunnel matching given parameters. * * Return: - * 0 on success + * created tunnel or NULL **/ -static int -ip6_tnl_create(struct ip6_tnl_parm *p, struct ip6_tnl **pt) +static struct ip6_tnl *ip6_tnl_create(struct ip6_tnl_parm *p) { struct net_device *dev; struct ip6_tnl *t; @@ -236,11 +235,11 @@ ip6_tnl_create(struct ip6_tnl_parm *p, s break; } if (i == IP6_TNL_MAX) - return -ENOBUFS; + goto failed; } dev = alloc_netdev(sizeof (*t), name, ip6ip6_tnl_dev_setup); if (dev == NULL) - return -ENOMEM; + goto failed; t = netdev_priv(dev); dev-init = ip6ip6_tnl_dev_init; @@ -248,13 +247,13 @@ ip6_tnl_create(struct ip6_tnl_parm *p, s if ((err = register_netdevice(dev)) 0) { free_netdev(dev); - return err; + goto failed; } dev_hold(dev); - ip6ip6_tnl_link(t); - *pt = t; - return 0; + return t; +failed: + return NULL; } /** @@ -268,32 +267,23 @@ ip6_tnl_create(struct ip6_tnl_parm *p, s * tunnel device is created and registered for use. * * Return: - * 0 if tunnel located or created, - * -EINVAL if parameters incorrect, - * -ENODEV if no matching tunnel available + * matching tunnel or NULL **/ -static int -ip6ip6_tnl_locate(struct ip6_tnl_parm *p, struct ip6_tnl **pt, int create) +static struct ip6_tnl *ip6ip6_tnl_locate(struct ip6_tnl_parm *p, int create) { struct in6_addr *remote = p-raddr; struct in6_addr *local = p-laddr; struct ip6_tnl *t; - if (p-proto != IPPROTO_IPV6) - return -EINVAL; - for (t = *ip6ip6_bucket(p); t; t = t-next) { if (ipv6_addr_equal(local, t-parms.laddr) - ipv6_addr_equal(remote, t-parms.raddr)) { - *pt = t; - return (create ? -EEXIST : 0); - } + ipv6_addr_equal(remote, t-parms.raddr)) + return t; } if (!create) - return -ENODEV; - - return ip6_tnl_create(p, pt); + return NULL; + return ip6_tnl_create(p); } /** @@ -919,26 +909,20 @@ static int ip6ip6_tnl_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { int err = 0; - int create; struct ip6_tnl_parm p; struct ip6_tnl *t = NULL; switch (cmd) { case SIOCGETTUNNEL: if (dev == ip6ip6_fb_tnl_dev) { - if (copy_from_user(p, - ifr-ifr_ifru.ifru_data, - sizeof (p))) { + if (copy_from_user(p, ifr-ifr_ifru.ifru_data, sizeof (p))) { err = -EFAULT; break; } - if ((err = ip6ip6_tnl_locate(p, t, 0)) == -ENODEV) - t = netdev_priv(dev); - else if (err) - break; - } else + t = ip6ip6_tnl_locate(p, 0); + } + if (t == NULL) t = netdev_priv(dev); - memcpy(p, t-parms, sizeof (p)); if (copy_to_user(ifr-ifr_ifru.ifru_data, p, sizeof (p))) { err = -EFAULT; @@ -947,35 +931,36 @@ ip6ip6_tnl_ioctl(struct net_device *dev, case SIOCADDTUNNEL: case SIOCCHGTUNNEL: err = -EPERM; - create = (cmd == SIOCADDTUNNEL); if (!capable(CAP_NET_ADMIN)) break; - if (copy_from_user(p, ifr-ifr_ifru.ifru_data, sizeof (p))) { - err = -EFAULT; + err = -EFAULT; + if (copy_from_user(p, ifr-ifr_ifru.ifru_data, sizeof (p))) break; - } - if (!create dev != ip6ip6_fb_tnl_dev) { - t = netdev_priv(dev); - } - if (!t (err = ip6ip6_tnl_locate
[PATCH 0/6] IPv6: IPv6 tunnel fixes
Hello, the following incremental patches contain several fixes to ip6_tunnel.c. They are in chronological order, so the most critical one is unfortunately last. Many thanks to Tero Kauppinen for reporting that one! The patches are sent as attachments as my mailer has at least previously messed up inlined patches. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] IPv6: Do mandatory IPv6 tunnel endpoint checks in realtime
From e60ed41fb11793d16ffcbfd56b889e0bbf04ea88 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 12:43:50 +0200 Subject: [PATCH 2/6] IPv6: Do mandatory IPv6 tunnel endpoint checks in realtime Doing the mandatory tunnel endpoint checks when the tunnel is set up isn't enough as interfaces can go up or down and addresses can be added or deleted after this. The checks need to be done realtime when the tunnel is processing a packet. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c | 96 +++- 1 files changed, 62 insertions(+), 34 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index e61458c..2467b04 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -494,6 +494,27 @@ static inline void ip6ip6_ecn_decapsulat if (INET_ECN_is_ce(ipv6_get_dsfield(outer_iph))) IP6_ECN_set_ce(inner_iph); } +static inline int ip6_tnl_rcv_ctl(struct ip6_tnl *t) +{ + struct ip6_tnl_parm *p = t-parms; + int ret = 0; + + if (p-flags IP6_TNL_F_CAP_RCV) { + struct net_device *ldev = NULL; + + if (p-link) + ldev = dev_get_by_index(p-link); + + if ((ipv6_addr_is_multicast(p-laddr) || +likely(ipv6_chk_addr(p-laddr, ldev, 0))) + likely(!ipv6_chk_addr(p-raddr, NULL, 0))) + ret = 1; + + if (ldev) + dev_put(ldev); + } + return ret; +} /** * ip6ip6_rcv - decapsulate IPv6 packet and retransmit it locally @@ -518,7 +539,7 @@ ip6ip6_rcv(struct sk_buff *skb) goto discard; } - if (!(t-parms.flags IP6_TNL_F_CAP_RCV)) { + if (!ip6_tnl_rcv_ctl(t)) { t-stat.rx_dropped++; read_unlock(ip6ip6_lock); goto discard; @@ -596,6 +617,34 @@ ip6ip6_tnl_addr_conflict(struct ip6_tnl return ipv6_addr_equal(t-parms.raddr, hdr-saddr); } +static inline int ip6_tnl_xmit_ctl(struct ip6_tnl *t) +{ + struct ip6_tnl_parm *p = t-parms; + int ret = 0; + + if (p-flags IP6_TNL_F_CAP_XMIT) { + struct net_device *ldev = NULL; + + if (p-link) + ldev = dev_get_by_index(p-link); + + if (unlikely(!ipv6_chk_addr(p-laddr, ldev, 0))) + printk(KERN_WARNING + %s xmit: Local address not yet configured!\n, + p-name); + else if (!ipv6_addr_is_multicast(p-raddr) +unlikely(ipv6_chk_addr(p-raddr, NULL, 0))) + printk(KERN_WARNING + %s xmit: Routing loop! + Remote address found on this node!\n, + p-name); + else + ret = 1; + if (ldev) + dev_put(ldev); + } + return ret; +} /** * ip6ip6_tnl_xmit - encapsulate packet and send * @skb: the outgoing socket buffer @@ -633,10 +682,9 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str goto tx_err; } if (skb-protocol != htons(ETH_P_IPV6) || - !(t-parms.flags IP6_TNL_F_CAP_XMIT) || - ip6ip6_tnl_addr_conflict(t, ipv6h)) { + !ip6_tnl_xmit_ctl(t) || ip6ip6_tnl_addr_conflict(t, ipv6h)) goto tx_err; - } + if ((offset = parse_tlv_tnl_enc_lim(skb, skb-nh.raw)) 0) { struct ipv6_tlv_tnl_enc_lim *tel; tel = (struct ipv6_tlv_tnl_enc_lim *) skb-nh.raw[offset]; @@ -767,39 +815,19 @@ tx_err: static void ip6_tnl_set_cap(struct ip6_tnl *t) { struct ip6_tnl_parm *p = t-parms; - struct in6_addr *laddr = p-laddr; - struct in6_addr *raddr = p-raddr; - int ltype = ipv6_addr_type(laddr); - int rtype = ipv6_addr_type(raddr); + int ltype = ipv6_addr_type(p-laddr); + int rtype = ipv6_addr_type(p-raddr); p-flags = ~(IP6_TNL_F_CAP_XMIT|IP6_TNL_F_CAP_RCV); - if (ltype != IPV6_ADDR_ANY rtype != IPV6_ADDR_ANY - ((ltype|rtype) -(IPV6_ADDR_UNICAST| - IPV6_ADDR_LOOPBACK|IPV6_ADDR_LINKLOCAL| - IPV6_ADDR_MAPPED|IPV6_ADDR_RESERVED)) == IPV6_ADDR_UNICAST) { - struct net_device *ldev = NULL; - int l_ok = 1; - int r_ok = 1; - - if (p-link) - ldev = dev_get_by_index(p-link); - - if (ltypeIPV6_ADDR_UNICAST !ipv6_chk_addr(laddr, ldev, 0)) - l_ok = 0; - - if (rtypeIPV6_ADDR_UNICAST ipv6_chk_addr(raddr, NULL, 0)) - r_ok = 0; - - if (l_ok r_ok
[PATCH 3/6] IPv6: Allow link-local tunnel endpoints
From de1ad3df949303a1454ae0cdee0a599c0fc95e45 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 12:45:08 +0200 Subject: [PATCH 3/6] IPv6: Allow link-local tunnel endpoints Allow link-local tunnel endpoints if the underlying link is defined. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 2467b04..008bd11 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -460,6 +460,7 @@ ip6ip6_err(struct sk_buff *skb, struct i if (rel_msg pskb_may_pull(skb, offset + sizeof (*ipv6h))) { struct rt6_info *rt; struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC); + if (!skb2) goto out; @@ -823,7 +824,7 @@ static void ip6_tnl_set_cap(struct ip6_t if (ltype (IPV6_ADDR_UNICAST|IPV6_ADDR_MULTICAST) rtype (IPV6_ADDR_UNICAST|IPV6_ADDR_MULTICAST) !((ltype|rtype) IPV6_ADDR_LOOPBACK) - !((ltype|rtype) IPV6_ADDR_LINKLOCAL)) { + (!((ltype|rtype) IPV6_ADDR_LINKLOCAL) || p-link)) { if (ltypeIPV6_ADDR_UNICAST) p-flags |= IP6_TNL_F_CAP_XMIT; if (rtypeIPV6_ADDR_UNICAST) @@ -861,8 +862,11 @@ static void ip6ip6_tnl_link_config(struc dev-iflink = p-link; if (p-flags IP6_TNL_F_CAP_XMIT) { + int strict = (ipv6_addr_type(p-raddr) + (IPV6_ADDR_MULTICAST|IPV6_ADDR_LINKLOCAL)); + struct rt6_info *rt = rt6_lookup(p-raddr, p-laddr, -p-link, 0); +p-link, strict); if (rt == NULL) return; -- 1.4.3.2
[PATCH 4/6] IPv6: Don't allocate memory for Tunnel Encapsulation Limit Option
From b278633af228bcc986856f6492b02422c25656c7 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 12:47:54 +0200 Subject: [PATCH 4/6] IPv6: Don't allocate memory for Tunnel Encapsulation Limit Option Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c | 58 ++-- 1 files changed, 22 insertions(+), 36 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 008bd11..a97eb52 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -571,31 +571,23 @@ discard: return 0; } -static inline struct ipv6_txoptions *create_tel(__u8 encap_limit) -{ - struct ipv6_tlv_tnl_enc_lim *tel; - struct ipv6_txoptions *opt; - __u8 *raw; - - int opt_len = sizeof(*opt) + 8; - - if (!(opt = kzalloc(opt_len, GFP_ATOMIC))) { - return NULL; - } - opt-tot_len = opt_len; - opt-dst0opt = (struct ipv6_opt_hdr *) (opt + 1); - opt-opt_nflen = 8; +struct ipv6_tel_txoption { + struct ipv6_txoptions ops; + __u8 dst_opt[8]; +}; - tel = (struct ipv6_tlv_tnl_enc_lim *) (opt-dst0opt + 1); - tel-type = IPV6_TLV_TNL_ENCAP_LIMIT; - tel-length = 1; - tel-encap_limit = encap_limit; +static void init_tel_txopt(struct ipv6_tel_txoption *opt, __u8 encap_limit) +{ + memset(opt, 0, sizeof(struct ipv6_tel_txoption)); - raw = (__u8 *) opt-dst0opt; - raw[5] = IPV6_TLV_PADN; - raw[6] = 1; + opt-dst_opt[2] = IPV6_TLV_TNL_ENCAP_LIMIT; + opt-dst_opt[3] = 1; + opt-dst_opt[4] = encap_limit; + opt-dst_opt[5] = IPV6_TLV_PADN; + opt-dst_opt[6] = 1; - return opt; + opt-ops.dst0opt = (struct ipv6_opt_hdr *) opt-dst_opt; + opt-ops.opt_nflen = 8; } /** @@ -665,8 +657,8 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str struct ip6_tnl *t = netdev_priv(dev); struct net_device_stats *stats = t-stat; struct ipv6hdr *ipv6h = skb-nh.ipv6h; - struct ipv6_txoptions *opt = NULL; int encap_limit = -1; + struct ipv6_tel_txoption opt; __u16 offset; struct flowi fl; struct dst_entry *dst; @@ -695,9 +687,9 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str goto tx_err; } encap_limit = tel-encap_limit - 1; - } else if (!(t-parms.flags IP6_TNL_F_IGN_ENCAP_LIMIT)) { + } else if (!(t-parms.flags IP6_TNL_F_IGN_ENCAP_LIMIT)) encap_limit = t-parms.encap_limit; - } + memcpy(fl, t-fl, sizeof (fl)); proto = fl.proto; @@ -707,9 +699,6 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str if ((t-parms.flags IP6_TNL_F_USE_ORIG_FLOWLABEL)) fl.fl6_flowlabel |= (*(__u32 *) ipv6h IPV6_FLOWLABEL_MASK); - if (encap_limit = 0 (opt = create_tel(encap_limit)) == NULL) - goto tx_err; - if ((dst = ip6_tnl_dst_check(t)) != NULL) dst_hold(dst); else { @@ -730,7 +719,7 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str goto tx_err_dst_release; } mtu = dst_mtu(dst) - sizeof (*ipv6h); - if (opt) { + if (encap_limit = 0) { max_headroom += 8; mtu -= 8; } @@ -768,9 +757,10 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str skb-h.raw = skb-nh.raw; - if (opt) - ipv6_push_nfrag_opts(skb, opt, proto, NULL); - + if (encap_limit = 0) { + init_tel_txopt(opt, encap_limit); + ipv6_push_nfrag_opts(skb, opt.ops, proto, NULL); + } skb-nh.raw = skb_push(skb, sizeof(struct ipv6hdr)); ipv6h = skb-nh.ipv6h; *(u32*)ipv6h = fl.fl6_flowlabel | htonl(0x6000); @@ -794,9 +784,6 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, str stats-tx_aborted_errors++; } ip6_tnl_dst_store(t, dst); - - kfree(opt); - t-recursion--; return 0; tx_err_link_failure: @@ -804,7 +791,6 @@ tx_err_link_failure: dst_link_failure(skb); tx_err_dst_release: dst_release(dst); - kfree(opt); tx_err: stats-tx_errors++; stats-tx_dropped++; -- 1.4.3.2
[PATCH 5/6] IPv6: Improve IPv6 tunnel error reporting
From 6b9b468f1cf3f2897fbcda389e8c2514a69c4943 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 12:49:47 +0200 Subject: [PATCH 5/6] IPv6: Improve IPv6 tunnel error reporting Log an error if the remote tunnel endpoint is unable to handle tunneled packets. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index a97eb52..4f64ed7 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -424,12 +424,9 @@ ip6ip6_err(struct sk_buff *skb, struct i } break; case ICMPV6_PARAMPROB: - /* ignore if parameter problem not caused by a tunnel - encapsulation limit sub-option */ - if (code != ICMPV6_HDR_FIELD) { - break; - } - teli = parse_tlv_tnl_enc_lim(skb, skb-data); + teli = 0; + if (code == ICMPV6_HDR_FIELD) + teli = parse_tlv_tnl_enc_lim(skb, skb-data); if (teli teli == ntohl(info) - 2) { tel = (struct ipv6_tlv_tnl_enc_lim *) skb-data[teli]; @@ -441,6 +438,10 @@ ip6ip6_err(struct sk_buff *skb, struct i tunnel!\n, t-parms.name); rel_msg = 1; } + } else if (net_ratelimit()) { + printk(KERN_WARNING + %s: Recipient unable to parse tunneled + packet!\n , t-parms.name); } break; case ICMPV6_PKT_TOOBIG: -- 1.4.3.2
[PATCH 6/6] IPv6: Fix infinite loop if no matching IPv6 tunnel found
From d9ecea2b1d88bc8702f70fbbca7cde2afb8312ee Mon Sep 17 00:00:00 2001 From: Ville Nuorvala [EMAIL PROTECTED] Date: Thu, 2 Nov 2006 13:07:35 +0200 Subject: [PATCH 6/6] IPv6: Fix infinite loop if no matching IPv6 tunnel found If no matching IPv6 tunnel was found ip6ip6_rcv() would cause ip6_input_finish() to resubmit the same skb to ip6ip6_rcv(). Many thanks to Tero Kauppinen at Ericsson for reporting this issue. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_tunnel.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 4f64ed7..603ed0d 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -565,11 +565,11 @@ ip6ip6_rcv(struct sk_buff *skb) return 0; } read_unlock(ip6ip6_lock); - return 1; - + icmpv6_send(skb, ICMPV6_DEST_UNREACH, + ICMPV6_ADDR_UNREACH, 0, skb-dev); discard: kfree_skb(skb); - return 0; + return -1; } struct ipv6_tel_txoption { -- 1.4.3.2
Re: [PATCH 9/13] [SCTP] Merge IPv4 and IPv6 versions of get_saddr() with their corresponding get_dst().
On 10/27/06 20:47, Sridhar Samudrala wrote: On Tue, 2006-10-17 at 03:19 +0300, Ville Nuorvala wrote: As the IPv6 route lookup now also returns the selected source address there is no need for a separate source address lookup. In fact, the source address selection needs to be moved to get_dst() because the selected IPv6 source address isn't always stored in the route. Sometimes this makes it impossible to guess the correct address later on. Ville, Overall the patch looks pretty good. I found only 1 issue in sctp_v6_get_dst(). See below. snip +/* Returns the dst cache entry for the given source and destination ip + * addresses. + */ +static struct dst_entry *sctp_v6_get_dst(struct sctp_association *asoc, + union sctp_addr *daddr, + union sctp_addr *saddr) +{ +struct dst_entry *dst; +struct flowi fl; +struct sctp_bind_addr *bp; +rwlock_t *addr_lock; +struct sctp_sockaddr_entry *laddr; +struct list_head *pos; +struct rt6_info *rt; +union sctp_addr baddr; +sctp_scope_t scope; +__u8 matchlen = 0; +__u8 bmatchlen; + +memset(fl, 0, sizeof(fl)); +ipv6_addr_copy(fl.fl6_dst, daddr-v6.sin6_addr); +if (ipv6_addr_type(daddr-v6.sin6_addr) IPV6_ADDR_LINKLOCAL) +fl.oif = daddr-v6.sin6_scope_id; + +ipv6_addr_copy(fl.fl6_src, saddr-v6.sin6_addr); +SCTP_DEBUG_PRINTK(%s: DST= NIP6_FMT SRC= NIP6_FMT , + __FUNCTION__, NIP6(fl.fl6_dst), NIP6(fl.fl6_src)); + +dst = ip6_route_output(NULL, fl); +if (dst-error) { +dst_release(dst); +dst = NULL; +} +if (!ipv6_addr_any(saddr-v6.sin6_addr)) +goto out; +if (!asoc) { +if (dst) +ipv6_addr_copy(saddr-v6.sin6_addr, fl.fl6_src); +goto out; +} +bp = asoc-base.bind_addr; +addr_lock = asoc-base.addr_lock; + +if (dst) { +/* Walk through the bind address list and look for a bind + * address that matches the source address of the returned rt. + */ +sctp_v6_fl_saddr(baddr, fl, bp-port); Here we are checking if the source address returned in the dst matches one of the address in the bind address list of the association. Not the source address that is passed to this routine(it could be INADDRY_ANY). So this should be changed back to sctp_v6_dst_saddr(). No, see that's the problem. The source address isn't always stored in the rt6_info. Therefore I copy the address into the flowi, so after ip6_route_output() fl does indeed contain the selected source address. Sorry if I didn't cc you all relevant patches from the patch set. Anyway there are still some unresolved issues with my code, but it's nice to know I didn't at least mess up SCTP in a big way ;-) Thanks, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: (usagi-core 31424) Re: [PATCH 7/13] [RFC] [IPV6] Move source address selection into route lookup.
On 10/19/06 05:27, Mitsuru Chinen wrote: Hello again Mitsuru-san, I haven't yet had time to follow up on this issue and I would in fact need some help to proceed. http://testlab.linux-ipv6.org/tahi-autorun.2/net-2.6_20061018/ The host testlab.linux-ipv6.org doesn't seem to be visible to the outside world so could you post the results somewhere where I could take a closer look at the results? Could you actually also send me the configuration files for the TAHI test, so I could try to set up an identical test environment? Thanks, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: (usagi-core 31424) Re: [PATCH 7/13] [RFC] [IPV6] Move source address selection into route lookup.
On 10/19/06 05:27, Mitsuru Chinen wrote: Hello Mitsuru-san, At the first Echo Request, its source address is a global address. | # ping6 -n -c 1 -i 1 -s 1452 -p 00 -w 2 -I eth1 FF1E: :1:2 | PATTERN: 0x00 | PING FF1E::1:2(ff1e::1:2) from 3ffe:501::100:2d0:b7ff:fe9a:455b eth1: 1452 data bytes | | --- FF1E::1:2 ping statistics --- | 3 packets transmitted, 0 received, 100% loss, time 1999ms However, after receiving a Too Big message, the source address of Echo Request is changed to :: at the failure case. thank you for the thorough test report! I'll try to track down the reason. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: (usagi-core 31424) Re: [PATCH 7/13] [RFC] [IPV6] Move source address selection into route lookup.
Mitsuru-san, could you apply patch #12 and rerun the test? I think this particular problem is caused by the routing cache entry not having a valid source address in the first place. As ip6_dst_lookup() doesn't do the source address lookup anymore, it is critical that we only store cache entries with complete address data for both destination and source. These two patches apparently shouldn't have been split in the first place. Sorry! Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/13] [RFC] [IPV6] Fix source prefix routing problems when source address undefined.
On 10/17/06 14:35, Thomas Graf wrote: Hi Thomas, Because otherwise a rule containing only a source prefix match is equivalent to a catch-all rule for all lookups not providing a source address. An example: Someone adding the rule ip rule add from 2001::1/128 unreachable results in _all_ lookups not providing a source address to resolve to unreachable which means that all source address lookups will fail. one quick but ugly hack would be to handle FR_ACT_TO_TBL rules differently than the others in fib6_rule_match(). I don't necessarily recommend this approach, but it could work. What do you say? The problem starts that both the routing decision and source address selection is both a routing decision sharing the same logic which are now conflicting as the behaviour for a from ANY requires different logic. In order to solve this, rules must be restricted to one of these paths, i.e. a rule intending to make certain prefixes unreachable may not apply to the source selection logic. This can be achieved using the 'reason' field I proposed in my netconf slides, it would allow turning the first rule example into rule add from 2001::1/128 for INPUT unreachable which would no longer apply when looking up the source address or deciding the outgoing route. Is there any reason we couldn't implement this? BTW I will not really able to participate in this discussion until Thursday, but please continue! Hopefully someone has a working solution that everyone is ok with by then ;-) I'm currently cooking something up, but I'm not yet sure what will become of it. It might not work at all. It will also change quite a lot of things, so there might be less invasive and cleaner solutions. Let's just say rt6_lookup(), which appears to cause most of these problems, will not be the same... Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/13] [RFC] Fix problems with IPv6 routing subtrees and source address selection
Hi, here are a bunch of more or less related patches having to do with fixing the IPv6 routing subtrees and source address selection. Most of the code is a cleaned up version of what I've written earlier for MIPL 2, where it has worked pretty well for a couple of years now. The SCTP code, however, turned out to be messier and more difficult to fix than I had originally thought. As I'm not that familiar with SCTP and don't really have an opportunity to test the code I'm especially grateful for any comments regarding those parts of the code. I've tried to split up the changes into logical parts to help digest them. Please comment! Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/13] [IPV6] Remove struct pol_chain.
Struct pol_chain has existed since at least the 2.2 kernel, but isn't used anymore. As the IPv6 policy routing is implemented in a totally different way in the current kernel, just get rid of it. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- include/net/ip6_route.h |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 6ca6b71..c14b70e 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -36,13 +36,6 @@ #define RT6_LOOKUP_F_IFACE 0x1 #define RT6_LOOKUP_F_REACHABLE 0x2 #define RT6_LOOKUP_F_HAS_SADDR 0x4 -struct pol_chain { - int type; - int priority; - struct fib6_node*rules; - struct pol_chain*next; -}; - extern struct rt6_info ip6_null_entry; #ifdef CONFIG_IPV6_MULTIPLE_TABLES -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/13] [SCTP] Fix minor typo
Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/sctp/socket.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 79c3e07..185d480 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -821,7 +821,7 @@ out: * addrs is a pointer to an array of one or more socket addresses. Each * address is contained in its appropriate structure (i.e. struct * sockaddr_in or struct sockaddr_in6) the family of the address type - * must be used to distengish the address length (note that this + * must be used to distinguish the address length (note that this * representation is termed a packed array of addresses). The caller * specifies the number of addresses in the array with addrcnt. * -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/13] [IPV6] Make sure error handling is done when calling ip6_route_output().
As ip6_route_output() never returns NULL, error checking must be done by looking at dst-error in stead of comparing dst against NULL. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/xfrm6_policy.c | 12 +++- net/sctp/ipv6.c | 10 +- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 6a252e2..db2d55c 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -25,12 +25,14 @@ #endif static struct dst_ops xfrm6_dst_ops; static struct xfrm_policy_afinfo xfrm6_policy_afinfo; -static int xfrm6_dst_lookup(struct xfrm_dst **dst, struct flowi *fl) +static int xfrm6_dst_lookup(struct xfrm_dst **xdst, struct flowi *fl) { - int err = 0; - *dst = (struct xfrm_dst*)ip6_route_output(NULL, fl); - if (!*dst) - err = -ENETUNREACH; + struct dst_entry *dst = ip6_route_output(NULL, fl); + int err = dst-error; + if (!err) + *xdst = (struct xfrm_dst *) dst; + else + dst_release(dst); return err; } diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 249e503..78071c6 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -215,17 +215,17 @@ static struct dst_entry *sctp_v6_get_dst } dst = ip6_route_output(NULL, fl); - if (dst) { + if (!dst-error) { struct rt6_info *rt; rt = (struct rt6_info *)dst; SCTP_DEBUG_PRINTK( rt6_dst: NIP6_FMT rt6_src: NIP6_FMT \n, NIP6(rt-rt6i_dst.addr), NIP6(rt-rt6i_src.addr)); - } else { - SCTP_DEBUG_PRINTK(NO ROUTE\n); + return dst; } - - return dst; + SCTP_DEBUG_PRINTK(NO ROUTE\n); + dst_release(dst); + return NULL; } /* Returns the number of consecutive initial bits that match in the 2 ipv6 -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/13] [IPV6] Clean up BACKTRACK().
The fn check is unnecessary as fn can never be NULL in BACKTRACK(). Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/route.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index a1b0f07..263c057 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -484,7 +484,7 @@ #define BACKTRACK(saddr) \ do { \ if (rt == ip6_null_entry) { \ struct fib6_node *pn; \ - while (fn) { \ + while (1) { \ if (fn-fn_flags RTN_TL_ROOT) \ goto out; \ pn = fn-parent; \ -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/13] [IPV6] Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES.
As IPV6_SUBTREES can't work without IPV6_MULTIPLE_TABLES have IPV6_SUBTREES depend on it. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/Kconfig | 16 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index a2d211d..5fd2ffd 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -152,9 +152,16 @@ config IPV6_TUNNEL If unsure, say N. +config IPV6_MULTIPLE_TABLES + bool IPv6: Multiple Routing Tables + depends on IPV6 EXPERIMENTAL + select FIB_RULES + ---help--- + Support multiple routing tables. + config IPV6_SUBTREES bool IPv6: source address based routing - depends on IPV6 EXPERIMENTAL + depends on IPV6_MULTIPLE_TABLES ---help--- Enable routing by source address or prefix. @@ -166,13 +173,6 @@ config IPV6_SUBTREES If unsure, say N. -config IPV6_MULTIPLE_TABLES - bool IPv6: Multiple Routing Tables - depends on IPV6 EXPERIMENTAL - select FIB_RULES - ---help--- - Support multiple routing tables. - config IPV6_ROUTE_FWMARK bool IPv6: use netfilter MARK value as routing key depends on IPV6_MULTIPLE_TABLES NETFILTER -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/13] [IPV6] Always copy rt-u.dst.error when copying a rt6_info.
Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/route.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 263c057..aa96be8 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -618,8 +618,6 @@ static struct rt6_info *rt6_alloc_clone( ipv6_addr_copy(rt-rt6i_dst.addr, daddr); rt-rt6i_dst.plen = 128; rt-rt6i_flags |= RTF_CACHE; - if (rt-rt6i_flags RTF_REJECT) - rt-u.dst.error = ort-u.dst.error; rt-u.dst.flags |= DST_HOST; rt-rt6i_nexthop = neigh_clone(ort-rt6i_nexthop); } @@ -1540,6 +1538,7 @@ static struct rt6_info * ip6_rt_copy(str rt-u.dst.output = ort-u.dst.output; memcpy(rt-u.dst.metrics, ort-u.dst.metrics, RTAX_MAX*sizeof(u32)); + rt-u.dst.error = ort-u.dst.error; rt-u.dst.dev = ort-u.dst.dev; if (rt-u.dst.dev) dev_hold(rt-u.dst.dev); -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/13] [RFC] [IPV6] Move source address selection into route lookup.
This patch moves the normal source address selection from ip6_dst_lookup() into ip6_pol_route_output(), but shouldn't change the routing or source address selection behavior in any way. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ip6_output.c |6 -- net/ipv6/route.c | 37 ++--- 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 6671691..0019007 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -855,12 +855,6 @@ static int ip6_dst_lookup_tail(struct so if ((err = (*dst)-error)) goto out_err_release; - if (ipv6_addr_any(fl-fl6_src)) { - err = ipv6_get_saddr(*dst, fl-fl6_dst, fl-fl6_src); - if (err) - goto out_err_release; - } - return 0; out_err_release: diff --git a/net/ipv6/route.c b/net/ipv6/route.c index aa96be8..b7b8148 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -536,7 +536,7 @@ struct rt6_info *rt6_lookup(struct in6_a int flags = strict ? RT6_LOOKUP_F_IFACE : 0; if (saddr) { - memcpy(fl.fl6_src, saddr, sizeof(*saddr)); + ipv6_addr_copy(fl.fl6_src, saddr); flags |= RT6_LOOKUP_F_HAS_SADDR; } @@ -629,13 +629,11 @@ static struct rt6_info *ip6_pol_route_in { struct fib6_node *fn; struct rt6_info *rt, *nrt; - int strict = 0; + int strict = flags RT6_LOOKUP_F_IFACE; int attempts = 3; int err; int reachable = RT6_LOOKUP_F_REACHABLE; - strict |= flags RT6_LOOKUP_F_IFACE; - relookup: read_lock_bh(table-tb6_lock); @@ -726,22 +724,22 @@ static struct rt6_info *ip6_pol_route_ou { struct fib6_node *fn; struct rt6_info *rt, *nrt; - int strict = 0; - int attempts = 3; - int err; + int has_saddr = flags RT6_LOOKUP_F_HAS_SADDR; + int strict = flags RT6_LOOKUP_F_IFACE; int reachable = RT6_LOOKUP_F_REACHABLE; + int attempts = 3; + struct in6_addr saddr; - strict |= flags RT6_LOOKUP_F_IFACE; - + ipv6_addr_copy(saddr, fl-fl6_src); relookup: read_lock_bh(table-tb6_lock); restart_2: - fn = fib6_lookup(table-tb6_root, fl-fl6_dst, fl-fl6_src); + fn = fib6_lookup(table-tb6_root, fl-fl6_dst, saddr); restart: rt = rt6_select(fn-leaf, fl-oif, strict | reachable); - BACKTRACK(fl-fl6_src); + BACKTRACK(saddr); if (rt == ip6_null_entry || rt-rt6i_flags RTF_CACHE) goto out; @@ -749,6 +747,13 @@ restart: dst_hold(rt-u.dst); read_unlock_bh(table-tb6_lock); + if (!has_saddr) { + /* policy rule doesn't restrict source address */ + if (ipv6_get_saddr(rt-u.dst, fl-fl6_dst, saddr)) + goto no_saddr; + has_saddr = RT6_LOOKUP_F_HAS_SADDR; + ipv6_addr_copy(fl-fl6_src, saddr); + } if (!rt-rt6i_nexthop !(rt-rt6i_flags RTF_NONEXTHOP)) nrt = rt6_alloc_cow(rt, fl-fl6_dst, fl-fl6_src); else { @@ -764,8 +769,7 @@ #endif dst_hold(rt-u.dst); if (nrt) { - err = ip6_ins_rt(nrt); - if (!err) + if (!ip6_ins_rt(nrt)) goto out2; } @@ -778,7 +782,6 @@ #endif */ dst_release(rt-u.dst); goto relookup; - out: if (reachable) { reachable = 0; @@ -790,6 +793,10 @@ out2: rt-u.dst.lastuse = jiffies; rt-u.dst.__use++; return rt; +no_saddr: + rt = ip6_null_entry; + dst_hold(rt-u.dst); + goto out2; } struct dst_entry * ip6_route_output(struct sock *sk, struct flowi *fl) @@ -2044,7 +2051,7 @@ #endif NLA_PUT_U32(skb, RTA_IIF, iif); else if (dst) { struct in6_addr saddr_buf; - if (ipv6_get_saddr(rt-u.dst, dst, saddr_buf) == 0) + if (!ipv6_get_saddr(rt-u.dst, dst, saddr_buf)) NLA_PUT(skb, RTA_PREFSRC, 16, saddr_buf); } -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/13] [RFC] [IPV6] Get rid of ipv6_get_saddr() in xfrm6_get_saddr().
As the source address is already selected in ip6_pol_route_output() there is no need to do the source address lookup a second time. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/xfrm6_policy.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index db2d55c..954c9ac 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -48,8 +48,7 @@ static int xfrm6_get_saddr(xfrm_address_ }; if (!xfrm6_dst_lookup((struct xfrm_dst **)rt, fl_tunnel)) { - ipv6_get_saddr(rt-u.dst, (struct in6_addr *)daddr-a6, - (struct in6_addr *)saddr-a6); + ipv6_addr_copy((struct in6_addr *)saddr, fl_tunnel.fl6_src); dst_release(rt-u.dst); return 0; } -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/13] [SCTP] Merge IPv4 and IPv6 versions of get_saddr() with their corresponding get_dst().
As the IPv6 route lookup now also returns the selected source address there is no need for a separate source address lookup. In fact, the source address selection needs to be moved to get_dst() because the selected IPv6 source address isn't always stored in the route. Sometimes this makes it impossible to guess the correct address later on. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- include/net/sctp/structs.h |7 - net/sctp/ipv6.c| 235 +++- net/sctp/protocol.c| 56 -- net/sctp/transport.c |8 + 4 files changed, 148 insertions(+), 158 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index c6d93bb..e0973a3 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -529,15 +529,8 @@ struct sctp_af { struct dst_entry *(*get_dst)(struct sctp_association *asoc, union sctp_addr *daddr, union sctp_addr *saddr); - void(*get_saddr)(struct sctp_association *asoc, -struct dst_entry *dst, -union sctp_addr *daddr, -union sctp_addr *saddr); void(*copy_addrlist) (struct list_head *, struct net_device *); - void(*dst_saddr)(union sctp_addr *saddr, -struct dst_entry *dst, -unsigned short port); int (*cmp_addr) (const union sctp_addr *addr1, const union sctp_addr *addr2); void(*addr_copy)(union sctp_addr *dst, diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 78071c6..68ead54 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -188,46 +188,6 @@ static int sctp_v6_xmit(struct sk_buff * return ip6_xmit(sk, skb, fl, np-opt, ipfragok); } -/* Returns the dst cache entry for the given source and destination ip - * addresses. - */ -static struct dst_entry *sctp_v6_get_dst(struct sctp_association *asoc, -union sctp_addr *daddr, -union sctp_addr *saddr) -{ - struct dst_entry *dst; - struct flowi fl; - - memset(fl, 0, sizeof(fl)); - ipv6_addr_copy(fl.fl6_dst, daddr-v6.sin6_addr); - if (ipv6_addr_type(daddr-v6.sin6_addr) IPV6_ADDR_LINKLOCAL) - fl.oif = daddr-v6.sin6_scope_id; - - - SCTP_DEBUG_PRINTK(%s: DST= NIP6_FMT , - __FUNCTION__, NIP6(fl.fl6_dst)); - - if (saddr) { - ipv6_addr_copy(fl.fl6_src, saddr-v6.sin6_addr); - SCTP_DEBUG_PRINTK( - SRC= NIP6_FMT - , - NIP6(fl.fl6_src)); - } - - dst = ip6_route_output(NULL, fl); - if (!dst-error) { - struct rt6_info *rt; - rt = (struct rt6_info *)dst; - SCTP_DEBUG_PRINTK( - rt6_dst: NIP6_FMT rt6_src: NIP6_FMT \n, - NIP6(rt-rt6i_dst.addr), NIP6(rt-rt6i_src.addr)); - return dst; - } - SCTP_DEBUG_PRINTK(NO ROUTE\n); - dst_release(dst); - return NULL; -} - /* Returns the number of consecutive initial bits that match in the 2 ipv6 * addresses. */ @@ -250,69 +210,6 @@ static inline int sctp_v6_addr_match_len return (i*32); } -/* Fills in the source address(saddr) based on the destination address(daddr) - * and asoc's bind address list. - */ -static void sctp_v6_get_saddr(struct sctp_association *asoc, - struct dst_entry *dst, - union sctp_addr *daddr, - union sctp_addr *saddr) -{ - struct sctp_bind_addr *bp; - rwlock_t *addr_lock; - struct sctp_sockaddr_entry *laddr; - struct list_head *pos; - sctp_scope_t scope; - union sctp_addr *baddr = NULL; - __u8 matchlen = 0; - __u8 bmatchlen; - - SCTP_DEBUG_PRINTK(%s: asoc:%p dst:%p - daddr: NIP6_FMT , - __FUNCTION__, asoc, dst, NIP6(daddr-v6.sin6_addr)); - - if (!asoc) { - ipv6_get_saddr(dst, daddr-v6.sin6_addr,saddr-v6.sin6_addr); - SCTP_DEBUG_PRINTK(saddr from ipv6_get_saddr: NIP6_FMT \n, - NIP6(saddr-v6.sin6_addr)); - return; - } - - scope = sctp_scope(daddr); - - bp = asoc-base.bind_addr; - addr_lock = asoc-base.addr_lock; - - /* Go through the bind address list and find the best source address -* that matches the scope of the destination address. -*/ - sctp_read_lock(addr_lock); - list_for_each(pos
[PATCH 10/13] [RFC] [IPV6] Don't export ipv6_get_saddr().
To make sure the source address selection is done correctly, don't let users outside the ipv6 module call ipv6_get_saddr() directly. In stead have them go through ip6_route_output(). Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/ipv6_syms.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c index 0e8e067..94a9806 100644 --- a/net/ipv6/ipv6_syms.c +++ b/net/ipv6/ipv6_syms.c @@ -25,7 +25,6 @@ EXPORT_SYMBOL(inet6_release); EXPORT_SYMBOL(inet6_bind); EXPORT_SYMBOL(inet6_getname); EXPORT_SYMBOL(inet6_ioctl); -EXPORT_SYMBOL(ipv6_get_saddr); EXPORT_SYMBOL(ipv6_chk_addr); EXPORT_SYMBOL(in6_dev_finish_destroy); #ifdef CONFIG_XFRM -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/13] [RFC] [IPV6] Merge ipv6_dev_get_saddr() and ipv6_get_saddr().
The split into ipv6_get_saddr() and ipv6_dev_get_saddr() isn't necessary anymore, so they can be merged into just the function ipv6_get_saddr(). Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- include/net/addrconf.h |5 + net/ipv6/addrconf.c| 21 ++--- net/ipv6/ndisc.c |2 +- net/ipv6/route.c |5 +++-- 4 files changed, 11 insertions(+), 22 deletions(-) diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 44f1b67..d075693 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -67,10 +67,7 @@ #endif extern struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev, int strict); -extern int ipv6_get_saddr(struct dst_entry *dst, - struct in6_addr *daddr, - struct in6_addr *saddr); -extern int ipv6_dev_get_saddr(struct net_device *dev, +extern int ipv6_get_saddr(int pref_if, struct in6_addr *daddr, struct in6_addr *saddr); extern int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c186763..09a22c8 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -904,8 +904,7 @@ static int inline ipv6_saddr_label(const return 1; } -int ipv6_dev_get_saddr(struct net_device *daddr_dev, - struct in6_addr *daddr, struct in6_addr *saddr) +int ipv6_get_saddr(int pref_if, struct in6_addr *daddr, struct in6_addr *saddr) { struct ipv6_saddr_score hiscore; struct inet6_ifaddr *ifa_result = NULL; @@ -937,7 +936,7 @@ int ipv6_dev_get_saddr(struct net_device */ if ((daddr_type IPV6_ADDR_MULTICAST || daddr_scope = IPV6_ADDR_SCOPE_LINKLOCAL) - daddr_dev dev != daddr_dev) + pref_if dev-ifindex != pref_if) continue; idev = __in6_dev_get(dev); @@ -1062,13 +1061,13 @@ #endif /* Rule 5: Prefer outgoing interface */ if (hiscore.rule 5) { - if (daddr_dev == NULL || - daddr_dev == ifa_result-idev-dev) + if (!pref_if || + pref_if == ifa_result-idev-dev-ifindex) hiscore.attrs |= IPV6_SADDR_SCORE_OIF; hiscore.rule++; } - if (daddr_dev == NULL || - daddr_dev == ifa-idev-dev) { + if (!pref_if || + pref_if == ifa-idev-dev-ifindex) { score.attrs |= IPV6_SADDR_SCORE_OIF; if (!(hiscore.attrs IPV6_SADDR_SCORE_OIF)) { score.rule = 5; @@ -1158,14 +1157,6 @@ record_it: return 0; } - -int ipv6_get_saddr(struct dst_entry *dst, - struct in6_addr *daddr, struct in6_addr *saddr) -{ - return ipv6_dev_get_saddr(dst ? ((struct rt6_info *)dst)-rt6i_idev-dev : NULL, daddr, saddr); -} - - int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr) { struct inet6_dev *idev; diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 0304b5f..3ac4e12 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -449,7 +449,7 @@ static void ndisc_send_na(struct net_dev src_addr = solicited_addr; in6_ifa_put(ifp); } else { - if (ipv6_dev_get_saddr(dev, daddr, tmpaddr)) + if (ipv6_get_saddr(dev-ifindex, daddr, tmpaddr)) return; src_addr = tmpaddr; } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index b7b8148..7cd7747 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -748,8 +748,9 @@ restart: read_unlock_bh(table-tb6_lock); if (!has_saddr) { + int oif = rt-rt6i_dev-ifindex; /* policy rule doesn't restrict source address */ - if (ipv6_get_saddr(rt-u.dst, fl-fl6_dst, saddr)) + if (ipv6_get_saddr(oif, fl-fl6_dst, saddr)) goto no_saddr; has_saddr = RT6_LOOKUP_F_HAS_SADDR; ipv6_addr_copy(fl-fl6_src, saddr); @@ -2051,7 +2052,7 @@ #endif NLA_PUT_U32(skb, RTA_IIF, iif); else if (dst) { struct in6_addr saddr_buf; - if (!ipv6_get_saddr(rt-u.dst, dst, saddr_buf)) + if (!ipv6_get_saddr(rt-rt6i_dev-ifindex, dst, saddr_buf
[PATCH 12/13] [RFC] [IPV6] Make sure route cache entries have a valid source address.
Leaving out the source address from routing cache entries when using routing subtrees causes all kinds of problems. Make sure this doesn't happen. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- net/ipv6/route.c | 31 +-- 1 files changed, 17 insertions(+), 14 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7cd7747..7c3438e 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -594,29 +594,28 @@ static struct rt6_info *rt6_alloc_cow(st ipv6_addr_copy(rt-rt6i_dst.addr, daddr); rt-rt6i_dst.plen = 128; - rt-rt6i_flags |= RTF_CACHE; - rt-u.dst.flags |= DST_HOST; - #ifdef CONFIG_IPV6_SUBTREES - if (rt-rt6i_src.plen saddr) { - ipv6_addr_copy(rt-rt6i_src.addr, saddr); - rt-rt6i_src.plen = 128; - } + ipv6_addr_copy(rt-rt6i_src.addr, saddr); + rt-rt6i_src.plen = 128; #endif - + rt-rt6i_flags |= RTF_CACHE; + rt-u.dst.flags |= DST_HOST; rt-rt6i_nexthop = ndisc_get_neigh(rt-rt6i_dev, rt-rt6i_gateway); - } return rt; } -static struct rt6_info *rt6_alloc_clone(struct rt6_info *ort, struct in6_addr *daddr) +static struct rt6_info *rt6_alloc_clone(struct rt6_info *ort, struct in6_addr *daddr, struct in6_addr *saddr) { struct rt6_info *rt = ip6_rt_copy(ort); if (rt) { ipv6_addr_copy(rt-rt6i_dst.addr, daddr); rt-rt6i_dst.plen = 128; +#ifdef CONFIG_IPV6_SUBTREES + ipv6_addr_copy(rt-rt6i_src.addr, saddr); + rt-rt6i_src.plen = 128; +#endif rt-rt6i_flags |= RTF_CACHE; rt-u.dst.flags |= DST_HOST; rt-rt6i_nexthop = neigh_clone(ort-rt6i_nexthop); @@ -654,7 +653,7 @@ restart: nrt = rt6_alloc_cow(rt, fl-fl6_dst, fl-fl6_src); else { #if CLONE_OFFLINK_ROUTE - nrt = rt6_alloc_clone(rt, fl-fl6_dst); + nrt = rt6_alloc_clone(rt, fl-fl6_dst, fl-fl6_src); #else goto out2; #endif @@ -756,10 +755,10 @@ restart: ipv6_addr_copy(fl-fl6_src, saddr); } if (!rt-rt6i_nexthop !(rt-rt6i_flags RTF_NONEXTHOP)) - nrt = rt6_alloc_cow(rt, fl-fl6_dst, fl-fl6_src); + nrt = rt6_alloc_cow(rt, fl-fl6_dst, saddr); else { #if CLONE_OFFLINK_ROUTE - nrt = rt6_alloc_clone(rt, fl-fl6_dst); + nrt = rt6_alloc_clone(rt, fl-fl6_dst, saddr); #else goto out2; #endif @@ -1429,6 +1428,10 @@ void rt6_redirect(struct in6_addr *dest, ipv6_addr_copy(nrt-rt6i_dst.addr, dest); nrt-rt6i_dst.plen = 128; +#ifdef CONFIG_IPV6_SUBTREES + ipv6_addr_copy(nrt-rt6i_src.addr, src); + nrt-rt6i_src.plen = 128; +#endif nrt-u.dst.flags |= DST_HOST; ipv6_addr_copy(nrt-rt6i_gateway, (struct in6_addr*)neigh-primary_key); @@ -1511,7 +1514,7 @@ void rt6_pmtu_discovery(struct in6_addr if (!rt-rt6i_nexthop !(rt-rt6i_flags RTF_NONEXTHOP)) nrt = rt6_alloc_cow(rt, daddr, saddr); else - nrt = rt6_alloc_clone(rt, daddr); + nrt = rt6_alloc_clone(rt, daddr, saddr); if (nrt) { nrt-u.dst.metrics[RTAX_MTU-1] = pmtu; -- 1.4.2.3 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/13] [RFC] [IPV6] Fix source prefix routing problems when source address undefined.
With IPv6 routing subtrees we need to take into account that the source address is typically not specified at the time of the route lookup. There are two separate cases where this can happen. In the typical case the source address hasn't been selected before the route lookup. Skipping a source prefix policy rule because of this will lead to inconsistent routing behavior between for example bound and unbound sockets. We avoid this by passing the policy rule source prefix to the lookup and source address selection functions. For source prefix rules the source address is selected before the route lookup, otherwise we do it the other way around. The source address selection algorithm remains virtually unchanged; the source prefix is just used to verify the selected address is compatible with the rule. If the source address doesn't match, the route lookup with the current rule is aborted and is started again with the next rule in the policy chain. The more uncommon case is where the unspecified address is actually used as a valid source address. When the kernel uses the unspecified address it doesn't touch the routing table. We need to make sure a userland application using a raw socket can do the same. If the user includes the IPv6 header we therefore have to bypass the source address selection even then the source address is unspecified. In addition, we don't insert any routing cache entry created by such a lookup. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- include/net/addrconf.h |4 +++- include/net/ip6_fib.h | 16 +++- net/ipv6/addrconf.c| 13 +++-- net/ipv6/fib6_rules.c | 16 ++-- net/ipv6/ip6_fib.c |2 +- net/ipv6/ndisc.c |2 +- net/ipv6/route.c | 41 + 7 files changed, 66 insertions(+), 28 deletions(-) diff --git a/include/net/addrconf.h b/include/net/addrconf.h index d075693..7066362 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -67,8 +67,10 @@ #endif extern struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev, int strict); -extern int ipv6_get_saddr(int pref_if, +struct rt6key; +extern int ipv6_get_saddr(int pref_if, struct in6_addr *daddr, + struct rt6key *sconstr, struct in6_addr *saddr); extern int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *); extern int ipv6_rcv_saddr_equal(const struct sock *sk, diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index e4438de..8887b5c 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -21,6 +21,7 @@ #include linux/spinlock.h #include net/dst.h #include net/flow.h #include net/netlink.h +#include net/fib_rules.h struct rt6_info; @@ -77,6 +78,18 @@ struct rt6key int plen; }; +struct fib6_rule +{ + struct fib_rule common; + struct rt6key src; + struct rt6key dst; +#ifdef CONFIG_IPV6_ROUTE_FWMARK + u32 fwmark; + u32 fwmask; +#endif + u8 tclass; +}; + struct fib6_table; struct rt6_info @@ -174,7 +187,8 @@ #define RT6_TABLE_LOCAL RT6_TABLE_MAIN #endif typedef struct rt6_info *(*pol_lookup_t)(struct fib6_table *, -struct flowi *, int); +struct flowi *, int, +struct fib6_rule *); /* * exported functions diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 09a22c8..486af76 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -904,7 +904,8 @@ static int inline ipv6_saddr_label(const return 1; } -int ipv6_get_saddr(int pref_if, struct in6_addr *daddr, struct in6_addr *saddr) +int ipv6_get_saddr(int pref_if, struct in6_addr *daddr, + struct rt6key *sconstr, struct in6_addr *saddr) { struct ipv6_saddr_score hiscore; struct inet6_ifaddr *ifa_result = NULL; @@ -1151,7 +1152,15 @@ record_it: if (!ifa_result) return -EADDRNOTAVAIL; - +#ifdef CONFIG_IPV6_SUBTREES + /* Don't let source address based routing interfere with the + address selection, just make sure the selected address + matches the routing policy constraints */ + + if (sconstr sconstr-plen 0 + !ipv6_prefix_equal(saddr, sconstr-addr, sconstr-plen)) + return -EADDRNOTAVAIL; +#endif ipv6_addr_copy(saddr, ifa_result-addr); in6_ifa_put(ifa_result); return 0; diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index
[PATCH 9/13] [RFC] [SCTP] Merge IPv4 and IPv6 versions of get_saddr() with their corresponding get_dst().
Oops, this almost more than any other patch was RFC. Sorry about that! Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [IPv6] rules: Use RT6_LOOKUP_F_HAS_SADDR and fix source based selectors
On 10/13/06 05:16, David Miller wrote: From: YOSHIFUJI Hideaki [EMAIL PROTECTED] Date: Thu, 12 Oct 2006 18:55:51 +0900 (JST) I tend to agree. Ville, do you agree? I'll wait for Ville's response before applying this. Otherwise, I think the change looks fine. Acked-by: Ville Nuorvala [EMAIL PROTECTED] This doesn't however solve all the problems with source address based routing. For example we always need to have a valid source address when we cow or copy a route in ip6_pol_route_output() etc. In my original solution I moved the source address selection functionality into the route lookup code as the two very much depend on each other and can't IMO really be done separately. I'm currently working on a patch for all of this, but please go ahead and apply Thomas's and Kim's patch. I'll try to post an initial RFC version of my patch later today. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] [GIT PATCH] IPv6 Routing / Ndisc Fixes
YOSHIFUJI Hideaki wrote: As a result of original lookup (with possible ambiguous outout interface), one interface for original output is selected. Which means, we have a route for the (original) destination through that interface. Redirects shall come from that interface. So, it is enough to lookup routes on that interface. Yes, exactly. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][IPV6]: Make sure fib6_rule_lookup doesn't return NULL
David Miller wrote: From: Thomas Graf [EMAIL PROTECTED] Date: Wed, 9 Aug 2006 00:16:51 +0200 * Ville Nuorvala [EMAIL PROTECTED] 2006-08-09 01:05 [IPV6]: Make sure fib6_rule_lookup doesn't return NULL The callers of fib6_rule_lookup don't expect it to return NULL, therefore it must return ip6_null_entry whenever fib_rule_lookup fails. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] I think it would be a lot cleaner to return an error code like fib_lookup() and save the atomic operations. We only destroy the dst again anyway in case of a failure. Yes, the current scheme is a poor substitute for IS_ERR/PTR_ERR. But the ip6_null_entry not always discarded after a failed route lookup! On a forwarding router it triggers the ICMPv6 Destination Unreachable message to the sender. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][IPV6]: Make sure fib6_rule_lookup doesn't return NULL
David Miller wrote: From: Ville Nuorvala [EMAIL PROTECTED] Date: Wed, 09 Aug 2006 09:31:15 +0300 But the ip6_null_entry not always discarded after a failed route lookup! On a forwarding router it triggers the ICMPv6 Destination Unreachable message to the sender. Good point, but I remember that in some cases the caller only wants a real entry or an error. That's technically true, but the IPv6 code has largely been built around the assumption that the route lookup produces a valid pointer to a rt6_info entry. Of the three original route lookup functions (ip6_route_input, ip6_route_output and rt6_lookup), rt6_lookup was the only one that was allowed to produce a NULL entry. Of these three rt6_lookup was also the only one not actually being used for routing. The function that absolutely requires ip6_null_entry is ip6_route_input. As for ip6_route_output, even if it can return NULL or an error code, we still need to check for dst-error separately in all the places we call the function. There is also one more issue with ip6_null_entry: previously it has always been the result of an unsuccessful route lookup, now it can also be the result of a successful application of a FR_ACT_UNREACHABLE policy rule. From a networking point of view these two cases should IMO be considered equivalent and should therefore trigger the same response. This will however not be true if NULL (or an error code) is the result of an unsuccessful route lookup. Perhaps both cases can be accomodated with a SHIM inline for the forwarding case that returns ip6_null_entry when IS_ERR(retval). If no one else is worried about the FR_ACT_UNREACHABLE inconsistency, I can of course move the ip6_null_entry fallback code snippet to ip6_route_input and perhaps ip6_route_output, while letting the NULL route fall through fib6_rule_lookup and rt6_lookup. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][IPV6]: Make sure fib6_rule_lookup doesn't return NULL
Thomas Graf wrote: * Ville Nuorvala [EMAIL PROTECTED] 2006-08-09 11:36 Of the three original route lookup functions (ip6_route_input, ip6_route_output and rt6_lookup), rt6_lookup was the only one that was allowed to produce a NULL entry. Of these three rt6_lookup was also the only one not actually being used for routing. The function that absolutely requires ip6_null_entry is ip6_route_input. It would mean to change the logic of handling route errors like in the IPv4 path and not handle them in .input/.output. Instead of a dst we'd return a valid dst or a ERR_PTR() which would force the caller to take appropriate actions such as updating statistics and sending ICMPs. Ok, it might require quite big changes to the existing code, but if someone is willing to take a look at it I wouldn't be against it :-) There is also one more issue with ip6_null_entry: previously it has always been the result of an unsuccessful route lookup, now it can also be the result of a successful application of a FR_ACT_UNREACHABLE policy rule. From a networking point of view these two cases should IMO be considered equivalent and should therefore trigger the same response. This will however not be true if NULL (or an error code) is the result of an unsuccessful route lookup. Both would simply result in a -ENETUNREACHABLE. You know, I'm starting to think we could perhaps get rid of ip6_null_entry altogether. I at least don't really see any good reason to keep it after such changes. This would apply even more strongly to the new ip6_prohibit_entry and ip6_blk_hole_entry as they don't even serve as routing table dummy entries. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][IPV6]: Make sure fib6_rule_lookup doesn't return NULL
Thomas Graf wrote: * Ville Nuorvala [EMAIL PROTECTED] 2006-08-09 12:42 Ok, it might require quite big changes to the existing code, but if someone is willing to take a look at it I wouldn't be against it :-) I'll provide a patch soon. Great, but just out of curiosity: about when is soon? My source address routing changes are likely to cause some conflicts with your code, so I'm just trying to plan ahead so I don't need to spend half an eternity merging the afore mentioned conflicts. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Excess use of packed attribute
David Miller wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Mon, 7 Aug 2006 13:34:23 -0700 Silly offenders: include/net/ipx.h include/net/ieee80211.h include/net/ip6_tunnel.h include/net/ndisc.h include/linux/if_ether.h include/linux/if_fddi.h include/linux/sctp.h -- really bad The ndisc.h one, for example, is needed for cases like ARM. The if_ether.h one is also needed, or else for: struct ethhdr *eth; eth + 1 would do the wrong thing as the compiler would align the structure to the native pointer size or similar. This is an issue because ethhdr is 14 bytes in size. Similarly the ipv6_tlv_tnl_enc_lim in ip6_tunnel.h is a 3-octet IPv6 destination sub-option. The sub-option can be located anywhere inside the destination option header and can only be found by parsing the whole header. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] [GIT PATCH] IPv6 Routing / Ndisc Fixes
Hi, I still seem to have serious problems with my mailer. Despite numerous resends, I still haven't seen my reply on netdev. Hopefully it finally gets through, sorry for any possible duplicates. Ville Nuorvala wrote: YOSHIFUJI Hideaki wrote: Hello. Hello Yoshifuji-san! Here's a set of changesets (on top of net-2.6.19 tree) to fix routing / ndisc. Changesets are available at: git://git.skbuff.net/gitroot/yoshfuji/net-2.6.19-20060809-polroute-fixes/ I'd like to comment some of the NDISC patches a bit (comments inline), but all other changes looked good. CHANGESETS -- commit 4f2956c43d77e1efbf044db305455493276fc6f2 Author: YOSHIFUJI Hideaki [EMAIL PROTECTED] Date: Wed Aug 9 16:53:52 2006 +0900 [IPV6] NDISC: Take source address into account for redirects. Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED] Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] --- commit 40ff54178bd3c5dbd80f9422e88f7539727cc4e7 Author: YOSHIFUJI Hideaki [EMAIL PROTECTED] Date: Wed Aug 9 16:53:53 2006 +0900 [IPV6] NDISC: Search over all possible rules on receipt of redirect. Split up function for finding routes for redirects. Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED] diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 91c9461..4650787 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1282,19 +1282,18 @@ static int ip6_route_del(struct in6_rtms /* * Handle redirects */ -void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, - struct in6_addr *saddr, - struct neighbour *neigh, u8 *lladdr, int on_link) +struct ip6rd_flowi { +struct flowi fl; +struct in6_addr gateway; +}; + +static struct rt6_info *__ip6_route_redirect(struct fib6_table *table, + struct flowi *fl, + int flags) { -struct rt6_info *rt, *nrt = NULL; +struct ip6rd_flowi *rdfl = (struct ip6rd_flowi *)fl; +struct rt6_info *rt; struct fib6_node *fn; -struct fib6_table *table; -struct netevent_redirect netevent; - -/* TODO: Very lazy, might need to check all tables */ -table = fib6_get_table(RT6_TABLE_MAIN); -if (table == NULL) -return; /* * Get the current route for this destination and @@ -1308,7 +1307,7 @@ void rt6_redirect(struct in6_addr *dest, */ read_lock_bh(table-tb6_lock); -fn = fib6_lookup(table-tb6_root, dest, src); +fn = fib6_lookup(table-tb6_root, fl-fl6_dst, fl-fl6_src); restart: for (rt = fn-leaf; rt; rt = rt-u.next) { /* @@ -1323,29 +1322,67 @@ restart: continue; if (!(rt-rt6i_flags RTF_GATEWAY)) continue; -if (neigh-dev != rt-rt6i_dev) +if (fl-oif != rt-rt6i_dev-ifindex) continue; -if (!ipv6_addr_equal(saddr, rt-rt6i_gateway)) +if (!ipv6_addr_equal(rdfl-gateway, rt-rt6i_gateway)) continue; break; } -if (rt) -dst_hold(rt-u.dst); -else if (rt6_need_strict(dest)) { -while ((fn = fn-parent) != NULL) { -if (fn-fn_flags RTN_ROOT) -break; -if (fn-fn_flags RTN_RTINFO) -goto restart; + +if (!rt) { +if (rt6_need_strict(fl-fl6_dst)) { +while ((fn = fn-parent) != NULL) { +if (fn-fn_flags RTN_ROOT) +break; +if (fn-fn_flags RTN_RTINFO) +goto restart; +} } +rt = ip6_null_entry; } +dst_hold(rt-u.dst); + read_unlock_bh(table-tb6_lock); -if (!rt) { +return rt; +}; + +static struct rt6_info *ip6_route_redirect(struct in6_addr *dest, + struct in6_addr *src, + struct in6_addr *gateway, + struct net_device *dev) +{ +struct ip6rd_flowi rdfl = { +.fl = { +.oif = dev-ifindex, +.nl_u = { +.ip6_u = { +.daddr = *dest, +.saddr = *src, +}, +}, +}, +.gateway = *gateway, +}; +int flags = rt6_need_strict(dest) ? RT6_F_STRICT : 0; + +return (struct rt6_info *)fib6_rule_lookup((struct flowi *)rdfl, flags, __ip6_route_redirect); +} + +void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, + struct in6_addr *saddr, + struct neighbour *neigh, u8
[PATCH][IPV6]: Make sure fib6_rule_lookup doesn't return NULL
Hi David, please apply the attached patch! Regards, Ville commit 1fb9864632af5a493353643e7acf970da1d4f59f tree 569fd122cad577f80138a3014c390a7999f5 parent a66c990b9f6b0b9b4c4d2b84f96ddd019b7a8eb3 author Ville Nuorvala [EMAIL PROTECTED] 1155073548 +0300 committer Ville Nuorvala [EMAIL PROTECTED] 1155073548 +0300 [IPV6]: Make sure fib6_rule_lookup doesn't return NULL The callers of fib6_rule_lookup don't expect it to return NULL, therefore it must return ip6_null_entry whenever fib_rule_lookup fails. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index bf9bba8..22a2fdb 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -63,7 +63,11 @@ struct dst_entry *fib6_rule_lookup(struc if (arg.rule) fib_rule_put(arg.rule); - return (struct dst_entry *) arg.result; + if (arg.result) + return (struct dst_entry *) arg.result; + + dst_hold(ip6_null_entry.u.dst); + return ip6_null_entry.u.dst; } static int fib6_rule_action(struct fib_rule *rule, struct flowi *flp,
Re: [RFC] Mobile IPv6 introduction
Hugo Santos wrote: David, On Tue, Aug 01, 2006 at 05:35:35PM -0700, David Miller wrote: This is partly why the multiple routing table code is being added as the initial infrastrucutre, so that source based things are possible. There have been other approaches for partial source-based stuff. For instance, in my tree i brought Subtrees back to a point of being usable. But what i was refering to was route-caching (some places only check cookies based on dst because we don't support source routing), APIs, etc. A few mails back you pointed the extension of a public structure to include a source attribute -- this is the kind of stuff we must add and that i think it's an independent work (read: even if the rest isn't merged, this should). Still regarding Subtrees, is there any interest in revitalizing that code? I have a couple patches that i could submit. Hi Hugo, I don't want to be dismissive towards your patches, but I've been working with the subtree routing stuff for several years now. And let me tell you: it has provided us with some nasty little surprises every now and then. I'm only saying it's surprisingly difficult to get right on the first try. To name just one issue: the chicken and egg problem of source address selection and source address based routing. I solved this problem by letting policy rules (with a source prefix) add additional constraints to the address selection. This did however mean the source address selection had to be moved inside the routing code. This is just one example; trust me, there are several more. My latest incarnation of source address routing is against my previous version of policy routing, which luckily isn't that different from current the version by Thomas. Unless Yoshifuji-san has already ported my code to Thomas'es policy routing code, I'll start working on it. Such a scheme would need provisions for handling the case where the user eats the message, but never tells us what to do. In such a case we'd need to emit some kind of ICMPv6 message, even if it would be just a timeout generated parameter problem. As i see it, the moment there is a raw socket open for dealing with a particular protocol, whoever opened that socket (handling the protocol) is responsible of generating any error messages associated with the protocol running. Which is the case, the kernel shouldn't need to know whether any of the Mobile IPv6 specific messages have problems. The particular patch i was refering to does partial MIPv6 message processing inside the kernel before handing it to the socket as you only have access to the full received headers there. Such a layer would be needed if we ever put some kernel level components of Mobile IPv4 into the tree, which I see no reason not to, since it has this route optimization as well. Yes, the functionality is needed. My only problem is with exposing MODE_ROUTEOPTIMIZATION, it isn't modular. But it's something i can live with. But route optimization is just one form of packet transform; it just adds a Routing Header type 2 and/or Home Address Option Destination Header to the outgoing packet. Isn't xfrm just the right place for this? You are right that we (HUT and USAGI) have mostly just looked at the xfrm framework from a MIPv6+IPsec perspective, but even this has helped us pinpoint several shortcomings in the current only IPsec specific framework. IMO, this doesn't hinder, but rather helps change xfrm into the generic packet transform framework it was originally envisioned to be. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] [IPV6]: Remove ndiscs rt6_lock dependency
David Miller wrote: From: Thomas Graf [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 00:00:01 +0200 (Ab)using rt6_lock wouldn't work anymore if rt6_lock is converted into a per table lock. Signed-off-by: Thomas Graf [EMAIL PROTECTED] This one looks great. Signed-off-by: David S. Miller [EMAIL PROTECTED] Ditto. Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Multiple IPV6 Routing Tables Policy Routing
Thomas Graf wrote: Hello, Hi Thomas! Thought it might be time to go through a round of comments on this work. Even though I've almost rewritten all the code the patches are based on the work found on www.mobile-ipv6.org. I have no idea which code was written by whom so just email me to get the credits right. The policy routing stuff (multiple tables and source address based routing) was almost entirely written by me. Therefore you can apply my name as you see fit ;-) Tushar Gohad at MontaVista, Benjamin Thery at Bull and of course USAGI have also worked on the code. Main differences to the version found on mobile-ipv6.org is that I removed table refcnt and defined that tables cannot disappear once created to simplify things and avoid too many atomic operations when looking up routes. Yes, that sounds good. As the ipv6 module doesn't really seem to become unloadable anytime soon, there isn't really any good reason to refcount the tables. I've replaced the table array with a hash table to prepare it for 255 tables and made things aware of the new default router selection code and experimental route info stuff added recently. Good! I never had the time to merge our changes with 2.6.17. It's not final but somewhat working, I'm eager to see comments or patches. I'll try to comment on them the best I can. I apologize if I've tramped onto anybody's foot by taking this up and submitting it, this isn't meant as an attempt to steal credits but rather to pick up good code and finally get it upstream after a very long while. No offense taken! It's great that someone wants to push these things upstream as I personally have neither had the time nor the energy to do so lately. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] [IPV6]: Multiple Routing Tables
Thomas Graf wrote: Adds the framework to support multiple IPv6 routing tables. Currently all automatically generated routes are put into the same table. This could be changed at a later point after considering the produced locking overhead. Hi Thomes, some minor comments below. When locating routes for redirects only the main table is searched for now. Since policy rules will not be reversible it is unclear whether it makes sense to change this. This is a good point. You are absolutely correct about the policy rules. IIRC, I initially looked through all the tables, but skipped this behavior when I rewrote the code for 2.6.11. Currently I'm once again in favor of looping through them all. This is IMO at least closer to the spirit of RFC 2461 section 8.3. where a host SHOULD update its destination cache upon receiving a redirect. If we don't look through all tables, we can't ensure this happens. Index: net-2.6.git/include/net/ip6_fib.h === --- net-2.6.git.orig/include/net/ip6_fib.h +++ net-2.6.git/include/net/ip6_fib.h snip @@ -143,12 +146,41 @@ struct rt6_statistics { typedef void (*f_pnode)(struct fib6_node *fn, void *); -extern struct fib6_node ip6_routing_table; +struct fib6_table { + struct hlist_node tb6_hlist; + u32 tb6_id; + rwlock_ttb6_lock; + struct fib6_nodetb6_root; +}; + +#define RT6_TABLE_UNSPEC RT_TABLE_UNSPEC +#define RT6_TABLE_MAIN RT_TABLE_MAIN +#define RT6_TABLE_LOCAL RT6_TABLE_MAIN +#define RT6_TABLE_DFLT RT6_TABLE_MAIN +#define RT6_TABLE_INFO RT6_TABLE_MAIN IMO it's a bit inconsistent to define a separate table entry for Route Information generated routes, but not Prefix Information based ones. What do you say about adding a RT6_TABLE_PRFX? Index: net-2.6.git/net/ipv6/route.c === --- net-2.6.git.orig/net/ipv6/route.c +++ net-2.6.git/net/ipv6/route.c snip @@ -1435,12 +1523,15 @@ static struct rt6_info *rt6_add_route_in struct rt6_info *rt6_get_dflt_router(struct in6_addr *addr, struct net_device *dev) { struct rt6_info *rt; - struct fib6_node *fn; + struct fib6_table *table; - fn = ip6_routing_table; + /* TODO: It might be better to search all tables */ + table = fib6_get_table(RT6_TABLE_DFLT); As long as the table for default routes is RT6_TABLE_DFLT and can't be configured by the user, I think the correct behavior is just to search RT6_TABLE_DFLT. Otherwise it looks very good! Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND 3/5] [NET]: Protocol Independant Policy Routing Rules Framework
Thomas Graf wrote: Hi Thomas, Derived from net/ipv6/fib_rules.c do you mean net/ipv4/fib_rules.c or net/ipv6/fib6_rules.c? :-) A couple of comments below. Signed-off-by: Thomas Graf [EMAIL PROTECTED] Index: net-2.6.git/include/linux/fib_rules.h === --- /dev/null +++ net-2.6.git/include/linux/fib_rules.h @@ -0,0 +1,60 @@ +#ifndef __LINUX_FIB_RULES_H +#define __LINUX_FIB_RULES_H + +#include linux/types.h +#include linux/rtnetlink.h + +/* rule is permanent, and cannot be deleted */ +#define FIB_RULE_PERMANENT 1 + +struct fib_rule_hdr +{ + __u8family; + __u8dst_len; + __u8src_len; + __u8tos; + + __u8table; + __u8res1; /* reserved */ + __u8res2; /* reserved */ + __u8action; + + __u32 flags; +}; I'm wondering if this is guaranteed to be equvalent to struct rtmsg? struct rtmsg { unsigned char rtm_family; unsigned char rtm_dst_len; unsigned char rtm_src_len; unsigned char rtm_tos; unsigned char rtm_table; /* Routing table id */ unsigned char rtm_protocol; /* Routing protocol; see below */ unsigned char rtm_scope; /* See below */ unsigned char rtm_type; /* See below*/ unsignedrtm_flags; }; Won't we otherwise be breaking the existing userland interface? +enum +{ + FRA_UNSPEC, + FRA_DST,/* destination address */ + FRA_SRC,/* source address */ + FRA_IFNAME, /* interface name */ + FRA_UNUSED1, + FRA_UNUSED2, + FRA_PRIORITY, /* priority/preference */ + FRA_UNUSED3, + FRA_UNUSED4, + FRA_UNUSED5, + FRA_FWMARK, /* netfilter mark (IPv4) */ + FRA_FLOW, /* flow/class id */ + __FRA_MAX +}; + +#define FRA_MAX (__FRA_MAX - 1) + +enum +{ + FR_ACT_UNSPEC, + FR_ACT_TO_TBL, /* Pass to fixed table */ + FR_ACT_RES1, + FR_ACT_RES2, + FR_ACT_RES3, + FR_ACT_RES4, + FR_ACT_BLACKHOLE, /* Drop without notification */ + FR_ACT_UNREACHABLE, /* Drop with ENETUNREACH */ + FR_ACT_PROHIBIT,/* Drop with EACCES */ + __FR_ACT_MAX, +}; + +#define FR_ACT_MAX (__FR_ACT_MAX - 1) + +#endif Shouldn't all these (struct fib_rule_hdr included) actually be defined in include/linux/rtnetlink.h? Otherwise, looks good. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND 4/5] [IPV6]: Policy Routing Rules
Thomas Graf wrote: Adds support for policy routing rules including a new local table for routes with a local destination. Looks good! Signed-off-by: Thomas Graf [EMAIL PROTECTED] Signed-off-by: Ville Nuorvala [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ip6_tunnel keeping dst_cache after change of params
Hugo Santos wrote: Hi, Hi Hugo! ip6_tunnel keeps a cached dst (dst_cache in ip6_tnl) per tunnel instance. This cached dst is re-used while it's not marked obsolete. A change of the tunnel's parameters (via SIOCCHGTUNNEL) does not invalidate the dst_cache directly, which results on it being used by ip6ip6_tnl_xmit after the tunnel is configured with new parameters. Shouldn't ip6ip6_tnl_change dst_release() the cached dst and leave ip6ip6_tnl_xmit to pick a new one based on the new local/remote addresses etc? I can provide a patch to fix this, meanwhile just wanted to confirm the expected behaviour. Yes, you are right. Regards, Ville - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ip6_tunnel: release cached dst on change of tunnel params
Hugo Santos wrote: Hi, The included patch fixes ip6_tunnel to release the cached dst entry when the tunnel parameters (such as tunnel endpoints) are changed so they are used immediatly for the next encapsulated packets. Signed-off-by: Hugo Santos [EMAIL PROTECTED] --- linux-2.6.16-rc4/net/ipv6/ip6_tunnel.c 2006-02-17 22:23:45.0 + +++ linux-2.6.16-rc4-new/net/ipv6/ip6_tunnel.c 2006-02-24 01:40:17.0 + @@ -884,6 +884,7 @@ ip6ip6_tnl_change(struct ip6_tnl *t, str t-parms.encap_limit = p-encap_limit; t-parms.flowinfo = p-flowinfo; t-parms.link = p-link; + ip6_tnl_dst_reset(t); ip6ip6_tnl_link_config(t); return 0; } Acked-by: Ville Nuorvala [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html