On Wed, Jun 10, 2015 at 08:57:55AM -0700, Alexander Duyck wrote:
> On 06/09/2015 11:47 PM, Andy Gospodarek wrote:
> >Add a fib flag called RTNH_F_LINKDOWN to any ipv4 nexthops that are
> >reachable via an interface where carrier is off.  No action is taken,
> >but additional flags are passed to userspace to indicate carrier status.
> >
> >This also includes a cleanup to fib_disable_ip to more clearly indicate
> >what event made the function call to replace the more cryptic force
> >option previously used.
> >
> >v2: Split out kernel functionality into 2 patches, this patch simply sets and
> >clears new nexthop flag RTNH_F_LINKDOWN.
> >
> >Signed-off-by: Andy Gospodarek <go...@cumulusnetworks.com>
> >Signed-off-by: Dinesh Dutt <dd...@cumulusnetworks.com>
> >
> >---
> >  include/net/ip_fib.h           |  4 +--
> >  include/uapi/linux/rtnetlink.h |  1 +
> >  net/ipv4/fib_frontend.c        | 26 +++++++++++---------
> >  net/ipv4/fib_semantics.c       | 56 
> > ++++++++++++++++++++++++++++++++----------
> >  4 files changed, 60 insertions(+), 27 deletions(-)
> >
> >diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> >index 54271ed..d1de1b7 100644
> >--- a/include/net/ip_fib.h
> >+++ b/include/net/ip_fib.h
> >@@ -305,9 +305,9 @@ void fib_flush_external(struct net *net);
> >
> >  /* Exported by fib_semantics.c */
> >  int ip_fib_check_default(__be32 gw, struct net_device *dev);
> >-int fib_sync_down_dev(struct net_device *dev, int force);
> >+int fib_sync_down_dev(struct net_device *dev, int event);
> >  int fib_sync_down_addr(struct net *net, __be32 local);
> >-int fib_sync_up(struct net_device *dev);
> >+int fib_sync_up(struct net_device *dev, unsigned int nh_flags);
> >  void fib_select_multipath(struct fib_result *res);
> >
> >  /* Exported by fib_trie.c */
> >diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
> >index 17fb02f..8dde432 100644
> >--- a/include/uapi/linux/rtnetlink.h
> >+++ b/include/uapi/linux/rtnetlink.h
> >@@ -338,6 +338,7 @@ struct rtnexthop {
> >  #define RTNH_F_PERVASIVE   2       /* Do recursive gateway lookup  */
> >  #define RTNH_F_ONLINK              4       /* Gateway is forced on link    
> > */
> >  #define RTNH_F_OFFLOAD             8       /* offloaded route */
> >+#define RTNH_F_LINKDOWN             16      /* carrier-down on nexthop */
> 
> So you could probably use some sort of define here to identify which flags
> are event based and which are configuration based.  Then it makes it easier
> to take care of code below such as the nh_comp call.
So are you saying something at the top to that would reserve a few bits
for whether the kernel can set it, userspace can set it, or both could
set it?  Seems like overkill to me and a waste of bits -- though maybe
there will not be that many nexthop flags. :)

> 
> >  /* Macros to handle hexthops */
> >
> >diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> >index 872494e..1e4c646 100644
> >--- a/net/ipv4/fib_frontend.c
> >+++ b/net/ipv4/fib_frontend.c
> >@@ -1063,9 +1063,9 @@ static void nl_fib_lookup_exit(struct net *net)
> >     net->ipv4.fibnl = NULL;
> >  }
> >
> >-static void fib_disable_ip(struct net_device *dev, int force)
> >+static void fib_disable_ip(struct net_device *dev, int event)
> 
> Event should be an unsigned long to match fib_inetaddr_event and avoid any
> unnecessary casts or warnings.
Fixed in upcoming v3

> 
> >  {
> >-    if (fib_sync_down_dev(dev, force))
> >+    if (fib_sync_down_dev(dev, event))
> >             fib_flush(dev_net(dev));
> >     rt_cache_flush(dev_net(dev));
> >     arp_ifdown(dev);
> >@@ -1080,9 +1080,7 @@ static int fib_inetaddr_event(struct notifier_block 
> >*this, unsigned long event,
> >     switch (event) {
> >     case NETDEV_UP:
> >             fib_add_ifaddr(ifa);
> >-#ifdef CONFIG_IP_ROUTE_MULTIPATH
> >-            fib_sync_up(dev);
> >-#endif
> >+            fib_sync_up(dev, RTNH_F_DEAD);
> >             atomic_inc(&net->ipv4.dev_addr_genid);
> >             rt_cache_flush(dev_net(dev));
> >             break;
> 
> Shouldn't this bit be left wrapped in CONFIG_IP_ROUTE_MULTIPATH?  I thought
> RTNH_F_DEAD was only used in that case.
I can double-check this one and the one referenced below in
fib_netdev_event, but I really struggle to understand why one would not
want to be sure that when IFF_UP is set the DEAD flags were definitely
going to be cleared before continuing?

> 
> >@@ -1093,7 +1091,7 @@ static int fib_inetaddr_event(struct notifier_block 
> >*this, unsigned long event,
> >                     /* Last address was deleted from this interface.
> >                      * Disable IP.
> >                      */
> >-                    fib_disable_ip(dev, 1);
> >+                    fib_disable_ip(dev, event);
> >             } else {
> >                     rt_cache_flush(dev_net(dev));
> >             }
> 
> Aren't you losing information here?  The line above this change is a call to
> see if ifa_list is NULL.  I don't see how that data is being communicated
> down to fib_disable_ip.  It seems like you could end up with the wrong
> scope.
Fixed in fib_sync_down_dev in upcoming v3.

[...]
> >diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> >index 28ec3c1..776e029 100644
> >--- a/net/ipv4/fib_semantics.c
> >+++ b/net/ipv4/fib_semantics.c
> >@@ -266,7 +266,7 @@ static inline int nh_comp(const struct fib_info *fi, 
> >const struct fib_info *ofi)
> >  #ifdef CONFIG_IP_ROUTE_CLASSID
> >                 nh->nh_tclassid != onh->nh_tclassid ||
> >  #endif
> >-                ((nh->nh_flags ^ onh->nh_flags) & ~RTNH_F_DEAD))
> >+                ((nh->nh_flags ^ onh->nh_flags) & 
> >~(RTNH_F_DEAD|RTNH_F_LINKDOWN)))
> >                     return -1;
> >             onh++;
> >     } endfor_nexthops(fi);
> >@@ -318,7 +318,7 @@ static struct fib_info *fib_find_info(const struct 
> >fib_info *nfi)
> >                 nfi->fib_type == fi->fib_type &&
> >                 memcmp(nfi->fib_metrics, fi->fib_metrics,
> >                        sizeof(u32) * RTAX_MAX) == 0 &&
> >-                ((nfi->fib_flags ^ fi->fib_flags) & ~RTNH_F_DEAD) == 0 &&
> >+                ((nfi->fib_flags ^ fi->fib_flags) & 
> >~(RTNH_F_DEAD|RTNH_F_LINKDOWN)) == 0 &&
> >                 (nfi->fib_nhs == 0 || nh_comp(fi, nfi) == 0))
> >                     return fi;
> >     }
> 
> Merging the two flags into some sort of define would probably help the
> readability here.
I can create something like RTNH_F_COMP_MASK for upcoming v3.

[...]
> 
> >@@ -604,6 +604,8 @@ static int fib_check_nh(struct fib_config *cfg, struct 
> >fib_info *fi,
> >                             return -ENODEV;
> >                     if (!(dev->flags & IFF_UP))
> >                             return -ENETDOWN;
> >+                    if (!netif_carrier_ok(dev))
> >+                            nh->nh_flags |= RTNH_F_LINKDOWN;
> >                     nh->nh_dev = dev;
> >                     dev_hold(dev);
> >                     nh->nh_scope = RT_SCOPE_LINK;
> >@@ -636,6 +638,8 @@ static int fib_check_nh(struct fib_config *cfg, struct 
> >fib_info *fi,
> >             if (!dev)
> >                     goto out;
> >             dev_hold(dev);
> >+            if (!netif_carrier_ok(dev))
> >+                    nh->nh_flags |= RTNH_F_LINKDOWN;
> >             err = (dev->flags & IFF_UP) ? 0 : -ENETDOWN;
> >     } else {
> >             struct in_device *in_dev;
> >@@ -654,6 +658,8 @@ static int fib_check_nh(struct fib_config *cfg, struct 
> >fib_info *fi,
> >             nh->nh_dev = in_dev->dev;
> >             dev_hold(nh->nh_dev);
> >             nh->nh_scope = RT_SCOPE_HOST;
> >+            if (!netif_carrier_ok(nh->nh_dev))
> >+                    nh->nh_flags |= RTNH_F_LINKDOWN;
> >             err = 0;
> >     }
> >  out:
> >@@ -920,11 +926,17 @@ struct fib_info *fib_create_info(struct fib_config 
> >*cfg)
> >             if (!nh->nh_dev)
> >                     goto failure;
> >     } else {
> >+            int linkdown = 0;
> >             change_nexthops(fi) {
> >                     err = fib_check_nh(cfg, fi, nexthop_nh);
> >                     if (err != 0)
> >                             goto failure;
> >+                    if (nexthop_nh->nh_flags & RTNH_F_LINKDOWN)
> >+                            linkdown++;
> >             } endfor_nexthops(fi)
> >+            if (linkdown == fi->fib_nhs) {
> >+                    fi->fib_flags |= RTNH_F_LINKDOWN;
> >+            }
> >     }
> >
> >     if (fi->fib_prefsrc) {
> >@@ -1103,7 +1115,7 @@ int fib_sync_down_addr(struct net *net, __be32 local)
> >     return ret;
> >  }
> >
> >-int fib_sync_down_dev(struct net_device *dev, int force)
> >+int fib_sync_down_dev(struct net_device *dev, int event)
> 
> I believe event should be unsigned long to match the original argument from
> fib_inetaddr_event.
Agreed, in upcoming v3.

> >  {
> >     int ret = 0;
> >     int scope = RT_SCOPE_NOWHERE;
> >@@ -1112,7 +1124,7 @@ int fib_sync_down_dev(struct net_device *dev, int 
> >force)
> >     struct hlist_head *head = &fib_info_devhash[hash];
> >     struct fib_nh *nh;
> >
> >-    if (force)
> >+    if (event == NETDEV_UNREGISTER)
> >             scope = -1;
> >
> 
> So I believe there is still a gap here in relation to fib_inetaddr_event.
> Specifically in the case of that function it is supposed to set the force
> value to 1 which would trigger this bit of code, but that isn't occurring
> with your change.
Agreed.  As mentioned above, I fixed this in my tree and it will be in
upcoming v3.

Thanks for the review, Alex!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to