[PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-18 Thread Steve Wise

This patch adds netevent and netlink calls for neighbour change, route
add/del, pmtu change, and routing redirect events.

Netlink Details:

Neighbour change events are broadcast as a new ndmsg type RTM_NEIGHUPD.

Path mtu change events are broadcast as a new rtmsg type RTM_ROUTEUPD.

Routing redirect events are broadcast as a pair of rtmsgs, RTM_DELROUTE
and RTM_NEWROUTE.
---

 include/linux/rtnetlink.h |4 ++
 net/core/Makefile |2 +
 net/core/neighbour.c  |   37 ---
 net/ipv4/fib_semantics.c  |9 +
 net/ipv4/route.c  |   86 ++--
 net/ipv6/route.c  |   87 +
 6 files changed, 213 insertions(+), 12 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index facd9ee..340ca4f 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -35,6 +35,8 @@ #define RTM_NEWROUTE  RTM_NEWROUTE
 #define RTM_DELROUTE   RTM_DELROUTE
RTM_GETROUTE,
 #define RTM_GETROUTE   RTM_GETROUTE
+   RTM_ROUTEUPD,
+#define RTM_ROUTEUPD   RTM_ROUTEUPD
 
RTM_NEWNEIGH= 28,
 #define RTM_NEWNEIGH   RTM_NEWNEIGH
@@ -42,6 +44,8 @@ #define RTM_NEWNEIGH  RTM_NEWNEIGH
 #define RTM_DELNEIGH   RTM_DELNEIGH
RTM_GETNEIGH,
 #define RTM_GETNEIGH   RTM_GETNEIGH
+   RTM_NEIGHUPD,
+#define RTM_NEIGHUPD   RTM_NEIGHUPD
 
RTM_NEWRULE = 32,
 #define RTM_NEWRULERTM_NEWRULE
diff --git a/net/core/Makefile b/net/core/Makefile
index e9bd246..2645ba4 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -7,7 +7,7 @@ obj-y := sock.o request_sock.o skbuff.o 
 
 obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
 
-obj-y   += dev.o ethtool.o dev_mcast.o dst.o \
+obj-y   += dev.o ethtool.o dev_mcast.o dst.o netevent.o \
neighbour.o rtnetlink.o utils.o link_watch.o filter.o
 
 obj-$(CONFIG_XFRM) += flow.o
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 7ad681f..11c7643 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -29,9 +29,11 @@ #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #define NEIGH_DEBUG 1
 
@@ -58,6 +60,7 @@ static void neigh_app_notify(struct neig
 #endif
 static int pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev);
 void neigh_changeaddr(struct neigh_table *tbl, struct net_device *dev);
+static void rtm_neigh_change(struct neighbour *n);
 
 static struct neigh_table *neigh_tables;
 #ifdef CONFIG_PROC_FS
@@ -754,6 +757,7 @@ #endif
neigh->nud_state = NUD_STALE;
neigh->updated = jiffies;
neigh_suspect(neigh);
+   notify = 1;
}
} else if (state & NUD_DELAY) {
if (time_before_eq(now, 
@@ -762,6 +766,7 @@ #endif
neigh->nud_state = NUD_REACHABLE;
neigh->updated = jiffies;
neigh_connect(neigh);
+   notify = 1;
next = neigh->confirmed + neigh->parms->reachable_time;
} else {
NEIGH_PRINTK2("neigh %p is probed.\n", neigh);
@@ -819,6 +824,8 @@ #endif
 out:
write_unlock(&neigh->lock);
}
+   if (notify)
+   rtm_neigh_change(neigh);
 
 #ifdef CONFIG_ARPD
if (notify && neigh->parms->app_probes)
@@ -926,9 +933,7 @@ int neigh_update(struct neighbour *neigh
 {
u8 old;
int err;
-#ifdef CONFIG_ARPD
int notify = 0;
-#endif
struct net_device *dev;
int update_isrouter = 0;
 
@@ -948,9 +953,7 @@ #endif
neigh_suspect(neigh);
neigh->nud_state = new;
err = 0;
-#ifdef CONFIG_ARPD
notify = old & NUD_VALID;
-#endif
goto out;
}
 
@@ -1022,9 +1025,7 @@ #endif
if (!(new & NUD_CONNECTED))
neigh->confirmed = jiffies -
  (neigh->parms->base_reachable_time << 1);
-#ifdef CONFIG_ARPD
notify = 1;
-#endif
}
if (new == old)
goto out;
@@ -1055,7 +1056,11 @@ out:
(neigh->flags | NTF_ROUTER) :
(neigh->flags & ~NTF_ROUTER);
}
+
write_unlock_bh(&neigh->lock);
+
+   if (notify)
+   rtm_neigh_change(neigh);
 #ifdef CONFIG_ARPD
if (notify && neigh->parms->app_probes)
neigh_app_notify(neigh);
@@ -2369,9 +2374,27 @@ static void neigh_app_notify(struct neig
NETLINK_CB(skb).dst_group  = RTNLGRP_NEIGH;
netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC);
 }
-
 #endif /* CONFIG_ARPD */
 
+static void rtm_neigh_change(struct neighbour *n)
+{
+   struct nlmsghdr *nlh;
+   int size = NLMSG_SPACE(sizeof(struct ndmsg) + 256);
+ 

Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-25 Thread Herbert Xu
Steve Wise <[EMAIL PROTECTED]> wrote:
> 
> Routing redirect events are broadcast as a pair of rtmsgs, RTM_DELROUTE
> and RTM_NEWROUTE.

This may confuse existing rtnetlink users since you're generating an
RTM_DELROUTE message that's identical to one triggered by something
like 'ip route del'.

As you're introducing a completely new RTM_ROUTEUPD type, it might
be better to attach any information from the existing route that you
need to the ROUTEUPD message.

Actually, what was the reason you need the existing route here?

> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index 5f87533..33d8a83 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -44,6 +44,7 @@ #include 
> #include 
> #include 
> #include 
> +#include 
> 
> #include "fib_lookup.h"
> 
> @@ -279,6 +280,14 @@ void rtmsg_fib(int event, u32 key, struc
>struct sk_buff *skb;
>u32 pid = req ? req->pid : n->nlmsg_pid;
>int size = NLMSG_SPACE(sizeof(struct rtmsg)+256);
> +   struct netevent_route_info nri;
> +   int netevent;
> +
> +   nri.family = AF_INET;
> +   nri.data = &fa->fa_info;
> +   netevent = event == RTM_NEWROUTE ? NETEVENT_ROUTE_ADD 
> +: NETEVENT_ROUTE_DEL;
> +   call_netevent_notifiers(netevent, &nri);

Hmm, this is broken.  These route events are meaningless without the
corresponding IP rule events.  Are you sure you really want to make
your hardware/driver grok multiple routing tables?

Perhaps you should simply stick to dst entries and flush all your
tables when the routes are changed.  This is what the Linux IP stack
does.

> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 2dc6dbb..18879e6 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1117,6 +1120,52 @@ static void rt_del(unsigned hash, struct
>spin_unlock_bh(rt_hash_lock_addr(hash));
> }
> 
> +static void rtm_redirect(struct rtable *old, struct rtable *new)
> +{
> +   struct netevent_redirect netevent;
> +   struct sk_buff *skb;
> +   int err;
> +
> +   netevent.old = &old->u.dst;
> +   netevent.new = &new->u.dst;
> +
> +   /* notify netevent subscribers */
> +   call_netevent_notifiers(NETEVENT_REDIRECT, &netevent);
> +
> +   /* Post NETLINK messages:  RTM_DELROUTE for old route, 
> +  RTM_NEWROUTE for new route */
> +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);

Please use a better size estimate rather than NLMSG_GOODSIZE here since
you're doing GFP_ATOMIC.

> @@ -1442,6 +1493,32 @@ unsigned short ip_rt_frag_needed(struct 
>return est_mtu ? : new_mtu;
> }
> 
> +static void rtm_pmtu_update(struct rtable *rt)
> +{
> +   struct sk_buff *skb;
> +   int err;
> +
> +   call_netevent_notifiers(NETEVENT_PMTU_UPDATE, &rt->u.dst);
> +
> +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);

Ditto.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-25 Thread Steve Wise
On Tue, 2006-07-25 at 17:39 +1000, Herbert Xu wrote:
> Steve Wise <[EMAIL PROTECTED]> wrote:
> > 
> > Routing redirect events are broadcast as a pair of rtmsgs, RTM_DELROUTE
> > and RTM_NEWROUTE.
> 
> This may confuse existing rtnetlink users since you're generating an
> RTM_DELROUTE message that's identical to one triggered by something
> like 'ip route del'.
> 

Yea, I didn't really want to create a REDIRECT rtmsg, so I punted. :-)  

But they really are seeing a delete followed by an add.  That's what the
kernel is doing.

> As you're introducing a completely new RTM_ROUTEUPD type, it might
> be better to attach any information from the existing route that you
> need to the ROUTEUPD message.

Yea, the main change is the next hop ip address or gateway field.  

> 
> Actually, what was the reason you need the existing route here?
> 

The rdma driver needs to update all established rdma connections that
are using the next-hop information of the existing route and make them
use the next-hop information of the new route.  In addition, the rdma
driver might have a reference to the old dst entry.  So it can release
that ref and add a ref to the new dst entry.

> > diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> > index 5f87533..33d8a83 100644
> > --- a/net/ipv4/fib_semantics.c
> > +++ b/net/ipv4/fib_semantics.c
> > @@ -44,6 +44,7 @@ #include 
> > #include 
> > #include 
> > #include 
> > +#include 
> > 
> > #include "fib_lookup.h"
> > 
> > @@ -279,6 +280,14 @@ void rtmsg_fib(int event, u32 key, struc
> >struct sk_buff *skb;
> >u32 pid = req ? req->pid : n->nlmsg_pid;
> >int size = NLMSG_SPACE(sizeof(struct rtmsg)+256);
> > +   struct netevent_route_info nri;
> > +   int netevent;
> > +
> > +   nri.family = AF_INET;
> > +   nri.data = &fa->fa_info;
> > +   netevent = event == RTM_NEWROUTE ? NETEVENT_ROUTE_ADD 
> > +: NETEVENT_ROUTE_DEL;
> > +   call_netevent_notifiers(netevent, &nri);
> 
> Hmm, this is broken.  These route events are meaningless without the
> corresponding IP rule events.  Are you sure you really want to make
> your hardware/driver grok multiple routing tables?
> 
> Perhaps you should simply stick to dst entries and flush all your
> tables when the routes are changed.  This is what the Linux IP stack
> does.
> 

I have to admit I'm a little fuzzy on the routing stuff.  The main
netevents I've utilized in the the rdma driver I'm writing is the
neighbour update event and the redirect event.  Route add/del was added
for completeness of "routing" netevents.   

Can you expand further or point me to code where the IP stack "flushes
its tables" when routes are changed?

>From my experience, all the rdma driver needs is the dst entry.   It
using the routing table to determine the dst_entry at connection
establish time.  And it needs to know if the next-hop or PMTU ever
changes.



> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 2dc6dbb..18879e6 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -1117,6 +1120,52 @@ static void rt_del(unsigned hash, struct
> >spin_unlock_bh(rt_hash_lock_addr(hash));
> > }
> > 
> > +static void rtm_redirect(struct rtable *old, struct rtable *new)
> > +{
> > +   struct netevent_redirect netevent;
> > +   struct sk_buff *skb;
> > +   int err;
> > +
> > +   netevent.old = &old->u.dst;
> > +   netevent.new = &new->u.dst;
> > +
> > +   /* notify netevent subscribers */
> > +   call_netevent_notifiers(NETEVENT_REDIRECT, &netevent);
> > +
> > +   /* Post NETLINK messages:  RTM_DELROUTE for old route, 
> > +  RTM_NEWROUTE for new route */
> > +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
> 
> Please use a better size estimate rather than NLMSG_GOODSIZE here since
> you're doing GFP_ATOMIC.
> 

ok

> > @@ -1442,6 +1493,32 @@ unsigned short ip_rt_frag_needed(struct 
> >return est_mtu ? : new_mtu;
> > }
> > 
> > +static void rtm_pmtu_update(struct rtable *rt)
> > +{
> > +   struct sk_buff *skb;
> > +   int err;
> > +
> > +   call_netevent_notifiers(NETEVENT_PMTU_UPDATE, &rt->u.dst);
> > +
> > +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
> 
> Ditto.
> 

ok


Thanks,

Steve.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-25 Thread Herbert Xu
On Tue, Jul 25, 2006 at 10:05:40AM -0500, Steve Wise wrote:
> 
> But they really are seeing a delete followed by an add.  That's what the
> kernel is doing.

Actually that's the other thing I don't really like.  The user-space
monitor may perceive that a route was actually deleted and replaced
by a new one even though this isn't what's happening at all.

In fact the problem here is that you're sending route notifications
when it's really the dst_entry that's changing.  User-space as it
stands only get notifications about fib changes which is quite different
from changes to the transient dst_entry objects which only exist in the
route cache.

Is anyone actually going to use the user-space interface of this? If not
perhaps we should wait until someone really needs it before adding the
netlink part of the patch.

We can change the kernel interface at will so if we make a mistake with
netevent it can be easily corrected.  For user-space though the rules
are totally different.  I'd really hate to be stuck with an interface
which turns out to not be the one that people actually want to have.

> The rdma driver needs to update all established rdma connections that
> are using the next-hop information of the existing route and make them
> use the next-hop information of the new route.  In addition, the rdma
> driver might have a reference to the old dst entry.  So it can release
> that ref and add a ref to the new dst entry.

Do you really need the old route for the user-space part of your patch?

> I have to admit I'm a little fuzzy on the routing stuff.  The main
> netevents I've utilized in the the rdma driver I'm writing is the
> neighbour update event and the redirect event.  Route add/del was added
> for completeness of "routing" netevents.   

So you mean you aren't going to use the route notifications? In that case
we should probably just drop them and add them when someone actually needs
it.  At that point they can tell us what semantics they want from it :)

> Can you expand further or point me to code where the IP stack "flushes
> its tables" when routes are changed?

Grep for rt_cache_flush in net/ipv4/fib_hash.c.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-26 Thread Steve Wise
On Wed, 2006-07-26 at 13:39 +1000, Herbert Xu wrote:
> On Tue, Jul 25, 2006 at 10:05:40AM -0500, Steve Wise wrote:
> > 
> > But they really are seeing a delete followed by an add.  That's what the
> > kernel is doing.
> 
> Actually that's the other thing I don't really like.  The user-space
> monitor may perceive that a route was actually deleted and replaced
> by a new one even though this isn't what's happening at all.
> 
> In fact the problem here is that you're sending route notifications
> when it's really the dst_entry that's changing.  User-space as it
> stands only get notifications about fib changes which is quite different
> from changes to the transient dst_entry objects which only exist in the
> route cache.
> 
> Is anyone actually going to use the user-space interface of this? If not
> perhaps we should wait until someone really needs it before adding the
> netlink part of the patch.
> 
> We can change the kernel interface at will so if we make a mistake with
> netevent it can be easily corrected.  For user-space though the rules
> are totally different.  I'd really hate to be stuck with an interface
> which turns out to not be the one that people actually want to have.
> 

The user interface is not needed for the rdma users.  They are all in
kernel.  I added this at the request of reviewers of this patch.  I have
no problem at all defering the rtnetlink integration until someone
really needs it.

> > The rdma driver needs to update all established rdma connections that
> > are using the next-hop information of the existing route and make them
> > use the next-hop information of the new route.  In addition, the rdma
> > driver might have a reference to the old dst entry.  So it can release
> > that ref and add a ref to the new dst entry.
> 
> Do you really need the old route for the user-space part of your patch?
> 

Not if we remove the user-space parts. :-)

> > I have to admit I'm a little fuzzy on the routing stuff.  The main
> > netevents I've utilized in the the rdma driver I'm writing is the
> > neighbour update event and the redirect event.  Route add/del was added
> > for completeness of "routing" netevents.   
> 
> So you mean you aren't going to use the route notifications? In that case
> we should probably just drop them and add them when someone actually needs
> it.  At that point they can tell us what semantics they want from it :)
> 

This is fine by me too!  The key events needed for rdma are:

neighbour update events
rtredirect events
pmtu change events

> > Can you expand further or point me to code where the IP stack "flushes
> > its tables" when routes are changed?
> 
> Grep for rt_cache_flush in net/ipv4/fib_hash.c.
> 

thanks.

Dave, what do you think about removing the user-space stuff for the
first round of integration?  IE:  Just add netevents and kernel hooks to
generate them.


Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-26 Thread David Miller
From: Steve Wise <[EMAIL PROTECTED]>
Date: Wed, 26 Jul 2006 11:15:43 -0500

> Dave, what do you think about removing the user-space stuff for the
> first round of integration?  IE:  Just add netevents and kernel hooks to
> generate them.

Sure.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html