Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-26 Thread Steve Wise
On Wed, 2006-07-26 at 13:39 +1000, Herbert Xu wrote:
 On Tue, Jul 25, 2006 at 10:05:40AM -0500, Steve Wise wrote:
  
  But they really are seeing a delete followed by an add.  That's what the
  kernel is doing.
 
 Actually that's the other thing I don't really like.  The user-space
 monitor may perceive that a route was actually deleted and replaced
 by a new one even though this isn't what's happening at all.
 
 In fact the problem here is that you're sending route notifications
 when it's really the dst_entry that's changing.  User-space as it
 stands only get notifications about fib changes which is quite different
 from changes to the transient dst_entry objects which only exist in the
 route cache.
 
 Is anyone actually going to use the user-space interface of this? If not
 perhaps we should wait until someone really needs it before adding the
 netlink part of the patch.
 
 We can change the kernel interface at will so if we make a mistake with
 netevent it can be easily corrected.  For user-space though the rules
 are totally different.  I'd really hate to be stuck with an interface
 which turns out to not be the one that people actually want to have.
 

The user interface is not needed for the rdma users.  They are all in
kernel.  I added this at the request of reviewers of this patch.  I have
no problem at all defering the rtnetlink integration until someone
really needs it.

  The rdma driver needs to update all established rdma connections that
  are using the next-hop information of the existing route and make them
  use the next-hop information of the new route.  In addition, the rdma
  driver might have a reference to the old dst entry.  So it can release
  that ref and add a ref to the new dst entry.
 
 Do you really need the old route for the user-space part of your patch?
 

Not if we remove the user-space parts. :-)

  I have to admit I'm a little fuzzy on the routing stuff.  The main
  netevents I've utilized in the the rdma driver I'm writing is the
  neighbour update event and the redirect event.  Route add/del was added
  for completeness of routing netevents.   
 
 So you mean you aren't going to use the route notifications? In that case
 we should probably just drop them and add them when someone actually needs
 it.  At that point they can tell us what semantics they want from it :)
 

This is fine by me too!  The key events needed for rdma are:

neighbour update events
rtredirect events
pmtu change events

  Can you expand further or point me to code where the IP stack flushes
  its tables when routes are changed?
 
 Grep for rt_cache_flush in net/ipv4/fib_hash.c.
 

thanks.

Dave, what do you think about removing the user-space stuff for the
first round of integration?  IE:  Just add netevents and kernel hooks to
generate them.


Steve.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-26 Thread David Miller
From: Steve Wise [EMAIL PROTECTED]
Date: Wed, 26 Jul 2006 11:15:43 -0500

 Dave, what do you think about removing the user-space stuff for the
 first round of integration?  IE:  Just add netevents and kernel hooks to
 generate them.

Sure.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-25 Thread Steve Wise
On Tue, 2006-07-25 at 17:39 +1000, Herbert Xu wrote:
 Steve Wise [EMAIL PROTECTED] wrote:
  
  Routing redirect events are broadcast as a pair of rtmsgs, RTM_DELROUTE
  and RTM_NEWROUTE.
 
 This may confuse existing rtnetlink users since you're generating an
 RTM_DELROUTE message that's identical to one triggered by something
 like 'ip route del'.
 

Yea, I didn't really want to create a REDIRECT rtmsg, so I punted. :-)  

But they really are seeing a delete followed by an add.  That's what the
kernel is doing.

 As you're introducing a completely new RTM_ROUTEUPD type, it might
 be better to attach any information from the existing route that you
 need to the ROUTEUPD message.

Yea, the main change is the next hop ip address or gateway field.  

 
 Actually, what was the reason you need the existing route here?
 

The rdma driver needs to update all established rdma connections that
are using the next-hop information of the existing route and make them
use the next-hop information of the new route.  In addition, the rdma
driver might have a reference to the old dst entry.  So it can release
that ref and add a ref to the new dst entry.

  diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
  index 5f87533..33d8a83 100644
  --- a/net/ipv4/fib_semantics.c
  +++ b/net/ipv4/fib_semantics.c
  @@ -44,6 +44,7 @@ #include net/tcp.h
  #include net/sock.h
  #include net/ip_fib.h
  #include net/ip_mp_alg.h
  +#include net/netevent.h
  
  #include fib_lookup.h
  
  @@ -279,6 +280,14 @@ void rtmsg_fib(int event, u32 key, struc
 struct sk_buff *skb;
 u32 pid = req ? req-pid : n-nlmsg_pid;
 int size = NLMSG_SPACE(sizeof(struct rtmsg)+256);
  +   struct netevent_route_info nri;
  +   int netevent;
  +
  +   nri.family = AF_INET;
  +   nri.data = fa-fa_info;
  +   netevent = event == RTM_NEWROUTE ? NETEVENT_ROUTE_ADD 
  +: NETEVENT_ROUTE_DEL;
  +   call_netevent_notifiers(netevent, nri);
 
 Hmm, this is broken.  These route events are meaningless without the
 corresponding IP rule events.  Are you sure you really want to make
 your hardware/driver grok multiple routing tables?
 
 Perhaps you should simply stick to dst entries and flush all your
 tables when the routes are changed.  This is what the Linux IP stack
 does.
 

I have to admit I'm a little fuzzy on the routing stuff.  The main
netevents I've utilized in the the rdma driver I'm writing is the
neighbour update event and the redirect event.  Route add/del was added
for completeness of routing netevents.   

Can you expand further or point me to code where the IP stack flushes
its tables when routes are changed?

From my experience, all the rdma driver needs is the dst entry.   It
using the routing table to determine the dst_entry at connection
establish time.  And it needs to know if the next-hop or PMTU ever
changes.



  diff --git a/net/ipv4/route.c b/net/ipv4/route.c
  index 2dc6dbb..18879e6 100644
  --- a/net/ipv4/route.c
  +++ b/net/ipv4/route.c
  @@ -1117,6 +1120,52 @@ static void rt_del(unsigned hash, struct
 spin_unlock_bh(rt_hash_lock_addr(hash));
  }
  
  +static void rtm_redirect(struct rtable *old, struct rtable *new)
  +{
  +   struct netevent_redirect netevent;
  +   struct sk_buff *skb;
  +   int err;
  +
  +   netevent.old = old-u.dst;
  +   netevent.new = new-u.dst;
  +
  +   /* notify netevent subscribers */
  +   call_netevent_notifiers(NETEVENT_REDIRECT, netevent);
  +
  +   /* Post NETLINK messages:  RTM_DELROUTE for old route, 
  +  RTM_NEWROUTE for new route */
  +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
 
 Please use a better size estimate rather than NLMSG_GOODSIZE here since
 you're doing GFP_ATOMIC.
 

ok

  @@ -1442,6 +1493,32 @@ unsigned short ip_rt_frag_needed(struct 
 return est_mtu ? : new_mtu;
  }
  
  +static void rtm_pmtu_update(struct rtable *rt)
  +{
  +   struct sk_buff *skb;
  +   int err;
  +
  +   call_netevent_notifiers(NETEVENT_PMTU_UPDATE, rt-u.dst);
  +
  +   skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC);
 
 Ditto.
 

ok


Thanks,

Steve.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH Round 4 2/3] Core network changes to support network event notification.

2006-07-25 Thread Herbert Xu
On Tue, Jul 25, 2006 at 10:05:40AM -0500, Steve Wise wrote:
 
 But they really are seeing a delete followed by an add.  That's what the
 kernel is doing.

Actually that's the other thing I don't really like.  The user-space
monitor may perceive that a route was actually deleted and replaced
by a new one even though this isn't what's happening at all.

In fact the problem here is that you're sending route notifications
when it's really the dst_entry that's changing.  User-space as it
stands only get notifications about fib changes which is quite different
from changes to the transient dst_entry objects which only exist in the
route cache.

Is anyone actually going to use the user-space interface of this? If not
perhaps we should wait until someone really needs it before adding the
netlink part of the patch.

We can change the kernel interface at will so if we make a mistake with
netevent it can be easily corrected.  For user-space though the rules
are totally different.  I'd really hate to be stuck with an interface
which turns out to not be the one that people actually want to have.

 The rdma driver needs to update all established rdma connections that
 are using the next-hop information of the existing route and make them
 use the next-hop information of the new route.  In addition, the rdma
 driver might have a reference to the old dst entry.  So it can release
 that ref and add a ref to the new dst entry.

Do you really need the old route for the user-space part of your patch?

 I have to admit I'm a little fuzzy on the routing stuff.  The main
 netevents I've utilized in the the rdma driver I'm writing is the
 neighbour update event and the redirect event.  Route add/del was added
 for completeness of routing netevents.   

So you mean you aren't going to use the route notifications? In that case
we should probably just drop them and add them when someone actually needs
it.  At that point they can tell us what semantics they want from it :)

 Can you expand further or point me to code where the IP stack flushes
 its tables when routes are changed?

Grep for rt_cache_flush in net/ipv4/fib_hash.c.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html