Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-27 Thread jamal
On Mon, 2006-26-06 at 09:34 -0500, Steve Wise wrote:
> On Sat, 2006-06-24 at 10:30 -0400, jamal wrote:

> The route/hh cache insertions might work for the initial dst MAC per
> next-hop IP.  But this dst MAC can _change_ for various reasons (even
> though the next-hop IP remains the same).   Such a change, I think,
> doesn't generate a new route + hh cache insertion, just a change to the
> hh entry.
> 
> Also, I think the route cache entry is created _before_ the MAC addr is
> known. So we really need to know when the neighbour entry is updated
> with the MAC address as a result of ARP/ND.  Hooking the correct spot in
> the neighbour code where the mac address gets stored also gets us the
> change event I described above.
> 
> Does this make sense?
> 

no - but as long as you solve the problem it should be fine. 
[My goal was to help -factoring in some experiences- it seems you have
it under control though].

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-26 Thread Steve Wise
On Sat, 2006-06-24 at 10:30 -0400, jamal wrote:
> On Fri, 2006-23-06 at 08:24 -0500, Steve Wise wrote:
> 
> > 
> > > PS:- I do think what they need is to hear route cache generation
> > > as opposed to ARP+FIB updates; but lets wait and see how clever 
> > > the patches would look.
> > > 
> 
> > Can you expand on your statement above?  If hooking route cache
> > generation gets all the events I described, then I'd like to use that.
> > I'm still learning the Linux routing subsystem.  Any help would be
> > GREAT!
> > 
> 
> If my understanding is correct of what you are trying to do is:
> for a destination IP you are going to figure the source and destination
> MAC address. Most of that info is available at the route + hh cache. 
> There can be only one destination mac per device and so you only need to
> watch the device changes for that. The dst MAC per IP and any changes
> you can glean from the route cache created.
> But this is based on my understanding of what you are trying to do and
> so far i cant say i am 100% clear.

The route/hh cache insertions might work for the initial dst MAC per
next-hop IP.  But this dst MAC can _change_ for various reasons (even
though the next-hop IP remains the same).   Such a change, I think,
doesn't generate a new route + hh cache insertion, just a change to the
hh entry.

Also, I think the route cache entry is created _before_ the MAC addr is
known. So we really need to know when the neighbour entry is updated
with the MAC address as a result of ARP/ND.  Hooking the correct spot in
the neighbour code where the mac address gets stored also gets us the
change event I described above.

Does this make sense?

Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-24 Thread jamal
On Fri, 2006-23-06 at 08:24 -0500, Steve Wise wrote:

> 
> > PS:- I do think what they need is to hear route cache generation
> > as opposed to ARP+FIB updates; but lets wait and see how clever 
> > the patches would look.
> > 

> Can you expand on your statement above?  If hooking route cache
> generation gets all the events I described, then I'd like to use that.
> I'm still learning the Linux routing subsystem.  Any help would be
> GREAT!
> 

If my understanding is correct of what you are trying to do is:
for a destination IP you are going to figure the source and destination
MAC address. Most of that info is available at the route + hh cache. 
There can be only one destination mac per device and so you only need to
watch the device changes for that. The dst MAC per IP and any changes
you can glean from the route cache created.
But this is based on my understanding of what you are trying to do and
so far i cant say i am 100% clear.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-23 Thread Steve Wise
On Fri, 2006-06-23 at 12:57 -0700, David Miller wrote:
> From: Steve Wise <[EMAIL PROTECTED]>
> Date: Fri, 23 Jun 2006 08:24:43 -0500
> 
> > On Thu, 2006-06-22 at 20:56 -0400, jamal wrote:
> > > On Thu, 2006-22-06 at 15:58 -0700, David Miller wrote:
> > > 
> > > > Anyways, we can create normal notifiers for neighbour and route
> > > > events just like we have for network device stuff.
> > > >
> > 
> > So did you agree with a new notifier head for these events as in my
> > original patch?  Or do you think I should add these to the netdev
> > notifier?  
> 
> Pretty much.  I may not agree with the details of your implementation.
> 
> So let's start by you doing a repost of the first patch and let's
> review that, ok?

Ok.  Stay tuned.

Steve.





-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-23 Thread David Miller
From: Steve Wise <[EMAIL PROTECTED]>
Date: Fri, 23 Jun 2006 08:24:43 -0500

> On Thu, 2006-06-22 at 20:56 -0400, jamal wrote:
> > On Thu, 2006-22-06 at 15:58 -0700, David Miller wrote:
> > 
> > > Anyways, we can create normal notifiers for neighbour and route
> > > events just like we have for network device stuff.
> > >
> 
> So did you agree with a new notifier head for these events as in my
> original patch?  Or do you think I should add these to the netdev
> notifier?  

Pretty much.  I may not agree with the details of your implementation.

So let's start by you doing a repost of the first patch and let's
review that, ok?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-23 Thread Steve Wise
On Thu, 2006-06-22 at 20:56 -0400, jamal wrote:
> On Thu, 2006-22-06 at 15:58 -0700, David Miller wrote:
> 
> > Anyways, we can create normal notifiers for neighbour and route
> > events just like we have for network device stuff.
> >

So did you agree with a new notifier head for these events as in my
original patch?  Or do you think I should add these to the netdev
notifier?  

> > There should be netlink counterparts for that stuff too, which
> > are generated by the notifier calls or similar.
> 

Ok.

> PS:- I do think what they need is to hear route cache generation
> as opposed to ARP+FIB updates; but lets wait and see how clever 
> the patches would look.
> 

Based on what I undestand from this thread, I should keep a notifier
block for these events and integrate that so the events also get passed
up to user space via netlink.  

Can you expand on your statement above?  If hooking route cache
generation gets all the events I described, then I'd like to use that.
I'm still learning the Linux routing subsystem.  Any help would be
GREAT!

Thanks,

Steve.



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-23 Thread Steve Wise
On Thu, 2006-06-22 at 16:56 -0400, jamal wrote:
> On Thu, 2006-22-06 at 15:40 -0500, Steve Wise wrote:
> > On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
> > > 
> > > No - what these 2 gents are saying was these events and infrastructure
> > > already exist. 
> > 
> > Notification of the exact events needed does not exist today.   
> > 
> 
> Ok, so you cant event make use of anything that already exists?
> Or is a subset of what you need already there?
> 
> > The key events, again, are:
> > 
> > - the neighbour entry mac address has changed.
> > 
> >
> > - the next hop ip address (ie the neighbour) for a given dst_entry has
> > changed.
> 
> 
> I dont see a difference for the above two from an L2 perspective.
> Are you keeping track of IP addresses?

There is no difference from an L2 perspective, but the RDMA driver needs
notification of each so it can correctly manipulate the L2 table in HW
and/or control block for the affected active connections.

> You didn't answer my question in the previous email as to what RDMA
> needs to keep track of in hardware.
> 

See my previous email.  To reiterate: The HW I'm working on maintains a
L2 table, and each active RDMA connection keeps an index into this
table .  If the mac addr of the next hop changes, then the L2 Table gets
updated.  If the next hop itself changes, then each active connection
must be kicked to update its index into the L2 table.


> > 
> > - the path mtu for a given dst_entry has changed.
> > 
> 
> Same with this.
> 

The RDMA HW needs the path mtu for each connection in order to do
segmentation.



Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-23 Thread Steve Wise

> > 
> > > Out of curiosity - what does RDMA NIC have that would need these events?
> > > a route table or L2 table etc? Can you elucidate a little?
> > > 
> > 
> > Mainly the L2 table, next hop ip addr, and the path mtu.  RDMA NICs
> > implement the entire RDMA stack in HW.  How they deal with L2 and L3
> > changes vary to some degree, but what seems to be emerging is that they
> > get this information from the native stack because ARP and ICMP, for
> > example, are always passed up to the native stack.
> > 
> 
> I am still unclear: 
> You have destination IP address, the dstMAC of the nexthop to get the
> packet to this IP address and i suspect some srcMAC address you will use
> sending out as well as the pathMTU to get there correct?
> Because of the IP address it sounds to me like you are populating an L3
> table

I mispoke.  The HW I'm using really only maintains a table of next hop
mac addrs and a table of src mac addrs.  Each active RDMA connection in
HW keeps an index into each table for building the ethernet header. 

The _driver_ needs to know when the next hop mac addr changes, or when
the next hop itself changes for a given destination so that it can
update the active connections and/or the L2T table accordingly.   Same
deal with the path mtu...

> How is this info used in hardware? Can you explain how an arriving
> packet would be used by the RDMA in conjunction with this info once it
> is in the hardware?
> 

I think my stuff above explains this, eh?

> > These devices also act a standard Ethernet NIC btw...
> > 
> 
> Meaning there is no funky hardware processing?
> 

If an incoming packet is not for one of the active RDMA connections (or
a listening RDMA endpoint), then the packet is passed up to the native
stack via the device's netdev driver.

Stevo.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 15:58 -0700, David Miller wrote:

> Anyways, we can create normal notifiers for neighbour and route
> events just like we have for network device stuff.
>
> There should be netlink counterparts for that stuff too, which
> are generated by the notifier calls or similar.

Sounds reasonable - so someone deciding to do this from user space
would still be able to do so.

cheers,
jamal

PS:- I do think what they need is to hear route cache generation
as opposed to ARP+FIB updates; but lets wait and see how clever 
the patches would look.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread David Miller
From: "Caitlin Bestler" <[EMAIL PROTECTED]>
Date: Thu, 22 Jun 2006 15:11:27 -0700

> I don't have any strong opinion on the best mechanism
> for implementing these subscriptions, but having correct
> consistent networking behaviour depend on a user-mode
> relay strikes me as odd.

Never heard of a routing daemon?  That all works in userspace.

Anyways, we can create normal notifiers for neighbour and route
events just like we have for network device stuff.

There should be netlink counterparts for that stuff too, which
are generated by the notifier calls or similar.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote:
> On Thu, 2006-22-06 at 15:58 -0500, Steve Wise wrote:
>> On Thu, 2006-06-22 at 16:36 -0400, jamal wrote:
> 
>> I created a new notifier block in my patch for these network events.
>> I guess I thought I was using the existing infrastructure to provide
>> this notification service. (I thought my patch was lovely :)  But I
>> didn't integrate with netlink for user space notification. Mainly cuz
>> I didn't think these events should be propagated up to users unless
>> there was a need.
> 
> I think they will be useful in user space. Typically you only
> propagate them if there's a user space program subscribed to
> listening (there are hooks which will tell you if there's
> anyone listening).
> The netdevice events tend to be a lot more usable in a few
> other blocks because they are lower in the hierachy (i.e
> routing depends on ip addresses which depend on netdevices)
> within the kernel unlike in this case where you are the only
> consumer; so it does sound logical to me to do it in user
> space; however, not totally unreasonable to do it in the kernel.
> 


These services are relevant to any RDMA connection. The user-space
consumer of RDMA services is no more interested in tracking the
routing of the remote IP address than the consumer of socket
services is.


>> 
>> 
>> Another issue I see with netlink is that the event notifications
>> aren't reliable.  Especially the CONFIG_ARPD stuff because it allocs
>> an sk_buff with ATOMIC.  A lost neighbour macaddr change is perhaps
>> fatal for an RDMA connection...
>> 
> 
> This would happen in the cases where you are short on memory;
> i would suspect you will need to allocate memory in your
> driver as well to update something in the hardware as well -
> so same problem.
> You can however work around issues like these in netlink.
>

A direct notification call to the driver makes the driver responsible
for providing whatever buffering it requires to save the information.
And if there is insufficient memory available at least the driver
is aware of the failure.

Allowing a third component to fail to relay information means that
the driver can no longer be responsible for maintaining its own
consistency with kernel routing, ARP and neighbor tables.

Maintaining that consistency is a matter of correct network
behaviour, not doing status reports. obviously we cannot have
hardware looking at and interpreting these tables directly.
So a *reliable* subscription would seem to be the only option.

If the only subscribers who require reliable notifications are
kernel drivers, does it really mamke sense to make those changes
in code that also supports user space?
 

> 
> I am still unclear:
> You have destination IP address, the dstMAC of the nexthop to
> get the packet to this IP address and i suspect some srcMAC
> address you will use sending out as well as the pathMTU to
> get there correct?
> Because of the IP address it sounds to me like you are
> populating an L3 table How is this info used in hardware? Can
> you explain how an arriving packet would be used by the RDMA
> in conjunction with this info once it is in the hardware?
>

Some packets are associated with established RDMA (or iSCSI)
connections, and are processed on the RDMA (or iSCSI) device.
These packets will also pass through other packets to the
host stack for processing (non-matched Ethernet frames for
IP networks, and IPoIB tunneled frames for IB networks).

The device provides L5 services (RDMA and/or iSCSI) in addition
to L2 services (as an Ethernet device). The rest of the network
rightfully demands that the left hand knows what the right hand
is doing. So information that is provided to a host, ARP/ICMP,
should affect the behaviour of *all* connections from that host.

Do you agree that having the device subsribe to the kernel
maintained tables is a better solution than having it attempt
to guess the correct values in parallel?
 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 15:11 -0700, Caitlin Bestler wrote:
> [EMAIL PROTECTED] wrote:


> 
> These subscriptions are an attempt to cede full control
> of these issues back to one place, the kernel, and to
> guarantee that an offload device can never think that
> the route to to X is Y when the kernel says it is Z.
> Or that it has a different PMTU, etc.
> 

Ok, so it is the routing information then that you are syncing,
correct? 

> I don't have any strong opinion on the best mechanism
> for implementing these subscriptions, but having correct
> consistent networking behaviour depend on a user-mode
> relay strikes me as odd.
> 

And why does it sound odd? 
You will need to think about one issue:
Linux caches routing info - it is not just as simple as keeping track of
the FIB and somehow correlating that to the ARP entries.  

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 15:58 -0500, Steve Wise wrote:
> On Thu, 2006-06-22 at 16:36 -0400, jamal wrote:

> I created a new notifier block in my patch for these network events.   I
> guess I thought I was using the existing infrastructure to provide this
> notification service. (I thought my patch was lovely :)  But I didn't
> integrate with netlink for user space notification. Mainly cuz I didn't
> think these events should be propagated up to users unless there was a
> need.  

I think they will be useful in user space. Typically you only propagate
them if there's a user space program subscribed to listening (there are
hooks which will tell you if there's anyone listening).
The netdevice events tend to be a lot more usable in a few other blocks
because they are lower in the hierachy (i.e routing depends on ip
addresses which depend on netdevices) within the kernel unlike in this
case where you are the only consumer; so it does sound logical to me
to do it in user space; however, not totally unreasonable to do it in
the kernel.

> Just to clarify, you're suggesting I add any needed netlink hooks for
> rt_redirect and the others that don't exist today, and use a NETLINK
> socket in user space to discover these events.  Yes?
> 

indeed.

> > Your mileage may vary. If you do it in user space you dont have to wait
> > for the next kernel release in case of a bug. 
> 
> As long as all the events are passed up correctly :-)
> 

They have been for years ;-> 

> > Additionally, it allows
> > for more feature richness that would tend to bloat the kernel/infiniband
> > otherwise. 
> 
> 
> Another issue I see with netlink is that the event notifications aren't
> reliable.  Especially the CONFIG_ARPD stuff because it allocs an sk_buff
> with ATOMIC.  A lost neighbour macaddr change is perhaps fatal for an
> RDMA connection...
> 

This would happen in the cases where you are short on memory; i would
suspect you will need to allocate memory in your driver as well to
update something in the hardware as well - so same problem.
You can however work around issues like these in netlink.

> 
> > Out of curiosity - what does RDMA NIC have that would need these events?
> > a route table or L2 table etc? Can you elucidate a little?
> > 
> 
> Mainly the L2 table, next hop ip addr, and the path mtu.  RDMA NICs
> implement the entire RDMA stack in HW.  How they deal with L2 and L3
> changes vary to some degree, but what seems to be emerging is that they
> get this information from the native stack because ARP and ICMP, for
> example, are always passed up to the native stack.
> 

I am still unclear: 
You have destination IP address, the dstMAC of the nexthop to get the
packet to this IP address and i suspect some srcMAC address you will use
sending out as well as the pathMTU to get there correct?
Because of the IP address it sounds to me like you are populating an L3
table
How is this info used in hardware? Can you explain how an arriving
packet would be used by the RDMA in conjunction with this info once it
is in the hardware?

> These devices also act a standard Ethernet NIC btw...
> 

Meaning there is no funky hardware processing?

cheers,
jamal


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote:
> On Thu, 2006-22-06 at 15:40 -0500, Steve Wise wrote:
>> On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
>>> 
>>> No - what these 2 gents are saying was these events and
>>> infrastructure already exist.
>> 
>> Notification of the exact events needed does not exist today.
>> 
> 
> Ok, so you cant event make use of anything that already exists?
> Or is a subset of what you need already there?
> 
>> The key events, again, are:
>> 
>> - the neighbour entry mac address has changed.
>> 
>> 
>> - the next hop ip address (ie the neighbour) for a given dst_entry
>> has changed.
> 
> 
> I dont see a difference for the above two from an L2 perspective.
> Are you keeping track of IP addresses?
> You didn't answer my question in the previous email as to
> what RDMA needs to keep track of in hardware.
> 

The RDMA device is handling L4 or L5 connections that 
have L3 Addresses (IP). Subscribing to the information
allows the device to keep its behaviour consistent
with the host stack.

The common alternative before proposing this integration
was to have the RDMA device sniff all incoming packets
and attempt to do parallel procesing on a large set
of lower layer protocols (ICMP, ARP, routing, ...)
Or by simply trusting that the IB network adminstrator
has faithfully replicated all IP-relevent instructions
in two forums (traditional IP nework administration
and IB network administration).

These subscriptions are an attempt to cede full control
of these issues back to one place, the kernel, and to
guarantee that an offload device can never think that
the route to to X is Y when the kernel says it is Z.
Or that it has a different PMTU, etc.

I don't have any strong opinion on the best mechanism
for implementing these subscriptions, but having correct
consistent networking behaviour depend on a user-mode
relay strikes me as odd.



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise
On Thu, 2006-06-22 at 16:36 -0400, jamal wrote:
> On Thu, 2006-22-06 at 15:18 -0500, Steve Wise wrote:
> > On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
> 
> > > As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
> > > Actually you are probably making this too complicated. 
> > 
> > NETDEV_CHANGEADDR uses a notifier block, and the network subsystem calls
> > call_netdevice_notifiers() when it sets an addr.  And any kernel module
> > can register for these events.  That's the model I used to create the
> > netevent_notifier mechanism in the patch I posted.
> > 
> 
> it also gets emmited as a netlink event.
> 

right.

> > I could add the new events to this netdevice notifier, but these aren't
> > really net device events.  Their network events.  
> > 
> 
> Different blocks for sure - the point is the infrastructure which
> constitutes using notifiers exists. And it is joined at the hip with
> netlink.
> 

I created a new notifier block in my patch for these network events.   I
guess I thought I was using the existing infrastructure to provide this
notification service. (I thought my patch was lovely :)  But I didn't
integrate with netlink for user space notification. Mainly cuz I didn't
think these events should be propagated up to users unless there was a
need.  


> > I can indeed extend the rtnetlink stuff to add the events in question
> > (neighbour mac addr change, route redirect, etc). In fact, there is
> > similar functionality under the CONFIG_ARPD option to support a user
> > space arp daemon.  Its not quite the same, and it doesn't cover redirect
> > and routing events, just neighbour events.
> > 
> 
> CONFIG_ARPD will give you all neighbor events you need. 
> => rt_redirect doesnt exist neither do route cache
> creation/updates/deletions. FIB changes exist etc
> 

Just to clarify, you're suggesting I add any needed netlink hooks for
rt_redirect and the others that don't exist today, and use a NETLINK
socket in user space to discover these events.  Yes?


> > But in the case of the RDMA subsystem, the consumer of these events is
> > in the kernel.  Why is it better to propagate events all the way up to
> > user space, then send the event back down into the Infiniband kernel
> > subsystem?  That seems very inefficient.  
> 
> Your mileage may vary. If you do it in user space you dont have to wait
> for the next kernel release in case of a bug. 

As long as all the events are passed up correctly :-)

> Additionally, it allows
> for more feature richness that would tend to bloat the kernel/infiniband
> otherwise. 


Another issue I see with netlink is that the event notifications aren't
reliable.  Especially the CONFIG_ARPD stuff because it allocs an sk_buff
with ATOMIC.  A lost neighbour macaddr change is perhaps fatal for an
RDMA connection...


> Out of curiosity - what does RDMA NIC have that would need these events?
> a route table or L2 table etc? Can you elucidate a little?
> 

Mainly the L2 table, next hop ip addr, and the path mtu.  RDMA NICs
implement the entire RDMA stack in HW.  How they deal with L2 and L3
changes vary to some degree, but what seems to be emerging is that they
get this information from the native stack because ARP and ICMP, for
example, are always passed up to the native stack.

These devices also act a standard Ethernet NIC btw...

Steve.





-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 15:40 -0500, Steve Wise wrote:
> On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
> > 
> > No - what these 2 gents are saying was these events and infrastructure
> > already exist. 
> 
> Notification of the exact events needed does not exist today.   
> 

Ok, so you cant event make use of anything that already exists?
Or is a subset of what you need already there?

> The key events, again, are:
> 
> - the neighbour entry mac address has changed.
> 
>
> - the next hop ip address (ie the neighbour) for a given dst_entry has
> changed.


I dont see a difference for the above two from an L2 perspective.
Are you keeping track of IP addresses?
You didn't answer my question in the previous email as to what RDMA
needs to keep track of in hardware.

> 
> - the path mtu for a given dst_entry has changed.
> 

Same with this.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise
On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
> 
> No - what these 2 gents are saying was these events and infrastructure
> already exist. 

Notification of the exact events needed does not exist today.   

The key events, again, are:

- the neighbour entry mac address has changed.

- the next hop ip address (ie the neighbour) for a given dst_entry has
changed.

- the path mtu for a given dst_entry has changed.


Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 15:18 -0500, Steve Wise wrote:
> On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:

> > As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
> > Actually you are probably making this too complicated. 
> 
> NETDEV_CHANGEADDR uses a notifier block, and the network subsystem calls
> call_netdevice_notifiers() when it sets an addr.  And any kernel module
> can register for these events.  That's the model I used to create the
> netevent_notifier mechanism in the patch I posted.
> 

it also gets emmited as a netlink event.

> I could add the new events to this netdevice notifier, but these aren't
> really net device events.  Their network events.  
> 

Different blocks for sure - the point is the infrastructure which
constitutes using notifiers exists. And it is joined at the hip with
netlink.


> I can indeed extend the rtnetlink stuff to add the events in question
> (neighbour mac addr change, route redirect, etc). In fact, there is
> similar functionality under the CONFIG_ARPD option to support a user
> space arp daemon.  Its not quite the same, and it doesn't cover redirect
> and routing events, just neighbour events.
> 

CONFIG_ARPD will give you all neighbor events you need. 
=> rt_redirect doesnt exist neither do route cache
creation/updates/deletions. FIB changes exist etc

> But in the case of the RDMA subsystem, the consumer of these events is
> in the kernel.  Why is it better to propagate events all the way up to
> user space, then send the event back down into the Infiniband kernel
> subsystem?  That seems very inefficient.  

Your mileage may vary. If you do it in user space you dont have to wait
for the next kernel release in case of a bug. Additionally, it allows
for more feature richness that would tend to bloat the kernel/infiniband
otherwise. 

Out of curiosity - what does RDMA NIC have that would need these events?
a route table or L2 table etc? Can you elucidate a little?

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise
On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
> On Thu, 2006-22-06 at 10:27 -0500, Steve Wise wrote:
> 
> > 
> > The in-kernel Infiniband subsystem needs to know when certain events
> > happen.  For example, if the mac address of a neighbour changes.  Any
> > rdma devices that are using said neighbour need to be notified of the
> > change.  You are asking that I extend the netlink facility (if
> > necessary) to provide this functionality.  
> > 
> 
> No - what these 2 gents are saying was these events and infrastructure
> already exist. If there are some events that dont and you need to extend
> what already exists. Your patch was a serious reinvention of the wheel
> (and in the case of the neighbor code looking very wrong).

ok.

> As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
> Actually you are probably making this too complicated. 

NETDEV_CHANGEADDR uses a notifier block, and the network subsystem calls
call_netdevice_notifiers() when it sets an addr.  And any kernel module
can register for these events.  That's the model I used to create the
netevent_notifier mechanism in the patch I posted.

I could add the new events to this netdevice notifier, but these aren't
really net device events.  Their network events.  

> Listen to events
> in user space and tell infiniband from user space.
> 

I can indeed extend the rtnetlink stuff to add the events in question
(neighbour mac addr change, route redirect, etc). In fact, there is
similar functionality under the CONFIG_ARPD option to support a user
space arp daemon.  Its not quite the same, and it doesn't cover redirect
and routing events, just neighbour events.

But in the case of the RDMA subsystem, the consumer of these events is
in the kernel.  Why is it better to propagate events all the way up to
user space, then send the event back down into the Infiniband kernel
subsystem?  That seems very inefficient.  

Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread jamal
On Thu, 2006-22-06 at 10:27 -0500, Steve Wise wrote:

> 
> The in-kernel Infiniband subsystem needs to know when certain events
> happen.  For example, if the mac address of a neighbour changes.  Any
> rdma devices that are using said neighbour need to be notified of the
> change.  You are asking that I extend the netlink facility (if
> necessary) to provide this functionality.  
> 

No - what these 2 gents are saying was these events and infrastructure
already exist. If there are some events that dont and you need to extend
what already exists. Your patch was a serious reinvention of the wheel
(and in the case of the neighbor code looking very wrong).
As an example, search for NETDEV_CHANGEADDR,NETDEV_CHANGEMTU etc.
Actually you are probably making this too complicated. Listen to events
in user space and tell infiniband from user space.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise
On Thu, 2006-06-22 at 08:53 -0500, Steve Wise wrote:
> On Thu, 2006-06-22 at 01:57 -0700, David Miller wrote:
> > From: Steve Wise <[EMAIL PROTECTED]>
> > Date: Wed, 21 Jun 2006 13:45:19 -0500
> > 
> > > This patch implements a mechanism that allows interested clients to
> > > register for notification of certain network events.
> > 
> > We have a generic network event notification facility called
> > netlink, please use it and extend it for your needs if necessary.
> 
> I'll investigate this.  
> 
> Thanks,


The in-kernel Infiniband subsystem needs to know when certain events
happen.  For example, if the mac address of a neighbour changes.  Any
rdma devices that are using said neighbour need to be notified of the
change.  You are asking that I extend the netlink facility (if
necessary) to provide this functionality.  

Are you suggesting, then, that the Infiniband subsystem should create an
in-kernel NETLINK socket and obtain these events (and the pertinent
information) via the socket?  

I'm still learning about netlink, but my understanding to date is that
its a way to pass events/commands between the kernel and user
applications.  It perhaps seems overkill to use this mechanism for
kernel->kernel event notifications.  That's why I started with notifier
blocks and added a netevent_notifier mechanism.

Any help is greatly appreciated.  Sorry if I'm being dense...

Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread Steve Wise
On Thu, 2006-06-22 at 01:57 -0700, David Miller wrote:
> From: Steve Wise <[EMAIL PROTECTED]>
> Date: Wed, 21 Jun 2006 13:45:19 -0500
> 
> > This patch implements a mechanism that allows interested clients to
> > register for notification of certain network events.
> 
> We have a generic network event notification facility called
> netlink, please use it and extend it for your needs if necessary.

I'll investigate this.  

Thanks,


Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-22 Thread David Miller
From: Steve Wise <[EMAIL PROTECTED]>
Date: Wed, 21 Jun 2006 13:45:19 -0500

> This patch implements a mechanism that allows interested clients to
> register for notification of certain network events.

We have a generic network event notification facility called
netlink, please use it and extend it for your needs if necessary.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2][RFC] Network Event Notifier Mechanism

2006-06-21 Thread YOSHIFUJI Hideaki / 吉藤英明
In article <[EMAIL PROTECTED]> (at Wed, 21 Jun 2006 13:45:19 -0500), Steve Wise 
<[EMAIL PROTECTED]> says:

> This patch implements a mechanism that allows interested clients to
> register for notification of certain network events. The intended use
> is to allow RDMA devices (linux/drivers/infiniband) to be notified of
> neighbour updates, ICMP redirects, path MTU changes, and route changes.

Why not netlink?
Neighbor / routing updates should be transmitted via netlink, at least.

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html