Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread Andi Kleen
On Sun, 21 Aug 2005 16:19:26 -0700 (PDT)
"David S. Miller" <[EMAIL PROTECTED]> wrote:

> From: Andi Kleen <[EMAIL PROTECTED]>
> Date: Mon, 22 Aug 2005 01:13:21 +0200
> 
> > > Basically, you'll have skb->free_callback(skb, ARG), and
> > > skb->free_callback_ARG.  And when the SKB and it's memory
> > > is about to get liberated, we'll call the callback instead
> > > of doing the free if the callback is non-NULL.
> > 
> > One issue is that the NIC focus shouldn't be reprogrammed for every 
> > packet because that would be too expensive.
> 
> The NIC is going to track this state internally in a cache, completely
> transparently from the OS (besides the callback), and use MSI vectors
> to target specific cpus based upon that information.

Yes it does. But you need to update that information when the receiving
CPU changes. So you need to figure out when the CPU has changed
by storing the last received CPU. You cannot rely on the information inside the 
NIC here because accessing that over the PCI would be too expensive.
Updating on every packet would be also too expensive.  So it has
to be stored in RAM somewhere.

In theory the NIC could store it in a separate data structure, but that would 
be wasteful IMHO because it would duplicate what a socket does. 
So it's best to add a last_rcv_cpu field to the struct sock and make sure 
the free callback can access it safely.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread Andi Kleen

> Another approach would be:
> 
> 1) Determine that we don't care about the callback (ie. it gets
>reset to NULL) when the skb->dev changes, as would occur for
>forwarding, and certain kinds of firewalling and classification
>actions.
> 
> 2) As a result of #1 we can put the callback into the netdev struct,
>the opaque "ARG" becomes superfluous since the thing you'd pass
>there (the device pointer) is implicit.

It would be better to still pass it the netdev so that the driver
can figure out what instance of its device it is for.

One problem is to figure out the input netdevice (skb->input_dev
is usually not valid). For TCP/connected UDP it is relatively
easy to figure out based on the routes, but not necessarily
for all kfree_skb calls.  But doing it only for these sockets
is fine I think. For unconnected UDP it could be even disabled.

I also prefer this over a generic free callback because
it will not tempt people to implement their own broken
skb reuse schemes in drivers using this.
 
> 3) Add a "received on cpu" number to sk_buff, and the callback can
>inspect that.
> 
> If smp_processor_id() is different from skb->recv_cpu, then the
> driver updates the table.

Sounds good too. 

-Andi
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Mon, 22 Aug 2005 05:00:17 +0200

> > Another approach would be:
> > 
> > 1) Determine that we don't care about the callback (ie. it gets
> >reset to NULL) when the skb->dev changes, as would occur for
> >forwarding, and certain kinds of firewalling and classification
> >actions.
> > 
> > 2) As a result of #1 we can put the callback into the netdev struct,
> >the opaque "ARG" becomes superfluous since the thing you'd pass
> >there (the device pointer) is implicit.
> 
> It would be better to still pass it the netdev so that the driver
> can figure out what instance of its device it is for.

We would know, because the args would be:

   netdev->skb_free_callback(skb, netdev);

So there is no question about which device it arrived on.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Mon, 22 Aug 2005 03:34:16 +0200

> In theory the NIC could store it in a separate data structure, but
> that would be wasteful IMHO because it would duplicate what a socket
> does.  So it's best to add a last_rcv_cpu field to the struct sock
> and make sure the free callback can access it safely.

That's an interesting idea.

Another approach would be:

1) Determine that we don't care about the callback (ie. it gets
   reset to NULL) when the skb->dev changes, as would occur for
   forwarding, and certain kinds of firewalling and classification
   actions.

2) As a result of #1 we can put the callback into the netdev struct,
   the opaque "ARG" becomes superfluous since the thing you'd pass
   there (the device pointer) is implicit.

3) Add a "received on cpu" number to sk_buff, and the callback can
   inspect that.

If smp_processor_id() is different from skb->recv_cpu, then the
driver updates the table.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Mon, 22 Aug 2005 01:13:21 +0200

> > Basically, you'll have skb->free_callback(skb, ARG), and
> > skb->free_callback_ARG.  And when the SKB and it's memory
> > is about to get liberated, we'll call the callback instead
> > of doing the free if the callback is non-NULL.
> 
> One issue is that the NIC focus shouldn't be reprogrammed for every 
> packet because that would be too expensive.

The NIC is going to track this state internally in a cache, completely
transparently from the OS (besides the callback), and use MSI vectors
to target specific cpus based upon that information.

Perhaps I'm missing something here.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread Andi Kleen
> 
> Basically, you'll have skb->free_callback(skb, ARG), and
> skb->free_callback_ARG.  And when the SKB and it's memory
> is about to get liberated, we'll call the callback instead
> of doing the free if the callback is non-NULL.

One issue is that the NIC focus shouldn't be reprogrammed for every 
packet because that would be too expensive. So someone needs to remember 
the last value of smp_processor_id(). Best is probably to do that in the 
struct sock, otherwise the driver would need to keep redundant 
data structures around which would be ugly. 

I think it it's enough to make sure skb->sk stays valid during 
the callback, then the driver (or a common library callback)
can maintain it in a struct sock field and only tell the driver
when the CPU has changed.

This would imply that skb->destructor would need to run after
the callback which could cause problems if the callback actually
destroys the skb. 

-Andi 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread Leonid Grossman
 

> -Original Message-
> From: David S. Miller [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, August 21, 2005 3:34 PM

> We'll be adding the RX free callback support soon, perhaps in 
> the 2.6.14 timeframe, once we shrink the sk_buff struct a 
> little bit more so that we can justify adding the extra 
> member necessary for the implementation.
> 
> Basically, you'll have skb->free_callback(skb, ARG), and
> skb->free_callback_ARG.  And when the SKB and it's memory
> is about to get liberated, we'll call the callback instead of 
> doing the free if the callback is non-NULL.
> 


Thanks, this will work!

Driver-only RTD scheme seems to work reasonably well, but the ability to
"follow the scheduler"  for both tx and rx driver processing should help
quite a bit. 
Since the ASIC association between rx queues and
saddr/daddr/sport/dport state is dynamic, the process doesn't even have
to stay on the same cpu forever - just "long enough".

 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread Christoph Hellwig
On Sun, Aug 21, 2005 at 03:33:36PM -0700, David S. Miller wrote:
> From: "Leonid Grossman" <[EMAIL PROTECTED]>
> Date: Sun, 21 Aug 2005 13:02:00 -0400
> 
> > Andi, can you provide a callback patch please? 
> 
> Andi isn't very active in the networking these days,
> so asking him to do the work whilst he's so busy with
> x86_64 maintainence isn't the best idea :)
> 
> We'll be adding the RX free callback support soon, perhaps
> in the 2.6.14 timeframe, once we shrink the sk_buff struct
> a little bit more so that we can justify adding the extra
> member necessary for the implementation.
> 
> Basically, you'll have skb->free_callback(skb, ARG), and
> skb->free_callback_ARG.  And when the SKB and it's memory
> is about to get liberated, we'll call the callback instead
> of doing the free if the callback is non-NULL.

Does it really need to be in the skbuff?  I I think a
rx_free_skb method in struct net_device would be sufficient.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread David S. Miller
From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Sun, 21 Aug 2005 23:38:24 +0100

> Does it really need to be in the skbuff?  I I think a
> rx_free_skb method in struct net_device would be sufficient.

The device on the SKB can be changed long before we free
it, due to netfilter, traffic classification actions, etc.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-21 Thread David S. Miller
From: "Leonid Grossman" <[EMAIL PROTECTED]>
Date: Sun, 21 Aug 2005 13:02:00 -0400

> Andi, can you provide a callback patch please? 

Andi isn't very active in the networking these days,
so asking him to do the work whilst he's so busy with
x86_64 maintainence isn't the best idea :)

We'll be adding the RX free callback support soon, perhaps
in the 2.6.14 timeframe, once we shrink the sk_buff struct
a little bit more so that we can justify adding the extra
member necessary for the implementation.

Basically, you'll have skb->free_callback(skb, ARG), and
skb->free_callback_ARG.  And when the SKB and it's memory
is about to get liberated, we'll call the callback instead
of doing the free if the callback is non-NULL.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html