Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-07 Thread David S. Miller
From: Jesse Brandeburg <[EMAIL PROTECTED]>
Date: Tue, 7 Feb 2006 14:11:46 -0800 (Pacific Standard Time)

> A recent commit in 2.6.14 broke this, see this git commit: 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=bc8dfcb93970ad7139c976356bfc99d7e251deaf
>  
> Or for a shorter version http://tinyurl.com/drpu8

I think we should revert that thing, it's caused more grief than
anything else.  I thought it was a complete waste of time from the
get-go even assuming that fraglists within fraglists never occur...
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-07 Thread Herbert Xu
David S. Miller <[EMAIL PROTECTED]> wrote:
> 
> I think we should revert that thing, it's caused more grief than
> anything else.  I thought it was a complete waste of time from the
> get-go even assuming that fraglists within fraglists never occur...

I share your feelings towards this patch.  However, what e1000 is doing
is broken.  It should be filling in the frags array, not frag_list.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-07 Thread Jesse Brandeburg

On Tue, 7 Feb 2006, Herbert Xu wrote:

David S. Miller <[EMAIL PROTECTED]> wrote:
>
> I think we should revert that thing, it's caused more grief than
> anything else.  I thought it was a complete waste of time from the
> get-go even assuming that fraglists within fraglists never occur...

I share your feelings towards this patch.  However, what e1000 is doing
is broken.  It should be filling in the frags array, not frag_list.


so we generally call dev_alloc_skb to get the receive buffers to give to 
our hardware.  When we use multiple receive buffers what is the right way 
to allocate memory to give buffers to the hardware and then later, to 
"chain" the descriptors together to make the packet?  Using skb's is the 
common way as far as I've understood it.


Your input on the correct way to do these things is greatly appreciated.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-07 Thread David S. Miller
From: Jesse Brandeburg <[EMAIL PROTECTED]>
Date: Tue, 7 Feb 2006 17:41:28 -0800 (Pacific Standard Time)

> so we generally call dev_alloc_skb to get the receive buffers to give to 
> our hardware.  When we use multiple receive buffers what is the right way 
> to allocate memory to give buffers to the hardware and then later, to 
> "chain" the descriptors together to make the packet?  Using skb's is the 
> common way as far as I've understood it.
> 
> Your input on the correct way to do these things is greatly appreciated.

Allocate a single SKB and fill in the skb_shared_info() page/offset/len
pairs, making sure to take proper references to the pages you add.
Coalesce when possible.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-08 Thread Jesse Brandeburg

On Tue, 7 Feb 2006, David S. Miller wrote:

From: Jesse Brandeburg <[EMAIL PROTECTED]>
Date: Tue, 7 Feb 2006 17:41:28 -0800 (Pacific Standard Time)

> so we generally call dev_alloc_skb to get the receive buffers to give to
> our hardware.  When we use multiple receive buffers what is the right way
> to allocate memory to give buffers to the hardware and then later, to
> "chain" the descriptors together to make the packet?  Using skb's is the
> common way as far as I've understood it.
>
> Your input on the correct way to do these things is greatly appreciated.

Allocate a single SKB and fill in the skb_shared_info() page/offset/len
pairs, making sure to take proper references to the pages you add.
Coalesce when possible.


Given that there is code in the kernel in
net/ipv4/ip_fragment.c: ip_frag_reasm

/* If the first fragment is fragmented itself, we split
 * it to two chunks: the first with data and paged part
 * and the second, holding only fragments. */

checks for a frag_list and in response, *creates* the frag 
list inside a frag list, when does that code get used?


Basically it appears that code has been there for a long time (since 2.4) 
and both e1000 and other drivers use it.  If the change to 
skb_copy_datagram_iovec is reverted, this method of chaining skbs 
continues to work.  I don't see any clear reason why it shouldn't work.


Jesse
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-08 Thread David S. Miller
From: Jesse Brandeburg <[EMAIL PROTECTED]>
Date: Wed, 8 Feb 2006 16:30:04 -0800 (Pacific Standard Time)

> Basically it appears that code has been there for a long time (since 2.4) 
> and both e1000 and other drivers use it.  If the change to 
> skb_copy_datagram_iovec is reverted, this method of chaining skbs 
> continues to work.  I don't see any clear reason why it shouldn't work.

I think regardless of your valid points you should fix e1000
to chain page/offset/len pairs into a single SKB because using
a frag list of SKBs is very wasteful and is going to break
TCP socket buffer accounting.

I think there was even a thread about this side effect e1000
has on TCP recently.

Please fix e1000, then I'll revert the skb_copy_datagram_iovec()
change being discussed.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-08 Thread Jesse Brandeburg


On Tue, 7 Feb 2006, Herbert Xu wrote:

I share your feelings towards this patch.  However, what e1000 is doing
is broken.  It should be filling in the frags array, not frag_list.


Okay, this implies all of our data will be in pages.  Does this mean that 
we have to copy the header back to skb->data?  eth_type_trans requires 
that the header be at skb->data, and kmap'ing the header creates the 
non-trivial problem of having to call kunmap at some point.


My experience with alloc_page/put_page in the "packet split" e1000 code is
that it is slower to call alloc_page/put_page (they show up in top 
10 oprofile) at certain packet rates, and will negatively affect our

performance as compared to using the slab allocator.  Do you think so too?

All we want to do is have a single packet represented by several 
descriptors indicated up the stack.


9K jumbos:
2k
2k
2k
2k
1k

If you think it would be benefical, I think we and others could use some 
more structure in the stack to support this.


As another option, for now we can back out the patch that made memory 
utilization so much more efficient by using small descriptors all the 
time, and go back to our monolithic descriptor usage.


Jesse
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-09 Thread Andi Kleen
On Thursday 09 February 2006 02:33, Jesse Brandeburg wrote:

> My experience with alloc_page/put_page in the "packet split" e1000 code is
> that it is slower to call alloc_page/put_page (they show up in top 
> 10 oprofile) 

Did you resolve it down to specific lines in alloc_page/put_page?
Perhaps they can be fixed.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html