Netperf2 TOT now accesses the buffer that was just recv()'d rather than
the one that is about to be recv()'d.
We've posted netperf2 results with I/OAT enabled/disabled and the data
access option on/off at
http://kernel.org/pub/linux/kernel/people/grover/ioat/netperf-icb-1.5-postscaling-both.pdf
Chris Leech wrote:
Netperf2 TOT now accesses the buffer that was just recv()'d rather than
the one that is about to be recv()'d.
We've posted netperf2 results with I/OAT enabled/disabled and the data
access option on/off at
David S. Miller wrote:
The first thing an application is going to do is touch that data. So
I think it's very important to prewarm the caches and the only
straightforward way I know of to always warm up the correct cpu's
caches is copy_to_user().
Hmm, what if the application is sth. like a
Hi I'm reposting these, originally posted by Chris Leech a few weeks ago.
However, there is an extra part since I broke up one patch that was too
big for netdev last time into two (patches 2 and 3).
Of course we're always looking for more style improvement comments, but
more importantly we're
On Thu, Apr 20, 2006 at 01:49:16PM -0700, Andrew Grover wrote:
Hi I'm reposting these, originally posted by Chris Leech a few weeks ago.
However, there is an extra part since I broke up one patch that was too
big for netdev last time into two (patches 2 and 3).
Of course we're always
On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote:
Hah, I was just writing an email covering those. I'll incorporate that
into this reponse.
On 4/20/06, Olof Johansson [EMAIL PROTECTED] wrote:
I guess the overall question is, how much of this needs to be addressed
in the
From: Andrew Grover [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 15:14:15 -0700
First obviously it's a technology for RX CPU improvement so there's no
benefit on TX workloads. Second it depends on there being buffers to
copy the data into *before* the data arrives. This happens to be the
case for
From: Olof Johansson [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 18:33:43 -0500
On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote:
In
addition, there may be workloads (file serving? backup?) where we
could do a skb-page-in-page-cache copy and avoid cache pollution?
Yes, NFS is
Unfortunately, many benchmarks just do raw bandwidth tests sending to
a receiver that just doesn't even look at the data. They just return
from recvmsg() and loop back into it. This is not what applications
using networking actually do, so it's important to make sure we look
intelligently at
David S. Miller wrote:
From: Andrew Grover [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 15:14:15 -0700
First obviously it's a technology for RX CPU improvement so there's no
benefit on TX workloads. Second it depends on there being buffers to
copy the data into *before* the data arrives. This
From: Rick Jones [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 18:00:37 -0700
Actually, that brings-up a question - presently, and for reasons that
are lost to me in the mists of time - netperf will access the buffer
before it calls recv(). I'm wondering if that should be changed to an
access
David S. Miller [EMAIL PROTECTED] wrote:
For I/O AT you'd really want to get the DMA engine going as soon
as you had those packets, but I do not see a clean and reliable way
to determine the target pages before the app gets back to recvmsg().
The vmsplice() system call proposed by Linus
On Thu, Apr 20, 2006 at 05:27:42PM -0700, David S. Miller wrote:
From: Olof Johansson [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 16:33:05 -0500
From the wiki:
3. Data copied by I/OAT is not cached
This is a I/OAT device limitation and not a global statement of the
DMA
On Thu, Apr 20, 2006 at 05:44:38PM -0700, David S. Miller wrote:
From: Olof Johansson [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 18:33:43 -0500
On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote:
In
addition, there may be workloads (file serving? backup?) where we
could do
From: Olof Johansson [EMAIL PROTECTED]
Date: Thu, 20 Apr 2006 22:04:26 -0500
On Thu, Apr 20, 2006 at 05:27:42PM -0700, David S. Miller wrote:
Besides the control overhead of the DMA engines, the biggest thing
lost in my opinion is the perfect cache warming that a cpu based copy
does from
On Thu, Apr 20, 2006 at 08:42:00PM -0700, David S. Miller wrote:
This is basically why none of the performance gains add up to me. I
am thus very concerned that the current non-cache-warming
implmentation may fall flat performance wise.
Ok, I buy your arguments. It does seems unlikely that a
16 matches
Mail list logo