Alan,
On Wed, Oct 23, 2002 at 01:40:09AM +0100, José Fonseca wrote:
On Wed, Oct 23, 2002 at 01:01:39AM +0100, Alan Cox wrote:
... I would expect to want to prefetch the input data (please trust
copy_from_user to do this right, it doesn't do a good job yet but its
the business of that code to do
On Wed, 30 Oct 2002, José Fonseca wrote:
But it doesn't seem I can get away of assembly due to the exception
table. So the only way is to do it portably is to call __copy_user
inside my routine for every read, or do you have any other suggestion
you can give me?
If you want to do this a
On Wed, 2002-10-30 at 22:05, José Fonseca wrote:
http://kernelnewbies.org/documents/copy_user/ . But although I do
understand the assembly implementation and I actually plan to do an
assembly optimized version myself, I would like to start with a plain C
implementation that would be platform
On Wed, Oct 23, 2002 at 01:01:39AM +0100, Alan Cox wrote:
...
Prefetching tends to be a win. What to prefetch is a harder question
normally solved by benchmarking. When the card does DMA access to a
buffer it will suck it from the processor L2 caches.
Pleaes define suck in this scenario.
On Wed, 2002-10-23 at 02:34, Keith Packard wrote:
The problem with cached writes is that each cacheline will be brought into
cache when the first write is issued. If the memory is across the PCI
Even the K6 has stuff to work on a partial cache line while filling it.
On Wed, 2002-10-23 at 07:53, Philip Brown wrote:
From my driver coding in non-linux areas, I was always taught that
the onus is on the OS to flush the various related areas of
processor cache, before DMA takes place.
The PC it is hardware handled. On sparc hardware its at least partially
On Tuesday 22 October 2002 03:00 pm, you wrote:
But wouldnt it be nice to allow the graphics card to directly access
the data from user space ?
It seems to defeat the whole point of DMA, if you have to do multiple
copies of the data.
DMA allows you to do other things while the display chip's
On Tue, Oct 22, 2002 at 03:13:09PM -0500, Frank C. Earl wrote:
On Tuesday 22 October 2002 03:00 pm, you wrote:
But wouldnt it be nice to allow the graphics card to directly access
the data from user space ?
It seems to defeat the whole point of DMA, if you have to do multiple
copies of
On Tue, 2002-10-22 at 21:13, Frank C. Earl wrote:
On Tuesday 22 October 2002 03:00 pm, you wrote:
But wouldnt it be nice to allow the graphics card to directly access
the data from user space ?
It seems to defeat the whole point of DMA, if you have to do multiple
copies of the data.
Around 13 o'clock on Oct 22, Ian Romanick wrote:
I would recommend an audit then a copy, with the DMA buffer being setup as
non-cached, write combining memory (like AGP mapped memory).
Non-cached write combining memory doesn't go through the regular cache
buffers and is significantly slower
Around 21 o'clock on Oct 22, Alan Cox wrote:
Unmapping something, especially with threaded apps on SMP boxes is
really really expensive. If you do the audit as you copy the data it may
well actually cost no more than a single copy
How expensive is it on a uniprocessor system? Copying data
On Tue, 2002-10-22 at 21:51, Keith Packard wrote:
How expensive is it on a uniprocessor system? Copying data prior to DMA
is not free, especially if the buffers span a significant fraction of
the cache size.
Its actually very hard to measure, because the impact is felt down the
line not
On Tue, Oct 22, 2002 at 01:48:35PM -0700, Keith Packard wrote:
Around 13 o'clock on Oct 22, Ian Romanick wrote:
I would recommend an audit then a copy, with the DMA buffer being setup as
non-cached, write combining memory (like AGP mapped memory).
Non-cached write combining memory
On Tue, Oct 22, 2002 at 10:24:04PM +0100, Alan Cox wrote:
[...]
I can believe there are objects in 3D rendering big enough to be worth
mapping but I'd be guessing to name a size. Linus as a chip hacker might
actually have more detailed numbers.
At least with the Mach64 is very rare for this
On Wed, 2002-10-23 at 00:19, José Fonseca wrote:
Is it neccessary to copy all the data then DMA it or can you pipeline it
so that the DMA is writing out some of the cache while you copy data in
and verify it ?
I'm not sure what you mean with cache above, but the Mach64 has a ring
buffer
On Wed, Oct 23, 2002 at 01:01:39AM +0100, Alan Cox wrote:
On Wed, 2002-10-23 at 00:19, José Fonseca wrote:
[...]
I'm not sure what you mean with cache above, but the Mach64 has a ring
buffer with all the pending DMA buffers, so there will be DMA transfer
simultaneously with the copy/verify, but
Around 1 o'clock on Oct 23, Alan Cox wrote:
Uncached writes on PC hardware are almost always a complete loss. You
want the writeback caching so you are writing to the PCI bridge or sdram
in the largest chunk sizes possible.
The problem with cached writes is that each cacheline will be
On Mon, Oct 21, 2002 at 11:27:15AM -0400, Leif Delgass wrote:
[...]
I have a pretty good idea how to do the verification (just checking the
register count and offset range of each command, skipping the data), but
I'm not sure if it'll be faster to copy as we verify, or memcpy the entire
buffer
18 matches
Mail list logo