On Thursday 12 July 2007, Timothy Normand Miller wrote:
> In testing and experimenting, I noticed something strange, and I was
> wondering if anyone could help shed some light on this.
>
> Keep in mind that OGD1 does not support DMA at this time, so
> everything we do is simple PIO access to the card.  If I use the test
> "x11perf -putimage500", I get a result of about 99/sec.  That
> translates to a bus throughput of about 94 megabytes/sec.
>
> If, on the other hand, I use memcpy, I only get about 24 megs/sec.
>
> What could possibly be making my code so much slower than theirs?

By guess would be you're not setting the right cache/write combining setings.

I don't remember the details, but a combination of MTRR and pagetable bits 
determine the cache and ordering constraints for accessing a particular 
memory region.

For control regions you want to disable cache and write combining (to avoid 
needing an explicit memory barrier after every access), so I'd guess this is 
probably what linux does by default.

Paul
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to