Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.

Christian Iversen Fri, 29 Feb 2008 06:47:30 -0800

Daniël Mantione wrote:

Op Fri, 29 Feb 2008, schreef Christian Iversen:
Memory access. What happens is that the non-packed version causesmore cache misses. A cache miss costs many cycles on a modern cpu, amisaligned read just costs an extra memory access (which is fast ifcached) on x86, and extra load instruction on ARM. This much cheaperthan a chache miss.
It's much worse than that. Some architectures simply _can't_ dounaligned access, and they will trigger an exception.
This exception will in many configurations be caught by the OS, thatthen might simulate the read by doing 2 reads, putting the resulttogether, writing into the application memory, and doing a task switch.
This, in total, is several _orders of magnitude_ worse than unalignedaccess on a supported platform.
Of course, unaligned access in itself is pretty bad.
True, but irrelevant, because the discussion was under the assumptionthan an unaligned read is done using the "unaligned" pseudo function.Unless there is a bug in the compiler, the use of "unaligned" will nevercause an exception.


Oh, you're right of course. I didn't catch that part of the argument.

Instead "unaligned" will simulate an unaligned load with two loads andsome rotation etc. On the ARM, where every mnemonic can rotate operands,this is isn't that bad of a penalty.
Therefore, I wouldn't be surprised that even on ARM, arrays with packedstructures are faster than arrays with unpacked structures.


That's possible. Why would it be faster, btw? Better cache coherency?

--
Med venlig hilsen
Christian Iversen
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.

Reply via email to