On 5/16/11 12:26 PM, Jeremy Huddleston wrote:
Is the one div needed for:

bpp / 8
> bpp % 8

really universally faster than the two bitwise ops needed for

bpp >> 3
> bpp & 0x7

?  I'm sure most modern compilers will know how to optimize that
based on the target CPU, but I've always tried to avoid doing mults
and divs in fast paths where possible.

Even if it's ten cycles slower, I'm going to wager it pales next to the hundreds-to-millions of cycles of memcpy.

- ajax
_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Reply via email to