Here's a simple patch that gives about a 50% (on my box) speed boost to
glReadPixels performance in 24-bit. I measured using the benchmark
built into progs/demos/readpix. The interesting thing is that the core
MMX & SSE2 routines can be used for other cards as well. For example,
it looks like MGA, Unichrome, and others can use the same code for 24-bit.
Before persuing this too far, I'd like to look at ways to make the
*compiled* code from spantmp.h be more device-independent. That would
make it easier to generate a bunch of these generic routines and just
plug them in.
r200_readpixels-01.tar.bz2
Description: BZip2 compressed data