Matt Turner <matts...@gmail.com> writes: > The registers -- yes. The 8-byte aligned loads and stores I'm not > sure. Can you do 8-byte aligned loads and stores to/from SSE > registers?
I believe movq can use SSE registers. > Indeed, runtime generation would be great. Something like LLVM or orc > would be interesting options. I'm not sure I'm up to that kind of > project yet/now though. > > I think adding pixman-sse*.c files is a reasonable measure for now. > Think it's okay to split the static inline support functions from > pixman-sse2.c out into a header to be shared with the other > pixman-sse*.c files? Sounds reasonable to me. > Also, are we planning to change the bilinear scaling algorithm for > 0.28 so that we can use pmaddubsw? I wouldn't object to a patch that dropped precision to 7 bits for all bilinear code, but it would require changes at least to the general code, the fast path code, the NEON code and the SSE2 code. An alternative idea is instead of changing the algorithm across the board, we could stop requiring bit exact results. The main piece of work here is to change the test suite so that it will accept pixels up to some maximum relative error. There is already some support for this: the 'composite' test is using the 'pixel_checker_t" to do compare the pixman output with perfect pixels computed in double precision. This latter idea is ultimately more useful because it will allow much more flexibility in the kinds of SIMD instruction sets we can support. Søren _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman