[Pixman] [PATCH 1/2] MIPS: DSPr2: Added over_n_8_8888 and over_n_8_0565 fast paths.

2012-05-02 Thread Nemanja Lukic
From: Nemanja Lukic nemanja.lu...@rt-rk.com Performance numbers before/after on MIPS-74kc @ 1GHz Referent (before): lowlevel-blt-bench: over_n_8_ = L1: 10.40 L2: 9.79 M: 8.47 ( 33.62%) HT: 7.64 VT: 7.59 R: 7.48 RT: 5.30 ( 40Kops/s) over_n_8_0565 = L1: 7.40 L2:

[Pixman] [PATCH] sse2: Using MMX and SSE 4.1

2012-05-02 Thread Matt Turner
I started porting my src__0565 MMX function to SSE2, and in the process started thinking about using SSE3+. The useful instructions added post SSE2 that I see are SSE3: lddqu - for unaligned loads across cache lines SSSE3: palignr - for unaligned loads (but requires software