Hi Soren, You are right. This is certainly a performance decrease. I made a fix for this. Actually case when srca == 0xff was not handled at all in these two routines, but only on over_n_8888 this showed up, since MIPS fast paths are using OVER combiner for 32bit destination. I'll push another commit for these two fast-paths.
Current results look like this: Referent (before): over_n_0565 = L1: 14.48 L2: 21.36 M: 17.57 ( 23.30%) HT: 6.95 VT: 6.44 R: 6.39 RT: 2.16 ( 22Kops/s) over_n_8888 = L1: 92.60 L2: 86.13 M: 24.41 ( 64.74%) HT: 8.94 VT: 8.06 R: 8.00 RT: 2.53 ( 25Kops/s) Optimized: over_n_0565 = L1: 27.65 L2: 189.22 M: 58.19 ( 77.12%) HT: 52.80 VT: 49.88 R: 47.53 RT: 23.67 ( 72Kops/s) over_n_8888 = L1: 235.99 L2: 230.86 M: 29.09 ( 77.11%) HT: 27.95 VT: 27.24 R: 26.58 RT: 18.10 ( 67Kops/s) Thanks, Nemanja Lukic -----Original Message----- From: Søren Sandmann [mailto:sandm...@cs.au.dk] Sent: Monday, November 05, 2012 7:39 PM To: Lukic, Nemanja Cc: pixman@lists.freedesktop.org; nemanja.lu...@rt-rk.com Subject: Re: [Pixman] [PATCH 3/3] MIPS: DSPr2: Added more fast-paths for OVER operation: Nemanja Lukic <nlu...@mips.com> writes: > From: Nemanja Lukic <nemanja.lu...@rt-rk.com> > > Performance numbers before/after on MIPS-74kc @ 1GHz: > > lowlevel-blt-bench results > > Referent (before): > over_n_0565 = L1: 12.04 L2: 21.45 M: 18.50 ( 24.55%) HT: 6.93 > VT: 6.45 R: 6.38 RT: 2.16 ( 22Kops/s) This one: > over_n_8888 = L1: 93.76 L2: 85.96 M: 24.41 ( 64.78%) HT: 8.93 > VT: 8.08 R: 7.99 RT: 2.54 ( 25Kops/s) > over_n_8888 = L1: 55.31 L2: 49.07 M: 28.60 ( 75.93%) HT: 23.99 > VT: 22.95 R: 22.34 RT: 12.85 ( 61Kops/s) looks like a performance regression for L1 and L2? Søren _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman