On Tue, Jan 29, 2013 at 11:21 AM, Siarhei Siamashka
wrote:
> > +if (BILINEAR_INTERPOLATION_BITS < 8)
> > +{
> > + const __m128i xmm_xorc7 = _mm_set_epi16 (0, BMSK, 0, BMSK, 0, BMSK,
> > 0, BMSK);
> > + const __m128i xmm_addc7 = _mm_set_epi16 (0, 1, 0, 1, 0, 1, 0, 1);
> > + con
Siarhei Siamashka writes:
> As for the affine transforms, they really depend on accessing memory
> in an a cache-friendly way.
A simple experiment that could be done would be to just switch to a
tiled access pattern in pixman-general.c and see what the performance
impact of that would be.
> I w
Siarhei Siamashka wrote:
Going forward, we need to also add support for separable bilinear
scaling (first horizontal interpolation for single scanlines to
temporary buffers in L1 cache, then vertical interpolation of these
buffers to get the final result). Unless I misunderstood something,
Soere
On Sun, 27 Jan 2013 14:10:27 +
Chris Wilson wrote:
> On an SNB i5-2500 using cairo-image:
>
> firefox-canvas17.8 -> 10.3: 1.72x speedup
> firefox-tron 46.3 -> 28.4: 1.63x speedup
> swfdec-youtube 1.7 -> 1.4: 1.22x speedup
> firefox-fishbowl 64.6 -> 53.7: 1.
On an SNB i5-2500 using cairo-image:
firefox-canvas17.8 -> 10.3: 1.72x speedup
firefox-tron 46.3 -> 28.4: 1.63x speedup
swfdec-youtube 1.7 -> 1.4: 1.22x speedup
firefox-fishbowl 64.6 -> 53.7: 1.20x speedup
firefox-paintball 40.8 -> 36.8: 1.11x speedup
firefo