On Tuesday 22 February 2011 23:23:48 you wrote:
> From: Siarhei Siamashka <siarhei.siamas...@nokia.com>
> 
> Initial NEON optimization for bilinear scaling. Can be probably
> improved more.
> 
> Benchmark on ARM Cortex-A8:
>  Microbenchmark (scaling 2000x2000 image with scale factor close to 1x):
>   before: op=1, src=20028888, dst=20028888, speed=10.72 MPix/s
>   after:  op=1, src=20028888, dst=20028888, speed=44.27 MPix/s

And indeed, just adding prefetch to bilinear scaling code actually even
provides something like 1.5x better performance than that. I'll try to make
a separate patch adding prefetch after testing how well it performs for
different scale factors.

It's interesting that prefetch was not actually helping in the nearest
scaling case, probably because LSU was already overloaded with handling
many scattered memory accesses (or maybe because I did something wrong
that time). In any case, because bilinear scaling also has a number
crunching part, adding prefetch really improves memory bandwidth
utilization and provides a nice performance boost.

-- 
Best regards,
Siarhei Siamashka
_______________________________________________
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman

Reply via email to