On Thu, Aug 29, 2013 at 10:02 AM, Søren Sandmann Pedersen
wrote:
> This commit adds a new, empty SSSE3 implementation and the associated
> build system support.
>
> configure.ac: detect whether the compiler understands SSSE3
> intrinsics and set up the required CFLAGS
>
> Makefil
On Thu, 29 Aug 2013 13:02:52 -0400
"Søren Sandmann Pedersen" wrote:
> This commit adds a new, empty SSSE3 implementation and the associated
> build system support.
>
> configure.ac: detect whether the compiler understands SSSE3
> intrinsics and set up the required CFLAGS
>
> M
On Thu, 29 Aug 2013 13:02:53 -0400
"Søren Sandmann Pedersen" wrote:
> This new iterator uses the SSSE3 instructions pmaddubsw and pabsw to
> implement a fast iterator for bilinear scaling.
This patch shows some really good performance for upscaling. In fact
even better than I expected. And the t
d a combiner.
And the non-separable bilinear code still seems to be somewhat
competitive for dowscaling. But the scaling ratio, where the
separable implementation becomes faster, differs for different
generations of hardware:
http://people.freedesktop.org/~siamashka/files/20130905/
http://peo
On Wed, Sep 4, 2013 at 7:49 PM, Søren Sandmann wrote:
> From: Søren Sandmann Pedersen
>
> The default has been 7-bit for a while now, and the quality
> improvement with 8-bit precision is not enough to justify keeping the
> code around as a compile-time option.
> ---
I'm fine with this change, b
On Wed, 4 Sep 2013 03:12:51 +0300
Siarhei Siamashka wrote:
> The calloc call from pixman_image_create_bits may still
> rely on http://en.wikipedia.org/wiki/Copy-on-write
> Explicitly initializing the destination image results in
> a more predictable behaviour.
A newer revision of this patch. To
s an iterator, primarily
intended for downscaling. Here are some benchmarks, comparing the SSE2
and SSSE3 implementations of src__ fast paths with the
performance of SSSE3 iterator (using the scaling-bench program, which
has been modified to use SRC instead of OVER, and pixman code patched
t