Re: [Pixman] ARMv6: Assorted improvements

2014-03-12 Thread Tomeu Vizoso
On 03/12/2014 09:24 AM, Tomeu Vizoso wrote: Hi, I'm resending a few patches that Ben Avison sent last year in March, rebased and make-checked on ARMv6. Just had to make a small change due to a function rename. Ahem, sorry about that. Have just noticed, while perusing the archives, tha

[Pixman] [PATCH 09/12] ARMv6: Add fast path for in_reverse_8888_8888

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 21.3 0.1 32.5 0.2 100.0% +52.1% L2 12.1 0.2 19.5 0.5 100.0% +61.2% M 11.0 0.0 17.1 0.0 100.0% +54.6% HT 8.7

[Pixman] [PATCH 11/12] ARMv6: Add fast path for add_8888_8888

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 27.6 0.1 125.9 0.8 100.0% +356.0% L2 14.0 0.5 30.8 1.6 100.0% +120.3% M 12.2 0.0 26.7 0.1 100.0% +118.8% HT 10

[Pixman] [PATCH 10/12] ARMv6: Add fast path for over_reverse_n_8888

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 15.0 0.1 276.2 4.0 100.0% +1743.3% L2 13.4 0.3 154.8 17.4100.0% +1058.0% M 11.4 0.0 73.7 0.8 100.0% +549.4% HT

[Pixman] [PATCH 12/12] ARMv6: Add fast path for src_x888_0565

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison This isn't used in the trimmed cairo-perf-trace tests at all, but these are the lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 68.5 1.0 116.3 0.6 100.0% +69.8% L2 31.1 1.8 60.9 5.0 10

[Pixman] [PATCH 01/12] ARMv6: Fix some indentation in the composite macros

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison --- pixman/pixman-arm-simd-asm.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/pixman/pixman-arm-simd-asm.h b/pixman/pixman-arm-simd-asm.h index 6543606..74400c1 100644 --- a/pixman/pixman-arm-simd-asm.h +++ b/pixman/pixman-arm-simd-asm.h @@ -7

[Pixman] [PATCH 05/12] ARMv6: Force fast paths to have fixed alignment to the BTAC

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison Trying to produce repeatable, trustworthy profiling results from the cairo-perf-trace benchmark suite has proved tricky, especially when testing changes that have only a marginal (< ~5%) effect upon the runtime as a whole. One of the problems is that some traces appear to show s

[Pixman] [PATCH 04/12] ARMv6: Add fast path flag to force no preload of destination buffer

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison --- pixman/pixman-arm-simd-asm.h | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/pixman/pixman-arm-simd-asm.h b/pixman/pixman-arm-simd-asm.h index e481320..4c08b9e 100644 --- a/pixman/pixman-arm-simd-asm.h +++ b/pixman/pixman-arm-simd-asm.h @@

[Pixman] [PATCH 07/12] ARMv6: Macro to permit testing for early returns or alternate implementations

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison When the source or mask is solid (as opposed to a bitmap) there is the possibility of an immediate exit, or a branch to an alternate, more optimal implementation in some cases. This is best achieved with a brief prologue to the function; to permit this, the necessary boilerplate

[Pixman] [PATCH 06/12] Add extra test to lowlevel-blt-bench and fix an existing one

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison in_reverse__ is one of the more commonly used operations in the cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now. The source for over_reverse_n_ needed to be marked as solid. --- test/lowlevel-blt-bench.c | 1 + 1 file changed, 1 insertion(+)

[Pixman] [PATCH 02/12] ARMv6: Minor optimisation

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison This knocks off one instruction per row. The effect is probably too small to be measurable, but might as well be included. The second occurrence of this sequence doesn't actually benefit at all, but is changed for consistency. --- pixman/pixman-arm-simd-asm.h | 11 --- 1

[Pixman] [PATCH 08/12] ARMv6: Added fast path for over_n_8888_8888_ca

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 2.70.0 16.2 0.1 100.0% +501.7% L2 2.40.0 14.8 0.2 100.0% +502.5% M 2.40.0 15.0 0.0 100.0% +525.7% HT 2.

[Pixman] [PATCH 03/12] ARMv6: Support for very variable-hungry composite operations

2014-03-12 Thread Tomeu Vizoso
From: Ben Avison Previously, the variable ARGS_STACK_OFFSET was available to extract values from function arguments during the init macro. Now this changes dynamically around stack operations in the function as a whole so that arguments can be accessed at any point. It is also joined by LOCALS_ST

[Pixman] ARMv6: Assorted improvements

2014-03-12 Thread Tomeu Vizoso
Hi, I'm resending a few patches that Ben Avison sent last year in March, rebased and make-checked on ARMv6. Just had to make a small change due to a function rename. Here is the original cover letter: While I have some pending contributions relating to pad-repeated images and over_n_ from 20