Siarhei Siamashka writes:
>> With this new alignment assumption, such an optimization becomes even more
>> impossible,
>
> Implementing this optimization does not seem to be too difficult in
> principle. I tried to hack a bit and here is the result:
>
> http://lists.freedesktop.org/archives/
On Thu, 29 Aug 2013 01:27:08 +0200
sandm...@cs.au.dk (Søren Sandmann) wrote:
> Siarhei Siamashka writes:
>
> > On Wed, 28 Aug 2013 16:01:27 -0400
> > Søren Sandmann wrote:
> >
> >> From: Søren Sandmann Pedersen
> >>
> >> Now that the general implementation guarantees that the iter buffers
> >
Because the redundant memcpy step is avoided, overall
performance is improved. Running lowlevel-blt-bench on
Intel Core-i7 860 @2.8GHz:
before:src_0565_ = L1: 931.54 L2: 888.93 M:638.34
after: src_0565_ = L1:1031.66 L2:1003.42 M:871.54
---
pixman/pixman-fast-path.c | 6 -
In the case if combine step is going to be a simple memcpy from
the temporary buffer to the destination (SRC operator, no mask,
x8r8g8b8 or a8r8g8b8 destination format), just route the source
iterator-based fetch operation to the destination buffer.
Earlier the source iterator was getting a const
Siarhei Siamashka writes:
> On Wed, 28 Aug 2013 16:01:27 -0400
> Søren Sandmann wrote:
>
>> From: Søren Sandmann Pedersen
>>
>> Now that the general implementation guarantees that the iter buffers
>> are aligned to 16 bytes, there is no longer any reason for the initial
>> loop to bring the de
On Wed, 28 Aug 2013 16:01:27 -0400
Søren Sandmann wrote:
> From: Søren Sandmann Pedersen
>
> Now that the general implementation guarantees that the iter buffers
> are aligned to 16 bytes, there is no longer any reason for the initial
> loop to bring the destination buffer up to an aligned posi
From: Søren Sandmann Pedersen
Now that the general implementation guarantees that the iter buffers
are aligned to 16 bytes, there is no longer any reason for the initial
loop to bring the destination buffer up to an aligned position.
---
pixman/pixman-mmx.c | 20
pixman/p
From: Søren Sandmann Pedersen
At the moment iter buffers are only guaranteed to be aligned to an 8
byte bit boundary. It is useful for SIMD implementations to be able to
assume that these buffers are aligned to 16 bytes, so ensure this.
---
pixman/pixman-general.c | 22 +++---