[Pixman] [PATCH 2/2] Buffer override support for bilinear and x8r8g8b8/r5g6b5/a8 iterators

2013-09-01 Thread Siarhei Siamashka
By providing the new offer_buffer iterator method, redundant memcpy combiners are now avoided for compositing with SRC operator in some cases. One good example is src_0565_. Running lowlevel-blt-bench on Intel Core-i7 860 @2.8GHz: before:src_0565_ = L1: 931.54 L2: 888.93 M:638.34

[Pixman] [PATCH 1/2] general: Allow to avoid redundant memcpy combiners for SRC operator

2013-09-01 Thread Siarhei Siamashka
For the SRC compositing operator, the combine step in the generic fetch - combine - writeback pipeline may be redundant. Some examples: 1. bilinear a8r8g8b8 - a8r8g8b8 We have a redundant copy from the source iterator temporary buffer directly to the destination buffer, while just direct

Re: [Pixman] [PATCH 1/2] Shortcut for the source iterator to fetch directly to destination

2013-09-01 Thread Siarhei Siamashka
On Thu, 29 Aug 2013 04:16:07 +0300 Siarhei Siamashka siarhei.siamas...@gmail.com wrote: In the case if combine step is going to be a simple memcpy from the temporary buffer to the destination (SRC operator, no mask, x8r8g8b8 or a8r8g8b8 destination format), just route the source

Re: [Pixman] [PATCH 2/2] sse2, mmx: Remove initial unaligned loops in fetchers

2013-09-01 Thread Siarhei Siamashka
On Thu, 29 Aug 2013 05:59:26 +0200 sandm...@cs.au.dk (Søren Sandmann) wrote: Siarhei Siamashka siarhei.siamas...@gmail.com writes: With this new alignment assumption, such an optimization becomes even more impossible, Implementing this optimization does not seem to be too difficult in