By providing the new offer_buffer iterator method, redundant
memcpy combiners are now avoided for compositing with SRC
operator in some cases. One good example is src_0565_.
Running lowlevel-blt-bench on Intel Core-i7 860 @2.8GHz:
before:src_0565_ = L1: 931.54 L2: 888.93 M:638.34
For the SRC compositing operator, the combine step in the generic
fetch - combine - writeback pipeline may be redundant.
Some examples:
1. bilinear a8r8g8b8 - a8r8g8b8
We have a redundant copy from the source iterator temporary
buffer directly to the destination buffer, while just direct
On Thu, 29 Aug 2013 04:16:07 +0300
Siarhei Siamashka siarhei.siamas...@gmail.com wrote:
In the case if combine step is going to be a simple memcpy from
the temporary buffer to the destination (SRC operator, no mask,
x8r8g8b8 or a8r8g8b8 destination format), just route the source
On Thu, 29 Aug 2013 05:59:26 +0200
sandm...@cs.au.dk (Søren Sandmann) wrote:
Siarhei Siamashka siarhei.siamas...@gmail.com writes:
With this new alignment assumption, such an optimization becomes even more
impossible,
Implementing this optimization does not seem to be too difficult in