On Thu, 05 Sep 2013 04:42:08 +0200
sandm...@cs.au.dk (Søren Sandmann) wrote:
Siarhei Siamashka siarhei.siamas...@gmail.com writes:
The loops are already unrolled, so it was just a matter of packing
4 pixels into a single XMM register and doing aligned 128-bit
writes to memory via MOVDQA
Siarhei Siamashka siarhei.siamas...@gmail.com writes:
The loops are already unrolled, so it was just a matter of packing
4 pixels into a single XMM register and doing aligned 128-bit
writes to memory via MOVDQA instructions for the SRC compositing
operator fast path. For the other fast paths,
The loops are already unrolled, so it was just a matter of packing
4 pixels into a single XMM register and doing aligned 128-bit
writes to memory via MOVDQA instructions for the SRC compositing
operator fast path. For the other fast paths, this XMM register
is also directly routed to further