These are often used inplace of a solid-fill gradient, with the expectation that have similar performance and hit the same fast paths. One exception is the identification of the 1x1R <-> solid-fill equivalence in general_composite_rect(). By adding the special case to _pixman_bits_image_src_iter_init, we can recover the lost performance:
./tests/lowlevel-blt-bench -n src_n_8888 on a core2 @ 2.66Ghz: before: add_n_8888 = L1: 4.52 L2: 4.42 M: 1.63 ( 0.13%) HT: 1.67 VT: 1.65 R: 1.64 RT: 1.59 ( 21Kops/s) after: add_n_8888 = L1:1160.10 L2:1296.63 M:581.81 ( 47.46%) HT:379.06 VT:307.07 R:239.88 RT: 79.37 ( 466Kops/s) For reference, ./tests/lowlevel-blt-bench src_n_8888: add_n_8888 = L1:1116.99 L2:1254.43 M:578.46 ( 47.50%) HT:369.20 VT:302.21 R:236.18 RT: 76.64 ( 750Kops/s) Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> --- pixman/pixman-bits-image.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/pixman/pixman-bits-image.c b/pixman/pixman-bits-image.c index 75a39a1..b15305c 100644 --- a/pixman/pixman-bits-image.c +++ b/pixman/pixman-bits-image.c @@ -1496,6 +1496,17 @@ _pixman_bits_image_src_iter_init (pixman_image_t *image, pixman_iter_t *iter) pixman_format_code_t format = image->common.extended_format_code; uint32_t flags = image->common.flags; const fetcher_info_t *info; + bits_image_t *bits = &image->bits; + + if ((bits->width | bits->height) == 1 && image->common.repeat) + { + if (iter->iter_flags & ITER_NARROW) + replicate_pixel_32 (bits, 0, 0, iter->width, iter->buffer); + else + replicate_pixel_float (bits, 0, 0, iter->width, iter->buffer); + iter->get_scanline = _pixman_iter_get_scanline_noop; + return; + } for (info = fetcher_info; info->format != PIXMAN_null; ++info) { -- 1.7.10.4 _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman