On Mon, 13 Apr 2015 18:42:45 +0100 "Ben Avison" <bavi...@riscosopen.org> wrote:
> On Mon, 13 Apr 2015 12:31:35 +0100, Pekka Paalanen <ppaala...@gmail.com> > wrote: > > Apart from restructuring, there is one significant(?) difference to > > Ben's patches: for a solid mask, Ben used a8r8g8b8, but my code uses a8. > > I had to remind myself how things hang together to check if this is > significant or not... > > When lowlevel-blt-bench tests a solid image (source or mask), it achieves > it not by calling pixman_image_create_solid_fill() but by creating a 1x1 > pixel bitmap using pixman_image_create_bits(). However, the effect is > effectively the same because compute_image_info() detects single-pixel > bitmaps and records it in image->common.extended_format_code before we > start doing fast path lookup. However, image->type remains as BITS rather > than SOLID because the non-common parts of the pixman_image union retain > their original meaning - for a native solid fill image, it's just the > colour in three different formats, but for bitmap images it's a pointer > to and sizes of the bitmap and colour LUT (if applicable), various helper > function pointers and so on. > > A typical ARM assembly fast path is written to assume that any argument > corresponding to a solid colour is already in an 8888 format where the > red-blue ordering is the same as the destination image. The magic that > ensures this is hidden in the wrapper C function generated by > PIXMAN_ARM_BIND_FAST_PATH_N_DST and similar, where it calls > _pixman_image_get_solid(). You'll see similar calls more explicitly in > pixman-fast-path.c (architecture-neutral fast paths) and the equivalent > sources files for other architectures. > > There are a few circumstances where you might not want to use > _pixman_image_get_solid() - perhaps most obviously are the operations > which use a solid image in pixel format a2r10g10b10 because you'd lose 2 > bits of precision per colour component. There are a few example of those > amongst those you picked out in special_patterns[]. > > It's worth noting that _pixman_image_get_solid() has optimised code to > extract the colour from image->type==BITS images if the source/mask image > is of format a8r8g8b8, x8r8g8b8 or a8. Other formats will still work, but > more laboriously. > > Not that any of this should matter at all for the purposes of > lowlevel-blt-bench. _pixman_image_get_solid() is only called once per > call of pixman_image_composite(), so it is part of the overhead that > should be being accounted for by the code wrapped in #if EXCLUDE_OVERHEAD. Ok, so if I understood you right, this difference should be harmless, especially with the addition mentioned below. > > should we also add a condition, that if a test has CA flag and a solid > > mask, then that mask needs to be a8r8g8b8? > > That might be desirable, if only because lowlevel-blt-bench initialises > all its images using memset(0xCC) so an a8 solid image would be converted > by _pixman_image_get_solid() to 0xCC000000 whereas an a8r8g8b8 would be > 0xCCCCCCCC. When you're not in component alpha mode, only the alpha byte > matters for the mask image, but in the case of component alpha > operations, a fast path might decide that it can save itself a lot of > multiplications if it spots that 3 constant mask components are already 0. Ok, good. I think I will make that a separate follow-up patch, so I can properly record the reasons and effects. > > None of the existing tests has a solid mask for CA tests. I'm not sure > > how much it makes sense. > > When you're in component alpha mode, mathematically the source and mask > images are interchangeable. So you could implement over_8888_n_8888_ca by > calling over_n_8888_8888_ca, and we already have a number of fast paths > for operations of the latter form. (I'm not aware that Pixman is - yet - > aware that it can make such substitutions, but I'm sure it could be > added.) Hmm, yes, interesting, at least in the cases where the operator indeed allows it... I don't know the operators by heart so I don't know if there are any that wouldn't work like that. An idea to remember. > > Ben, when we eventually get to the actual optimization patches, I expect > > I will reproduce the benchmarks myself while reviewing. Is that ok? > > That's definitely fine by me, thanks for offering. Fingers crossed it > doesn't substantially change the results :-) > > > What do you think of this series? > > In general, it looks like an improvement to me. I don't think the test > programs have had this much attention in a while! One minor point in > patch 6's log message, you copied my typo: > > <operation>_<src>[_<mask]<dst>[_ca] > > should be > > <operation>_<src>[_<mask]_<dst>[_ca] Ahh, yes, I'll fix that. So, can I take it that you gave your Reviewed-by for the whole series? For the record, if the terminology of S-o-b, R-b, Acked-by, etc. is foreign to you or anyone, they are documented in the Linux kernel: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n407 Sections 11-13. IMO it's enough to read those once and understand the concepts. I have a feeling most projects use those tags on a bit looser terms than the kernel. Signed-off-by (depending on the project) and Reviewed-by are the most important. Thanks, pq _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman