Hi, Comments on the patch series below. In the future, please consider sending smaller patch sets so that I can review them in smaller chunks of time. Those are easier to find than multiple consecutive hours ...
Thanks, Søren ================================================================== 0001: Typo in preload Looks good ================================================================== Looks good ================================================================== 0002: lowlevel parse Needs work 0003: affine benchmarker Needs work ================================================================== Both of these could benefit from taking the support for format/operator parsing and printing in check-format.c and moving it to utils.[ch], perhaps extended to deal with strings like 'none' etc. Also some coding style issues. (Use a typedef for structs, don't define a struct on a single line, braces around statements that span multiple lines -- see CODING_STYLE for more information. ================================================================== 0004: Temporary buffers Needs work ================================================================== The idea behind aligning the final buffer is that it allows fetchers to assume that they can do a full 16-byte SIMD write to the buffer. We may possibly not do that, though. The change to use 15 instead of 32 is alright I guess. Is this something that actually shows up in practice? ================================================================== 0005: Early detection of 1x1 Needs work ================================================================== I don't think this works as written. What if someone writes a non-opaque pixel to the 1x1 image after it has been used once? Then the flags would be wrong. We could potentially do the check inside pixman_image_composite32(), though you would have to also check at least for ~FAST_PATH_ACCESSORS. If you want to go through with this patch, I think we need a test that reads and writes opaque and non-opaque pixels to 1x1 pixel images, and verification that the results don't change before/after the patch. ================================================================== 0006: fast: over_n_8888 Looks good ================================================================== Looks good to me ================================================================== 0007: armv6: over_n_8888 Not reviewed ================================================================== Not reviewed ================================================================== 0008: fast: over_n_0565 Looks good ================================================================== Looks good to me ================================================================== 0009: armv6: over_n_0565 Not reviewed ================================================================== Not reviewed ================================================================== 0010: fast: in_8888_8 - sRGB Needs work 0011: armv6: in_8888_8 - sRGB? Needs work ================================================================== These can presumably also be used for the a8r8g8b8_sRGB format since all we care about is the alpha channel. Also, is this actually showing up in the wild? Which applications? ================================================================== 0012: armv6: over_8888_8_0565 Not reviewed 0013: armv6: over_8888_n_0565 Not reviewed 0014: in_n_8888 Not reviewed 0015: Improved over_8888_8888 Not reviewed 0016: single scanline Not reviewed 0017: arm: Move Bind_COMBINE Not reviewed 0018: armv6: ADD combiner Not reviewed 0019: armv6: OVER_REVERSE combiner Not reviewed 0020: armv6: IN, IN_REVERSE, ... Not reviewed 0021: armv6: OVER combiner Not reviewed 0022: armv6: SRC combiner Not reviewed ================================================================== These are all armv6 specific, and I did not review them beyond checking that there's nothing obviously insane in them. ================================================================== 0023: fetchers/writeback Needs work ================================================================== Don't you need "FAST_PATH_STANDARD_DST_FLAGS" in the ITER_DEST case? Also, since you have a writeback function for 565, you might want to copy the fast_dest_fetch_noop setup from pixman-fast-path.c ================================================================== 0024: fetcher for a1r5g5b5 Not reviewed ================================================================== Not reviewed ================================================================== 0025: src_1555_8888 Needs work ================================================================== Can these be used for x8r8g8b8/x8b8g8r8 targets as well? I did not look at the assembly. ================================================================== 0026: nearest_scaled_cover_a8r8g8b8 Not reviewed 0027: nearest_scaled_cover_r5g6b5 0028: nearest_scaled_cover_x8r8g8b8 0029: nearest_scaled_cover_a8 ================================================================== I only looked at the table initializations, which look good to me. ================================================================== 0030: nearest_scaled_cover src_8888_8888 0031: four more 0032: nearest_scaled_cover ================================================================== I can't say I'm a big fan of these patches. They are essentially workarounds for the longstanding problem that pixman sometimes won't pick the fastest code for some operations. In this case the general iter based code will be faster then the C code in pixman-fast-path.c because the iter code will use assembly fetchers. As a result you end up with a bunch of partial reimplementations of general_composite_rect() inside pixman-arm-simd.c. Maybe we need to admit failure and make general_composite_rect() available for other implementations to use. Essentially we would officially provide a way for implementations to say: My iterators are faster than pixman-fast-path.c for these specific operations, so just go directly to general_composite_rect(). It's definitely not a pleasant thing to do, but given that nobody is likely to have the time/skill combination required to do the necessary redesign of pixman's fast path system, it might still be preferable to to do this instead of duplicating the code like these patches do. With a setup like that, we could also fix the same issue for the bilinear SSSE3 fetchers and possibly other cases. Regarding 0031: If you had the noop setup mentioned in 0023, this looks like it could also just be a fallback to general_composite_rect. ================================================================== 0033 fetcher-for-a8r8g8b8-bilinear-interpolatio.patch 0034 fetcher-for-x8r8g8b8-bilinear-interpolatio.patch 0035 fetcher-for-r5g6b5-bilinear-interpolation-.patch 0036 fetcher-for-a8-bilinear-interpolation-scal.patch ================================================================== I think this one might benefit from having the infrastructure added first and some more comments on how the code works. And why does it have to be so much more complicated than the SSSE3 code? Coding style: Use /* */ instead of // Coding style: braces on separate lines Coding style: Comments look like this: /* blah blah * blah blah */ not like this: /* blah blah * blah blah */ Also, has this code been tested with BILINEAR_INTERPOLATION_BITS set to anything other than 7? ================================================================== 0037: More accurate COVER_CLIP_NEAREST|BILINEAR Needs work ================================================================== A concern I have here is that code might access pixels outside the image that have weight 0. Ie., in the bilinear case, some code might attempt to read the pixel at bits[-1] and then multiply it with 0. But we can't assume that bits[-1] is readable. If I remember correctly, the +pixman_fixed_e * 8 stuff was intended to handle this case. I think it would be worthwhile to have a test that uses fence_malloc for the source buffer and the matrix mentioned in the commit. In fact, the fence_malloc() testing could benefit from being extended in various ways: - having fence pages both before and after the image - having fence pages in the 'stride' part of the image _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman