On Wed, Sep 16, 2015 at 2:25 PM, Pekka Paalanen <ppaala...@gmail.com> wrote: > On Fri, 4 Sep 2015 03:09:20 +0100 > Ben Avison <bavi...@riscosopen.org> wrote: > >> As discussed in >> http://lists.freedesktop.org/archives/pixman/2015-August/003905.html >> >> the 8 * pixman_fixed_e adjustment which was applied to the transformed >> coordinates is a legacy of rounding errors which used to occur in old >> versions of Pixman, but which no longer apply. For any affine transform, >> you are now guaranteed to get the same result by transforming the upper >> coordinate as though you transform the lower coordinate and add (size-1) >> steps of the increment in source coordinate space. No projective >> transform routines use the COVER_CLIP flags, so they cannot be affected. > > Hi all, > > as we doing these things not just for cleaning up but with the premise > that there are missed optimization opportunities, I have benchmarked > this patch series. > > The series as benchmarked is available at: > https://git.collabora.com/cgit/user/pq/pixman.git/log/?h=cover-benchmark-1 > > The benchmark points are: > > - baseline: "test: Add cover-test v5" > > - cleanup: "affine-bench: remove 8e margin from COVER area" > Includes the 8e extra safety margin removal. > > - tight: "pixman-fast-path: Make bilinear cover fetcher use > COVER_CLIP_TIGHT flag" > Includes all the COVER_CLIP_BILINEAR related patches from > Ben. > > Note, that ssse3_iters[] in pixman-ssse3.c still contains > FAST_PATH_SAMPLES_COVER_CLIP_BILINEAR. > > Cairo version is 1.14.2 for the benchmarks, which are run like: > $ CAIRO_TEST_TARGET=image cairo-perf-trace -r -v -i8 > baseline-image-2.txt > > I tried both "image" and "image16" on an x86_64 (Sandybridge), and got > no performance differences in the trimmed-cairo-traces set in either > baseline/cleanup or cleanup/tight. > > I also tried with PIXMAN_DISABLE=ssse3 and still got no difference. I > verified I am really running what I think I am by editing Pixman and > seeing the effect in the benchmark. > > Am I missing something? > > I thought we would see at least some improvements also on x86_64 when > comparing cleanup/tight. > > Should I run the same on rpi2? Or is the best effect on the fast paths > we haven't merged yet? > > I'd rather not run this on rpi1 due to the function address / > performance quirk, doing the required iterations there would probably > take too long and I'd need to rearrange the result files too. > > Or maybe our test set is not enough? I recall having some problems with > that in the past. > > So, I patched Pixman to yell whenever TIGHT is set but > COVER_CLIP_BILINEAR is not set. Only t-firefox-canvas-swscroll and > t-firefox-fishtank hit it with source image, each twice per iteration. > Definitely seems like this test set is not hitting the cases we are > interested in. I think I need to dig up our old performance profiles > and see if we could record a trace from a real app that would hit these > cases, now that Cairo's trace recording is supposedly fixed. > > The removal of the 8e extra safety margins shouldn't need performance > profiles as justification, but for the tightening patches they'd be > nice to have, especially since the usefulness of them has been > questioned. > > > Thanks, > pq
Hi Pekka, Ben I decided to also run the cairo trimmed benchmarks on my POWER8 ppc64le and POWER7 ppc64. To make things clearer, I used the same definitions for "baseline", "cleanup" and "tight". I used Cairo version 1.14.3, actually from git with head set to 6f7a9b4 I run the benchmarks doing (it's from inside a script): "cairo-perf-trace benchmark -r -i8 > ../${__output}.perf" First of all, diff between baseline/cleanup showed no change, in both platforms, so that's good :) Now, for cleanup/tight: With POWER8 ppc64le, I got the following very modest boost: image t-firefox-asteroids 483.10 (523.85 3.49%) -> 452.84 (480.34 3.16%): 1.07x speedup image t-firefox-chalkboard 691.38 (692.09 0.06%) -> 653.07 (654.60 0.26%): 1.06x speedup However, with POWER7 ppc64, I got the following regressions, which is quite bad: image t-firefox-asteroids 545.55 (559.64 1.79%) -> 781.07 (791.83 2.33%): 1.43x slowdown image t-firefox-scrolling 1185.45 (1186.02 0.05%) -> 1748.76 (1754.85 0.20%): 1.48x slowdown image t-firefox-chalkboard 1444.76 (1464.55 0.88%) -> 2315.76 (2333.10 0.34%): 1.60x slowdown image t-firefox-paintball 681.43 (682.28 0.10%) -> 1138.15 (1140.19 0.08%): 1.67x slowdown image t-firefox-canvas 890.14 (890.90 0.10%) -> 1492.83 (1493.51 0.20%): 1.68x slowdown image t-firefox-canvas-swscroll 1369.94 (1371.66 0.05%) -> 2297.53 (2305.70 0.18%): 1.68x slowdown image t-xfce4-terminal-a1 829.35 (832.39 0.16%) -> 1392.50 (1414.69 1.08%): 1.68x slowdown image t-firefox-fishbowl 3112.93 (3114.13 0.02%) -> 5227.18 (5229.05 0.03%): 1.68x slowdown image t-poppler 404.14 (407.43 0.52%) -> 680.27 (685.01 0.45%): 1.68x slowdown image t-firefox-particles 3555.75 (3570.29 0.18%) -> 5990.93 (5995.00 0.05%): 1.68x slowdown image t-midori-zoomed 555.84 (557.29 0.24%) -> 936.56 (937.69 0.08%): 1.68x slowdown image t-gnome-system-monitor 844.70 (849.98 0.52%) -> 1426.26 (1427.60 0.12%): 1.69x slowdown image t-firefox-planet-gnome 904.60 (908.31 0.18%) -> 1527.90 (1530.03 0.08%): 1.69x slowdown image t-chromium-tabs 221.74 (221.87 0.04%) -> 374.75 (376.72 0.26%): 1.69x slowdown image t-swfdec-youtube 929.86 (930.31 0.12%) -> 1571.61 (1572.76 0.09%): 1.69x slowdown image t-firefox-fishtank 1787.33 (1787.36 0.00%) -> 3022.38 (3023.47 0.09%): 1.69x slowdown image t-firefox-canvas-alpha 1026.19 (1030.55 0.24%) -> 1735.63 (1740.84 0.28%): 1.69x slowdown image t-evolution 431.94 (433.98 0.36%) -> 731.76 (732.26 0.08%): 1.69x slowdown image t-firefox-talos-svg 1381.38 (1388.40 0.26%) -> 2342.68 (2345.83 0.10%): 1.70x slowdown image t-gvim 803.40 (806.02 0.29%) -> 1363.80 (1366.63 0.27%): 1.70x slowdown image t-poppler-reseau 1416.96 (1443.14 0.74%) -> 2408.39 (2412.49 0.16%): 1.70x slowdown image t-swfdec-giant-steps 827.47 (829.87 0.17%) -> 1407.90 (1410.93 0.18%): 1.70x slowdown image t-gnome-terminal-vim 663.55 (669.39 0.71%) -> 1132.85 (1139.02 0.29%): 1.71x slowdown image t-grads-heat-map 225.85 (225.92 0.02%) -> 386.23 (386.78 0.49%): 1.71x slowdown btw, out of curiosity, I checked cleanup/tight on my Haswell laptop and I got mixed/bad results: image t-firefox-canvas 705.79 (869.04 11.16%) -> 563.55 (594.35 2.52%): 1.25x speedup image t-poppler-reseau 619.46 (881.17 16.35%) -> 657.98 (679.11 7.95%): 1.06x slowdown image t-firefox-planet-gnome 582.52 (605.63 1.82%) -> 627.80 (634.95 3.31%): 1.08x slowdown image t-evolution 264.55 (271.81 3.30%) -> 288.95 (336.86 11.37%): 1.09x slowdown image t-gnome-terminal-vim 264.74 (270.65 0.92%) -> 312.25 (516.79 20.96%): 1.18x slowdown image t-grads-heat-map 93.61 (93.92 0.23%) -> 115.32 (136.32 10.96%): 1.23x slowdown image t-chromium-tabs 115.36 (115.94 0.45%) -> 200.87 (254.77 11.90%): 1.74x slowdown Opinions ? Oded _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman