On Mon, 17 Dec 2012 03:34:34 +0100 sandm...@cs.au.dk (Søren Sandmann) wrote:
> Siarhei Siamashka <siarhei.siamas...@gmail.com> writes: > > > I just wonder how big is the performance cost for adding an extra > > comparison operation. Probably much less than using -ffloat-store, > > -fexcess-precision=standard, and -std=c99 options, but might be > > interesting to confirm. > > It's not going to matter all that much in any case since we are talking > about floating point variants of operations that involve > divisions. These are not used that much, and the divisions will tend to > swamp a lot of the difference. > > However, I added conjoint_over_8888_2a10 to lowlevel-blt-test and did > some measurements: > > As a baseline, current master compiled with -m32 and == 0.0f checks: > > conjoint_over_8888_2a10 = L1: 5.62 L2: 5.67 M: 5.65 ( 0.50%) HT: > 5.59 VT: 5.52 R: 5.49 RT: 5.06 ( 68Kops/s) > > With the FLT_MIN checks: > > conjoint_over_8888_2a10 = L1: 5.68 L2: 5.73 M: 5.72 ( 0.51%) HT: > 5.65 VT: 5.53 R: 5.45 RT: 5.02 ( 67Kops/s) > > The numbers are actually slightly better with the checks, so I suspect > the difference is just noise (although conceivably, the checks may > filter out more divisions than before). > > When just pixman-combine-float.c is compiled with -ffloat-store: > > conjoint_over_8888_2a10 = L1: 5.58 L2: 5.60 M: 5.60 ( 0.50%) HT: > 5.53 VT: 5.44 R: 5.41 RT: 4.99 ( 67Kops/s) > > The numbers here are slightly worse than the baseline, but possibly > still just noise. > > If all of pixman is compiled with -ffloat-store: > > conjoint_over_8888_2a10 = L1: 4.31 L2: 4.34c M: 4.31 ( 0.38%) > HT: 4.26 VT: 4.21 R: 4.14 RT: 3.92 ( 53Kops/s) > > the numbers are clearly worse. > > Finally, the numbers in x86_64 mode. Current master: > > conjoint_over_8888_2a10 = L1: 19.09 L2: 19.58 M: 19.13 ( 1.75%) HT: > 17.47 VT: 17.35 R: 17.32 RT: 13.72 ( 178Kops/s) > > With FLT_MIN checks: > > conjoint_over_8888_2a10 = L1: 19.09 L2: 19.59 M: 19.51 ( 1.76%) HT: > 17.52 VT: 17.02 R: 17.00 RT: 13.43 ( 175Kops/s) > > Ie., no real difference. Agreed, now it looks clear. Thanks for the detailed benchmark results. -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman