Ok I'm convinced enough it's not worth bothering about the (mostly minimal) performance impact and pushed this (a slightly altered version). Thanks!
Roland Am 03.11.2015 um 09:36 schrieb Oded Gabbay: > There are currently two methods in llvmpipe code to calculate coeffs to > be used as inputs for the fragment shader. The two methods use slightly > different ways to do the floating point calculations and thus produce > slightly different results. > > The decision which method to use is determined by the size of the vector > that is used by the platform. > > For vectors with size of more than 128bit, a single-step method is used, > in which coeffs_init_simple() + attribs_update_simple() are called. > > For vectors with size of 128bit or less, a two-step method is used, in > which coeffs_init() + attribs_update() are called. > > This causes some piglit tests (clip-distance-bulk-copy, > interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with > 128bit vectors (such as ppc64le or x86-64 without AVX). > > This patch makes platforms with 128bit vectors use the single-step > method (aka "simple" method) instead of the two-step method. > This would make the resulting coeffs identical between more platforms, > make sure the piglit tests passes, and make debugging and maintainability > a bit easier as the generated LLVM IR will be the same for more platforms. > > The performance impact is negligible for x86-64 without AVX, and > basically non-existent for ppc64le, as it can be seen from the following > benchmarking results: > > - glxspheres, on ppc64le: > > - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec > - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec > - Additional 0.8% performance boost > > - glxspheres, on x86-64 without AVX: > > - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec > - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec > - Additional 0.74% performance boost > > - glmark2, on ppc64le: > > - original code: score of 58 > - with my change: score of 57 > > - glmark2, on x86-64 without AVX: > > - original code: score of 175 > - with the patch: score of 167 > - Impact of of -4.5% on performance > > - OpenArena, on ppc64le: > > - original code: 3398 frames 1719.0 seconds 2.0 fps > 255.0/505.9/2773.0/0.0 ms > > - with the patch: 3398 frames 1690.4 seconds 2.0 fps > 241.0/497.5/2563.0/0.2 ms > > - 29 seconds faster with the patch, which is about 2% > > - OpenArena, on x86-64 without AVX: > > - original code: 3398 frames 239.6 seconds 14.2 fps > 38.0/70.5/719.0/14.6 ms > > - with the patch: 3398 frames 244.4 seconds 13.9 fps > 38.0/71.9/697.0/14.3 ms > > - 0.3 fps slower with the patch (about 2%) > > Additional details can be found at: > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_archives_mesa-2Ddev_2015-2DOctober_098635.html&d=BQIBAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=leupoMWQQSziy-ONBqVRNVTPLKwGiZIiJ4rAJTwPcp0&s=G-j7DINld6T77nYUd6diDitYgoXqgWdJEsmLk6vpDw4&e= > > > Signed-off-by: Oded Gabbay <oded.gab...@gmail.com> > --- > src/gallium/drivers/llvmpipe/lp_bld_interp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/llvmpipe/lp_bld_interp.c > b/src/gallium/drivers/llvmpipe/lp_bld_interp.c > index df262fa..a2055d2 100644 > --- a/src/gallium/drivers/llvmpipe/lp_bld_interp.c > +++ b/src/gallium/drivers/llvmpipe/lp_bld_interp.c > @@ -746,7 +746,7 @@ lp_build_interp_soa_init(struct > lp_build_interp_soa_context *bld, > > pos_init(bld, x0, y0); > > - if (coeff_type.length > 4) { > + if (coeff_type.length >= 4) { > bld->simple_interp = TRUE; > { > /* XXX this should use a global static table */ > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev