On 13/04/16 06:51, Jose Fonseca wrote:
On 13/04/16 04:00, srol...@vmware.com wrote:
From: Roland Scheidegger <srol...@vmware.com>

We used to use sse roundps intrinsic directly, but switched to use the
llvm
intrinsics for rounding with e4f01da15d8c6ce3e8c77ff3ff3d2ce2574a3f7b.
However, llvm semantics follows standard math lib round function which is
specced to do roundNearestAwayFromZero but we really want
roundNearestEven
(moreoever, using round generates atrocious code since the cpu can't
do it
directly and it results in scalar calls to libm __roundf).
So, use llvm.nearbyint instead, which does exactly the right thing,
and even
has the advantage of being available with llvm 3.3 too. (I've verified it
actually generates a roundps instruction with llvm 3.3.)

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909

Thanks Roland.

I should've cought that.  I think I forget to actually look at
llvm.round's assembly as I only found that intrisic much later.  I
presumed it was roundps.

---
  src/gallium/auxiliary/gallivm/lp_bld_arit.c | 6 +-----
  1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 0c43617..74f1683 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -1863,11 +1863,7 @@ lp_build_round_arch(struct lp_build_context *bld,

        switch (mode) {
        case LP_BUILD_ROUND_NEAREST:
-         if (HAVE_LLVM >= 0x0304) {
-            intrinsic_root = "llvm.round";
-         } else {
-            return lp_build_nearest_sse41(bld, a);

You can remove lp_build_nearest_sse41 too while you're at it.

-         }
+         intrinsic_root = "llvm.nearbyint";
           break;
        case LP_BUILD_ROUND_FLOOR:
           intrinsic_root = "llvm.floor";


We should add a test case to lp_test_arit too.  Like this:

diff --git a/src/gallium/drivers/llvmpipe/lp_test_arit.c
b/src/gallium/drivers/llvmpipe/lp_test_arit.c
index a0f2db7..ba831f3 100644
--- a/src/gallium/drivers/llvmpipe/lp_test_arit.c
+++ b/src/gallium/drivers/llvmpipe/lp_test_arit.c
@@ -218,6 +218,7 @@ const float round_values[] = {
        -10.0, -1, 0.0, 12.0,
        -1.49, -0.25, 1.25, 2.51,
        -0.99, -0.01, 0.01, 0.99,
+      -1.5, -0.5, 0.5, 1.5,
        1.401298464324817e-45f, // smallest denormal
        -1.401298464324817e-45f,
        1.62981451e-08f,
@@ -283,7 +284,7 @@ unary_tests[] = {
     {"sin", &lp_build_sin, &sinf, sincos_values,
Elements(sincos_values), 20.0 },
     {"cos", &lp_build_cos, &cosf, sincos_values,
Elements(sincos_values), 20.0 },
     {"sgn", &lp_build_sgn, &sgnf, exp2_values, Elements(exp2_values),
20.0 },
-   {"round", &lp_build_round, &roundf, round_values,
Elements(round_values), 24.0 },
+   {"round", &lp_build_round, &nearbyintf, round_values,
Elements(round_values), 24.0 },
     {"trunc", &lp_build_trunc, &truncf, round_values,
Elements(round_values), 24.0 },
     {"floor", &lp_build_floor, &floorf, round_values,
Elements(round_values), 24.0 },
     {"ceil", &lp_build_ceil, &ceilf, round_values,
Elements(round_values), 24.0 },


It seems nearbyintf is available on MSVC too.

It's just not clear whether MSVC implementation follows round-to-even:
the nearbyint doc page is missing from MSDN, and rintf page explicitly
states that rint(2.5) is 3 per,
https://msdn.microsoft.com/en-us/library/dn465165.aspx

I've verified this works -- https://ci.appveyor.com/project/jrfonseca/mesa/build/20 --, so I went ahead and commited your change, plus the lp_test_arit one.

Jose

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to