On Sun, Feb 7, 2016 at 6:13 PM, Axel Davy <axel.d...@ens.fr> wrote:
> SQRT is not supported everywhere, so replace
> it by RSQ + RCP
>
> Signed-off-by: Axel Davy <axel.d...@ens.fr>
> ---
>  src/gallium/state_trackers/nine/nine_ff.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/nine/nine_ff.c 
> b/src/gallium/state_trackers/nine/nine_ff.c
> index a5466a7..894fc63 100644
> --- a/src/gallium/state_trackers/nine/nine_ff.c
> +++ b/src/gallium/state_trackers/nine/nine_ff.c
> @@ -563,7 +563,8 @@ nine_ff_build_vs(struct NineDevice9 *device, struct 
> vs_build_ctx *vs)
>          struct ureg_src cPsz2 = ureg_DECL_constant(ureg, 27);
>
>          ureg_DP3(ureg, tmp_x, ureg_src(r[1]), ureg_src(r[1]));
> -        ureg_SQRT(ureg, tmp_y, _X(tmp));
> +        ureg_RSQ(ureg, tmp_y, _X(tmp));
> +        ureg_RCP(ureg, tmp_y, _Y(tmp));

I'd recommend doing

ureg_MUL(ureg, tmp_y, _Y(tmp), _X(tmp))

instead. That should be (a) more numerically stable (rcp doesn't have
great precision), and (b) not blow up for 0.

>          ureg_MAD(ureg, tmp_x, _Y(tmp), _YYYY(cPsz2), _XXXX(cPsz2));
>          ureg_MAD(ureg, tmp_x, _Y(tmp), _X(tmp), _WWWW(cPsz1));
>          ureg_RCP(ureg, tmp_x, ureg_src(tmp));
> --
> 2.7.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to