On Thursday, May 19, 2016 12:57:44 PM PDT Rob Clark wrote: > On Wed, May 18, 2016 at 6:00 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > > ffma is an explicitly fused multiply add with higher precision. > > The optimizer will take care of promoting mul/add to fma when > > it's beneficial to do so. > > > > This fixes failures on Gen4-5 when using this pass, as those platforms > > don't actually implement fma(). > > hmm, we can't rely on the opt-algebraic pass to do this? > > BR, > -R
We can rely on either nir_opt_algebraic (with the fuse_ffma flag set) or brw_nir_opt_peephole_ffma() (if someone wants to move it to src/compiler/nir and use it) to fuse add+mul into ffma. However, we can't rely on nir_opt_algebraic to split up ffma into mul+add for us. We made it stop doing that a little while ago, so that the GLSL fma() built-in is always higher precision. (The thinking is that if apps didn't care, they would just write (a*b+c), and that splitting fma() is pretty bunk...and splitting and reassembling so fma() has /inconsistent/ precision is even more bunk...) I suppose I could just set lower_ffma in i965's nir_compiler_options for Gen < 6 where we don't have a MAD instruction (and don't support the GLSL fma() built-in function, either). That might be more sensible. --Ken
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev