Ah OK. So llvm.fmuladd is more like llvm.fmadontcare. Wrong assumption on my part.
On Wed, Oct 4, 2017 at 1:00 PM, Connor Abbott <cwabbo...@gmail.com> wrote: > No. From the LLVM langref: > > The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add > expressions that can be fused if the code generator determines that > (a) the target instruction set has support for a fused operation, and > (b) that the fused operation is more efficient than the equivalent, > separate pair of mul and add instructions. > > The (b) part is especially important -- it says that LLVM can pick and > choose which fmuladd intrinsics to turn into FMA instructions, or > unfused MULADD instructions, or just a sequence of mul+add. For > example, if many instructions call fmuladd with the first two > arguments the same, it can break it up into a mul followed by a bunch > of adds. That wouldn't be ok under the GLSL precise semantics > (assuming the target would've used FMA otherwise, which I think some > GCN cards will do). > > Also, and maybe more importantly, if an app developer explicitly asks > for fma() with a precise modifier, it's probably not a great idea to > then give them an unfused mul+add -- it's legal, thanks to GLSL's > weasel-wording, but probably not what you really want, on HW which > actually does have an FMA instruction :) > > Connor > > > On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> Wouldn't this guarantee that nothing is fused (and thus fine)? >> Presumably fmuladd always does mul+add either as 1 or 2 instructions? >> >> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott <cwabbo...@gmail.com> wrote: >>> If the fma has the exact flag, then we need to use the llvm.fma >>> intrinsic. These come from fma() calls with the precise or invariant >>> qualifiers in GLSL, where you basically have to fuse everything or >>> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that. >>> >>> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie <airl...@gmail.com> wrote: >>>> From: Dave Airlie <airl...@redhat.com> >>>> >>>> For Vulkan SPIR-V the spec states >>>> fma() Inherited from OpFMul followed by OpFAdd. >>>> >>>> Matt says the backend will do the right thing depending on the >>>> hardware being compiled for, if you use the fmuladd intrinsic. >>>> >>>> Using the Mad Max pts test, on high settings at 4K: >>>> CHP: 55->60 >>>> HGDD: 46->50 >>>> LM: 55->60 >>>> No change on Stronghold. >>>> >>>> Thanks to Feral for spending the time to track this down. >>>> >>>> Signed-off-by: Dave Airlie <airl...@redhat.com> >>>> --- >>>> src/amd/common/ac_nir_to_llvm.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/src/amd/common/ac_nir_to_llvm.c >>>> b/src/amd/common/ac_nir_to_llvm.c >>>> index d7b6259..11ba487 100644 >>>> --- a/src/amd/common/ac_nir_to_llvm.c >>>> +++ b/src/amd/common/ac_nir_to_llvm.c >>>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, >>>> const nir_alu_instr *instr) >>>> result); >>>> break; >>>> case nir_op_ffma: >>>> - result = emit_intrin_3f_param(&ctx->ac, "llvm.fma", >>>> + result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd", >>>> ac_to_float_type(&ctx->ac, >>>> def_type), src[0], src[1], src[2]); >>>> break; >>>> case nir_op_ibitfield_extract: >>>> -- >>>> 2.9.4 >>>> >>>> _______________________________________________ >>>> mesa-dev mailing list >>>> mesa-dev@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev