craig.topper added inline comments.
================ Comment at: clang/include/clang/Basic/BuiltinsX86.def:2014 + +TARGET_BUILTIN(__builtin_ia32_vfmaddsh3_mask, "V8xV8xV8xV8xUcIi", "ncV:128:", "avx512fp16") +TARGET_BUILTIN(__builtin_ia32_vfmaddsh3_maskz, "V8xV8xV8xV8xUcIi", "ncV:128:", "avx512fp16") ---------------- LuoYuanke wrote: > What does "3" stand for? The 3 is there because AMD's 4 operand fma used vfmaddss/vfmaddsd. So Intel's 3 operand used vfmaddss3/vfmaddsd3. That naming is being carried forward here. ================ Comment at: clang/lib/Headers/avx512vlfp16intrin.h:1385 + __m128h __C) { + return (__m128h)__builtin_ia32_selectph_128( + (__mmask8)__U, ---------------- LuoYuanke wrote: > Sorry, I'm confused sometimes we use mask builtin, sometimes we use select > builtin. Any guideline on it? Ideally FP should never use select because it doesn't convey that exceptions should be masked for strictfp. But the mistake was already made for add/sub/mul/div/fma/etc years ago before strictfp support existed in llvm. fp16 is intentionally following float/double for consistency. ================ Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5709 + + def int_x86_avx512fp16_vfmadd_ph_512 + : Intrinsic<[ llvm_v32f16_ty ], ---------------- LuoYuanke wrote: > I notice there is no builtin bound to this intrinsic. What is it used for? It is manually selected in CGBuiltin.cpp ================ Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:5727 + [ IntrNoMem, ImmArg<ArgIndex<3>> ]>; + def int_x86_avx512fp16_vfmadd_f16 + : Intrinsic<[ llvm_half_ty ], ---------------- LuoYuanke wrote: > ph? It's scalar so it shouldn't be ph. This matches int_x86_avx512_vfmadd_f32 and int_x86_avx512_vfmadd_f64. They don't use ss/sd because the ss/sd names are usually used for intrinsics that 128-bit operands and only modify the lower element. int_x86_avx512_vfmadd_f32 and int_x86_avx512_vfmadd_f64 have float/double inputs and produce float/double results. ================ Comment at: llvm/test/CodeGen/X86/avx512fp16-fma-commute.ll:9 + +define half @fma_123_f16(half %x, half %y, half %z) { +; CHECK-LABEL: fma_123_f16: ---------------- LuoYuanke wrote: > The name 123 is not the same with the generated instruction (213sh). Is it > expected? 123 represents how the 3 arguments to the fucntion are mapped to the 3 intrinsic arguments that it calls. There are 6 possible permutations which are all tested here, but only 3 instruction mnemonics. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105268/new/ https://reviews.llvm.org/D105268 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits