LuoYuanke added inline comments.
================ Comment at: clang/lib/Headers/avx512bf16intrin.h:13 +#ifdef __SSE2__ + ---------------- What is this macro check used for? ================ Comment at: clang/test/CodeGen/X86/avx512bf16-error.c:14 +__bfloat16 bar(__bfloat16 a, __bfloat16 b) { + return a + b; +} ---------------- Need test for other operations (-, *, /) as well? ================ Comment at: llvm/include/llvm/IR/IntrinsicsX86.td:4928 Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_v4i32_ty, llvm_v4i32_ty], [IntrNoMem]>; def int_x86_avx512bf16_dpbf16ps_256: ---------------- It seems we still use i32 to represent <2 x bf16>, but we don't have a better way since 1 bit mask cover a pair of bf16 elements. ================ Comment at: llvm/lib/IR/AutoUpgrade.cpp:4095 + Intrinsic::x86_avx512bf16_mask_cvtneps2bf16_128) + Args[1] = Builder.CreateBitCast( + Args[1], FixedVectorType::get(Builder.getBFloatTy(), NumElts)); ---------------- Why there is no bitcast for the input for the other intrinsics? I expect to see the bitcast from vXi16 to vXbf16. ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:3916 +multiclass mask_move_lowering_f16_bf16<AVX512VLVectorVTInfo _> { let Predicates = [HasBWI] in { + def : Pat<(_.info512.VT (vselect VK32WM:$mask, (_.info512.VT VR512:$src1), (_.info512.VT VR512:$src0))), ---------------- Not sure the indent is correct or not. ================ Comment at: llvm/test/CodeGen/X86/bfloat.ll:32 +; BF16-NEXT: shll $16, %eax +; BF16-NEXT: vmovd %eax, %xmm1 +; BF16-NEXT: vaddss %xmm0, %xmm1, %xmm0 ---------------- It seems the difference between SSE2 and BF16 is using SSE instruction or AVX instruction. What do we expect to test for BF16? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132329/new/ https://reviews.llvm.org/D132329 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits