[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `clang-hip-vega20` running on `hip-vega20-0` while building `clang` at step 3 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/5253 Here is the relevant piece of the build log for the reference ``` Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure) ... [38/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG External/HIP/CMakeFiles/InOneWeekend-hip-6.0.2.dir/workload/ray-tracing/InOneWeekend/main.cc.o -o External/HIP/InOneWeekend-hip-6.0.2 --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/InOneWeekend.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend.reference_output-hip-6.0.2 [39/40] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG -O3 -DNDEBUG -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc [40/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -o External/HIP/TheNextWeek-hip-6.0.2 --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2 + build_step 'Testing HIP test-suite' + echo '@@@BUILD_STEP Testing HIP test-suite@@@' @@@BUILD_STEP Testing HIP test-suite@@@ + ninja -v check-hip-simple [0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test -- Testing: 7 tests, 7 workers -- Testing: 0.. 10.. 20.. 30.. 40 FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7) TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2 cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2 + cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP + /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2 /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i' /usr/bin/strip: /bin/bash.stripped: Bad file descriptor Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Failed Tests (1): test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test Testing Time: 340.03s Total Discovered Tests: 7 Passed: 6 (85.71%) Failed: 1 (14.29%) FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.tes
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
github-actions[bot] wrote: @JinjinLi868 Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/wangpc-pp closed https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 6688d7ed8a0ce0c6885b82218e832320da57eef0 Mon Sep 17 00:00:00 2001 From: JinjinLi868 Date: Wed, 4 Sep 2024 16:43:48 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 4 +++ .../test/CodeGen/X86/bfloat16-convert-half.c | 25 +++ 2 files changed, 29 insertions(+) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 7aa2d3d89c2936..f315043411ae1e 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1454,6 +1454,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); + } if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); return Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..55451dc6f092cd --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > > The vector tests should still be added > > > > > > sorry. if i remove the change of the vector. i have to remove the testcase. > > because, for the current code convert between vector type of half and > > bfloat16, it has a bug. And it will be Assert "Invalid cast!"" > > OK, LGTM with the else before return fixed. Can you handle the vector case in > a follow up? OK, i will handle the vector case in a follow up. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
arsenm wrote: > > The vector tests should still be added > > sorry. if i remove the change of the vector. i have to remove the testcase. > because, for the current code convert between vector type of half and > bfloat16, it has a bug. And it will be Assert "Invalid cast!"" > OK, LGTM with the else before return fixed. Can you handle the vector case in a follow up? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: i add the gcc behavior in this case ``` __bf16 test_convert_from_fp16_to_bf16(_Float16 a) { return (__bf16)a; } test_convert_from_fp16_to_bf16(_Float16): pushrbp mov rbp, rsp sub rsp, 16 movdeax, xmm0 mov WORD PTR [rbp-2], ax pinsrw xmm0, WORD PTR [rbp-2], 0 call__trunchfbf2 movdeax, xmm0 movdxmm0, eax leave ret ``` ``` _Float16 test_convert_from_bf16_to_fp16(__bf16 a) { return (_Float16)a; } test_convert_from_bf16_to_fp16(std::bfloat16_t): pushrbp mov rbp, rsp sub rsp, 16 movdeax, xmm0 mov WORD PTR [rbp-2], ax movzx eax, WORD PTR [rbp-2] sal eax, 16 movdxmm0, eax call__truncsfhf2 movdeax, xmm0 movdxmm0, eax leave ret ``` https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > The vector tests should still be added sorry. if i remove the change of the vector. i have to remove the testcase. because, for the current code convert between vector type of half and bfloat16, it has a bug. And it will be Assert "Invalid cast!"" `CastInst *CastInst::Create(Instruction::CastOps op, Value *S, Type *Ty, const Twine &Name, InsertPosition InsertBefore) { assert(castIsValid(op, S, Ty) && "Invalid cast!"); // Construct and return the appropriate CastInst subclass switch (op) { case Trunc: return new TruncInst (S, Ty, Name, InsertBefore); case ZExt: return new ZExtInst (S, Ty, Name, InsertBefore); case SExt: return new SExtInst (S, Ty, Name, InsertBefore); case FPTrunc: return new FPTruncInst (S, Ty, Name, InsertBefore); case FPExt: return new FPExtInst (S, Ty, Name, InsertBefore); default: llvm_unreachable("Invalid opcode provided"); } }` `CastInst::castIsValid(Instruction::CastOps op, Type *SrcTy, Type *DstTy) { if (!SrcTy->isFirstClassType() || !DstTy->isFirstClassType() || SrcTy->isAggregateType() || DstTy->isAggregateType()) return false; // Get the size of the types in bits, and whether we are dealing // with vector types, we'll need this later. bool SrcIsVec = isa(SrcTy); bool DstIsVec = isa(DstTy); unsigned SrcScalarBitSize = SrcTy->getScalarSizeInBits(); unsigned DstScalarBitSize = DstTy->getScalarSizeInBits(); // If these are vector types, get the lengths of the vectors (using zero for // scalar types means that checking that vector lengths match also checks that // scalars are not being converted to vectors or vectors to scalars). ElementCount SrcEC = SrcIsVec ? cast(SrcTy)->getElementCount() : ElementCount::getFixed(0); ElementCount DstEC = DstIsVec ? cast(DstTy)->getElementCount() : ElementCount::getFixed(0); // Switch on the opcode provided switch (op) { default: return false; // This is an input error case Instruction::FPExt: return SrcTy->isFPOrFPVectorTy() && DstTy->isFPOrFPVectorTy() && SrcEC == DstEC && SrcScalarBitSize < DstScalarBitSize; ` now, for the vector convert between half and bfloat16. it codegen to the FPExt. For the FPExt, castIsValid() need SrcScalarBitSize < DstScalarBitSize; But for vector half and bfloat16, the SrcScalarBitSize is equal to DstScalarBitSize. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm commented: The vector tests should still be added https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) arsenm wrote: Still needs to be resolved. Also should probably get a comment https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > ok, you mean, i remove the vector testcase for this patch. and just save > > the scalar testcase? > > No, keep the tests. Only keep the scalar behavior change. The previous > revision was essentially correct and minimal i have changed. just save the scalar on this patch https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > > > @JinjinLi868 are you still working on this? > > > > > > > > > yes, i am. could you give me some advice and can i help you ? > > > > > > Can we have a scalar only patch as @arsenm requested? > > ok, you mean, i remove the vector testcase for this patch. and just save the > scalar testcase? i have changed. just save the scalar on this patch https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 3c2e873551aa85abe9011996c5adfaac544d0104 Mon Sep 17 00:00:00 2001 From: JinjinLi868 Date: Wed, 4 Sep 2024 16:43:48 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 5 +++- .../test/CodeGen/X86/bfloat16-convert-half.c | 25 +++ 2 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 7aa2d3d89c2936..338b6de4693ae9 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1454,7 +1454,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); return Builder.CreateFPExt(Src, DstTy, "conv"); } diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..55451dc6f092cd --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
arsenm wrote: > ok, you mean, i remove the vector testcase for this patch. and just save the > scalar testcase? No, keep the tests. Only keep the scalar behavior change. The previous revision was essentially correct and minimal https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > > @JinjinLi868 are you still working on this? > > > > > > yes, i am. could you give me some advice and can i help you ? > > Can we have a scalar only patch as @arsenm requested? ok, you mean, i remove the vector testcase for this patch. and just save the scalar testcase? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
topperc wrote: > > @JinjinLi868 are you still working on this? > > yes, i am. could you give me some advice and can i help you ? Can we have a scalar only patch as @arsenm requested? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) topperc wrote: Why was this resolved? It doesn't seem to have been addressed. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
topperc wrote: > > @JinjinLi868 are you still working on this? > > I can ask him. Is this PR blocking some of your recent works on float16/bf16? I stumbled onto the verifier error earlier while writing a test. It's not blocking me. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > @JinjinLi868 are you still working on this? yes, i am. could you give me some advice and can i help you ? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
wangpc-pp wrote: > @JinjinLi868 are you still working on this? I can ask him. Is this PR blocking some of your works on float16/bf16? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
topperc wrote: @JinjinLi868 are you still working on this? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + arsenm wrote: I think these tests need to be additive. The vector behavior seems to be different between standard C and the proper vector languages? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
arsenm wrote: > ping Ping Do you have another review comment? This has now confused me. You should roll back to the case where you only changed the scalar behavior. Any vector behavior change should be a separate PR, if that is even correct. I would still like to know what the gcc behavior is in this case https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: ping https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + wangpc-pp wrote: Vector tests are moved to HIP target now as the IRs are wierd in X86. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,165 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +bfloat2 test_cast_from_half2_to_bfloat2(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x half> [[IN1]] to <4 x bfloat> +// CHECK-NEXT:store <4 x bfloat> [[TMP0]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP1]] +// +bfloat4 test_cast_from_half4_to_bfloat4(half4 in) { + return (bfloat4)in; +} + +// CHECK-LABEL: define dso_local i32 @test_cast_from_bfloat2_to_half2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x bfloat> [[IN1]] to <2 x half> +// CHECK-NEXT:store <2 x half> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +half2 test_cast_from_bfloat2_to_half2(bfloat2 in) { + return (half2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_bfloat4_to_half4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x bfloat>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x bfloat> [[IN1]] to <4 x half> +// CHECK-NEXT:store <4 x half> [[TMP0]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP1]] +// +half4 test_cast_from_bfloat4_to_half4(bfloat4 in) { + return (half4)in; +} + + +// CHECK-LABEL: define dso_local i32 @test_convertvector_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[FPEXT:%.*]] = fpext <2 x half> [[IN1]] to <2 x float> +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc <2 x float> [[FPEXT]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[FPTRUNC]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load i32,
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -1906,7 +1909,15 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { arsenm wrote: This should be a separate change if it's correct https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + arsenm wrote: The regular C case should still test the vector conversions. Also the rebase is confusing me https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 0afac9d8a6acedff53089f55eacb92a2880f58aa Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 15 +++- .../test/CodeGen/X86/bfloat16-convert-half.c | 25 +++ .../test/CodeGenHIP/bfloat16-half-convert.hip | 71 +++ 3 files changed, 109 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c create mode 100644 clang/test/CodeGenHIP/bfloat16-half-convert.hip diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..8e35c801bc9599 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); return Builder.CreateFPExt(Src, DstTy, "conv"); } @@ -1906,7 +1909,15 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { + auto *ScrVecTy = cast(SrcTy); + Value *FloatVal = Builder.CreateFPExt( + Src, + llvm::VectorType::get(Builder.getFloatTy(), +ScrVecTy->getElementCount()), + "fpext"); + Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); +} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..55451dc6f092cd --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,25 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + diff --git a/clang/test/CodeGenHIP/bfloat16-half-convert.hip b/clang/test/CodeGenHIP/bfloat16-half-convert.hip new file mode 100644 index 00..0ffebb44c969b4 --- /dev/null +++ b/clang/test/CodeGenHIP/bfloat16-half-convert.hip @@ -0,0 +1,71 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -disable-O0-optnone -emit-llvm -fcuda-is-device \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +#define __device__ __attribute__((device)) + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local noundef <2 x bfloat> @_Z40test_convertvector_from_half2_to_bfloat2Dv2_DF16_ +// CHECK-SAME: (<2 x half> noundef [[IN:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4, addrspace(5) +// CHECK-NEXT:[[IN_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[IN_ADDR]] to ptr +// CHECK-NEXT:store <2 x half> [[IN]], ptr [[IN_ADDR_ASCAST]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR_ASCAST]], align 4 +// CHECK-NEXT:[[FPEXT:%.*]] =
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,165 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +bfloat2 test_cast_from_half2_to_bfloat2(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x half> [[IN1]] to <4 x bfloat> +// CHECK-NEXT:store <4 x bfloat> [[TMP0]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP1]] +// +bfloat4 test_cast_from_half4_to_bfloat4(half4 in) { + return (bfloat4)in; +} + +// CHECK-LABEL: define dso_local i32 @test_cast_from_bfloat2_to_half2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x bfloat> [[IN1]] to <2 x half> +// CHECK-NEXT:store <2 x half> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +half2 test_cast_from_bfloat2_to_half2(bfloat2 in) { + return (half2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_bfloat4_to_half4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x bfloat>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x bfloat> [[IN1]] to <4 x half> +// CHECK-NEXT:store <4 x half> [[TMP0]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP1]] +// +half4 test_cast_from_bfloat4_to_half4(bfloat4 in) { + return (half4)in; +} + + +// CHECK-LABEL: define dso_local i32 @test_convertvector_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[FPEXT:%.*]] = fpext <2 x half> [[IN1]] to <2 x float> +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc <2 x float> [[FPEXT]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[FPTRUNC]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load i32,
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 31ced11517042bcbd6f5f6e544cadf6943c1b1c0 Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 15 +- .../test/CodeGen/X86/bfloat16-convert-half.c | 165 ++ 2 files changed, 178 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..8e35c801bc9599 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); return Builder.CreateFPExt(Src, DstTy, "conv"); } @@ -1906,7 +1909,15 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { + auto *ScrVecTy = cast(SrcTy); + Value *FloatVal = Builder.CreateFPExt( + Src, + llvm::VectorType::get(Builder.getFloatTy(), +ScrVecTy->getElementCount()), + "fpext"); + Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); +} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..3f60bd13b33d48 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,165 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +bfloat2 test_cast_from_half2_to_bfloat2(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 45c6985815f7896c09c1be1eefc10cd4f9cd35af Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 15 +- .../test/CodeGen/X86/bfloat16-convert-half.c | 165 ++ 2 files changed, 178 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..8e35c801bc9599 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,7 +1431,10 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "fpext"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); return Builder.CreateFPExt(Src, DstTy, "conv"); } @@ -1906,7 +1909,15 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { + auto *ScrVecTy = cast(SrcTy); + Value *FloatVal = Builder.CreateFPExt( + Src, + llvm::VectorType::get(Builder.getFloatTy(), +ScrVecTy->getElementCount()), + "fpext"); + Res = Builder.CreateFPTrunc(FloatVal, DstTy, "fptrunc"); +} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..85ebe8502bc033 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,165 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -disable-O0-optnone -emit-llvm \ +// RUN: %s -o - | opt -S -passes=mem2reg | FileCheck %s + +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext bfloat [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to half +// CHECK-NEXT:ret half [[FPTRUNC]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[FPEXT:%.*]] = fpext half [[A]] to float +// CHECK-NEXT:[[FPTRUNC:%.*]] = fptrunc float [[FPEXT]] to bfloat +// CHECK-NEXT:ret bfloat [[FPTRUNC]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_half2_to_bfloat2( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = bitcast <2 x half> [[IN1]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP0]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +bfloat2 test_cast_from_half2_to_bfloat2(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_half4_to_bfloat4( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:[[TMP0:%.*
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) arsenm wrote: No else after return (although it's already violated in the existing code) https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm approved this pull request. LGTM. Would be good to verify the vector case is "correct" in as far as it's what GCC does https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> arsenm wrote: Does GCC have the same behavior for the bfloat x half case? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s wangpc-pp wrote: ```suggestion // RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -disable-O0-optnone -emit-llvm %s -o - | opt -S -passes=mem2reg | FileCheck %s ``` We can run `mem2reg` to reduce CHECKs. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
efriedma-quic wrote: Looks like automation didn't trigger, but: > ⚠️ We detected that you are using a GitHub private e-mail address to > contribute to the repo. > Please turn off [Keep my email addresses > private](https://github.com/settings/emails) setting in your account. > See [LLVM > Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it) > for more information. (I'll also wait a bit to give @arsenm a chance to respond.) https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> efriedma-quic wrote: The CastKind in the AST is BitCast. This is inherited from old GNU vector_type semantics. Whether or not that's correct is orthogonal to this patch. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/efriedma-quic approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/efriedma-quic edited https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP2]] +// +bfloat2 test_cast_from_fp162_to_bf162(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:store <4 x half> [[IN1]], ptr [[IN_ADDR]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = load <4 x half>, ptr [[IN_ADDR]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[TMP0]] to <4 x bfloat> +// CHECK-NEXT:store <4 x bfloat> [[TMP1]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP2]] +// +bfloat4 test_cast_from_fp164_to_bf164(half4 in) { + return (bfloat4)in; +} + +// CHECK-LABEL: define dso_local i32 @test_cast_from_bf162_to_fp162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x bfloat> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x bfloat>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x bfloat> [[TMP0]] to <2 x half> +// CHECK-NEXT:store <2 x half> [[TMP1]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP2]] +// +half2 test_cast_from_bf162_to_fp162(bfloat2 in) { arsenm wrote: Rename test to match the new typedef name https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP2]] +// +bfloat2 test_cast_from_fp162_to_bf162(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <4 x bfloat>, align 8 +// CHECK-NEXT:[[IN:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <4 x half>, align 8 +// CHECK-NEXT:store double [[IN_COERCE]], ptr [[IN]], align 8 +// CHECK-NEXT:[[IN1:%.*]] = load <4 x half>, ptr [[IN]], align 8 +// CHECK-NEXT:store <4 x half> [[IN1]], ptr [[IN_ADDR]], align 8 +// CHECK-NEXT:[[TMP0:%.*]] = load <4 x half>, ptr [[IN_ADDR]], align 8 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <4 x half> [[TMP0]] to <4 x bfloat> +// CHECK-NEXT:store <4 x bfloat> [[TMP1]], ptr [[RETVAL]], align 8 +// CHECK-NEXT:[[TMP2:%.*]] = load double, ptr [[RETVAL]], align 8 +// CHECK-NEXT:ret double [[TMP2]] +// +bfloat4 test_cast_from_fp164_to_bf164(half4 in) { + return (bfloat4)in; +} + +// CHECK-LABEL: define dso_local i32 @test_cast_from_bf162_to_fp162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x bfloat>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x bfloat> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x bfloat>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x bfloat> [[TMP0]] to <2 x half> arsenm wrote: This bitcast also doesn't look right. I'm shocked that the vector cast behavior seems to treat FP-to-int vectors as bitcast, radically different from the scalar case (which OpenCL doesn't even allow). The comment says it's allowing bitcast between fp/int of the same size, but that's not really what the cast is here. https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> arsenm wrote: The vector case is still emitting the incorrect bitcast https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > For the vector case, I think you want __builtin_convertvector or something > like that thanks, i have add __builtin_convertvector testcase https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From f61686e42906886a0686158b3050767e60b576fa Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 18 +- .../test/CodeGen/X86/bfloat16-convert-half.c | 194 ++ 2 files changed, 209 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..1fbeb37de5de60 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,13 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); +return Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1910,15 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { + auto *ScrVecTy = cast(SrcTy); + Value *FloatVal = Builder.CreateFPExt( + Src, + llvm::VectorType::get(Builder.getFloatTy(), +ScrVecTy->getElementCount()), + "conv"); + Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); +} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..a1b948c873e064 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,194 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
efriedma-quic wrote: For the vector case, I think you want __builtin_convertvector or something like that https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > But In some target, it supply a HW instruction to complete the process > > (fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16) > > Which is not a bitcast. The correct IR representation of this conversion is > fpext+fptrunc i understand, i have changed. but for the vector convert case, it is also a bitcast IR https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 9e6c2a16172c66b7a9eec7957d95b4239f178368 Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 15 ++- .../test/CodeGen/X86/bfloat16-convert-half.c | 113 ++ 2 files changed, 125 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..d4c60a2a7ffcaf 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,15 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) { +Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); +// Value *Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); +// return Res; +return Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); + } else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1912,10 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) { + Value *FloatVal = Builder.CreateFPExt(Src, Builder.getFloatTy(), "conv"); + Res = Builder.CreateFPTrunc(FloatVal, DstTy, "conv"); +} else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..ad12bd3f654175 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,113 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext bfloat [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to half +// CHECK-NEXT:ret half [[CONV1]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = fpext half [[TMP0]] to float +// CHECK-NEXT:[[CONV1:%.*]] = fptrunc float [[CONV]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV1]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
arsenm wrote: > But In some target, it supply a HW instruction to complete the process > (fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16) Which is not a bitcast. The correct IR representation of this conversion is fpext+fptrunc https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From a8d31ec6602f55f845b9e508f71a42f83e3a474e Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 11 +- .../test/CodeGen/X86/bfloat16-convert-half.c | 111 ++ 2 files changed, 119 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..da5a410f040d1b 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) +return Builder.CreateBitCast(Src, DstTy, "conv"); + else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1909,9 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) + Res = Builder.CreateBitCast(Src, DstTy, "conv"); +else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..048789d3adecc8 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,111 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 half2 __attribute__((ext_vector_type(2))); +typedef _Float16 half4 __attribute__((ext_vector_type(4))); + +typedef __bf16 bfloat2 __attribute__((ext_vector_type(2))); +typedef __bf16 bfloat4 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[TMP1]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP2]] +// +bfloat2 test_cast_from_fp162_to_bf162(half2 in) { + return (bfloat2)in; +} + + +// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164( +// CHECK-SAME: double noundef [[IN_COERCE:%.*]]) #[[A
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,109 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 _Float162 __attribute__((ext_vector_type(2))); +typedef _Float16 _Float164 __attribute__((ext_vector_type(4))); + +typedef __bf16 __bf162 __attribute__((ext_vector_type(2))); +typedef __bf16 __bf164 __attribute__((ext_vector_type(4))); arsenm wrote: Can you use a different typedef name? Maybe half2/bfloat2? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm requested changes to this pull request. Bitcast is not the correct behavior https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,109 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 _Float162 __attribute__((ext_vector_type(2))); +typedef _Float16 _Float164 __attribute__((ext_vector_type(4))); + +typedef __bf16 __bf162 __attribute__((ext_vector_type(2))); +typedef __bf16 __bf164 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[CONV:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[CONV]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +__bf162 test_cast_from_fp162_to_bf162(_Float162 in) { + return __builtin_convertvector(in, __bf162); arsenm wrote: I meant test the raw cast, not this builtin https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 8d0ed7653c21315c5c920114b3a2d7686e54b9ca Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp| 11 +- .../test/CodeGen/X86/bfloat16-convert-half.c | 109 ++ 2 files changed, 117 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..da5a410f040d1b 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) +return Builder.CreateBitCast(Src, DstTy, "conv"); + else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1909,9 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) + Res = Builder.CreateBitCast(Src, DstTy, "conv"); +else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..138793c242db16 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,109 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert_from_bf16_to_fp16( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert_from_bf16_to_fp16(__bf16 a) { +return (_Float16)a; +} + +// CHECK-LABEL: define dso_local bfloat @test_convert_from_fp16_to_bf16( +// CHECK-SAME: half noundef [[A:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca half, align 2 +// CHECK-NEXT:store half [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast half [[TMP0]] to bfloat +// CHECK-NEXT:ret bfloat [[CONV]] +// +__bf16 test_convert_from_fp16_to_bf16(_Float16 a) { +return (__bf16)a; +} + +typedef _Float16 _Float162 __attribute__((ext_vector_type(2))); +typedef _Float16 _Float164 __attribute__((ext_vector_type(4))); + +typedef __bf16 __bf162 __attribute__((ext_vector_type(2))); +typedef __bf16 __bf164 __attribute__((ext_vector_type(4))); + +// CHECK-LABEL: define dso_local i32 @test_cast_from_fp162_to_bf162( +// CHECK-SAME: i32 noundef [[IN_COERCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca <2 x bfloat>, align 4 +// CHECK-NEXT:[[IN:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:[[IN_ADDR:%.*]] = alloca <2 x half>, align 4 +// CHECK-NEXT:store i32 [[IN_COERCE]], ptr [[IN]], align 4 +// CHECK-NEXT:[[IN1:%.*]] = load <2 x half>, ptr [[IN]], align 4 +// CHECK-NEXT:store <2 x half> [[IN1]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[TMP0:%.*]] = load <2 x half>, ptr [[IN_ADDR]], align 4 +// CHECK-NEXT:[[CONV:%.*]] = bitcast <2 x half> [[TMP0]] to <2 x bfloat> +// CHECK-NEXT:store <2 x bfloat> [[CONV]], ptr [[RETVAL]], align 4 +// CHECK-NEXT:[[TMP1:%.*]] = load i32, ptr [[RETVAL]], align 4 +// CHECK-NEXT:ret i32 [[TMP1]] +// +__bf162 test_cast_from_fp162_to_bf162(_Float162 in) { + return __builtin_convertvector(in, __bf162); +} + +// CHECK-LABEL: define dso_local double @test_cast_from_fp164_to_bf164( +// CHECK-SAME: dou
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > > This appears to just assert today, but interpreting this as bitcast doesn't > > make sense. I would expect this to emit a pair of casts, fpext to float, > > and fptrunc down to half > > If we don't just reject it as an invalid cast i understand your means, it like a Hardware behavior to do fp16 convert to bf16(firstly, fp16 fpext to float32, second , float32 fptrunc to bfloat16). But In some target, it supply a HW instruction to complete the process (fp16->float32->bf16) . so it just supply a intrinsic (fp16 -> bf16) https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
JinjinLi868 wrote: > This appears to just assert today, but interpreting this as bitcast doesn't > make sense. I would expect this to emit a pair of casts, fpext to float, and > fptrunc down to half https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
arsenm wrote: > This appears to just assert today, but interpreting this as bitcast doesn't > make sense. I would expect this to emit a pair of casts, fpext to float, and > fptrunc down to half If we don't just reject it as an invalid cast https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
@@ -0,0 +1,14 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert(__bf16 a) { +return (_Float16)a; +} arsenm wrote: Can you also test some vector cases? https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/arsenm commented: This appears to just assert today, but interpreting this as bitcast doesn't make sense. I would expect this to emit a pair of casts, fpext to float, and fptrunc down to half https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 updated https://github.com/llvm/llvm-project/pull/89051 >From 69a584119d8978d0ea3177c59d8772f00df3a68e Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] Fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp | 11 --- clang/test/CodeGen/X86/bfloat16-convert-half.c | 14 ++ 2 files changed, 22 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/X86/bfloat16-convert-half.c diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..da5a410f040d1b 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) +return Builder.CreateBitCast(Src, DstTy, "conv"); + else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1909,9 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) + Res = Builder.CreateBitCast(Src, DstTy, "conv"); +else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); diff --git a/clang/test/CodeGen/X86/bfloat16-convert-half.c b/clang/test/CodeGen/X86/bfloat16-convert-half.c new file mode 100644 index 00..4a13e2c33c8149 --- /dev/null +++ b/clang/test/CodeGen/X86/bfloat16-convert-half.c @@ -0,0 +1,14 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s +// CHECK-LABEL: define dso_local half @test_convert( +// CHECK-SAME: bfloat noundef [[A:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[A_ADDR:%.*]] = alloca bfloat, align 2 +// CHECK-NEXT:store bfloat [[A]], ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[TMP0:%.*]] = load bfloat, ptr [[A_ADDR]], align 2 +// CHECK-NEXT:[[CONV:%.*]] = bitcast bfloat [[TMP0]] to half +// CHECK-NEXT:ret half [[CONV]] +// +_Float16 test_convert(__bf16 a) { +return (_Float16)a; +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (JinjinLi868) Changes Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- Full diff: https://github.com/llvm/llvm-project/pull/89051.diff 1 Files Affected: - (modified) clang/lib/CodeGen/CGExprScalar.cpp (+8-3) ``diff diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..da5a410f040d1b 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) +return Builder.CreateBitCast(Src, DstTy, "conv"); + else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1909,9 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) + Res = Builder.CreateBitCast(Src, DstTy, "conv"); +else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); `` https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
github-actions[bot] wrote: Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using `@` followed by their GitHub username. If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the [LLVM GitHub User Guide](https://llvm.org/docs/GitHub.html). You can also ask questions in a comment on this PR, on the [LLVM Discord](https://discord.com/invite/xS7Z362) or on the [forums](https://discourse.llvm.org/). https://github.com/llvm/llvm-project/pull/89051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] fix half && bfloat16 convert node expr codegen (PR #89051)
https://github.com/JinjinLi868 created https://github.com/llvm/llvm-project/pull/89051 Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. >From 02c11a9db49dd34839feb8329cfb8dbe4bc45763 Mon Sep 17 00:00:00 2001 From: Jinjin Li Date: Wed, 17 Apr 2024 16:44:50 +0800 Subject: [PATCH] [clang] fix half && bfloat16 convert node expr codegen Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes. --- clang/lib/CodeGen/CGExprScalar.cpp | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 1f18e0d5ba409a..da5a410f040d1b 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -1431,9 +1431,12 @@ Value *ScalarExprEmitter::EmitScalarCast(Value *Src, QualType SrcType, return Builder.CreateFPToUI(Src, DstTy, "conv"); } - if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) + if ((DstElementTy->is16bitFPTy() && SrcElementTy->is16bitFPTy())) +return Builder.CreateBitCast(Src, DstTy, "conv"); + else if (DstElementTy->getTypeID() < SrcElementTy->getTypeID()) return Builder.CreateFPTrunc(Src, DstTy, "conv"); - return Builder.CreateFPExt(Src, DstTy, "conv"); + else +return Builder.CreateFPExt(Src, DstTy, "conv"); } /// Emit a conversion from the specified type to the specified destination type, @@ -1906,7 +1909,9 @@ Value *ScalarExprEmitter::VisitConvertVectorExpr(ConvertVectorExpr *E) { } else { assert(SrcEltTy->isFloatingPointTy() && DstEltTy->isFloatingPointTy() && "Unknown real conversion"); -if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) +if ((DstEltTy->is16bitFPTy() && SrcEltTy->is16bitFPTy())) + Res = Builder.CreateBitCast(Src, DstTy, "conv"); +else if (DstEltTy->getTypeID() < SrcEltTy->getTypeID()) Res = Builder.CreateFPTrunc(Src, DstTy, "conv"); else Res = Builder.CreateFPExt(Src, DstTy, "conv"); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits